SlideShare ist ein Scribd-Unternehmen logo
1 von 48
Downloaden Sie, um offline zu lesen
Postgres vs
Elasticsearch while
enriching data.
Vlad Somov @ Salt Edge Inc.
Unstructured Data
Enrichment
Incoming raw data
Structured identified data
Keyword1 Keyword2 Website Name
Tag
Keyword1 Keyword2 Website Name
Tag
Unstructured Data
Enrichment
Some Transaction Description Website
Incoming raw data
Keyword1 Keyword2 Website
Structured identified data
Name
Tag
Description
Keyword1
Tag
Basic Setup Performance
Min
Average
Max
Seconds
0 7.5 15 22.5 30
Postgres Elasticsearch
~4mln. Records
Basic Setup Performance
Min
Average
Max
Seconds
0 7.5 15 22.5 30
Postgres Elasticsearch
28.73
9.88
2.10
~4mln. Records
Basic Setup Performance
Min
Average
Max
Seconds
0 7.5 15 22.5 30
Postgres Elasticsearch
1.37
0.99
0.73
28.73
9.88
2.10
~4mln. Records
B-tree index structure
3 39 68
meta
39 42 55 68 89 943 15 28
3 9 15 21 29 32 39 42 42 48 55 68 68 77 89 93 94 98
39
B-tree index structure
3 68
meta
39 42 55 68 89 943 15 28
3 9 15 21 29 32 39 42 42 48 55 68 68 77 89 93 94 98
39
39
B-tree index structure
3 68
meta
42 55 68 89 943 15 28
3 9 15 21 29 32 39 42 42 48 55 68 68 77 89 93 94 98
39
39
B-tree index structure
3 68
meta
42 55 68 89 943 15 28
3 9 15 21 29 32 39 42 42 48 55 68 68 77 89 93 94 98
39
39
B-tree index structure
3 68
meta
42 55 68 89 943 15 28
3 9 15 21 29 32 39 42 42 48 55 68 68 77 89 93 94 98
39
39
B-tree index structure
3 68
meta
42 55 68 89 943 15 28
3 9 15 21 29 32 39 42 42 48 55 68 68 77 89 93 94 98
Why it is useful?
• b-tree index sort values inside each node.

• b-tree is balanced

• Same level nodes are connected using doubly linked list.
After multicolumn index on country_id
and merchant_type Performance
Min
Average
Max
Seconds
0 7.5 15 22.5 30
Postgres Elasticsearch
Postgres + multicolumn index
~4mln. Records
After multicolumn index on country_id
and merchant_type Performance
Min
Average
Max
Seconds
0 7.5 15 22.5 30
Postgres Elasticsearch
Postgres + multicolumn index
28.73
9.88
2.1
~4mln. Records
After multicolumn index on country_id
and merchant_type Performance
Min
Average
Max
Seconds
0 7.5 15 22.5 30
Postgres Elasticsearch
Postgres + multicolumn index
1.37
0.99
0.73
28.73
9.88
2.1
~4mln. Records
After multicolumn index on country_id
and merchant_type Performance
Min
Average
Max
Seconds
0 7.5 15 22.5 30
Postgres Elasticsearch
Postgres + multicolumn index
10.19
5.09
2.28
1.37
0.99
0.73
28.73
9.88
2.1
~4mln. Records
What is GiST
Generalized Search
Tree
• In GiST each leaf contains
logical expression and
pointer to TID, where
indexed data should
satisfy logical expression.

• Faster on insert, update
What is GIN
Generalized Inverted
Index
• It is b-tree with elements to
which is connected another
b-tree or plain list of TID's. 

• Faster and more accurate
on select.
Welcome to ruby meditation.

All of us love ruby.
Does everyone love meditation?
Everyone Of Welcome
All Does WelcomeRuby ToOfLove MeditationEveryone
0,1 0,10,1
2,1
1,51,5 2,1 2,1
Yellow rectangle are TID’s. First number is a page number and second is
position on a page
0,1
1,52,1
1,5
Welcome to ruby meditation.

All of us love ruby.
Does everyone love meditation?
Everyone Of Welcome
All Does WelcomeRuby ToOfLove MeditationEveryone
0,1
1,5
0,1 0,10,1
2,1
1,51,5 2,1 2,1
Yellow rectangle are TID’s. First number is a page number and second is
position on a page
2,1
1,5
ruby
rubylove
love
1,5
1,5
Welcome to ruby meditation.

All of us love ruby.
Does everyone love meditation?
Everyone Of Welcome
All Does WelcomeRuby ToOfLove MeditationEveryone
0,1 0,1 0,10,1
2,1
1,51,5 2,1 2,1
Yellow rectangle are TID’s. First number is a page number and second is
position on a page
2,1
ruby
rubylove
love
love ruby
gin_trgm_ops
A trigram is a group of three consecutive characters
taken from a string.
We can measure the similarity of two strings by counting the
number of trigrams they share.
Performance after gin index on websites
Min
Average
Max
Seconds
0 5 10 15 20
Postgres
Elasticsearch
Postgres + multicolumn index
Postgres + gin index with trgm_ops on websites
~4mln. Records
Performance after gin index on websites
Min
Average
Max
Seconds
0 5 10 15 20
Postgres
Elasticsearch
Postgres + multicolumn index
Postgres + gin index with trgm_ops on websites
28.73
9.88
2.1
~4mln. Records
Performance after gin index on websites
Min
Average
Max
Seconds
0 5 10 15 20
Postgres
Elasticsearch
Postgres + multicolumn index
Postgres + gin index with trgm_ops on websites
1.37
0.99
0.75
28.73
9.88
2.1
~4mln. Records
Performance after gin index on websites
Min
Average
Max
Seconds
0 5 10 15 20
Postgres
Elasticsearch
Postgres + multicolumn index
Postgres + gin index with trgm_ops on websites
10.19
5.09
2.28
1.37
0.99
0.75
28.73
9.88
2.1
~4mln. Records
Performance after gin index on websites
Min
Average
Max
Seconds
0 5 10 15 20
Postgres
Elasticsearch
Postgres + multicolumn index
Postgres + gin index with trgm_ops on websites
0.55
0.34
0.26
10.19
5.09
2.28
1.37
0.99
0.75
28.73
9.88
2.1
~4mln. Records
How elasticsearch works
• It uses analyzers for all incoming data. (it could be custom
or default one)

• Each analyzer has at least one tokenizer

• Zero or more TokenFilters

• Tokenizer may be preceded by one or more CharFilters
How analyzer works?
How analyzer works?
Input
How analyzer works?
Input Char Filter
String
How analyzer works?
Input Char Filter Tokenizer
String String
How analyzer works?
Input Char Filter Tokenizer
Token
Filter
String String Tokens
How analyzer works?
Input Char Filter Tokenizer
Token
Filter
Output
String String Tokens Tokens
Example
Example
The 2 QUICK <p>Brown-Foxes</p> jumped over the lazy dog's bone.
Example
The 2 QUICK <p>Brown-Foxes</p> jumped over the lazy dog's bone.
html_strip
The 2 QUICK Brown-Foxes jumped over the lazy dog's bone.
Example
The 2 QUICK <p>Brown-Foxes</p> jumped over the lazy dog's bone.
html_strip
The 2 QUICK Brown-Foxes jumped over the lazy dog's bone.
standart tokenizer
The 2 QUICK Brown jumpedFoxes over
the lazy dog’s bone
Example
The 2 QUICK <p>Brown-Foxes</p> jumped over the lazy dog's bone.
html_strip
The 2 QUICK Brown-Foxes jumped over the lazy dog's bone.
standart tokenizer
The 2 QUICK Brown jumpedFoxes over
the lazy dog’s bone
lowercase
the 2 quick brown jumpedfoxes over
the lazy dog’s bone
Example
The 2 QUICK <p>Brown-Foxes</p> jumped over the lazy dog's bone.
html_strip
The 2 QUICK Brown-Foxes jumped over the lazy dog's bone.
standart tokenizer
The 2 QUICK Brown jumpedFoxes over
the lazy dog’s bone
lowercase
the 2 quick brown jumpedfoxes over
the lazy dog’s bone
stop
2 quick brown jumpedfoxes over lazy dog’s bone
the
the
Example
The 2 QUICK <p>Brown-Foxes</p> jumped over the lazy dog's bone.
html_strip
The 2 QUICK Brown-Foxes jumped over the lazy dog's bone.
standart tokenizer
The 2 QUICK Brown jumpedFoxes over
the lazy dog’s bone
lowercase
the 2 quick brown jumpedfoxes over
the lazy dog’s bone
stop
2 quick brown jumpedfoxes over lazy dog’s bone
snowball
2 quick brown jumpfox over lazi dog bone
the
the
jump lazi dog
Postgres full search
implementation
• We can use tsvector type to achieve almost the same
functionality. By using to_tsvector function

• To imporve perfomance we could create separate tsvector
column with to_tsvector values.

• To create a request we should use to_tsquery. & | <->

• plainto_tsquery works with plain text so you don’t need to
insert any special symbols. Inserts &

• phraseto_tsquery also works with plain text but marks that
each token should be close to each other. Inserts <->
Rum access method
• Based on GIN access method code

• Solves slow ranking

• Solves slow phrase search (tsquery with <-> operator)

• Supports index on tsquery column
122
1
5
3
2
4
4
3
3
4211
Welcome to ruby meditation.

All of us love ruby.
Does everyone love meditation?
ruby, meditation, love
Everyone Of Welcome
All Does WelcomeRuby ToOfLove MeditationEveryone
0,1 0,10,1
2,1
1,51,5 2,1 2,1
The number in green rectangle is word position in the document.
0,1
1,52,1
1,5
8,4 8,4 8,4
122
1
5
3
2
4
4
3
3
4211
Welcome to ruby meditation.

All of us love ruby.
Does everyone love meditation?
ruby, meditation, love
Everyone Of Welcome
All Does WelcomeRuby ToOfLove MeditationEveryone
0,1
1,5
0,1 0,10,1
2,1
1,51,5 2,1 2,1
The number in green rectangle is word position in the document.
2,1
1,5
ruby
rubylove
love
love
ruby
8,4 8,4 8,4
122
1
5
3
2
4
4
3
3
4211
1,5
1,5
Welcome to ruby meditation.

All of us love ruby.
Does everyone love meditation?
ruby, meditation, love
Everyone Of Welcome
All Does WelcomeRuby ToOfLove MeditationEveryone
0,1 0,1 0,10,1
2,1
1,51,5 2,1 2,1
The number in green rectangle is word position in the document.
2,1
ruby
rubylove
love
love ruby
love
ruby
8,4 8,4 8,4
Conclusion
• Postgres can also be fast.

• Multicolumn indexes can improve performance if your search has multicolumn
constraints.

• For fast text search prefer using Gin when table doesn’t update occasionally,
otherwise use GiST

• Use gin with trgm_ops when using full text search. If full text search is still slow
try to use tsvector data type with gin index on it.

• When you have some kind ‘inverse full-text search’ problem. Add tsquery type in
your table as a query and incoming data treat as a document. Add rum access
method on query column with tsquery_ops for fast classification.

• Before moving to other instrument make analysis of current/new instrument and
verify is it worth moving or not.
email: vlad.somov@icloud.com
twitter: @vsomov93
Questions?

Weitere ähnliche Inhalte

Was ist angesagt?

楽天トラベルとSpring(Spring Day 2016)
楽天トラベルとSpring(Spring Day 2016)楽天トラベルとSpring(Spring Day 2016)
楽天トラベルとSpring(Spring Day 2016)Rakuten Group, Inc.
 
マルウェア感染!!そのときあなたがやるべきこと、やってはいけないこと
マルウェア感染!!そのときあなたがやるべきこと、やってはいけないことマルウェア感染!!そのときあなたがやるべきこと、やってはいけないこと
マルウェア感染!!そのときあなたがやるべきこと、やってはいけないことIIJ
 
ADO.NETとORMとMicro-ORM -dapper dot netを使ってみた
ADO.NETとORMとMicro-ORM -dapper dot netを使ってみたADO.NETとORMとMicro-ORM -dapper dot netを使ってみた
ADO.NETとORMとMicro-ORM -dapper dot netを使ってみたNarami Kiyokura
 
Fluentd, Digdag, Embulkを用いたデータ分析基盤の始め方
Fluentd, Digdag, Embulkを用いたデータ分析基盤の始め方Fluentd, Digdag, Embulkを用いたデータ分析基盤の始め方
Fluentd, Digdag, Embulkを用いたデータ分析基盤の始め方Kentaro Yoshida
 
How Narvar Uses Pulsar to Power the Post-Purchase Experience - Pulsar Summit ...
How Narvar Uses Pulsar to Power the Post-Purchase Experience - Pulsar Summit ...How Narvar Uses Pulsar to Power the Post-Purchase Experience - Pulsar Summit ...
How Narvar Uses Pulsar to Power the Post-Purchase Experience - Pulsar Summit ...StreamNative
 
コンテナを止めるな! PacemakerによるコンテナHAクラスタリングとKubernetesとの違いとは
コンテナを止めるな!  PacemakerによるコンテナHAクラスタリングとKubernetesとの違いとはコンテナを止めるな!  PacemakerによるコンテナHAクラスタリングとKubernetesとの違いとは
コンテナを止めるな! PacemakerによるコンテナHAクラスタリングとKubernetesとの違いとはksk_ha
 
URP? Excuse You! The Three Kafka Metrics You Need to Know
URP? Excuse You! The Three Kafka Metrics You Need to KnowURP? Excuse You! The Three Kafka Metrics You Need to Know
URP? Excuse You! The Three Kafka Metrics You Need to KnowTodd Palino
 
MariaDBとMroongaで作る全言語対応超高速全文検索システム
MariaDBとMroongaで作る全言語対応超高速全文検索システムMariaDBとMroongaで作る全言語対応超高速全文検索システム
MariaDBとMroongaで作る全言語対応超高速全文検索システムKouhei Sutou
 
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLWebinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLSeveralnines
 
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward
 
MySQLからPostgreSQLへのマイグレーションのハマリ所
MySQLからPostgreSQLへのマイグレーションのハマリ所MySQLからPostgreSQLへのマイグレーションのハマリ所
MySQLからPostgreSQLへのマイグレーションのハマリ所Makoto Kaga
 
Elastic stack Presentation
Elastic stack PresentationElastic stack Presentation
Elastic stack PresentationAmr Alaa Yassen
 
ネットワーク自動化、なに使う? ~自動化ツール紹介~(2017/08/18追加開催)
ネットワーク自動化、なに使う? ~自動化ツール紹介~(2017/08/18追加開催) ネットワーク自動化、なに使う? ~自動化ツール紹介~(2017/08/18追加開催)
ネットワーク自動化、なに使う? ~自動化ツール紹介~(2017/08/18追加開催) akira6592
 
料理を楽しくする画像配信システム
料理を楽しくする画像配信システム料理を楽しくする画像配信システム
料理を楽しくする画像配信システムIssei Naruta
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkFlink Forward
 
上司が信用できない会社の内部統制~第32回WebSig会議「便利さと、怖さと、心強さと〜戦う会社のための社内セキュリティ 2013年のスタンダードとは?!...
上司が信用できない会社の内部統制~第32回WebSig会議「便利さと、怖さと、心強さと〜戦う会社のための社内セキュリティ 2013年のスタンダードとは?!...上司が信用できない会社の内部統制~第32回WebSig会議「便利さと、怖さと、心強さと〜戦う会社のための社内セキュリティ 2013年のスタンダードとは?!...
上司が信用できない会社の内部統制~第32回WebSig会議「便利さと、怖さと、心強さと〜戦う会社のための社内セキュリティ 2013年のスタンダードとは?!...WebSig24/7
 
こわくない Git
こわくない Gitこわくない Git
こわくない GitKota Saito
 

Was ist angesagt? (20)

楽天トラベルとSpring(Spring Day 2016)
楽天トラベルとSpring(Spring Day 2016)楽天トラベルとSpring(Spring Day 2016)
楽天トラベルとSpring(Spring Day 2016)
 
マルウェア感染!!そのときあなたがやるべきこと、やってはいけないこと
マルウェア感染!!そのときあなたがやるべきこと、やってはいけないことマルウェア感染!!そのときあなたがやるべきこと、やってはいけないこと
マルウェア感染!!そのときあなたがやるべきこと、やってはいけないこと
 
ADO.NETとORMとMicro-ORM -dapper dot netを使ってみた
ADO.NETとORMとMicro-ORM -dapper dot netを使ってみたADO.NETとORMとMicro-ORM -dapper dot netを使ってみた
ADO.NETとORMとMicro-ORM -dapper dot netを使ってみた
 
[GKE & Spanner 勉強会] GKE 入門
[GKE & Spanner 勉強会] GKE 入門[GKE & Spanner 勉強会] GKE 入門
[GKE & Spanner 勉強会] GKE 入門
 
Fluentd, Digdag, Embulkを用いたデータ分析基盤の始め方
Fluentd, Digdag, Embulkを用いたデータ分析基盤の始め方Fluentd, Digdag, Embulkを用いたデータ分析基盤の始め方
Fluentd, Digdag, Embulkを用いたデータ分析基盤の始め方
 
How Narvar Uses Pulsar to Power the Post-Purchase Experience - Pulsar Summit ...
How Narvar Uses Pulsar to Power the Post-Purchase Experience - Pulsar Summit ...How Narvar Uses Pulsar to Power the Post-Purchase Experience - Pulsar Summit ...
How Narvar Uses Pulsar to Power the Post-Purchase Experience - Pulsar Summit ...
 
コンテナを止めるな! PacemakerによるコンテナHAクラスタリングとKubernetesとの違いとは
コンテナを止めるな!  PacemakerによるコンテナHAクラスタリングとKubernetesとの違いとはコンテナを止めるな!  PacemakerによるコンテナHAクラスタリングとKubernetesとの違いとは
コンテナを止めるな! PacemakerによるコンテナHAクラスタリングとKubernetesとの違いとは
 
URP? Excuse You! The Three Kafka Metrics You Need to Know
URP? Excuse You! The Three Kafka Metrics You Need to KnowURP? Excuse You! The Three Kafka Metrics You Need to Know
URP? Excuse You! The Three Kafka Metrics You Need to Know
 
MariaDBとMroongaで作る全言語対応超高速全文検索システム
MariaDBとMroongaで作る全言語対応超高速全文検索システムMariaDBとMroongaで作る全言語対応超高速全文検索システム
MariaDBとMroongaで作る全言語対応超高速全文検索システム
 
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLWebinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
 
SpringBootTest入門
SpringBootTest入門SpringBootTest入門
SpringBootTest入門
 
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
 
MySQLからPostgreSQLへのマイグレーションのハマリ所
MySQLからPostgreSQLへのマイグレーションのハマリ所MySQLからPostgreSQLへのマイグレーションのハマリ所
MySQLからPostgreSQLへのマイグレーションのハマリ所
 
Elastic stack Presentation
Elastic stack PresentationElastic stack Presentation
Elastic stack Presentation
 
ネットワーク自動化、なに使う? ~自動化ツール紹介~(2017/08/18追加開催)
ネットワーク自動化、なに使う? ~自動化ツール紹介~(2017/08/18追加開催) ネットワーク自動化、なに使う? ~自動化ツール紹介~(2017/08/18追加開催)
ネットワーク自動化、なに使う? ~自動化ツール紹介~(2017/08/18追加開催)
 
Redis
RedisRedis
Redis
 
料理を楽しくする画像配信システム
料理を楽しくする画像配信システム料理を楽しくする画像配信システム
料理を楽しくする画像配信システム
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
上司が信用できない会社の内部統制~第32回WebSig会議「便利さと、怖さと、心強さと〜戦う会社のための社内セキュリティ 2013年のスタンダードとは?!...
上司が信用できない会社の内部統制~第32回WebSig会議「便利さと、怖さと、心強さと〜戦う会社のための社内セキュリティ 2013年のスタンダードとは?!...上司が信用できない会社の内部統制~第32回WebSig会議「便利さと、怖さと、心強さと〜戦う会社のための社内セキュリティ 2013年のスタンダードとは?!...
上司が信用できない会社の内部統制~第32回WebSig会議「便利さと、怖さと、心強さと〜戦う会社のための社内セキュリティ 2013年のスタンダードとは?!...
 
こわくない Git
こわくない Gitこわくない Git
こわくない Git
 

Ähnlich wie Postgres vs Elasticsearch while enriching data - Vlad Somov | Ruby Meditaiton #23

Basics in algorithms and data structure
Basics in algorithms and data structure Basics in algorithms and data structure
Basics in algorithms and data structure Eman magdy
 
ZFConf 2011: Что такое Sphinx, зачем он вообще нужен и как его использовать с...
ZFConf 2011: Что такое Sphinx, зачем он вообще нужен и как его использовать с...ZFConf 2011: Что такое Sphinx, зачем он вообще нужен и как его использовать с...
ZFConf 2011: Что такое Sphinx, зачем он вообще нужен и как его использовать с...ZFConf Conference
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeWim Godden
 
About elasticsearch
About elasticsearchAbout elasticsearch
About elasticsearchMinsoo Jun
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeWim Godden
 
Unveiling etcd: Architecture and Source Code Deep Dive
Unveiling etcd: Architecture and Source Code Deep DiveUnveiling etcd: Architecture and Source Code Deep Dive
Unveiling etcd: Architecture and Source Code Deep DiveChieh (Jack) Yu
 
The Challenges of Distributing Postgres: A Citus Story | DataEngConf NYC 2017...
The Challenges of Distributing Postgres: A Citus Story | DataEngConf NYC 2017...The Challenges of Distributing Postgres: A Citus Story | DataEngConf NYC 2017...
The Challenges of Distributing Postgres: A Citus Story | DataEngConf NYC 2017...Citus Data
 
The Challenges of Distributing Postgres: A Citus Story
The Challenges of Distributing Postgres: A Citus StoryThe Challenges of Distributing Postgres: A Citus Story
The Challenges of Distributing Postgres: A Citus StoryHanna Kelman
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeWim Godden
 
Better Full Text Search in PostgreSQL
Better Full Text Search in PostgreSQLBetter Full Text Search in PostgreSQL
Better Full Text Search in PostgreSQLArtur Zakirov
 
SequoiaDB Distributed Relational Database
SequoiaDB Distributed Relational DatabaseSequoiaDB Distributed Relational Database
SequoiaDB Distributed Relational Databasewangzhonnew
 
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the codeBeyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the codeWim Godden
 
Slash n near real time indexing
Slash n   near real time indexingSlash n   near real time indexing
Slash n near real time indexingUmesh Prasad
 
Elasticsearch at Dailymotion
Elasticsearch at DailymotionElasticsearch at Dailymotion
Elasticsearch at DailymotionCédric Hourcade
 
SDPHP - Percona Toolkit (It's Basically Magic)
SDPHP - Percona Toolkit (It's Basically Magic)SDPHP - Percona Toolkit (It's Basically Magic)
SDPHP - Percona Toolkit (It's Basically Magic)Robert Swisher
 
A Call for Sanity in NoSQL
A Call for Sanity in NoSQLA Call for Sanity in NoSQL
A Call for Sanity in NoSQLC4Media
 
Performance Optimization of Rails Applications
Performance Optimization of Rails ApplicationsPerformance Optimization of Rails Applications
Performance Optimization of Rails ApplicationsSerge Smetana
 
.NET Fest 2019. Łukasz Pyrzyk. Daily Performance Fuckups
.NET Fest 2019. Łukasz Pyrzyk. Daily Performance Fuckups.NET Fest 2019. Łukasz Pyrzyk. Daily Performance Fuckups
.NET Fest 2019. Łukasz Pyrzyk. Daily Performance FuckupsNETFest
 
Moving Toward Deep Learning Algorithms on HPCC Systems
Moving Toward Deep Learning Algorithms on HPCC SystemsMoving Toward Deep Learning Algorithms on HPCC Systems
Moving Toward Deep Learning Algorithms on HPCC SystemsHPCC Systems
 
London devops logging
London devops loggingLondon devops logging
London devops loggingTomas Doran
 

Ähnlich wie Postgres vs Elasticsearch while enriching data - Vlad Somov | Ruby Meditaiton #23 (20)

Basics in algorithms and data structure
Basics in algorithms and data structure Basics in algorithms and data structure
Basics in algorithms and data structure
 
ZFConf 2011: Что такое Sphinx, зачем он вообще нужен и как его использовать с...
ZFConf 2011: Что такое Sphinx, зачем он вообще нужен и как его использовать с...ZFConf 2011: Что такое Sphinx, зачем он вообще нужен и как его использовать с...
ZFConf 2011: Что такое Sphinx, зачем он вообще нужен и как его использовать с...
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
 
About elasticsearch
About elasticsearchAbout elasticsearch
About elasticsearch
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
 
Unveiling etcd: Architecture and Source Code Deep Dive
Unveiling etcd: Architecture and Source Code Deep DiveUnveiling etcd: Architecture and Source Code Deep Dive
Unveiling etcd: Architecture and Source Code Deep Dive
 
The Challenges of Distributing Postgres: A Citus Story | DataEngConf NYC 2017...
The Challenges of Distributing Postgres: A Citus Story | DataEngConf NYC 2017...The Challenges of Distributing Postgres: A Citus Story | DataEngConf NYC 2017...
The Challenges of Distributing Postgres: A Citus Story | DataEngConf NYC 2017...
 
The Challenges of Distributing Postgres: A Citus Story
The Challenges of Distributing Postgres: A Citus StoryThe Challenges of Distributing Postgres: A Citus Story
The Challenges of Distributing Postgres: A Citus Story
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
 
Better Full Text Search in PostgreSQL
Better Full Text Search in PostgreSQLBetter Full Text Search in PostgreSQL
Better Full Text Search in PostgreSQL
 
SequoiaDB Distributed Relational Database
SequoiaDB Distributed Relational DatabaseSequoiaDB Distributed Relational Database
SequoiaDB Distributed Relational Database
 
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the codeBeyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
 
Slash n near real time indexing
Slash n   near real time indexingSlash n   near real time indexing
Slash n near real time indexing
 
Elasticsearch at Dailymotion
Elasticsearch at DailymotionElasticsearch at Dailymotion
Elasticsearch at Dailymotion
 
SDPHP - Percona Toolkit (It's Basically Magic)
SDPHP - Percona Toolkit (It's Basically Magic)SDPHP - Percona Toolkit (It's Basically Magic)
SDPHP - Percona Toolkit (It's Basically Magic)
 
A Call for Sanity in NoSQL
A Call for Sanity in NoSQLA Call for Sanity in NoSQL
A Call for Sanity in NoSQL
 
Performance Optimization of Rails Applications
Performance Optimization of Rails ApplicationsPerformance Optimization of Rails Applications
Performance Optimization of Rails Applications
 
.NET Fest 2019. Łukasz Pyrzyk. Daily Performance Fuckups
.NET Fest 2019. Łukasz Pyrzyk. Daily Performance Fuckups.NET Fest 2019. Łukasz Pyrzyk. Daily Performance Fuckups
.NET Fest 2019. Łukasz Pyrzyk. Daily Performance Fuckups
 
Moving Toward Deep Learning Algorithms on HPCC Systems
Moving Toward Deep Learning Algorithms on HPCC SystemsMoving Toward Deep Learning Algorithms on HPCC Systems
Moving Toward Deep Learning Algorithms on HPCC Systems
 
London devops logging
London devops loggingLondon devops logging
London devops logging
 

Mehr von Ruby Meditation

Is this Legacy or Revenant Code? - Sergey Sergyenko | Ruby Meditation 30
Is this Legacy or Revenant Code? - Sergey Sergyenko  | Ruby Meditation 30Is this Legacy or Revenant Code? - Sergey Sergyenko  | Ruby Meditation 30
Is this Legacy or Revenant Code? - Sergey Sergyenko | Ruby Meditation 30Ruby Meditation
 
Life with GraphQL API: good practices and unresolved issues - Roman Dubrovsky...
Life with GraphQL API: good practices and unresolved issues - Roman Dubrovsky...Life with GraphQL API: good practices and unresolved issues - Roman Dubrovsky...
Life with GraphQL API: good practices and unresolved issues - Roman Dubrovsky...Ruby Meditation
 
Where is your license, dude? - Viacheslav Miroshnychenko | Ruby Meditation 29
Where is your license, dude? - Viacheslav Miroshnychenko | Ruby Meditation 29Where is your license, dude? - Viacheslav Miroshnychenko | Ruby Meditation 29
Where is your license, dude? - Viacheslav Miroshnychenko | Ruby Meditation 29Ruby Meditation
 
Dry-validation update. Dry-validation vs Dry-schema 1.0 - Aleksandra Stolyar ...
Dry-validation update. Dry-validation vs Dry-schema 1.0 - Aleksandra Stolyar ...Dry-validation update. Dry-validation vs Dry-schema 1.0 - Aleksandra Stolyar ...
Dry-validation update. Dry-validation vs Dry-schema 1.0 - Aleksandra Stolyar ...Ruby Meditation
 
How to cook Rabbit on Production - Bohdan Parshentsev | Ruby Meditation 28
How to cook Rabbit on Production - Bohdan Parshentsev | Ruby Meditation 28 How to cook Rabbit on Production - Bohdan Parshentsev | Ruby Meditation 28
How to cook Rabbit on Production - Bohdan Parshentsev | Ruby Meditation 28 Ruby Meditation
 
How to cook Rabbit on Production - Serhiy Nazarov | Ruby Meditation 28
How to cook Rabbit on Production - Serhiy Nazarov | Ruby Meditation 28How to cook Rabbit on Production - Serhiy Nazarov | Ruby Meditation 28
How to cook Rabbit on Production - Serhiy Nazarov | Ruby Meditation 28Ruby Meditation
 
Reinventing the wheel - why do it and how to feel good about it - Julik Tarkh...
Reinventing the wheel - why do it and how to feel good about it - Julik Tarkh...Reinventing the wheel - why do it and how to feel good about it - Julik Tarkh...
Reinventing the wheel - why do it and how to feel good about it - Julik Tarkh...Ruby Meditation
 
Performance Optimization 101 for Ruby developers - Nihad Abbasov (ENG) | Ruby...
Performance Optimization 101 for Ruby developers - Nihad Abbasov (ENG) | Ruby...Performance Optimization 101 for Ruby developers - Nihad Abbasov (ENG) | Ruby...
Performance Optimization 101 for Ruby developers - Nihad Abbasov (ENG) | Ruby...Ruby Meditation
 
Use cases for Serverless Technologies - Ruslan Tolstov (RUS) | Ruby Meditatio...
Use cases for Serverless Technologies - Ruslan Tolstov (RUS) | Ruby Meditatio...Use cases for Serverless Technologies - Ruslan Tolstov (RUS) | Ruby Meditatio...
Use cases for Serverless Technologies - Ruslan Tolstov (RUS) | Ruby Meditatio...Ruby Meditation
 
The Trailblazer Ride from the If Jungle into a Civilised Railway Station - Or...
The Trailblazer Ride from the If Jungle into a Civilised Railway Station - Or...The Trailblazer Ride from the If Jungle into a Civilised Railway Station - Or...
The Trailblazer Ride from the If Jungle into a Civilised Railway Station - Or...Ruby Meditation
 
What/How to do with GraphQL? - Valentyn Ostakh (ENG) | Ruby Meditation 27
What/How to do with GraphQL? - Valentyn Ostakh (ENG) | Ruby Meditation 27What/How to do with GraphQL? - Valentyn Ostakh (ENG) | Ruby Meditation 27
What/How to do with GraphQL? - Valentyn Ostakh (ENG) | Ruby Meditation 27Ruby Meditation
 
New features in Rails 6 - Nihad Abbasov (RUS) | Ruby Meditation 26
New features in Rails 6 -  Nihad Abbasov (RUS) | Ruby Meditation 26New features in Rails 6 -  Nihad Abbasov (RUS) | Ruby Meditation 26
New features in Rails 6 - Nihad Abbasov (RUS) | Ruby Meditation 26Ruby Meditation
 
Security Scanning Overview - Tetiana Chupryna (RUS) | Ruby Meditation 26
Security Scanning Overview - Tetiana Chupryna (RUS) | Ruby Meditation 26Security Scanning Overview - Tetiana Chupryna (RUS) | Ruby Meditation 26
Security Scanning Overview - Tetiana Chupryna (RUS) | Ruby Meditation 26Ruby Meditation
 
Teach your application eloquence. Logs, metrics, traces - Dmytro Shapovalov (...
Teach your application eloquence. Logs, metrics, traces - Dmytro Shapovalov (...Teach your application eloquence. Logs, metrics, traces - Dmytro Shapovalov (...
Teach your application eloquence. Logs, metrics, traces - Dmytro Shapovalov (...Ruby Meditation
 
Best practices. Exploring - Ike Kurghinyan (RUS) | Ruby Meditation 26
Best practices. Exploring - Ike Kurghinyan (RUS) | Ruby Meditation 26Best practices. Exploring - Ike Kurghinyan (RUS) | Ruby Meditation 26
Best practices. Exploring - Ike Kurghinyan (RUS) | Ruby Meditation 26Ruby Meditation
 
Road to A/B testing - Alexey Vasiliev (ENG) | Ruby Meditation 25
Road to A/B testing - Alexey Vasiliev (ENG) | Ruby Meditation 25Road to A/B testing - Alexey Vasiliev (ENG) | Ruby Meditation 25
Road to A/B testing - Alexey Vasiliev (ENG) | Ruby Meditation 25Ruby Meditation
 
Concurrency in production. Real life example - Dmytro Herasymuk | Ruby Medita...
Concurrency in production. Real life example - Dmytro Herasymuk | Ruby Medita...Concurrency in production. Real life example - Dmytro Herasymuk | Ruby Medita...
Concurrency in production. Real life example - Dmytro Herasymuk | Ruby Medita...Ruby Meditation
 
Data encryption for Ruby web applications - Dmytro Shapovalov (RUS) | Ruby Me...
Data encryption for Ruby web applications - Dmytro Shapovalov (RUS) | Ruby Me...Data encryption for Ruby web applications - Dmytro Shapovalov (RUS) | Ruby Me...
Data encryption for Ruby web applications - Dmytro Shapovalov (RUS) | Ruby Me...Ruby Meditation
 
Rails App performance at the limit - Bogdan Gusiev
Rails App performance at the limit - Bogdan GusievRails App performance at the limit - Bogdan Gusiev
Rails App performance at the limit - Bogdan GusievRuby Meditation
 
GDPR. Next Y2K in 2018? - Anton Tkachov | Ruby Meditation #23
GDPR. Next Y2K in 2018? - Anton Tkachov | Ruby Meditation #23GDPR. Next Y2K in 2018? - Anton Tkachov | Ruby Meditation #23
GDPR. Next Y2K in 2018? - Anton Tkachov | Ruby Meditation #23Ruby Meditation
 

Mehr von Ruby Meditation (20)

Is this Legacy or Revenant Code? - Sergey Sergyenko | Ruby Meditation 30
Is this Legacy or Revenant Code? - Sergey Sergyenko  | Ruby Meditation 30Is this Legacy or Revenant Code? - Sergey Sergyenko  | Ruby Meditation 30
Is this Legacy or Revenant Code? - Sergey Sergyenko | Ruby Meditation 30
 
Life with GraphQL API: good practices and unresolved issues - Roman Dubrovsky...
Life with GraphQL API: good practices and unresolved issues - Roman Dubrovsky...Life with GraphQL API: good practices and unresolved issues - Roman Dubrovsky...
Life with GraphQL API: good practices and unresolved issues - Roman Dubrovsky...
 
Where is your license, dude? - Viacheslav Miroshnychenko | Ruby Meditation 29
Where is your license, dude? - Viacheslav Miroshnychenko | Ruby Meditation 29Where is your license, dude? - Viacheslav Miroshnychenko | Ruby Meditation 29
Where is your license, dude? - Viacheslav Miroshnychenko | Ruby Meditation 29
 
Dry-validation update. Dry-validation vs Dry-schema 1.0 - Aleksandra Stolyar ...
Dry-validation update. Dry-validation vs Dry-schema 1.0 - Aleksandra Stolyar ...Dry-validation update. Dry-validation vs Dry-schema 1.0 - Aleksandra Stolyar ...
Dry-validation update. Dry-validation vs Dry-schema 1.0 - Aleksandra Stolyar ...
 
How to cook Rabbit on Production - Bohdan Parshentsev | Ruby Meditation 28
How to cook Rabbit on Production - Bohdan Parshentsev | Ruby Meditation 28 How to cook Rabbit on Production - Bohdan Parshentsev | Ruby Meditation 28
How to cook Rabbit on Production - Bohdan Parshentsev | Ruby Meditation 28
 
How to cook Rabbit on Production - Serhiy Nazarov | Ruby Meditation 28
How to cook Rabbit on Production - Serhiy Nazarov | Ruby Meditation 28How to cook Rabbit on Production - Serhiy Nazarov | Ruby Meditation 28
How to cook Rabbit on Production - Serhiy Nazarov | Ruby Meditation 28
 
Reinventing the wheel - why do it and how to feel good about it - Julik Tarkh...
Reinventing the wheel - why do it and how to feel good about it - Julik Tarkh...Reinventing the wheel - why do it and how to feel good about it - Julik Tarkh...
Reinventing the wheel - why do it and how to feel good about it - Julik Tarkh...
 
Performance Optimization 101 for Ruby developers - Nihad Abbasov (ENG) | Ruby...
Performance Optimization 101 for Ruby developers - Nihad Abbasov (ENG) | Ruby...Performance Optimization 101 for Ruby developers - Nihad Abbasov (ENG) | Ruby...
Performance Optimization 101 for Ruby developers - Nihad Abbasov (ENG) | Ruby...
 
Use cases for Serverless Technologies - Ruslan Tolstov (RUS) | Ruby Meditatio...
Use cases for Serverless Technologies - Ruslan Tolstov (RUS) | Ruby Meditatio...Use cases for Serverless Technologies - Ruslan Tolstov (RUS) | Ruby Meditatio...
Use cases for Serverless Technologies - Ruslan Tolstov (RUS) | Ruby Meditatio...
 
The Trailblazer Ride from the If Jungle into a Civilised Railway Station - Or...
The Trailblazer Ride from the If Jungle into a Civilised Railway Station - Or...The Trailblazer Ride from the If Jungle into a Civilised Railway Station - Or...
The Trailblazer Ride from the If Jungle into a Civilised Railway Station - Or...
 
What/How to do with GraphQL? - Valentyn Ostakh (ENG) | Ruby Meditation 27
What/How to do with GraphQL? - Valentyn Ostakh (ENG) | Ruby Meditation 27What/How to do with GraphQL? - Valentyn Ostakh (ENG) | Ruby Meditation 27
What/How to do with GraphQL? - Valentyn Ostakh (ENG) | Ruby Meditation 27
 
New features in Rails 6 - Nihad Abbasov (RUS) | Ruby Meditation 26
New features in Rails 6 -  Nihad Abbasov (RUS) | Ruby Meditation 26New features in Rails 6 -  Nihad Abbasov (RUS) | Ruby Meditation 26
New features in Rails 6 - Nihad Abbasov (RUS) | Ruby Meditation 26
 
Security Scanning Overview - Tetiana Chupryna (RUS) | Ruby Meditation 26
Security Scanning Overview - Tetiana Chupryna (RUS) | Ruby Meditation 26Security Scanning Overview - Tetiana Chupryna (RUS) | Ruby Meditation 26
Security Scanning Overview - Tetiana Chupryna (RUS) | Ruby Meditation 26
 
Teach your application eloquence. Logs, metrics, traces - Dmytro Shapovalov (...
Teach your application eloquence. Logs, metrics, traces - Dmytro Shapovalov (...Teach your application eloquence. Logs, metrics, traces - Dmytro Shapovalov (...
Teach your application eloquence. Logs, metrics, traces - Dmytro Shapovalov (...
 
Best practices. Exploring - Ike Kurghinyan (RUS) | Ruby Meditation 26
Best practices. Exploring - Ike Kurghinyan (RUS) | Ruby Meditation 26Best practices. Exploring - Ike Kurghinyan (RUS) | Ruby Meditation 26
Best practices. Exploring - Ike Kurghinyan (RUS) | Ruby Meditation 26
 
Road to A/B testing - Alexey Vasiliev (ENG) | Ruby Meditation 25
Road to A/B testing - Alexey Vasiliev (ENG) | Ruby Meditation 25Road to A/B testing - Alexey Vasiliev (ENG) | Ruby Meditation 25
Road to A/B testing - Alexey Vasiliev (ENG) | Ruby Meditation 25
 
Concurrency in production. Real life example - Dmytro Herasymuk | Ruby Medita...
Concurrency in production. Real life example - Dmytro Herasymuk | Ruby Medita...Concurrency in production. Real life example - Dmytro Herasymuk | Ruby Medita...
Concurrency in production. Real life example - Dmytro Herasymuk | Ruby Medita...
 
Data encryption for Ruby web applications - Dmytro Shapovalov (RUS) | Ruby Me...
Data encryption for Ruby web applications - Dmytro Shapovalov (RUS) | Ruby Me...Data encryption for Ruby web applications - Dmytro Shapovalov (RUS) | Ruby Me...
Data encryption for Ruby web applications - Dmytro Shapovalov (RUS) | Ruby Me...
 
Rails App performance at the limit - Bogdan Gusiev
Rails App performance at the limit - Bogdan GusievRails App performance at the limit - Bogdan Gusiev
Rails App performance at the limit - Bogdan Gusiev
 
GDPR. Next Y2K in 2018? - Anton Tkachov | Ruby Meditation #23
GDPR. Next Y2K in 2018? - Anton Tkachov | Ruby Meditation #23GDPR. Next Y2K in 2018? - Anton Tkachov | Ruby Meditation #23
GDPR. Next Y2K in 2018? - Anton Tkachov | Ruby Meditation #23
 

Kürzlich hochgeladen

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 

Kürzlich hochgeladen (20)

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 

Postgres vs Elasticsearch while enriching data - Vlad Somov | Ruby Meditaiton #23

  • 1. Postgres vs Elasticsearch while enriching data. Vlad Somov @ Salt Edge Inc.
  • 2. Unstructured Data Enrichment Incoming raw data Structured identified data
  • 3. Keyword1 Keyword2 Website Name Tag Keyword1 Keyword2 Website Name Tag Unstructured Data Enrichment Some Transaction Description Website Incoming raw data Keyword1 Keyword2 Website Structured identified data Name Tag Description Keyword1 Tag
  • 4. Basic Setup Performance Min Average Max Seconds 0 7.5 15 22.5 30 Postgres Elasticsearch ~4mln. Records
  • 5. Basic Setup Performance Min Average Max Seconds 0 7.5 15 22.5 30 Postgres Elasticsearch 28.73 9.88 2.10 ~4mln. Records
  • 6. Basic Setup Performance Min Average Max Seconds 0 7.5 15 22.5 30 Postgres Elasticsearch 1.37 0.99 0.73 28.73 9.88 2.10 ~4mln. Records
  • 7. B-tree index structure 3 39 68 meta 39 42 55 68 89 943 15 28 3 9 15 21 29 32 39 42 42 48 55 68 68 77 89 93 94 98
  • 8. 39 B-tree index structure 3 68 meta 39 42 55 68 89 943 15 28 3 9 15 21 29 32 39 42 42 48 55 68 68 77 89 93 94 98
  • 9. 39 39 B-tree index structure 3 68 meta 42 55 68 89 943 15 28 3 9 15 21 29 32 39 42 42 48 55 68 68 77 89 93 94 98
  • 10. 39 39 B-tree index structure 3 68 meta 42 55 68 89 943 15 28 3 9 15 21 29 32 39 42 42 48 55 68 68 77 89 93 94 98
  • 11. 39 39 B-tree index structure 3 68 meta 42 55 68 89 943 15 28 3 9 15 21 29 32 39 42 42 48 55 68 68 77 89 93 94 98
  • 12. 39 39 B-tree index structure 3 68 meta 42 55 68 89 943 15 28 3 9 15 21 29 32 39 42 42 48 55 68 68 77 89 93 94 98
  • 13. Why it is useful? • b-tree index sort values inside each node. • b-tree is balanced • Same level nodes are connected using doubly linked list.
  • 14. After multicolumn index on country_id and merchant_type Performance Min Average Max Seconds 0 7.5 15 22.5 30 Postgres Elasticsearch Postgres + multicolumn index ~4mln. Records
  • 15. After multicolumn index on country_id and merchant_type Performance Min Average Max Seconds 0 7.5 15 22.5 30 Postgres Elasticsearch Postgres + multicolumn index 28.73 9.88 2.1 ~4mln. Records
  • 16. After multicolumn index on country_id and merchant_type Performance Min Average Max Seconds 0 7.5 15 22.5 30 Postgres Elasticsearch Postgres + multicolumn index 1.37 0.99 0.73 28.73 9.88 2.1 ~4mln. Records
  • 17. After multicolumn index on country_id and merchant_type Performance Min Average Max Seconds 0 7.5 15 22.5 30 Postgres Elasticsearch Postgres + multicolumn index 10.19 5.09 2.28 1.37 0.99 0.73 28.73 9.88 2.1 ~4mln. Records
  • 18. What is GiST Generalized Search Tree • In GiST each leaf contains logical expression and pointer to TID, where indexed data should satisfy logical expression. • Faster on insert, update What is GIN Generalized Inverted Index • It is b-tree with elements to which is connected another b-tree or plain list of TID's. • Faster and more accurate on select.
  • 19. Welcome to ruby meditation.
 All of us love ruby. Does everyone love meditation? Everyone Of Welcome All Does WelcomeRuby ToOfLove MeditationEveryone 0,1 0,10,1 2,1 1,51,5 2,1 2,1 Yellow rectangle are TID’s. First number is a page number and second is position on a page 0,1 1,52,1 1,5
  • 20. Welcome to ruby meditation.
 All of us love ruby. Does everyone love meditation? Everyone Of Welcome All Does WelcomeRuby ToOfLove MeditationEveryone 0,1 1,5 0,1 0,10,1 2,1 1,51,5 2,1 2,1 Yellow rectangle are TID’s. First number is a page number and second is position on a page 2,1 1,5 ruby rubylove love
  • 21. 1,5 1,5 Welcome to ruby meditation.
 All of us love ruby. Does everyone love meditation? Everyone Of Welcome All Does WelcomeRuby ToOfLove MeditationEveryone 0,1 0,1 0,10,1 2,1 1,51,5 2,1 2,1 Yellow rectangle are TID’s. First number is a page number and second is position on a page 2,1 ruby rubylove love love ruby
  • 22. gin_trgm_ops A trigram is a group of three consecutive characters taken from a string. We can measure the similarity of two strings by counting the number of trigrams they share.
  • 23. Performance after gin index on websites Min Average Max Seconds 0 5 10 15 20 Postgres Elasticsearch Postgres + multicolumn index Postgres + gin index with trgm_ops on websites ~4mln. Records
  • 24. Performance after gin index on websites Min Average Max Seconds 0 5 10 15 20 Postgres Elasticsearch Postgres + multicolumn index Postgres + gin index with trgm_ops on websites 28.73 9.88 2.1 ~4mln. Records
  • 25. Performance after gin index on websites Min Average Max Seconds 0 5 10 15 20 Postgres Elasticsearch Postgres + multicolumn index Postgres + gin index with trgm_ops on websites 1.37 0.99 0.75 28.73 9.88 2.1 ~4mln. Records
  • 26. Performance after gin index on websites Min Average Max Seconds 0 5 10 15 20 Postgres Elasticsearch Postgres + multicolumn index Postgres + gin index with trgm_ops on websites 10.19 5.09 2.28 1.37 0.99 0.75 28.73 9.88 2.1 ~4mln. Records
  • 27. Performance after gin index on websites Min Average Max Seconds 0 5 10 15 20 Postgres Elasticsearch Postgres + multicolumn index Postgres + gin index with trgm_ops on websites 0.55 0.34 0.26 10.19 5.09 2.28 1.37 0.99 0.75 28.73 9.88 2.1 ~4mln. Records
  • 28. How elasticsearch works • It uses analyzers for all incoming data. (it could be custom or default one) • Each analyzer has at least one tokenizer • Zero or more TokenFilters • Tokenizer may be preceded by one or more CharFilters
  • 31. How analyzer works? Input Char Filter String
  • 32. How analyzer works? Input Char Filter Tokenizer String String
  • 33. How analyzer works? Input Char Filter Tokenizer Token Filter String String Tokens
  • 34. How analyzer works? Input Char Filter Tokenizer Token Filter Output String String Tokens Tokens
  • 36. Example The 2 QUICK <p>Brown-Foxes</p> jumped over the lazy dog's bone.
  • 37. Example The 2 QUICK <p>Brown-Foxes</p> jumped over the lazy dog's bone. html_strip The 2 QUICK Brown-Foxes jumped over the lazy dog's bone.
  • 38. Example The 2 QUICK <p>Brown-Foxes</p> jumped over the lazy dog's bone. html_strip The 2 QUICK Brown-Foxes jumped over the lazy dog's bone. standart tokenizer The 2 QUICK Brown jumpedFoxes over the lazy dog’s bone
  • 39. Example The 2 QUICK <p>Brown-Foxes</p> jumped over the lazy dog's bone. html_strip The 2 QUICK Brown-Foxes jumped over the lazy dog's bone. standart tokenizer The 2 QUICK Brown jumpedFoxes over the lazy dog’s bone lowercase the 2 quick brown jumpedfoxes over the lazy dog’s bone
  • 40. Example The 2 QUICK <p>Brown-Foxes</p> jumped over the lazy dog's bone. html_strip The 2 QUICK Brown-Foxes jumped over the lazy dog's bone. standart tokenizer The 2 QUICK Brown jumpedFoxes over the lazy dog’s bone lowercase the 2 quick brown jumpedfoxes over the lazy dog’s bone stop 2 quick brown jumpedfoxes over lazy dog’s bone the the
  • 41. Example The 2 QUICK <p>Brown-Foxes</p> jumped over the lazy dog's bone. html_strip The 2 QUICK Brown-Foxes jumped over the lazy dog's bone. standart tokenizer The 2 QUICK Brown jumpedFoxes over the lazy dog’s bone lowercase the 2 quick brown jumpedfoxes over the lazy dog’s bone stop 2 quick brown jumpedfoxes over lazy dog’s bone snowball 2 quick brown jumpfox over lazi dog bone the the jump lazi dog
  • 42. Postgres full search implementation • We can use tsvector type to achieve almost the same functionality. By using to_tsvector function • To imporve perfomance we could create separate tsvector column with to_tsvector values. • To create a request we should use to_tsquery. & | <-> • plainto_tsquery works with plain text so you don’t need to insert any special symbols. Inserts & • phraseto_tsquery also works with plain text but marks that each token should be close to each other. Inserts <->
  • 43. Rum access method • Based on GIN access method code • Solves slow ranking • Solves slow phrase search (tsquery with <-> operator) • Supports index on tsquery column
  • 44. 122 1 5 3 2 4 4 3 3 4211 Welcome to ruby meditation.
 All of us love ruby. Does everyone love meditation? ruby, meditation, love Everyone Of Welcome All Does WelcomeRuby ToOfLove MeditationEveryone 0,1 0,10,1 2,1 1,51,5 2,1 2,1 The number in green rectangle is word position in the document. 0,1 1,52,1 1,5 8,4 8,4 8,4
  • 45. 122 1 5 3 2 4 4 3 3 4211 Welcome to ruby meditation.
 All of us love ruby. Does everyone love meditation? ruby, meditation, love Everyone Of Welcome All Does WelcomeRuby ToOfLove MeditationEveryone 0,1 1,5 0,1 0,10,1 2,1 1,51,5 2,1 2,1 The number in green rectangle is word position in the document. 2,1 1,5 ruby rubylove love love ruby 8,4 8,4 8,4
  • 46. 122 1 5 3 2 4 4 3 3 4211 1,5 1,5 Welcome to ruby meditation.
 All of us love ruby. Does everyone love meditation? ruby, meditation, love Everyone Of Welcome All Does WelcomeRuby ToOfLove MeditationEveryone 0,1 0,1 0,10,1 2,1 1,51,5 2,1 2,1 The number in green rectangle is word position in the document. 2,1 ruby rubylove love love ruby love ruby 8,4 8,4 8,4
  • 47. Conclusion • Postgres can also be fast. • Multicolumn indexes can improve performance if your search has multicolumn constraints. • For fast text search prefer using Gin when table doesn’t update occasionally, otherwise use GiST • Use gin with trgm_ops when using full text search. If full text search is still slow try to use tsvector data type with gin index on it. • When you have some kind ‘inverse full-text search’ problem. Add tsquery type in your table as a query and incoming data treat as a document. Add rum access method on query column with tsquery_ops for fast classification. • Before moving to other instrument make analysis of current/new instrument and verify is it worth moving or not.