SlideShare ist ein Scribd-Unternehmen logo
1 von 88
Why databases cry at night
Michael Yarichuk
Hibernating Rhinos
Magic?
This Photo by Unknown Author is licensed under CC BY-NC-ND
This Photo by Unknown Author
is licensed under CC BY-SA
Query
Nope. Not magic!
Databases are abstractions
The photo is licensed under CC BY-SA.
The Law of Leaky Abstractions
"All non-trivial abstractions, to some degree, are leaky."
- Joel Spolsky
Many shades of grey databases
Many shades of grey databases
RDBMS Key/Value Store Document Database Graph Database
MS-SQL LMDB RavenDB Neo4j
MySQL BerkeleyDB MongoDB OrientDB*
Oracle Cassandra* CouchDB ArangoDB
PostgreSQL Dynamo CosmosDB JanusGraph
Very different databases have
the same reasons for crying.
Things we will take a look at
Storage algorithms & Storage
Indexing & Queries
Network
Let’s start with something simple:
Storage.
It just works, no?
Famous last words!
Before we discuss storage, here is a riddle…
RavenDB server-wide backup failed
• The instance had multiple databases in single instance
• Plenty of memory and cores, resource usage is small
• Nothing else was running on the machine EXCEPT RavenDB
• Scheduled backup tasks fail soon after they started
The backup tasks started at the
same time!
Also, there were gazillion of databases
SAN
Storage
Database
A
Database
B
Database
C
Database
D
Database
E Database
F
Database
G
...
Database
FF
RavenDB’s failing backups
Approx. 200 databases doing backups at the same time WILL
cause storage saturation!
The solution was rather simple
Disk queue length can be an… issue
16
KB
8
KB
10
KB
16
KB
12
KB
12
KB
Disk Queue Length
Disk Write
Disk queue depth
16
KB
8
KB
10
KB
16
KB
12
KB
12
KB
Disk Queue Depth = 2
Disk Write
16
KB
8
KB
10
KB
16
KB
12
KB
8
KBThread 1
Thread 2
What can we do about storge issues?
• Load test database code to ensure
• Write-through throughput
• Enough IOPS for expected production load (disk queue length is <= 2)
• Cloud  provision IOPS
• Load-test application to find limits of the system
• Monitoring! (too long queues = storage bottleneck)
Storage performance benchmarks
• Sysinternals Process Monitor
• CrystalDiskMark
• ATTO Disk Benchmark
• (Many) other tools
CrystalDiskMark
• Random/sequential I/O?
• Queues/Threads (queue depth/length)
• Size of each read/write
But wait, there is more!
(about storage)
A tale of two primary keys
• One embedded transactional database engine (LMDB)
• 100 transactions, 100 key/value writes per transaction
• Two databases, keys and values have the same size
• One uses sequential keys (UuidCreateSequential)
• One uses random keys (UuidCreate)
A tale of two primary keys
0
20
40
60
80
100
120
140
160
180
0 2 4 6 8 10 12 14 16
#ofseeksperTX
Transaction #
B-Tree seeks per write TX
Random
Sequential
A tale of two primary keys
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0.05
0 2 4 6 8 10 12 14 16
TotalseeklatencyperTX
Transaction #
Total seek latency per TX
Random
Sequential
Process Monitor
Types of OS operations to listen (file activity, network activity, etc)
Types of operations
Why?
Storage algorithms (page-oriented storage engines)
• B-tree, B+ Tree
• Optimized for reads
• Optimized for sequential data
B+ tree keys
Sequential keys Non-sequential keys
The cost of hops in the tree
Disk seek
https://people.eecs.berkeley.edu/~rcs/research/interactive_latency.html
Sequential reads
Minimize performance impact of keys
• Sequential keys allow better performance
• 1,2,3,4,5
• Users/1, Users/2, Users/3
• Notice: B-Trees are used to store data AND indexes
• Query performance!
There is another kind of storage
implementation…
The tale of an occasionally slow database
• Sometimes, Cassandra database was fast and sometimes not
• This happened non-deterministically
A page in
Cassandra
documentation!
https://docs.datastax.com/en/dse-trblshoot/doc/troubleshooting/slowReads.html
Storage algorithms (log-structured storage engines)
Users/1
Users/3
In-memory Storage
Users/3
Users/5
Users/22
Users/3
SSTable SSTable SSTable
Usually a B-Tree or Skip List
Storage algorithms (log-structured storage engines)
Users/1
Users/3
In-memory Storage
Users/3
Users/5
Users/22
Users/3
Insert users/44
Users/44
Storage algorithms (log-structured storage engines)
Users/1
Users/3
In-memory Storage
Users/5
Users/3
Users/22
Users/3
Update users/3
Users/44
Users/3
(update)
Storage algorithms (log-structured storage engines)
Users/1
Users/3
In-memory Storage
Users/5
Users/3
Users/22
Users/3
Delete users/5
Users/44
Users/3 (update)
Users/5 (delete)
Storage algorithms (log-structured storage engines)
Users/1
Users/3
In-memory Storage
Users/5
Users/3
Users/22
Users/3
Flush!
Users/44
Users/3 (update)
Users/5 (delete)
Storage algorithms (log-structured storage engines)
Users/1
Users/3
In-memory Storage
Users/5
Users/3
Users/22
Users/3
(appended at the end)
Users/44
Users/3 (update)
Users/5 (delete)
Storage algorithms (log-structured storage engines)
Users/1
Users/3
In-memory Storage
Users/5
Users/3
Users/22
Users/3
Reading requires searching ALL SSTables!
Users/44
Users/3 (update)
Users/5 (delete)
Storage algorithms – LSM Tree compaction
This is roughly O(n*log(n)) operation!
Key/Value 1
Key/Value 3
Key/Value 5
Key/Value 3
Key/Value 9
Key/Value 12
Key/Value 5
Key/Value 12
Key/Value 18
Key/Value 1
Key/Value 3
Key/Value 5
Key/Value 9
Key/Value 12
Key/Value 18
TX1
TX2
TX3
Storage algorithms – LSM Tree compaction
In-memory Storage
Users/1
Users/3
Users/22
Users/44
After compaction
Compaction strategies
• Compaction strategies  WHEN compaction is triggered?
• Leveled  optimization for inserts
• SizeTiered  optimization for reads
• Time-window  optimization for TimeSeries/immutable data
CQL
Now, let’s talk about another
important feature…
ACID guarantees!
• Atomicity, Consistency, Isolation, Durability
• Note: not all DBs support it
• All of RDBMS
• Some NoSQLs – RavenDB, LevelDB, LMDB
Write-ahead log (WAL) - Atomicity, Durability
Put "A" Put "B" Commit Put "X" CommitWAL
Data Storage
Flush
Writing
Write-ahead log (WAL)
• "Write-Through" writes – no caching (otherwise no durability!)
• Lots of small writes (overhead of each write)
Write-Through
Volatile
Buffer
Regular Write Periodic Flush
ATTO Disk Benchmark
Benchmarking modes
Buffered vs. Write-through
With buffers and caching (GBs/sec) No caching, Write-Through (MBs/sec)
NVMe SSD (Samsung 860 EVO m.2)
Let's talk a bit about indexing and querying
Storage algorithms & Storage
Indexing & Queries
Network
Are those queries different?
Let’s start from something… simple.
First, we define an index
We create an index that covers city and country fields of ShipTo
Then we do some queries
Fetching orders that were shipped to Paris or Lyon
So far so good…
And another query
Fetching orders that were shipped to all of France
It’s a gotcha!
Why?
Field 1 Field 2
Lyon France
Paris France
Oslo Norway
In some databases (like MongoDB)
• Indexed fields are concatenated into single index key
• Filtering only by prefix
Index Key Record IDs
LyonFrance [7],[1],[4]
ParisFrance [5],[6]
OsloNorway [12],[2],[9],[34]
The values are concatenated!
Why?
Field 1 Field 2
Lyon France
Paris France
Oslo Norway
In some databases (RavenDB, any Lucene-based index)
• Indexed terms stored separate
• Filtering by one or both fields in any order (union/intersect as needed)
Index Key Record IDs
Lyon [7],[1],[4]
Paris [5],[6]
Oslo [12],[2],[9],[34]
Index Key Record IDs
France [7],[1],[4],[5],[6]
Norway [12],[2],[9],[34]
ShipTo.City Index
ShipTo.Country
Index
Collection/table scans are easily overlooked
Collection scan - development
• Small amount of data
• Extremely small query latency
Collection scan - production
• Large amount of data
• HUGE latency (quite often!)
Latency: 50ms vs 50 hours
Indexing
• Indexes are stored as trees (usually B-trees)
• Updates have non-trivial complexity!
https://commons.wikimedia.org/wiki/File:Trie_example.svg
Indexing
Search time complexity (WHERE clause):
O(log(N)) + O(log(M)) + O(Max(K,P))
Where:
• N and M are amount of rows in indexes
• K and P are result sets of index searches
And if we use RDBMS, things become even
more interesting...
Join Algorithm Complexity
Merge Join O(n*log(n) + m*log(m))
Hash Join O(n + m)
Index Join O(m*log(n))
And if we have a non-trivial query...
https://dev.to/tyzia/example-of-complex-sql-query-to-get-as-much-data-as-possible-from-database-9he
Those are 10 JOIN statements!
…we have complexity between
O(log(n)) and O(too much)!
More often than not it is O(too much)…
What can (should!) we do?
• RDBMS
• Proper indexing (kinda obvious, but still )
• Optimize (remove unnecessary JOINs – depends on business logic)
• Reduce query complexity
• Replace ‘row by row’ cursors with set based queries
• Reduce the amount of work queries do (for example, unnecessary sub-queries)
• Remove ORDER BY where it makes sense (huge overhead)
• Other optimizations are possible
• NoSQL
• Proper modeling
• Well planned indexing
Let’s talk a bit about NoSQL Data Modeling
• Documents are independent
• Transaction borders
• Depend on context!
• Depend on query pattern
• Depend on data growth
Well planned indexing?
Well planned indexing?
Networking can be a bottleneck too!
Storage algorithms & Storage
Indexing & Queries
Network
Here is a riddle: why a query with 100 results
consistently took several seconds to complete?
Hint: the request spends < 10ms on the server
The investigation
1. Look at query latency on the server
The investigation
2. Look at Fiddler timings Response Size
Latency
Network bandwith is not infinite!
3mb
document
2mb
document
5mb
document
Query Results
Database Client API
Solution: server-side projections (NoSQL)
Server-side Projection
RQL
Solution: server-side projections (NoSQL)
Server-side Projection
MongoDB API
Also… database requests can be an
interesting issue…
Network overhead
https://github.com/dajuric/simple-http
TCP handshake
Network overhead
Round Trip Time (RTT)
• Physical distance (insignificant for LANs)
• Bandwidth
• Network hops
Round-trip Time
So, what can we do?
What can we do?
• Refactor to reduce number of requests (kinda obvious, but still…)
• NHibernate – Future Queries
• Entity Framework - QueryFuture
• RavenDB – Lazy Queries
May sound trivial, but…
Do take a look at database traffic while stress testing and if possible in
production too.
• Fiddler
• Wireshark
• Profilers
• Any other tool to inspect traffic
To sum it up
• Databases are abstractions
• Abstractions are leaky and might be the cause of perf issues
• Such perf issues can be dealt with (if we know about the "leak"!)
Questions?
michael.yarichuk@hibernatingrhinos.com
@myarichuk
This Photo by Unknown author is licensed under CC BY-SA.
https://github.com/ravendb/ravendb
https://github.com/myarichuk/PerfDemo-Sequential-vs-Random-Key

Weitere ähnliche Inhalte

Was ist angesagt?

Big data 101 for beginners riga dev days
Big data 101 for beginners riga dev daysBig data 101 for beginners riga dev days
Big data 101 for beginners riga dev daysDuyhai Doan
 
Elasticsearch - DevNexus 2015
Elasticsearch - DevNexus 2015Elasticsearch - DevNexus 2015
Elasticsearch - DevNexus 2015Roy Russo
 
SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsKorea Sdec
 
Spark Cassandra 2016
Spark Cassandra 2016Spark Cassandra 2016
Spark Cassandra 2016Duyhai Doan
 
Roaring with elastic search sangam2018
Roaring with elastic search sangam2018Roaring with elastic search sangam2018
Roaring with elastic search sangam2018Vinay Kumar
 
Elastic Search
Elastic SearchElastic Search
Elastic SearchNavule Rao
 
Elasticsearch - Devoxx France 2012 - English version
Elasticsearch - Devoxx France 2012 - English versionElasticsearch - Devoxx France 2012 - English version
Elasticsearch - Devoxx France 2012 - English versionDavid Pilato
 
Scaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextScaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextRafał Kuć
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to ElasticsearchSperasoft
 
Elasticsearch presentation 1
Elasticsearch presentation 1Elasticsearch presentation 1
Elasticsearch presentation 1Maruf Hassan
 
Big Data Processing using Apache Spark and Clojure
Big Data Processing using Apache Spark and ClojureBig Data Processing using Apache Spark and Clojure
Big Data Processing using Apache Spark and ClojureDr. Christian Betz
 
Datastax enterprise presentation
Datastax enterprise presentationDatastax enterprise presentation
Datastax enterprise presentationDuyhai Doan
 
Managing Your Content with Elasticsearch
Managing Your Content with ElasticsearchManaging Your Content with Elasticsearch
Managing Your Content with ElasticsearchSamantha Quiñones
 
Building a CRM on top of ElasticSearch
Building a CRM on top of ElasticSearchBuilding a CRM on top of ElasticSearch
Building a CRM on top of ElasticSearchMark Greene
 
Elasticsearch for Data Analytics
Elasticsearch for Data AnalyticsElasticsearch for Data Analytics
Elasticsearch for Data AnalyticsFelipe
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016Duyhai Doan
 
Cassandra 3 new features 2016
Cassandra 3 new features 2016Cassandra 3 new features 2016
Cassandra 3 new features 2016Duyhai Doan
 
Big data 101 for beginners devoxxpl
Big data 101 for beginners devoxxplBig data 101 for beginners devoxxpl
Big data 101 for beginners devoxxplDuyhai Doan
 
23 October 2013 - AWS 201 - A Walk through the AWS Cloud: Introduction to Ama...
23 October 2013 - AWS 201 - A Walk through the AWS Cloud: Introduction to Ama...23 October 2013 - AWS 201 - A Walk through the AWS Cloud: Introduction to Ama...
23 October 2013 - AWS 201 - A Walk through the AWS Cloud: Introduction to Ama...Amazon Web Services
 

Was ist angesagt? (20)

Big data 101 for beginners riga dev days
Big data 101 for beginners riga dev daysBig data 101 for beginners riga dev days
Big data 101 for beginners riga dev days
 
Elasticsearch - DevNexus 2015
Elasticsearch - DevNexus 2015Elasticsearch - DevNexus 2015
Elasticsearch - DevNexus 2015
 
SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and models
 
Spark Cassandra 2016
Spark Cassandra 2016Spark Cassandra 2016
Spark Cassandra 2016
 
Roaring with elastic search sangam2018
Roaring with elastic search sangam2018Roaring with elastic search sangam2018
Roaring with elastic search sangam2018
 
Elastic Search
Elastic SearchElastic Search
Elastic Search
 
Elasticsearch - Devoxx France 2012 - English version
Elasticsearch - Devoxx France 2012 - English versionElasticsearch - Devoxx France 2012 - English version
Elasticsearch - Devoxx France 2012 - English version
 
Scaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextScaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - Sematext
 
memcached Distributed Cache
memcached Distributed Cachememcached Distributed Cache
memcached Distributed Cache
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Elasticsearch presentation 1
Elasticsearch presentation 1Elasticsearch presentation 1
Elasticsearch presentation 1
 
Big Data Processing using Apache Spark and Clojure
Big Data Processing using Apache Spark and ClojureBig Data Processing using Apache Spark and Clojure
Big Data Processing using Apache Spark and Clojure
 
Datastax enterprise presentation
Datastax enterprise presentationDatastax enterprise presentation
Datastax enterprise presentation
 
Managing Your Content with Elasticsearch
Managing Your Content with ElasticsearchManaging Your Content with Elasticsearch
Managing Your Content with Elasticsearch
 
Building a CRM on top of ElasticSearch
Building a CRM on top of ElasticSearchBuilding a CRM on top of ElasticSearch
Building a CRM on top of ElasticSearch
 
Elasticsearch for Data Analytics
Elasticsearch for Data AnalyticsElasticsearch for Data Analytics
Elasticsearch for Data Analytics
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016
 
Cassandra 3 new features 2016
Cassandra 3 new features 2016Cassandra 3 new features 2016
Cassandra 3 new features 2016
 
Big data 101 for beginners devoxxpl
Big data 101 for beginners devoxxplBig data 101 for beginners devoxxpl
Big data 101 for beginners devoxxpl
 
23 October 2013 - AWS 201 - A Walk through the AWS Cloud: Introduction to Ama...
23 October 2013 - AWS 201 - A Walk through the AWS Cloud: Introduction to Ama...23 October 2013 - AWS 201 - A Walk through the AWS Cloud: Introduction to Ama...
23 October 2013 - AWS 201 - A Walk through the AWS Cloud: Introduction to Ama...
 

Ähnlich wie Why databases cry at night

NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]Huy Do
 
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware ProvisioningMongoDB
 
Everything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDBEverything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDBjhugg
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...javier ramirez
 
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...javier ramirez
 
Implementing the Databese Server session 02
Implementing the Databese Server session 02Implementing the Databese Server session 02
Implementing the Databese Server session 02Guillermo Julca
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrRahul Jain
 
SQL Now! How Optiq brings the best of SQL to NoSQL data.
SQL Now! How Optiq brings the best of SQL to NoSQL data.SQL Now! How Optiq brings the best of SQL to NoSQL data.
SQL Now! How Optiq brings the best of SQL to NoSQL data.Julian Hyde
 
Know thy cost (or where performance problems lurk)
Know thy cost (or where performance problems lurk)Know thy cost (or where performance problems lurk)
Know thy cost (or where performance problems lurk)Oren Eini
 
MySQL And Search At Craigslist
MySQL And Search At CraigslistMySQL And Search At Craigslist
MySQL And Search At CraigslistJeremy Zawodny
 
Vote NO for MySQL
Vote NO for MySQLVote NO for MySQL
Vote NO for MySQLUlf Wendel
 
QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...javier ramirez
 
Scaling with mongo db (with notes)
Scaling with mongo db (with notes)Scaling with mongo db (with notes)
Scaling with mongo db (with notes)emiltamas
 
Your backend architecture is what matters slideshare
Your backend architecture is what matters slideshareYour backend architecture is what matters slideshare
Your backend architecture is what matters slideshareColin Charles
 

Ähnlich wie Why databases cry at night (20)

NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]
 
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware Provisioning
 
Everything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDBEverything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDB
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
 
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
 
NOsql Presentation.pdf
NOsql Presentation.pdfNOsql Presentation.pdf
NOsql Presentation.pdf
 
Implementing the Databese Server session 02
Implementing the Databese Server session 02Implementing the Databese Server session 02
Implementing the Databese Server session 02
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache Solr
 
No sq lv1_0
No sq lv1_0No sq lv1_0
No sq lv1_0
 
SQL Now! How Optiq brings the best of SQL to NoSQL data.
SQL Now! How Optiq brings the best of SQL to NoSQL data.SQL Now! How Optiq brings the best of SQL to NoSQL data.
SQL Now! How Optiq brings the best of SQL to NoSQL data.
 
Know thy cost (or where performance problems lurk)
Know thy cost (or where performance problems lurk)Know thy cost (or where performance problems lurk)
Know thy cost (or where performance problems lurk)
 
Database Technologies
Database TechnologiesDatabase Technologies
Database Technologies
 
MySQL And Search At Craigslist
MySQL And Search At CraigslistMySQL And Search At Craigslist
MySQL And Search At Craigslist
 
Vote NO for MySQL
Vote NO for MySQLVote NO for MySQL
Vote NO for MySQL
 
NoSQL.pptx
NoSQL.pptxNoSQL.pptx
NoSQL.pptx
 
NoSql Databases
NoSql DatabasesNoSql Databases
NoSql Databases
 
QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...
 
Scaling with mongo db (with notes)
Scaling with mongo db (with notes)Scaling with mongo db (with notes)
Scaling with mongo db (with notes)
 
Your backend architecture is what matters slideshare
Your backend architecture is what matters slideshareYour backend architecture is what matters slideshare
Your backend architecture is what matters slideshare
 
NoSQL
NoSQLNoSQL
NoSQL
 

Kürzlich hochgeladen

%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benonimasabamasaba
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024VictoriaMetrics
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Bert Jan Schrijver
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in sowetomasabamasaba
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...masabamasaba
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
tonesoftg
tonesoftgtonesoftg
tonesoftglanshi9
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 

Kürzlich hochgeladen (20)

%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 

Why databases cry at night

  • 1. Why databases cry at night Michael Yarichuk Hibernating Rhinos
  • 2.
  • 3.
  • 4.
  • 5.
  • 6. Magic? This Photo by Unknown Author is licensed under CC BY-NC-ND This Photo by Unknown Author is licensed under CC BY-SA Query
  • 8. Databases are abstractions The photo is licensed under CC BY-SA.
  • 9. The Law of Leaky Abstractions "All non-trivial abstractions, to some degree, are leaky." - Joel Spolsky
  • 10. Many shades of grey databases
  • 11. Many shades of grey databases RDBMS Key/Value Store Document Database Graph Database MS-SQL LMDB RavenDB Neo4j MySQL BerkeleyDB MongoDB OrientDB* Oracle Cassandra* CouchDB ArangoDB PostgreSQL Dynamo CosmosDB JanusGraph
  • 12. Very different databases have the same reasons for crying.
  • 13. Things we will take a look at Storage algorithms & Storage Indexing & Queries Network
  • 14. Let’s start with something simple: Storage. It just works, no? Famous last words!
  • 15. Before we discuss storage, here is a riddle… RavenDB server-wide backup failed • The instance had multiple databases in single instance • Plenty of memory and cores, resource usage is small • Nothing else was running on the machine EXCEPT RavenDB • Scheduled backup tasks fail soon after they started
  • 16. The backup tasks started at the same time!
  • 17. Also, there were gazillion of databases SAN Storage Database A Database B Database C Database D Database E Database F Database G ... Database FF
  • 18.
  • 19. RavenDB’s failing backups Approx. 200 databases doing backups at the same time WILL cause storage saturation!
  • 20. The solution was rather simple
  • 21. Disk queue length can be an… issue 16 KB 8 KB 10 KB 16 KB 12 KB 12 KB Disk Queue Length Disk Write
  • 22. Disk queue depth 16 KB 8 KB 10 KB 16 KB 12 KB 12 KB Disk Queue Depth = 2 Disk Write 16 KB 8 KB 10 KB 16 KB 12 KB 8 KBThread 1 Thread 2
  • 23. What can we do about storge issues? • Load test database code to ensure • Write-through throughput • Enough IOPS for expected production load (disk queue length is <= 2) • Cloud  provision IOPS • Load-test application to find limits of the system • Monitoring! (too long queues = storage bottleneck)
  • 24. Storage performance benchmarks • Sysinternals Process Monitor • CrystalDiskMark • ATTO Disk Benchmark • (Many) other tools
  • 25. CrystalDiskMark • Random/sequential I/O? • Queues/Threads (queue depth/length) • Size of each read/write
  • 26. But wait, there is more! (about storage)
  • 27. A tale of two primary keys • One embedded transactional database engine (LMDB) • 100 transactions, 100 key/value writes per transaction • Two databases, keys and values have the same size • One uses sequential keys (UuidCreateSequential) • One uses random keys (UuidCreate)
  • 28. A tale of two primary keys 0 20 40 60 80 100 120 140 160 180 0 2 4 6 8 10 12 14 16 #ofseeksperTX Transaction # B-Tree seeks per write TX Random Sequential
  • 29. A tale of two primary keys 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05 0 2 4 6 8 10 12 14 16 TotalseeklatencyperTX Transaction # Total seek latency per TX Random Sequential
  • 30. Process Monitor Types of OS operations to listen (file activity, network activity, etc) Types of operations
  • 31. Why?
  • 32. Storage algorithms (page-oriented storage engines) • B-tree, B+ Tree • Optimized for reads • Optimized for sequential data
  • 33. B+ tree keys Sequential keys Non-sequential keys
  • 34. The cost of hops in the tree Disk seek https://people.eecs.berkeley.edu/~rcs/research/interactive_latency.html Sequential reads
  • 35. Minimize performance impact of keys • Sequential keys allow better performance • 1,2,3,4,5 • Users/1, Users/2, Users/3 • Notice: B-Trees are used to store data AND indexes • Query performance!
  • 36. There is another kind of storage implementation…
  • 37. The tale of an occasionally slow database • Sometimes, Cassandra database was fast and sometimes not • This happened non-deterministically A page in Cassandra documentation! https://docs.datastax.com/en/dse-trblshoot/doc/troubleshooting/slowReads.html
  • 38. Storage algorithms (log-structured storage engines) Users/1 Users/3 In-memory Storage Users/3 Users/5 Users/22 Users/3 SSTable SSTable SSTable Usually a B-Tree or Skip List
  • 39. Storage algorithms (log-structured storage engines) Users/1 Users/3 In-memory Storage Users/3 Users/5 Users/22 Users/3 Insert users/44 Users/44
  • 40. Storage algorithms (log-structured storage engines) Users/1 Users/3 In-memory Storage Users/5 Users/3 Users/22 Users/3 Update users/3 Users/44 Users/3 (update)
  • 41. Storage algorithms (log-structured storage engines) Users/1 Users/3 In-memory Storage Users/5 Users/3 Users/22 Users/3 Delete users/5 Users/44 Users/3 (update) Users/5 (delete)
  • 42. Storage algorithms (log-structured storage engines) Users/1 Users/3 In-memory Storage Users/5 Users/3 Users/22 Users/3 Flush! Users/44 Users/3 (update) Users/5 (delete)
  • 43. Storage algorithms (log-structured storage engines) Users/1 Users/3 In-memory Storage Users/5 Users/3 Users/22 Users/3 (appended at the end) Users/44 Users/3 (update) Users/5 (delete)
  • 44. Storage algorithms (log-structured storage engines) Users/1 Users/3 In-memory Storage Users/5 Users/3 Users/22 Users/3 Reading requires searching ALL SSTables! Users/44 Users/3 (update) Users/5 (delete)
  • 45. Storage algorithms – LSM Tree compaction This is roughly O(n*log(n)) operation! Key/Value 1 Key/Value 3 Key/Value 5 Key/Value 3 Key/Value 9 Key/Value 12 Key/Value 5 Key/Value 12 Key/Value 18 Key/Value 1 Key/Value 3 Key/Value 5 Key/Value 9 Key/Value 12 Key/Value 18 TX1 TX2 TX3
  • 46. Storage algorithms – LSM Tree compaction In-memory Storage Users/1 Users/3 Users/22 Users/44 After compaction
  • 47. Compaction strategies • Compaction strategies  WHEN compaction is triggered? • Leveled  optimization for inserts • SizeTiered  optimization for reads • Time-window  optimization for TimeSeries/immutable data CQL
  • 48. Now, let’s talk about another important feature…
  • 49. ACID guarantees! • Atomicity, Consistency, Isolation, Durability • Note: not all DBs support it • All of RDBMS • Some NoSQLs – RavenDB, LevelDB, LMDB
  • 50. Write-ahead log (WAL) - Atomicity, Durability Put "A" Put "B" Commit Put "X" CommitWAL Data Storage Flush Writing
  • 51. Write-ahead log (WAL) • "Write-Through" writes – no caching (otherwise no durability!) • Lots of small writes (overhead of each write) Write-Through Volatile Buffer Regular Write Periodic Flush
  • 53. Buffered vs. Write-through With buffers and caching (GBs/sec) No caching, Write-Through (MBs/sec) NVMe SSD (Samsung 860 EVO m.2)
  • 54. Let's talk a bit about indexing and querying Storage algorithms & Storage Indexing & Queries Network
  • 55. Are those queries different?
  • 56. Let’s start from something… simple.
  • 57. First, we define an index We create an index that covers city and country fields of ShipTo
  • 58. Then we do some queries Fetching orders that were shipped to Paris or Lyon
  • 59. So far so good…
  • 60. And another query Fetching orders that were shipped to all of France
  • 62. Why? Field 1 Field 2 Lyon France Paris France Oslo Norway In some databases (like MongoDB) • Indexed fields are concatenated into single index key • Filtering only by prefix Index Key Record IDs LyonFrance [7],[1],[4] ParisFrance [5],[6] OsloNorway [12],[2],[9],[34] The values are concatenated!
  • 63. Why? Field 1 Field 2 Lyon France Paris France Oslo Norway In some databases (RavenDB, any Lucene-based index) • Indexed terms stored separate • Filtering by one or both fields in any order (union/intersect as needed) Index Key Record IDs Lyon [7],[1],[4] Paris [5],[6] Oslo [12],[2],[9],[34] Index Key Record IDs France [7],[1],[4],[5],[6] Norway [12],[2],[9],[34] ShipTo.City Index ShipTo.Country Index
  • 64. Collection/table scans are easily overlooked Collection scan - development • Small amount of data • Extremely small query latency Collection scan - production • Large amount of data • HUGE latency (quite often!) Latency: 50ms vs 50 hours
  • 65. Indexing • Indexes are stored as trees (usually B-trees) • Updates have non-trivial complexity! https://commons.wikimedia.org/wiki/File:Trie_example.svg
  • 66. Indexing Search time complexity (WHERE clause): O(log(N)) + O(log(M)) + O(Max(K,P)) Where: • N and M are amount of rows in indexes • K and P are result sets of index searches
  • 67. And if we use RDBMS, things become even more interesting... Join Algorithm Complexity Merge Join O(n*log(n) + m*log(m)) Hash Join O(n + m) Index Join O(m*log(n))
  • 68. And if we have a non-trivial query... https://dev.to/tyzia/example-of-complex-sql-query-to-get-as-much-data-as-possible-from-database-9he Those are 10 JOIN statements!
  • 69. …we have complexity between O(log(n)) and O(too much)! More often than not it is O(too much)…
  • 70. What can (should!) we do? • RDBMS • Proper indexing (kinda obvious, but still ) • Optimize (remove unnecessary JOINs – depends on business logic) • Reduce query complexity • Replace ‘row by row’ cursors with set based queries • Reduce the amount of work queries do (for example, unnecessary sub-queries) • Remove ORDER BY where it makes sense (huge overhead) • Other optimizations are possible • NoSQL • Proper modeling • Well planned indexing
  • 71. Let’s talk a bit about NoSQL Data Modeling • Documents are independent • Transaction borders • Depend on context! • Depend on query pattern • Depend on data growth
  • 74. Networking can be a bottleneck too! Storage algorithms & Storage Indexing & Queries Network
  • 75. Here is a riddle: why a query with 100 results consistently took several seconds to complete? Hint: the request spends < 10ms on the server
  • 76. The investigation 1. Look at query latency on the server
  • 77. The investigation 2. Look at Fiddler timings Response Size Latency
  • 78. Network bandwith is not infinite! 3mb document 2mb document 5mb document Query Results Database Client API
  • 79. Solution: server-side projections (NoSQL) Server-side Projection RQL
  • 80. Solution: server-side projections (NoSQL) Server-side Projection MongoDB API
  • 81. Also… database requests can be an interesting issue…
  • 83. Network overhead Round Trip Time (RTT) • Physical distance (insignificant for LANs) • Bandwidth • Network hops Round-trip Time
  • 84. So, what can we do?
  • 85. What can we do? • Refactor to reduce number of requests (kinda obvious, but still…) • NHibernate – Future Queries • Entity Framework - QueryFuture • RavenDB – Lazy Queries
  • 86. May sound trivial, but… Do take a look at database traffic while stress testing and if possible in production too. • Fiddler • Wireshark • Profilers • Any other tool to inspect traffic
  • 87. To sum it up • Databases are abstractions • Abstractions are leaky and might be the cause of perf issues • Such perf issues can be dealt with (if we know about the "leak"!)
  • 88. Questions? michael.yarichuk@hibernatingrhinos.com @myarichuk This Photo by Unknown author is licensed under CC BY-SA. https://github.com/ravendb/ravendb https://github.com/myarichuk/PerfDemo-Sequential-vs-Random-Key

Hinweis der Redaktion

  1. Do not forget: say thanks for coming to my talk!
  2. I have been working on internals of a NoSQL database for some time now. Mostly I am working on clusters and distributed stuff like replication. Also, I do support calls and in this talk I will talk about various performance issues I have seen
  3. … a company called Hibernating Rhinos. We have created profilers for ORMs, but our main business is (click to next slide)
  4. …our main business is RavenDB – NoSQL document database .Net Core  MENTION that RavenDB is built on .Net Core
  5. Many times developers deal with databases by just tweaking settings and blindly playing around.
  6. It is very tempting to think of databases as magic boxes You query a database and get answers to complex queries… Things simply… work.
  7. …but databases are not magic!
  8. Databases are abstractions to complex internals  Many things can go wrong under the outermost layer  the API
  9. Why things can go wrong? Because ALL abstractions are leaky
  10. There are many types of databases Give concrete examples of each type of database RDBMS: MySQL, PostgreSQL, MS-SQL, Oracle Document DBs: CouchDB, MongoDB, RavenDB GraphDB: Neo4j, OrientDB Key/value store: LMDB, BerkeleyDB, Cassandra (hybrid between table and key/value)
  11. Note that there are hybrid databases such as Cassandra (hybrid column-family store) and OrientDB (hybrid document & graph) (more than 400 database companies)
  12. Databases are different, but deep under the hood they use the same algorithms
  13. We will take a look at multiple reasons why databases will cry. Note: mention that things we take a look at only scratch the surface of potential issues
  14. We will start from storage..
  15. Before we jump into details, here is a riddle… Mention this is a real support case involve audience here  ask them to theorize
  16. Mention that looking at monitoring numbers (more specifically info about disk perf)
  17. * Mention that this is relevant to many types of DBs Mention it was SAN timeout  Those database were stored at SAN disk SHARED with other systems
  18. The solution? Make the backups happen one after another (so the storage won’t be overloaded)
  19. How many concurrent IO the storage supports?
  20. Mention that IOPS is input/output operations per second
  21. Load testing of software is understandable, but what about load testing of hardware (storage)? Tools are needed for it…
  22. Sequential/Random Sequential – accessing files sequentially (for things like video streaming) Random – accessing files randomly  more seeks Queues – how much writes each process does concurrently Threads – how much processes access the disk concurrently
  23. And now, let’s take a look at yet another reason why databases can cry at night
  24. Tell a bit about LMDB  what is it? (OpenLDAP, some Linux distros such as Ubuntu, Debian and Fedora) Tell about WHAT kind of thing I checked,
  25. Just a note – this information is gathered using Sysinternals Process Monitor In Linux there is an equivalent - a strace (it is a tool) Process Monitor is useful to trace the ways any application accesses storage, including OS calls and latencies (in this context)
  26. Not the only kind of trees that are used, but B-Trees are the most prevalent * B-Trees are self-balancing! Used in: CouchDB, RavenDB, MS-SQL, Oracle
  27. Non-sequential keys in B trees can cause it being much deeper. In here – the same data but with different keys Conclusion: different data patterns can give DIFFERENT performance on the same engine
  28. How many is many?  I am talking about millions of data records here
  29. Mention that this is a true story: Tell about a user that told such story at a workshop (London)
  30. Log Structured Merge Trees are used in database storage engines Mention that it is optimized for writes * Used in Lucene, LevelDB, HBase, SQLite4, Apache Cassandra, Tarantool (mail.ru) the data is kept in multiple different structures, each is optimized for different storages  (memory/disk); * SSTable (each column)  ordered immutable map from keys to values * data is synchronized between the two structures efficiently, in batches
  31. Mention that it is optimized for writes Log Structured Merge Trees are used in database storage engines * Used in Lucene, LevelDB, HBase, SQLite4, Apache Cassandra, Tarantool (mail.ru) * the data is kept in multiple different structures, each is optimized for different storages  (memory/disk); * data is synchronized between the two structures efficiently, in batches
  32. Mention that it is optimized for writes Log Structured Merge Trees are used in database storage engines * Used in Lucene, LevelDB, HBase, SQLite4, Apache Cassandra, Tarantool (mail.ru) * the data is kept in multiple different structures, each is optimized for different storages  (memory/disk); * data is synchronized between the two structures efficiently, in batches
  33. Mention that it is optimized for writes Log Structured Merge Trees are used in database storage engines * Used in Lucene, LevelDB, HBase, SQLite4, Apache Cassandra, Tarantool (mail.ru) * the data is kept in multiple different structures, each is optimized for different storages  (memory/disk); * data is synchronized between the two structures efficiently, in batches
  34. Mention that it is optimized for writes Log Structured Merge Trees are used in database storage engines * Used in Lucene, LevelDB, HBase, SQLite4, Apache Cassandra, Tarantool (mail.ru) * the data is kept in multiple different structures, each is optimized for different storages  (memory/disk); * data is synchronized between the two structures efficiently, in batches
  35. Mention that it is optimized for writes Log Structured Merge Trees are used in database storage engines * Used in Lucene, LevelDB, HBase, SQLite4, Apache Cassandra, Tarantool (mail.ru) * the data is kept in multiple different structures, each is optimized for different storages  (memory/disk); * data is synchronized between the two structures efficiently, in batches
  36. Reading requires searching ALL SSTables because  Since we store data mutation operations, we may have an insert followed by delete operation, where delete would cancel the originally inserted record
  37. Compaction of SSTables is O(n*S*log(S)) operation  (n -> amount of SSTables, S is count of key/values in each) Note:  * the compaction is usually asynchronous * multi-level compaction can be implemented * But it adds strain to system resources! (Memory, CPUs, Storage)
  38. Mention that it is optimized for writes Log Structured Merge Trees are used in database storage engines * Used in Lucene, LevelDB, HBase, SQLite4, Apache Cassandra, Tarantool (mail.ru) * the data is kept in multiple different structures, each is optimized for different storages  (memory/disk); * data is synchronized between the two structures efficiently, in batches
  39. Usually, there are different types of compaction strategies are possible For example - LSM compaction strategies in Cassandra Mention  this is a solution!
  40. Since we are talking about storage related algorithms, There is another thing to consider!
  41. ACID – if a database supports it, it’s basically – You will lose data only if your server burns down So, what’s to talk about them? Except that not all databases support them…
  42. Mention that WAL is used to implement Atomicity and Durability Explain how it works (in general terms)
  43. Any database that uses WAL benefits from good no-cache performance
  44. We can test for no-cache writes using tools like ATTO Disk Benchmark
  45. There is a big difference between regular write and a “no-cache” write! * Note that the larger the write batch, the more efficient it gets  - Mention the metaphor of moving people in cars vs moving in them buses * Note the discrepancy - for 1MB writes 7.95 GB/s vs. 313.22 MB/s
  46. Need to EXPLICITLY say what kind of query we are doing: Select ALL ORDERS that were sent to Paris or Lyon AND ordered after a *certain* day
  47. we want to be able to query Orders by Shipping city and country
  48. Let’s see an example WHY it is important…
  49. We do index scan, so everything is good…
  50. This is not so good! Because collection scans have linear complexity!
  51. Thus, we can only query by first field OR by both fields
  52. Thus, we can only query by first field OR by both fields
  53. Beware of collection scans!
  54. Since the index is TYPICALLY stored as a tree (can't think of a case where it isn't a tree!) - ROUGHLY we would have O(log(n)) complexity
  55. don't forget to mention: searching an index is O(log(n)) EXCEPT for special cases like hash indexes or TRIE 2) the mentioned complexity DEPENDS on a query plan of course 3) M + N is complexity of intersection between two query results
  56. RDBMS optimizations – partial denormalization? Merge tables?
  57. Independent: all data required to process a document is stored within the document itself
  58. Well, not exactly indexing, but still… MongoDB map/reduce - find how many blogs one author has with its first name and last name. Note that such operation will do allocations and will take CPU resources It needs to reflect business logic AND be as efficient as possible
  59. Explain about RavenDB indexes that are defined via LINQ The same thing with RavenDB indexes. Such index is flexible but it will do allocations and will take CPU resources
  60. Discrepancy over request/response latency and time spent on server + total response size (3MB response took 3 seconds to download)
  61. The solution was to specify server-side projection  So ONLY needed information is sent over the wire
  62. Return only name and cuisine fields
  63. Ask audience who knows about query futures