To understand how to make your application fast, it's important to understand what makes the database fast. We will take a detailed look at how to think about performance, and how different choices in schema design affect your cluster performances depending on storage engines used and physical resources available.
10. Is It Fast?
• In context of crossing the bridge, fast means:
– how long will it take one car
– how many cars can do it "at the same time"
11. Is It Fast?
Facts & Info
Opened to traffic
Upper level: October 25, 1931
Lower level: August 29, 1962
Bus Station opened: January 17, 1963
Length of bridge between anchorages: 4,760 feet
Width of bridge: 119 feet
Width of roadway: 90 feet
Height of tower above water: 604 feet
Water clearance at midspan: 212 feet
Number of toll lanes:
Upper level: 12
Lower level: 10
Palisades Interstate Parkway: 7*
* E-ZPass only overnight
2013 Traffic Volumes
Total New York-bound (eastbound) traffic: 49,402,245 vehicles
40. Unbounded growth
Deeply nested arrays
Really large
documents
Schema Anti-Patterns: over-normalizing
you are over-normalizing if you are
doing JOINS in your application
instead of "finds"
88. Benchmark your own application
Use realistic workload
Use real data
Measure throughput and latency
Hinweis der Redaktion
What is fast? Before we can agree what our topic is, we have to literally define what fast means for you. For your application, for your users, for your stakeholders.
For your application, for your users, for your stakeholders.
What's fast in one context, / may not be fast in \ another
may not be fast in
fast in \ another context, let me give you an example
For those unfamiliar with this area, here were my options. Holland, Lincoln and
By far the most scenic is George Washington Bridge the world's busiest motor vehicle bridge. Twice as long as any previous suspension bridge
when its design finalized in 1923, construction started in 1927 and the bridge was first opened to traffic in 1931 1932 more than 5.5 million vehicles used original six lane roadway. Two center lanes were added in 1946, increasing capacity by 1/3rd. Six lanes of the lower roadway were completed in 1962.
bringing bridge to 14 lanes it has today. So let me ask you this:
is the George Washington Bridge fast? Well, that's a bit of a non sequitor as a question in a vacuum isn't it? The bridge cannot be fast, it's not even going anywhere! But we all have context here. So what matters when I ask this question is whether it's a fast way to get from NJ to NY.
*For me* to get to NY "fast" meant to get across the Hudson river as quickly (and painlessly) as possible.
speed limit which is 45 MPH, let's just say that to drive across GW bridge would take about one minute. but we wouldn't measure GW capacity by how long it took me, but by how many cars can make use of it. 50M just from NJ to NY.
back to your application. In a vacuum, it's not slow or fast. your stakeholders say "fast application" what we mean is perform whatever it is that it does for the end-user quickly. I'm not the only car on the GWB, there is never just one end-user - we want the application to perform quickly and consistently for all endusrs. User: what matters is fast response time, for you matters how many can use it simultaneously.
How many users or operations we can process at any given time, or in a given period of time / we call that throughput. So latency == how long something takes; Throughput == how many you can process "in parallel" You'd be surprised how often they get confused for one another...
One of the reasons that latency and throughput get conflated /when talking about performance, is because they are closely related. You can easily see in the single threaded / case that your latency directly impacts your throughput. The higher/WORSE /the latency, the/ lower the throughput. And sometimes,/ the lower the throughput, the higher the latency... happens when (two in one) /
One of the reasons that latency and throughput get conflated /when talking about performance, is because they are closely related. You can easily see in the single threaded / case that your latency directly impacts your throughput. The higher/WORSE /the latency, the/ lower the throughput. And sometimes,/ the lower the throughput, the higher the latency... happens when (two in one) /
If your latency across the bridge is caused by delays at the toll booths,
One of the reasons that latency and throughput get conflated /when talking about performance, is because they are closely related. You can easily see in the single threaded / case that your latency directly impacts your throughput. The higher/WORSE /the latency, the/ lower the throughput. And sometimes,/ the lower the throughput, the higher the latency... happens when (two in one) /
THIS IS BECAUSE EACH PHYSICAL RESOURCE CAN ONLY ACCOMMODATE A FIXED NUMBER OF CLIENTS.
because everyone has to wait. So they get worse together. So increasing latency can reduce your throughput And decreasing throughput increases latency. - that is undeniable. I'm sure we've all experienced it. A slightly less intuitive concept is that increasing throughput capacity, may or may not reduce your latency. It depends how much of latency is inherent in doing the operation itself and how much is caused by waiting due to ... well, lack of throughput...
Adding more lanes, without adding more toll booths will *not* help with either throughput or latency. [click] adding more toll booths will likely reduce the time across the bridge. ...
only to a point, because no matter how many toll booths or lanes you add, the laws of physics (and speed limit laws) will make it hard to reduce the duration of our trip across the Hudson to less than about a minute.
Adding more lanes, without adding more toll booths will *not* help with either throughput or latency. [click] adding more toll booths will likely reduce the time across the bridge. ...
only to a point, because no matter how many toll booths or lanes you add, the laws of physics (and speed limit laws) will make it hard to reduce the duration of our trip across the Hudson to less than about a minute.
Spped of light, it's not just a good idea, it's the law!
Why does all of this matter and how does this tie into your application design decisions? Well, just like getting across the Hudson requires a working vehicle, an open road and a variety of other favorable conditions, your application comprises many components, and all of them must be working together optimally to get the best possible performance to your end-user – focusing on speeding up the wrong component (not bottleneck) will be useless... SYSTEM COMPONENTS
Any one can slow down each user as they go through it, and that will increase latency, reducing your throughput significantly. You know – opposite of "fast application". Components can be split into two groups:
Any one can slow down each user as they go through it, and that will increase latency, reducing your throughput significantly. You know – opposite of "fast application". Components can be split into two groups:
Any one can slow down each user as they go through it, and that will increase latency, reducing your throughput significantly. You know – opposite of "fast application". Components can be split into two groups:
Any one can slow down each user as they go through it, and that will increase latency, reducing your throughput significantly. You know – opposite of "fast application". Components can be split into two groups:
Physical components /resources - and conceptual components -/your algorithms, data structures,/schema, indexes, choice of storage engine, choice of OS and FS - all those choices affect how your physical resources will be used up. So if you don't design your application well, you will be unnecessarily exhausting some of these limited physical resources, causing your application to perform worse than it might with optimal design. [PAUSE] Of course, physical components must be properly sized and tuned. File System tuning, OS tuning: we don't have time to get into the specifics of it here, but there's lots of information available so just keep in mind that if you don't follow best practices in configuring your file system, it's a little bit like trying to drive across GW bridge with four flat tires - let's just agree that's a bad idea? [PAUSE] /Two big components we will focus on in detail are the parts of the DB:
Physical components /resources - and conceptual components -/your algorithms, data structures,/schema, indexes, choice of storage engine, choice of OS and FS - all those choices affect how your physical resources will be used up. So if you don't design your application well, you will be unnecessarily exhausting some of these limited physical resources, causing your application to perform worse than it might with optimal design. [PAUSE] Of course, physical components must be properly sized and tuned. File System tuning, OS tuning: we don't have time to get into the specifics of it here, but there's lots of information available so just keep in mind that if you don't follow best practices in configuring your file system, it's a little bit like trying to drive across GW bridge with four flat tires - let's just agree that's a bad idea? [PAUSE] /Two big components we will focus on in detail are the parts of the DB:
[ schema / indexes ] [ STORAGE ENGINE ]
Schema Design is the building block of your application and getting it right is essential to making your application's DB requests efficient. We do that by structuring your data in a way that your application can easily read and write This willl minimize the resources used while minimizing latency of each request.
Tailoring your schema design to fit your read and write patterns is like using the right tool for the job. Good schema design will always take into account data locality - that's co-locating data that you tend to get at the same time into the same documents. Now that's a rule of thumb, there are definitely ways to take this too far – important counter point to this is "don't store data in the document that you tend not to need immediately".
Imagine you have to get 50 people across George Washington bridge,/would you use a car and make over a dozen trips? /Or would you use a much slower moving bus and get the job done in a single trip? [PAUSE] On the other hand, if you have one passenger, you might get better gas mileage if you take a car rather than the bus. If u r making lots of trips to fetch all the data ...
Imagine you have to get 50 people across George Washington bridge,/would you use a car and make over a dozen trips? /Or would you use a much slower moving bus and get the job done in a single trip? [PAUSE] On the other hand, if you have one passenger, you might get better gas mileage if you take a car rather than the bus. If u r making lots of trips to fetch all the data ...
to fetch all the data you need for a single operation is called an anti-pattern in schema design, we recognize as over-normalization. /On the other hand, getting way more data each time than you need is usually a sign of the opposite problem - let's call it over-embedding. "ANTIPATTERNS"
to fetch all the data you need for a single operation is called an anti-pattern in schema design, we recognize as over-normalization. /On the other hand, getting way more data each time than you need is usually a sign of the opposite problem - let's call it over-embedding. "ANTIPATTERNS"
sign you might be over embedding: Your documents tend to grow unbounded (you keep pushing more values into arrays, though you don't usually need them all) / You have deeply nested arrays within arrays but you usually need to work only with a small number of elements in it (NOT ALWAYS)/ Your documents are really large
[PAUSE] Some of the signs you might be over-normalizing
sign you might be over embedding: Your documents tend to grow unbounded (you keep pushing more values into arrays, though you don't usually need them all) / You have deeply nested arrays within arrays but you usually need to work only with a small number of elements in it (NOT ALWAYS)/ Your documents are really large
[PAUSE] Some of the signs you might be over-normalizing
sign you might be over embedding: Your documents tend to grow unbounded (you keep pushing more values into arrays, though you don't usually need them all) / You have deeply nested arrays within arrays but you usually need to work only with a small number of elements in it (NOT ALWAYS)/ Your documents are really large
[PAUSE] Some of the signs you might be over-normalizing
1 sign you might be over-normalizing [CLICK] you keep implementing joins in your application for every "query".
Other Signs you may run into trouble with your schema in the future: IF u haven't considered relative SLAs of reads vs writes - usually if we architect our system to make one of those faster it's at the cost of the other - more on that when we come to indexes. So knowing which you can afford to be a bit slower (higher latency) up front will help you make these trade-offs correctly.
Another one: You have lots of different types of documents in the same collection - usually it's a sign of trouble. [PAUSE]
You have lots of different types of values in the same field across a collection (sometimes string, sometimes date, sometimes number).[PAUSE] that will bring you to the BIGGEST warning sign: Your queries can't use indexes efficiently:
Other Signs you may run into trouble with your schema in the future: IF u haven't considered relative SLAs of reads vs writes - usually if we architect our system to make one of those faster it's at the cost of the other - more on that when we come to indexes. So knowing which you can afford to be a bit slower (higher latency) up front will help you make these trade-offs correctly.
Another one: You have lots of different types of documents in the same collection - usually it's a sign of trouble. [PAUSE]
You have lots of different types of values in the same field across a collection (sometimes string, sometimes date, sometimes number).[PAUSE] that will bring you to the BIGGEST warning sign: Your queries can't use indexes efficiently:
Other Signs you may run into trouble with your schema in the future: IF u haven't considered relative SLAs of reads vs writes - usually if we architect our system to make one of those faster it's at the cost of the other - more on that when we come to indexes. So knowing which you can afford to be a bit slower (higher latency) up front will help you make these trade-offs correctly.
Another one: You have lots of different types of documents in the same collection - usually it's a sign of trouble. [PAUSE]
You have lots of different types of values in the same field across a collection (sometimes string, sometimes date, sometimes number).[PAUSE] that will bring you to the BIGGEST warning sign: Your queries can't use indexes efficiently:
can't use indexes efficiently: - unanchored or case insensitive regex's - you need dozens of indexes for a single collection -worst you have no idea what indexes you might possibly need on a collection. Which brings us to the other biggest determining factor of whether your application will be fast:
can't use indexes efficiently: - unanchored or case insensitive regex's - you need dozens of indexes for a single collection -worst you have no idea what indexes you might possibly need on a collection. Which brings us to the other biggest determining factor of whether your application will be fast:
can't use indexes efficiently: - unanchored or case insensitive regex's - you need dozens of indexes for a single collection -worst you have no idea what indexes you might possibly need on a collection. Which brings us to the other biggest determining factor of whether your application will be fast:
can't use indexes efficiently: - unanchored or case insensitive regex's - you need dozens of indexes for a single collection -worst you have no idea what indexes you might possibly need on a collection. Which brings us to the other biggest determining factor of whether your application will be fast:
I wouldn't be exaggerating if I told you that when our support is dealing with a customer whose application is "slow" over 90% of the time, the indexes are suboptimal or outright missing for some high percentage of the slow operations! And this is in spite of the fact that we constantly harp about how important indexing is to good performance, and of course *all* databases require indexing to work well, right? let me show you how BAD life is with no indexes:
Here is my bridge analogy extended to such systems: Imagine that every morning, a bus, let's say NJ Transit picks up passengers at bus stops and then heads across GWB. How.. impact the "latency" of the trip, if throw away schedule & signed bus stops the bus just drove on every street to see if any of the people who wanted to go to NY were there? I don't imagine that would work very well. YOUR APP=query:{ } [PAUSE] And yet users deploy applications into production without having proper indexes in place – frequently because they didn't do proper test - they didn't benchmark their application's performance. (more about that at the end).
Here is my bridge analogy extended to such systems: Imagine that every morning, a bus, let's say NJ Transit picks up passengers at bus stops and then heads across GWB. How.. impact the "latency" of the trip, if throw away schedule & signed bus stops the bus just drove on every street to see if any of the people who wanted to go to NY were there? I don't imagine that would work very well. YOUR APP=query:{ } [PAUSE] And yet users deploy applications into production without having proper indexes in place – frequently because they didn't do proper test - they didn't benchmark their application's performance. (more about that at the end).
Here is my bridge analogy extended to such systems: Imagine that every morning, a bus, let's say NJ Transit picks up passengers at bus stops and then heads across GWB. How.. impact the "latency" of the trip, if throw away schedule & signed bus stops the bus just drove on every street to see if any of the people who wanted to go to NY were there? I don't imagine that would work very well. YOUR APP=query:{ } [PAUSE] And yet users deploy applications into production without having proper indexes in place – frequently because they didn't do proper test - they didn't benchmark their application's performance. (more about that at the end).
Here is my bridge analogy extended to such systems: Imagine that every morning, a bus, let's say NJ Transit picks up passengers at bus stops and then heads across GWB. How.. impact the "latency" of the trip, if throw away schedule & signed bus stops the bus just drove on every street to see if any of the people who wanted to go to NY were there? I don't imagine that would work very well. YOUR APP=query:{ } [PAUSE] And yet users deploy applications into production without having proper indexes in place – frequently because they didn't do proper test - they didn't benchmark their application's performance. (more about that at the end).
Here is my bridge analogy extended to such systems: Imagine that every morning, a bus, let's say NJ Transit picks up passengers at bus stops and then heads across GWB. How.. impact the "latency" of the trip, if throw away schedule & signed bus stops the bus just drove on every street to see if any of the people who wanted to go to NY were there? I don't imagine that would work very well. YOUR APP=query:{ } [PAUSE] And yet users deploy applications into production without having proper indexes in place – frequently because they didn't do proper test - they didn't benchmark their application's performance. (more about that at the end).
Here is my bridge analogy extended to such systems: Imagine that every morning, a bus, let's say NJ Transit picks up passengers at bus stops and then heads across GWB. How.. impact the "latency" of the trip, if throw away schedule & signed bus stops the bus just drove on every street to see if any of the people who wanted to go to NY were there? I don't imagine that would work very well. YOUR APP=query:{ } [PAUSE] And yet users deploy applications into production without having proper indexes in place – frequently because they didn't do proper test - they didn't benchmark their application's performance. (more about that at the end).
I'm sure you are all excited to hear about how awesome Wired Tiger is - and it is! But of course - the right tool for the job and all that. There are a couple of important differences between MMAP and WT that I want you to understand so you can take advantage of the strengths of each.
Most easily seen difference: WT has on-disk compression. MMAP does not. MMAP does X. WT does Y. Will it help with RAM? yes – prefix index compression.
Index prefix compression 7X (1/7th) 20% or less!
40%
3%
We have our own application Evergreen - our continuous build integration that runs thousands of tests and has TBs of log files - it was doing fine with MMAP but with 10x compression in WT we are able to now keep 10x as many runs of history! talk tomorrow afternoon about it.
If disk resource is a big limiting factor for your application, AND your data is highly compressible, CPU cycles available? then WT FTW!
If disk resource is a big limiting factor for your application, AND your data is highly compressible, CPU cycles available? then WT FTW!
If disk resource is a big limiting factor for your application, AND your data is highly compressible, CPU cycles available? then WT FTW!
If disk resource is a big limiting factor for your application, AND your data is highly compressible, CPU cycles available? then WT FTW!
interesting, complex, CONCURRENCY impacts both latency Throughput. lot has been said over the years about MMAP low granularity concurrency. It's like relatively few toll booths in front of GWB. It can be a limiting factor. But - for actual execution of the operation, mmap is "faster" i.e. lower latency. Wired Tiger has very high grained concurrency - in fact, not "document level *locking*" - it uses clever lock-free algorithms to achieve high degree of concurrency. But related to that, the latency of a single operation is higher than with mmap. WT ^thruput ^latency
Wired Tiger has very high grained concurrency - in fact, not "document level *locking*" - it uses clever lock-free algorithms to achieve high degree of concurrency. But related to that, the latency of a single operation is higher than with mmap. WT ^thruput ^latency
Why would granularity of locking impact latency this way? Imagine GWB lanes again... MMAP is like having one toll booth (or one per lane). - once you pay the toll and you *know* you are the only person in that lane so you can go as fast as possible
Why would granularity of locking impact latency this way? Imagine GWB lanes again... MMAP is like having one toll booth (or one per lane). - once you pay the toll and you *know* you are the only person in that lane so you can go as fast as possible
WiredTiger, well, I'm stretching the metaphor a little here, but imagine that there are no toll booths. Everyone has EZ-Pass or FastTrak or whatever. And you drive to your lane BUT if you find yourself in contention
Why would granularity of locking impact latency this way? Imagine GWB lanes again... MMAP is like having one toll booth (or one per lane). - once you pay the toll and you *know* you are the only person in that lane so you can go as fast as possible
might find yourself in contention with another car for this lane, then one of you has to stop and try again. So first, you can't drive quite so fast, because you have to be able to notice another car in your lane in time to stop, and second if you do meet contention then you have to stop and try again. WRITE-CONFLICTS NO BLIND WRITES.
So when is this a big win for Wired Tiger? Well, you have to have (a) multiple threads! too few threads and you aren't winning big from the clever algorithms (b) multiple threads have to be contending on the same collection (otherwise mmap has coll-level lock) (c) multiple threads must NOT be all contending on a single document (if they are then well, you see) (d) CPU available but (e) you must not have significantly more threads than you have "lanes" - in this case CPU processors Here are some "benchmarks"
So when is this a big win for Wired Tiger? Well, you have to have (a) multiple threads! too few threads and you aren't winning big from the clever algorithms (b) multiple threads have to be contending on the same collection (otherwise mmap has coll-level lock) (c) multiple threads must NOT be all contending on a single document (if they are then well, you see) (d) CPU available but (e) you must not have significantly more threads than you have "lanes" - in this case CPU processors Here are some "benchmarks"
So when is this a big win for Wired Tiger? Well, you have to have (a) multiple threads! too few threads and you aren't winning big from the clever algorithms (b) multiple threads have to be contending on the same collection (otherwise mmap has coll-level lock) (c) multiple threads must NOT be all contending on a single document (if they are then well, you see) (d) CPU available but (e) you must not have significantly more threads than you have "lanes" - in this case CPU processors Here are some "benchmarks"
So when is this a big win for Wired Tiger? Well, you have to have (a) multiple threads! too few threads and you aren't winning big from the clever algorithms (b) multiple threads have to be contending on the same collection (otherwise mmap has coll-level lock) (c) multiple threads must NOT be all contending on a single document (if they are then well, you see) (d) CPU available but (e) you must not have significantly more threads than you have "lanes" - in this case CPU processors Here are some "benchmarks"
So when is this a big win for Wired Tiger? Well, you have to have (a) multiple threads! too few threads and you aren't winning big from the clever algorithms (b) multiple threads have to be contending on the same collection (otherwise mmap has coll-level lock) (c) multiple threads must NOT be all contending on a single document (if they are then well, you see) (d) CPU available but (e) you must not have significantly more threads than you have "lanes" - in this case CPU processors Here are some "benchmarks"
So when is this a big win for Wired Tiger? Well, you have to have (a) multiple threads! too few threads and you aren't winning big from the clever algorithms (b) multiple threads have to be contending on the same collection (otherwise mmap has coll-level lock) (c) multiple threads must NOT be all contending on a single document (if they are then well, you see) (d) CPU available but (e) you must not have significantly more threads than you have "lanes" - in this case CPU processors Here are some "benchmarks"
Not contending on the same document!!! and contending. Uniform, latest, zipfian
you must not have significantly more threads than you have "lanes" - in this case CPU processors if you have a huge number of threads which are all trying to do active work on a small number of cores then you will waste a huge amount of resources on just context switching and not actually doing work plus more threads contending on same documents.
Even for read heavy loads, huge number of threads which are all trying to do active work on a small number of cores then you will waste a huge amount of resources on just context switching
context switching and not actually doing work. That's concurrency and multithreading.
So please don't do any single threaded benchmarking of WT and then ask how come it's not as fast as you heard. But don't benchmark 500 threads on a 4-core laptop!
The other other significant differentiator is the "write pattern". I'm not talking compressing data on disk & using the disk IOPs a lot more judiciously than MMAP. I'm talking abotu write amplification. There is a big difference in how writes are done during updates: MMAP does "in place" updates WT does "copy on write" on all updates. Illustration using a document rather than a bridge
Here' a time series document for a particular hour, with minutes and seconds. if you make an update to document
update to this document, mmap will overwrite the existing document with new value.
new value.
back to original document: WiredTiger will rewrite the current document on update
document (or more technically the internal page that contains that document) as a new version
new version of that page. This of course enables whoever was reading that page to still be reading it as the previous version of that page, which will get recycled when everyone who was using it is done with it. USE CASE Think about the use case where you have a very high number of documents that are nonetheless a small portion of your total data that are being extremely frequently updated, over and over again?
I'm talking of course of a system like MMS monitoring component which receives a large number of performance metrics and updates counters inside documents that don't change except for these numbers being incremented for the duration of whatever the document represents. Here, with schema heavily optimized to make sure updates are in place, performance is better with mmap even though it uses up more disk space (and RAM).
And this brings me to the most important point I'm going to make – all the generalizations are just that - no matter what I told you here today, no matter what you read on the internet, the only way to know for sure how fast your application is with your carefully selected schema and your carefully selected indexes would be to stress test and measure it. The examples I used are both applications we run in-house that we benchmarked with both storage engines with different configurations and physical resources to make the most appropriate choices - you guys should do the same. Oh, and if you happen to be going back to Jersey tonight and you want to have predictable latency
do yourself a favor, and take the train.
Thank you!