High Performance Applications with MongoDB

•Als PPTX, PDF herunterladen•

6 gefällt mir•2,980 views

To understand how to make your application fast, it's important to understand what makes the database fast. We will take a detailed look at how to think about performance, and how different choices in schema design affect your cluster performances depending on storage engines used and physical resources available.

Technologie

High Performance with MongoDB
or "how to design fast applications"
Asya Kamsky
Lead Product Manager, MongoDB Inc
#MongoDB @asya999 #askAsya

What Is Fast?
• Must understand
– what "fast" means
– how to measure it
– what are requirements
– what's the context

Is It Fast?
• In context of crossing the bridge, fast means:
– how long will it take one car
– how many cars can do it "at the same time"

Is It Fast?
Facts & Info
Opened to traffic
Upper level: October 25, 1931
Lower level: August 29, 1962
Bus Station opened: January 17, 1963
Length of bridge between anchorages: 4,760 feet
Width of bridge: 119 feet
Width of roadway: 90 feet
Height of tower above water: 604 feet
Water clearance at midspan: 212 feet
Number of toll lanes:
Upper level: 12
Lower level: 10
Palisades Interstate Parkway: 7*
* E-ZPass only overnight
2013 Traffic Volumes
Total New York-bound (eastbound) traffic: 49,402,245 vehicles

What Is Fast?
Latency Throughput
How long "it" takes How many "per unit of time"

What Is Fast?
Latency ThroughputThroughput Latency

What Is Fast?
Latency ThroughputThroughput Latency
Orthogonal, but highly interdependent

Application
Schema
Indexes
Storage
Engine
Driver
DB Requests

Application
Schema
Indexes
File System
Storage
Engine
OS
Driver
DB Requests

Application
Schema
Indexes
File System
Storage
Engine
OS
Driver
DB Requests
PhysicalConceptual

Parent Object
OVER-NORMALIZATION OVER-EMBEDDING
Schema Anti-Patterns

Deeply nested arrays
Really large
documents
Schema Anti-Patterns: over-embedding

Unbounded growth
Deeply nested arrays
Really large
documents
Schema Anti-Patterns: over-normalizing
you are over-normalizing if you are
doing JOINS in your application
instead of "finds"

reads vs writes
polymorphic
collections
polymorphic fields
Schema Anti-Patterns: signs of trouble

bad regex queries
lots of indexes
no indexes
Schema Anti-Patterns: can't use indexes

MMAPV1 WiredTiger
Granularity low
Latency low
Granularity high
Latency higher

0
10,000
20,000
30,000
40,000
50,000
60,000
Uniform Latest Zipfian
Throughput: 50/50 Workload in RAM

${ timestamp_hour: ISODate("2015-11-10T23:00:00.000Z"), type: “memory_used”, values: { 0: { 0: 999999, 1: 999999, …, 59: 1000000 }, 1: { 0: 2000000, 1: 2000000, …, 59: 1000000 }, …, 58: { 0: 1600000, 1: 1200000, …, 59: 1100000 }, 59: { 0: 1300000, 1: 1400000, …, 59: 1500000 } } } db.metrics.update( { timestamp_hour: ISODate("2015-11-10T23:00:00.000Z"), type: “memory_used” }, {$set: {“values.59.59”: 2000000 } } ) { timestamp_hour: ISODate("2015-11-10T23:00:00.000Z"), type: “memory_used”, values: { 0: { 0: 999999, 1: 999999, …, 59: 1000000 }, 1: { 0: 2000000, 1: 2000000, …, 59: 1000000 }, …, 58: { 0: 1600000, 1: 1200000, …, 59: 1100000 }, 59: { 0: 1300000, 1: 1400000, …, 59: 2000000 } } }$

Benchmark your own application
Use realistic workload
Use real data
Measure throughput and latency

High Performance Applications with MongoDB

Empfohlen

MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDBMongoDB

Migrating to MongoDB: Best PracticesMongoDB

MongoDB Days Silicon Valley: Introducing MongoDB 3.2MongoDB

Introduction to MongoDBMongoDB

Back to Basics Webinar 2: Your First MongoDB ApplicationMongoDB

Webinar: Best Practices for Getting Started with MongoDBMongoDB

MongoDB Best Practices for DevelopersMoshe Kaplan

Webinar: Developing with the modern App Stack: MEAN and MERN (with Angular2 a...MongoDB

Empfohlen

MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDBMongoDB

Migrating to MongoDB: Best PracticesMongoDB

MongoDB Days Silicon Valley: Introducing MongoDB 3.2MongoDB

Introduction to MongoDBMongoDB

Back to Basics Webinar 2: Your First MongoDB ApplicationMongoDB

Webinar: Best Practices for Getting Started with MongoDBMongoDB

MongoDB Best Practices for DevelopersMoshe Kaplan

Webinar: Developing with the modern App Stack: MEAN and MERN (with Angular2 a...MongoDB

MongoDB Schema Design: Practical Applications and ImplicationsMongoDB

Back to Basics Webinar 1: Introduction to NoSQLMongoDB

Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...MongoDB

Webinar: Schema Patterns and Your Storage EngineMongoDB

Practical Ruby Projects With Mongo DbAlex Sharp

MongoDB Days Silicon Valley: Jumpstart: Ops/Admin 101MongoDB

Back to Basics: My First MongoDB ApplicationMongoDB

Conceptos básicos. Seminario web 2: Su primera aplicación MongoDBMongoDB

Back to Basics Webinar 1: Introduction to NoSQLMongoDB

5 Pitfalls to Avoid with MongoDBTim Callaghan

Webinar: Performance Tuning + OptimizationMongoDB

Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentosMongoDB

MongoDB 101Abhijeet Vaikar

mongoDB PerformanceMoshe Kaplan

Back to Basics Webinar 3: Schema Design Thinking in DocumentsMongoDB

MongoDBSteven Francia

Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkMongoDB

MongoDB : The Definitive GuideWildan Maulana

Back to Basics Webinar 3: Introduction to Replica SetsMongoDB

Agility and Scalability with MongoDBMongoDB

High Performance MongoDB on Storage-Optimized AWS EC2MongoDB

Webinar: MongoDB Schema Design and Performance ImplicationsMongoDB

Weitere ähnliche Inhalte

Was ist angesagt?

MongoDB Schema Design: Practical Applications and ImplicationsMongoDB

Back to Basics Webinar 1: Introduction to NoSQLMongoDB

Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...MongoDB

Webinar: Schema Patterns and Your Storage EngineMongoDB

Practical Ruby Projects With Mongo DbAlex Sharp

MongoDB Days Silicon Valley: Jumpstart: Ops/Admin 101MongoDB

Back to Basics: My First MongoDB ApplicationMongoDB

Conceptos básicos. Seminario web 2: Su primera aplicación MongoDBMongoDB

Back to Basics Webinar 1: Introduction to NoSQLMongoDB

5 Pitfalls to Avoid with MongoDBTim Callaghan

Webinar: Performance Tuning + OptimizationMongoDB

Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentosMongoDB

MongoDB 101Abhijeet Vaikar

mongoDB PerformanceMoshe Kaplan

Back to Basics Webinar 3: Schema Design Thinking in DocumentsMongoDB

MongoDBSteven Francia

Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkMongoDB

MongoDB : The Definitive GuideWildan Maulana

Back to Basics Webinar 3: Introduction to Replica SetsMongoDB

Agility and Scalability with MongoDBMongoDB

Was ist angesagt? (20)

MongoDB Schema Design: Practical Applications and Implications

Back to Basics Webinar 1: Introduction to NoSQL

Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...

Webinar: Schema Patterns and Your Storage Engine

Practical Ruby Projects With Mongo Db

MongoDB Days Silicon Valley: Jumpstart: Ops/Admin 101

Back to Basics: My First MongoDB Application

Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB

Back to Basics Webinar 1: Introduction to NoSQL

5 Pitfalls to Avoid with MongoDB

Webinar: Performance Tuning + Optimization

Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos

MongoDB 101

mongoDB Performance

Back to Basics Webinar 3: Schema Design Thinking in Documents

MongoDB

Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework

MongoDB : The Definitive Guide

Back to Basics Webinar 3: Introduction to Replica Sets

Agility and Scalability with MongoDB

Andere mochten auch

High Performance MongoDB on Storage-Optimized AWS EC2MongoDB

Webinar: MongoDB Schema Design and Performance ImplicationsMongoDB

Mongo performance tuning: tips and tricksVladimir Malyk

No sqlSparsa Roychowdhury

The role of NoSQL in the Next Generation of Financial InformaticsAerospike, Inc.

Optimizing your job apply pages with the LinkedIn profile APIIvo Brett

What enterprises can learn from Real Time BiddingAerospike

Building your first app with mongo dbMongoDB

MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)MongoDB

Rapid Application Design in Financial ServicesAerospike

Introduction to mongoDBCuelogic Technologies Pvt. Ltd.

Building Your First Application with MongoDBMongoDB

Agile Schema Design: An introduction to MongoDBStennie Steneker

MongoDB Europe 2016 - Advanced MongoDB Aggregation PipelinesMongoDB

Is It Fast? : Measuring MongoDB PerformanceTim Callaghan

Webinar: High Performance MongoDB Applications with IBM POWER8MongoDB

Brian Bulkowski. AerospikeVolha Banadyseva

Mongo db multidc_webinarMongoDB

Creating a Single View Part 1: Overview and Data AnalysisMongoDB

Step-by-Step Parse MigrationMongoDB

Andere mochten auch (20)

High Performance MongoDB on Storage-Optimized AWS EC2

Webinar: MongoDB Schema Design and Performance Implications

Mongo performance tuning: tips and tricks

No sql

The role of NoSQL in the Next Generation of Financial Informatics

Optimizing your job apply pages with the LinkedIn profile API

What enterprises can learn from Real Time Bidding

Building your first app with mongo db

MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)

Rapid Application Design in Financial Services

Introduction to mongoDB

Building Your First Application with MongoDB

Agile Schema Design: An introduction to MongoDB

MongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines

Is It Fast? : Measuring MongoDB Performance

Webinar: High Performance MongoDB Applications with IBM POWER8

Brian Bulkowski. Aerospike

Mongo db multidc_webinar

Creating a Single View Part 1: Overview and Data Analysis

Step-by-Step Parse Migration

Ähnlich wie High Performance Applications with MongoDB

Remaining Agile with Billions of Documents: Appboy and Creative MongoDB SchemasMongoDB

MongoDB Stitch IntroductionMongoDB

MongoDB Days UK: Building Apps with the MEAN StackMongoDB

Day 4 - Cloud Migration - But How?Amazon Web Services

MongoDB Stich OverviewMongoDB

Moved to https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmxMilen Dyankov

MongoDB for Time Series Data: Setting the Stage for Sensor ManagementMongoDB

WSO2Con EU 2016: An Introduction to the WSO2 Analytics PlatformWSO2

Anything data (revisited)Ahmet Akyol

Comparison between OGC Sensor Observation Service and SensorThings APISensorUp

Socialite, the Open Source Status FeedMongoDB

Intro to node and mongodb 1Mohammad Qureshi

Introducing MongoDB Stitch, Backend-as-a-Service from MongoDBMongoDB

Cassandra's Sweet Spot - an introduction to Apache CassandraDave Gardner

Using Graph Analysis and Fraud Detection in the Fintech IndustryStanka Dalekova

Application metrics - Confoo 2019Rafael Dohms

Ustream vs Legacy, It's never too late to start your fight! #Jsist 2014Máté Nádasdi

Retail referencearchitecture productcatalogMongoDB

Microxchg Analyzing Response Time Distributions for MicroservicesAdrian Cockcroft

Ähnlich wie High Performance Applications with MongoDB (20)

Remaining Agile with Billions of Documents: Appboy and Creative MongoDB Schemas

MongoDB Stitch Introduction

MongoDB Days UK: Building Apps with the MEAN Stack

Day 4 - Cloud Migration - But How?

MongoDB Stich Overview

Moved to https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmx

MongoDB for Time Series Data: Setting the Stage for Sensor Management

WSO2Con EU 2016: An Introduction to the WSO2 Analytics Platform

Anything data (revisited)

Comparison between OGC Sensor Observation Service and SensorThings API

Socialite, the Open Source Status Feed

Intro to node and mongodb 1

Introducing MongoDB Stitch, Backend-as-a-Service from MongoDB

Cassandra's Sweet Spot - an introduction to Apache Cassandra

Using Graph Analysis and Fraud Detection in the Fintech Industry

Application metrics - Confoo 2019

Ustream vs Legacy, It's never too late to start your fight! #Jsist 2014

Retail referencearchitecture productcatalog

Microxchg Analyzing Response Time Distributions for Microservices

Mehr von MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB

MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB

MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB

MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB

MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB

MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB

MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB

MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB

MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB

MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB

MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB

MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB

MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB

MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB

MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB

MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB

MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB

MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB

MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB

MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB

Mehr von MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas

MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!

MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...

MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB

MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...

MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data

MongoDB SoCal 2020: MongoDB Atlas Jump Start

MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]

MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2

MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...

MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!

MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset

MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart

MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...

MongoDB .local San Francisco 2020: Aggregation Pipeline Power++

MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...

MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive

MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang

MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...

MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...

Kürzlich hochgeladen

Architecting Cloud Native ApplicationsWSO2

Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood

GenAI Risks & Security Meetup 01052024.pdflior mazor

presentation ICT roal in 21st century educationjfdjdjcjdnsjd

MINDCTI Revenue Release Quarter One 2024MIND CTI

Manulife - Insurer Transformation Award 2024The Digital Insurer

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

MS Copilot expands with MS Graph connectorsNanddeep Nachan

AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays

Ransomware_Q4_2023. The report. [EN].pdfOverkill Security

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz

AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer

Kürzlich hochgeladen (20)

Architecting Cloud Native Applications

Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...

GenAI Risks & Security Meetup 01052024.pdf

presentation ICT roal in 21st century education

MINDCTI Revenue Release Quarter One 2024

Manulife - Insurer Transformation Award 2024

Boost Fertility New Invention Ups Success Rates.pdf

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...

Powerful Google developer tools for immediate impact! (2023-24 C)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...

How to Troubleshoot Apps for the Modern Connected Worker

Data Cloud, More than a CDP by Matt Robison

MS Copilot expands with MS Graph connectors

AWS Community Day CPH - Three problems of Terraform

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Ransomware_Q4_2023. The report. [EN].pdf

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...

AXA XL - Insurer Innovation Award Americas 2024

High Performance Applications with MongoDB

1. High Performance with MongoDB or "how to design fast applications" Asya Kamsky Lead Product Manager, MongoDB Inc #MongoDB @asya999 #askAsya

2. What Is Fast? • Must understand – what "fast" means – how to measure it – what are requirements – what's the context

3. What Is Fast?

4. What Is Fast?

5. What Is Fast?

6. George Washington Bridge

10. Is It Fast? • In context of crossing the bridge, fast means: – how long will it take one car – how many cars can do it "at the same time"

11. Is It Fast? Facts & Info Opened to traffic Upper level: October 25, 1931 Lower level: August 29, 1962 Bus Station opened: January 17, 1963 Length of bridge between anchorages: 4,760 feet Width of bridge: 119 feet Width of roadway: 90 feet Height of tower above water: 604 feet Water clearance at midspan: 212 feet Number of toll lanes: Upper level: 12 Lower level: 10 Palisades Interstate Parkway: 7* * E-ZPass only overnight 2013 Traffic Volumes Total New York-bound (eastbound) traffic: 49,402,245 vehicles

12. What Is Fast?

13. What Is Fast? Latency Throughput How long "it" takes How many "per unit of time"

14. What Is Fast? Latency ThroughputThroughput Latency

15. What Is Fast? Latency ThroughputThroughput Latency Orthogonal, but highly interdependent

16.

17. What Is Fast? Latency ThroughputThroughput Latency

18. What Is Fast? Latency ThroughputThroughput Latency

19.

20. What Is Fast? New Jersey New York

21. What Is Fast? New Jersey New York

22. What Is Fast? New Jersey New York

23. Must address the "limiting factor"

24. Application Driver DB Requests

25. Application Schema Indexes Storage Engine Driver DB Requests

26. Application Schema Indexes File System Storage Engine OS Driver DB Requests

27. Application Schema Indexes File System Storage Engine OS Driver DB Requests

28. Application Schema Indexes File System Storage Engine OS Driver DB Requests PhysicalConceptual

29. Application Schema Indexes File System Storage Engine OS Driver DB Requests PhysicalConceptual

30. Schema Indexes Storage Engine

31. Schema

32. Schema Patterns

33.

34.

35. Schema Anti-Patterns

36. Parent Object OVER-NORMALIZATION OVER-EMBEDDING Schema Anti-Patterns

37. Deeply nested arrays Really large documents Schema Anti-Patterns: over-embedding

38. Deeply nested arrays Really large documents Schema Anti-Patterns: over-embedding

39. Deeply nested arrays Really large documents Schema Anti-Patterns: over-embedding

40. Unbounded growth Deeply nested arrays Really large documents Schema Anti-Patterns: over-normalizing you are over-normalizing if you are doing JOINS in your application instead of "finds"

41. reads vs writes polymorphic collections polymorphic fields Schema Anti-Patterns: signs of trouble

42. reads vs writes polymorphic collections polymorphic fields Schema Anti-Patterns: signs of trouble

43. reads vs writes polymorphic collections polymorphic fields Schema Anti-Patterns: signs of trouble

44. bad regex queries lots of indexes no indexes Schema Anti-Patterns: can't use indexes

45. bad regex queries lots of indexes no indexes Schema Anti-Patterns: can't use indexes

46. bad regex queries lots of indexes no indexes Schema Anti-Patterns: can't use indexes

47. bad regex queries lots of indexes no indexes Schema Anti-Patterns: can't use indexes

48. Indexes

49.

50.

51.

52.

53.

54.

55. Storage Engine

56. Storage Engine: compression

57.

58.

59.

60.

61.

62.

63. Storage Engine: concurrency

64. MMAPV1 WiredTiger Granularity low Latency low Granularity high Latency higher

65. New Jersey New York

66. New Jersey New York

67. New Jersey New York

68. New Jersey New York

69. New Jersey New York

70.

71.

72.

73.

74.

75.

76. 0 10,000 20,000 30,000 40,000 50,000 60,000 Uniform Latest Zipfian Throughput: 50/50 Workload in RAM

77.

78.

79. htop

80. Storage Engine: write-pattern

81. { timestamp_hour: ISODate("2015-11-10T23:00:00.000Z"), type: “memory_used”, values: { 0: { 0: 999999, 1: 999999, …, 59: 1000000 }, 1: { 0: 2000000, 1: 2000000, …, 59: 1000000 }, …, 58: { 0: 1600000, 1: 1200000, …, 59: 1100000 }, 59: { 0: 1300000, 1: 1400000, …, 59: 1500000 } } }

82. { timestamp_hour: ISODate("2015-11-10T23:00:00.000Z"), type: “memory_used”, values: { 0: { 0: 999999, 1: 999999, …, 59: 1000000 }, 1: { 0: 2000000, 1: 2000000, …, 59: 1000000 }, …, 58: { 0: 1600000, 1: 1200000, …, 59: 1100000 }, 59: { 0: 1300000, 1: 1400000, …, 59: 1500000 } } } db.metrics.update( { timestamp_hour: ISODate("2015-11-10T23:00:00.000Z"), type: “memory_used” }, {$set: {“values.59.59”: 2000000 } } )

83. { timestamp_hour: ISODate("2015-11-10T23:00:00.000Z"), type: “memory_used”, values: { 0: { 0: 999999, 1: 999999, …, 59: 1000000 }, 1: { 0: 2000000, 1: 2000000, …, 59: 1000000 }, …, 58: { 0: 1600000, 1: 1200000, …, 59: 1100000 }, 59: { 0: 1300000, 1: 1400000, …, 59: 2000000 } } } db.metrics.update( { timestamp_hour: ISODate("2015-11-10T23:00:00.000Z"), type: “memory_used” }, {$set: {“values.59.59”: 2000000 } } )

84. { timestamp_hour: ISODate("2015-11-10T23:00:00.000Z"), type: “memory_used”, values: { 0: { 0: 999999, 1: 999999, …, 59: 1000000 }, 1: { 0: 2000000, 1: 2000000, …, 59: 1000000 }, …, 58: { 0: 1600000, 1: 1200000, …, 59: 1100000 }, 59: { 0: 1300000, 1: 1400000, …, 59: 1500000 } } }

85. { timestamp_hour: ISODate("2015-11-10T23:00:00.000Z"), type: “memory_used”, values: { 0: { 0: 999999, 1: 999999, …, 59: 1000000 }, 1: { 0: 2000000, 1: 2000000, …, 59: 1000000 }, …, 58: { 0: 1600000, 1: 1200000, …, 59: 1100000 }, 59: { 0: 1300000, 1: 1400000, …, 59: 1500000 } } } db.metrics.update( { timestamp_hour: ISODate("2015-11-10T23:00:00.000Z"), type: “memory_used” }, {$set: {“values.59.59”: 2000000 } } )

86. { timestamp_hour: ISODate("2015-11-10T23:00:00.000Z"), type: “memory_used”, values: { 0: { 0: 999999, 1: 999999, …, 59: 1000000 }, 1: { 0: 2000000, 1: 2000000, …, 59: 1000000 }, …, 58: { 0: 1600000, 1: 1200000, …, 59: 1100000 }, 59: { 0: 1300000, 1: 1400000, …, 59: 1500000 } } } db.metrics.update( { timestamp_hour: ISODate("2015-11-10T23:00:00.000Z"), type: “memory_used” }, {$set: {“values.59.59”: 2000000 } } ) { timestamp_hour: ISODate("2015-11-10T23:00:00.000Z"), type: “memory_used”, values: { 0: { 0: 999999, 1: 999999, …, 59: 1000000 }, 1: { 0: 2000000, 1: 2000000, …, 59: 1000000 }, …, 58: { 0: 1600000, 1: 1200000, …, 59: 1100000 }, 59: { 0: 1300000, 1: 1400000, …, 59: 2000000 } } }

87. MongoDB Cloud Monitoring

88. Benchmark your own application Use realistic workload Use real data Measure throughput and latency

Hinweis der Redaktion

What is fast? Before we can agree what our topic is, we have to literally define what fast means for you. For your application, for your users, for your stakeholders.
For your application, for your users, for your stakeholders. What's fast in one context, / may not be fast in \ another
may not be fast in
fast in \ another context, let me give you an example
For those unfamiliar with this area, here were my options. Holland, Lincoln and
By far the most scenic is George Washington Bridge the world's busiest motor vehicle bridge. Twice as long as any previous suspension bridge
when its design finalized in 1923, construction started in 1927 and the bridge was first opened to traffic in 1931 1932 more than 5.5 million vehicles used original six lane roadway. Two center lanes were added in 1946, increasing capacity by 1/3rd. Six lanes of the lower roadway were completed in 1962.
bringing bridge to 14 lanes it has today. So let me ask you this:
is the George Washington Bridge fast? Well, that's a bit of a non sequitor as a question in a vacuum isn't it? The bridge cannot be fast, it's not even going anywhere! But we all have context here. So what matters when I ask this question is whether it's a fast way to get from NJ to NY.
*For me* to get to NY "fast" meant to get across the Hudson river as quickly (and painlessly) as possible.
speed limit which is 45 MPH, let's just say that to drive across GW bridge would take about one minute. but we wouldn't measure GW capacity by how long it took me, but by how many cars can make use of it. 50M just from NJ to NY.
back to your application. In a vacuum, it's not slow or fast. your stakeholders say "fast application" what we mean is perform whatever it is that it does for the end-user quickly. I'm not the only car on the GWB, there is never just one end-user - we want the application to perform quickly and consistently for all endusrs. User: what matters is fast response time, for you matters how many can use it simultaneously.
How many users or operations we can process at any given time, or in a given period of time / we call that throughput. So latency == how long something takes; Throughput == how many you can process "in parallel" You'd be surprised how often they get confused for one another...
One of the reasons that latency and throughput get conflated /when talking about performance, is because they are closely related. You can easily see in the single threaded / case that your latency directly impacts your throughput. The higher/WORSE /the latency, the/ lower the throughput. And sometimes,/ the lower the throughput, the higher the latency... happens when (two in one) /
One of the reasons that latency and throughput get conflated /when talking about performance, is because they are closely related. You can easily see in the single threaded / case that your latency directly impacts your throughput. The higher/WORSE /the latency, the/ lower the throughput. And sometimes,/ the lower the throughput, the higher the latency... happens when (two in one) /
If your latency across the bridge is caused by delays at the toll booths,
One of the reasons that latency and throughput get conflated /when talking about performance, is because they are closely related. You can easily see in the single threaded / case that your latency directly impacts your throughput. The higher/WORSE /the latency, the/ lower the throughput. And sometimes,/ the lower the throughput, the higher the latency... happens when (two in one) /
THIS IS BECAUSE EACH PHYSICAL RESOURCE CAN ONLY ACCOMMODATE A FIXED NUMBER OF CLIENTS.
because everyone has to wait. So they get worse together. So increasing latency can reduce your throughput And decreasing throughput increases latency. - that is undeniable. I'm sure we've all experienced it. A slightly less intuitive concept is that increasing throughput capacity, may or may not reduce your latency. It depends how much of latency is inherent in doing the operation itself and how much is caused by waiting due to ... well, lack of throughput...
Adding more lanes, without adding more toll booths will *not* help with either throughput or latency. [click] adding more toll booths will likely reduce the time across the bridge. ... only to a point, because no matter how many toll booths or lanes you add, the laws of physics (and speed limit laws) will make it hard to reduce the duration of our trip across the Hudson to less than about a minute.
Adding more lanes, without adding more toll booths will *not* help with either throughput or latency. [click] adding more toll booths will likely reduce the time across the bridge. ... only to a point, because no matter how many toll booths or lanes you add, the laws of physics (and speed limit laws) will make it hard to reduce the duration of our trip across the Hudson to less than about a minute.
Spped of light, it's not just a good idea, it's the law!
Why does all of this matter and how does this tie into your application design decisions? Well, just like getting across the Hudson requires a working vehicle, an open road and a variety of other favorable conditions, your application comprises many components, and all of them must be working together optimally to get the best possible performance to your end-user – focusing on speeding up the wrong component (not bottleneck) will be useless... SYSTEM COMPONENTS
Any one can slow down each user as they go through it, and that will increase latency, reducing your throughput significantly. You know – opposite of "fast application". Components can be split into two groups:
Any one can slow down each user as they go through it, and that will increase latency, reducing your throughput significantly. You know – opposite of "fast application". Components can be split into two groups:
Any one can slow down each user as they go through it, and that will increase latency, reducing your throughput significantly. You know – opposite of "fast application". Components can be split into two groups:
Any one can slow down each user as they go through it, and that will increase latency, reducing your throughput significantly. You know – opposite of "fast application". Components can be split into two groups:
Physical components /resources - and conceptual components -/your algorithms, data structures,/schema, indexes, choice of storage engine, choice of OS and FS - all those choices affect how your physical resources will be used up. So if you don't design your application well, you will be unnecessarily exhausting some of these limited physical resources, causing your application to perform worse than it might with optimal design. [PAUSE] Of course, physical components must be properly sized and tuned. File System tuning, OS tuning: we don't have time to get into the specifics of it here, but there's lots of information available so just keep in mind that if you don't follow best practices in configuring your file system, it's a little bit like trying to drive across GW bridge with four flat tires - let's just agree that's a bad idea? [PAUSE] /Two big components we will focus on in detail are the parts of the DB:
Physical components /resources - and conceptual components -/your algorithms, data structures,/schema, indexes, choice of storage engine, choice of OS and FS - all those choices affect how your physical resources will be used up. So if you don't design your application well, you will be unnecessarily exhausting some of these limited physical resources, causing your application to perform worse than it might with optimal design. [PAUSE] Of course, physical components must be properly sized and tuned. File System tuning, OS tuning: we don't have time to get into the specifics of it here, but there's lots of information available so just keep in mind that if you don't follow best practices in configuring your file system, it's a little bit like trying to drive across GW bridge with four flat tires - let's just agree that's a bad idea? [PAUSE] /Two big components we will focus on in detail are the parts of the DB:
[ schema / indexes ] [ STORAGE ENGINE ]
Schema Design is the building block of your application and getting it right is essential to making your application's DB requests efficient. We do that by structuring your data in a way that your application can easily read and write This willl minimize the resources used while minimizing latency of each request.
Tailoring your schema design to fit your read and write patterns is like using the right tool for the job. Good schema design will always take into account data locality - that's co-locating data that you tend to get at the same time into the same documents. Now that's a rule of thumb, there are definitely ways to take this too far – important counter point to this is "don't store data in the document that you tend not to need immediately".
Imagine you have to get 50 people across George Washington bridge,/would you use a car and make over a dozen trips? /Or would you use a much slower moving bus and get the job done in a single trip? [PAUSE] On the other hand, if you have one passenger, you might get better gas mileage if you take a car rather than the bus. If u r making lots of trips to fetch all the data ...
Imagine you have to get 50 people across George Washington bridge,/would you use a car and make over a dozen trips? /Or would you use a much slower moving bus and get the job done in a single trip? [PAUSE] On the other hand, if you have one passenger, you might get better gas mileage if you take a car rather than the bus. If u r making lots of trips to fetch all the data ...
to fetch all the data you need for a single operation is called an anti-pattern in schema design, we recognize as over-normalization. /On the other hand, getting way more data each time than you need is usually a sign of the opposite problem - let's call it over-embedding. "ANTIPATTERNS"
to fetch all the data you need for a single operation is called an anti-pattern in schema design, we recognize as over-normalization. /On the other hand, getting way more data each time than you need is usually a sign of the opposite problem - let's call it over-embedding. "ANTIPATTERNS"
sign you might be over embedding: Your documents tend to grow unbounded (you keep pushing more values into arrays, though you don't usually need them all) / You have deeply nested arrays within arrays but you usually need to work only with a small number of elements in it (NOT ALWAYS)/ Your documents are really large [PAUSE] Some of the signs you might be over-normalizing
sign you might be over embedding: Your documents tend to grow unbounded (you keep pushing more values into arrays, though you don't usually need them all) / You have deeply nested arrays within arrays but you usually need to work only with a small number of elements in it (NOT ALWAYS)/ Your documents are really large [PAUSE] Some of the signs you might be over-normalizing
sign you might be over embedding: Your documents tend to grow unbounded (you keep pushing more values into arrays, though you don't usually need them all) / You have deeply nested arrays within arrays but you usually need to work only with a small number of elements in it (NOT ALWAYS)/ Your documents are really large [PAUSE] Some of the signs you might be over-normalizing
1 sign you might be over-normalizing [CLICK] you keep implementing joins in your application for every "query".
Other Signs you may run into trouble with your schema in the future: IF u haven't considered relative SLAs of reads vs writes - usually if we architect our system to make one of those faster it's at the cost of the other - more on that when we come to indexes. So knowing which you can afford to be a bit slower (higher latency) up front will help you make these trade-offs correctly. Another one: You have lots of different types of documents in the same collection - usually it's a sign of trouble. [PAUSE] You have lots of different types of values in the same field across a collection (sometimes string, sometimes date, sometimes number).[PAUSE] that will bring you to the BIGGEST warning sign: Your queries can't use indexes efficiently:
Other Signs you may run into trouble with your schema in the future: IF u haven't considered relative SLAs of reads vs writes - usually if we architect our system to make one of those faster it's at the cost of the other - more on that when we come to indexes. So knowing which you can afford to be a bit slower (higher latency) up front will help you make these trade-offs correctly. Another one: You have lots of different types of documents in the same collection - usually it's a sign of trouble. [PAUSE] You have lots of different types of values in the same field across a collection (sometimes string, sometimes date, sometimes number).[PAUSE] that will bring you to the BIGGEST warning sign: Your queries can't use indexes efficiently:
Other Signs you may run into trouble with your schema in the future: IF u haven't considered relative SLAs of reads vs writes - usually if we architect our system to make one of those faster it's at the cost of the other - more on that when we come to indexes. So knowing which you can afford to be a bit slower (higher latency) up front will help you make these trade-offs correctly. Another one: You have lots of different types of documents in the same collection - usually it's a sign of trouble. [PAUSE] You have lots of different types of values in the same field across a collection (sometimes string, sometimes date, sometimes number).[PAUSE] that will bring you to the BIGGEST warning sign: Your queries can't use indexes efficiently:
can't use indexes efficiently: - unanchored or case insensitive regex's - you need dozens of indexes for a single collection -worst you have no idea what indexes you might possibly need on a collection. Which brings us to the other biggest determining factor of whether your application will be fast:
can't use indexes efficiently: - unanchored or case insensitive regex's - you need dozens of indexes for a single collection -worst you have no idea what indexes you might possibly need on a collection. Which brings us to the other biggest determining factor of whether your application will be fast:
can't use indexes efficiently: - unanchored or case insensitive regex's - you need dozens of indexes for a single collection -worst you have no idea what indexes you might possibly need on a collection. Which brings us to the other biggest determining factor of whether your application will be fast:
can't use indexes efficiently: - unanchored or case insensitive regex's - you need dozens of indexes for a single collection -worst you have no idea what indexes you might possibly need on a collection. Which brings us to the other biggest determining factor of whether your application will be fast:
I wouldn't be exaggerating if I told you that when our support is dealing with a customer whose application is "slow" over 90% of the time, the indexes are suboptimal or outright missing for some high percentage of the slow operations! And this is in spite of the fact that we constantly harp about how important indexing is to good performance, and of course *all* databases require indexing to work well, right? let me show you how BAD life is with no indexes:
Here is my bridge analogy extended to such systems: Imagine that every morning, a bus, let's say NJ Transit picks up passengers at bus stops and then heads across GWB. How.. impact the "latency" of the trip, if throw away schedule & signed bus stops the bus just drove on every street to see if any of the people who wanted to go to NY were there? I don't imagine that would work very well. YOUR APP=query:{ } [PAUSE] And yet users deploy applications into production without having proper indexes in place – frequently because they didn't do proper test - they didn't benchmark their application's performance. (more about that at the end).
Here is my bridge analogy extended to such systems: Imagine that every morning, a bus, let's say NJ Transit picks up passengers at bus stops and then heads across GWB. How.. impact the "latency" of the trip, if throw away schedule & signed bus stops the bus just drove on every street to see if any of the people who wanted to go to NY were there? I don't imagine that would work very well. YOUR APP=query:{ } [PAUSE] And yet users deploy applications into production without having proper indexes in place – frequently because they didn't do proper test - they didn't benchmark their application's performance. (more about that at the end).
Here is my bridge analogy extended to such systems: Imagine that every morning, a bus, let's say NJ Transit picks up passengers at bus stops and then heads across GWB. How.. impact the "latency" of the trip, if throw away schedule & signed bus stops the bus just drove on every street to see if any of the people who wanted to go to NY were there? I don't imagine that would work very well. YOUR APP=query:{ } [PAUSE] And yet users deploy applications into production without having proper indexes in place – frequently because they didn't do proper test - they didn't benchmark their application's performance. (more about that at the end).
Here is my bridge analogy extended to such systems: Imagine that every morning, a bus, let's say NJ Transit picks up passengers at bus stops and then heads across GWB. How.. impact the "latency" of the trip, if throw away schedule & signed bus stops the bus just drove on every street to see if any of the people who wanted to go to NY were there? I don't imagine that would work very well. YOUR APP=query:{ } [PAUSE] And yet users deploy applications into production without having proper indexes in place – frequently because they didn't do proper test - they didn't benchmark their application's performance. (more about that at the end).
Here is my bridge analogy extended to such systems: Imagine that every morning, a bus, let's say NJ Transit picks up passengers at bus stops and then heads across GWB. How.. impact the "latency" of the trip, if throw away schedule & signed bus stops the bus just drove on every street to see if any of the people who wanted to go to NY were there? I don't imagine that would work very well. YOUR APP=query:{ } [PAUSE] And yet users deploy applications into production without having proper indexes in place – frequently because they didn't do proper test - they didn't benchmark their application's performance. (more about that at the end).
Here is my bridge analogy extended to such systems: Imagine that every morning, a bus, let's say NJ Transit picks up passengers at bus stops and then heads across GWB. How.. impact the "latency" of the trip, if throw away schedule & signed bus stops the bus just drove on every street to see if any of the people who wanted to go to NY were there? I don't imagine that would work very well. YOUR APP=query:{ } [PAUSE] And yet users deploy applications into production without having proper indexes in place – frequently because they didn't do proper test - they didn't benchmark their application's performance. (more about that at the end).
I'm sure you are all excited to hear about how awesome Wired Tiger is - and it is! But of course - the right tool for the job and all that. There are a couple of important differences between MMAP and WT that I want you to understand so you can take advantage of the strengths of each.
Most easily seen difference: WT has on-disk compression. MMAP does not. MMAP does X. WT does Y. Will it help with RAM? yes – prefix index compression.
Index prefix compression 7X (1/7th) 20% or less! 40% 3%
We have our own application Evergreen - our continuous build integration that runs thousands of tests and has TBs of log files - it was doing fine with MMAP but with 10x compression in WT we are able to now keep 10x as many runs of history! talk tomorrow afternoon about it.
If disk resource is a big limiting factor for your application, AND your data is highly compressible, CPU cycles available? then WT FTW!
If disk resource is a big limiting factor for your application, AND your data is highly compressible, CPU cycles available? then WT FTW!
If disk resource is a big limiting factor for your application, AND your data is highly compressible, CPU cycles available? then WT FTW!
If disk resource is a big limiting factor for your application, AND your data is highly compressible, CPU cycles available? then WT FTW!
interesting, complex, CONCURRENCY impacts both latency Throughput. lot has been said over the years about MMAP low granularity concurrency. It's like relatively few toll booths in front of GWB. It can be a limiting factor. But - for actual execution of the operation, mmap is "faster" i.e. lower latency. Wired Tiger has very high grained concurrency - in fact, not "document level *locking*" - it uses clever lock-free algorithms to achieve high degree of concurrency. But related to that, the latency of a single operation is higher than with mmap. WT ^thruput ^latency
Wired Tiger has very high grained concurrency - in fact, not "document level *locking*" - it uses clever lock-free algorithms to achieve high degree of concurrency. But related to that, the latency of a single operation is higher than with mmap. WT ^thruput ^latency
Why would granularity of locking impact latency this way? Imagine GWB lanes again... MMAP is like having one toll booth (or one per lane). - once you pay the toll and you *know* you are the only person in that lane so you can go as fast as possible
Why would granularity of locking impact latency this way? Imagine GWB lanes again... MMAP is like having one toll booth (or one per lane). - once you pay the toll and you *know* you are the only person in that lane so you can go as fast as possible
WiredTiger, well, I'm stretching the metaphor a little here, but imagine that there are no toll booths. Everyone has EZ-Pass or FastTrak or whatever. And you drive to your lane BUT if you find yourself in contention
Why would granularity of locking impact latency this way? Imagine GWB lanes again... MMAP is like having one toll booth (or one per lane). - once you pay the toll and you *know* you are the only person in that lane so you can go as fast as possible
might find yourself in contention with another car for this lane, then one of you has to stop and try again. So first, you can't drive quite so fast, because you have to be able to notice another car in your lane in time to stop, and second if you do meet contention then you have to stop and try again. WRITE-CONFLICTS NO BLIND WRITES.
So when is this a big win for Wired Tiger? Well, you have to have (a) multiple threads! too few threads and you aren't winning big from the clever algorithms (b) multiple threads have to be contending on the same collection (otherwise mmap has coll-level lock) (c) multiple threads must NOT be all contending on a single document (if they are then well, you see) (d) CPU available but (e) you must not have significantly more threads than you have "lanes" - in this case CPU processors Here are some "benchmarks"
So when is this a big win for Wired Tiger? Well, you have to have (a) multiple threads! too few threads and you aren't winning big from the clever algorithms (b) multiple threads have to be contending on the same collection (otherwise mmap has coll-level lock) (c) multiple threads must NOT be all contending on a single document (if they are then well, you see) (d) CPU available but (e) you must not have significantly more threads than you have "lanes" - in this case CPU processors Here are some "benchmarks"
So when is this a big win for Wired Tiger? Well, you have to have (a) multiple threads! too few threads and you aren't winning big from the clever algorithms (b) multiple threads have to be contending on the same collection (otherwise mmap has coll-level lock) (c) multiple threads must NOT be all contending on a single document (if they are then well, you see) (d) CPU available but (e) you must not have significantly more threads than you have "lanes" - in this case CPU processors Here are some "benchmarks"
So when is this a big win for Wired Tiger? Well, you have to have (a) multiple threads! too few threads and you aren't winning big from the clever algorithms (b) multiple threads have to be contending on the same collection (otherwise mmap has coll-level lock) (c) multiple threads must NOT be all contending on a single document (if they are then well, you see) (d) CPU available but (e) you must not have significantly more threads than you have "lanes" - in this case CPU processors Here are some "benchmarks"
So when is this a big win for Wired Tiger? Well, you have to have (a) multiple threads! too few threads and you aren't winning big from the clever algorithms (b) multiple threads have to be contending on the same collection (otherwise mmap has coll-level lock) (c) multiple threads must NOT be all contending on a single document (if they are then well, you see) (d) CPU available but (e) you must not have significantly more threads than you have "lanes" - in this case CPU processors Here are some "benchmarks"
So when is this a big win for Wired Tiger? Well, you have to have (a) multiple threads! too few threads and you aren't winning big from the clever algorithms (b) multiple threads have to be contending on the same collection (otherwise mmap has coll-level lock) (c) multiple threads must NOT be all contending on a single document (if they are then well, you see) (d) CPU available but (e) you must not have significantly more threads than you have "lanes" - in this case CPU processors Here are some "benchmarks"
Not contending on the same document!!! and contending. Uniform, latest, zipfian
you must not have significantly more threads than you have "lanes" - in this case CPU processors if you have a huge number of threads which are all trying to do active work on a small number of cores then you will waste a huge amount of resources on just context switching and not actually doing work plus more threads contending on same documents.
Even for read heavy loads, huge number of threads which are all trying to do active work on a small number of cores then you will waste a huge amount of resources on just context switching
context switching and not actually doing work. That's concurrency and multithreading. So please don't do any single threaded benchmarking of WT and then ask how come it's not as fast as you heard. But don't benchmark 500 threads on a 4-core laptop!
The other other significant differentiator is the "write pattern". I'm not talking compressing data on disk & using the disk IOPs a lot more judiciously than MMAP. I'm talking abotu write amplification. There is a big difference in how writes are done during updates: MMAP does "in place" updates WT does "copy on write" on all updates. Illustration using a document rather than a bridge 
Here' a time series document for a particular hour, with minutes and seconds. if you make an update to document
update to this document, mmap will overwrite the existing document with new value.
new value.
back to original document: WiredTiger will rewrite the current document on update
document (or more technically the internal page that contains that document) as a new version
new version of that page. This of course enables whoever was reading that page to still be reading it as the previous version of that page, which will get recycled when everyone who was using it is done with it. USE CASE Think about the use case where you have a very high number of documents that are nonetheless a small portion of your total data that are being extremely frequently updated, over and over again?
I'm talking of course of a system like MMS monitoring component which receives a large number of performance metrics and updates counters inside documents that don't change except for these numbers being incremented for the duration of whatever the document represents. Here, with schema heavily optimized to make sure updates are in place, performance is better with mmap even though it uses up more disk space (and RAM).
And this brings me to the most important point I'm going to make – all the generalizations are just that - no matter what I told you here today, no matter what you read on the internet, the only way to know for sure how fast your application is with your carefully selected schema and your carefully selected indexes would be to stress test and measure it. The examples I used are both applications we run in-house that we benchmarked with both storage engines with different configurations and physical resources to make the most appropriate choices - you guys should do the same. Oh, and if you happen to be going back to Jersey tonight and you want to have predictable latency
do yourself a favor, and take the train. Thank you!