SlideShare ist ein Scribd-Unternehmen logo
1 von 14
Downloaden Sie, um offline zu lesen
MongoDB	
  Sharding	
  

How	
  queries	
  work	
  in	
  sharded	
  environments	
  
One	
  small	
  server.	
  	
  We	
  want	
  more	
  
capacity.	
  	
  What	
  to	
  do?	
  
TradiAonally,	
  we	
  would	
  scale	
  
verAcally	
  with	
  a	
  bigger	
  box.	
  
With	
  sharding	
  we	
  instead	
  scale	
  
horizontally	
  to	
  achieve	
  the	
  same	
  
computaAonal/storage/memory	
  
footprint	
  from	
  smaller	
  servers.	
  




                           m=10	
  
We	
  will	
  show	
  the	
  verAcally	
  scale	
  db	
  and	
  the	
  horizontally	
  scaled	
  db	
  
side-­‐by-­‐side	
  for	
  comparison.	
  
A	
  sharded	
  MongoDB	
  collecAon	
  has	
  a	
  shard	
  key.	
  	
  The	
  collecAon	
  is	
  
parAAoned	
  in	
  an	
  order-­‐preserving	
  manner	
  using	
  this	
  key.	
  	
  In	
  this	
  
example	
  a	
  is	
  our	
  shard	
  key:	
  

{	
  a	
  :	
  …,	
  b	
  :	
  …,	
  c	
  :	
  …	
  }	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  a	
  is	
  declared	
  shard	
  key	
  for	
  the	
  collec0on	
  
{	
  a	
  :	
  …,	
  b	
  :	
  …,	
  c	
  :	
  …	
  }	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  a	
  is	
  shard	
  key	
  

Metadata	
  is	
  maintained	
  on	
  chunks	
  which	
  are	
  represented	
  by	
  
shard	
  key	
  ranges.	
  	
  Each	
  chunk	
  is	
  assigned	
  to	
  a	
  parAcular	
  shard.	
  
                                                                                                          Range	
                        Shard	
  
                                                                                                          a	
  in	
  [-­‐∞,2000)	
       2	
  
                                                                                                          a	
  in	
  [2000,2100)	
       8	
  
                                                                                                          a	
  in	
  [2100,5500)	
       3	
  
                                                                                                          …	
                            …	
  
                                                                                                          a	
  in	
  [88700,	
  ∞)	
     0	
  



When	
  a	
  chunk	
  becomes	
  too	
  large,	
  MongoDB	
  automaAcally	
  splits	
  
it,	
  and	
  the	
  balancer	
  will	
  later	
  migrate	
  chunks	
  as	
  necessary.	
  
{	
  a	
  :	
  …,	
  b	
  :	
  …,	
  c	
  :	
  …	
  }	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  a	
  is	
  shard	
  key	
  

find(	
  {	
  a	
  :	
  {	
  $gt	
  :	
  333,	
  $lt	
  :	
  400	
  }	
  )	
  

                                                                                                          Range	
                        Shard	
  
                                                                                                          a	
  in	
  [-­‐∞,2000)	
       2	
  
                                                                                                          a	
  in	
  [2000,2100)	
       8	
  
                                                                                                          a	
  in	
  [2100,5500)	
       3	
  
                                                                                                          …	
                            …	
  
                                                                                                          a	
  in	
  [88700,	
  ∞)	
     0	
  



The	
  mongos	
  process	
  routes	
  a	
  query	
  to	
  the	
  correct	
  shard(s).	
  	
  For	
  
the	
  query	
  above,	
  all	
  data	
  possibly	
  relevant	
  is	
  on	
  shard	
  2,	
  so	
  the	
  
query	
  is	
  sent	
  to	
  that	
  node	
  only,	
  and	
  processed	
  there.	
  
{	
  a	
  :	
  …,	
  b	
  :	
  …,	
  c	
  :	
  …	
  }	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  a	
  is	
  declared	
  shard	
  key	
  

find(	
  {	
  a	
  :	
  {	
  $gt	
  :	
  333,	
  $lt	
  :	
  2012	
  }	
  )	
  

                                                                                              Range	
                                  Shard	
  
                                                                                              a	
  in	
  [-­‐∞,2000)	
                 2	
  
                                                                                              a	
  in	
  [2000,2100)	
                 8	
  
                                                                                              a	
  in	
  [2100,5500)	
                 3	
  
                                                                                              …	
                                      …	
  
                                                                                              a	
  in	
  [88700,	
  ∞)	
               0	
  



SomeAmes	
  a	
  query	
  range	
  might	
  span	
  more	
  than	
  one	
  shard,	
  but	
  
very	
  few	
  in	
  total.	
  This	
  is	
  reasonably	
  efficient.	
  
{	
  a	
  :	
  …,	
  b	
  :	
  …,	
  c	
  :	
  …	
  }	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  non-­‐shard	
  key	
  query,	
  no	
  index	
  

find(	
  {	
  b	
  :	
  99	
  }	
  )	
  

                                                                                       Range	
                                       Shard	
  
                                                                                       a	
  in	
  [-­‐∞,2000)	
                      2	
  
                                                                                       a	
  in	
  [2000,2100)	
                      8	
  
                                                                                       a	
  in	
  [2100,5500)	
                      3	
  
                                                                                       …	
                                           …	
  
                                                                                       a	
  in	
  [88700,	
  ∞)	
                    0	
  



Queries	
  not	
  involving	
  the	
  shard	
  key	
  will	
  be	
  sent	
  to	
  all	
  shards	
  as	
  a	
  
“scader/gather”	
  operaAon.	
  	
  This	
  is	
  someAmes	
  ok.	
  	
  Here	
  on	
  both	
  
our	
  tradiAonal	
  machine	
  and	
  the	
  shards,	
  we	
  do	
  a	
  table	
  scan	
  -­‐-­‐	
  
equally	
  expensive	
  (roughly)	
  on	
  both.	
  
{	
  a	
  :	
  …,	
  b	
  :	
  …,	
  c	
  :	
  …	
  }	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Sca8er	
  /	
  gather	
  with	
  secondary	
  index	
  

ensureIndex({b:1})	
  
find(	
  {	
  b	
  :	
  99	
  }	
  )	
  
                                                                                 Range	
                                   Shard	
  
                                                                                 a	
  in	
  [-­‐∞,2000)	
                  2	
  
                                                                                 a	
  in	
  [2000,2100)	
                  8	
  
                                                                                 a	
  in	
  [2100,5500)	
                  3	
  
                                                                                 …	
                                       …	
  
                                                                                 a	
  in	
  [88700,	
  ∞)	
                0	
  



Once	
  again	
  a	
  query	
  with	
  a	
  shard	
  key	
  results	
  in	
  a	
  scader/gather	
  
operaAon.	
  	
  However	
  at	
  each	
  shard,	
  we	
  can	
  use	
  the	
  {b:1}	
  index	
  to	
  
make	
  the	
  operaAon	
  efficient	
  for	
  that	
  shard.	
  	
  We	
  have	
  a	
  lidle	
  extra	
  
overhead	
  over	
  the	
  verAcal	
  configuraAon	
  for	
  the	
  communicaAons	
  
effort	
  from	
  mongos	
  to	
  each	
  shard	
  -­‐-­‐	
  not	
  too	
  bad	
  if	
  number	
  of	
  
shards	
  is	
  small	
  (10)	
  but	
  quite	
  substanAal	
  for	
  say,	
  a	
  1000	
  shard	
  
system.	
  
{	
  a	
  :	
  …,	
  b	
  :	
  …,	
  c	
  :	
  …	
  }	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Non-­‐shard	
  key	
  query,	
  secondary	
  index	
  

find(	
  {	
  b	
  :	
  99,	
  a	
  :	
  100	
  }	
  )	
  

                                                                               Range	
                                    Shard	
  
                                                                               a	
  in	
  [-­‐∞,2000)	
                   2	
  
                                                                               a	
  in	
  [2000,2100)	
                   8	
  
                                                                               a	
  in	
  [2100,5500)	
                   3	
  
                                                                               …	
                                        …	
  
                                                                               a	
  in	
  [88700,	
  ∞)	
                 0	
  



The	
  a	
  term	
  involves	
  the	
  shard	
  key	
  and	
  allows	
  mongos	
  to	
  
intelligently	
  route	
  the	
  query	
  to	
  shard	
  2.	
  	
  Once	
  the	
  query	
  reaches	
  
shard	
  2,	
  the	
  {	
  b	
  :	
  1	
  }	
  index	
  can	
  be	
  used	
  to	
  efficiently	
  process	
  the	
  
query.	
  
When	
  sorAng	
  is	
  specified,	
  the	
  relevant	
  shards	
  sort	
  locally,	
  and	
  
            then	
  mongos	
  merges	
  the	
  results.	
  	
  Thus	
  the	
  mongos	
  resource	
  
            usage	
  is	
  not	
  terribly	
  high.	
  



                                        client	
  



                                         Adam	
  
                                          Bob	
  
                                         David	
  
                                          Julie	
  
                                          Sue	
  
                                         Time	
  
                                         Zack	
  




                                       mongos	
  


 Bob	
  
David	
                                   Sue	
                     Adam	
  
Julie	
                                   Tim	
                     Zack	
  
When	
  using	
  replicaAon	
  (typically	
  a	
  replica	
  set),	
  we	
  simply	
  have	
  
more	
  than	
  one	
  node	
  per	
  shard.	
  

(arrows	
  below	
  indicate	
  replica0on,	
  tradi0onal	
  vs.	
  sharded	
  
environments)	
  

Weitere ähnliche Inhalte

Was ist angesagt?

Oracle db architecture
Oracle db architectureOracle db architecture
Oracle db architecture
Simon Huang
 
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
DataStax
 
The InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQLThe InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQL
Morgan Tocker
 

Was ist angesagt? (20)

Indexing
IndexingIndexing
Indexing
 
Oracle db architecture
Oracle db architectureOracle db architecture
Oracle db architecture
 
Mongo DB Presentation
Mongo DB PresentationMongo DB Presentation
Mongo DB Presentation
 
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
 
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
 
MongoDB 101
MongoDB 101MongoDB 101
MongoDB 101
 
MongoDB Database Replication
MongoDB Database ReplicationMongoDB Database Replication
MongoDB Database Replication
 
MySQL innoDB split and merge pages
MySQL innoDB split and merge pagesMySQL innoDB split and merge pages
MySQL innoDB split and merge pages
 
Structured query language(sql)ppt
Structured query language(sql)pptStructured query language(sql)ppt
Structured query language(sql)ppt
 
12. oracle database architecture
12. oracle database architecture12. oracle database architecture
12. oracle database architecture
 
Nosql databases
Nosql databasesNosql databases
Nosql databases
 
Introduction to PostgreSQL
Introduction to PostgreSQLIntroduction to PostgreSQL
Introduction to PostgreSQL
 
InnoDB MVCC Architecture (by 권건우)
InnoDB MVCC Architecture (by 권건우)InnoDB MVCC Architecture (by 권건우)
InnoDB MVCC Architecture (by 권건우)
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architecture
 
SQLServer Database Structures
SQLServer Database Structures SQLServer Database Structures
SQLServer Database Structures
 
Oracle statistics by example
Oracle statistics by exampleOracle statistics by example
Oracle statistics by example
 
The InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQLThe InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQL
 
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinC* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
 
Indexing with MongoDB
Indexing with MongoDBIndexing with MongoDB
Indexing with MongoDB
 
Optimizing your app by understanding your Postgres | RailsConf 2019 | Samay S...
Optimizing your app by understanding your Postgres | RailsConf 2019 | Samay S...Optimizing your app by understanding your Postgres | RailsConf 2019 | Samay S...
Optimizing your app by understanding your Postgres | RailsConf 2019 | Samay S...
 

Andere mochten auch

MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: Sharding
MongoDB
 
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor ManagementMongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB
 
Lessons Learned from Building a Multi-Tenant Saas Content Management System o...
Lessons Learned from Building a Multi-Tenant Saas Content Management System o...Lessons Learned from Building a Multi-Tenant Saas Content Management System o...
Lessons Learned from Building a Multi-Tenant Saas Content Management System o...
MongoDB
 

Andere mochten auch (20)

Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
 
Everything You Need to Know About Sharding
Everything You Need to Know About ShardingEverything You Need to Know About Sharding
Everything You Need to Know About Sharding
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: Sharding
 
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor ManagementMongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
 
Choosing a Shard key
Choosing a Shard keyChoosing a Shard key
Choosing a Shard key
 
Lessons Learned from Building a Multi-Tenant Saas Content Management System o...
Lessons Learned from Building a Multi-Tenant Saas Content Management System o...Lessons Learned from Building a Multi-Tenant Saas Content Management System o...
Lessons Learned from Building a Multi-Tenant Saas Content Management System o...
 
Eclipse Paho - MQTT and the Internet of Things
Eclipse Paho - MQTT and the Internet of ThingsEclipse Paho - MQTT and the Internet of Things
Eclipse Paho - MQTT and the Internet of Things
 
Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS A...
Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS A...Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS A...
Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS A...
 
How to monitor MongoDB
How to monitor MongoDBHow to monitor MongoDB
How to monitor MongoDB
 
Efficient Pagination Using MySQL
Efficient Pagination Using MySQLEfficient Pagination Using MySQL
Efficient Pagination Using MySQL
 
Pagination Done the Right Way
Pagination Done the Right WayPagination Done the Right Way
Pagination Done the Right Way
 
Open Source IoT at Eclipse
Open Source IoT at EclipseOpen Source IoT at Eclipse
Open Source IoT at Eclipse
 
BigData_TP5 : Neo4J
BigData_TP5 : Neo4JBigData_TP5 : Neo4J
BigData_TP5 : Neo4J
 
BigData_TP2: Design Patterns dans Hadoop
BigData_TP2: Design Patterns dans HadoopBigData_TP2: Design Patterns dans Hadoop
BigData_TP2: Design Patterns dans Hadoop
 
BigData_TP4 : Cassandra
BigData_TP4 : CassandraBigData_TP4 : Cassandra
BigData_TP4 : Cassandra
 
BigData_Chp5: Putting it all together
BigData_Chp5: Putting it all togetherBigData_Chp5: Putting it all together
BigData_Chp5: Putting it all together
 
BigData_TP1: Initiation à Hadoop et Map-Reduce
BigData_TP1: Initiation à Hadoop et Map-ReduceBigData_TP1: Initiation à Hadoop et Map-Reduce
BigData_TP1: Initiation à Hadoop et Map-Reduce
 
BigData_TP3 : Spark
BigData_TP3 : SparkBigData_TP3 : Spark
BigData_TP3 : Spark
 
BigData_Chp3: Data Processing
BigData_Chp3: Data ProcessingBigData_Chp3: Data Processing
BigData_Chp3: Data Processing
 
BigData_Chp2: Hadoop & Map-Reduce
BigData_Chp2: Hadoop & Map-ReduceBigData_Chp2: Hadoop & Map-Reduce
BigData_Chp2: Hadoop & Map-Reduce
 

Mehr von MongoDB

Mehr von MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Kürzlich hochgeladen (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

How queries work with sharding

  • 1. MongoDB  Sharding   How  queries  work  in  sharded  environments  
  • 2. One  small  server.    We  want  more   capacity.    What  to  do?  
  • 3. TradiAonally,  we  would  scale   verAcally  with  a  bigger  box.  
  • 4. With  sharding  we  instead  scale   horizontally  to  achieve  the  same   computaAonal/storage/memory   footprint  from  smaller  servers.   m=10  
  • 5. We  will  show  the  verAcally  scale  db  and  the  horizontally  scaled  db   side-­‐by-­‐side  for  comparison.  
  • 6. A  sharded  MongoDB  collecAon  has  a  shard  key.    The  collecAon  is   parAAoned  in  an  order-­‐preserving  manner  using  this  key.    In  this   example  a  is  our  shard  key:   {  a  :  …,  b  :  …,  c  :  …  }                    a  is  declared  shard  key  for  the  collec0on  
  • 7. {  a  :  …,  b  :  …,  c  :  …  }                    a  is  shard  key   Metadata  is  maintained  on  chunks  which  are  represented  by   shard  key  ranges.    Each  chunk  is  assigned  to  a  parAcular  shard.   Range   Shard   a  in  [-­‐∞,2000)   2   a  in  [2000,2100)   8   a  in  [2100,5500)   3   …   …   a  in  [88700,  ∞)   0   When  a  chunk  becomes  too  large,  MongoDB  automaAcally  splits   it,  and  the  balancer  will  later  migrate  chunks  as  necessary.  
  • 8. {  a  :  …,  b  :  …,  c  :  …  }                    a  is  shard  key   find(  {  a  :  {  $gt  :  333,  $lt  :  400  }  )   Range   Shard   a  in  [-­‐∞,2000)   2   a  in  [2000,2100)   8   a  in  [2100,5500)   3   …   …   a  in  [88700,  ∞)   0   The  mongos  process  routes  a  query  to  the  correct  shard(s).    For   the  query  above,  all  data  possibly  relevant  is  on  shard  2,  so  the   query  is  sent  to  that  node  only,  and  processed  there.  
  • 9. {  a  :  …,  b  :  …,  c  :  …  }                    a  is  declared  shard  key   find(  {  a  :  {  $gt  :  333,  $lt  :  2012  }  )   Range   Shard   a  in  [-­‐∞,2000)   2   a  in  [2000,2100)   8   a  in  [2100,5500)   3   …   …   a  in  [88700,  ∞)   0   SomeAmes  a  query  range  might  span  more  than  one  shard,  but   very  few  in  total.  This  is  reasonably  efficient.  
  • 10. {  a  :  …,  b  :  …,  c  :  …  }                    non-­‐shard  key  query,  no  index   find(  {  b  :  99  }  )   Range   Shard   a  in  [-­‐∞,2000)   2   a  in  [2000,2100)   8   a  in  [2100,5500)   3   …   …   a  in  [88700,  ∞)   0   Queries  not  involving  the  shard  key  will  be  sent  to  all  shards  as  a   “scader/gather”  operaAon.    This  is  someAmes  ok.    Here  on  both   our  tradiAonal  machine  and  the  shards,  we  do  a  table  scan  -­‐-­‐   equally  expensive  (roughly)  on  both.  
  • 11. {  a  :  …,  b  :  …,  c  :  …  }                    Sca8er  /  gather  with  secondary  index   ensureIndex({b:1})   find(  {  b  :  99  }  )   Range   Shard   a  in  [-­‐∞,2000)   2   a  in  [2000,2100)   8   a  in  [2100,5500)   3   …   …   a  in  [88700,  ∞)   0   Once  again  a  query  with  a  shard  key  results  in  a  scader/gather   operaAon.    However  at  each  shard,  we  can  use  the  {b:1}  index  to   make  the  operaAon  efficient  for  that  shard.    We  have  a  lidle  extra   overhead  over  the  verAcal  configuraAon  for  the  communicaAons   effort  from  mongos  to  each  shard  -­‐-­‐  not  too  bad  if  number  of   shards  is  small  (10)  but  quite  substanAal  for  say,  a  1000  shard   system.  
  • 12. {  a  :  …,  b  :  …,  c  :  …  }                    Non-­‐shard  key  query,  secondary  index   find(  {  b  :  99,  a  :  100  }  )   Range   Shard   a  in  [-­‐∞,2000)   2   a  in  [2000,2100)   8   a  in  [2100,5500)   3   …   …   a  in  [88700,  ∞)   0   The  a  term  involves  the  shard  key  and  allows  mongos  to   intelligently  route  the  query  to  shard  2.    Once  the  query  reaches   shard  2,  the  {  b  :  1  }  index  can  be  used  to  efficiently  process  the   query.  
  • 13. When  sorAng  is  specified,  the  relevant  shards  sort  locally,  and   then  mongos  merges  the  results.    Thus  the  mongos  resource   usage  is  not  terribly  high.   client   Adam   Bob   David   Julie   Sue   Time   Zack   mongos   Bob   David   Sue   Adam   Julie   Tim   Zack  
  • 14. When  using  replicaAon  (typically  a  replica  set),  we  simply  have   more  than  one  node  per  shard.   (arrows  below  indicate  replica0on,  tradi0onal  vs.  sharded   environments)