SlideShare ist ein Scribd-Unternehmen logo
1 von 20
Scaling and Sharding
       Eliot Horowitz
       @eliothorowitz
        MongoBoston
     September 20, 2010
Scaling

• Data size only goes up
• Operations/sec only go up
• Vertical scaling is limited
• Hard to scale vertically in the cloud
• Can scale wider than higher
Traditional Horizontal
        Scaling

• read only slaves
• caching
• custom partitioning code
Newer Scaling

• relational database clustering
• consistent hashing (Dynamo)
• range based partitioning (BigTable/PNUTS)
MongoDB Sharding

• Scale horizontally for data size, index size,
  write and consistent read scaling
• Distribute databases, collections or a
  objects in a collection
• Auto-balancing, migrations, management
  happen with no down time
• Choose how you partition data
• Can convert from single master to sharded
  system with no downtime
• Same features as non-sharding single
  master
• Fully consistent
Range Based



• collection is broken into chunks by range
• chunks default to 200mb or 100,000
  objects
Architecture
                     Shards
            mongod   mongod     mongod
                                               ...
 Config      mongod   mongod     mongod
 Servers

mongod

mongod

mongod               mongos    mongos    ...


                      client
User profiles

• Partition by user_id
• Secondary indexes on location, dates, etc...
• Reads/writes know which shard to hit
User Activity Stream

• Shard by user_id
• Loading a user’s stream hits a single shard
• Writes are distributed across all shards
• Can index on activity for deleting
Photos


• Can shard by photo_id for best read/write
  distribution
• Secondary index on tags, date
Logging

Possible Shard Keys


• date
• machine, date
• logger name
Config Servers

• 3 of them
• changes are made with 2 phase commit
• if any are down, meta data goes read only
• system is online as long as 1/3 is up
Shards


• Can be master, master/slave or replica sets
• Replica sets gives sharding + full auto-
  failover
• Regular mongod processes
mongos

• Sharding Router
• Acts just like a mongod to clients
• Can have 1 or as many as you want
• Can run on appserver so no extra network
  traffic
Writes

• Inserts : require shard key, routed
• Removes: routed and/or scattered
• Updates: routed or scattered
Queries

• By shard key: routed
• sorted by shard key: routed in order
• by non shard key: scatter gather
• sorted by non shard key: distributed merge
  sort
Operations

• split: breaking a chunk into 2
• migrate: move a chunk from 1 shard to
  another
• balancing: moving chunks automatically to
  keep system in balance
Setting it Up
•   Start servers

•   add shards: db.runCommand( { addshard :
    "10.1.1.5" } )

•   turn on partitioning:
    db.runCommand( { enablesharding : "test" }

•   shard a collection:
    db.runCommand( { shardcollection : "test.data" ,
    key : { num : 1 } } )
Download MongoDB
      http://www.mongodb.org



   and
let
us
know
what
you
think
    @eliothorowitz



@mongodb


       10gen is hiring!
http://www.10gen.com/jobs

Weitere ähnliche Inhalte

Andere mochten auch

Performance Tuning and Optimization
Performance Tuning and OptimizationPerformance Tuning and Optimization
Performance Tuning and OptimizationMongoDB
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDBMongoDB
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDBMongoDB
 
Inside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseMike Dirolf
 
MongoDB: How it Works
MongoDB: How it WorksMongoDB: How it Works
MongoDB: How it WorksMike Dirolf
 
Scaling and Transaction Futures
Scaling and Transaction FuturesScaling and Transaction Futures
Scaling and Transaction FuturesMongoDB
 
Developing with the Modern App Stack: MEAN and MERN (with Angular2 and ReactJS)
Developing with the Modern App Stack: MEAN and MERN (with Angular2 and ReactJS)Developing with the Modern App Stack: MEAN and MERN (with Angular2 and ReactJS)
Developing with the Modern App Stack: MEAN and MERN (with Angular2 and ReactJS)MongoDB
 

Andere mochten auch (8)

Indexing
IndexingIndexing
Indexing
 
Performance Tuning and Optimization
Performance Tuning and OptimizationPerformance Tuning and Optimization
Performance Tuning and Optimization
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
 
Inside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source Database
 
MongoDB: How it Works
MongoDB: How it WorksMongoDB: How it Works
MongoDB: How it Works
 
Scaling and Transaction Futures
Scaling and Transaction FuturesScaling and Transaction Futures
Scaling and Transaction Futures
 
Developing with the Modern App Stack: MEAN and MERN (with Angular2 and ReactJS)
Developing with the Modern App Stack: MEAN and MERN (with Angular2 and ReactJS)Developing with the Modern App Stack: MEAN and MERN (with Angular2 and ReactJS)
Developing with the Modern App Stack: MEAN and MERN (with Angular2 and ReactJS)
 

MongoDB Sharding - MongoBoston 2010

  • 1. Scaling and Sharding Eliot Horowitz @eliothorowitz MongoBoston September 20, 2010
  • 2. Scaling • Data size only goes up • Operations/sec only go up • Vertical scaling is limited • Hard to scale vertically in the cloud • Can scale wider than higher
  • 3. Traditional Horizontal Scaling • read only slaves • caching • custom partitioning code
  • 4. Newer Scaling • relational database clustering • consistent hashing (Dynamo) • range based partitioning (BigTable/PNUTS)
  • 5. MongoDB Sharding • Scale horizontally for data size, index size, write and consistent read scaling • Distribute databases, collections or a objects in a collection • Auto-balancing, migrations, management happen with no down time
  • 6. • Choose how you partition data • Can convert from single master to sharded system with no downtime • Same features as non-sharding single master • Fully consistent
  • 7. Range Based • collection is broken into chunks by range • chunks default to 200mb or 100,000 objects
  • 8. Architecture Shards mongod mongod mongod ... Config mongod mongod mongod Servers mongod mongod mongod mongos mongos ... client
  • 9. User profiles • Partition by user_id • Secondary indexes on location, dates, etc... • Reads/writes know which shard to hit
  • 10. User Activity Stream • Shard by user_id • Loading a user’s stream hits a single shard • Writes are distributed across all shards • Can index on activity for deleting
  • 11. Photos • Can shard by photo_id for best read/write distribution • Secondary index on tags, date
  • 12. Logging Possible Shard Keys • date • machine, date • logger name
  • 13. Config Servers • 3 of them • changes are made with 2 phase commit • if any are down, meta data goes read only • system is online as long as 1/3 is up
  • 14. Shards • Can be master, master/slave or replica sets • Replica sets gives sharding + full auto- failover • Regular mongod processes
  • 15. mongos • Sharding Router • Acts just like a mongod to clients • Can have 1 or as many as you want • Can run on appserver so no extra network traffic
  • 16. Writes • Inserts : require shard key, routed • Removes: routed and/or scattered • Updates: routed or scattered
  • 17. Queries • By shard key: routed • sorted by shard key: routed in order • by non shard key: scatter gather • sorted by non shard key: distributed merge sort
  • 18. Operations • split: breaking a chunk into 2 • migrate: move a chunk from 1 shard to another • balancing: moving chunks automatically to keep system in balance
  • 19. Setting it Up • Start servers • add shards: db.runCommand( { addshard : "10.1.1.5" } ) • turn on partitioning: db.runCommand( { enablesharding : "test" } • shard a collection: db.runCommand( { shardcollection : "test.data" , key : { num : 1 } } )
  • 20. Download MongoDB http://www.mongodb.org and
let
us
know
what
you
think @eliothorowitz



@mongodb 10gen is hiring! http://www.10gen.com/jobs

Hinweis der Redaktion

  1. What is scaling? Well - hopefully for everyone here.
  2. scaling isn’t new sharding isn’t manual re-balancing is painful at best
  3. Replica Sets for inconsistent read scaling for inconsistent read scaling
  4. don’t shard by date