Developing with the Modern App Stack: MEAN and MERN (with Angular2 and ReactJS)
MongoDB Sharding - MongoBoston 2010
1. Scaling and Sharding
Eliot Horowitz
@eliothorowitz
MongoBoston
September 20, 2010
2. Scaling
• Data size only goes up
• Operations/sec only go up
• Vertical scaling is limited
• Hard to scale vertically in the cloud
• Can scale wider than higher
4. Newer Scaling
• relational database clustering
• consistent hashing (Dynamo)
• range based partitioning (BigTable/PNUTS)
5. MongoDB Sharding
• Scale horizontally for data size, index size,
write and consistent read scaling
• Distribute databases, collections or a
objects in a collection
• Auto-balancing, migrations, management
happen with no down time
6. • Choose how you partition data
• Can convert from single master to sharded
system with no downtime
• Same features as non-sharding single
master
• Fully consistent
7. Range Based
• collection is broken into chunks by range
• chunks default to 200mb or 100,000
objects
9. User profiles
• Partition by user_id
• Secondary indexes on location, dates, etc...
• Reads/writes know which shard to hit
10. User Activity Stream
• Shard by user_id
• Loading a user’s stream hits a single shard
• Writes are distributed across all shards
• Can index on activity for deleting
11. Photos
• Can shard by photo_id for best read/write
distribution
• Secondary index on tags, date
13. Config Servers
• 3 of them
• changes are made with 2 phase commit
• if any are down, meta data goes read only
• system is online as long as 1/3 is up
14. Shards
• Can be master, master/slave or replica sets
• Replica sets gives sharding + full auto-
failover
• Regular mongod processes
15. mongos
• Sharding Router
• Acts just like a mongod to clients
• Can have 1 or as many as you want
• Can run on appserver so no extra network
traffic
17. Queries
• By shard key: routed
• sorted by shard key: routed in order
• by non shard key: scatter gather
• sorted by non shard key: distributed merge
sort
18. Operations
• split: breaking a chunk into 2
• migrate: move a chunk from 1 shard to
another
• balancing: moving chunks automatically to
keep system in balance
19. Setting it Up
• Start servers
• add shards: db.runCommand( { addshard :
"10.1.1.5" } )
• turn on partitioning:
db.runCommand( { enablesharding : "test" }
• shard a collection:
db.runCommand( { shardcollection : "test.data" ,
key : { num : 1 } } )
20. Download MongoDB
http://www.mongodb.org
and let us know what you think
@eliothorowitz @mongodb
10gen is hiring!
http://www.10gen.com/jobs
Hinweis der Redaktion
What is scaling?
Well - hopefully for everyone here.
scaling isn’t new
sharding isn’t
manual re-balancing is painful at best
Replica Sets for inconsistent read scaling
for inconsistent read scaling