25. Explain
• explain will return information
– indexes used for the query (if any)
– stats about timing
– the number of documents scanned
db.census.find({city:"CHICAGO"}).explain()
37. Sharding
• The process of splitting data up and storing
different portions of the data on different
machines
• Automatic vs. manual
• Chunks
– Shard Key
39. Sharding
• Server types:
– Shard
• holds a subset of a collection’s data.
– Single mongod server
– Replica set
– Mongos
• router process and aggregates responses
• Does not store anything
– config server
• Stores cluster configuration: which data is on which shard.
• Start these in reverse
Key value pairsData typesTalk about the idHow would you maps these key value pairs to a row in a table?
Documents can contain other documents
I can have arrays of arraysHow do I put this into a single row?How would you do this in 3rd normal form?Not just denormalized, multi-dimensional denormalizationIsn’t that wrong?This is all stored all in the same place on diskWould need joins to do this in RDBMSOur performance improvements
Analogous except where multi-dimensionalStrings, long numbers, dates, arrays and embedded objectsCan be any unique value for id, if desired
Go over relational side first
Start it up and shut it downPort 27017, 28017Ctrl-C
Project1 – include0 - exclude
AND is default for query criteria
$unset removes the key altogether
No parameter removes all documents Empty collectiondrop_collection removes the collection itself
Slow!
ZealotsA long lived proven technologyGives you incredible flexibility up front (but you may not need it)YAGNI
Question the ORM assumptionORM mappers like Hibernate or Ruby on Rails ActiveRecordFundamental mismatch OO and relational (handling sets of data, dealing with inheritance)Its not just about being to able to do something quick and dirty and paying for it later. You may never need the RDBMS flexibility. It is not just a poor man’s database
No need to join – everything is already all together. Joins are slow, how much slower distributed?Violates 3rd normal form. Yes. Create your schema based upon your application’s needs, not general flexibility- Cannot debit one record, and credit another in a single transactionOnly within a single document (which can have a sub-documents)- Distributed joins and two phase commits – slow!