Presentation given to a room full of web designers and developers at DevCon5 in NYC. The presentation introduces concepts critical to understanding how databases work and why it's important to understand even when one typically works much higher in the stack.
Topics covered:
* History of web development
* Picking the right data model
* Modern database interaction
* How modern databases scale
An emphasis is placed on MongoDB in this presentation, but other technologies such as relational (MySQL, Oracle, MSSQL) and Key Value (Cassandra, Riak, DynamoDB) are also covered.
6. If you don't ow whe
you've come from,
you don't ow whe
you are. - James Burke
7. te 90’s
๏Web is born
๏Web development mostly
done in perl or C
๏Everyone is a webmaster
๏Relational databases
8. r ’s
๏ Web growth
redefines scale
๏ Javascript avoided
๏ Dynamic languages come of age
๏ LAMP
๏ Everyone is a PHP programmer
๏ Relational databases
9. Mid ’s
๏ Social re-
redefines scale
๏ Multimedia rules
๏ Heavy caching (memcache) required LAM(m)P
๏ Frameworks (Ruby on Rails) with heavy
database abstractions en vogue
๏ Everyone is a OO programmer
๏ Relational databases*
11. Condons
๏ Web users exponentially increasing
๏ Excessive layering causes applications
to be slower
๏ Social (dynamic data) limits use of
caching crutch
๏ Cost per byte decreasing rapidly
๏ Data growing in size & complexity
19. K Value
๏ One-dimensional storage
๏ Single value is a blob
๏ Query on key only
๏ Some support secondary indexes
๏ No schema
๏ Value cannot be updated, only replaced
Key Blob
Cassandra, Redis, MemcacheD, Riak, DynamoDB
20. Raonal๏ Query on any field
๏ In-place updates
๏ Two-dimensional storage
๏ Each field contains a single value
๏ Very structured schema (table)
๏ Normalization process requires many tables,
joins, indexes, and poor data locality
Primary
Key
Oracle, MSSQL, MySQL, PostgreSQL, DB2
21. Documt๏ N-dimensional storage
๏ Each field can contain 0, 1,
many, or embedded values
๏ Query on any field & level
๏ Flexible schema
๏ Inline updates
๏ Embedding related data has optimal data locality,
requires fewer indexes, has better performance
_id
MongoDB, CouchDB, RethinkDB
27. MongoDB spks your ngauage
๏ Drivers in 14+ languages
๏ Interface is natural and
idiomatic for each language
๏ Document natively maps to
map/hash/object
array/dict/struct
28. place1 = {
name : "10gen HQ",
address : "229 W 43rd St. 5th Floor",
city : "New York",
zip : "10036",
tags : [ "business", "awesome" ]
}
Start with an object
(or array, hash, dict, etc)
43. Scabi Needs
๏ Data is highly available
๏ Data is consistent
๏ Performant
(caching unnecessary)
44. Difft Aroaches
๏
MultiMaster
๏
Peer to peer
๏
Has Conflicts
๏
Ring based
approach
combines high
availability and
distribution
๏
Complex
application logic
๏
Single Master
๏
Consistent
๏
Slaves have
delayed writes
๏
High availability
๏
No scalable
solution
๏
Single Master
๏
Consistent
๏
Secondaries have
delayed writes
๏
High availability
๏
Range based
distribution
45. MongoDB : bui to scale
๏ Intelligent replication
๏ Automatic partitioning of data
(user configurable)
๏ Horizontal Scale
๏ Targeted Queries
๏ Parallel Processing