A SoCal CodeCamp talk about two NoSQL databases, the crowd favorite MongoDB and the up-and-comer RavenDB. Which one is better at what?
This talk really just offers a pair and contrast view of both database systems and tries to speak to the strengths and weaknesses of each. It's a 300-level talk and is not meant for people who are totally unfamiliar with NoSQL and document database systems.
3. What do RavenDB and MongoDB do that are similar?
WHATâS IN COMMON?
4. Fundamentals
No Schema
No Impedance Mismatch
{ "Address" : "123 Anywhere
St.", "City" : "Springfield", Expose Data as JSON
"PostalCode" : 99999 }
5. Queries
Documents Stored as
Collections
Indexing for Deep Properties
Map(k1,v1) â list(k2,v2)
Reduce(k2, list (v2)) â list(v3) MapReduce Support
8. Fundamentals
RavenDB MongoDB
Built with C# Built with C++
Data saved as Data saved as
JSON BSON
Uses Lucene.NET Uses B-Trees for
for indexing indexing
Uses Memory-
Uses Esent for
mapped files for
Storage
storage
9. Writing to a Database
RavenDB MongoDB
Batch transaction Single row
support transactions*
Optimistic No concurrency
concurrency management
Granular write &
ACID safety control
10. Reading from a Database
RavenDB MongoDB
Cross-collection No cross-collection
query support queries
Server-side DbRef Client-side DbRef
Resolution* Resolution
No ad-hoc queries Supports ad-hoc
on subsets queries on subsets
Supports full-text No support for full-text
queries queries
11. Indexes
RavenDB MongoDB
Supports static Supports
indexes static indexes
Supports ad-hoc No ad-hoc index
indexes support
Multi-map No multi-map
indexes indexes
Indexing performed Indexing is configurable
in background to background or
foreground
12. MapReduce
RavenDB MongoDB
M/R done as M/R done as
indexes queries
Supports
No M/R pipeline
incremental M/R
M/R is calculated M/R is calculated
in background in real time
13. Replication and Scaling
RavenDB MongoDB
Master-Slave Master-Slave
Replication Replication
Master-Master No Master-Master
Replication Replication
Manual & Auto Manual & Auto
Sharding Sharding
Mix and Match
Replication / Shards Replica sets
14. Ecosystem
RavenDB MongoDB
Limited driver Rich driver ecosystem
ecosystem
Little Lots of documentation
documentation and examples
and examples
Depends on Can run
Windows anywhere
OSS and Commercial
Free / OSS License
Licenses ($$)
15. Extras
RavenDB MongoDB
âą Supports Triggers âą No Trigger Support
âą Multi-tenant âą Multi-tenant
âą Supports OAuth / Basic âą Supports Basic Auth
Auth / Anonymous*
âą REST API âą BSON over TCP
Both MongoDB and RavenDB support the modern toolsets we expect from production database technologies, like support for sharding, replication, and for producing stand-alone static backup images at any time.
AndersHejdlsberg and Bjorn Soustroup (invetors of C# and C++ respectively)Jayson Werth (baseball players for the Phillies) and Wyoming Bison scratching itself against a rockLucene vs. B-tree for indexingEsentvs Memory mapped files for storage engine
RavenDB supports transaction scope for batch insert operations, whereas transactions (i.e. operations which are atomic and can be rolled back) are only applicable at the single item level for MongoDB. However, it is worth noting that Mongo does support batch writes even though they cannot be wrapped inside transaction scope.In terms of concurrency management, MongoDB has none whereas RavenDB implements a form of optimistic concurrency â so before a record is written to the system will check to see if that document has been modified by another transaction beforehand, and if thatâs the case the transaction simply rolls back. There are no locks when this happens so the underyling database doesnât do any lock escalation or row / table locking.In terms of write safety, RavenDB is fully ACID compliant â every write is committed to disk before the operation returns true to the client. In MongoDB, write safety is more granular â by default writes are not safe; theyâre written to an in-memory buffer first and MongoDB tells the client that everything is fine even though the data is not committed to disk. Eventually Mongo will write the data to disk in the background asynchronously. You have the ability to configure this in Mongo though, so you can make writes in MongoDB commit to disk before returning âtrueâ to the client.
Buffet, representing cross-collection query support â and a fat guy eating a gargantuan cheeseburger representing single-collection-only queriesIn RavenDB cross-collection queries are done in a couple of different ways â MultiMap indexes are one, transform indexes are another where collections can be cross-referenced on the server.MakerBotvsIkeaIn Raven, you can have the server render whatever type of data object or view that you want as long as itâs derived from one or more of the indexes or collections inside of RavenDB, so this means that you can have it use transforms to take cross-collection references and resolve them server side.In Mongo, youâre stuck with the Ikea model of âsome assembly requiredâ â any relationships between documents in different collections have to be resolved through the calling client application. The 10gen folks have called this an âapplication join.âSome random book and album coversRavenDB does not support ad-hoc queries on subsets of fields, meaning when you ask âgive me back just these parts of documents that match this criteria in collection Aâ you still get the entire document back on the client and not just the fields you requested. Mongo does have the ability to run these types of queries and return back only a subset of fields you requested.Lastly, full-text indexing. Due to the fact that RavenDB is built on top of Lucene.NET (a powerful indexing agent) it has the ability to run full-text search right out of the box. MongoDB does not support full-text indexing or queries currently.
Both MongoDB and RavenDB support static indexes, which include indexes on deep properties and covered indexes.Interestingly, RavenDB supports the notion of a dynamic index created on-the-fly during production operations â these indexes are temporary and are not as powerful for flexible as static indexes, but they do make it so if your client application misses creating an index for an important set of operations that Raven will still find a way to provide it with the speed and timeliness of queries covered by static indexes.Raven also supports the notion of a multi-map index, where documents from multiple different collections are reduced down into a common subset â this is really useful for building search indexes that need to cover multiple types of documents (like blog posts, comments, and tags.) MongoDB does not support multi-map indexes.Lastly, in terms of how indexing is performed â this is always a background operation in RavenDB so technically you can end up with âstale indexesâ in the course of consuming indexed data from a client. There are ways to wait for a non-stale index, but these are typically not used in production applications. In MongoDB indexing behavior is configurable â it can either be a blocking foreground operation (and is done at the same time as a document insert) or it can run in the background like RavenDB.
InRavenDB, MapReduce is defined as an index and is precalculated in the background, whereas in MongoDBMapReduce is run as a query against the existing dataset in real time. One of the advantages of Mongoâs model is that it can support a MapReduce pipeline, where multiple MapReduce operations may need to be performed before the final result weâre interested in can be delivered. As far as I know, this isnât typically feasible in RavenDB without a lot of Linq magic or making inline calls to load data from other MapReduceindicies.
Boring chart showing master-slave replicationMaster-Slave replication is handy in environments where you need some read-only and both MongoDB and RavenDB support it.Phil Mickelson winning the masters (again) vs. Sad Tiger Woods - Should be noted that Tiger actually has 4 master wins over Phil Mickelsonâs 3, but hasnât won the masters since 2005.Master-Master replication is not supported in mostNoSQL databases but it is supported in RavenDB â the scenario where itâs handy is when you have all of your databases running under a load-balancer and donât want to expose the network topography to the client. In Master-Master you essentially have mirrored databases that are both writeable at any given time.The most popular form of replication in the NoSQL universe is sharding, where the contents of the database are partitioned into phyiscally separate databases spread across different servers, the idea being to help balance the load for large collections across multiple machines. Both Raven and Mongo support user-defined sharding strategies or automatic sharding where the Database Management System decides what the most balanced approach for the shards are in the cluster.Lastly, in terms of how you replicate shards RavenDB allows you to mix and match replication with sharding, but this is done somewhat manually. Mongo has a more elegant solution known as the replica set, where shards are replicated across the set of servers that are running, so if at any given time one server goes down the replica set can re-balance and not lose any data.
RavenDB loses to Mongo big time on ecosystem.It has a very limited set of drivers compared to Mongo, it has fewer documentation and examples, and it only runs on Windows. Couple that with the fact that it costs $500 per server to run in a commercial application and it has a much less appeal ecosystem than Mongo, where its adoption is much more widespread and it has a more favorable cost model for small companies.That being said, the RavenDB team often make exceptions for early stage startups, students, and non-profits in terms of licensing costs â and itâs free to start developing with RavenDB. You only need to pay for a license once you deploy your commercial application to market.