Charles Nurse gives an introduction to NoSQL databases. He discusses the CAP theorem and how it relates to consistency, availability and partition tolerance. He also explains map reduce and how it is used to aggregate large amounts of distributed data. Finally, he summarizes different types of NoSQL databases like key-value stores, column-oriented databases and document databases, and provides an example with RavenDB, a .NET based document database.
4. Look Mom - NoSQL
Intro to NoSQL Databases
• Driven by the demands of “Big Data”
Google
Facebook
Amazon
• Huge amounts of data
Distributed Environment
Availability
• CAP Theorem
5. Look Mom - NoSQL
CAP Theorem
• CAP Theorem states
“It is impossible for a distributed computer system to simultaneously provide
all three of the guarantees”
• Consistency
• Availability
• Partition Tolerance
6. Look Mom - NoSQL
CAP Theorem
• Consistency
All nodes in a distributed system see the same data at the same time
eCommerce
Weapons Systems
• Availability
All requests receive a response about whether it was successful or failed
• Partition Tolerance
The system continues to operate despite arbitrary message loss or failures
of part of the system
7. Look Mom - NoSQL
CAP Theorem
• Relational Databases emphasise Consistency, so either
Availability or Partition Tolerance will suffer
• NoSQL Databases emphasise Availability and Partition
Tolerance
Eventual Consistency
Google Searches do not need to show documents created in the last
few seconds
Facebook News Feed – do not need to show updates from the last few
seconds
8. Look Mom - NoSQL
Map Reduce
• NoSQL databases support distributed systems
• Map Reduce helps aggregate data using a pair of functions
Map function
Maps input data into its final form
Can be executed in parallel on each system
Reduce function
Operates on results of the Map functions
Executed repeatedly until results are obtained
9. Look Mom - NoSQL
Map Reduce (Example)
• Blog Documents • Map
{ from post in docs.posts
"type": "post", select new {
"name": "Raven's Map/Reduce post.blog_id,
functionality", comments_length = comments.length
"blog_id": 1342, };
"post_id": 29293921,
"tags": ["raven", "nosql"], • Reduce
"post_content": "<p>...</p>", from agg in results
"comments": [ { group agg by agg.key into g
"source_ip": '124.2.21.2', select new {
"author": "martin", agg.blog_id,
"text": "..." comments_length = g.Sum(x=>x.comments_length)
}] };
}
25. Look Mom - NoSQL
RavenDB
• Document Database
Using JSON
• Built in .NET
• LINQ Support
• Full-text Search
Built on Lucene
• Two versions
Server
Embedded
ConsistencyA service that is consistent operates fully or not at all. Gilbert and Lynch use the word "atomic" instead of consistent in their proof, which makes more sense technically because, strictly speaking, consistent is the C in ACID as applied to the ideal properties of database transactions and means that data will never be persisted that breaks certain pre-set constraints. But if you consider it a preset constraint of distributed systems that multiple values for the same piece of data are not allowed then I think the leak in the abstraction is plugged (plus, if Brewer had used the word atomic, it would be called the AAP theorem and we'd all be in hospital every time we tried to pronounce it).In the book buying example you can add the book to your basket, or fail. Purchase it, or not. You can't half-add or half-purchase a book. There's one copy in stock and only one person will get it the next day. If both customers can continue through the order process to the end (i.e. make payment) the lack of consistency between what's in stock and what's in the system will cause an issue. Maybe not a huge issue in this case - someone's either going to be bored on vacation or spilling soup - but scale this up to thousands of inconsistencies and give them a monetary value (e.g. trades on a financial exchange where there's an inconsistency between what you think you've bought or sold and what the exchange record states) and it's a huge issue.We might solve consistency by utilising a database. At the correct moment in the book order process the number of War and Peace books-in-stock is decremented by one. When the other customer reaches this point, the cupboard is bare and the order process will alert them to this without continuing to payment. The first operates fully, the second not at all.Databases are great at this because they focus on ACID properties and give us Consistency by also giving us Isolation, so that when Customer One is reducing books-in-stock by one, and simultaneously increasing books-in-basket by one, any intermediate states are isolated from Customer Two, who has to wait a few milliseconds while the data store is made consistent.AvailabilityAvailability means just that - the service is available (to operate fully or not as above). When you buy the book you want to get a response, not some browser message about the web site being uncommunicative. Gilbert & Lynch in their proof of CAP Theorem make the good point that availability most often deserts you when you need it most - sites tend to go down at busy periods precisely because they are busy. A service that's available but not being accessed is of no benefit to anyone.Partition ToleranceIf your application and database runs on one box then (ignoring scale issues and assuming all your code is perfect) your server acts as a kind of atomic processor in that it either works or doesn't (i.e. if it has crashed it's not available, but it won't cause data inconsistency either).Once you start to spread data and logic around different nodes then there's a risk of partitions forming. A partition happens when, say, a network cable gets chopped, and Node A can no longer communicate with Node B. With the kind of distribution capabilities the web provides, temporary partitions are a relatively common occurrence and, as I said earlier, they're also not that rare inside global corporations with multiple data centres.Gilbert & Lynch defined partition tolerance as:No set of failures less than total network failure is allowed to cause the system to respond incorrectlyand noted Brewer's comment that a one-node partition is equivalent to a server crash, because if nothing can connect to it, it may as well not be there.
Map / Reduce is just a pair of functions, operating over a list of data. In C#, Linq actually gives us a great chance to do things in a way that make it very easy to understand and work with. Let us say that we want to be about to get a count of comments per blog. We can do that using the following Map / Reduce queries:There are a couple of things to note here:The first query is the map query, it maps the input document into the final format.The second query is the reduce query, it operate over a set of results and produce an answer.Note that the reduce query must return its result in the same format that it received it, why will be explained shortly.The first value in the result is the key, which is what we are aggregating on (think the group by clause in SQL).
We have some blog posts in a multi-blogger environment (Wordpress, Blogger etc)Distributed over 4 systems.For simplicity we have distributed them equally.We want to find the total number of comments each blog has.
The next step is to start reducing the results, in real Map/Reduce algorithms, we partition the original input, and work toward the final result. In this case, imagine that the output of the first step was divided into groups of 3 (so 4 groups overall), and then the reduce query was applied to it, giving us:
You can see why it was called reduce, for every batch, we apply a sum by blog_id to get a new Total Comments value. We started with 11 rows, and we ended up with just 10. That is where it gets interesting, because we are still not done, we can still reduce the data further.This is what we do in the third step, reducing the data further still. That is why the input & output format of the reduce query must match, we will feed the output of several the reduce queries as the input of a new one. You can also see that now we moved from having 10 rows to have just 8
And now we are done, we can't reduce the data any further because all the keys are unique.There is another interesting property of Map / Reduce, let us say that I just added a comment to a post, that would obviously invalidate the results of the query, right?Well, yes, but not all of them. Assuming that I added a comment to the post whose id is 10, what would I need to do to recalculate the right result?Map Doc #10 againReduce Step 2, Batch #3 againReduce Step 3, Batch #1 againReduce Step 4
And now we are done, we can't reduce the data any further because all the keys are unique.There is another interesting property of Map / Reduce, let us say that I just added a comment to a post, that would obviously invalidate the results of the query, right?Well, yes, but not all of them. Assuming that I added a comment to the post whose id is 10, what would I need to do to recalculate the right result?Map Doc #10 againReduce Step 2, Batch #3 againReduce Step 3, Batch #1 againReduce Step 4
Google, who were one of the main pioneers of the trend to NoSQL databases, due to the sheer volume of data they store uses a system (Bigtable) that stores data in a column-oriented way. This compares with typical Relational systems which are row-oriented.Each unit of data can be thought of as a set of key-value pairs, with the unit being identified by a “primary key”. Bigtable calls this the “row key”. The units of data are sorted and ordered on the basis of this row key. So far this isn’t really much different than a Table in a Relational model that has a primary key field and a clustered index.What makes the store “column-oriented” is that the various pieces of information that define the “record” of data, can be divided into groups of columns or column families. For example, if we are saving information about a person, we may define first_name and last_name fields, which can be grouped in a name column family. Likewise we could define street_address, city and zip_code, which can be grouped in a address column family, and sex and age which can be grouped in a profile column family.We now have 3 column families or buckets of information. In a column-oriented store column families are typically defined at configuration or startup but the individual columns need not be pre-defined.Within each bucket, only key/value pairs are defined. The column key identifies the column family or bucket to use and the row key identifies the individual columns within the bucket.Like many NoSQL databases there is not really a concept of NULL data. New columns can be added at any time as it is just another key/value pair in the bucket.While data that relates to the same row key will often be stored in a contiguous fashion, this set up allows for data to be partitioned across multiple computer nodes.
A relational database assumes that each column defined in the table schema will have a value for each row that is present in the table. NULL values are usually represented with a special marker (e.g. \\N). The primary key and column identifier are implicitly associated with each cell based on its physical position within the layout. The following diagram illustrates how a relational database table might be laid out on disk.
Hypertable (and Bigtable) takes its design from the Log Structured Merge Tree. It flattens out the table structure into a sorted list of key/value pairs, each one representing a cell in the table. The key includes the full row and column identifier, which means each cell is provided complete addressing information. Cells that are NULL are simply not included in the list which makes this design particularly well-suited for sparse data. The following diagram illustrates how Hypertable stores table data on-disk.Though there can be a fair amount of redundancy in the row keys and column identifiers, Hypertable employs key-prefix and block data compression which considerably mitigates this problem.
Document Databases are not document management systems. A document in this case is a loosely structured set of key/value pairs in documents, typically using JSON (JavaScript Object Notation), not a Word Processing Document or Spreadsheet.One core benefit for object-oriented developers is that we can think of a document as mapping to an object, including any contained collections/objects, although in reality what we mean here are objects that are considered to be “aggregate roots”.Document Databases treat a document as a whole – rather than splitting it into its constituent key/value pairs.