2. A NoSQL (originally referring to "non SQL" or "non
relational") database provides a mechanism
for storage and retrieval of data that is modeled in
means other than the tabular relations used
in relational databases.
Such databases have existed since the late 1960s,
but did not obtain the "NoSQL" moniker until a surge
of popularity in the early twenty-first century triggered
by the needs of Web 2.0 companies such
as Facebook, Google, and Amazon.
NoSQL databases are increasingly used in big
data and real time web applications. NoSQL systems
are also sometimes called "Not only SQL" to
emphasize that they may support SQL-like query
languages.
3. Motivations for this approach include:
simplicity of design, simpler horizontal
scaling to clusters of machines which
is a problem for relational databases
and finer control over availability.
4. a more flexible data model
higher scalability
Superior performance
No expensive JOIN operations and
complex, multi-record transactions.
5. Column: Accumulo, Cassandra, Druid, HBase, Vertica
, SAP HANA
Document: Apache
CouchDB, ArangoDB, Clusterpoint, Couchbase, Cos
mos DB, HyperDex, IBM
Domino, MarkLogic, MongoDB, OrientDB, Qizx, Rethin
kDB
Keyvalue: Aerospike, ArangoDB, Couchbase, Dynam
o reddis
Graph: AllegroGraph, ArangoDB, InfiniteGraph, Apac
he Giraph, MarkLogic, Neo4J, OrientDB
Multi-
model: ArangoDB, Couchbase, FoundationDB, Infinity
DB, MarkLogic, OrientDB
8. The biggest difference between non-relational
databases lies in the ability to query data efficiently.
Document databases provide the richest query
functionality, which allows them to address a wide
variety of operational and real-time analytics
applications.
Key-value stores and wide column stores provide a
single means of accessing data: by primary key. This
can be fast, but they offer very limited query
functionality and may impose additional development
costs and application-level requirements to support
anything more than basic query patterns.
9. Whereas relational databases store data in
rows and columns, document databases store
data in documents. These documents typically
use a structure that is like JSON a format
Documents provide an intuitive and natural
way to model data that is closely aligned with
object-oriented programming –
each document is effectively an object.
Documents contain one or more Fields, where
each Field contains a typed value, such as a
string, date, binary, or array.
Examples: MongoDB and CouchDB.
10.
11. MongoDB has the largest commercial
backing
The largest and most active community;
support teams spread across the world
providing 24x7 coverage;
user-groups in most major cities
extensive documentation.
12. Term equivalent
in RDBMS
Description
collection table This is a grouping of mongo dB documents .
The collection contains documents which in
turn contains Fields, which in turn are key-
value pairs.
cursor cursor Pointer to a result set of queries
database database Container of collections
document row In MongoDB, the data is stored in
documents.
field column the column denotes a set of data values.
These in MongoDB are known as Fields.
Embedded
documents
Joins the data is normally stored in a single
collection, but separated by using
Embedded documents. So there is no
concept of joins in Mongodb.
17. MongoDB is well suited for Bigdata and
mobile & social infrastructure.
MongoDB provides Replication, High
availability .
MongoDB is used by companies like
Foursquare, SourceForge, The New York
Times, Lexis ,Orange Digital etc.
18.
19. CouchDB is a JSON document-oriented database
written in Erlang.
It is a highly concurrent database designed to be
easily replicable, horizontally, across numerous
devices and be fault tolerant.
It is part of the NoSQL generation of databases.
It is an open source Apache foundation project.
It allows applications to store JSON documents via its
RESTful interface.
It makes use of map/reduce to index and query the
database.
CouchDB is a database designed to run on the
internet of today.
22. String Float integer boolean Arrays
Object nulls
{ // a document example
"Subject": "I like Plankton",
"Author": "Rusty",
"PostedDate": "2006-08-15T17:30:12-04:00",
"Tags": [
"plankton",
"baseball",
"decisions"
],
"Body": "I decided today that I don't like baseball. I like plankton."
}
23. Create a document - cURL
curl -H 'Content-Type: application/json' -X PUT
http://127.0.0.1:5984/my_database/"001" -d
'{"Name":“john", "age":"23" , "Designation" : "Designer" }'
Viewing all documents - cURL
curl -X GET http://127.0.0.1:5984/mycouchshop/_all_docs
Creating a simple map function – FAUXTON
function (doc) {
if (doc.type === "product" && doc.name) {
emit(doc.name, doc);
}
}
24. JSON Documents - Everything stored in CouchDB boils
down to a JSON document.
RESTful Interface - From creation to replication to
data insertion, every management and data task in
CouchDB can be done via HTTP.
N-Master Replication - You can make use of an
unlimited amount of 'masters', making for some very
interesting replication topologies.
Built for Offline - CouchDB can replicate to devices
(like Android phones) that can go offline and handle
data sync for you when the device is back online.
Replication Filters - You can filter precisely the data
you wish to replicate to different nodes.
Browser Based GUI: CouchDB provides an interface
Futon which facilitates a browser based GUI to
handle your data, permission and configuration.
25. FEATURE CouchDB MongoDB
Data Model Document model Document model
Interface HTTP/REST binary protocol and custom
protocol over TCP/IP
Object Storage database contains
documents.
database contains collections
and collection contains
documents.
Query Method Map/Reduce -
JavaScript
Map/Reduce -JavaScript
object-based query language
Replication master-master
replication
master-slave replication.
Consistency consistent strongly consistent
Written in Erlang C++
26. From a data model perspective, key-
value stores are the most basic type of
non-relational database. Every item in
the database is stored as an attribute
name, or key,together with its value. The
value, however, is entirely opaque to the
system; data can only be queried by the
key.
27. Each record can vary in the number of columns that are
stored. Columns can be grouped together for access in
column families, or columns can be spread across multiple
column families. Data is retrieved by primary key per
column family.
Applications: Key value stores and wide column stores
are useful for a narrow set of applications that only query
data by a single key value. The appeal of these systems is
their performance and scalability, which can be highly
optimized due to the simplicity of the data access patterns
and opacity of the data itself.
Examples: Riak and Redis (Key-Value)
HBase and Cassandra (Wide Column).
28.
29. Apache Cassandra is highly scalable, high
performance, distributed NoSQL database.
Cassandra is designed to handle huge
amount of data across many commodity
servers, providing high availability without a
single point of failure.
Cassandra has a distributed architecture
which is capable to handle a huge amount
of data. Data is placed on different
machines with more than one replication
factor to attain a high availability without a
single point of failure.
30. Cassandra is highly scalable, high
performance, consistent . Cassandra is a
column-oriented database.
Cassandra provides easy data distribution.
Cassandra follows the distribution design of
Amazons dynamo and its data model
design is based on Google's Bigtable.
Cassandra was initially created at
Facebook for inbox search and now it is
being used by some of the biggest
companies like Facebook, Twitter, ebay,
Netflix, Cisco, Rackspace etc.
31. A column is the basic unit in a wide-
column database and consists of a key
and value pair. For example, a column
might have the key “name” and the
value could be a string representing a
name.
32. Messaging
Handle high speed Applications
Social Media Analytics
Product Catalogs and retailing
33.
34.
35. Keyspaces
outermost container which contains data corresponding to an application.
Column
A column is the basic unit in a wide-column database and consists of a key and
value pair.
Column families
column family is a container of a collection of rows. Each row contains
ordered columns.
Super columns
A super column contains many key-value pairs
Indexes
queries
36. BIGINT BLOB BOOLEAN DECIMAL DOULBE
Float Frozen int inet List
Map Set Text Timestamp varchar
39. Graph databases use graph structures
with nodes, edges and properties to
represent data. In essence, data is
modeled as a network of relationships
between specie elements.
navigating social network connections,
network topologies or supply chains.
Examples: Neo4j and Giraph.