4. prepared for: 20/20 Companies | 8.22.2010
NoSQL Introduction
â NoSQL database, also called Not Only SQL, is an approach to data
management and database design that's useful for very large sets of
distributed data.
â A NoSQL database provides a mechanism for storage and retrieval of data
that is modeled in means other than the tabular relations used in traditional
relational databases.
â NoSQL (not only SQL) - a number of approaches and projects aimed for the
implementation of database models, with significant differences from those
that used in traditional relational database management system with access to
the data with the help of SQL. Description schema in the case of NoSQL can
be carried out through the use of different data structures: hash tables, arrays,
trees, etc.
â For the first time the term "NoSQL" was used in the late 90's.
â Now there are about 150 kinds of NoSQL databases (nosql-database.org)
9. prepared for: 20/20 Companies | 8.22.2010
Advantage of using NoSQL
â Quicker development times, this is due to aggregates, meaning you write to the
database once, instead of multiple times for individual entities.
â Easy to scale where relational databases are meant for a single server.
â Prevents having to create tables with custom columns like custom1, custom2,
etc. This is due to all NoSQL databases being schema-less
â Prevents having tables with large amounts of NULL values (sparse table).
â Itâs More Scalable
â Itâs Flexible
â Itâs Administrator-Friendly
â Itâs Cost-Effective and Open-Source
â The Cloudâs the Limit
10. prepared for: 20/20 Companies | 8.22.2010
Disadvantages of using NoSQL
â It Has a Very Narrow Focus
â Standardization and Open Source
â Performance and Scaling > Consistency
â A General Lack of Maturity
â It Doesnât Play Nice with Analytics
â No relations between the tables(collections in mongodb).
â No Stored Procedures in mongodb (NoSql database).
â Most of the administration is depends upon scripting like bash,perl etc.,
in linux environment.
â GUI mode tools to access the database is not flexibly available in
market.
12. prepared for: 20/20 Companies | 8.22.2010
â It is a Open Source, cross platform database written in C++.
â It is a document oriented database that provides, high performance, high
availability, and easy scalability.
â It stores data as documents. So it is a document oriented database.
â It does not support SQL It supports a rich, ad-hoc query language of its own.
â It works on concept of collection and document.
There are few organizations migrated to MongoDb and it proven benefits to them:. These are:
What is a MongoDB?
Organization Migrated From Application
MTV Networks Multiple RDBMS Centralized content management
Cisco Multiple RDBMS Analytics, social networking
Foursquare PostgreSQL Social, mobile networking platforms
Salesforce marketing Cloud RDBMS Social marketing, analytics
Orange Digital MySQL Content management
edmunds.com Oracale Billing, online advertising, user data
13. prepared for: 20/20 Companies | 8.22.2010
Basic Overview
Database: Database is a physical container for
collections. Each database gets its own set of
files on the file system. A single MongoDB
server typically has multiple databases.
Collection: Collection is a group of MongoDB
documents. It is the equivalent of an RDBMS
table. Collections do not enforce a schema.
Documents within a collection can have different
fields.
Document: A document is a set of key-value
pairs. Documents have dynamic schema.
Dynamic schema means that documents in the
same collection do not need to have the same
set of fields or structure, and common fields in a
collection's documents may hold different types
of data.
14. prepared for: 20/20 Companies | 8.22.2010
â Schema less : MongoDB is document database in which one collection holds
different different documents. Number of fields, content and size of the
document can be differ from one document to another.
â You can set an index on any attribute of a MongoDb record (as
FirstName="Sameer",Address="8 Gandhi Road"), with respect to which, a
record can be sort quickly and ordered.
â MongoDb supports various programming languages like C, C# and .NET,
C++, Erlang, Haskell, Java, Javascript, Perl, PHP, Python, Ruby, Scala (via
Casbah).
â MongoDb supports rich query to fetch the data.
â Data is stored in the form of JSON style documents.
â No complex joins
â MongoDb is easily installable.
Why MongoDB?
15. prepared for: 20/20 Companies | 8.22.2010
Features
â Document-oriented
â Ad hoc queries
â Indexing
â Replication
â Load balancing
â File storage
â Aggregation
â Server-side JavaScript execution
â Capped collections
â HTTP Server for REST API
16. prepared for: 20/20 Companies | 8.22.2010
How mongoDB works?
There is a process âmongodâ that acts as a database server.
This process attaches itself to a directory with the --dbpath option.
(Default dbpath : â/data/dbâ).
And start listening to a specific port number via --port option.
(Default port : 27017, 28017 - Http interfaces)
Example:-
> .mongod.exe --dbpath "C:datadb" --port 28080
17. prepared for: 20/20 Companies | 8.22.2010
⢠JSON:- JSON (JavaScript Object Notation) is a lightweight data-interchange
format. It is easy for humans to read and write. It is easy for machines to
parse and generate. Its built on key value structure and ordered list of
values.
⢠BSON:- Binary JSON, is a binary-encoded serialization of JSON-like docu-
ments. It supports the embedding of documents and arrays within other doc-
uments and arrays. BSON also contains extensions that allow representa-
tion of data types that are not part of the JSON spec.
Characteristics:-
â Lightweight: minimum overhead
â Traversable
â Efficient: encoding and decoding
Behind MongoDB - JSON / BSON
21. prepared for: 20/20 Companies | 8.22.2010
Sample MongoDB Document
Below given example shows the document structure of a blog site which is simply a comma separated key value pair.
{ _id: ObjectId(7df78ad8902c)
title: 'MongoDB Overview',
description: 'MongoDB is no sql database',
by: 'tutorials point',
url: 'http://www.tutorialspoint.com',
tags: ['mongodb', 'database', 'NoSQL'],
likes: 100,
comments: [
{
user:'user1',
message: 'My first comment',
dateCreated: new Date(2011,1,20,2,15),
like: 0
},
{
user:'user2',
message: 'My second comments',
dateCreated: new Date(2011,1,25,7,45),
like: 5
}
]
}
Note:
_id is a 12 bytes hexadecimal number which
assures the uniqueness of every document.
You can provide _id while inserting the
document. If you didn't provide then
MongoDB provide a unique id for every
document. These 12 bytes first 4 bytes for the
current timestamp, next 3 bytes for machine
id, next 2 bytes for process id of mongodb
server and remaining 3 bytes are simple
incremental value.
22. prepared for: 20/20 Companies | 8.22.2010
Relationship of RDBMS terminology with MongoDB
Below given table shows the relationship of RDBMS terminology with MongoDB
RDBMS MongoDB
Database Database
Table Collection
Tuple/Row Document
column Field
Table Join Embedded Documents
Primary Key Primary Key (Default key
_id provided by mongodb
itself)
Database Server and Client
Mysqld/Oracle mongod
mysql/sqlplus mongo
24. prepared for: 20/20 Companies | 8.22.2010
⢠Authentication - Who are you in MongoDB?
â Application user
â Administrator
â Backup job
â Monitoring agents
⢠Authorization - What can you do in MongoDB?
â CDUD Operation
â Configure the database
â Manage sharding/replication
â User management
MongoDB Configuration
25. prepared for: 20/20 Companies | 8.22.2010
Manage User and Roles
⢠MongoDB employs Role-Based Access Control (RBAC) to determine access
for users
⢠MongoDb store user and roles details to admin database.
⢠Each application and user of a MongoDB system should map to a distinct
application or administrator
⢠MongoDB provide built-in roles likeâŚ
â Database user role
â read
â readWrite
â Database Administration Roles
â dbAdmin
â dbOwner
â userAdmin
26. prepared for: 20/20 Companies | 8.22.2010
Create a system user administrator
use admin
db.createUser({
user: "root",
pwd: "root",
roles: ["root"]
})
db.createUser({
user: "siteUserAdmin",
pwd: "password",
roles: [ { role: "userAdminAnyDatabase", db: "admin" } ]
})
33. prepared for: 20/20 Companies | 8.22.2010
Collection : Filter Query
Comparison Operators
$eq Matches values that are equal to a specified value.
$gt Matches values that are greater than a specified value.
$gte Matches values that are greater than or equal to a
specified value.
$lt Matches values that are less than a specified value.
$lte Matches values that are less than or equal to a
specified value.
$ne Matches all values that are not equal to a specified
value.
$in Matches any of the values specified in an array.
$nin Matches none of the values specified in an array.
34. prepared for: 20/20 Companies | 8.22.2010
Collection : Filter Query
Logical Operators
$or Joins query clauses with a logical OR returns all documents that match
the conditions of either clause.
$and Joins query clauses with a logical AND returns all documents that
match the conditions of both clauses.
$not Inverts the effect of a query expression and returns documents that do
not match the query expression.
$nor Joins query clauses with a logical NOR returns all documents that fail
to match both clauses.
35. prepared for: 20/20 Companies | 8.22.2010
Collection : Filter Query
$exists Matches documents that have the specified field.
$type Selects documents if a field is of the specified type.
Element Operators
Array Operators
$all Matches arrays that contain all elements specified in the query.
$elemMatch Selects documents if element in the array field matches all the
specified $elemMatch conditions
$size Selects documents if the array field is a specified size.
36. prepared for: 20/20 Companies | 8.22.2010
Collection : Projection
Projection Operators
$ Projects the first element in an array that matches the query
condition.
$elemMatch Projects the first element in an array that matches the specified
$elemMatch condition.
$meta Projects the documentâs score assigned during $text operation.
$slice Limits the number of elements projected from an array. Supports
skip and limit slices.
37. prepared for: 20/20 Companies | 8.22.2010
Aggregation Framework
⢠Aggregations are operations that process data records and return computed
results.
⢠MongoDB provides a rich set of aggregation operations that examine and
perform calculations on the data sets.
⢠Running data aggregation on the mongod instance simplifies application
code and limits resource requirements.
40. prepared for: 20/20 Companies | 8.22.2010
Zip document
{
"_id": "10280",
"city": "NEW YORK",
"state": "NY",
"pop": 5574,
"loc": [ -74.016323, 40.710537 ]
}
Report:
⢠States with Populations above 10 Million
⢠Average City Population by State
⢠Largest and Smallest Cities by State
Aggregation : Demo
41. prepared for: 20/20 Companies | 8.22.2010
SQL and the corresponding MongoDB statements
Statement MYSQL MONGODB
Create CREATE TABLE users ( id INT NOT NULL
AUTO_INCREMENT, user_id Varchar(30), age Number,
status char(1), PRIMARY KEY (id) )
db.users.insert( { user_id: "abc123", age: 55,
status: "A" } )
OR
db.createCollection("users")
Alter ALTER TABLE users ADD join_date DATETIME db.users.update( { }, { $set: { join_date: new
Date() } }, { multi: true } )
Drop Column ALTER TABLE users DROP COLUMN join_date db.users.update( { }, { $unset: { join_date: ""
} }, { multi: true } )
Drop Table DROP TABLE users db.users.drop()
Insert INSERT INTO users(user_id, age, status) VALUES
("bcd001", 45, "A")
db.users.insert( { user_id: "bcd001", age: 45,
status: "A" } )
Select SELECT user_id, status FROM users WHERE status = "A" db.users.find( { status: "A" }, { user_id: 1,
status: 1, _id: 0 } )
()
Update UPDATE users SET status = "C" WHERE age > 25 db.users.update( { age: { $gt: 25 } }, { $set: {
status: "C" } }, { multi: true } )
Delete DELETE FROM users WHERE status = "D" db.users.remove( { status: "D" } )
42. prepared for: 20/20 Companies | 8.22.2010
⢠What is a priority?
â High consistency
â High read performance
â High write performance
⢠How does the application access and manipulate data?
â Read/Write Ratio
â Types of Queries / Updates
â Data life-cycle and growth
â Analytics (MapReduce, Aggregation)
Schema Design - Patterns
43. prepared for: 20/20 Companies | 8.22.2010
⢠One to One Relationship
â Relationships are often embedded
â Optimized read performance.
â Document provides a holistic representation of objects with
embedded entities
⢠One to Many Relations
â De-normalization
â Provides data locality using Referencing or Embedding
⢠Many to Many Relations
â Referencing and indexing
Schema Design - Example
45. prepared for: 20/20 Companies | 8.22.2010
Data Modeling in MongoDB
There are two tools that allow applications to represent these relationships: references and embedded documents.
References: References: References store the relationships between
data by including links or references from one document to another.
Applications can resolve these references to access the related data.
Embedded Data: Embedded documents capture
relationships between data by storing related data in a single
document structure. These denormalized data models allow
applications to retrieve and manipulate related data in a
single database operation.
User Document:
{
_id: <ObjectId1>,
username: "user1"
}
Contact Document:
{
_id: <ObjectId2>,
user_id: <ObjectId1>,
phone: "1234567890",
email: "user@example.com"
}
Access Document:
{
_id: <ObjectId3>,
user_id: <ObjectId1>,
level: 5,
group: "Dev"
}
{
_id: <ObjectId1>,
username: "user1",
contact: {
phone: "1234567890",
email: "user@example.com"
},
access: {
level: 5,
group: "Dev"
}
}
46. prepared for: 20/20 Companies | 8.22.2010
Replication
⢠A replica set in MongoDB
is a group of mongod
processes that maintain
the same data set
⢠Replica sets provide
redundancy and high
availability, and are the
basis for all production
deployments
47. prepared for: 20/20 Companies | 8.22.2010
Sharding
⢠Sharding is the process of storing
data records across multiple
machines and is MongoDBâs
approach to meeting the demands of
data growth.
⢠With sharding, you add more
machines to support data growth and
the demands of read and write
operations.