SlideShare ist ein Scribd-Unternehmen logo
1 von 45


Presented by:
Partha Pratim Das
5th Semester
M.Tech. in Computer Sc. And Applications

Registration no.:053792 of 2006-07
Roll no.: 97/CSA/111002
No SQL

Feb 19, 2014

1
Introduction to NOSQL
 SQL v/s NoSQL
 Architecture of NoSQL
 ACID v/s BASE
 Examples of NOSQL databases
 NOSQL vs SQL
 Conclusion


No SQL

Feb 19, 2014

2



Database – is a organized collection of inter-related data.
Data base Management System (DBMS)- is a software
package with computer program that controls the
creation , maintenance & use of a database in a
convenient and efficient way.
◦ for DBMS , we use structured language to interact with it
◦ Ex. Oracle , IBM DB2 , Ms Access , MySQL , FoxPro etc.



Relational DBMS - A relational database is a
collection of data items organized as a set of formally
described tables from which data can be accessed
easily. A relational database is created using
the relational model. The software used in a relational
database is called a relational database management
system (RDBMS).
No SQL

Feb 19, 2014

3









Structured Query Language
Special purpose programming language designed for
managing data in Relational DBMS.
Originaly based upon relational algebra & tuple relation
calculus.
SQl’s scope include data insertion, updation & deletion,
schema creation and modification , data access control.
It is static and strongly used in database.
Most widely used database language.
Query is the most important operation in SQL.
Ex. SELECT *
FROM Book
WHERE price > 100.00
ORDER BY title;
No SQL

Feb 19, 2014

4
Stands for Not Only SQL
 Class of non-relational data storage systems
 Usually do not require a fixed table schema or the
concept of joins
 All NOSQL offerings relax one or more of the
ACID properties .


◦ Atomicity , Consistency , Isolation , Durability ( ACID )



“NOSQL” = “Not Only SQL” =
Not Only using traditional relational DBMS

No SQL

Feb 19, 2014

5
•

Alternative to traditional relational DBMS
• Flexible schema
• Quicker/cheaper to set up
• Massive scalability
• Relaxed consistency → higher performance &
availability
* No declarative query language → more programming
* Relaxed consistency → fewer guarantees

No SQL

Feb 19, 2014

6
Every problem cannot be solved by traditional
relational database system exclusively.
 Handles huge databases.
 Redundancy, data is pretty safe on commodity
hardware
 Super flexible queries using map/reduce
 Rapid development (no fixed schema)
 Very fast for common use cases


No SQL

Feb 19, 2014

7
Inspired

by Distributed Data Storage problems
Scale easily by adding servers
Not suited to all problem types, but super-suited to
certain large problem types
High-write situations (eg activity tracking or timeline
rendering for millions of users)
A lot of relational uses are really dumbed down (eg
fetch by PK with update)

No SQL

Feb 19, 2014

8
No SQL

Feb 19, 2014

9


Clients know how to:
Send items to servers (consistent hashing)
What to do when a server fails
How to fetch keys from servers
Can “weigh” to server capacities



Servers know how to:
Store items they receive
Expire them from the cache
No inter-server communications – everything is
unaware
No SQL

Feb 19, 2014

10


RDBMS tries to ensure ACID properties



NoSQL does not guarantee ACID and is therefore
much faster



We don’t need ACID everywhere



NoSQL follows BASE properties

No SQL

Feb 19, 2014

11
Basic availability
The store appears to work most of the
time.
 Soft-state
Stores don’t have to be write-consistent, nor
do different replicas have to be mutually
consistent all the time.
 Eventual consistency
Stores exhibit consistency at some later
point (e.g., lazily at read time).


No SQL

Feb 19, 2014

12


Simple web application with not much traffic

◦ Application server, database server all on one machine

No SQL

Feb 19, 2014

13
More traffic comes in
 Application server
 Database server

Even more traffic comes in
 Load balancer
 Application server x2
 Database server

No SQL

Feb 19, 2014

14
 Even more traffic comes in






Load balancer x N
 easy
Application server x N
 easy
Database server xN
 hard for SQL databases

No SQL

Feb 19, 2014

15
SQL Slowdown

Not linear!

No SQL

Feb 19, 2014

16
NoSQL Scalling Need more storage?
 Add more servers!
Need higher performance?
 Add more servers!
Need better reliability?
 Add more servers!

No SQL

Feb 19, 2014

17


You can scale SQL databases (Oracle, MySQL,
SQL Server…)
◦ This will cost you dearly
◦ If you don’t have a lot of money, you will reach limits
quickly



You can scale NoSQL databases

◦ Very easy horizontal scaling
◦ Lots of open-source solutions
◦ Scaling is one of the basic incentives for design, so it is
well handled
◦ Scaling is the cause of trade-offs causing you to have to
use map/reduce

No SQL

Feb 19, 2014

18
Almost infinite horizontal scaling
 Very fast
 Performance doesn’t deteriorate with growth
(much)
 No fixed table schemas
 No join operations
 Ad-hoc queries difficult or impossible
 Structured storage
 Almost everything happens in RAM


No SQL

Feb 19, 2014

19
No SQL

Feb 19, 2014

20
 Key-Value
 Column

Family

 Document
 Graph

Stores

Databases

Databases
No SQL

Feb 19, 2014

21
No SQL

Feb 19, 2014

22







Lineage: Amazon's Dynamo paper and Distributed
HashTables.
Data model: A global collection of key-value pairs
Example systems
◦ Google BigTable , Amazon Dynamo, Cassandra,
Voldemort , Hbase etc.
Implementation: efficiency, scalability, fault-tolerance,
load balancing
◦ Records distributed to nodes based on key
◦ Replication (R= 2*F+1) where F stands for fault
tolerence
◦ Single-record transactions, “eventual consistency”
No SQL

Feb 19, 2014

23
Lineage: Inspired by Lotus Notes.
 Data model: Collections of documents, which
contain key-value collections (called
"documents").
 Example: CouchDB, MongoDB, Riak


No SQL

Feb 19, 2014

24
Basic
Building
Blocks of
Column
Family
Storage

No SQL

Feb 19, 2014

25
No SQL

Feb 19, 2014

26
Lineage: Draws from Euler and graph theory.
 Data model: Nodes & relationships, both which
can hold key-value pairs
 Example: AllegroGraph, InfoGrid, Neo4j


No SQL

Feb 19, 2014

27
 Property

Graph:

• It contains nodes and relationships
• Nodes contain properties (key-value pairs)
• Relationships are named and directed, and always
have a start and end node
• Relationships can also contain properties

No SQL

Feb 19, 2014

28
No SQL

Feb 19, 2014

29
No SQL

Feb 19, 2014

30




Google’s framework for processing highly
distributable problems across huge datasets
using a large number of computers
Let’s define large number of computers
◦ Cluster if all of them have same hardware
◦ Grid unless Cluster (if !Cluster for old-style programmers)



Process split into two phases
◦ Map
 Take the input, partition it delegate to other machines
 Other machines can repeat the process, leading to tree structure
 Each machine returns results to the machine who gave it the task

No SQL

Feb 19, 2014

31
◦ Reduce
 collect results from machines you gave the tasks
 combine results and return it to requester

◦ Slower than sequential data processing, but massively
parallel
◦ Sort petabyte of data in a few hours
◦ Input, Map, Shuffle, Reduce, Output

No SQL

Feb 19, 2014

32
Hadoop / Hbase

MemcacheDB

Cassandra

Voldemort

Amazon

Hypertable

SimpleDB
MongoDB
CouchDB
Redis

Cloudata
IBM

Lotus/Domino

No SQL

Feb 19, 2014

33


Cassandra
◦
◦
◦
◦
◦
◦



Facebook (original developer, used it till late 2010)
Twitter
Digg
Reddit
Rackspace
Cisco

BigTable

◦ Google (open-source version is HBase)



MongoDB
◦
◦
◦
◦
◦

Foursquare
Craigslist
Bit.ly
SourceForge
GitHub

No SQL

Feb 19, 2014

34
Written in: Java
 Protocol: Custom, binary (Thrift)
 Tunable trade-offs for distribution and replication
(N, R, W)
 Querying by column, range of keys
 BigTable-like features: columns, column families
 Writes are much faster than reads (!)


◦ Constant write time regardless of database size



Map/reduce possible with Apache Hadoop

No SQL

Feb 19, 2014

35








Cassandra is open source DBMS from Appache
software foundation.
Cassandra provides a structured key-value
store with tunable consistency
Cassandra is a distributed storage system for
managing structured data that is designed to scale to
a very large size across many commodity servers,
with no single point of failure
It is a NoSQL solution that was initially developed
by Facebook and powered their Inbox Search feature
until late 2010
No SQL

Feb 19, 2014

36














Written in: Java
Main point: Billions of rows X millions of columns
Modeled after BigTable
Map/reduce with Hadoop
Query predicate push down via server side scan and get filters
Optimizations for real time queries
A high performance Thrift gateway
HTTP supports XML, Protobuf, and binary
Cascading, hive, and pig source and sink modules
No single point of failure
While Hadoop streams data efficiently, it has overhead for starting
map/reduce jobs. HBase is column oriented key/value store and
allows for low latency read and writes.
Random access performance is like MySQL

No SQL

Feb 19, 2014

37















Written in: Erlang
Main point: DB consistency, ease of use
Bi-directional (!) replication, continuous or ad-hoc, with
conflict detection, thus, master-master replication. (!)
MVCC - write operations do not block reads
Previous versions of documents are available
Crash-only (reliable) design
Needs compacting from time to time
Views: embedded map/reduce
Formatting views: lists & shows
Server-side document validation possible
Authentication possible
Real-time updates via _changes (!)
Attachment handling
CouchApps (standalone JS apps)
No SQL

Feb 19, 2014

38







Apache project
A framework that allows for the distributed processing of large
data sets across clusters of computers
Designed to scale up from single servers to thousands of
machines
Designed to detect and handle failures at the application layer,
instead of relying on hardware for it
Created by Doug Cutting, who named it after his son's toy
elephant
Hadoop subprojects
◦ Cassandra
◦ HBase
◦ Pig

 Hive was a Hadoop subproject, but is now a top-level Apache project

No SQL

Feb 19, 2014

39










Scales to hundreds or thousands of computers, each with
several processor cores
Designed to efficiently distribute large amounts of work across
a set of machines
Hundreds of gigabytes of data constitute the low end of
Hadoop-scale
Built to process "web-scale" data on the order of hundreds of
gigabytes to terabytes or petabytes
Uses Java, but allows streaming so other languages can
easily send and accept data items to/from Hadoop

No SQL

Feb 19, 2014

40


Uses distributed file system (HDFS)

◦ Designed to hold very large amounts of data (terabytes
or even petabytes)
◦ Files are stored in a redundant fashion across multiple
machines to ensure their durability to failure and high
availability to very parallel applications
◦ Data organized into directories and files
◦ Files are divided into block (64MB by default) and
distributed across nodes



Design of HDFS is based on the design of the
Google File System

No SQL

Feb 19, 2014

41
A petabyte-scale data warehouse system for
Hadoop
 Easy data summarization, ad-hoc queries
 Query the data using a SQL-like language called
HiveQL
 Hive compiler generates map-reduce jobs for
most queries


No SQL

Feb 19, 2014

42
NoSQL is a great problem solver if you need it
 Choose your NoSQL platform carefully as each is
designed for specific purpose
 Get used to Map/Reduce
 It’s not a sin to use NoSQL alongside (yes)SQL
database


No SQL

Feb 19, 2014

43
Graph Databases by Ian Robinson,Jim Webber
and Emil Eifrem
 http://www.mongodb.com/learn/nosql
 http://www.couchbase.com/nosql-database
 http://en.wikipedia.org/wiki/Apache_Cassandra
 http://en.wikipedia.org/wiki/SQL
 http://en.wikipedia.org/wiki/NoSQL
 www.slideshare.com


No SQL

Feb 19, 2014

44
THANK
YOU..!!
No SQL

Feb 19, 2014

45

Weitere ähnliche Inhalte

Was ist angesagt?

Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sqlRam kumar
 
Sql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explainedSql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explainedSatya Pal
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databasesJames Serra
 
Sql vs NoSQL-Presentation
 Sql vs NoSQL-Presentation Sql vs NoSQL-Presentation
Sql vs NoSQL-PresentationShubham Tomar
 
NoSQL - 05March2014 Seminar
NoSQL - 05March2014 SeminarNoSQL - 05March2014 Seminar
NoSQL - 05March2014 SeminarJainul Musani
 
Introduction to NuoDB
Introduction to NuoDBIntroduction to NuoDB
Introduction to NuoDBSandun Perera
 
Introduction to NOSQL databases
Introduction to NOSQL databasesIntroduction to NOSQL databases
Introduction to NOSQL databasesAshwani Kumar
 
Chapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choicesChapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choicesMaynooth University
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduceJ Singh
 
Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)gdusbabek
 
Webinar: DataStax Training - Everything you need to become a Cassandra Rockstar
Webinar: DataStax Training - Everything you need to become a Cassandra RockstarWebinar: DataStax Training - Everything you need to become a Cassandra Rockstar
Webinar: DataStax Training - Everything you need to become a Cassandra RockstarDataStax
 
Research on vector spatial data storage scheme based
Research on vector spatial data storage scheme basedResearch on vector spatial data storage scheme based
Research on vector spatial data storage scheme basedAnant Kumar
 
SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!Andraz Tori
 

Was ist angesagt? (20)

Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sql
 
NoSql
NoSqlNoSql
NoSql
 
Sql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explainedSql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explained
 
Relational vs. Non-Relational
Relational vs. Non-RelationalRelational vs. Non-Relational
Relational vs. Non-Relational
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 
NoSQL
NoSQLNoSQL
NoSQL
 
Sql vs NoSQL-Presentation
 Sql vs NoSQL-Presentation Sql vs NoSQL-Presentation
Sql vs NoSQL-Presentation
 
NoSQL - 05March2014 Seminar
NoSQL - 05March2014 SeminarNoSQL - 05March2014 Seminar
NoSQL - 05March2014 Seminar
 
Introduction to NuoDB
Introduction to NuoDBIntroduction to NuoDB
Introduction to NuoDB
 
Introduction to NOSQL databases
Introduction to NOSQL databasesIntroduction to NOSQL databases
Introduction to NOSQL databases
 
Chapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choicesChapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choices
 
Databases in the Cloud
Databases in the CloudDatabases in the Cloud
Databases in the Cloud
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduce
 
Nosql
NosqlNosql
Nosql
 
Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)
 
Webinar: DataStax Training - Everything you need to become a Cassandra Rockstar
Webinar: DataStax Training - Everything you need to become a Cassandra RockstarWebinar: DataStax Training - Everything you need to become a Cassandra Rockstar
Webinar: DataStax Training - Everything you need to become a Cassandra Rockstar
 
Rdbms vs. no sql
Rdbms vs. no sqlRdbms vs. no sql
Rdbms vs. no sql
 
Research on vector spatial data storage scheme based
Research on vector spatial data storage scheme basedResearch on vector spatial data storage scheme based
Research on vector spatial data storage scheme based
 
NoSQL Consepts
NoSQL ConseptsNoSQL Consepts
NoSQL Consepts
 
SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!
 

Ähnlich wie NoSQL Seminer

Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?Ahmed Rashwan
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQLbalwinders
 
Introduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDBIntroduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDBAhmed Farag
 
No sql databases explained
No sql databases explainedNo sql databases explained
No sql databases explainedSalil Mehendale
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabasesAdi Challa
 
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013Facundo Farias
 
Unit II -BIG DATA ANALYTICS.docx
Unit II -BIG DATA ANALYTICS.docxUnit II -BIG DATA ANALYTICS.docx
Unit II -BIG DATA ANALYTICS.docxvvpadhu
 
001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introductionScott Miao
 
Why does Microsoft care about NoSQL, SQL and Polyglot Persistence?
Why does Microsoft care about NoSQL, SQL and Polyglot Persistence?Why does Microsoft care about NoSQL, SQL and Polyglot Persistence?
Why does Microsoft care about NoSQL, SQL and Polyglot Persistence?brianlangbecker
 
1. introduction to no sql
1. introduction to no sql1. introduction to no sql
1. introduction to no sqlAnuja Gunale
 
No sqlpresentation
No sqlpresentationNo sqlpresentation
No sqlpresentationSalma Gouia
 

Ähnlich wie NoSQL Seminer (20)

Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Introduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDBIntroduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDB
 
No sql databases explained
No sql databases explainedNo sql databases explained
No sql databases explained
 
Erciyes university
Erciyes universityErciyes university
Erciyes university
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
 
NoSQL Basics and MongDB
NoSQL Basics and  MongDBNoSQL Basics and  MongDB
NoSQL Basics and MongDB
 
No SQL introduction
No SQL introductionNo SQL introduction
No SQL introduction
 
the rising no sql technology
the rising no sql technologythe rising no sql technology
the rising no sql technology
 
Know what is NOSQL
Know what is NOSQL Know what is NOSQL
Know what is NOSQL
 
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
 
unit2-ppt1.pptx
unit2-ppt1.pptxunit2-ppt1.pptx
unit2-ppt1.pptx
 
Unit II -BIG DATA ANALYTICS.docx
Unit II -BIG DATA ANALYTICS.docxUnit II -BIG DATA ANALYTICS.docx
Unit II -BIG DATA ANALYTICS.docx
 
001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introduction
 
No sql database
No sql databaseNo sql database
No sql database
 
Why does Microsoft care about NoSQL, SQL and Polyglot Persistence?
Why does Microsoft care about NoSQL, SQL and Polyglot Persistence?Why does Microsoft care about NoSQL, SQL and Polyglot Persistence?
Why does Microsoft care about NoSQL, SQL and Polyglot Persistence?
 
1. introduction to no sql
1. introduction to no sql1. introduction to no sql
1. introduction to no sql
 
No sql
No sqlNo sql
No sql
 
SQL & NoSQL
SQL & NoSQLSQL & NoSQL
SQL & NoSQL
 
No sqlpresentation
No sqlpresentationNo sqlpresentation
No sqlpresentation
 

Kürzlich hochgeladen

SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 

Kürzlich hochgeladen (20)

SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 

NoSQL Seminer

  • 1.  Presented by: Partha Pratim Das 5th Semester M.Tech. in Computer Sc. And Applications Registration no.:053792 of 2006-07 Roll no.: 97/CSA/111002 No SQL Feb 19, 2014 1
  • 2. Introduction to NOSQL  SQL v/s NoSQL  Architecture of NoSQL  ACID v/s BASE  Examples of NOSQL databases  NOSQL vs SQL  Conclusion  No SQL Feb 19, 2014 2
  • 3.   Database – is a organized collection of inter-related data. Data base Management System (DBMS)- is a software package with computer program that controls the creation , maintenance & use of a database in a convenient and efficient way. ◦ for DBMS , we use structured language to interact with it ◦ Ex. Oracle , IBM DB2 , Ms Access , MySQL , FoxPro etc.  Relational DBMS - A relational database is a collection of data items organized as a set of formally described tables from which data can be accessed easily. A relational database is created using the relational model. The software used in a relational database is called a relational database management system (RDBMS). No SQL Feb 19, 2014 3
  • 4.         Structured Query Language Special purpose programming language designed for managing data in Relational DBMS. Originaly based upon relational algebra & tuple relation calculus. SQl’s scope include data insertion, updation & deletion, schema creation and modification , data access control. It is static and strongly used in database. Most widely used database language. Query is the most important operation in SQL. Ex. SELECT * FROM Book WHERE price > 100.00 ORDER BY title; No SQL Feb 19, 2014 4
  • 5. Stands for Not Only SQL  Class of non-relational data storage systems  Usually do not require a fixed table schema or the concept of joins  All NOSQL offerings relax one or more of the ACID properties .  ◦ Atomicity , Consistency , Isolation , Durability ( ACID )  “NOSQL” = “Not Only SQL” = Not Only using traditional relational DBMS No SQL Feb 19, 2014 5
  • 6. • Alternative to traditional relational DBMS • Flexible schema • Quicker/cheaper to set up • Massive scalability • Relaxed consistency → higher performance & availability * No declarative query language → more programming * Relaxed consistency → fewer guarantees No SQL Feb 19, 2014 6
  • 7. Every problem cannot be solved by traditional relational database system exclusively.  Handles huge databases.  Redundancy, data is pretty safe on commodity hardware  Super flexible queries using map/reduce  Rapid development (no fixed schema)  Very fast for common use cases  No SQL Feb 19, 2014 7
  • 8. Inspired by Distributed Data Storage problems Scale easily by adding servers Not suited to all problem types, but super-suited to certain large problem types High-write situations (eg activity tracking or timeline rendering for millions of users) A lot of relational uses are really dumbed down (eg fetch by PK with update) No SQL Feb 19, 2014 8
  • 9. No SQL Feb 19, 2014 9
  • 10.  Clients know how to: Send items to servers (consistent hashing) What to do when a server fails How to fetch keys from servers Can “weigh” to server capacities  Servers know how to: Store items they receive Expire them from the cache No inter-server communications – everything is unaware No SQL Feb 19, 2014 10
  • 11.  RDBMS tries to ensure ACID properties  NoSQL does not guarantee ACID and is therefore much faster  We don’t need ACID everywhere  NoSQL follows BASE properties No SQL Feb 19, 2014 11
  • 12. Basic availability The store appears to work most of the time.  Soft-state Stores don’t have to be write-consistent, nor do different replicas have to be mutually consistent all the time.  Eventual consistency Stores exhibit consistency at some later point (e.g., lazily at read time).  No SQL Feb 19, 2014 12
  • 13.  Simple web application with not much traffic ◦ Application server, database server all on one machine No SQL Feb 19, 2014 13
  • 14. More traffic comes in  Application server  Database server Even more traffic comes in  Load balancer  Application server x2  Database server No SQL Feb 19, 2014 14
  • 15.  Even more traffic comes in    Load balancer x N  easy Application server x N  easy Database server xN  hard for SQL databases No SQL Feb 19, 2014 15
  • 16. SQL Slowdown Not linear! No SQL Feb 19, 2014 16
  • 17. NoSQL Scalling Need more storage?  Add more servers! Need higher performance?  Add more servers! Need better reliability?  Add more servers! No SQL Feb 19, 2014 17
  • 18.  You can scale SQL databases (Oracle, MySQL, SQL Server…) ◦ This will cost you dearly ◦ If you don’t have a lot of money, you will reach limits quickly  You can scale NoSQL databases ◦ Very easy horizontal scaling ◦ Lots of open-source solutions ◦ Scaling is one of the basic incentives for design, so it is well handled ◦ Scaling is the cause of trade-offs causing you to have to use map/reduce No SQL Feb 19, 2014 18
  • 19. Almost infinite horizontal scaling  Very fast  Performance doesn’t deteriorate with growth (much)  No fixed table schemas  No join operations  Ad-hoc queries difficult or impossible  Structured storage  Almost everything happens in RAM  No SQL Feb 19, 2014 19
  • 20. No SQL Feb 19, 2014 20
  • 21.  Key-Value  Column Family  Document  Graph Stores Databases Databases No SQL Feb 19, 2014 21
  • 22. No SQL Feb 19, 2014 22
  • 23.     Lineage: Amazon's Dynamo paper and Distributed HashTables. Data model: A global collection of key-value pairs Example systems ◦ Google BigTable , Amazon Dynamo, Cassandra, Voldemort , Hbase etc. Implementation: efficiency, scalability, fault-tolerance, load balancing ◦ Records distributed to nodes based on key ◦ Replication (R= 2*F+1) where F stands for fault tolerence ◦ Single-record transactions, “eventual consistency” No SQL Feb 19, 2014 23
  • 24. Lineage: Inspired by Lotus Notes.  Data model: Collections of documents, which contain key-value collections (called "documents").  Example: CouchDB, MongoDB, Riak  No SQL Feb 19, 2014 24
  • 26. No SQL Feb 19, 2014 26
  • 27. Lineage: Draws from Euler and graph theory.  Data model: Nodes & relationships, both which can hold key-value pairs  Example: AllegroGraph, InfoGrid, Neo4j  No SQL Feb 19, 2014 27
  • 28.  Property Graph: • It contains nodes and relationships • Nodes contain properties (key-value pairs) • Relationships are named and directed, and always have a start and end node • Relationships can also contain properties No SQL Feb 19, 2014 28
  • 29. No SQL Feb 19, 2014 29
  • 30. No SQL Feb 19, 2014 30
  • 31.   Google’s framework for processing highly distributable problems across huge datasets using a large number of computers Let’s define large number of computers ◦ Cluster if all of them have same hardware ◦ Grid unless Cluster (if !Cluster for old-style programmers)  Process split into two phases ◦ Map  Take the input, partition it delegate to other machines  Other machines can repeat the process, leading to tree structure  Each machine returns results to the machine who gave it the task No SQL Feb 19, 2014 31
  • 32. ◦ Reduce  collect results from machines you gave the tasks  combine results and return it to requester ◦ Slower than sequential data processing, but massively parallel ◦ Sort petabyte of data in a few hours ◦ Input, Map, Shuffle, Reduce, Output No SQL Feb 19, 2014 32
  • 34.  Cassandra ◦ ◦ ◦ ◦ ◦ ◦  Facebook (original developer, used it till late 2010) Twitter Digg Reddit Rackspace Cisco BigTable ◦ Google (open-source version is HBase)  MongoDB ◦ ◦ ◦ ◦ ◦ Foursquare Craigslist Bit.ly SourceForge GitHub No SQL Feb 19, 2014 34
  • 35. Written in: Java  Protocol: Custom, binary (Thrift)  Tunable trade-offs for distribution and replication (N, R, W)  Querying by column, range of keys  BigTable-like features: columns, column families  Writes are much faster than reads (!)  ◦ Constant write time regardless of database size  Map/reduce possible with Apache Hadoop No SQL Feb 19, 2014 35
  • 36.     Cassandra is open source DBMS from Appache software foundation. Cassandra provides a structured key-value store with tunable consistency Cassandra is a distributed storage system for managing structured data that is designed to scale to a very large size across many commodity servers, with no single point of failure It is a NoSQL solution that was initially developed by Facebook and powered their Inbox Search feature until late 2010 No SQL Feb 19, 2014 36
  • 37.             Written in: Java Main point: Billions of rows X millions of columns Modeled after BigTable Map/reduce with Hadoop Query predicate push down via server side scan and get filters Optimizations for real time queries A high performance Thrift gateway HTTP supports XML, Protobuf, and binary Cascading, hive, and pig source and sink modules No single point of failure While Hadoop streams data efficiently, it has overhead for starting map/reduce jobs. HBase is column oriented key/value store and allows for low latency read and writes. Random access performance is like MySQL No SQL Feb 19, 2014 37
  • 38.               Written in: Erlang Main point: DB consistency, ease of use Bi-directional (!) replication, continuous or ad-hoc, with conflict detection, thus, master-master replication. (!) MVCC - write operations do not block reads Previous versions of documents are available Crash-only (reliable) design Needs compacting from time to time Views: embedded map/reduce Formatting views: lists & shows Server-side document validation possible Authentication possible Real-time updates via _changes (!) Attachment handling CouchApps (standalone JS apps) No SQL Feb 19, 2014 38
  • 39.       Apache project A framework that allows for the distributed processing of large data sets across clusters of computers Designed to scale up from single servers to thousands of machines Designed to detect and handle failures at the application layer, instead of relying on hardware for it Created by Doug Cutting, who named it after his son's toy elephant Hadoop subprojects ◦ Cassandra ◦ HBase ◦ Pig  Hive was a Hadoop subproject, but is now a top-level Apache project No SQL Feb 19, 2014 39
  • 40.      Scales to hundreds or thousands of computers, each with several processor cores Designed to efficiently distribute large amounts of work across a set of machines Hundreds of gigabytes of data constitute the low end of Hadoop-scale Built to process "web-scale" data on the order of hundreds of gigabytes to terabytes or petabytes Uses Java, but allows streaming so other languages can easily send and accept data items to/from Hadoop No SQL Feb 19, 2014 40
  • 41.  Uses distributed file system (HDFS) ◦ Designed to hold very large amounts of data (terabytes or even petabytes) ◦ Files are stored in a redundant fashion across multiple machines to ensure their durability to failure and high availability to very parallel applications ◦ Data organized into directories and files ◦ Files are divided into block (64MB by default) and distributed across nodes  Design of HDFS is based on the design of the Google File System No SQL Feb 19, 2014 41
  • 42. A petabyte-scale data warehouse system for Hadoop  Easy data summarization, ad-hoc queries  Query the data using a SQL-like language called HiveQL  Hive compiler generates map-reduce jobs for most queries  No SQL Feb 19, 2014 42
  • 43. NoSQL is a great problem solver if you need it  Choose your NoSQL platform carefully as each is designed for specific purpose  Get used to Map/Reduce  It’s not a sin to use NoSQL alongside (yes)SQL database  No SQL Feb 19, 2014 43
  • 44. Graph Databases by Ian Robinson,Jim Webber and Emil Eifrem  http://www.mongodb.com/learn/nosql  http://www.couchbase.com/nosql-database  http://en.wikipedia.org/wiki/Apache_Cassandra  http://en.wikipedia.org/wiki/SQL  http://en.wikipedia.org/wiki/NoSQL  www.slideshare.com  No SQL Feb 19, 2014 44