SlideShare ist ein Scribd-Unternehmen logo
1 von 21
rICh Morrow, quicloud LLC
 This talk is essentially the first couple
chapters of “NoSQL Distilled” (Sadalage,
Fowler)
 Highly recommend this book!
 App development productivity
 Fixes “impedance mismatch”
 Large scale
 Happily handles the “threeVs” of “big data”
▪ Volume
▪ Velocity
▪ Variety
You’ve always needed a “backing store”
 …could be files
 great for a single user or application
 …could be databases
 great for multiple users/applications
 …and on the DB side, could be:
 Application Database (used by single app)
 Integration Database (used by several apps)
 Concurrency
 Simple problem, very tough to solve
 Application Datastores
 One app, many users
 Integration Datastores
 One set of data, many apps, lots of potential for
headbanging
{
“id”: “1001”,
"firstName": ”Ann",
"lastName": "Williams",
"age": 55,
“purchasedItems”:
{
0321290533 {qty, price… }
0321601912 {qty, price… }
0131495054 {qty, price… }
}
“paymentDetails”:
{ cc info… }
"address":
{
"street": "1234 Park",
"city": "San Francisco",
"state": "CA",
"zip": "94102"
}
}
1 object = 10, 20, 100?Tables. Ugh…
Your code has one structure, but your RDBMS stores in another…
A great "all purpose" storage + query tool
 ACID compliant
 Supports many users
 Supports many apps
 3NF stores data efficiently
 Disk wasn't always cheap
 Fast and tunable
 Introduced a common interface (SQL)
 Which every vendor quickly then “broke”
 Impedance mismatch
 Many teams build (then have to maintain) custom
ORM or SOA proxies
 Weren't build to be distributed
 Google, Amazon, et al hit hard walls on RDBMS
capabilities
 Often required expensive, proprietary hardware
 Ooops, I sharded myself!
 Additional complexity
 Cross shard joins now extremely expensive
 Velocity
 Faster responses required
 Volume
 100s ofTB, PB now common
 “Web Scale” can mean 100s of thousands of
concurrent transactions
 Both of those increasing rapidly
 Variety
 Mixed structure, semi-structured, unstructured
 Bigtable paper (by Google)
 Heavily influenced the “Columnar” branch of NoSQL
 Dynamo paper (by Amazon)
 Heavily influenced the “KeValue” branch of NoSQL
 This is NOT DynamoDB!!!
Design considerations:
 Distributed from the start
 Clusters of inexpensive commodity hardware are cheaper &
more fault tolerant at scale
 Relaxed and/or tunable C&A (from CAP theorem)
 Deal with unheard of volume & velocity
 Schemaless (bye bye impedance mismatch)
 Consistency
 How consistent the data looks to 2 or more
viewers
 “Eventual” consistency possible (and common)!
 Availability
 Responsiveness of the system
 PartitionTolerance
 How well does the system respond to partition
failures?
 This is normally “untunable”, unlike the C&A
 Because “Cloud” and “Big Data” were just not
confusing enough people in IT
 "Not ONLY SQL" - incredibly unfortunate
"little o"
 Name born out of a Bay Area meetup in 2009
 …and regretted / derided ever since
Fancy term for “multiple datastores”
 ...you're already doing it
 Browser side cache
 Memcache
 Query cache
 OLAP systems
 ...just add NoSQL
 Tell your RDBMS not to worry – it will (probably)
still live a long, happy life
 Generally Open Source
 Schemaless
 Easily change schema or do 'schema on read'
 Cluster-oriented
 With the exception of Graph DBs
 Generally favor "Web Scale" over ACID
 Generally better for APPLICATION Databases
 Aggregate data models
 Let you treat a group of data as a unit
 Again, graph DBs are an exception here…
 KeyValue
 Fast lookup on a single “hashed” key
 Document
 Each “Document” self-defines it’s own structure
 Columnar (or Column-Family)
 Great for “sparse” data (millions of columns)
 Graph [bit of a black sheep in the NoSQL family]
 Specialized to crawl graph relations like social
networks, resource flows, etc
 Less popular at the moment, but gaining steam fast
 Can only look up by (normally a single) Key
 Extremely fast for that key
 Value can be anything
 Example: DynamoDB, Riak
 Document can contain anything
 json extremely popular
 But can also be XML, CSV, semi-structured,
unstructured, custom… literally anything
 Can query on aggregates inside of document
 Can even index on aggregates
 Can retrieve part of the document
 Extremely memory intensive
 Example: MongoDB, CouchDB
 Great for “sparse” data (populated columns vary
greatly between rows)
 Group columns into families
 Think of it as a “two level” aggregate
 First level “key” is rowID or aggregate of interest
 2nd level values are the columns
 You can visualize the data as row or column-
oriented
 Example: Hbase, Cassandra
 Built to efficiently crawl & search graph trees
 Social Networks
 Resource flows
 “people of interest”
 Don’t run well on clusters
 Example: Neo4J (and not much else right now)
 RDBMS were not designed with many of today’s
problems in mind
 NoSQL DBs were built from the ground up to deal
with these “ThreeV” issues
 NoSQL can either replace or (more commonly)
supplement existing RDBMS functions
 Move hot tables out to DynamoDB
 Write a greenfield app from ground up with only a NoSQL
datastore
 Consistency & Availability are often tunable
 Many flavors exist & each have their own best use
cases
 Research heavily before deciding upon a platform
 Thanks!

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
Derek Stainer
 
Introduction to column oriented databases
Introduction to column oriented databasesIntroduction to column oriented databases
Introduction to column oriented databases
ArangoDB Database
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQL
Don Demcsak
 

Was ist angesagt? (20)

Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introduction
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless Databases
 
Nosql databases
Nosql databasesNosql databases
Nosql databases
 
Oltp vs olap
Oltp vs olapOltp vs olap
Oltp vs olap
 
Introduction to column oriented databases
Introduction to column oriented databasesIntroduction to column oriented databases
Introduction to column oriented databases
 
NoSQL
NoSQLNoSQL
NoSQL
 
Big Data Technology Stack : Nutshell
Big Data Technology Stack : NutshellBig Data Technology Stack : Nutshell
Big Data Technology Stack : Nutshell
 
Consistency in NoSQL
Consistency in NoSQLConsistency in NoSQL
Consistency in NoSQL
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sql
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQL
 
Database Management System
Database Management SystemDatabase Management System
Database Management System
 
Oracle database introduction
Oracle database introductionOracle database introduction
Oracle database introduction
 
Mongo Nosql CRUD Operations
Mongo Nosql CRUD OperationsMongo Nosql CRUD Operations
Mongo Nosql CRUD Operations
 
NoSQL @ CodeMash 2010
NoSQL @ CodeMash 2010NoSQL @ CodeMash 2010
NoSQL @ CodeMash 2010
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Lecture4 big data technology foundations
Lecture4 big data technology foundationsLecture4 big data technology foundations
Lecture4 big data technology foundations
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Relational databases
Relational databasesRelational databases
Relational databases
 

Andere mochten auch

Millat ibrahim - Abu Muhammad Maqdisi
Millat ibrahim - Abu Muhammad MaqdisiMillat ibrahim - Abu Muhammad Maqdisi
Millat ibrahim - Abu Muhammad Maqdisi
guest647712b0
 
Darcell ppt
Darcell pptDarcell ppt
Darcell ppt
gzorskas
 
7 elements of art
7 elements of art7 elements of art
7 elements of art
gzorskas
 
Hall of fame december
Hall of fame   decemberHall of fame   december
Hall of fame december
samsungmena
 
Quality stories - september
Quality stories - septemberQuality stories - september
Quality stories - september
samsungmena
 
Hall of fame april 2011
Hall of fame april 2011Hall of fame april 2011
Hall of fame april 2011
samsungmena
 
Drug Addiction NICE Guidelines
Drug Addiction NICE GuidelinesDrug Addiction NICE Guidelines
Drug Addiction NICE Guidelines
Pk Doctors
 
Hospitality Industry in Tanzania_HAT Presentation to Hon. Minister Prof Rev1
Hospitality Industry in Tanzania_HAT Presentation to Hon. Minister Prof Rev1Hospitality Industry in Tanzania_HAT Presentation to Hon. Minister Prof Rev1
Hospitality Industry in Tanzania_HAT Presentation to Hon. Minister Prof Rev1
SW Associates, LLC
 
Elements of artjustine
Elements of artjustineElements of artjustine
Elements of artjustine
gzorskas
 

Andere mochten auch (20)

Introduction to NoSQL with MongoDB
Introduction to NoSQL with MongoDBIntroduction to NoSQL with MongoDB
Introduction to NoSQL with MongoDB
 
Patinatge
PatinatgePatinatge
Patinatge
 
Matrixbrochure Web
Matrixbrochure WebMatrixbrochure Web
Matrixbrochure Web
 
P1121327289
P1121327289P1121327289
P1121327289
 
Makesense Mobilizing Citizens for Refugee Initiatives
Makesense Mobilizing Citizens for Refugee InitiativesMakesense Mobilizing Citizens for Refugee Initiatives
Makesense Mobilizing Citizens for Refugee Initiatives
 
Millat ibrahim - Abu Muhammad Maqdisi
Millat ibrahim - Abu Muhammad MaqdisiMillat ibrahim - Abu Muhammad Maqdisi
Millat ibrahim - Abu Muhammad Maqdisi
 
Darcell ppt
Darcell pptDarcell ppt
Darcell ppt
 
7 elements of art
7 elements of art7 elements of art
7 elements of art
 
Hall of fame december
Hall of fame   decemberHall of fame   december
Hall of fame december
 
Quality stories - september
Quality stories - septemberQuality stories - september
Quality stories - september
 
Hall of fame april 2011
Hall of fame april 2011Hall of fame april 2011
Hall of fame april 2011
 
P1151418327
P1151418327P1151418327
P1151418327
 
P1121105111
P1121105111P1121105111
P1121105111
 
Reggae
ReggaeReggae
Reggae
 
Drug Addiction NICE Guidelines
Drug Addiction NICE GuidelinesDrug Addiction NICE Guidelines
Drug Addiction NICE Guidelines
 
Hospitality Industry in Tanzania_HAT Presentation to Hon. Minister Prof Rev1
Hospitality Industry in Tanzania_HAT Presentation to Hon. Minister Prof Rev1Hospitality Industry in Tanzania_HAT Presentation to Hon. Minister Prof Rev1
Hospitality Industry in Tanzania_HAT Presentation to Hon. Minister Prof Rev1
 
Elements of artjustine
Elements of artjustineElements of artjustine
Elements of artjustine
 
Acorde
AcordeAcorde
Acorde
 
Ogs designs
Ogs designsOgs designs
Ogs designs
 
Clase 01 06
Clase 01 06Clase 01 06
Clase 01 06
 

Ähnlich wie No sql distilled-distilled

NO SQL: What, Why, How
NO SQL: What, Why, HowNO SQL: What, Why, How
NO SQL: What, Why, How
Igor Moochnick
 
NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and How
BigBlueHat
 

Ähnlich wie No sql distilled-distilled (20)

NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduce
 
NoSQL
NoSQLNoSQL
NoSQL
 
NO SQL: What, Why, How
NO SQL: What, Why, HowNO SQL: What, Why, How
NO SQL: What, Why, How
 
Nosql seminar
Nosql seminarNosql seminar
Nosql seminar
 
Nonrelational Databases
Nonrelational DatabasesNonrelational Databases
Nonrelational Databases
 
Some NoSQL
Some NoSQLSome NoSQL
Some NoSQL
 
On no sql.partiii
On no sql.partiiiOn no sql.partiii
On no sql.partiii
 
Implementation of nosql for robotics
Implementation of nosql for roboticsImplementation of nosql for robotics
Implementation of nosql for robotics
 
CodeFutures - Scaling Your Database in the Cloud
CodeFutures - Scaling Your Database in the CloudCodeFutures - Scaling Your Database in the Cloud
CodeFutures - Scaling Your Database in the Cloud
 
NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and How
 
NoSQL Basics - A Quick Tour
NoSQL Basics - A Quick TourNoSQL Basics - A Quick Tour
NoSQL Basics - A Quick Tour
 
No sql
No sqlNo sql
No sql
 
Beyond Relational Databases
Beyond Relational DatabasesBeyond Relational Databases
Beyond Relational Databases
 
No sql databases
No sql databasesNo sql databases
No sql databases
 
gayathrinosql.pptx
gayathrinosql.pptxgayathrinosql.pptx
gayathrinosql.pptx
 
NOSQL
NOSQLNOSQL
NOSQL
 
Front Range PHP NoSQL Databases
Front Range PHP NoSQL DatabasesFront Range PHP NoSQL Databases
Front Range PHP NoSQL Databases
 
Introduction to Sql on Hadoop
Introduction to Sql on HadoopIntroduction to Sql on Hadoop
Introduction to Sql on Hadoop
 
SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?
 
On nosql
On nosqlOn nosql
On nosql
 

Mehr von rICh morrow (6)

IoT Stream Conf Keynote: Past, Present and Future of IoT
IoT Stream Conf Keynote: Past, Present and Future of IoTIoT Stream Conf Keynote: Past, Present and Future of IoT
IoT Stream Conf Keynote: Past, Present and Future of IoT
 
PHP from soup to nuts Course Deck
PHP from soup to nuts Course DeckPHP from soup to nuts Course Deck
PHP from soup to nuts Course Deck
 
"PHP from soup to nuts" -- lab exercises
"PHP from soup to nuts" -- lab exercises"PHP from soup to nuts" -- lab exercises
"PHP from soup to nuts" -- lab exercises
 
Hadoop in the cloud with AWS' EMR
Hadoop in the cloud with AWS' EMRHadoop in the cloud with AWS' EMR
Hadoop in the cloud with AWS' EMR
 
EC2 Pricing Model (deck 0307 of the InfiniteSkills AWS course at http://bit.l...
EC2 Pricing Model (deck 0307 of the InfiniteSkills AWS course at http://bit.l...EC2 Pricing Model (deck 0307 of the InfiniteSkills AWS course at http://bit.l...
EC2 Pricing Model (deck 0307 of the InfiniteSkills AWS course at http://bit.l...
 
quicloud Apr 20 2010 Boulder New Tech Presentation
quicloud Apr 20 2010 Boulder New Tech Presentationquicloud Apr 20 2010 Boulder New Tech Presentation
quicloud Apr 20 2010 Boulder New Tech Presentation
 

No sql distilled-distilled

  • 2.  This talk is essentially the first couple chapters of “NoSQL Distilled” (Sadalage, Fowler)  Highly recommend this book!
  • 3.  App development productivity  Fixes “impedance mismatch”  Large scale  Happily handles the “threeVs” of “big data” ▪ Volume ▪ Velocity ▪ Variety
  • 4. You’ve always needed a “backing store”  …could be files  great for a single user or application  …could be databases  great for multiple users/applications  …and on the DB side, could be:  Application Database (used by single app)  Integration Database (used by several apps)
  • 5.  Concurrency  Simple problem, very tough to solve  Application Datastores  One app, many users  Integration Datastores  One set of data, many apps, lots of potential for headbanging
  • 6. { “id”: “1001”, "firstName": ”Ann", "lastName": "Williams", "age": 55, “purchasedItems”: { 0321290533 {qty, price… } 0321601912 {qty, price… } 0131495054 {qty, price… } } “paymentDetails”: { cc info… } "address": { "street": "1234 Park", "city": "San Francisco", "state": "CA", "zip": "94102" } } 1 object = 10, 20, 100?Tables. Ugh… Your code has one structure, but your RDBMS stores in another…
  • 7. A great "all purpose" storage + query tool  ACID compliant  Supports many users  Supports many apps  3NF stores data efficiently  Disk wasn't always cheap  Fast and tunable  Introduced a common interface (SQL)  Which every vendor quickly then “broke”
  • 8.  Impedance mismatch  Many teams build (then have to maintain) custom ORM or SOA proxies  Weren't build to be distributed  Google, Amazon, et al hit hard walls on RDBMS capabilities  Often required expensive, proprietary hardware  Ooops, I sharded myself!  Additional complexity  Cross shard joins now extremely expensive
  • 9.  Velocity  Faster responses required  Volume  100s ofTB, PB now common  “Web Scale” can mean 100s of thousands of concurrent transactions  Both of those increasing rapidly  Variety  Mixed structure, semi-structured, unstructured
  • 10.  Bigtable paper (by Google)  Heavily influenced the “Columnar” branch of NoSQL  Dynamo paper (by Amazon)  Heavily influenced the “KeValue” branch of NoSQL  This is NOT DynamoDB!!! Design considerations:  Distributed from the start  Clusters of inexpensive commodity hardware are cheaper & more fault tolerant at scale  Relaxed and/or tunable C&A (from CAP theorem)  Deal with unheard of volume & velocity  Schemaless (bye bye impedance mismatch)
  • 11.  Consistency  How consistent the data looks to 2 or more viewers  “Eventual” consistency possible (and common)!  Availability  Responsiveness of the system  PartitionTolerance  How well does the system respond to partition failures?  This is normally “untunable”, unlike the C&A
  • 12.  Because “Cloud” and “Big Data” were just not confusing enough people in IT  "Not ONLY SQL" - incredibly unfortunate "little o"  Name born out of a Bay Area meetup in 2009  …and regretted / derided ever since
  • 13. Fancy term for “multiple datastores”  ...you're already doing it  Browser side cache  Memcache  Query cache  OLAP systems  ...just add NoSQL  Tell your RDBMS not to worry – it will (probably) still live a long, happy life
  • 14.  Generally Open Source  Schemaless  Easily change schema or do 'schema on read'  Cluster-oriented  With the exception of Graph DBs  Generally favor "Web Scale" over ACID  Generally better for APPLICATION Databases  Aggregate data models  Let you treat a group of data as a unit  Again, graph DBs are an exception here…
  • 15.  KeyValue  Fast lookup on a single “hashed” key  Document  Each “Document” self-defines it’s own structure  Columnar (or Column-Family)  Great for “sparse” data (millions of columns)  Graph [bit of a black sheep in the NoSQL family]  Specialized to crawl graph relations like social networks, resource flows, etc  Less popular at the moment, but gaining steam fast
  • 16.  Can only look up by (normally a single) Key  Extremely fast for that key  Value can be anything  Example: DynamoDB, Riak
  • 17.  Document can contain anything  json extremely popular  But can also be XML, CSV, semi-structured, unstructured, custom… literally anything  Can query on aggregates inside of document  Can even index on aggregates  Can retrieve part of the document  Extremely memory intensive  Example: MongoDB, CouchDB
  • 18.  Great for “sparse” data (populated columns vary greatly between rows)  Group columns into families  Think of it as a “two level” aggregate  First level “key” is rowID or aggregate of interest  2nd level values are the columns  You can visualize the data as row or column- oriented  Example: Hbase, Cassandra
  • 19.  Built to efficiently crawl & search graph trees  Social Networks  Resource flows  “people of interest”  Don’t run well on clusters  Example: Neo4J (and not much else right now)
  • 20.  RDBMS were not designed with many of today’s problems in mind  NoSQL DBs were built from the ground up to deal with these “ThreeV” issues  NoSQL can either replace or (more commonly) supplement existing RDBMS functions  Move hot tables out to DynamoDB  Write a greenfield app from ground up with only a NoSQL datastore  Consistency & Availability are often tunable  Many flavors exist & each have their own best use cases  Research heavily before deciding upon a platform