SlideShare ist ein Scribd-Unternehmen logo
1 von 20
Downloaden Sie, um offline zu lesen
Overview of
NoSQL
...motivation, technologies, should you
care?
Overview
● Evolution of/motivation for NoSQL
  databases
● Characterization of NoSQL databases
● Classification of NoSQL databases
● Popularity/usage of NoSQL systems
A brief history of NoSQL
● Originally coined in 1998 by Strozzi for
  specific non-rel database
   ○ easy to use, free, text based data storage, easy
     manipulation of contents of db
● Reintroduced by Evans (Rackspace) in 2009
  for conf on open source distributed
  databases
   ○ in response to increase in interest in non RDBMS
     solutions
      ■ bringing together Cassandra, Mongo, Couch, etc
● Has grown as a movement over last 3 years
Current status
● Significant buzz within community in 2010
  ○ initial development of technology
  ○ pioneer deployments
  ○ lots of meetups/conferences/birds of feathers
● Many key technologies evolved later 2010,
  2011
  ○ more large deployments for some technologies
  ○ small companies with no legacy basing operations
    on NoSQL
Current Status
● 2012
  ○   buzz/hype is fading
  ○   technology continues to mature
  ○   increased number of deployments
  ○   skills sought in job market
NoSQL - a negative
definition
● NoSQL simply defined by being non-
  relational
  ○ diverse set of technologies fall into NoSQL camp
● Motivations mixed
  ○   open source
  ○   scale - TB, PB - particulary for read/write latency
  ○   increased flexibility over RDBMS systems
  ○   ability to work with raw data
  ○   ACID not always most appropriate design choice
       ■ analytics data is excellent example
● Results in many different NoSQL
  technologies
Typical characteristics
● Don't use SQL!
● Open Source
● Intended to deliver performance
  ○ in some dimension
● Typically JOIN not supported
  ○ performance hit
● Consistency often relaxed
  ○ eventual consistency
● More flexibility in schema
  ○ if schema used at all!
Diversity of NoSQL
databases
● 122 seperate technologies listed on http:
  //nosql-database.org/
  ○ mix of commercial, open source and some
    inbetween
● Vary in many dimensions:
  ○ architecture
  ○ interfaces
     ■ api/languages
  ○ internal data storage
  ○ distribution mechanisms
     ■ redundancy, reliability
  ○ usage - deployments & support community
  ○ maturity
Classification of NoSQL
systems
●   Column based solutions
●   Document store solutions
●   Key/Value solutions
●   Graph based solutions
●   Less significantly:
    ○ XML databases
    ○ Object databases
    ○ Mulitvalue databases
Column based solutions
● Structured data
  ○ similar to classical tables
● Generally much more flexible
  ○ no rigorous schema necessary
  ○ can typically add columns in ad hoc fashion
    ■ often without explicitly declaring column
● However, can result in very different usage
  ○ eg can have millions of columns associated with
    given row
● Examples: Hadoop/HBase, Cassandra,
  Hypertable, SimpleDB
Document based solutions
● Less structured data
  ○ DB composed of 'documents' containing arbitrary
    data
    ■ usually containing longer form content eg CMS
● Documents contain some structure to
  support query/search/filter, etc
● Somewhat less emphasis on a key
  ○ can be autogenerated
● Quite unlike classical databases
● Examples: MongoDB, CouchDB
Key/value stores
● DBs inspired by memcache
   ○ simple, fast key/value stores
● Attempt to retain most of DB in memory
   ○ fast response times
● Different designs for scalability
   ○ single node/multi node
● Much emphasis on the keys in this type of
  DB
● Write usually overwrites entire previous entry
● Examples: Redis, Couchbase/Membase,
  DynamoDB, Riak
Graph based solutions
● Obviously different from previous categories
  ○ Focus specifically on graphs
● Queries supported are graph-specific
  ○ eg get nodes related to specified node
● Typically support for solving standard graph
  problems
  ○ eg shortest path, general graph traversal
● Can deliver very significant performance
  over non-graph specific solutions
  ○ for graph problems!
● Examples: Neo4j
It's a noisy space...
● Very many candidate technologies
● Relatively small amount of real world
  solutions
● Differences between classifications above is
  one of emphasis...
   ○ column based and document based arrive at semi-
     structured sweet spot from opposite ends of
     spectrum
● ...although this results in different preferred
  use cases...
   ○ document based solution better for document
     problems, eg CMS
Common techniques used
● Hashing techniques used to map data to
  nodes in cluster
● Internode communication via Gossip
● Common replication techniques
● Thrift is used in a few cases
● MapReduce often used to search over
  distributed system
Comparison (oldish)...
Comparison (oldish)
Comparison (oldish)
Horses for courses...
● SQL is perfectly good solution for many
  problems
  ○ tried and tested
● Some problems require alternative solution
  ○ typically driven by scale and/or flexibility
● NoSQL offers (many) alternatives
  ○ although relatively easy to identify realistic options
● Column based approaches good for mostly
  structured data with enhanced flexibility
● Document based approaches good for
  document oriented problems
...so let's dive into one
NoSQL database...
● Cassandra...

Weitere ähnliche Inhalte

Was ist angesagt?

Nosql databases for the .net developer
Nosql databases for the .net developerNosql databases for the .net developer
Nosql databases for the .net developer
Jesus Rodriguez
 

Was ist angesagt? (20)

NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introduction
 
HPTS 2011: The NoSQL Ecosystem
HPTS 2011: The NoSQL EcosystemHPTS 2011: The NoSQL Ecosystem
HPTS 2011: The NoSQL Ecosystem
 
Nosql databases for the .net developer
Nosql databases for the .net developerNosql databases for the .net developer
Nosql databases for the .net developer
 
No sql
No sqlNo sql
No sql
 
An Intro to NoSQL Databases
An Intro to NoSQL DatabasesAn Intro to NoSQL Databases
An Intro to NoSQL Databases
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDB
 
NoSql
NoSqlNoSql
NoSql
 
Comparative study of modern databases
Comparative study of modern databasesComparative study of modern databases
Comparative study of modern databases
 
NoSQL
NoSQLNoSQL
NoSQL
 
NoSQL Seminer
NoSQL SeminerNoSQL Seminer
NoSQL Seminer
 
NoSQL
NoSQLNoSQL
NoSQL
 
First steps to Azure Cosmos DB: Getting Started with MongoDB and NoSQL
First steps to Azure Cosmos DB: Getting Started with MongoDB and NoSQLFirst steps to Azure Cosmos DB: Getting Started with MongoDB and NoSQL
First steps to Azure Cosmos DB: Getting Started with MongoDB and NoSQL
 
Four NoSQL Databases You Should Know
Four NoSQL Databases You Should KnowFour NoSQL Databases You Should Know
Four NoSQL Databases You Should Know
 
A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.
 
MongoDB
MongoDBMongoDB
MongoDB
 
No SQL - A Simple Intro
No SQL - A Simple IntroNo SQL - A Simple Intro
No SQL - A Simple Intro
 
Big data stores
Big data  storesBig data  stores
Big data stores
 
NoSQL
NoSQLNoSQL
NoSQL
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
NoSQL and MongoDB
NoSQL and MongoDBNoSQL and MongoDB
NoSQL and MongoDB
 

Ähnlich wie Overview of no sql

NOSQL Databases for the .NET Developer
NOSQL Databases for the .NET DeveloperNOSQL Databases for the .NET Developer
NOSQL Databases for the .NET Developer
Jesus Rodriguez
 
NoSQL Solutions - a comparative study
NoSQL Solutions - a comparative studyNoSQL Solutions - a comparative study
NoSQL Solutions - a comparative study
Guillaume Lefranc
 

Ähnlich wie Overview of no sql (20)

SQL or NoSQL - how to choose
SQL or NoSQL - how to chooseSQL or NoSQL - how to choose
SQL or NoSQL - how to choose
 
How to get started in Big Data for master's students
How to get started in Big Data for master's studentsHow to get started in Big Data for master's students
How to get started in Big Data for master's students
 
No sql bigdata and postgresql
No sql bigdata and postgresqlNo sql bigdata and postgresql
No sql bigdata and postgresql
 
HPEC 2021 sparse binary format
HPEC 2021 sparse binary formatHPEC 2021 sparse binary format
HPEC 2021 sparse binary format
 
NOsql Presentation.pdf
NOsql Presentation.pdfNOsql Presentation.pdf
NOsql Presentation.pdf
 
Scalability broad strokes
Scalability   broad strokesScalability   broad strokes
Scalability broad strokes
 
Introduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDBIntroduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDB
 
NOSQL Databases for the .NET Developer
NOSQL Databases for the .NET DeveloperNOSQL Databases for the .NET Developer
NOSQL Databases for the .NET Developer
 
Database Technologies
Database TechnologiesDatabase Technologies
Database Technologies
 
NoSQL.pptx
NoSQL.pptxNoSQL.pptx
NoSQL.pptx
 
Steam Learn: Introduction to RDBMS indexes
Steam Learn: Introduction to RDBMS indexesSteam Learn: Introduction to RDBMS indexes
Steam Learn: Introduction to RDBMS indexes
 
NoSQL for Artificial Intelligence
NoSQL for Artificial IntelligenceNoSQL for Artificial Intelligence
NoSQL for Artificial Intelligence
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data Storage
 
Polyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great TogetherPolyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great Together
 
NoSQL Solutions - a comparative study
NoSQL Solutions - a comparative studyNoSQL Solutions - a comparative study
NoSQL Solutions - a comparative study
 
Handling the growth of data
Handling the growth of dataHandling the growth of data
Handling the growth of data
 
Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @Lendingkart
 
Heterogenous Persistence
Heterogenous PersistenceHeterogenous Persistence
Heterogenous Persistence
 
Distributed Databases - Concepts & Architectures
Distributed Databases - Concepts & ArchitecturesDistributed Databases - Concepts & Architectures
Distributed Databases - Concepts & Architectures
 
Datastore PPT.pptx
Datastore PPT.pptxDatastore PPT.pptx
Datastore PPT.pptx
 

Mehr von Sean Murphy (8)

Hadoop pig
Hadoop pigHadoop pig
Hadoop pig
 
Demonstration
DemonstrationDemonstration
Demonstration
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overview
 
No sql course introduction
No sql course   introductionNo sql course   introduction
No sql course introduction
 
Rss talk
Rss talkRss talk
Rss talk
 
Rss announcements
Rss announcementsRss announcements
Rss announcements
 
Rocco pres-v1
Rocco pres-v1Rocco pres-v1
Rocco pres-v1
 
UCD Android Workshop
UCD Android WorkshopUCD Android Workshop
UCD Android Workshop
 

Kürzlich hochgeladen

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Kürzlich hochgeladen (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Overview of no sql

  • 2. Overview ● Evolution of/motivation for NoSQL databases ● Characterization of NoSQL databases ● Classification of NoSQL databases ● Popularity/usage of NoSQL systems
  • 3. A brief history of NoSQL ● Originally coined in 1998 by Strozzi for specific non-rel database ○ easy to use, free, text based data storage, easy manipulation of contents of db ● Reintroduced by Evans (Rackspace) in 2009 for conf on open source distributed databases ○ in response to increase in interest in non RDBMS solutions ■ bringing together Cassandra, Mongo, Couch, etc ● Has grown as a movement over last 3 years
  • 4. Current status ● Significant buzz within community in 2010 ○ initial development of technology ○ pioneer deployments ○ lots of meetups/conferences/birds of feathers ● Many key technologies evolved later 2010, 2011 ○ more large deployments for some technologies ○ small companies with no legacy basing operations on NoSQL
  • 5. Current Status ● 2012 ○ buzz/hype is fading ○ technology continues to mature ○ increased number of deployments ○ skills sought in job market
  • 6. NoSQL - a negative definition ● NoSQL simply defined by being non- relational ○ diverse set of technologies fall into NoSQL camp ● Motivations mixed ○ open source ○ scale - TB, PB - particulary for read/write latency ○ increased flexibility over RDBMS systems ○ ability to work with raw data ○ ACID not always most appropriate design choice ■ analytics data is excellent example ● Results in many different NoSQL technologies
  • 7. Typical characteristics ● Don't use SQL! ● Open Source ● Intended to deliver performance ○ in some dimension ● Typically JOIN not supported ○ performance hit ● Consistency often relaxed ○ eventual consistency ● More flexibility in schema ○ if schema used at all!
  • 8. Diversity of NoSQL databases ● 122 seperate technologies listed on http: //nosql-database.org/ ○ mix of commercial, open source and some inbetween ● Vary in many dimensions: ○ architecture ○ interfaces ■ api/languages ○ internal data storage ○ distribution mechanisms ■ redundancy, reliability ○ usage - deployments & support community ○ maturity
  • 9. Classification of NoSQL systems ● Column based solutions ● Document store solutions ● Key/Value solutions ● Graph based solutions ● Less significantly: ○ XML databases ○ Object databases ○ Mulitvalue databases
  • 10. Column based solutions ● Structured data ○ similar to classical tables ● Generally much more flexible ○ no rigorous schema necessary ○ can typically add columns in ad hoc fashion ■ often without explicitly declaring column ● However, can result in very different usage ○ eg can have millions of columns associated with given row ● Examples: Hadoop/HBase, Cassandra, Hypertable, SimpleDB
  • 11. Document based solutions ● Less structured data ○ DB composed of 'documents' containing arbitrary data ■ usually containing longer form content eg CMS ● Documents contain some structure to support query/search/filter, etc ● Somewhat less emphasis on a key ○ can be autogenerated ● Quite unlike classical databases ● Examples: MongoDB, CouchDB
  • 12. Key/value stores ● DBs inspired by memcache ○ simple, fast key/value stores ● Attempt to retain most of DB in memory ○ fast response times ● Different designs for scalability ○ single node/multi node ● Much emphasis on the keys in this type of DB ● Write usually overwrites entire previous entry ● Examples: Redis, Couchbase/Membase, DynamoDB, Riak
  • 13. Graph based solutions ● Obviously different from previous categories ○ Focus specifically on graphs ● Queries supported are graph-specific ○ eg get nodes related to specified node ● Typically support for solving standard graph problems ○ eg shortest path, general graph traversal ● Can deliver very significant performance over non-graph specific solutions ○ for graph problems! ● Examples: Neo4j
  • 14. It's a noisy space... ● Very many candidate technologies ● Relatively small amount of real world solutions ● Differences between classifications above is one of emphasis... ○ column based and document based arrive at semi- structured sweet spot from opposite ends of spectrum ● ...although this results in different preferred use cases... ○ document based solution better for document problems, eg CMS
  • 15. Common techniques used ● Hashing techniques used to map data to nodes in cluster ● Internode communication via Gossip ● Common replication techniques ● Thrift is used in a few cases ● MapReduce often used to search over distributed system
  • 19. Horses for courses... ● SQL is perfectly good solution for many problems ○ tried and tested ● Some problems require alternative solution ○ typically driven by scale and/or flexibility ● NoSQL offers (many) alternatives ○ although relatively easy to identify realistic options ● Column based approaches good for mostly structured data with enhanced flexibility ● Document based approaches good for document oriented problems
  • 20. ...so let's dive into one NoSQL database... ● Cassandra...