NoSQL Roundup

•Als PPTX, PDF herunterladen•

0 gefällt mir•133 views

There is a lot of confusion out there about the various kinds of NoSQL, and NewSQL, technologies. Document stores, graph databases, columnar databases, graph databases, and the list goes on. This confusion has lead to a good deal of less than optimal deployments, pain, and, ultimately, antipathy. In this talk, Dan will walk us through a high-level explanation of the various NoSQL technologies available to us, how they work, and provide some dos and don'ts for their implementation.

Software

NoSQL Roundup
- DAN FIELDS (@DANIELSFIELDS)

What if NoSQL?
Map/Reduce (HBase, CouchDB)

What if NoSQL?
Map/Reduce (HBase, CouchDB)
Query DSL (MongoDB, ElasticSearch, ReThink)

What if NoSQL?
Map/Reduce (HBase, CouchDB)
Query DSL (MongoDB, ElasticSearch, ReThink)
Key-Value (Redis, Membase)

What if NoSQL?
Map/Reduce (HBase, CouchDB)
Query DSL (MongoDB, ElasticSearch, ReThink)
Key-Value (Redis, Membase)
Other (Cypher, Pig Latin)

What if NoSQL?
Map/Reduce (HBase, CouchDB)
Query DSL (MongoDB, ElasticSearch, ReThink)
Key-Value (Redis, Membase)
Other (Cypher, Pig Latin)
Uh…Structured Query Language…?

SQL-like
N1QL (Couchbase)
SPARQL (RDF – AllegroGraph, BlazeGraph)
CQL (Cassandra)
…and so on…

No, seriously, what the heck is “NoSQL”?

NoSQL solves specific problems
Horizontal scalability

NoSQL solves specific problems
Horizontal scalability
Availability

NoSQL solves specific problems
Horizontal scalability
Availability
Schema updates

NoSQL solves specific problems
Horizontal scalability
Availability
Schema updates
Performance

NoSQL solves specific problems
Horizontal scalability
Availability
Schema updates
Performance
Data gets very large

NoSQL solves specific problems
Horizontal scalability
Availability
Schema updates
Performance
Data gets very large
Data gets very wide

NoSQL solves specific problems
Horizontal scalability
Availability
Schema updates
Performance
Data gets very large
Data gets very wide
High volatility

“NoSQL” is not a monolith
Key-value
Document
Graph
Inverted Index
Object
RDF (triplestore/quadstore)
Columnar

Key-value
Description Pros Cons Examples
Data is modeled
as key-value pairs.
• Simplicity
• In-memory
• Flexibility
• Easy to partition
• Limited query
capabilities
• Limited ability
to build out
complex data
relationships
• Redis
• Couchbase
• Membase
• DynamoDB
• Riak

Document
Description Pros Cons Examples
Data is
represented as a
“document,” and
is serialized in a
hierarchical data
format.
• Flexibility • Limited ability
to build out
complex data
relationships
• ArrangoDB
• Couchbase
• CouchDB
• DynamoDB
• ElasticSearch
• MongoDB
• RethinkDB
• Riak

Graph
Description Pros Cons Examples
Based on graph
theory. Data is
represented as
edges and
vertices.
• Data
relationships
are stored with
the data itself at
the logical level
• Complexity • Apache Giraph
• ArrangoDB
• BlazeGraph
• InfinitGraph
• Neo4j

Inverted Index
Description Pros Cons Examples
An index that
builds
relationships
between the
contents of
documents. Full-
text indexes, and
probability
searches.
• Flexibility
• Robust querying
• Mutations are
slow
• Requires a lot
of storage
• Probability
searches may
not be what
you want.
• ElasticSearch
• Solr

Object
Description Pros Cons Examples
Models data as
you would model
them in an object-
oriented language.
• Automatic
schema updates
• Simplicity
• Lightweight
• Lack of
standards
• Lack of tooling
• db4o
• JADE
• ObjectDB
• Perst
• Zope

RDF (triplestore/quadstore)
Description Pros Cons Examples
A type of graph
database where
vertices and edges
are represented as
semantic
expressions.
• Extremely
scalable
• Engines are
incredibly fast
• Complexity • AllegroGraph
• BlazeGraph
• MarkLogic
• Oracle NoSQL

Columnar
Description Pros Cons Examples
Tabular data is
stored by column
instead of rows. A
single table
typically consists
of many files.
• Supports wide
tables (100s of
columns)
• Aggregates on
pico-scale data
is fast
• Write
performance
• Complexity
• Mutations are a
no-no
• Cannot query
by row
• Accumulo
• Cassandra
• Druid
• HBase
• Vertica

Other considerations
Multi-modal
Approach to CAP theorem

Other considerations
Multi-modal
Approach to CAP theorem
Memory management

Other considerations
Multi-modal
Approach to CAP theorem
Memory management
Durability

Other considerations
Multi-modal
Approach to CAP theorem
Memory management
Durability
NewSQL

Thank you!
Dan Fields
Technologist, Liberty Mutual Insurance
Twitter: @danielsfields
GitHub: dsfields
LinkedIn: /in/danielsfields

Weitere ähnliche Inhalte

Was ist angesagt?

Graph databases & data integration v2Dimitris Kontokostas

NoSQL DatabasesCarlos Alberto Benitez

Data quality in Real EstateDimitris Kontokostas

PhD thesis defense: Large-scale multilingual knowledge extraction, publishin...Dimitris Kontokostas

Evolution of the Graph SchemaJoshua Shinavier

Analyzing Web Archivesvinaygo

Graph database Shruti Arya

GraphDB Cloud: Enterprise Ready RDF Database on DemandOntotext

Introduction to ArangoDB (nosql matters Barcelona 2012)ArangoDB Database

Graph DatabaseRichard Kuo

Graph Databasesthai

Cogapp Open Studios 2012 - Adventures with Linked DataCogapp

Scalable Data Models with ElasticsearchBeyondTrees

Comparison with storing data using NoSQL(CouchDB) and a relational database.eross77

DBpedia JapaneseFumihiro Kato

Open Location Data and Linked Open DataApps4Finland

Deriving an Emergent Relational Schema from RDF DataGraph-TA

Publishing RDF SKOS with microservicesBart Hanssens

Future of pandasJeff Reback

JDD 2016 - Michal Matloka - Small Intro To Big DataPROIDEA

Was ist angesagt? (20)

Graph databases & data integration v2

NoSQL Databases

Data quality in Real Estate

PhD thesis defense: Large-scale multilingual knowledge extraction, publishin...

Evolution of the Graph Schema

Analyzing Web Archives

Graph database

GraphDB Cloud: Enterprise Ready RDF Database on Demand

Introduction to ArangoDB (nosql matters Barcelona 2012)

Graph Database

Graph Databases

Cogapp Open Studios 2012 - Adventures with Linked Data

Scalable Data Models with Elasticsearch

Comparison with storing data using NoSQL(CouchDB) and a relational database.

DBpedia Japanese

Open Location Data and Linked Open Data

Deriving an Emergent Relational Schema from RDF Data

Publishing RDF SKOS with microservices

Future of pandas

JDD 2016 - Michal Matloka - Small Intro To Big Data

Ähnlich wie NoSQL Roundup

Taming NoSQL with Spring DataSergi Almar i Graupera

NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...Felix Gessert

NoSQL and MapReduceJ Singh

NoSQL and MongoDBRajesh Menon

Introduction to NoSQLDimitar Danailov

Presentation: mongo db & elasticsearch & membaseArdak Shalkarbayuli

A rubyist's naive comparison of some database systems and toolkitsBelighted

NoSQL: Why, When, and HowBigBlueHat

Mongodb - NoSql DatabasePrashant Gupta

Ephedra: efficiently combining RDF data and services using SPARQL federationPeter Haase

Apache DrillTed Dunning

Dropping ACID: Wrapping Your Mind Around NoSQL DatabasesKyle Banerjee

NoSQLdbulic

Mongo Bb - NoSQL tutorialMohan Rathour

No SQL : Which way to go? Presented at DDDMelbourne 2015Himanshu Desai

NoSQL, which way to go?Ahmed Elharouny

Nosql databases for the .net developerJesus Rodriguez

NoSQL DatabasesAshish Karki

Drop acidMike Feltman

Big data overviewbeCloudReady

Ähnlich wie NoSQL Roundup (20)

Taming NoSQL with Spring Data

NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...

NoSQL and MapReduce

NoSQL and MongoDB

Introduction to NoSQL

Presentation: mongo db & elasticsearch & membase

A rubyist's naive comparison of some database systems and toolkits

NoSQL: Why, When, and How

Mongodb - NoSql Database

Ephedra: efficiently combining RDF data and services using SPARQL federation

Apache Drill

Dropping ACID: Wrapping Your Mind Around NoSQL Databases

NoSQL

Mongo Bb - NoSQL tutorial

No SQL : Which way to go? Presented at DDDMelbourne 2015

NoSQL, which way to go?

Nosql databases for the .net developer

NoSQL Databases

Drop acid

Big data overview

Kürzlich hochgeladen

Software Quality Assurance Interview QuestionsArshad QA

Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions

CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823

Optimizing AI for immediate response in Smart CCTVshikhaohhpro

Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy

Active Directory Penetration Testing, cionsystems.com.pdfCionsystems

Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveCall Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes

Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531

Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.

Test Automation Strategy for Frontend and BackendArshad QA

How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc

TECUNIQUE: Success Stories: IT Service providermohitmore19

call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls

(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700

The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171

Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab

Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave

Exploring iOS App Development: Simplifying the ProcessEvangelist Apps https://twitter.com/EvangelistSW/

Kürzlich hochgeladen (20)

Software Quality Assurance Interview Questions

Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...

CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️

Optimizing AI for immediate response in Smart CCTV

Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...

Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications

Active Directory Penetration Testing, cionsystems.com.pdf

Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live

How To Troubleshoot Collaboration Apps for the Modern Connected Worker

Hand gesture recognition PROJECT PPT.pptx

Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...

Test Automation Strategy for Frontend and Backend

How To Use Server-Side Rendering with Nuxt.js

TECUNIQUE: Success Stories: IT Service provider

call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️

(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...

The Ultimate Test Automation Guide_ Best Practices and Tips.pdf

Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...

Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...

Exploring iOS App Development: Simplifying the Process

NoSQL Roundup

1. NoSQL Roundup - DAN FIELDS (@DANIELSFIELDS)

2. “What is NoSQL”?

3. What if NoSQL?

4. What if NoSQL? Map/Reduce (HBase, CouchDB)

5. What if NoSQL? Map/Reduce (HBase, CouchDB) Query DSL (MongoDB, ElasticSearch, ReThink)

6. What if NoSQL? Map/Reduce (HBase, CouchDB) Query DSL (MongoDB, ElasticSearch, ReThink) Key-Value (Redis, Membase)

7. What if NoSQL? Map/Reduce (HBase, CouchDB) Query DSL (MongoDB, ElasticSearch, ReThink) Key-Value (Redis, Membase) Other (Cypher, Pig Latin)

8. What if NoSQL? Map/Reduce (HBase, CouchDB) Query DSL (MongoDB, ElasticSearch, ReThink) Key-Value (Redis, Membase) Other (Cypher, Pig Latin) Uh…Structured Query Language…?

9. “No”SQL

10. “N.O.”SQL ”Not Only” SQL?

11. SQL-like N1QL (Couchbase) SPARQL (RDF – AllegroGraph, BlazeGraph) CQL (Cassandra) …and so on…

12. No, seriously, what the heck is “NoSQL”?

13. No, seriously, what the heck is “NoSQL”?

14. So, then NoSQL === NoRel?

15. So, then NoSQL === NoRel?

16. NoSQL === (NoSQL !== RDBMS)

17. But why NoSQL?

18. NoSQL === (NoSQL !== RDBMS)

19. NoSQL solves specific problems

20. NoSQL solves specific problems Horizontal scalability

21. NoSQL solves specific problems Horizontal scalability Availability

22. NoSQL solves specific problems Horizontal scalability Availability Schema updates

23. NoSQL solves specific problems Horizontal scalability Availability Schema updates Performance

24. NoSQL solves specific problems Horizontal scalability Availability Schema updates Performance Data gets very large

25. NoSQL solves specific problems Horizontal scalability Availability Schema updates Performance Data gets very large Data gets very wide

26. NoSQL solves specific problems Horizontal scalability Availability Schema updates Performance Data gets very large Data gets very wide High volatility

27. NoSQL solves specific problems Horizontal scalability Availability Schema updates Performance Data gets very large Data gets very wide High volatility CPU-bound operations

28. “NoSQL” is not a monolith Key-value Document Graph Inverted Index Object RDF (triplestore/quadstore) Columnar

29. Key-value Description Pros Cons Examples Data is modeled as key-value pairs. • Simplicity • In-memory • Flexibility • Easy to partition • Limited query capabilities • Limited ability to build out complex data relationships • Redis • Couchbase • Membase • DynamoDB • Riak

30. Document Description Pros Cons Examples Data is represented as a “document,” and is serialized in a hierarchical data format. • Flexibility • Limited ability to build out complex data relationships • ArrangoDB • Couchbase • CouchDB • DynamoDB • ElasticSearch • MongoDB • RethinkDB • Riak

31. Graph Description Pros Cons Examples Based on graph theory. Data is represented as edges and vertices. • Data relationships are stored with the data itself at the logical level • Complexity • Apache Giraph • ArrangoDB • BlazeGraph • InfinitGraph • Neo4j

32. Inverted Index Description Pros Cons Examples An index that builds relationships between the contents of documents. Full- text indexes, and probability searches. • Flexibility • Robust querying • Mutations are slow • Requires a lot of storage • Probability searches may not be what you want. • ElasticSearch • Solr

33. Object Description Pros Cons Examples Models data as you would model them in an object- oriented language. • Automatic schema updates • Simplicity • Lightweight • Lack of standards • Lack of tooling • db4o • JADE • ObjectDB • Perst • Zope

34. RDF (triplestore/quadstore) Description Pros Cons Examples A type of graph database where vertices and edges are represented as semantic expressions. • Extremely scalable • Engines are incredibly fast • Complexity • AllegroGraph • BlazeGraph • MarkLogic • Oracle NoSQL

35. Columnar Description Pros Cons Examples Tabular data is stored by column instead of rows. A single table typically consists of many files. • Supports wide tables (100s of columns) • Aggregates on pico-scale data is fast • Write performance • Complexity • Mutations are a no-no • Cannot query by row • Accumulo • Cassandra • Druid • HBase • Vertica

36. Other considerations

37. Other considerations Multi-modal

38. Other considerations Multi-modal Approach to CAP theorem

39. Other considerations Multi-modal Approach to CAP theorem Memory management

40. Other considerations Multi-modal Approach to CAP theorem Memory management Durability

41. Other considerations Multi-modal Approach to CAP theorem Memory management Durability NewSQL

42. Thank you! Dan Fields Technologist, Liberty Mutual Insurance Twitter: @danielsfields GitHub: dsfields LinkedIn: /in/danielsfields

Hinweis der Redaktion

The goal of this talk is provide a definition for the term “NoSQL,” and give a high-level overview of the various kinds of NoSQL databases.
What NoSQL is and is not has caused confusion. There are a lot of options. NoSQL is not. A monolith. The name implies that these are database that do not use SQL to query data.
Well, then how the heck are we getting data? There are a number of alternatives.
Map performs filtering and sorting Reduce performs aggregates (counts, summations, unions, etc)
Queries are objects built using a domain specific language. Different keys and structures result in different filter criteria and logical conjunctions/disjunctions.
Gigantic hashtables. Simple hash lookup.
Completely different languages for expressing queries.
And of course there’s…SQL. But it’s called “NoSQL”!
Are we just being trolled here? As it turns out, SQL is a pretty good tool. It makes it easy to build highly-complex queries and database mutations. Collectively a lot of knowledge about SQL within the development community.
The term “NoSQL” has morphed from “there is no SQL” to “not only SQL.”
We have this broad classification of databases. That are no SQL technologies. But also use SQL. It’s a bit confusing. What’s the difference between something like MySQL or PostgreSQL?
What we start to find is that many of these technologies approach the modeling of data in dramatically different ways. There is often is no construct of a “relationship” as a first-class citizen in the database.
Not exactly. The primitives for defining objects in a graph database are called vertices and edges. Edges are the relationship between vertices.
Really, NoSQL is what it is not. That is, a database that is not an RDBMS.
Perhaps at this point you may be saying, “yo, Dan, imma let you finish, but PostgreSQL is the greatest database of all time!”
If you find that an RDBMS solves your problems, then use it. NoSQL is not an “alternative” to using RDBMS databases. Nor is NoSQL a panacea for data storage.
The rise of these NoSQL database technologies has been about solving specific problems.
Generally, scaling a traditional RDBMS means adding more disks, CPUs, and memory This can get costly, and has limitations Being able to simply add new nodes to a cluster to scale out a single database instance is compelling
With RDBMS, if our database goes down, we have to fail over to a replica. Some NoSQL databases approach this problem by running hot-hot replicas. Traffic is simply routed away from an unhealthy node.
If you have to change a database table, this can be a complicated exercise. Especially if you need to automate this as part of your continuous deployment pipeline. Some NoSQL technologies approach this by going ”schemaless.”
Generally, RDBMS databases are pretty fast. There are situations where they are start to break down.
Indexes begin to get difficult to manage and query.
Many RDBS systems have a number of optimizations built into them that help it handle the efficient storage of data. Often the trade off is often limitations on the number of columns in a database.
Lots of data mutations means lots of recomputing of indexes, and index fragmentation.
CPUs tend to be the most limited resource on a server. Lots of CPU-intensive operations can result in the CPU getting pegged, which means your database is no longer available.
Again, NoSQL databases solve specific problems. Consequently, there are a number of types of databases that often get associated with the “NoSQL” moniker. The different types are defined by the various ways a database will internally model data. They all have their pros and cons. Lets take a deeper look.
Giant hashtable. Often in-memory. Typically used for caching Couchbase has additional methods of querying Redis has data structures DynamoDB has range keys
Typically requires a fair amount of denormalization.
The advantage of a graph database over an RDBMS is the ability to represent and query deeply-nested hierarchies.
Typically associated with full-text indexes. Can be used as a document database. ElasticSearch offers additional, non-probably queries.
This is a type of database that does not get a lot of publicity. The original “NoSQL” database (Strozzi).
”Humans are animals” / “Bob is human” / “Bob has hair” / “Bob is 35" / "Bob knows Fred” / “Sarah has hair” Engines are mind-bogglingly fast. BlazeGraph running on GPU servers has been clocked at 50 billion edge traversals per second Query language is SPARQL
NO MUTATIONS! Often used for time series Typically stored in a tabular format
This is a lot of information, so here’s some more!
You may have noticed some databases appear on multiple NoSQL type lists. These are called multi-modal databases. Models data internally in multiple forms to overcome limitations of individual models.
Consistency Availability Partition-tolerance. For example… CP gives you immediate consistency by running hot-cold replicas. This impacts availability because of warmup time for replicas. AP is the opposite.
Some in-memory databases do not support larger-than-memory datasets with persistence (Redis).
Eventually durable.
There is another category worth mention here called NewSQL.

NoSQL Roundup

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie NoSQL Roundup

Ähnlich wie NoSQL Roundup (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

NoSQL Roundup

Hinweis der Redaktion