SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Downloaden Sie, um offline zu lesen
Is multi-model the future of
NoSQL?
Max NeunhĂ¶ïŹ€er
Big Data Science Meetup, 15 March 2015
www.arangodb.com
Max NeunhĂ¶ïŹ€er
I am a mathematician
“Earlier life”: Research in Computer Algebra
(Computational Group Theory)
Always juggled with big data
Now: working in database development, NoSQL, ArangoDB
I like:
research,
hacking,
teaching,
tickling the highest performance out of computer systems.
1
Document and Key/Value Stores
Document store
A document store stores a set of documents, which usually
means JSON data, these sets are called collections. The
database has access to the contents of the documents.
each document in the collection has a unique key
secondary indexes possible, leading to more powerful queries
diïŹ€erent documents in the same collection: structure can vary
no schema is required for a collection
database normalisation can be relaxed
Key/value store
Opaque values, only key lookup without secondary indexes:
=⇒ high performance and perfect scalability
2
Graph databases
Graph database
A graph database stores a labelled graph. Vertices and
edges can be documents. Graphs are good to model
relations.
graphs often describe data very naturally (e.g. the facebook
friendship graph)
graphs can be stored using tables, however, graph queries
notoriously lead to expensive joins
there are interesting and useful graph algorithms like “shortest
path” or “neighbourhood”
need a good query language to reap the beneïŹts
horizontal scalability is troublesome
graph databases vary widely in scope and usage, no standard
3
Polyglot Persistence
Idea
Use the right data model for each part of a system.
For an application, persist
an object or structured data as a JSON document,
a hash table in a key/value store,
relations between objects in a graph database,
a homogeneous array in a relational DBMS.
If the table has many empty cells or inhomogeneous rows, use
a column-oriented database.
Take scalability needs into account!
4
A typical Use Case — an Online Shop
We need to hold
customer data: usually homogeneous, but still variations
=⇒ use a relational DB: MySQL
product data: even for a specialised business quite
inhomogeneous
=⇒ use a document store:
shopping carts: need very fast lookup by session key
=⇒ use a key/value store:
order and sales data: relate customers and products
=⇒ use a document store:
recommendation engine data: links between diïŹ€erent entities
=⇒ use a graph database:
5
Polyglot Persistence is nice, but . . .
Consequence: One needs multiple database systems in the persis-
tence layer of a single project!
Polyglot persistence introduces some friction through
data synchronisation,
data conversion,
increased installation and administration eïŹ€ort,
more training needs.
Wouldn’t it be nice, . . .
. . . to enjoy the beneïŹts without the disadvantages?
6
The Multi-Model Approach
Multi-model database
A multi-model database combines a document store with a
graph database and is at the same time a key/value store.
Vertices are documents in a vertex collection,
edges are documents in an edge collection.
a single, common query language for all three data models
is able to compete with specialised products on their turf
allows for polyglot persistence using a single database
queries can mix the diïŹ€erent data models
can replace a RDMBS in many cases
7
Use case: Aircraft ïŹ‚eet management
One of our customers uses ArangoDB to
store each part, component, unit or aircraft as a document
model containment as a graph
thus can easily ïŹnd all parts of some component
keep track of maintenance intervals
perform queries orthogonal to the graph structure
thereby getting good eïŹƒciency for all needed queries
8
Use case: Family tree management
For genealogy, the natural object is a family tree.
data naturally comes as a (directed) graph
many queries are traversals or shortest path
but not all, for example:
“all people with name James” in a family tree, sorted by birthday
“all family members who studied at Berkeley”, sorted by
number of children
quite often, queries mixing the diïŹ€erent models are useful
9
Recently: Key/Value stores adding other models
(by Basho), originally a key/value store, adds support for
documents with their 2.0 version (late 2014)
(sponsored by Pivotal), originally an in-memory
key/value store, has over time added more data types and
more complex operations
FoundationDB (by FoundationDB) is a key/value store, but is
now marketed as a multi-model database by adding additional
layers on top
OrientDB (by Orient Technologies) started as an object
database and nowadays calls itself a multi-model database
10
Recently: DataStax acquired Aurelius
In February 2015, DataStax (commercialised version of Cassan-
dra (column-oriented)), announced the acquisition of Aurelius, the
company behind TitanDB (a distributed graph database on top of
Cassandra).
In their own words:
“Bringing Graph Database Technology To Cassandra.”
“Will deliver massively scalable, always-on graph database
technology.”
“Will simplify the adoption of leading NoSQL technologies to
support multi-model use case environments.”
11
Recently: MongoDB 3.0 adds pluggable DB engine
is one of the most popular document stores.
In February 2015, they announced their 3.0 version, to be released
in March, featuring
a pluggable storage engine layer
transparent on-disk compression
etc.
This indicates their interest to support more data models than “just
documents”.
It will be very interesting indeed to see if and how they extend their
query-language . . .
12
is a multi-model database (document store & graph database),
is open source and free (Apache 2 license),
oïŹ€ers convenient queries (via HTTP/REST and AQL),
including joins between diïŹ€erent collections,
conïŹgurable consistency guarantees using transactions
memory eïŹƒcient by shape detection,
uses JavaScript throughout (Google’s V8 built into server),
API extensible by JS code in the Foxx Microservice Framework,
oïŹ€ers many drivers for a wide range of languages,
is easy to use with web front end and good documentation,
and enjoys good community as well as professional support.
13
ConïŹgurable consistency
ArangoDB oïŹ€ers
atomic and isolated CRUD operations for single documents,
transactions spanning multiple documents and multiple
collections,
snapshot semantics for complex queries,
very secure durable storage using append only and storing
multiple revisions,
all this for documents as well as for graphs.
In the near future, ArangoDB will
implement complete MVCC semantics to allow for lock-free
concurrent transactions
and oïŹ€er the same ACID semantics even with sharding.
14
Extensible through JavaScript and Foxx
The HTTP API of ArangoDB
can be extended by user-deïŹned JavaScript code,
that is executed in the DB server for high performance.
This is formalised by the Foxx microservice framework,
which allows to implement complex, user-deïŹned APIs with
direct access to the DB engine.
Very ïŹ‚exible and secure authentication schemes can be
implemented conveniently by the user in JavaScript.
Because JavaScript runs everywhere (in the DB server as well
as in the browser), one can use the same libraries in the
back-end and in the front-end.
=⇒ implement your own micro services
15
The Future of NoSQL: My Observations
I observe
2 decades ago the most versatile solutions eventually
dominated the relational DB market
(Oracle, MySQL, PostgreSQL),
the rise of the polyglot persistence idea
a trend towards multi-model databases
specialised products broadening their scope
even relational systems add support for JSON documents
devOps gaining inïŹ‚uence (Docker phenomenon)
16
The Future of NoSQL: My Predictions
In 5 years time . . .
the default approach is to use a multi-model database,
the big vendors will all add other data models,
the NoSQL solutions will conquer a sizable portion
of what is now dominated by the relational model,
specialized products will only survive, if they ïŹnd a niche.
17
Links
https://www.arangodb.com
https://github.com/ArangoDB/guesser
18

Weitere Àhnliche Inhalte

Andere mochten auch

Andere mochten auch (6)

Data Modeling for Integration of NoSQL with a Data Warehouse
Data Modeling for Integration of NoSQL with a Data WarehouseData Modeling for Integration of NoSQL with a Data Warehouse
Data Modeling for Integration of NoSQL with a Data Warehouse
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4j
 
SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?
 
RDF and OWL
RDF and OWLRDF and OWL
RDF and OWL
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data Architecture
 
Webinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in DocumentsWebinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in Documents
 

Mehr von Max Neunhöffer

Mehr von Max Neunhöffer (14)

Deep Dive on ArangoDB
Deep Dive on ArangoDBDeep Dive on ArangoDB
Deep Dive on ArangoDB
 
Scaling ArangoDB on Mesosphere DCOS
Scaling ArangoDB on Mesosphere DCOSScaling ArangoDB on Mesosphere DCOS
Scaling ArangoDB on Mesosphere DCOS
 
Scaling ArangoDB on Mesosphere DCOS
Scaling ArangoDB on Mesosphere DCOSScaling ArangoDB on Mesosphere DCOS
Scaling ArangoDB on Mesosphere DCOS
 
Processing large-scale graphs with Google Pregel
Processing large-scale graphs with Google PregelProcessing large-scale graphs with Google Pregel
Processing large-scale graphs with Google Pregel
 
Multi-model databases and node.js
Multi-model databases and node.jsMulti-model databases and node.js
Multi-model databases and node.js
 
Backbone using Extensible Database APIs over HTTP
Backbone using Extensible Database APIs over HTTPBackbone using Extensible Database APIs over HTTP
Backbone using Extensible Database APIs over HTTP
 
Complex queries in a distributed multi-model database
Complex queries in a distributed multi-model databaseComplex queries in a distributed multi-model database
Complex queries in a distributed multi-model database
 
Overhauling a database engine in 2 months
Overhauling a database engine in 2 monthsOverhauling a database engine in 2 months
Overhauling a database engine in 2 months
 
Experience with C++11 in ArangoDB
Experience with C++11 in ArangoDBExperience with C++11 in ArangoDB
Experience with C++11 in ArangoDB
 
guacamole: an Object Document Mapper for ArangoDB
guacamole: an Object Document Mapper for ArangoDBguacamole: an Object Document Mapper for ArangoDB
guacamole: an Object Document Mapper for ArangoDB
 
Extensible Database APIs and their role in Software Architecture
Extensible Database APIs and their role in Software ArchitectureExtensible Database APIs and their role in Software Architecture
Extensible Database APIs and their role in Software Architecture
 
Is multi-model the future of NoSQL?
Is multi-model the future of NoSQL?Is multi-model the future of NoSQL?
Is multi-model the future of NoSQL?
 
Oslo bekk2014
Oslo bekk2014Oslo bekk2014
Oslo bekk2014
 
Oslo baksia2014
Oslo baksia2014Oslo baksia2014
Oslo baksia2014
 

KĂŒrzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

KĂŒrzlich hochgeladen (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

Is multi-model the future of NoSQL?

  • 1. Is multi-model the future of NoSQL? Max NeunhĂ¶ïŹ€er Big Data Science Meetup, 15 March 2015 www.arangodb.com
  • 2. Max NeunhĂ¶ïŹ€er I am a mathematician “Earlier life”: Research in Computer Algebra (Computational Group Theory) Always juggled with big data Now: working in database development, NoSQL, ArangoDB I like: research, hacking, teaching, tickling the highest performance out of computer systems. 1
  • 3. Document and Key/Value Stores Document store A document store stores a set of documents, which usually means JSON data, these sets are called collections. The database has access to the contents of the documents. each document in the collection has a unique key secondary indexes possible, leading to more powerful queries diïŹ€erent documents in the same collection: structure can vary no schema is required for a collection database normalisation can be relaxed Key/value store Opaque values, only key lookup without secondary indexes: =⇒ high performance and perfect scalability 2
  • 4. Graph databases Graph database A graph database stores a labelled graph. Vertices and edges can be documents. Graphs are good to model relations. graphs often describe data very naturally (e.g. the facebook friendship graph) graphs can be stored using tables, however, graph queries notoriously lead to expensive joins there are interesting and useful graph algorithms like “shortest path” or “neighbourhood” need a good query language to reap the beneïŹts horizontal scalability is troublesome graph databases vary widely in scope and usage, no standard 3
  • 5. Polyglot Persistence Idea Use the right data model for each part of a system. For an application, persist an object or structured data as a JSON document, a hash table in a key/value store, relations between objects in a graph database, a homogeneous array in a relational DBMS. If the table has many empty cells or inhomogeneous rows, use a column-oriented database. Take scalability needs into account! 4
  • 6. A typical Use Case — an Online Shop We need to hold customer data: usually homogeneous, but still variations =⇒ use a relational DB: MySQL product data: even for a specialised business quite inhomogeneous =⇒ use a document store: shopping carts: need very fast lookup by session key =⇒ use a key/value store: order and sales data: relate customers and products =⇒ use a document store: recommendation engine data: links between diïŹ€erent entities =⇒ use a graph database: 5
  • 7. Polyglot Persistence is nice, but . . . Consequence: One needs multiple database systems in the persis- tence layer of a single project! Polyglot persistence introduces some friction through data synchronisation, data conversion, increased installation and administration eïŹ€ort, more training needs. Wouldn’t it be nice, . . . . . . to enjoy the beneïŹts without the disadvantages? 6
  • 8. The Multi-Model Approach Multi-model database A multi-model database combines a document store with a graph database and is at the same time a key/value store. Vertices are documents in a vertex collection, edges are documents in an edge collection. a single, common query language for all three data models is able to compete with specialised products on their turf allows for polyglot persistence using a single database queries can mix the diïŹ€erent data models can replace a RDMBS in many cases 7
  • 9. Use case: Aircraft ïŹ‚eet management One of our customers uses ArangoDB to store each part, component, unit or aircraft as a document model containment as a graph thus can easily ïŹnd all parts of some component keep track of maintenance intervals perform queries orthogonal to the graph structure thereby getting good eïŹƒciency for all needed queries 8
  • 10. Use case: Family tree management For genealogy, the natural object is a family tree. data naturally comes as a (directed) graph many queries are traversals or shortest path but not all, for example: “all people with name James” in a family tree, sorted by birthday “all family members who studied at Berkeley”, sorted by number of children quite often, queries mixing the diïŹ€erent models are useful 9
  • 11. Recently: Key/Value stores adding other models (by Basho), originally a key/value store, adds support for documents with their 2.0 version (late 2014) (sponsored by Pivotal), originally an in-memory key/value store, has over time added more data types and more complex operations FoundationDB (by FoundationDB) is a key/value store, but is now marketed as a multi-model database by adding additional layers on top OrientDB (by Orient Technologies) started as an object database and nowadays calls itself a multi-model database 10
  • 12. Recently: DataStax acquired Aurelius In February 2015, DataStax (commercialised version of Cassan- dra (column-oriented)), announced the acquisition of Aurelius, the company behind TitanDB (a distributed graph database on top of Cassandra). In their own words: “Bringing Graph Database Technology To Cassandra.” “Will deliver massively scalable, always-on graph database technology.” “Will simplify the adoption of leading NoSQL technologies to support multi-model use case environments.” 11
  • 13. Recently: MongoDB 3.0 adds pluggable DB engine is one of the most popular document stores. In February 2015, they announced their 3.0 version, to be released in March, featuring a pluggable storage engine layer transparent on-disk compression etc. This indicates their interest to support more data models than “just documents”. It will be very interesting indeed to see if and how they extend their query-language . . . 12
  • 14. is a multi-model database (document store & graph database), is open source and free (Apache 2 license), oïŹ€ers convenient queries (via HTTP/REST and AQL), including joins between diïŹ€erent collections, conïŹgurable consistency guarantees using transactions memory eïŹƒcient by shape detection, uses JavaScript throughout (Google’s V8 built into server), API extensible by JS code in the Foxx Microservice Framework, oïŹ€ers many drivers for a wide range of languages, is easy to use with web front end and good documentation, and enjoys good community as well as professional support. 13
  • 15. ConïŹgurable consistency ArangoDB oïŹ€ers atomic and isolated CRUD operations for single documents, transactions spanning multiple documents and multiple collections, snapshot semantics for complex queries, very secure durable storage using append only and storing multiple revisions, all this for documents as well as for graphs. In the near future, ArangoDB will implement complete MVCC semantics to allow for lock-free concurrent transactions and oïŹ€er the same ACID semantics even with sharding. 14
  • 16. Extensible through JavaScript and Foxx The HTTP API of ArangoDB can be extended by user-deïŹned JavaScript code, that is executed in the DB server for high performance. This is formalised by the Foxx microservice framework, which allows to implement complex, user-deïŹned APIs with direct access to the DB engine. Very ïŹ‚exible and secure authentication schemes can be implemented conveniently by the user in JavaScript. Because JavaScript runs everywhere (in the DB server as well as in the browser), one can use the same libraries in the back-end and in the front-end. =⇒ implement your own micro services 15
  • 17. The Future of NoSQL: My Observations I observe 2 decades ago the most versatile solutions eventually dominated the relational DB market (Oracle, MySQL, PostgreSQL), the rise of the polyglot persistence idea a trend towards multi-model databases specialised products broadening their scope even relational systems add support for JSON documents devOps gaining inïŹ‚uence (Docker phenomenon) 16
  • 18. The Future of NoSQL: My Predictions In 5 years time . . . the default approach is to use a multi-model database, the big vendors will all add other data models, the NoSQL solutions will conquer a sizable portion of what is now dominated by the relational model, specialized products will only survive, if they ïŹnd a niche. 17