Polyglot Persistence, NoSQL 
3-in-1 Database: 
Graph DB, Key/Value & 
Document Store 
Database Month New York, 11 November...
Max Neunhöffer 
I am a mathematician 
“Earlier life”: Research in Computer Algebra 
(Computational Group Theory) 
Always j...
ArangoDB GmbH 
triAGENS GmbH offers consulting services since 2004: 
software architecture 
project management 
software d...
A typical Project: a Web Shop 
The Speci1cation Workshop 
(need recommendation engine, need statistics, etc.) 
The Develop...
Solution: Agile Approach and Domain Driven Design 
These days, many use (or try to use): 
agile methods (Scrum, sprints, r...
Fundamental Problem: need a ubiquitous Language 
Listening to team members, you hear completely different things: 
Product...
The problem is rooted very deeply 
functionality not gathered 
methodically 
+ 
“obvious” functions are missing 
no common...
NoSQL: Richer Data Models are closer to the Domain 
Some terms used by Evans as part of the ubiquitous language: 
Entity: ...
Polyglot Persistence 
Idea 
Use the right data model for each part of a system. 
For an application, persist 
an object or...
Document and Key/Value Stores 
Document store 
A document store stores a set of documents, which usually 
means JSON data,...
Graph Databases 
Graph database 
A graph database stores a labelled graph. Vertices and 
edges are documents. Graphs are g...
A typical Use Case — an Online Shop 
We need to hold 
customer data: usually homogeneous, but still variations 
=) use a d...
Polyglot Persistence is nice, but . . . 
Consequence: One needs multiple database systems in the persis-tence 
layer of a ...
The Multi-Model Approach 
Multi-model database 
A multi-model database combines a document store with a 
graph database an...
A Map of the NoSQL Landscape 
Map/reduce 
Column Stores 
Extensibility 
Documents 
Massively distributed 
Graphs 
Structur...
is a multi-model database (document store & graph database), 
is open source and free (Apache 2 license), 
offers convenie...
A Map of the NoSQL Landscape 
Map/reduce 
Column Stores 
Extensibility 
Documents 
Massively distributed 
Graphs 
Structur...
The ArangoDB Territory 
Map/reduce 
Column Stores 
Extensibility 
Documents 
Massively distributed 
Graphs 
Structured 
Da...
Strong Consistency 
ArangoDB offers 
atomic and isolated CRUD operations for single documents, 
transactions spanning mult...
Replication and Sharding — horizontal scalability 
Right now, ArangoDB provides 
easy setup of (asynchronous) replication,...
Powerful query language: AQL 
The built in Arango Query Language AQL allows 
complex, powerful and convenient queries, 
wi...
Extensible through JavaScript and Foxx 
The HTTP API of ArangoDB 
can be extended by user-de1ned JavaScript code, 
that is...
Nächste SlideShare
Wird geladen in …5
×

Polyglot Persistence NoSQL 3-in-1 Database: Graph DB, Key/Value & Document Store

1.382 Aufrufe

Veröffentlicht am

Polyglot Persistence is a hugely popular trend in the NoSQL database domain, because it obviously makes sense to use "the right data model" for each specific part of an architecture or application. Traditionally, this means that one has to use different persistence tools for different parts of a larger system, which creates some friction in the form of data conversion and synchronisation between these different tools.

The idea of a "multi-model database" recently emerged, which is a document store, a graph database and a key/value store combined in one program. Therefore it is able to cover a lot of use cases which otherwise would need multiple different database systems, all in one tool with a single and coherent API and query language. All this is not just a utopian dream - there is an actual working implementation available, and you will learn how to use it during this presentation!

In this talk I will explain the motivation behind the multi-model approach, its consequences on Polyglot Persistence and discuss its advantages and limitations, as well as predictions about the NoSQL database market in five-years time.

Veröffentlicht in: Software
0 Kommentare
1 Gefällt mir
Statistik
Notizen
  • Als Erste(r) kommentieren

Keine Downloads
Aufrufe
Aufrufe insgesamt
1.382
Auf SlideShare
0
Aus Einbettungen
0
Anzahl an Einbettungen
3
Aktionen
Geteilt
0
Downloads
24
Kommentare
0
Gefällt mir
1
Einbettungen 0
Keine Einbettungen

Keine Notizen für die Folie

Polyglot Persistence NoSQL 3-in-1 Database: Graph DB, Key/Value & Document Store

  1. 1. Polyglot Persistence, NoSQL 3-in-1 Database: Graph DB, Key/Value & Document Store Database Month New York, 11 November 2014 Max Neunhöffer www.arangodb.com
  2. 2. Max Neunhöffer I am a mathematician “Earlier life”: Research in Computer Algebra (Computational Group Theory) Always juggled with big data Now: working in database development, NoSQL, ArangoDB I like: research, hacking, teaching, tickling the highest performance out of computer systems. 1
  3. 3. ArangoDB GmbH triAGENS GmbH offers consulting services since 2004: software architecture project management software development business analysis a lot of experience with specialised database systems. have done NoSQL, before the term was coined at all 2011/2012, an idea emerged: to build the database one had wished to have all those years! development of ArangoDB as open source software since 2012 ArangoDB GmbH: spin-off to take care of ArangoDB (2014) 2
  4. 4. A typical Project: a Web Shop The Speci1cation Workshop (need recommendation engine, need statistics, etc.) The Developers get to work . . . (tables, relations, normalisation, schemas, queries, front-ends, etc.) HANDOVER (Why can I not . . . ? This is unusable!) 3
  5. 5. Solution: Agile Approach and Domain Driven Design These days, many use (or try to use): agile methods (Scrum, sprints, rapid prototyping) with continuous feedback from product owners to developers promising less surprises in deployment and high 2exibility. Domain Driven Design (Eric Evans, 2004): identify a Domain (area in which software is applied) make a Model (abstract description of situation) use a Ubiquitous Language (that all team members speak) clearly de1ne the Context in which the model applies. Model your data as close to the domain as possible. Example: object oriented programming 4
  6. 6. Fundamental Problem: need a ubiquitous Language Listening to team members, you hear completely different things: Product Managers talk about customers “browsing” through the shop, powerful search for products (with the “good ones” up), “useful” recommendations. Developers talk about tables, normalisation, queries and joins secondary indexes, front-end pages object oriented, model view controller, responsive design =) both groups think the others are morons 5
  7. 7. The problem is rooted very deeply functionality not gathered methodically + “obvious” functions are missing no common language + misunderstandings about details 6
  8. 8. NoSQL: Richer Data Models are closer to the Domain Some terms used by Evans as part of the ubiquitous language: Entity: has an identity and mutable state (e.g. a person) Value object: is identi1ed by its attributes and immutable (e.g. an address) Aggregate: is a combination of entities and value objects into one transactional unit (e.g. a customer with its orders) Association: is a relation between entities and value objects, can have attributes, usually immutable Consequences These terms coming from the Domain must be present in the Design. The whole team must understand the same when talking about them. 7
  9. 9. Polyglot Persistence Idea Use the right data model for each part of a system. For an application, persist an object or structured data as a JSON document, a hash table in a key/value store, relations between objects in a graph database, a homogeneous array in a relational DBMS. If the table has many empty cells or inhomogeneous rows, use a column-based database. Take scalability needs into account! 8
  10. 10. Document and Key/Value Stores Document store A document store stores a set of documents, which usually means JSON data, these sets are called collections. The database has access to the contents of the documents. each document in the collection has a unique key secondary indexes possible, leading to more powerful queries different documents in the same collection: structure can vary no schema is required for a collection database normalisation can be relaxed Key/value store Opaque values, only key lookup without secondary indexes: =) high performance and perfect scalability 9
  11. 11. Graph Databases Graph database A graph database stores a labelled graph. Vertices and edges are documents. Graphs are good to model relations. graphs often describe data very naturally (e.g. the facebook friendship graph) graphs can be stored using tables, however, graph queries notoriously lead to expensive joins there are interesting and useful graph algorithms like “shortest path” or “neighbourhood” need a good query language to reap the bene1ts horizontal scalability is troublesome graph databases vary widely in scope and usage, no standard 10
  12. 12. A typical Use Case — an Online Shop We need to hold customer data: usually homogeneous, but still variations =) use a document store: product data: even for a specialised business quite inhomogeneous =) use a document store: shopping carts: need very fast lookup by session key =) use a key/value store: order and sales data: relate customers and products =) use a document store: recommendation engine data: links between different entities =) use a graph database: 11
  13. 13. Polyglot Persistence is nice, but . . . Consequence: One needs multiple database systems in the persis-tence layer of a single project! Polyglot persistence introduces some friction through data synchronisation, data conversion, increased installation and administration effort, more training needs. Wouldn’t it be nice, . . . . . . to enjoy the bene1ts without the disadvantages? 12
  14. 14. The Multi-Model Approach Multi-model database A multi-model database combines a document store with a graph database and a key/value store. Vertices are documents in a vertex collection, edges are documents in an edge collection. a single, common query language for all three data models is able to compete with specialised products on their turf allows for polyglot persistence using a single database queries can mix the different data models can replace a RDMBS in many cases 13
  15. 15. A Map of the NoSQL Landscape Map/reduce Column Stores Extensibility Documents Massively distributed Graphs Structured Data Key/Value Operational DBs Analytic DBs Complex queries 14
  16. 16. is a multi-model database (document store & graph database), is open source and free (Apache 2 license), offers convenient queries (via HTTP/REST and AQL), including joins between different collections, strong consistency guarantees using transactions is memory eZcient by shape detection, uses JavaScript throughout (Google’s V8 built into server), API extensible by JavaScript code in the Foxx framework, offers many drivers for a wide range of languages, is easy to use with web front end and good documentation, and enjoys good community as well as professional support. 15
  17. 17. A Map of the NoSQL Landscape Map/reduce Column Stores Extensibility Documents Massively distributed Graphs Structured Data Key/Value Operational DBs Analytic DBs Complex queries 16
  18. 18. The ArangoDB Territory Map/reduce Column Stores Extensibility Documents Massively distributed Graphs Structured Data Key/Value Operational DBs Analytic DBs Complex queries 17
  19. 19. Strong Consistency ArangoDB offers atomic and isolated CRUD operations for single documents, transactions spanning multiple documents and multiple collections, snapshot semantics for complex queries, very secure durable storage using append only and storing multiple revisions, all this for documents as well as for graphs. In the (near) future, ArangoDB will offer the same ACID semantics even with sharding, implement complete MVCC semantics to allow for lock-free concurrent transactions. 18
  20. 20. Replication and Sharding — horizontal scalability Right now, ArangoDB provides easy setup of (asynchronous) replication, which allows read access parallelisation (master/slaves setup), sharding with automatic data distribution to multiple servers. Very soon, ArangoDB will feature fault tolerance by automatic failover and synchronous replication in cluster mode, zero administration by a self-reparing and self-balancing cluster architecture. 19
  21. 21. Powerful query language: AQL The built in Arango Query Language AQL allows complex, powerful and convenient queries, with transaction semantics, allowing to do joins, with user de1nable functions (in JavaScript). AQL is independent of the driver used and offers protection against injections by design. For Version 2.3, we are reengineering the AQL query engine: use a C++ implementation for high performance, optimise distributed queries in the cluster. 20
  22. 22. Extensible through JavaScript and Foxx The HTTP API of ArangoDB can be extended by user-de1ned JavaScript code, that is executed in the DB server for high performance. This is formalised by the Foxx framework, which allows to implement complex, user-de1ned APIs with direct access to the DB engine. Very 2exible and secure authentication schemes can be implemented conveniently by the user in JavaScript. Because JavaScript runs everywhere (in the DB server as well as in the browser), one can use the same libraries in the back-end and in the front-end. =) implement your own micro services 21

×