7. Key Value Stores
• Most Based on Dynamo: Amazon Highly
Available Key-Value Store
• Data Model:
– Global key-value mapping
– Big scalable HashMap
– Highly fault tolerant (typically)
• Projects:
8. Key Value Stores
• Pros:
– Simple data model
– Scalable
• Cons
– Create your own “foreign keys”
– Poor for complex data
9. Column Databases
• Most Based on BigTable: Google’s Distributed
Storage System for Structured Data
• Data Model:
– A big table, with column families
– Map Reduce for querying/processing
• Projects:
10. Column Databases
• Pros:
– Supports Simi-Structured Data
– Naturally Indexed (columns)
– Scalable
• Cons
– Poor for interconnected data
11. Document Databases
• Data Model:
– A collection of documents
– A document is a key value collection
– Index-centric, lots of map-reduce
• Projects :
12. Document Databases
• Pros:
– Simple, powerful data model
– Scalable
• Cons
– Poor for interconnected data
– Query model limited to keys and indexes
– Map reduce for larger queries
14. Graph Databases
• Pros:
– Powerful data model, as general as RDBMS
– Connected data locally indexed
– Easy to query
• Cons
– Sharding ( lots of people working on this)
• Scales UP reasonably well
– Requires rewiring your brain
22. GraphDB Overview
Data is more connected:
• Text (content)
• HyperText (added pointers)
• RSS (joined those pointers)
• Blogs (added pingbacks)
• Tagging (grouped related data)
• RDF (described connected data)
• GGG (content + pointers + relationships +
descriptions)
23. GraphDB Overview
Data is less structured:
• If you tried to collect all the data of every
movie ever made, how would you model
it?
• Actors, Characters, Locations, Dates, Costs,
Ratings, Showings, Ticket Sales, etc.
25. What is Graph
• An abstract representation of a set of
objects where some pairs are connected by
links.
Object (Vertex, Node)
Link (Edge, Arc, Relationship)
26. Different Kinds of Graphs
• Undirected Graph
• Directed Graph
• Pseudo Graph
• Multi Graph
• Hyper Graph
27. More Kinds of Graphs
• Weighted Graph
• Labeled Graph
• Property Graph
29. What is a Graph DB?
• A database with an explicit graph structure
• Each node knows its adjacent nodes
• As the number of nodes increases, the cost
of a local step (or hop) remains the same
• Plus an Index for lookups
32. What is Neo4j?
• A java based graph database
• Property Graph
• Full ACID (atomicity, consistency, isolation, durability)
• High Availability (with Enterprise Edition)
• 32 Billion Nodes, 32 Billion Relationships,
64 Billion Properties
• Embedded Server
• REST API
33. What is Neo4j?
• Both nodes and relationships can have metadata.
• Integrated pattern-matching-based query language (“Cypher”).
• Also the “Gremlin” graph traversal language can be used.
• Indexing of nodes and relationships. (Lucene)
• Nice self-contained web admin.
• Advanced path-finding with multiple algorithms.
• Optimized for reads.
• Has transactions (in the Java API)
• Scriptable in Groovy
• Online backup, advanced monitoring and High Availability is
AGPL/commercial licensed
34. Neo4j is good for :
• Highly connected data (social networks)
• Recommendations (e-commerce)
• Path Finding (how do I know you?)
• A* (Least Cost path)
• Data First Schema (bottom-up, but you still
need to design)
37. If you’ve ever
• Joined more than 7 tables together
• Modeled a graph in a table
• Written a recursive CTE
• Tried to write some crazy stored procedure
with multiple recursive self and inner joins
You should use Neo4j
38. rewiring you brain
Language LanguageCountry Country
language_code language_code country_code
language_name country_code country_name
word_count primary flag_uri
Language Country
name name
IS_SPOKEN_IN
code code
word_count as_primary flag_uri
40. rewiring you brain
Country
name
flag_uri
language_name
number_of_words
yes_in_langauge
no_in_language
currency_code
Country Language
name name
flag_uri number_of_words
SPEAKS
yes
no
Currency
code
name
41. show me the code!
GraphDatabaseService graphDb =
new EmbeddedGraphDatabase("var/neo4j");
Node david = graphDb.createNode();
Node andreas = graphDb.createNode();
david.setProperty("name", "David Montag");
andreas.setProperty("name", "Andreas Kollegger");
Relationship presentedWith =
david.createRelationshipTo(andreas,
PresentationTypes.PRESENTED_WITH);
presentedWith.setProperty("date", System.currentTimeMillis());
45. console.neo4j.org
Try it right now:
start n=node(*) match n-[r:LOVES]->m return n, type(r), m
Notice the two nodes in red, they are your result set.
47. Spring-Data-Neo4J
• Focus on Spring Data Neo4j
• VMWare is collaborating with Neo Technology, the
company behind the Neo4j graph database.
• Improved programming model: Annotation-based
programming model for applications with rich
domain models
• Cross-store persistence: Extend existing JPA
application with NoSQL persistence
• Tagging (grouped related data)
• RDF (described connected data)
51. Spring-Data-Neo4J
@RelationshipEntity
public class Role {
@StartNode private Actor actor;
@EndNode private Movie movie;
private String roleName;
}
@NodeEntity
public class Actor {
@RelatedToVia(type = “ACTS_IN”)
private Iterable<Role> roles;
}
An undirected graph is one in which edges have no orientation. The edge (a, b) is identical to the edge (b, a).A directed graph or digraph is an ordered pair D = (V, A)A pseudo graph is a graph with loopsA multi graph allows for multiple edges between nodesA hyper graph allows an edge to join more than two nodes
An undirected graph is one in which edges have no orientation. The edge (a, b) is identical to the edge (b, a).A directed graph or digraph is an ordered pair D = (V, A)A pseudo graph is a graph with loopsA multi graph allows for multiple edges between nodesA hyper graph allows an edge to join more than two nodes
Best used: For graph-style, rich or complex, interconnected data. Neo4j is quite different from the others in this sense.For example: Social relations, public transport links, road maps, network topologies.