SlideShare ist ein Scribd-Unternehmen logo
1 von 66
Downloaden Sie, um offline zu lesen
Copyright © ArangoDB GmbH, 2016
Handling Billions Of Edges in
a Graph Database
Michael Hackstein
@mchacki
Copyright © ArangoDB GmbH, 2016
‣ Schema-free Objects (Vertices)
‣ Relations between them (Edges)
‣ Edges have a direction
{
name: "alice",
age: 32
}
{
name: "dancing"
}
{
name: "bob",
age: 35,
size: 1,73m
}
{
name: "reading"
}
{
name: "fishing"
}
hobby
hobby
hobby
hobby
Copyright © ArangoDB GmbH, 2016
‣ Schema-free Objects (Vertices)
‣ Relations between them (Edges)
‣ Edges have a direction
‣ Edges can be queried in both directions
‣ Easily query a range of edges (2 to 5)
‣ Undefined number of edges (1 to *)
‣ Shortest Path between two vertices
{
name: "alice",
age: 32
}
{
name: "dancing"
}
{
name: "bob",
age: 35,
size: 1,73m
}
{
name: "reading"
}
{
name: "fishing"
}
hobby
hobby
hobby
hobby
Copyright © ArangoDB GmbH, 2016
‣ Give me all friends of Alice
Bob
Charly DaveAlice
Eve Frank
Copyright © ArangoDB GmbH, 2016
‣ Give me all friends of Alice
Bob
Charly DaveCharly
Bob
DaveAlice
Eve Frank
Copyright © ArangoDB GmbH, 2016
‣ Give me all friends-of-friends of Alice
Alice
Bob
Charly
Eve
Dave
Frank
Copyright © ArangoDB GmbH, 2016
‣ Give me all friends-of-friends of Alice
Alice
Bob
Charly
Eve
Dave
FrankEve Frank
Copyright © ArangoDB GmbH, 2016
‣ What is the linking path between Alice and Eve
Alice
Bob
Charly
Eve
Dave
Frank
Copyright © ArangoDB GmbH, 2016
‣ What is the linking path between Alice and Eve
Alice
Bob
Charly
Eve
Dave
FrankBobEve
Alice
Copyright © ArangoDB GmbH, 2016
‣ Which Trainstations can I reach if I am allowed to drive a distance of at most 6
stations on my ticket
You are
here
Copyright © ArangoDB GmbH, 2016
‣ Which Trainstations can I reach if I am allowed to drive a distance of at most 6
stations on my ticket
You are
here
Copyright © ArangoDB GmbH, 2016
Friend
‣ Give me all users that share two hobbies with Alice
Alice
Hobby1 Hobby2
Copyright © ArangoDB GmbH, 2016
FriendFriend
‣ Give me all users that share two hobbies with Alice
Alice
Hobby1 Hobby2
Copyright © ArangoDB GmbH, 2016
Alice
Product
ProductFriend
‣ Give me all products that at least one of my friends has bought together with the
products I already own, ordered by how many friends have bought it and the
products rating, but only 20 of them.
has_bought
has_bought
has_boughtis_friend
Copyright © ArangoDB GmbH, 2016
Alice
Product
ProductFriend Product
‣ Give me all products that at least one of my friends has bought together with the
products I already own, ordered by how many friends have bought it and the
products rating, but only 20 of them.
has_bought
has_bought
has_boughtis_friend
Copyright © ArangoDB GmbH, 2016
Copyright © ArangoDB GmbH, 2016
‣ Give me all users which have an age attribute between 21 and 35.
Copyright © ArangoDB GmbH, 2016
‣ Give me all users which have an age attribute between 21 and 35.
‣ Give me the age distribution of all users
Copyright © ArangoDB GmbH, 2016
‣ Give me all users which have an age attribute between 21 and 35.
‣ Give me the age distribution of all users
‣ Group all users by their name
Copyright © ArangoDB GmbH, 2016
Copyright © ArangoDB GmbH, 2016
‣ We first pick a start vertex (S)
S
Copyright © ArangoDB GmbH, 2016
‣ We first pick a start vertex (S)
‣ We collect all edges on S
B
C
A
S
Copyright © ArangoDB GmbH, 2016
‣ We first pick a start vertex (S)
‣ We collect all edges on S
‣ We apply filters on edges
B
C
A
SS
A
B
Copyright © ArangoDB GmbH, 2016
‣ We first pick a start vertex (S)
‣ We collect all edges on S
‣ We apply filters on edges
‣ We iterate down one of the new vertices (A)
D
E
B
C
A
SS
A
B
Copyright © ArangoDB GmbH, 2016
‣ We first pick a start vertex (S)
‣ We collect all edges on S
‣ We apply filters on edges
‣ We iterate down one of the new vertices (A)
‣ We apply filters on edges
D
E
B
C
A
SS
E
A
B
Copyright © ArangoDB GmbH, 2016
‣ We first pick a start vertex (S)
‣ We collect all edges on S
‣ We apply filters on edges
‣ We iterate down one of the new vertices (A)
‣ We apply filters on edges
‣ The next vertex (E) is in desired depth. Return
the path S -> A -> E
D
E
B
C
A
SS
E
A
B
Copyright © ArangoDB GmbH, 2016
‣ We first pick a start vertex (S)
‣ We collect all edges on S
‣ We apply filters on edges
‣ We iterate down one of the new vertices (A)
‣ We apply filters on edges
‣ The next vertex (E) is in desired depth. Return
the path S -> A -> E
‣ Go back to the next unfinished vertex (B)
D
E
B
C
A
SS
E
A
B
Copyright © ArangoDB GmbH, 2016
‣ We first pick a start vertex (S)
‣ We collect all edges on S
‣ We apply filters on edges
‣ We iterate down one of the new vertices (A)
‣ We apply filters on edges
‣ The next vertex (E) is in desired depth. Return
the path S -> A -> E
‣ Go back to the next unfinished vertex (B)
‣ We iterate down on (B)
D
E F
B
C
A
SS
E
A
B
Copyright © ArangoDB GmbH, 2016
‣ We first pick a start vertex (S)
‣ We collect all edges on S
‣ We apply filters on edges
‣ We iterate down one of the new vertices (A)
‣ We apply filters on edges
‣ The next vertex (E) is in desired depth. Return
the path S -> A -> E
‣ Go back to the next unfinished vertex (B)
‣ We iterate down on (B)
‣ We apply filters on edges
D
E F
B
C
A
SS
E
A
F
B
Copyright © ArangoDB GmbH, 2016
‣ We first pick a start vertex (S)
‣ We collect all edges on S
‣ We apply filters on edges
‣ We iterate down one of the new vertices (A)
‣ We apply filters on edges
‣ The next vertex (E) is in desired depth. Return
the path S -> A -> E
‣ Go back to the next unfinished vertex (B)
‣ We iterate down on (B)
‣ We apply filters on edges
‣ The next vertex (F) is in desired depth. Return
the path S -> B -> F
D
E F
B
C
A
SS
E
A
F
B
Copyright © ArangoDB GmbH, 2016
‣ Once:
‣ Find the start vertex
‣ For every depth:
‣ Find all connected edges
‣ Filter non-matching edges
‣ Find connected vertices
‣ Filter non-matching vertices
Depends on indexes: Hash:
Edge-Index or Index-Free:
Linear in edges:
Depends on indexes: Hash:
Linear in vertices:
Only one pass:
O
1
1
n
n * 1
n
3n
Copyright © ArangoDB GmbH, 2016
‣ Linear sounds evil?
‣ NOT linear in All Edges O(E)
‣ Only Linear in relevant Edges n < E
‣ Traversals solely scale with their result size
‣ They are not effected at all by total amount of data
‣ BUT: Every depth increases the exponent: O((3n)d
)
‣ "7 degrees of separation": (3n)6
< E < (3n)7
Copyright © ArangoDB GmbH, 2016
‣ MULTI-MODEL database
‣ Stores Key Value, Documents, and Graphs
‣ All in one core
‣ Query language AQL
‣ Document Queries
‣ Graph Queries
‣ Joins
‣ All can be combined in the same statement
‣ ACID support including Multi Collection Transactions
+ +
Copyright © ArangoDB GmbH, 2016
FOR user IN users
RETURN user
Copyright © ArangoDB GmbH, 2016
FOR user IN users
FILTER user.name == "alice"
RETURN user
Alice
Copyright © ArangoDB GmbH, 2016
FOR user IN users
FILTER user.name == "alice"
FOR product IN OUTBOUND user has_bought
RETURN product
Alice
Copyright © ArangoDB GmbH, 2016
FOR user IN users
FILTER user.name == "alice"
FOR product IN OUTBOUND user has_bought
RETURN product
Alice
has_bought
TV
Copyright © ArangoDB GmbH, 2016
FOR user IN users
FILTER user.name == "alice"
FOR recommendation, action, path IN 3 ANY user has_bought
FILTER path.vertices[2].age <= user.age + 5
AND path.vertices[2].age >= user.age - 5
FILTER recommendation.price < 25
LIMIT 10
RETURN recommendation
Alice
has_bought
TV
Copyright © ArangoDB GmbH, 2016
FOR user IN users
FILTER user.name == "alice"
FOR recommendation, action, path IN 3 ANY user has_bought
FILTER path.vertices[2].age <= user.age + 5
AND path.vertices[2].age >= user.age - 5
FILTER recommendation.price < 25
LIMIT 10
RETURN recommendation
Alice
has_bought
TV
has_bought
playstation.price < 25
PlaystationBob
alice.age - 5 <= bob.age &&
bob.age <= alice.age + 5
has_bought
Copyright © ArangoDB GmbH, 2016
‣ Many graphs have "Supernodes"
‣ Vertices with many inbound and/or outbound edges
‣ Traversing over them is expensive
‣ Often you only need a subset of edges
Bob Alice
Copyright © ArangoDB GmbH, 2016
‣ Remember Complexity? O((3n)d
)
‣ Filtering of non-matching edges is linear for every depth
‣ Index all edges based on their vertices and arbitrary other attributes
‣ Find initial set of edges in identical time
‣ Less / No post-filtering required
‣ This decreases the n significantly
Alice
Copyright © ArangoDB GmbH, 2016
‣ We have the rise of big data
‣ Store everything you can
‣ Dataset easily grows beyond one machine
‣ This includes graph data!
Copyright © ArangoDB GmbH, 2016
‣ Distribute graph on several machines (sharding)
‣ How to query it now?
‣ No global view of the graph possible any more
‣ What about edges between servers?
‣ In a shared environment network most of the time is the bottleneck
‣ Reduce network hops
‣ Vertex-Centric Indexes again help with super-nodes
‣ But: Only on a local machine
First let's do
the cluster thingy
Copyright © ArangoDB GmbH, 2016
‣ ArangoDB is the first fully certified
database including the persistence
primitives for DC/OS
‣ ArangoDB's cluster resource
management is based on Apache
Mesos
‣ Later this year we will also support
Kubernetes
Copyright © ArangoDB GmbH, 2016
‣ ArangoDB can run clusters without it
‣ Setup Requires manual effort (can be scripted)
‣ This works:
‣ Automatic Failover (Follower takes over if leader dies)
‣ Rebalancing of shards
‣ Everything inside of ArangoDB
‣ This is based on Mesos:
‣ Complete self healing
‣ Automatic restart of ArangoDBs (on new machines)
➡ We suggest you have someone on call
Demo Time
DC/OS
Now distribute
the graph
Copyright © ArangoDB GmbH, 2016
‣ Only parts of the graph on every machine
‣ Neighboring vertices may be on different machines
‣ Even edges could be on other machines than their vertices
‣ Queries need to be executed in a distributed way
‣ Result needs to be merged locally
Copyright © ArangoDB GmbH, 2016
Random Distribution
‣ Advantages:
‣ every server takes an equal portion of
data
‣ easy to realize
‣ no knowledge about data required
‣ always works
‣ Disadvantages:
‣ Neighbors on different machines
‣ Probably edges on other machines than
their vertices
‣ A lot of network overhead is required for
querying
Copyright © ArangoDB GmbH, 2016
Random Distribution
‣ Advantages:
‣ every server takes an equal portion of
data
‣ easy to realize
‣ no knowledge about data required
‣ always works
‣ Disadvantages:
‣ Neighbors on different machines
‣ Probably edges on other machines than
their vertices
‣ A lot of network overhead is required for
querying
Copyright © ArangoDB GmbH, 2016
Index-Free Adjacency
‣ Used by most other graph databases
‣ Every vertex maintains two lists of it's edges (IN and OUT)
‣ Do not use an index to find edges
‣ How to shard this?
Copyright © ArangoDB GmbH, 2016
Index-Free Adjacency
‣ Used by most other graph databases
‣ Every vertex maintains two lists of it's edges (IN and OUT)
‣ Do not use an index to find edges
‣ How to shard this?
Copyright © ArangoDB GmbH, 2016
Index-Free Adjacency
‣ Used by most other graph databases
‣ Every vertex maintains two lists of it's edges (IN and OUT)
‣ Do not use an index to find edges
‣ How to shard this?
Copyright © ArangoDB GmbH, 2016
Index-Free Adjacency
‣ Used by most other graph databases
‣ Every vertex maintains two lists of it's edges (IN and OUT)
‣ Do not use an index to find edges
‣ How to shard this?
????
Copyright © ArangoDB GmbH, 2016
Index-Free Adjacency
‣ ArangoDB uses an hash-based EdgeIndex (O(1) - lookup)
‣ The vertex is independent of it's edges
‣ It can be stored on a different machine
‣ Used by most other graph databases
‣ Every vertex maintains two lists of it's edges (IN and OUT)
‣ Do not use an index to find edges
‣ How to shard this?
????
Copyright © ArangoDB GmbH, 2016
Domain Based Distribution
‣ Many Graphs have a natural distribution
‣ By country/region for People
‣ By tags for Blogs
‣ By category for Products
‣ Most edges in same group
‣ Rare edges between groups
Copyright © ArangoDB GmbH, 2016
Domain Based Distribution
‣ Many Graphs have a natural distribution
‣ By country/region for People
‣ By tags for Blogs
‣ By category for Products
‣ Most edges in same group
‣ Rare edges between groups
Copyright © ArangoDB GmbH, 2016
Domain Based Distribution
‣ Many Graphs have a natural distribution
‣ By country/region for People
‣ By tags for Blogs
‣ By category for Products
‣ Most edges in same group
‣ Rare edges between groups
ArangoDB Enterprise Edition
uses Domain Knowledge

for short-cuts
Copyright © ArangoDB GmbH, 2016
SmartGraphs - How it works
DB Server 1 DB Server 2 DB Server n
Coordinator
Foxx
Coordinator
Foxx
Copyright © ArangoDB GmbH, 2016
SmartGraphs - How it works
DB Server 1 DB Server 2 DB Server n
Coordinator
Foxx
Coordinator
Foxx
Copyright © ArangoDB GmbH, 2016
SmartGraphs - How it works
DB Server 1 DB Server 2 DB Server n
Coordinator
Foxx
Coordinator
Foxx
Sneak Preview
SmartGraphs
Copyright © ArangoDB GmbH, 2016
Copyright © ArangoDB GmbH, 2016
Database
document write transactions
per virtual CPU/second
ArangoDB 1,730
Aerospike 2,500
Cassandra 965
Couchbase 1,375
FoundationDB 750
*) Source: https://mesosphere.com/blog/2015/11/30/arangodb-benchmark-dcos/
Copyright © ArangoDB GmbH, 2016
‣ Further questions?
‣ Follow us on twitter: @arangodb
‣ Join our slack: slack.arangodb.com
‣ Follow me on twitter/github: @mchacki
‣ Write me a mail: michael@arangodb.com
Thank You

Weitere ähnliche Inhalte

Andere mochten auch

ArangoDB – A different approach to NoSQL
ArangoDB – A different approach to NoSQLArangoDB – A different approach to NoSQL
ArangoDB – A different approach to NoSQLArangoDB Database
 
Domain driven design @FrOSCon
Domain driven design @FrOSConDomain driven design @FrOSCon
Domain driven design @FrOSConArangoDB Database
 
Polyglot Persistence & Multi Model-Databases at JMaghreb3.0
Polyglot Persistence & Multi Model-Databases at JMaghreb3.0Polyglot Persistence & Multi Model-Databases at JMaghreb3.0
Polyglot Persistence & Multi Model-Databases at JMaghreb3.0ArangoDB Database
 
Extensibility of a database api with js
Extensibility of a database api with jsExtensibility of a database api with js
Extensibility of a database api with jsArangoDB Database
 
Extreme JavaScript Minification and Obfuscation
Extreme JavaScript Minification and ObfuscationExtreme JavaScript Minification and Obfuscation
Extreme JavaScript Minification and ObfuscationSergey Ilinsky
 
Creating data centric microservices
Creating data centric microservicesCreating data centric microservices
Creating data centric microservicesArangoDB Database
 
Microservice-based software architecture
Microservice-based software architectureMicroservice-based software architecture
Microservice-based software architectureArangoDB Database
 
Processing large-scale graphs with Google(TM) Pregel
Processing large-scale graphs with Google(TM) PregelProcessing large-scale graphs with Google(TM) Pregel
Processing large-scale graphs with Google(TM) PregelArangoDB Database
 
Performance comparison: Multi-Model vs. MongoDB and Neo4j
Performance comparison: Multi-Model vs. MongoDB and Neo4jPerformance comparison: Multi-Model vs. MongoDB and Neo4j
Performance comparison: Multi-Model vs. MongoDB and Neo4jArangoDB Database
 
FOXX - a Javascript application framework on top of ArangoDB
FOXX - a Javascript application framework on top of ArangoDBFOXX - a Javascript application framework on top of ArangoDB
FOXX - a Javascript application framework on top of ArangoDBArangoDB Database
 
FlinkML - Big data application meetup
FlinkML - Big data application meetupFlinkML - Big data application meetup
FlinkML - Big data application meetupTheodoros Vasiloudis
 
EmilyHauserResumeJuly2016
EmilyHauserResumeJuly2016EmilyHauserResumeJuly2016
EmilyHauserResumeJuly2016Emily Hauser
 
Replication, Durability, and Disaster Recovery
Replication, Durability, and Disaster RecoveryReplication, Durability, and Disaster Recovery
Replication, Durability, and Disaster RecoverySteven Francia
 
FlinkML: Large Scale Machine Learning with Apache Flink
FlinkML: Large Scale Machine Learning with Apache FlinkFlinkML: Large Scale Machine Learning with Apache Flink
FlinkML: Large Scale Machine Learning with Apache FlinkTheodoros Vasiloudis
 
Arango DB for rubyists in 10mins
Arango DB for rubyists in 10minsArango DB for rubyists in 10mins
Arango DB for rubyists in 10minsPivorak MeetUp
 
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)Alexandre Rafalovitch
 
Introduction to column oriented databases
Introduction to column oriented databasesIntroduction to column oriented databases
Introduction to column oriented databasesArangoDB Database
 

Andere mochten auch (19)

ArangoDB – A different approach to NoSQL
ArangoDB – A different approach to NoSQLArangoDB – A different approach to NoSQL
ArangoDB – A different approach to NoSQL
 
Domain driven design @FrOSCon
Domain driven design @FrOSConDomain driven design @FrOSCon
Domain driven design @FrOSCon
 
Polyglot Persistence & Multi Model-Databases at JMaghreb3.0
Polyglot Persistence & Multi Model-Databases at JMaghreb3.0Polyglot Persistence & Multi Model-Databases at JMaghreb3.0
Polyglot Persistence & Multi Model-Databases at JMaghreb3.0
 
Extensibility of a database api with js
Extensibility of a database api with jsExtensibility of a database api with js
Extensibility of a database api with js
 
Extreme JavaScript Minification and Obfuscation
Extreme JavaScript Minification and ObfuscationExtreme JavaScript Minification and Obfuscation
Extreme JavaScript Minification and Obfuscation
 
Creating data centric microservices
Creating data centric microservicesCreating data centric microservices
Creating data centric microservices
 
Guacamole
GuacamoleGuacamole
Guacamole
 
Microservice-based software architecture
Microservice-based software architectureMicroservice-based software architecture
Microservice-based software architecture
 
Processing large-scale graphs with Google(TM) Pregel
Processing large-scale graphs with Google(TM) PregelProcessing large-scale graphs with Google(TM) Pregel
Processing large-scale graphs with Google(TM) Pregel
 
Performance comparison: Multi-Model vs. MongoDB and Neo4j
Performance comparison: Multi-Model vs. MongoDB and Neo4jPerformance comparison: Multi-Model vs. MongoDB and Neo4j
Performance comparison: Multi-Model vs. MongoDB and Neo4j
 
FOXX - a Javascript application framework on top of ArangoDB
FOXX - a Javascript application framework on top of ArangoDBFOXX - a Javascript application framework on top of ArangoDB
FOXX - a Javascript application framework on top of ArangoDB
 
FlinkML - Big data application meetup
FlinkML - Big data application meetupFlinkML - Big data application meetup
FlinkML - Big data application meetup
 
EmilyHauserResumeJuly2016
EmilyHauserResumeJuly2016EmilyHauserResumeJuly2016
EmilyHauserResumeJuly2016
 
NoSQL meets Microservices
NoSQL meets MicroservicesNoSQL meets Microservices
NoSQL meets Microservices
 
Replication, Durability, and Disaster Recovery
Replication, Durability, and Disaster RecoveryReplication, Durability, and Disaster Recovery
Replication, Durability, and Disaster Recovery
 
FlinkML: Large Scale Machine Learning with Apache Flink
FlinkML: Large Scale Machine Learning with Apache FlinkFlinkML: Large Scale Machine Learning with Apache Flink
FlinkML: Large Scale Machine Learning with Apache Flink
 
Arango DB for rubyists in 10mins
Arango DB for rubyists in 10minsArango DB for rubyists in 10mins
Arango DB for rubyists in 10mins
 
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
 
Introduction to column oriented databases
Introduction to column oriented databasesIntroduction to column oriented databases
Introduction to column oriented databases
 

Mehr von ArangoDB Database

ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....ArangoDB Database
 
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022ArangoDB Database
 
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022ArangoDB Database
 
ArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB 3.9 - Further Powering Graphs at ScaleArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB 3.9 - Further Powering Graphs at ScaleArangoDB Database
 
GraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDBGraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDBArangoDB Database
 
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale ArangoDB Database
 
Graph Analytics with ArangoDB
Graph Analytics with ArangoDBGraph Analytics with ArangoDB
Graph Analytics with ArangoDBArangoDB Database
 
Getting Started with ArangoDB Oasis
Getting Started with ArangoDB OasisGetting Started with ArangoDB Oasis
Getting Started with ArangoDB OasisArangoDB Database
 
Custom Pregel Algorithms in ArangoDB
Custom Pregel Algorithms in ArangoDBCustom Pregel Algorithms in ArangoDB
Custom Pregel Algorithms in ArangoDBArangoDB Database
 
Hacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge GraphsHacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge GraphsArangoDB Database
 
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
A Graph Database That Scales - ArangoDB 3.7 Release WebinarA Graph Database That Scales - ArangoDB 3.7 Release Webinar
A Graph Database That Scales - ArangoDB 3.7 Release WebinarArangoDB Database
 
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?ArangoDB Database
 
ArangoML Pipeline Cloud - Managed Machine Learning Metadata
ArangoML Pipeline Cloud - Managed Machine Learning MetadataArangoML Pipeline Cloud - Managed Machine Learning Metadata
ArangoML Pipeline Cloud - Managed Machine Learning MetadataArangoDB Database
 
ArangoDB 3.7 Roadmap: Performance at Scale
ArangoDB 3.7 Roadmap: Performance at ScaleArangoDB 3.7 Roadmap: Performance at Scale
ArangoDB 3.7 Roadmap: Performance at ScaleArangoDB Database
 
Webinar: What to expect from ArangoDB Oasis
Webinar: What to expect from ArangoDB OasisWebinar: What to expect from ArangoDB Oasis
Webinar: What to expect from ArangoDB OasisArangoDB Database
 
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019ArangoDB Database
 
Webinar: How native multi model works in ArangoDB
Webinar: How native multi model works in ArangoDBWebinar: How native multi model works in ArangoDB
Webinar: How native multi model works in ArangoDBArangoDB Database
 
An introduction to multi-model databases
An introduction to multi-model databasesAn introduction to multi-model databases
An introduction to multi-model databasesArangoDB Database
 
Running complex data queries in a distributed system
Running complex data queries in a distributed systemRunning complex data queries in a distributed system
Running complex data queries in a distributed systemArangoDB Database
 

Mehr von ArangoDB Database (20)

ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
 
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
 
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
 
ArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB 3.9 - Further Powering Graphs at ScaleArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB 3.9 - Further Powering Graphs at Scale
 
GraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDBGraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDB
 
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
 
Graph Analytics with ArangoDB
Graph Analytics with ArangoDBGraph Analytics with ArangoDB
Graph Analytics with ArangoDB
 
Getting Started with ArangoDB Oasis
Getting Started with ArangoDB OasisGetting Started with ArangoDB Oasis
Getting Started with ArangoDB Oasis
 
Custom Pregel Algorithms in ArangoDB
Custom Pregel Algorithms in ArangoDBCustom Pregel Algorithms in ArangoDB
Custom Pregel Algorithms in ArangoDB
 
Hacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge GraphsHacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge Graphs
 
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
A Graph Database That Scales - ArangoDB 3.7 Release WebinarA Graph Database That Scales - ArangoDB 3.7 Release Webinar
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
 
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
 
ArangoML Pipeline Cloud - Managed Machine Learning Metadata
ArangoML Pipeline Cloud - Managed Machine Learning MetadataArangoML Pipeline Cloud - Managed Machine Learning Metadata
ArangoML Pipeline Cloud - Managed Machine Learning Metadata
 
ArangoDB 3.7 Roadmap: Performance at Scale
ArangoDB 3.7 Roadmap: Performance at ScaleArangoDB 3.7 Roadmap: Performance at Scale
ArangoDB 3.7 Roadmap: Performance at Scale
 
Webinar: What to expect from ArangoDB Oasis
Webinar: What to expect from ArangoDB OasisWebinar: What to expect from ArangoDB Oasis
Webinar: What to expect from ArangoDB Oasis
 
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
 
3.5 webinar
3.5 webinar 3.5 webinar
3.5 webinar
 
Webinar: How native multi model works in ArangoDB
Webinar: How native multi model works in ArangoDBWebinar: How native multi model works in ArangoDB
Webinar: How native multi model works in ArangoDB
 
An introduction to multi-model databases
An introduction to multi-model databasesAn introduction to multi-model databases
An introduction to multi-model databases
 
Running complex data queries in a distributed system
Running complex data queries in a distributed systemRunning complex data queries in a distributed system
Running complex data queries in a distributed system
 

Kürzlich hochgeladen

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 

Kürzlich hochgeladen (20)

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Handling billions of edges in a graph database @GoToLondon

  • 1. Copyright © ArangoDB GmbH, 2016 Handling Billions Of Edges in a Graph Database Michael Hackstein @mchacki
  • 2. Copyright © ArangoDB GmbH, 2016 ‣ Schema-free Objects (Vertices) ‣ Relations between them (Edges) ‣ Edges have a direction { name: "alice", age: 32 } { name: "dancing" } { name: "bob", age: 35, size: 1,73m } { name: "reading" } { name: "fishing" } hobby hobby hobby hobby
  • 3. Copyright © ArangoDB GmbH, 2016 ‣ Schema-free Objects (Vertices) ‣ Relations between them (Edges) ‣ Edges have a direction ‣ Edges can be queried in both directions ‣ Easily query a range of edges (2 to 5) ‣ Undefined number of edges (1 to *) ‣ Shortest Path between two vertices { name: "alice", age: 32 } { name: "dancing" } { name: "bob", age: 35, size: 1,73m } { name: "reading" } { name: "fishing" } hobby hobby hobby hobby
  • 4. Copyright © ArangoDB GmbH, 2016 ‣ Give me all friends of Alice Bob Charly DaveAlice Eve Frank
  • 5. Copyright © ArangoDB GmbH, 2016 ‣ Give me all friends of Alice Bob Charly DaveCharly Bob DaveAlice Eve Frank
  • 6. Copyright © ArangoDB GmbH, 2016 ‣ Give me all friends-of-friends of Alice Alice Bob Charly Eve Dave Frank
  • 7. Copyright © ArangoDB GmbH, 2016 ‣ Give me all friends-of-friends of Alice Alice Bob Charly Eve Dave FrankEve Frank
  • 8. Copyright © ArangoDB GmbH, 2016 ‣ What is the linking path between Alice and Eve Alice Bob Charly Eve Dave Frank
  • 9. Copyright © ArangoDB GmbH, 2016 ‣ What is the linking path between Alice and Eve Alice Bob Charly Eve Dave FrankBobEve Alice
  • 10. Copyright © ArangoDB GmbH, 2016 ‣ Which Trainstations can I reach if I am allowed to drive a distance of at most 6 stations on my ticket You are here
  • 11. Copyright © ArangoDB GmbH, 2016 ‣ Which Trainstations can I reach if I am allowed to drive a distance of at most 6 stations on my ticket You are here
  • 12. Copyright © ArangoDB GmbH, 2016 Friend ‣ Give me all users that share two hobbies with Alice Alice Hobby1 Hobby2
  • 13. Copyright © ArangoDB GmbH, 2016 FriendFriend ‣ Give me all users that share two hobbies with Alice Alice Hobby1 Hobby2
  • 14. Copyright © ArangoDB GmbH, 2016 Alice Product ProductFriend ‣ Give me all products that at least one of my friends has bought together with the products I already own, ordered by how many friends have bought it and the products rating, but only 20 of them. has_bought has_bought has_boughtis_friend
  • 15. Copyright © ArangoDB GmbH, 2016 Alice Product ProductFriend Product ‣ Give me all products that at least one of my friends has bought together with the products I already own, ordered by how many friends have bought it and the products rating, but only 20 of them. has_bought has_bought has_boughtis_friend
  • 16. Copyright © ArangoDB GmbH, 2016
  • 17. Copyright © ArangoDB GmbH, 2016 ‣ Give me all users which have an age attribute between 21 and 35.
  • 18. Copyright © ArangoDB GmbH, 2016 ‣ Give me all users which have an age attribute between 21 and 35. ‣ Give me the age distribution of all users
  • 19. Copyright © ArangoDB GmbH, 2016 ‣ Give me all users which have an age attribute between 21 and 35. ‣ Give me the age distribution of all users ‣ Group all users by their name
  • 20. Copyright © ArangoDB GmbH, 2016
  • 21. Copyright © ArangoDB GmbH, 2016 ‣ We first pick a start vertex (S) S
  • 22. Copyright © ArangoDB GmbH, 2016 ‣ We first pick a start vertex (S) ‣ We collect all edges on S B C A S
  • 23. Copyright © ArangoDB GmbH, 2016 ‣ We first pick a start vertex (S) ‣ We collect all edges on S ‣ We apply filters on edges B C A SS A B
  • 24. Copyright © ArangoDB GmbH, 2016 ‣ We first pick a start vertex (S) ‣ We collect all edges on S ‣ We apply filters on edges ‣ We iterate down one of the new vertices (A) D E B C A SS A B
  • 25. Copyright © ArangoDB GmbH, 2016 ‣ We first pick a start vertex (S) ‣ We collect all edges on S ‣ We apply filters on edges ‣ We iterate down one of the new vertices (A) ‣ We apply filters on edges D E B C A SS E A B
  • 26. Copyright © ArangoDB GmbH, 2016 ‣ We first pick a start vertex (S) ‣ We collect all edges on S ‣ We apply filters on edges ‣ We iterate down one of the new vertices (A) ‣ We apply filters on edges ‣ The next vertex (E) is in desired depth. Return the path S -> A -> E D E B C A SS E A B
  • 27. Copyright © ArangoDB GmbH, 2016 ‣ We first pick a start vertex (S) ‣ We collect all edges on S ‣ We apply filters on edges ‣ We iterate down one of the new vertices (A) ‣ We apply filters on edges ‣ The next vertex (E) is in desired depth. Return the path S -> A -> E ‣ Go back to the next unfinished vertex (B) D E B C A SS E A B
  • 28. Copyright © ArangoDB GmbH, 2016 ‣ We first pick a start vertex (S) ‣ We collect all edges on S ‣ We apply filters on edges ‣ We iterate down one of the new vertices (A) ‣ We apply filters on edges ‣ The next vertex (E) is in desired depth. Return the path S -> A -> E ‣ Go back to the next unfinished vertex (B) ‣ We iterate down on (B) D E F B C A SS E A B
  • 29. Copyright © ArangoDB GmbH, 2016 ‣ We first pick a start vertex (S) ‣ We collect all edges on S ‣ We apply filters on edges ‣ We iterate down one of the new vertices (A) ‣ We apply filters on edges ‣ The next vertex (E) is in desired depth. Return the path S -> A -> E ‣ Go back to the next unfinished vertex (B) ‣ We iterate down on (B) ‣ We apply filters on edges D E F B C A SS E A F B
  • 30. Copyright © ArangoDB GmbH, 2016 ‣ We first pick a start vertex (S) ‣ We collect all edges on S ‣ We apply filters on edges ‣ We iterate down one of the new vertices (A) ‣ We apply filters on edges ‣ The next vertex (E) is in desired depth. Return the path S -> A -> E ‣ Go back to the next unfinished vertex (B) ‣ We iterate down on (B) ‣ We apply filters on edges ‣ The next vertex (F) is in desired depth. Return the path S -> B -> F D E F B C A SS E A F B
  • 31. Copyright © ArangoDB GmbH, 2016 ‣ Once: ‣ Find the start vertex ‣ For every depth: ‣ Find all connected edges ‣ Filter non-matching edges ‣ Find connected vertices ‣ Filter non-matching vertices Depends on indexes: Hash: Edge-Index or Index-Free: Linear in edges: Depends on indexes: Hash: Linear in vertices: Only one pass: O 1 1 n n * 1 n 3n
  • 32. Copyright © ArangoDB GmbH, 2016 ‣ Linear sounds evil? ‣ NOT linear in All Edges O(E) ‣ Only Linear in relevant Edges n < E ‣ Traversals solely scale with their result size ‣ They are not effected at all by total amount of data ‣ BUT: Every depth increases the exponent: O((3n)d ) ‣ "7 degrees of separation": (3n)6 < E < (3n)7
  • 33. Copyright © ArangoDB GmbH, 2016 ‣ MULTI-MODEL database ‣ Stores Key Value, Documents, and Graphs ‣ All in one core ‣ Query language AQL ‣ Document Queries ‣ Graph Queries ‣ Joins ‣ All can be combined in the same statement ‣ ACID support including Multi Collection Transactions + +
  • 34. Copyright © ArangoDB GmbH, 2016 FOR user IN users RETURN user
  • 35. Copyright © ArangoDB GmbH, 2016 FOR user IN users FILTER user.name == "alice" RETURN user Alice
  • 36. Copyright © ArangoDB GmbH, 2016 FOR user IN users FILTER user.name == "alice" FOR product IN OUTBOUND user has_bought RETURN product Alice
  • 37. Copyright © ArangoDB GmbH, 2016 FOR user IN users FILTER user.name == "alice" FOR product IN OUTBOUND user has_bought RETURN product Alice has_bought TV
  • 38. Copyright © ArangoDB GmbH, 2016 FOR user IN users FILTER user.name == "alice" FOR recommendation, action, path IN 3 ANY user has_bought FILTER path.vertices[2].age <= user.age + 5 AND path.vertices[2].age >= user.age - 5 FILTER recommendation.price < 25 LIMIT 10 RETURN recommendation Alice has_bought TV
  • 39. Copyright © ArangoDB GmbH, 2016 FOR user IN users FILTER user.name == "alice" FOR recommendation, action, path IN 3 ANY user has_bought FILTER path.vertices[2].age <= user.age + 5 AND path.vertices[2].age >= user.age - 5 FILTER recommendation.price < 25 LIMIT 10 RETURN recommendation Alice has_bought TV has_bought playstation.price < 25 PlaystationBob alice.age - 5 <= bob.age && bob.age <= alice.age + 5 has_bought
  • 40. Copyright © ArangoDB GmbH, 2016 ‣ Many graphs have "Supernodes" ‣ Vertices with many inbound and/or outbound edges ‣ Traversing over them is expensive ‣ Often you only need a subset of edges Bob Alice
  • 41. Copyright © ArangoDB GmbH, 2016 ‣ Remember Complexity? O((3n)d ) ‣ Filtering of non-matching edges is linear for every depth ‣ Index all edges based on their vertices and arbitrary other attributes ‣ Find initial set of edges in identical time ‣ Less / No post-filtering required ‣ This decreases the n significantly Alice
  • 42. Copyright © ArangoDB GmbH, 2016 ‣ We have the rise of big data ‣ Store everything you can ‣ Dataset easily grows beyond one machine ‣ This includes graph data!
  • 43. Copyright © ArangoDB GmbH, 2016 ‣ Distribute graph on several machines (sharding) ‣ How to query it now? ‣ No global view of the graph possible any more ‣ What about edges between servers? ‣ In a shared environment network most of the time is the bottleneck ‣ Reduce network hops ‣ Vertex-Centric Indexes again help with super-nodes ‣ But: Only on a local machine
  • 44. First let's do the cluster thingy
  • 45. Copyright © ArangoDB GmbH, 2016 ‣ ArangoDB is the first fully certified database including the persistence primitives for DC/OS ‣ ArangoDB's cluster resource management is based on Apache Mesos ‣ Later this year we will also support Kubernetes
  • 46. Copyright © ArangoDB GmbH, 2016 ‣ ArangoDB can run clusters without it ‣ Setup Requires manual effort (can be scripted) ‣ This works: ‣ Automatic Failover (Follower takes over if leader dies) ‣ Rebalancing of shards ‣ Everything inside of ArangoDB ‣ This is based on Mesos: ‣ Complete self healing ‣ Automatic restart of ArangoDBs (on new machines) ➡ We suggest you have someone on call
  • 49. Copyright © ArangoDB GmbH, 2016 ‣ Only parts of the graph on every machine ‣ Neighboring vertices may be on different machines ‣ Even edges could be on other machines than their vertices ‣ Queries need to be executed in a distributed way ‣ Result needs to be merged locally
  • 50. Copyright © ArangoDB GmbH, 2016 Random Distribution ‣ Advantages: ‣ every server takes an equal portion of data ‣ easy to realize ‣ no knowledge about data required ‣ always works ‣ Disadvantages: ‣ Neighbors on different machines ‣ Probably edges on other machines than their vertices ‣ A lot of network overhead is required for querying
  • 51. Copyright © ArangoDB GmbH, 2016 Random Distribution ‣ Advantages: ‣ every server takes an equal portion of data ‣ easy to realize ‣ no knowledge about data required ‣ always works ‣ Disadvantages: ‣ Neighbors on different machines ‣ Probably edges on other machines than their vertices ‣ A lot of network overhead is required for querying
  • 52. Copyright © ArangoDB GmbH, 2016 Index-Free Adjacency ‣ Used by most other graph databases ‣ Every vertex maintains two lists of it's edges (IN and OUT) ‣ Do not use an index to find edges ‣ How to shard this?
  • 53. Copyright © ArangoDB GmbH, 2016 Index-Free Adjacency ‣ Used by most other graph databases ‣ Every vertex maintains two lists of it's edges (IN and OUT) ‣ Do not use an index to find edges ‣ How to shard this?
  • 54. Copyright © ArangoDB GmbH, 2016 Index-Free Adjacency ‣ Used by most other graph databases ‣ Every vertex maintains two lists of it's edges (IN and OUT) ‣ Do not use an index to find edges ‣ How to shard this?
  • 55. Copyright © ArangoDB GmbH, 2016 Index-Free Adjacency ‣ Used by most other graph databases ‣ Every vertex maintains two lists of it's edges (IN and OUT) ‣ Do not use an index to find edges ‣ How to shard this? ????
  • 56. Copyright © ArangoDB GmbH, 2016 Index-Free Adjacency ‣ ArangoDB uses an hash-based EdgeIndex (O(1) - lookup) ‣ The vertex is independent of it's edges ‣ It can be stored on a different machine ‣ Used by most other graph databases ‣ Every vertex maintains two lists of it's edges (IN and OUT) ‣ Do not use an index to find edges ‣ How to shard this? ????
  • 57. Copyright © ArangoDB GmbH, 2016 Domain Based Distribution ‣ Many Graphs have a natural distribution ‣ By country/region for People ‣ By tags for Blogs ‣ By category for Products ‣ Most edges in same group ‣ Rare edges between groups
  • 58. Copyright © ArangoDB GmbH, 2016 Domain Based Distribution ‣ Many Graphs have a natural distribution ‣ By country/region for People ‣ By tags for Blogs ‣ By category for Products ‣ Most edges in same group ‣ Rare edges between groups
  • 59. Copyright © ArangoDB GmbH, 2016 Domain Based Distribution ‣ Many Graphs have a natural distribution ‣ By country/region for People ‣ By tags for Blogs ‣ By category for Products ‣ Most edges in same group ‣ Rare edges between groups ArangoDB Enterprise Edition uses Domain Knowledge
 for short-cuts
  • 60. Copyright © ArangoDB GmbH, 2016 SmartGraphs - How it works DB Server 1 DB Server 2 DB Server n Coordinator Foxx Coordinator Foxx
  • 61. Copyright © ArangoDB GmbH, 2016 SmartGraphs - How it works DB Server 1 DB Server 2 DB Server n Coordinator Foxx Coordinator Foxx
  • 62. Copyright © ArangoDB GmbH, 2016 SmartGraphs - How it works DB Server 1 DB Server 2 DB Server n Coordinator Foxx Coordinator Foxx
  • 64. Copyright © ArangoDB GmbH, 2016
  • 65. Copyright © ArangoDB GmbH, 2016 Database document write transactions per virtual CPU/second ArangoDB 1,730 Aerospike 2,500 Cassandra 965 Couchbase 1,375 FoundationDB 750 *) Source: https://mesosphere.com/blog/2015/11/30/arangodb-benchmark-dcos/
  • 66. Copyright © ArangoDB GmbH, 2016 ‣ Further questions? ‣ Follow us on twitter: @arangodb ‣ Join our slack: slack.arangodb.com ‣ Follow me on twitter/github: @mchacki ‣ Write me a mail: michael@arangodb.com Thank You