SlideShare a Scribd company logo
1 of 69
The Native Graph Advantage
Dr. Jim Webber
Chief Scientist, Neo4j
• A cheeky bit of computer science
• Database architecture from 30,000ft
• Why Neo4j is graph native, and why it matters
• Quantitative performance advantages
• Finish
Overview
Applied to data, native data formats or communication protocols are those
supported by a certain computer hardware or software, with maximal
consistency and minimal amount of additional components.
-- Wikipedia
Native: A Definition
Those who can imagine anything,
can create the impossible.
An Unordered Singly Linked-List
27 1657 5674
A Write-Centric Database?
27 1657 5674Client
Impractical Design
27 1657 5674Client
Client
Client
Every client contends for write lock in naïve implementation
CRDTs to the Rescue
27 1657 5674Client
5674 7689 6Client
1657 5674 66Client
27 1657 5674 7689 6 66
Trees
27
1657
5674
7689
6
66
27 1657 5674 7689 6 66
Minimise contention
27
1657
27 1657 5674 7689 6 66
5674
7689
6
66
Client
Writes
Client
Writes
Client
Writes
Client
Writes
• Classic B-trees common pattern for on disk-databases
• “Index” in memory, files on leaf nodes on disk
• B+ Trees for linear scans are neat! But…
Databases Usually <3 Trees
Pick the right tool for the job
• It could be tables or columns or KV or documents…
• Each database is likely very good for that model
• Evolution driven by its primary workload in its
primary market
• Any add-on doesn’t benefit from this
• Unloved
• Opportunistic (e.g. “multi model”)
• Models don’t compose easily
All Databases have a native model
Why jump on the graph bandwagon?
Graph Layer
• Take existing data store
• Bolt-on Graph-like API from third-
party open source
• Declare victory
Graph Operator
• Take existing data store
• Add graph features into the query
language
• Declare victory
Two Non-Native Approaches to Graph
Non-Native Architectures
Other DBMS
(e.g. Column Store)
Graph Layer
Graph API
Other DBMS
(e.g. Document Store)
Other QL Graph Operator
Graph Layer Graph Operator
Non-Native Architectures
No Cypher!
Other DBMS
(e.g. Column Store)
Graph Layer
Graph API
Other DBMS
(e.g. Document Store)
Other QL Graph Operator
Graph Layer Graph Operator
Non-Native Architectures
Requires
convention
at user levelDenormalization
No Cypher!
Other DBMS
(e.g. Column Store)
Graph Layer
Graph API
Other DBMS
(e.g. Document Store)
Other QL Graph Operator
Graph Layer Graph Operator
Non-Native Architectures
Requires
convention
at user levelDenormalization
No Cypher!
Other DBMS
(e.g. Column Store)
Graph Layer
Graph API
Other DBMS
(e.g. Document Store)
Other QL Graph Operator
Graph Layer Graph Operator
Does not understand graphs
Cannot prevent dangling relationships/logical corruption/etc
• Engine and store are not designed for graphs
• Graphs are not motivating workload
• Denormalization only works to certain modest limits
• E.g. depth 3
• Operational concerns: schema rigidity, evolution
Graph Layer Drawbacks
Popular Implementation: Column Store
http://javahungry.blogspot.com/2013/08/hashing-how-hash-map-works-in-java-or.html
• Works by convention only
• Underlying engine cannot enforce integrity
• Data structures and store formats are
designed for another job entirely
• Performance concerns
Graph Operator Drawbacks
Popular Implementation: B-Trees!
http://zhangliyong.github.io/posts/2014/02/19/mongodb-index-internals.htm
Do one thing, do it well
OLTP
OLAP
HTAP OLAPOLTP
Transactional Servers Read-only
Replicas
Transactional Servers Read-only
Replicas
OLAP
Transactional Servers Read-only
Replicas
OLAP
OLTP Application
OLAP
ApplicationNeo4j
Drivers
causal_clustering.server_groups=olap1
causal_clustering.load_balancing.config.server_policies.OLAP=
groups(olap1)
Application code
neo4j.conf
GraphDatabase.driver( "bolt+routing://server?policy=OLAP" );
OLTP Application
OLAP
Application
HTAP Application
Cypher Query
Engine
Cypher
Application
Graph
Algorithms
• Actions
• Insights
• Label Propagation
• Union Find / Weakly Connected Components
• Strongly Connected Components
• Triangle-Count / Clustering Coefficient
ClusteringCentrality
• PageRank
• Betweeness
• Closeness
• Degree
Path Finding
• Breadth-first search
• Depth-first search
• Single-source shortest path
• All-pairs shortest path
• Minimum weight spanning
tree
Cypher Query
Engine
Cypher
Application
Cypher Query
Engine
Graph
Algorithms
Cypher
Application
Cypher Query
Engine
Graph
Algorithms
Cypher
Application
CALL algo.pageRankCREATE (:User)
PageRank
N MATCH (n) RETURN count(n)
M(p) MATCH (p)<--(q) RETURN q
L(p) MATCH (p)-[r]->() RETURN count(r)
(:Page)-[:Link]->(:Page)
11 million nodes
116 million relationships
DBPedia
• 11 million nodes
• 116 million relationships
• 20 iterations
• < 10 seconds
DBPedia
• Combine OLTP and OLAP in the same cluster
• Work on up-to-date data, no complex ETL,
warehousing
• Mix with graph algorithms
Neo4j is an HTAP Database
Quantitative Analysis
http://cdn2.hubspot.net/hubfs/145335/25-seo-statistics-for-2015-and-what-you-can-learn-from-them.jpg
• Asymptotic benchmarking effort for
native graph tech
• “What Neo4j can do when it’s pushed
to its limits?”
• The results are impressive
Pushing Neo4j to the Limits
• Asymptotic benchmarking effort for native graph tech
• “What Neo4j can do when it’s pushed to its limits?”
• The results are impressive
Pushing Neo4j to the Limits
Traversals
Realistic retail dataset from Amazon
Commodity dual Xeon processor server
Social recommendation (Java procedure) equivalent to:
MATCH (you)-[:BOUGHT]->(something)<-[:BOUGHT]-(other)-[:BOUGHT]->(reco)
WHERE id(you)={id}
RETURN reco
• Can comfortably handle 1 trillion relationships on a single server
• 24x2TB SSDs, 33TB size on disk.
• Compiled Cypher query
• Random reads
• Sustains over 100k user transactions/sec
• Even with 99.8% page faults because of modest 512GB RAM
Read Scale
• Import Friendster dataset
• 1.8 billion relationships takes around 20
minutes
• That is 1M writes/second!
Write Scale
>50M traversals/sec
1,000,000 writes/sec
1012 Records
Comparison on a ~10M node, ~100M relationship graph
Workload Non-native graph DB: 6 machines, each with
48 VCPUs, 256 GB disk and 256 GB of RAM
Count nodes 201s
Count outgoing rels 202s
Count outgoing rels at depth 2 276s
Count outgoing rels at depth 3 511s
Group nodes by property val 212s
Group rels by type 198s
Count depth 2 knows-likes 324s
Page Rank 2571s
Neo4j 3.3: single
machine
< 1ms
< 1ms
56s
423s
25s
26s
133s
27s
Just one more thing…
Amazing Native Graph Performance
Thanks for coming today!
Drinks on the 13th Floor
@jimwebber

More Related Content

What's hot

Training Week: Build APIs with Neo4j GraphQL Library
Training Week: Build APIs with Neo4j GraphQL LibraryTraining Week: Build APIs with Neo4j GraphQL Library
Training Week: Build APIs with Neo4j GraphQL LibraryNeo4j
 
Presto @ Zalando - Big Data Tech Warsaw 2020
Presto @ Zalando - Big Data Tech Warsaw 2020Presto @ Zalando - Big Data Tech Warsaw 2020
Presto @ Zalando - Big Data Tech Warsaw 2020Piotr Findeisen
 
Adobe Behance Scales to Millions of Users at Lower TCO with Neo4j
Adobe Behance Scales to Millions of Users at Lower TCO with Neo4jAdobe Behance Scales to Millions of Users at Lower TCO with Neo4j
Adobe Behance Scales to Millions of Users at Lower TCO with Neo4jNeo4j
 
Business Track: How MongoDB Helps Telefonia Digital Accelerate Time to Market
Business Track: How MongoDB Helps Telefonia Digital Accelerate Time to MarketBusiness Track: How MongoDB Helps Telefonia Digital Accelerate Time to Market
Business Track: How MongoDB Helps Telefonia Digital Accelerate Time to MarketMongoDB
 
Full Stack Graph in the Cloud
Full Stack Graph in the CloudFull Stack Graph in the Cloud
Full Stack Graph in the CloudNeo4j
 
GraphConnect Europe 2016 - Faster Lap Times with Neo4j - Srinivas Suravarapu
GraphConnect Europe 2016 - Faster Lap Times with Neo4j - Srinivas SuravarapuGraphConnect Europe 2016 - Faster Lap Times with Neo4j - Srinivas Suravarapu
GraphConnect Europe 2016 - Faster Lap Times with Neo4j - Srinivas SuravarapuNeo4j
 
Remote DBA Service: Powering your DBA needs
Remote DBA Service: Powering your DBA needsRemote DBA Service: Powering your DBA needs
Remote DBA Service: Powering your DBA needsEDB
 
Spectator to Participant. Contributing to Cassandra (Patrick McFadin, DataSta...
Spectator to Participant. Contributing to Cassandra (Patrick McFadin, DataSta...Spectator to Participant. Contributing to Cassandra (Patrick McFadin, DataSta...
Spectator to Participant. Contributing to Cassandra (Patrick McFadin, DataSta...DataStax
 
GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...
GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...
GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...DataWorks Summit
 
DbyDx Software Corporate Presentation
DbyDx Software Corporate PresentationDbyDx Software Corporate Presentation
DbyDx Software Corporate PresentationDbyDx Software
 
EDB's Migration Portal - Migrate from Oracle to Postgres
EDB's Migration Portal - Migrate from Oracle to PostgresEDB's Migration Portal - Migrate from Oracle to Postgres
EDB's Migration Portal - Migrate from Oracle to PostgresEDB
 
Remote DBA Service: Powering your DBA needs
Remote DBA Service: Powering your DBA needsRemote DBA Service: Powering your DBA needs
Remote DBA Service: Powering your DBA needsEDB
 
Querying Druid in SQL with Superset
Querying Druid in SQL with SupersetQuerying Druid in SQL with Superset
Querying Druid in SQL with SupersetDataWorks Summit
 
Near Real-Time Analytics with Apache Spark: Ingestion, ETL, and Interactive Q...
Near Real-Time Analytics with Apache Spark: Ingestion, ETL, and Interactive Q...Near Real-Time Analytics with Apache Spark: Ingestion, ETL, and Interactive Q...
Near Real-Time Analytics with Apache Spark: Ingestion, ETL, and Interactive Q...Databricks
 
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...Databricks
 
What's the Scoop on Hadoop? How It Works and How to WORK IT!
What's the Scoop on Hadoop? How It Works and How to WORK IT!What's the Scoop on Hadoop? How It Works and How to WORK IT!
What's the Scoop on Hadoop? How It Works and How to WORK IT!MongoDB
 
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB  present...MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB  present...
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...MongoDB
 
Introduction to Neo4j and .Net
Introduction to Neo4j and .NetIntroduction to Neo4j and .Net
Introduction to Neo4j and .NetNeo4j
 
Hermes: Free the Data! Distributed Computing with MongoDB
Hermes: Free the Data! Distributed Computing with MongoDBHermes: Free the Data! Distributed Computing with MongoDB
Hermes: Free the Data! Distributed Computing with MongoDBMongoDB
 
Webinar: An Enterprise Architect’s View of MongoDB
Webinar: An Enterprise Architect’s View of MongoDBWebinar: An Enterprise Architect’s View of MongoDB
Webinar: An Enterprise Architect’s View of MongoDBMongoDB
 

What's hot (20)

Training Week: Build APIs with Neo4j GraphQL Library
Training Week: Build APIs with Neo4j GraphQL LibraryTraining Week: Build APIs with Neo4j GraphQL Library
Training Week: Build APIs with Neo4j GraphQL Library
 
Presto @ Zalando - Big Data Tech Warsaw 2020
Presto @ Zalando - Big Data Tech Warsaw 2020Presto @ Zalando - Big Data Tech Warsaw 2020
Presto @ Zalando - Big Data Tech Warsaw 2020
 
Adobe Behance Scales to Millions of Users at Lower TCO with Neo4j
Adobe Behance Scales to Millions of Users at Lower TCO with Neo4jAdobe Behance Scales to Millions of Users at Lower TCO with Neo4j
Adobe Behance Scales to Millions of Users at Lower TCO with Neo4j
 
Business Track: How MongoDB Helps Telefonia Digital Accelerate Time to Market
Business Track: How MongoDB Helps Telefonia Digital Accelerate Time to MarketBusiness Track: How MongoDB Helps Telefonia Digital Accelerate Time to Market
Business Track: How MongoDB Helps Telefonia Digital Accelerate Time to Market
 
Full Stack Graph in the Cloud
Full Stack Graph in the CloudFull Stack Graph in the Cloud
Full Stack Graph in the Cloud
 
GraphConnect Europe 2016 - Faster Lap Times with Neo4j - Srinivas Suravarapu
GraphConnect Europe 2016 - Faster Lap Times with Neo4j - Srinivas SuravarapuGraphConnect Europe 2016 - Faster Lap Times with Neo4j - Srinivas Suravarapu
GraphConnect Europe 2016 - Faster Lap Times with Neo4j - Srinivas Suravarapu
 
Remote DBA Service: Powering your DBA needs
Remote DBA Service: Powering your DBA needsRemote DBA Service: Powering your DBA needs
Remote DBA Service: Powering your DBA needs
 
Spectator to Participant. Contributing to Cassandra (Patrick McFadin, DataSta...
Spectator to Participant. Contributing to Cassandra (Patrick McFadin, DataSta...Spectator to Participant. Contributing to Cassandra (Patrick McFadin, DataSta...
Spectator to Participant. Contributing to Cassandra (Patrick McFadin, DataSta...
 
GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...
GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...
GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...
 
DbyDx Software Corporate Presentation
DbyDx Software Corporate PresentationDbyDx Software Corporate Presentation
DbyDx Software Corporate Presentation
 
EDB's Migration Portal - Migrate from Oracle to Postgres
EDB's Migration Portal - Migrate from Oracle to PostgresEDB's Migration Portal - Migrate from Oracle to Postgres
EDB's Migration Portal - Migrate from Oracle to Postgres
 
Remote DBA Service: Powering your DBA needs
Remote DBA Service: Powering your DBA needsRemote DBA Service: Powering your DBA needs
Remote DBA Service: Powering your DBA needs
 
Querying Druid in SQL with Superset
Querying Druid in SQL with SupersetQuerying Druid in SQL with Superset
Querying Druid in SQL with Superset
 
Near Real-Time Analytics with Apache Spark: Ingestion, ETL, and Interactive Q...
Near Real-Time Analytics with Apache Spark: Ingestion, ETL, and Interactive Q...Near Real-Time Analytics with Apache Spark: Ingestion, ETL, and Interactive Q...
Near Real-Time Analytics with Apache Spark: Ingestion, ETL, and Interactive Q...
 
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
 
What's the Scoop on Hadoop? How It Works and How to WORK IT!
What's the Scoop on Hadoop? How It Works and How to WORK IT!What's the Scoop on Hadoop? How It Works and How to WORK IT!
What's the Scoop on Hadoop? How It Works and How to WORK IT!
 
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB  present...MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB  present...
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...
 
Introduction to Neo4j and .Net
Introduction to Neo4j and .NetIntroduction to Neo4j and .Net
Introduction to Neo4j and .Net
 
Hermes: Free the Data! Distributed Computing with MongoDB
Hermes: Free the Data! Distributed Computing with MongoDBHermes: Free the Data! Distributed Computing with MongoDB
Hermes: Free the Data! Distributed Computing with MongoDB
 
Webinar: An Enterprise Architect’s View of MongoDB
Webinar: An Enterprise Architect’s View of MongoDBWebinar: An Enterprise Architect’s View of MongoDB
Webinar: An Enterprise Architect’s View of MongoDB
 

Similar to GraphTour - Closing Keynote

Challenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on HadoopChallenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on HadoopDataWorks Summit
 
Basic Application Performance Optimization Techniques (Backend)
Basic Application Performance Optimization Techniques (Backend)Basic Application Performance Optimization Techniques (Backend)
Basic Application Performance Optimization Techniques (Backend)Klas Berlič Fras
 
001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introductionScott Miao
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDBFoundationDB
 
Building a High Performance Analytics Platform
Building a High Performance Analytics PlatformBuilding a High Performance Analytics Platform
Building a High Performance Analytics PlatformSantanu Dey
 
Webinar: Large Scale Graph Processing with IBM Power Systems & Neo4j
Webinar: Large Scale Graph Processing with IBM Power Systems & Neo4jWebinar: Large Scale Graph Processing with IBM Power Systems & Neo4j
Webinar: Large Scale Graph Processing with IBM Power Systems & Neo4jNeo4j
 
Challenges of Building a First Class SQL-on-Hadoop Engine
Challenges of Building a First Class SQL-on-Hadoop EngineChallenges of Building a First Class SQL-on-Hadoop Engine
Challenges of Building a First Class SQL-on-Hadoop EngineNicolas Morales
 
Oracle big data appliance and solutions
Oracle big data appliance and solutionsOracle big data appliance and solutions
Oracle big data appliance and solutionssolarisyougood
 
Using graphs for recommendations
Using graphs for recommendationsUsing graphs for recommendations
Using graphs for recommendationsRik Van Bruggen
 
Codemotion 2015 Infinispan Tech lab
Codemotion 2015 Infinispan Tech labCodemotion 2015 Infinispan Tech lab
Codemotion 2015 Infinispan Tech labUgo Landini
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Databricks
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...DataStax
 
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...Amazon Web Services
 
Benchmark Showdown: Which Relational Database is the Fastest on AWS?
Benchmark Showdown: Which Relational Database is the Fastest on AWS?Benchmark Showdown: Which Relational Database is the Fastest on AWS?
Benchmark Showdown: Which Relational Database is the Fastest on AWS?Clustrix
 
Everything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDBEverything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDBjhugg
 
10x latency improvement – how to squeeze performance out of your BizTalk solu...
10x latency improvement – how to squeeze performance out of your BizTalk solu...10x latency improvement – how to squeeze performance out of your BizTalk solu...
10x latency improvement – how to squeeze performance out of your BizTalk solu...BizTalk360
 
SSJS, NoSQL, GAE and AppengineJS
SSJS, NoSQL, GAE and AppengineJSSSJS, NoSQL, GAE and AppengineJS
SSJS, NoSQL, GAE and AppengineJSEugene Lazutkin
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impalamarkgrover
 
In-memory ColumnStore Index
In-memory ColumnStore IndexIn-memory ColumnStore Index
In-memory ColumnStore IndexSolidQ
 

Similar to GraphTour - Closing Keynote (20)

Challenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on HadoopChallenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on Hadoop
 
Basic Application Performance Optimization Techniques (Backend)
Basic Application Performance Optimization Techniques (Backend)Basic Application Performance Optimization Techniques (Backend)
Basic Application Performance Optimization Techniques (Backend)
 
001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introduction
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDB
 
Building a High Performance Analytics Platform
Building a High Performance Analytics PlatformBuilding a High Performance Analytics Platform
Building a High Performance Analytics Platform
 
Webinar: Large Scale Graph Processing with IBM Power Systems & Neo4j
Webinar: Large Scale Graph Processing with IBM Power Systems & Neo4jWebinar: Large Scale Graph Processing with IBM Power Systems & Neo4j
Webinar: Large Scale Graph Processing with IBM Power Systems & Neo4j
 
Challenges of Building a First Class SQL-on-Hadoop Engine
Challenges of Building a First Class SQL-on-Hadoop EngineChallenges of Building a First Class SQL-on-Hadoop Engine
Challenges of Building a First Class SQL-on-Hadoop Engine
 
Oracle big data appliance and solutions
Oracle big data appliance and solutionsOracle big data appliance and solutions
Oracle big data appliance and solutions
 
Using graphs for recommendations
Using graphs for recommendationsUsing graphs for recommendations
Using graphs for recommendations
 
Codemotion 2015 Infinispan Tech lab
Codemotion 2015 Infinispan Tech labCodemotion 2015 Infinispan Tech lab
Codemotion 2015 Infinispan Tech lab
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
 
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
 
Benchmark Showdown: Which Relational Database is the Fastest on AWS?
Benchmark Showdown: Which Relational Database is the Fastest on AWS?Benchmark Showdown: Which Relational Database is the Fastest on AWS?
Benchmark Showdown: Which Relational Database is the Fastest on AWS?
 
Everything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDBEverything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDB
 
The state of SQL-on-Hadoop in the Cloud
The state of SQL-on-Hadoop in the CloudThe state of SQL-on-Hadoop in the Cloud
The state of SQL-on-Hadoop in the Cloud
 
10x latency improvement – how to squeeze performance out of your BizTalk solu...
10x latency improvement – how to squeeze performance out of your BizTalk solu...10x latency improvement – how to squeeze performance out of your BizTalk solu...
10x latency improvement – how to squeeze performance out of your BizTalk solu...
 
SSJS, NoSQL, GAE and AppengineJS
SSJS, NoSQL, GAE and AppengineJSSSJS, NoSQL, GAE and AppengineJS
SSJS, NoSQL, GAE and AppengineJS
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 
In-memory ColumnStore Index
In-memory ColumnStore IndexIn-memory ColumnStore Index
In-memory ColumnStore Index
 

More from Neo4j

Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...Neo4j
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosNeo4j
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Neo4j
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jNeo4j
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j
 
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfRabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Neo4j
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeNeo4j
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsNeo4j
 
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j
 
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j
 
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...Neo4j
 
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AIDeloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AINeo4j
 
Ingka Digital: Linked Metadata by Design
Ingka Digital: Linked Metadata by DesignIngka Digital: Linked Metadata by Design
Ingka Digital: Linked Metadata by DesignNeo4j
 
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24Neo4j
 
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxGraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxNeo4j
 

More from Neo4j (20)

Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfRabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG time
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge Graphs
 
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
 
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with Graph
 
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
 
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AIDeloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
 
Ingka Digital: Linked Metadata by Design
Ingka Digital: Linked Metadata by DesignIngka Digital: Linked Metadata by Design
Ingka Digital: Linked Metadata by Design
 
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
 
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxGraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
 

Recently uploaded

Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 

Recently uploaded (20)

2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 

GraphTour - Closing Keynote

Editor's Notes

  1. The Neo4j database fits this definition: a small number of modules each dedicated to some part of graph storage and query. There’s no other DBMS underneath requiring translation into/out from the native world. And that provides serious benefits to the end user.
  2. Science time. A reminder of algorithms and data structures
  3. What are the properties of this list? If you want to add something, it’s easy to just insert If you want to find something (or find out it’s not there) it’s laborious. It is O(1) for writes = great It is O(N) for reads = sucky
  4. Pop it in a box Put an API on it Voila, it’s a database! Yes it’s crappy database, but for our purposes it suits. If you squint it could even be blockchain. Works great for one client, but…
  5. Works great for one client, but…
  6. Conflict Free Replicated Data Type A CRDT is a data structure that has well known merge rules. We can write into several concurrent copies (on different servers) and merge them all later. Great! Because we don’t care about ordering this is easy peasy and even a CRDT library can do this for us. But this database is still awful for reads: reads get slower the more data you add. You could even go another route and assume that you rarely read and do something like large fast ring buffers. Lots of options. But reading what you’ve written is always expensive.
  7. Let’s try again. Binary tree Log (n) for reads Log (n) for writes because you first have to do log(n) for reads
  8. Can read anywhere in principle Can write across the leading edge of the tree. Contention is generally federated through structure.
  9. When you’ve got trees then you get lots of logs That is log(n) lookups, and m log(n) traversal speed for graphs - this model isn’t a good choice for graph workloads
  10. Of course Neo4j has some indexes that are tree based, but most of the time we only use them to find starting points in the graph Traversals in Neo4j are O(1) Native graph = far fewer log(n) penalties
  11. The linked list is great for writes, less good for reads. B-trees strike a balance between reads and writes. Your design and implementation choices empower you for your native model Your design and implementation choices limit you for other use cases
  12. Caveat emptor – buyer beware. Models don’t compose easily Can make documents from graphs conveniently, but not so much the other way Non-native Sea Lamprey - costs $500k per year to control in NY state! It’s not a native part of the ecosystem
  13. The graph trend is enormous and outstripping all other models. If you’re a vendor in one of the slower growing models, you need some graph *story* Bandwagon jumping
  14. Some vendors have spotted the enormous graph trend and are simply jumping on the bandwagon Let’s take a look at their non-native architecture.
  15. Achitecture
  16. We’ve seen two approaches in the market where a non-graph vendor has tried to stretch their data structures to graph
  17. Today most non-native graphs have their own APIs – not cypher, not open Cypher. That excludes them from an amazing ecosystem of tools and people that is to their detriment. I also think that Cypher is by far the best graph query language – by definition, it built on the learning of earlier languages: SQL, Gremlin, Sparql.
  18. [Japanese Knotweed] Graph API suffers because most of the data store is focussed on the existing data model The data structures aren’t designed for graphs, nor are the store formats. Graphs are a hobby, tick box, something to answer RFPs. Graphs are not the motivating workload. The motivating workload doesn't even have relationships and therefore the DB engine will not optimise Upper levels try to compensate but generally only can do so for a few hops How many hops even to traverse your data center? Or your train ride? Or your Mars mission?
  19. Column store provides nested hashmap data structure Hashmap-of-hashmaps Theoretical O(1) lookups for items seems great! But O(n) in practice because of collisions and pathologically O(n2) for inserting n objects! But is not mechanically sympathetic Hashes distributed data to avoid clashes But performance comes from data locality Work at disk speed if unoptimized Work at RAM speed if optimized But have to denormalize Serious imitations (e.g. up to depth 3 queries optimised only) And then add in network latency for distributed hashring
  20. [Himalayan Balsam] Add a graph lookup operator to the query language Use some conventions in the existing model to infer linkage that the new operator can use But no native support for links means slow. The data structures aren’t designed for graphs, nor are the store formats. Also means you need clever workarounds and clever workarounds and you reach the limits of those workarounds quickly Again: How many hops even to traverse your data center? Or your train ride? Or your Mars mission? And if you disobey those conventions – no graph, and there is nothing to enforce them.
  21. Underlying model knows nothing about links, so: Is not that good for general purpose graphs because you can’t denormalize for all possible use cases Deleting documents leave dangling links (document engine doesn’t have referential constraints) More generally, user has to ensure conventions are upheld to make graph features work. Easy to unintentionally disable graph features when other folks have only a document view of the data. And then add network latency for all lookups Poor performance at modest search depth, difficult governance (engine does not respect graph), poor expresivity for any reasonable graph problem
  22. Non native serve 2 (or more domains). Always prefer their primary domain: it’s what most of their users need. So while there are CS and engineering considerations, there’s also the notion of doing one thing well that underpins Neo4j. Neo4j supports graph workloads natively. From bottom to top. It is not a document store, or a column store, it is a native graph database. Let’s see how we do it.
  23. For us that one thing is graphs But graphs are useful in a variety of processing contexts.
  24. First of all, Online Transaction Processing. OLTP. What OLTP typically means for a graph is reading or writing small part of the whole graph.
  25. The second way we see people using graphs is for Online Analytical Processing. Analytics typically means processing much larger sections of the graph, and often, in fact, processing the whole graph. For the last few decades, the trend has been for specialist technology to handle analytic workloads - different systems, different data models maybe - and isolated from OLTP systems. Well now there’s a new trend
  26. Frecently there’s been lots of talk about something called HTAP - *Hybrid* transactional and analytical processing. The idea is that if you could have one system that serves both workloads, you can run your analytics on up-to-date or nearly up-to-date data, so that you can respond to things faster. Also, maybe it’s just not worth the complexity of two totally different systems. What are we doing about this at Neo4j?
  27. Since Neo4j 3.1 the cluster architecture has supported dividing the cluster into different groups. Here I’m showing 5 servers on the left that handle transaction workload, updating the graph and a read-only replica which is useful for read-heavy workloads
  28. What this gives you is a part of the cluster that is perfect for OLAP workloads. Mostly isolated from the main transactional cluster, work over here won’t impact the transactional workload.
  29. You can also specialize the hardware for each workload - for example use machines with more RAM or CPU cores for the LDAP workload. How do you use this cluster?
  30. Well you use the Neo4j Drivers to talk to all the servers in this cluster. If you’ve got OLAP workload you want it to go to just to the OLAP specialized machines you can do this just trhought configuration.
  31. When you create a Neo4j Driver in your application, you specify a policy. And on the servers you say what that policy means, which groups it should send queries to, and which servers are in those groups.
  32. So that gives us our workload directed to the right servers in the cluster. I don’t really need to have two different applications.
  33. I can can have one application doing a mixture of OLTP and OLAP and have still have the work routed to the right place Now let’s look at the work itslef in more detail
  34. This a model for using Neo4j: You have an application. It sends Cypher queries. They get run by the query Engine. Which queries the graph model. This model is great, but now we’re adding something else into the picture
  35. Graph Alogoirithms. They’re firmly on the analytics side of things. They look at a whole graph. You run them and they lead to actions like “this transaction seems fraudulent, you should investigate” or they lead to insights like “this is the type of customer we do well selling to, we should tune our business around them”
  36. There are two broad categories of algorithms available with 3.3. Centrality algorithms identify nodes that have significant positions in the network. Clustering algorithms are about detecting groups or clusters of nodes.
  37. So if we want to run these graph algorithms, how do they fit into the picture?
  38. Well we’ve packaged the algorithms as a set of procedures. This means they sit alongside the cypher query engine behind exactly the same Cypher interface
  39. To run one of the algorithms, it’s just a call to the relevant procedure. Works just the same way as running a normal cypher query. Now let’s have a look at one of the algorithms in more detail
  40. I’ve picked PageRank because it’s quite well known. PageRank scores the importance of each node according to the importance of the other nodes that link to it. So it’s a kind of recursive definition
  41. Practically what that means is that you have iterate. So cosider all the nodes and all the relationships In the graph, many times over.
  42. As the algorithm
  43. Efficiency for graph operations is paramount. You don’t need huge macho clusters to do this.
  44. I think these are incredibly useful building blocks for your next-gen systems – I’m looking forward to seeing the kind applications that get built with this stuff
  45. On and on scalability note that Neo4j is light enough to scale down to some really interesting edge compute cases – like Stefan Armbruster’s RasPi cluster! But let’s dig down a bit further. Cypher is at the heart of neo4j and we’ve heard a lot about it today. I’d like to invite Tobias Lindaaker to the stage to talk about advances in the Cypher runtime that translate into performance advantages for you.
  46. But now let’s reflect on what it means practically to choose graph native technology
  47. So let’s zoom in on the lowest levels: what at the performance advantages of native graph. But what can we do when we really push the envelope – to work the machinery as hard as possible? Lots. Our CTO Johan decided to push the machinery to its limit and see what it can do.
  48. Tease Johan.
  49. User transaction means real units of work that are meaningful and valuable to the application. Lots of traversals involved. Not an artificial to-first-byte delivery benchmark. Random reads are the hardest for a database to optimise so this is a truly challenging benchmark.
  50. This is soon to be outdated – our new highly parallel importer will be far faster. For transactional updates even on my modest laptop I can get several thousand ACID tx/sec online.
  51. You can get so much work done so quickly with numbers like those.
  52. You don’t have to follow me on this path though. You take the blue pill, the story ends. You wake up in your data centre, the shoe-horning connected data into the those same DBMS systems not designed for it. You take the red pill, and stay in graph land. And I show you how deep traversals can go in the real world. We’re taking the red pill
  53. I saw this on the internet and thought it looked like a neat challenge. We had the Dbpedia dataset to hand which is comparable in size (slightly larger but from the real world, 11M nodes 116M links) Theirs was synthetic, slightly smaller. The original experiment ran on 288 cores with 1.5TB RAM. Neo4j ran on a single workstation with 128GB RAM for the database in total – thanks to Michael Hunger for running the experiment. That itself is remarkable illustration of how efficient neo4j can be. Sure it’s macho to run 6 large machines, but it’s more sensible not to. *** Describe what’s going on *** then: This is not really a fair comparison. The work undertaken by the non-native store is far higher than the work undertaken by neo4j. But that’s the whole point! Because neo4j can optimise for graphs all the way down the stack, we can and have implemented all kinds of shortcuts that databases optimised for tables or columns or keys and values or documents can’t do. If you saw a similar table a year ago: the Neo4j column is even faster now, in some cases 2x faster.
  54. One more thing…
  55. The neo4j engineering team has done some fantastic stuff in the last couple of years: That’s a 3B nod, 18B rel graph pageranked with 20 iterations in less than 2 hours using the graph algos. On commodity hardware. Imagine what we can do with Cypher for Apache Spark too! We also measure ourselves on the standard LDBC 100 benchmark: Running since March 2016: “SF100 Read” has improved *~2x* (~2800 tx/s --> ~5000 tx/s) “SF100 Write” has improved *~4x* (~5000 tx/s --> ~20000 tx/s)
  56. Just remains for me to invite you to join us for drinks
  57. OK, I guess this is more accurate.