SlideShare a Scribd company logo
1 of 28
Mapping Graph Queries to PostgreSQL
Gábor Szárnyas, József Marton,
Márton Elekes, János Benjamin Antal
Budapest DB Meetup — 2018/Nov/13
PROPERTY GRAPH DATABASES
NoSQL family
Data model:
 nodes
 edges
 properties
#1 query approach:
graph pattern matching
RANKINGS: POPULARITY CHANGES PER CATEGORY
CYPHER AND OPENCYPHER
Cypher: query language of the Neo4j graph database.
„Cypher is a declarative, SQL-inspired language for describing
patterns in graphs visually using an ascii-art syntax.”
MATCH
(d:Database)<-[:RELATED_TO]-(:Talk)-[:PRESENTED_AT]->(m:Meetup)
WHERE m.date = 'Tuesday, November 13, 2018'
RETURN d
„The openCypher project aims to deliver a full and open
specification of the industry’s most widely adopted graph
database query language: Cypher.” (late 2015)
OPENCYPHER SYSTEMS
 Increasing adoption
 Relational databases:
o SAP HANA
o AGENS Graph
 Research prototypes:
o Graphflow (Univesity of Waterloo, Canada)
o ingraph (incremental graph engine @ BME)
(Source: Keynote talk @ GraphConnect NYC 2017)
PROPERTY GRAPHS
 Textbook graph: 𝐺 = 𝑉, 𝐸
o Dijkstra, Ford-Fulkerson, etc.
o Homogenous nodes
o Homogeneous edges
 Extensions
o Labelled nodes
o Typed edges
o Properties
 The schema is implicit
 Very intuitive
o Things and connections
Bob
53
Carol
38
A Ltd.
David
47
20182007
Erin
30
name
Person,
Student
Company
COLLEAGUE
FRIEND
WORKS_AT
since
name
age Person
GRAPH VS. RELATIONAL DATABASES
 Graph databases
o Graph-based modelling is intuitive
o Concise query language
 Relational databases
o Most common
o Many legacy systems
o Efficient and mature
 No tools available to bridge the two
o i.e. query data in RDBs as a graph
o first you have to wrangle the graph out of the RDB
Col1 Col2
1 A
2 B
Tables
Graph
:Course
name: 'Phys1'
REQUIRES
:Course
name: 'Phys2'ENROLLED
:Student
name: 'John'
ENROLLED
PROPOSED APPROACH
To get the best of both worlds, map Cypher queries to SQL:
1. Formulate queries in Cypher
2. Execute inside an existing RDB
3. Return results as a graph relation
GRAPH QUERIES IN CYPHER
 Subgraph matching and graph traversals
 Example: Alice’s colleagues and their colleagues
MATCH (p1:Person {name: 'Alice'})-[c:COLL*1..2]-(p2:Person)
RETURN p2.name
Bob
53
Carol
38
A Ltd.
David
47
20182007
Erin
30
name
Person,
Student
Company
COLLEAGUE
FRIEND
WORKS_ATsince
name
age Person
Data and Query Mapping
MAPPING BETWEEN DATA MODELS
Class 2
Objects
:Class1
attr1: String
attr2: int
:Class 2
attr1: String
attr2: int
Graph
:Course
name: 'Phys1'
REQUIRES
:Course
name: 'Phys2'ENROLLED
:Student
name: 'John'
ENROLLED
object-
graph
mapping
object-relational mapping (ORM)
e.g. JPA, Entity Framework
graph-relational mapping
Col1 Col2
1 A
2 B
Tables
DATA MAPPING #1: GENERIC SCHEMA
 Useful for representing schema-free data sets
 Hopelessly slow
Bob
53
Carol
38
A Ltd.
David
47
20182007
Erin
30
DATA MAPPING #2: CONCRETE SCHEMA
QUERY MAPPING
Cypher
query
graph
relational
algebra
SQL
algebra
SQL
query
MATCH (p:Person)
-[:COLL|:FRIEND]-()
…
RETURN p
◯ 𝑃𝑒𝑟𝑠𝑜𝑛
⋈
𝜎 𝑎=5
◯ → ◯
COLL,
FRIEND
𝜋 𝑝
⋈ 𝑐𝑜𝑙1,𝑐𝑜𝑙2
𝜎 𝑎=5
𝜋 𝑝
compiler mapping code
generation
◯ person
◯ → ◯
colleague
◯ → ◯
friend
⋃
SELECT …
FROM …
SELECT …
FROM …
SELECT …
FROM …
SELECT …
FROM …
SELECT …
FROM …
SELECT …
FROM …
SELECT …
FROM …
Gábor Szárnyas, József Marton, Dániel Varró:
Formalising openCypher Graph Queries in Relational Algebra.
ADBIS 2017
A SIMPLE EXAMPLE
MATCH (node)
WITH node.name AS name
RETURN name ◯(𝑛𝑜𝑑𝑒,𝑛𝑜𝑑𝑒.𝑛𝑎𝑚𝑒)
𝜋 𝑛𝑜𝑑𝑒.𝑛𝑎𝑚𝑒→𝑛𝑎𝑚𝑒
𝜋 𝑛𝑎𝑚𝑒
SELECT "name"
FROM
(
SELECT "node.name" AS "name"
FROM
(
SELECT
vertex_id AS "node",
(SELECT value
FROM vertex_property
WHERE parent = vertex_id AND key = 'name') AS "node.name"
FROM vertex
)
)
Cypher query relational algebra tree
ingraph
mapping
CHALLENGES #1
 Variable length paths: union of multiple subqueries
 Unbound: WITH RECURSIVE (fixpoint-based evaluation)
 WITH RECURSIVE was introduced in SQL:1999 but
o PostgreSQL 8.4+ (since 2009)
o SQLite 3.8.3+ (since 2014)
o MySQL 8.0.1+ (since 2017)
MATCH (p1:Person {name: 'Alice'})-[c:COLL*1..2]-(p2:Person)
RETURN p2.name
MATCH (p1:Person {name: 'Alice'})-[c:COLL*]-(p2:Person)
RETURN p2.name
CHALLENGES #2
 Edges are directed
o Undirectedness is modelled in the query
o Union of both directions
MATCH (p1:Person …)-[:FRIEND]-(p2:Person),
(p2)-[:WORKS_AT]->(c:Company)
RETURN p2.name, c.name
Bob
53
Carol
38
A Ltd.
David
47
20182007
Erin
30
name
Person,
Student
Company
COLLEAGUE
FRIEND
WORKS_ATsince
name
age Person
CHALLENGES #3
 Multiple tables as sources
MATCH (p1:Person …)-[:COLL|:FRIEND]-(p2:Person)
RETURN p2.name
Bob
53
Carol
38
David
47
Erin
30
Person,
Student
COLLEAGUE
FRIEND
name
age Person
id name
1 Alice
2 Bob
person
p1 p2
1 2
1 3
colleague
p1 p2
2 1
5 4
friend
name
age
CHALLENGES #1 #2 #3
Simple graph patterns turn to many subqueries
 Querying edges as undirected:
• Enumerate edges in both directions (2 ×)
 Variable length paths (1..L, *):
• Limited: enumerate 1, 2, … , 𝐿 → 𝐿 ×
• Unlimited: use WITH RECURSIVE
 Multiple node labels/edge types:
• Enumerate all source tables (𝑁 ×)
Total: union of 2 × 𝐿 × 𝑁 subqueries in SQL
Bob
53
Carol
38
David
47
Person,
Student
COLLEAGUE
FRIEND
name
age Person
MATCH (p1:Person …)-[:COLL|:FRIEND*1..2]-(p2:Person)
A COMPLEX EXAMPLE
WITH
q0 AS
(-- GetVerticesWithGTop
SELECT
ROW(0, p_personid)::vertex_type AS "_e186#0",
"p_personid" AS "_e186.id#0"
FROM person),
q1 AS
(-- Selection
SELECT * FROM q0 AS subquery
WHERE ("_e186.id#0" = :personId)),
q2 AS
(-- GetEdgesWithGTop
SELECT ROW(0, edgeTable."k_person1id")::vertex_type AS "_e186#0", ROW(0, edgeTable."k_person1id", edgeTable."k_person2id")::edge_type AS
"_e187#0", ROW(0, edgeTable."k_person2id")::vertex_type AS "friend#2",
toTable."p_personid" AS "friend.id#2", toTable."p_firstname" AS "friend.firstName#1", toTable."p_lastname" AS "friend.lastName#2"
FROM "knows" edgeTable
JOIN "person" toTable ON (edgeTable."k_person2id" = toTable."p_personid")),
q3 AS
(-- GetEdgesWithGTop
SELECT ROW(0, edgeTable."k_person1id")::vertex_type AS "friend#2", ROW(0, edgeTable."k_person1id", edgeTable."k_person2id")::edge_type AS
"_e187#0", ROW(0, edgeTable."k_person2id")::vertex_type AS "_e186#0",
fromTable."p_personid" AS "friend.id#2", fromTable."p_firstname" AS "friend.firstName#1", fromTable."p_lastname" AS "friend.lastName#2"
FROM "knows" edgeTable
JOIN "person" fromTable ON (fromTable."p_personid" = edgeTable."k_person1id")),
q4 AS
(-- UnionAll
SELECT "_e186#0", "_e187#0", "friend#2", "friend.id#2", "friend.firstName#1", "friend.lastName#2" FROM q2
UNION ALL
SELECT "_e186#0", "_e187#0", "friend#2", "friend.id#2", "friend.firstName#1", "friend.lastName#2" FROM q3),
q5 AS
(-- EquiJoinLike
SELECT left_query."_e186#0", left_query."_e186.id#0", right_query."friend#2", right_query."friend.id#2", right_query."_e187#0",
right_query."friend.lastName#2", right_query."friend.firstName#1" FROM
q1 AS left_query
INNER JOIN
q4 AS right_query
ON left_query."_e186#0" = right_query."_e186#0"),
q6 AS
(-- GetEdgesWithGTop
SELECT ROW(6, fromTable."m_messageid")::vertex_type AS "message#17", ROW(8, fromTable."m_messageid", fromTable."m_creatorid")::edge_type AS
"_e188#0", ROW(0, fromTable."m_creatorid")::vertex_type AS "friend#2",
fromTable."m_messageid" AS "message.id#2", fromTable."m_content" AS "message.content#2", fromTable."m_ps_imagefile" AS
"message.imageFile#0", fromTable."m_creationdate" AS "message.creationDate#13"
FROM "message" fromTable
WHERE (fromTable."m_c_replyof" IS NULL)),
q7 AS
(-- GetEdgesWithGTop
SELECT ROW(6, fromTable."m_messageid")::vertex_type AS "message#17", ROW(8, fromTable."m_messageid", fromTable."m_creatorid")::edge_type AS
"_e188#0", ROW(0, fromTable."m_creatorid")::vertex_type AS "friend#2",
fromTable."m_messageid" AS "message.id#2", fromTable."m_content" AS "message.content#2", fromTable."m_creationdate" AS
"message.creationDate#13"
FROM "message" fromTable
WHERE (fromTable."m_c_replyof" IS NOT NULL)),
q8 AS
(-- UnionAll
SELECT "message#17", "_e188#0", "friend#2", "message.id#2", "message.content#2", "message.imageFile#0", "message.creationDate#13" FROM q6
UNION ALL
SELECT "message#17", "_e188#0", "friend#2", "message.id#2", "message.content#2", NULL AS "message.imageFile#0", "message.creationDate#13"
FROM q7),
q9 AS
(-- EquiJoinLike
SELECT left_query."_e186#0", left_query."_e186.id#0", left_query."_e187#0", left_query."friend#2", left_query."friend.id#2",
left_query."friend.firstName#1", left_query."friend.lastName#2", right_query."message#17", right_query."message.id#2",
right_query."message.imageFile#0", right_query."_e188#0", right_query."message.creationDate#13", right_query."message.content#2" FROM
q5 AS left_query
INNER JOIN
q8 AS right_query
ON left_query."friend#2" = right_query."friend#2"),
q10 AS
(-- AllDifferent
SELECT * FROM q9 AS subquery
WHERE is_unique(ARRAY[]::edge_type[] || "_e188#0" || "_e187#0")),
q11 AS
(-- Selection
SELECT * FROM q10 AS subquery
WHERE ("message.creationDate#13" <= :maxDate)),
q12 AS
(-- Projection
SELECT "friend.id#2" AS "personId#0", "friend.firstName#1" AS "personFirstName#0", "friend.lastName#2" AS "personLastName#0",
"message.id#2" AS "postOrCommentId#0", CASE WHEN ("message.content#2" IS NOT NULL = true) THEN "message.content#2"
ELSE "message.imageFile#0"
END AS "postOrCommentContent#0", "message.creationDate#13" AS "postOrCommentCreationDate#0"
FROM q11 AS subquery),
q13 AS
(-- SortAndTop
SELECT * FROM q12 AS subquery
ORDER BY "postOrCommentCreationDate#0" DESC NULLS LAST, ("postOrCommentId#0")::BIGINT ASC NULLS FIRST
LIMIT 20)
SELECT "personId#0" AS "personId", "personFirstName#0" AS "personFirstName", "personLastName#0" AS "personLastName", "postOrCommentId#0" AS
"postOrCommentId", "postOrCommentContent#0" AS "postOrCommentContent", "postOrCommentCreationDate#0" AS "postOrCommentCreationDate"
FROM q13 AS subquery
MATCH (:Person {id:$personId})-[:KNOWS]-(friend:Person)<-
[:HAS_CREATOR]-(message:Message)
WHERE message.creationDate <= $maxDate
RETURN
friend.id AS personId,
friend.firstName AS personFirstName,
friend.lastName AS personLastName,
message.id AS postOrCommentId,
CASE exists(message.content)
WHEN true THEN message.content
ELSE message.imageFile
END AS postOrCommentContent,
message.creationDate AS postOrCommentCreationDate
ORDER BY postOrCommentCreationDate DESC, toInteger(postOrCommentId) ASC
LIMIT 20
Benchmarks
BENCHMARKS: LINKED DATA BENCHMARK COUNCIL
LDBC is a non-profit organization dedicated to establishing benchmarks,
benchmark practices and benchmark results for graph data
management software.
The Social Network Benchmark is an industrial and academic initiative,
formed by principal actors in the field of graph-like data management.
LDBC IN A NUTSHELL
Peter Boncz, Thomas Neumann, Orri Erling,
TPC-H Analyzed: Hidden Messages and Lessons Learned
from an Influential Benchmark,
TPCTC 2013
Gábor Szárnyas, József Marton, János Benjamin Antal et al.:
An early look at the LDBC Social Network Benchmark’s BI Workload.
GRADES-NDA at SIGMOD, 2018
Orri Erling et al.,
The LDBC Social Network Benchmark: Interactive Workload,
SIGMOD 2015
PERFORMANCE EXPERIMENTS
 LDBC Interactive workload
 Tools
o PostgreSQL (reference implementation)
o Cypher-to-SQL queries on PostgreSQL
o Semantic database (anonymized)
 Geometric mean of 20+ executions
BENCHMARK RESULTS ON LDBC QUERIES
RELATED PROJECTS
 Cytosm
o Cypher to SQL Mapping
o HP Labs for Vertica
o Project abandoned in 2017
o gTop (graph topology) reused
 Cypher for Apache Spark
o Neo4j’s project
o Executes queries in Spark
o Read-only
Tool Source Target OSS Updates Paths
CAPS Cypher SparkSQL   
Cytosm Cypher Vertica SQL   
Cypher-to-SQL Cypher PostgreSQL   
SUMMARY
 Mapping property graph queries to SQL is challenging
o Similar to ORM
o + edge properties
o + reachability
 Initial implementation: C2S
o Moderate feature coverage
o Poor performance
o Needs some tweaks, e.g. work around CTE optimization fences
Gábor Szárnyas, József Marton, János Maginecz, Dániel Varró:
Incremental View Maintenance on Property Graphs.
arXiv preprint 2018
RELATED RESOURCES
ingraph and C2S github.com/ftsrg/ingraph
Cypher for Apache Spark github.com/opencypher/cypher-for-apache-spark
Cytosm github.com/cytosm/cytosm
LDBC github.com/ldbc/
Thanks for the contributions to the whole ingraph team.

More Related Content

What's hot

Exploratory data analysis
Exploratory data analysis Exploratory data analysis
Exploratory data analysis Peter Reimann
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 
Neo4j Training Cypher
Neo4j Training CypherNeo4j Training Cypher
Neo4j Training CypherMax De Marzi
 
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...Simplilearn
 
Debunking some “RDF vs. Property Graph” Alternative Facts
Debunking some “RDF vs. Property Graph” Alternative FactsDebunking some “RDF vs. Property Graph” Alternative Facts
Debunking some “RDF vs. Property Graph” Alternative FactsNeo4j
 
Natural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jNatural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jWilliam Lyon
 
Taxonomy Management based on SKOS-XL
Taxonomy Management based on SKOS-XLTaxonomy Management based on SKOS-XL
Taxonomy Management based on SKOS-XLAndreas Blumauer
 
Semantic web meetup – sparql tutorial
Semantic web meetup – sparql tutorialSemantic web meetup – sparql tutorial
Semantic web meetup – sparql tutorialAdonisDamian
 
Exploratory data analysis with Python
Exploratory data analysis with PythonExploratory data analysis with Python
Exploratory data analysis with PythonDavis David
 
Natural Language parsing.pptx
Natural Language parsing.pptxNatural Language parsing.pptx
Natural Language parsing.pptxsiddhantroy13
 
What’s New with Databricks Machine Learning
What’s New with Databricks Machine LearningWhat’s New with Databricks Machine Learning
What’s New with Databricks Machine LearningDatabricks
 
Introduction to Knowledge Graphs
Introduction to Knowledge GraphsIntroduction to Knowledge Graphs
Introduction to Knowledge Graphsmukuljoshi
 
揭开数据虚拟化的神秘面纱
揭开数据虚拟化的神秘面纱揭开数据虚拟化的神秘面纱
揭开数据虚拟化的神秘面纱Denodo
 
Introduction to data science.pptx
Introduction to data science.pptxIntroduction to data science.pptx
Introduction to data science.pptxSadhanaParameswaran
 
Big Data - Breve panoramica
Big Data - Breve panoramicaBig Data - Breve panoramica
Big Data - Breve panoramicaLuca Naso
 
Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2izahn
 

What's hot (20)

Big data ppt
Big data pptBig data ppt
Big data ppt
 
Exploratory data analysis
Exploratory data analysis Exploratory data analysis
Exploratory data analysis
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Graph databases
Graph databasesGraph databases
Graph databases
 
Neo4j Training Cypher
Neo4j Training CypherNeo4j Training Cypher
Neo4j Training Cypher
 
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...
 
Debunking some “RDF vs. Property Graph” Alternative Facts
Debunking some “RDF vs. Property Graph” Alternative FactsDebunking some “RDF vs. Property Graph” Alternative Facts
Debunking some “RDF vs. Property Graph” Alternative Facts
 
Natural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jNatural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4j
 
Taxonomy Management based on SKOS-XL
Taxonomy Management based on SKOS-XLTaxonomy Management based on SKOS-XL
Taxonomy Management based on SKOS-XL
 
Audio mining
Audio miningAudio mining
Audio mining
 
Semantic web meetup – sparql tutorial
Semantic web meetup – sparql tutorialSemantic web meetup – sparql tutorial
Semantic web meetup – sparql tutorial
 
Exploratory data analysis with Python
Exploratory data analysis with PythonExploratory data analysis with Python
Exploratory data analysis with Python
 
Natural Language parsing.pptx
Natural Language parsing.pptxNatural Language parsing.pptx
Natural Language parsing.pptx
 
RDF data model
RDF data modelRDF data model
RDF data model
 
What’s New with Databricks Machine Learning
What’s New with Databricks Machine LearningWhat’s New with Databricks Machine Learning
What’s New with Databricks Machine Learning
 
Introduction to Knowledge Graphs
Introduction to Knowledge GraphsIntroduction to Knowledge Graphs
Introduction to Knowledge Graphs
 
揭开数据虚拟化的神秘面纱
揭开数据虚拟化的神秘面纱揭开数据虚拟化的神秘面纱
揭开数据虚拟化的神秘面纱
 
Introduction to data science.pptx
Introduction to data science.pptxIntroduction to data science.pptx
Introduction to data science.pptx
 
Big Data - Breve panoramica
Big Data - Breve panoramicaBig Data - Breve panoramica
Big Data - Breve panoramica
 
Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2
 

Similar to Mapping Graph Queries to PostgreSQL

Writing a Cypher Engine in Clojure
Writing a Cypher Engine in ClojureWriting a Cypher Engine in Clojure
Writing a Cypher Engine in ClojureGábor Szárnyas
 
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)Ankur Dave
 
GraphFrames: Graph Queries in Spark SQL by Ankur Dave
GraphFrames: Graph Queries in Spark SQL by Ankur DaveGraphFrames: Graph Queries in Spark SQL by Ankur Dave
GraphFrames: Graph Queries in Spark SQL by Ankur DaveSpark Summit
 
A Scalable Approach to Learn Semantic Models of Structured Sources
A Scalable Approach to Learn Semantic Models of Structured SourcesA Scalable Approach to Learn Semantic Models of Structured Sources
A Scalable Approach to Learn Semantic Models of Structured SourcesMohsen Taheriyan
 
Graph technology meetup slides
Graph technology meetup slidesGraph technology meetup slides
Graph technology meetup slidesSean Mulvehill
 
Complex queries in a distributed multi-model database
Complex queries in a distributed multi-model databaseComplex queries in a distributed multi-model database
Complex queries in a distributed multi-model databaseMax Neunhöffer
 
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...Databricks
 
Neo4j Graph Database และการประยุกตร์ใช้
Neo4j Graph Database และการประยุกตร์ใช้Neo4j Graph Database และการประยุกตร์ใช้
Neo4j Graph Database และการประยุกตร์ใช้Chakrit Phain
 
MongoDB Europe 2016 - Graph Operations with MongoDB
MongoDB Europe 2016 - Graph Operations with MongoDBMongoDB Europe 2016 - Graph Operations with MongoDB
MongoDB Europe 2016 - Graph Operations with MongoDBMongoDB
 
Redis Day TLV 2018 - Graph Distribution
Redis Day TLV 2018 - Graph DistributionRedis Day TLV 2018 - Graph Distribution
Redis Day TLV 2018 - Graph DistributionRedis Labs
 
Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXGraph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXBenjamin Bengfort
 
PostgreSQL: Advanced features in practice
PostgreSQL: Advanced features in practicePostgreSQL: Advanced features in practice
PostgreSQL: Advanced features in practiceJano Suchal
 
Cypher and apache spark multiple graphs and more in open cypher
Cypher and apache spark  multiple graphs and more in  open cypherCypher and apache spark  multiple graphs and more in  open cypher
Cypher and apache spark multiple graphs and more in open cypherNeo4j
 
Webinar: Working with Graph Data in MongoDB
Webinar: Working with Graph Data in MongoDBWebinar: Working with Graph Data in MongoDB
Webinar: Working with Graph Data in MongoDBMongoDB
 
Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark wi...
Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark wi...Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark wi...
Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark wi...Databricks
 
Max Neunhöffer – Joins and aggregations in a distributed NoSQL DB - NoSQL mat...
Max Neunhöffer – Joins and aggregations in a distributed NoSQL DB - NoSQL mat...Max Neunhöffer – Joins and aggregations in a distributed NoSQL DB - NoSQL mat...
Max Neunhöffer – Joins and aggregations in a distributed NoSQL DB - NoSQL mat...NoSQLmatters
 
Overiew of Cassandra and Doradus
Overiew of Cassandra and DoradusOveriew of Cassandra and Doradus
Overiew of Cassandra and Doradusrandyguck
 
The 2nd graph database in sv meetup
The 2nd graph database in sv meetupThe 2nd graph database in sv meetup
The 2nd graph database in sv meetupJoshua Bae
 
DataMapper @ RubyEnRails2009
DataMapper @ RubyEnRails2009DataMapper @ RubyEnRails2009
DataMapper @ RubyEnRails2009Dirkjan Bussink
 

Similar to Mapping Graph Queries to PostgreSQL (20)

Writing a Cypher Engine in Clojure
Writing a Cypher Engine in ClojureWriting a Cypher Engine in Clojure
Writing a Cypher Engine in Clojure
 
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)
 
Lecture 3.pdf
Lecture 3.pdfLecture 3.pdf
Lecture 3.pdf
 
GraphFrames: Graph Queries in Spark SQL by Ankur Dave
GraphFrames: Graph Queries in Spark SQL by Ankur DaveGraphFrames: Graph Queries in Spark SQL by Ankur Dave
GraphFrames: Graph Queries in Spark SQL by Ankur Dave
 
A Scalable Approach to Learn Semantic Models of Structured Sources
A Scalable Approach to Learn Semantic Models of Structured SourcesA Scalable Approach to Learn Semantic Models of Structured Sources
A Scalable Approach to Learn Semantic Models of Structured Sources
 
Graph technology meetup slides
Graph technology meetup slidesGraph technology meetup slides
Graph technology meetup slides
 
Complex queries in a distributed multi-model database
Complex queries in a distributed multi-model databaseComplex queries in a distributed multi-model database
Complex queries in a distributed multi-model database
 
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
 
Neo4j Graph Database และการประยุกตร์ใช้
Neo4j Graph Database และการประยุกตร์ใช้Neo4j Graph Database และการประยุกตร์ใช้
Neo4j Graph Database และการประยุกตร์ใช้
 
MongoDB Europe 2016 - Graph Operations with MongoDB
MongoDB Europe 2016 - Graph Operations with MongoDBMongoDB Europe 2016 - Graph Operations with MongoDB
MongoDB Europe 2016 - Graph Operations with MongoDB
 
Redis Day TLV 2018 - Graph Distribution
Redis Day TLV 2018 - Graph DistributionRedis Day TLV 2018 - Graph Distribution
Redis Day TLV 2018 - Graph Distribution
 
Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXGraph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkX
 
PostgreSQL: Advanced features in practice
PostgreSQL: Advanced features in practicePostgreSQL: Advanced features in practice
PostgreSQL: Advanced features in practice
 
Cypher and apache spark multiple graphs and more in open cypher
Cypher and apache spark  multiple graphs and more in  open cypherCypher and apache spark  multiple graphs and more in  open cypher
Cypher and apache spark multiple graphs and more in open cypher
 
Webinar: Working with Graph Data in MongoDB
Webinar: Working with Graph Data in MongoDBWebinar: Working with Graph Data in MongoDB
Webinar: Working with Graph Data in MongoDB
 
Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark wi...
Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark wi...Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark wi...
Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark wi...
 
Max Neunhöffer – Joins and aggregations in a distributed NoSQL DB - NoSQL mat...
Max Neunhöffer – Joins and aggregations in a distributed NoSQL DB - NoSQL mat...Max Neunhöffer – Joins and aggregations in a distributed NoSQL DB - NoSQL mat...
Max Neunhöffer – Joins and aggregations in a distributed NoSQL DB - NoSQL mat...
 
Overiew of Cassandra and Doradus
Overiew of Cassandra and DoradusOveriew of Cassandra and Doradus
Overiew of Cassandra and Doradus
 
The 2nd graph database in sv meetup
The 2nd graph database in sv meetupThe 2nd graph database in sv meetup
The 2nd graph database in sv meetup
 
DataMapper @ RubyEnRails2009
DataMapper @ RubyEnRails2009DataMapper @ RubyEnRails2009
DataMapper @ RubyEnRails2009
 

More from Gábor Szárnyas

GraphBLAS: A linear algebraic approach for high-performance graph queries
GraphBLAS: A linear algebraic approach for high-performance graph queriesGraphBLAS: A linear algebraic approach for high-performance graph queries
GraphBLAS: A linear algebraic approach for high-performance graph queriesGábor Szárnyas
 
What Makes Graph Queries Difficult?
What Makes Graph Queries Difficult?What Makes Graph Queries Difficult?
What Makes Graph Queries Difficult?Gábor Szárnyas
 
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...Gábor Szárnyas
 
Incremental View Maintenance for openCypher Queries
Incremental View Maintenance for openCypher QueriesIncremental View Maintenance for openCypher Queries
Incremental View Maintenance for openCypher QueriesGábor Szárnyas
 
Learning Timed Automata with Cypher
Learning Timed Automata with CypherLearning Timed Automata with Cypher
Learning Timed Automata with CypherGábor Szárnyas
 
Időzített automatatanulás Cypherrel
Időzített automatatanulás CypherrelIdőzített automatatanulás Cypherrel
Időzített automatatanulás CypherrelGábor Szárnyas
 
Compiling openCypher graph queries with Spark Catalyst
Compiling openCypher graph queries with Spark CatalystCompiling openCypher graph queries with Spark Catalyst
Compiling openCypher graph queries with Spark CatalystGábor Szárnyas
 
Towards the Characterization of Realistic Models: Evaluation of Multidiscipli...
Towards the Characterization of Realistic Models: Evaluation of Multidiscipli...Towards the Characterization of Realistic Models: Evaluation of Multidiscipli...
Towards the Characterization of Realistic Models: Evaluation of Multidiscipli...Gábor Szárnyas
 
Sharded Joins for Scalable Incremental Graph Queries
Sharded Joins for Scalable Incremental Graph QueriesSharded Joins for Scalable Incremental Graph Queries
Sharded Joins for Scalable Incremental Graph QueriesGábor Szárnyas
 
Towards a Macrobenchmark Framework for Performance Analysis of Java Applications
Towards a Macrobenchmark Framework for Performance Analysis of Java ApplicationsTowards a Macrobenchmark Framework for Performance Analysis of Java Applications
Towards a Macrobenchmark Framework for Performance Analysis of Java ApplicationsGábor Szárnyas
 
IncQuery-D: Distributed Incremental Graph Queries
IncQuery-D: Distributed Incremental Graph QueriesIncQuery-D: Distributed Incremental Graph Queries
IncQuery-D: Distributed Incremental Graph QueriesGábor Szárnyas
 
IncQuery-D: Incremental Queries in the Cloud
IncQuery-D: Incremental Queries in the CloudIncQuery-D: Incremental Queries in the Cloud
IncQuery-D: Incremental Queries in the CloudGábor Szárnyas
 

More from Gábor Szárnyas (13)

GraphBLAS: A linear algebraic approach for high-performance graph queries
GraphBLAS: A linear algebraic approach for high-performance graph queriesGraphBLAS: A linear algebraic approach for high-performance graph queries
GraphBLAS: A linear algebraic approach for high-performance graph queries
 
What Makes Graph Queries Difficult?
What Makes Graph Queries Difficult?What Makes Graph Queries Difficult?
What Makes Graph Queries Difficult?
 
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
 
Incremental View Maintenance for openCypher Queries
Incremental View Maintenance for openCypher QueriesIncremental View Maintenance for openCypher Queries
Incremental View Maintenance for openCypher Queries
 
Learning Timed Automata with Cypher
Learning Timed Automata with CypherLearning Timed Automata with Cypher
Learning Timed Automata with Cypher
 
Időzített automatatanulás Cypherrel
Időzített automatatanulás CypherrelIdőzített automatatanulás Cypherrel
Időzített automatatanulás Cypherrel
 
Parsing process
Parsing processParsing process
Parsing process
 
Compiling openCypher graph queries with Spark Catalyst
Compiling openCypher graph queries with Spark CatalystCompiling openCypher graph queries with Spark Catalyst
Compiling openCypher graph queries with Spark Catalyst
 
Towards the Characterization of Realistic Models: Evaluation of Multidiscipli...
Towards the Characterization of Realistic Models: Evaluation of Multidiscipli...Towards the Characterization of Realistic Models: Evaluation of Multidiscipli...
Towards the Characterization of Realistic Models: Evaluation of Multidiscipli...
 
Sharded Joins for Scalable Incremental Graph Queries
Sharded Joins for Scalable Incremental Graph QueriesSharded Joins for Scalable Incremental Graph Queries
Sharded Joins for Scalable Incremental Graph Queries
 
Towards a Macrobenchmark Framework for Performance Analysis of Java Applications
Towards a Macrobenchmark Framework for Performance Analysis of Java ApplicationsTowards a Macrobenchmark Framework for Performance Analysis of Java Applications
Towards a Macrobenchmark Framework for Performance Analysis of Java Applications
 
IncQuery-D: Distributed Incremental Graph Queries
IncQuery-D: Distributed Incremental Graph QueriesIncQuery-D: Distributed Incremental Graph Queries
IncQuery-D: Distributed Incremental Graph Queries
 
IncQuery-D: Incremental Queries in the Cloud
IncQuery-D: Incremental Queries in the CloudIncQuery-D: Incremental Queries in the Cloud
IncQuery-D: Incremental Queries in the Cloud
 

Recently uploaded

Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringmulugeta48
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXssuser89054b
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdfKamal Acharya
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01KreezheaRecto
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdfSuman Jyoti
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 

Recently uploaded (20)

Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 

Mapping Graph Queries to PostgreSQL

  • 1. Mapping Graph Queries to PostgreSQL Gábor Szárnyas, József Marton, Márton Elekes, János Benjamin Antal Budapest DB Meetup — 2018/Nov/13
  • 2. PROPERTY GRAPH DATABASES NoSQL family Data model:  nodes  edges  properties #1 query approach: graph pattern matching
  • 4. CYPHER AND OPENCYPHER Cypher: query language of the Neo4j graph database. „Cypher is a declarative, SQL-inspired language for describing patterns in graphs visually using an ascii-art syntax.” MATCH (d:Database)<-[:RELATED_TO]-(:Talk)-[:PRESENTED_AT]->(m:Meetup) WHERE m.date = 'Tuesday, November 13, 2018' RETURN d „The openCypher project aims to deliver a full and open specification of the industry’s most widely adopted graph database query language: Cypher.” (late 2015)
  • 5. OPENCYPHER SYSTEMS  Increasing adoption  Relational databases: o SAP HANA o AGENS Graph  Research prototypes: o Graphflow (Univesity of Waterloo, Canada) o ingraph (incremental graph engine @ BME) (Source: Keynote talk @ GraphConnect NYC 2017)
  • 6. PROPERTY GRAPHS  Textbook graph: 𝐺 = 𝑉, 𝐸 o Dijkstra, Ford-Fulkerson, etc. o Homogenous nodes o Homogeneous edges  Extensions o Labelled nodes o Typed edges o Properties  The schema is implicit  Very intuitive o Things and connections Bob 53 Carol 38 A Ltd. David 47 20182007 Erin 30 name Person, Student Company COLLEAGUE FRIEND WORKS_AT since name age Person
  • 7. GRAPH VS. RELATIONAL DATABASES  Graph databases o Graph-based modelling is intuitive o Concise query language  Relational databases o Most common o Many legacy systems o Efficient and mature  No tools available to bridge the two o i.e. query data in RDBs as a graph o first you have to wrangle the graph out of the RDB Col1 Col2 1 A 2 B Tables Graph :Course name: 'Phys1' REQUIRES :Course name: 'Phys2'ENROLLED :Student name: 'John' ENROLLED
  • 8. PROPOSED APPROACH To get the best of both worlds, map Cypher queries to SQL: 1. Formulate queries in Cypher 2. Execute inside an existing RDB 3. Return results as a graph relation
  • 9. GRAPH QUERIES IN CYPHER  Subgraph matching and graph traversals  Example: Alice’s colleagues and their colleagues MATCH (p1:Person {name: 'Alice'})-[c:COLL*1..2]-(p2:Person) RETURN p2.name Bob 53 Carol 38 A Ltd. David 47 20182007 Erin 30 name Person, Student Company COLLEAGUE FRIEND WORKS_ATsince name age Person
  • 10. Data and Query Mapping
  • 11. MAPPING BETWEEN DATA MODELS Class 2 Objects :Class1 attr1: String attr2: int :Class 2 attr1: String attr2: int Graph :Course name: 'Phys1' REQUIRES :Course name: 'Phys2'ENROLLED :Student name: 'John' ENROLLED object- graph mapping object-relational mapping (ORM) e.g. JPA, Entity Framework graph-relational mapping Col1 Col2 1 A 2 B Tables
  • 12. DATA MAPPING #1: GENERIC SCHEMA  Useful for representing schema-free data sets  Hopelessly slow Bob 53 Carol 38 A Ltd. David 47 20182007 Erin 30
  • 13. DATA MAPPING #2: CONCRETE SCHEMA
  • 14. QUERY MAPPING Cypher query graph relational algebra SQL algebra SQL query MATCH (p:Person) -[:COLL|:FRIEND]-() … RETURN p ◯ 𝑃𝑒𝑟𝑠𝑜𝑛 ⋈ 𝜎 𝑎=5 ◯ → ◯ COLL, FRIEND 𝜋 𝑝 ⋈ 𝑐𝑜𝑙1,𝑐𝑜𝑙2 𝜎 𝑎=5 𝜋 𝑝 compiler mapping code generation ◯ person ◯ → ◯ colleague ◯ → ◯ friend ⋃ SELECT … FROM … SELECT … FROM … SELECT … FROM … SELECT … FROM … SELECT … FROM … SELECT … FROM … SELECT … FROM … Gábor Szárnyas, József Marton, Dániel Varró: Formalising openCypher Graph Queries in Relational Algebra. ADBIS 2017
  • 15. A SIMPLE EXAMPLE MATCH (node) WITH node.name AS name RETURN name ◯(𝑛𝑜𝑑𝑒,𝑛𝑜𝑑𝑒.𝑛𝑎𝑚𝑒) 𝜋 𝑛𝑜𝑑𝑒.𝑛𝑎𝑚𝑒→𝑛𝑎𝑚𝑒 𝜋 𝑛𝑎𝑚𝑒 SELECT "name" FROM ( SELECT "node.name" AS "name" FROM ( SELECT vertex_id AS "node", (SELECT value FROM vertex_property WHERE parent = vertex_id AND key = 'name') AS "node.name" FROM vertex ) ) Cypher query relational algebra tree ingraph mapping
  • 16. CHALLENGES #1  Variable length paths: union of multiple subqueries  Unbound: WITH RECURSIVE (fixpoint-based evaluation)  WITH RECURSIVE was introduced in SQL:1999 but o PostgreSQL 8.4+ (since 2009) o SQLite 3.8.3+ (since 2014) o MySQL 8.0.1+ (since 2017) MATCH (p1:Person {name: 'Alice'})-[c:COLL*1..2]-(p2:Person) RETURN p2.name MATCH (p1:Person {name: 'Alice'})-[c:COLL*]-(p2:Person) RETURN p2.name
  • 17. CHALLENGES #2  Edges are directed o Undirectedness is modelled in the query o Union of both directions MATCH (p1:Person …)-[:FRIEND]-(p2:Person), (p2)-[:WORKS_AT]->(c:Company) RETURN p2.name, c.name Bob 53 Carol 38 A Ltd. David 47 20182007 Erin 30 name Person, Student Company COLLEAGUE FRIEND WORKS_ATsince name age Person
  • 18. CHALLENGES #3  Multiple tables as sources MATCH (p1:Person …)-[:COLL|:FRIEND]-(p2:Person) RETURN p2.name Bob 53 Carol 38 David 47 Erin 30 Person, Student COLLEAGUE FRIEND name age Person id name 1 Alice 2 Bob person p1 p2 1 2 1 3 colleague p1 p2 2 1 5 4 friend name age
  • 19. CHALLENGES #1 #2 #3 Simple graph patterns turn to many subqueries  Querying edges as undirected: • Enumerate edges in both directions (2 ×)  Variable length paths (1..L, *): • Limited: enumerate 1, 2, … , 𝐿 → 𝐿 × • Unlimited: use WITH RECURSIVE  Multiple node labels/edge types: • Enumerate all source tables (𝑁 ×) Total: union of 2 × 𝐿 × 𝑁 subqueries in SQL Bob 53 Carol 38 David 47 Person, Student COLLEAGUE FRIEND name age Person MATCH (p1:Person …)-[:COLL|:FRIEND*1..2]-(p2:Person)
  • 20. A COMPLEX EXAMPLE WITH q0 AS (-- GetVerticesWithGTop SELECT ROW(0, p_personid)::vertex_type AS "_e186#0", "p_personid" AS "_e186.id#0" FROM person), q1 AS (-- Selection SELECT * FROM q0 AS subquery WHERE ("_e186.id#0" = :personId)), q2 AS (-- GetEdgesWithGTop SELECT ROW(0, edgeTable."k_person1id")::vertex_type AS "_e186#0", ROW(0, edgeTable."k_person1id", edgeTable."k_person2id")::edge_type AS "_e187#0", ROW(0, edgeTable."k_person2id")::vertex_type AS "friend#2", toTable."p_personid" AS "friend.id#2", toTable."p_firstname" AS "friend.firstName#1", toTable."p_lastname" AS "friend.lastName#2" FROM "knows" edgeTable JOIN "person" toTable ON (edgeTable."k_person2id" = toTable."p_personid")), q3 AS (-- GetEdgesWithGTop SELECT ROW(0, edgeTable."k_person1id")::vertex_type AS "friend#2", ROW(0, edgeTable."k_person1id", edgeTable."k_person2id")::edge_type AS "_e187#0", ROW(0, edgeTable."k_person2id")::vertex_type AS "_e186#0", fromTable."p_personid" AS "friend.id#2", fromTable."p_firstname" AS "friend.firstName#1", fromTable."p_lastname" AS "friend.lastName#2" FROM "knows" edgeTable JOIN "person" fromTable ON (fromTable."p_personid" = edgeTable."k_person1id")), q4 AS (-- UnionAll SELECT "_e186#0", "_e187#0", "friend#2", "friend.id#2", "friend.firstName#1", "friend.lastName#2" FROM q2 UNION ALL SELECT "_e186#0", "_e187#0", "friend#2", "friend.id#2", "friend.firstName#1", "friend.lastName#2" FROM q3), q5 AS (-- EquiJoinLike SELECT left_query."_e186#0", left_query."_e186.id#0", right_query."friend#2", right_query."friend.id#2", right_query."_e187#0", right_query."friend.lastName#2", right_query."friend.firstName#1" FROM q1 AS left_query INNER JOIN q4 AS right_query ON left_query."_e186#0" = right_query."_e186#0"), q6 AS (-- GetEdgesWithGTop SELECT ROW(6, fromTable."m_messageid")::vertex_type AS "message#17", ROW(8, fromTable."m_messageid", fromTable."m_creatorid")::edge_type AS "_e188#0", ROW(0, fromTable."m_creatorid")::vertex_type AS "friend#2", fromTable."m_messageid" AS "message.id#2", fromTable."m_content" AS "message.content#2", fromTable."m_ps_imagefile" AS "message.imageFile#0", fromTable."m_creationdate" AS "message.creationDate#13" FROM "message" fromTable WHERE (fromTable."m_c_replyof" IS NULL)), q7 AS (-- GetEdgesWithGTop SELECT ROW(6, fromTable."m_messageid")::vertex_type AS "message#17", ROW(8, fromTable."m_messageid", fromTable."m_creatorid")::edge_type AS "_e188#0", ROW(0, fromTable."m_creatorid")::vertex_type AS "friend#2", fromTable."m_messageid" AS "message.id#2", fromTable."m_content" AS "message.content#2", fromTable."m_creationdate" AS "message.creationDate#13" FROM "message" fromTable WHERE (fromTable."m_c_replyof" IS NOT NULL)), q8 AS (-- UnionAll SELECT "message#17", "_e188#0", "friend#2", "message.id#2", "message.content#2", "message.imageFile#0", "message.creationDate#13" FROM q6 UNION ALL SELECT "message#17", "_e188#0", "friend#2", "message.id#2", "message.content#2", NULL AS "message.imageFile#0", "message.creationDate#13" FROM q7), q9 AS (-- EquiJoinLike SELECT left_query."_e186#0", left_query."_e186.id#0", left_query."_e187#0", left_query."friend#2", left_query."friend.id#2", left_query."friend.firstName#1", left_query."friend.lastName#2", right_query."message#17", right_query."message.id#2", right_query."message.imageFile#0", right_query."_e188#0", right_query."message.creationDate#13", right_query."message.content#2" FROM q5 AS left_query INNER JOIN q8 AS right_query ON left_query."friend#2" = right_query."friend#2"), q10 AS (-- AllDifferent SELECT * FROM q9 AS subquery WHERE is_unique(ARRAY[]::edge_type[] || "_e188#0" || "_e187#0")), q11 AS (-- Selection SELECT * FROM q10 AS subquery WHERE ("message.creationDate#13" <= :maxDate)), q12 AS (-- Projection SELECT "friend.id#2" AS "personId#0", "friend.firstName#1" AS "personFirstName#0", "friend.lastName#2" AS "personLastName#0", "message.id#2" AS "postOrCommentId#0", CASE WHEN ("message.content#2" IS NOT NULL = true) THEN "message.content#2" ELSE "message.imageFile#0" END AS "postOrCommentContent#0", "message.creationDate#13" AS "postOrCommentCreationDate#0" FROM q11 AS subquery), q13 AS (-- SortAndTop SELECT * FROM q12 AS subquery ORDER BY "postOrCommentCreationDate#0" DESC NULLS LAST, ("postOrCommentId#0")::BIGINT ASC NULLS FIRST LIMIT 20) SELECT "personId#0" AS "personId", "personFirstName#0" AS "personFirstName", "personLastName#0" AS "personLastName", "postOrCommentId#0" AS "postOrCommentId", "postOrCommentContent#0" AS "postOrCommentContent", "postOrCommentCreationDate#0" AS "postOrCommentCreationDate" FROM q13 AS subquery MATCH (:Person {id:$personId})-[:KNOWS]-(friend:Person)<- [:HAS_CREATOR]-(message:Message) WHERE message.creationDate <= $maxDate RETURN friend.id AS personId, friend.firstName AS personFirstName, friend.lastName AS personLastName, message.id AS postOrCommentId, CASE exists(message.content) WHEN true THEN message.content ELSE message.imageFile END AS postOrCommentContent, message.creationDate AS postOrCommentCreationDate ORDER BY postOrCommentCreationDate DESC, toInteger(postOrCommentId) ASC LIMIT 20
  • 22. BENCHMARKS: LINKED DATA BENCHMARK COUNCIL LDBC is a non-profit organization dedicated to establishing benchmarks, benchmark practices and benchmark results for graph data management software. The Social Network Benchmark is an industrial and academic initiative, formed by principal actors in the field of graph-like data management.
  • 23. LDBC IN A NUTSHELL Peter Boncz, Thomas Neumann, Orri Erling, TPC-H Analyzed: Hidden Messages and Lessons Learned from an Influential Benchmark, TPCTC 2013 Gábor Szárnyas, József Marton, János Benjamin Antal et al.: An early look at the LDBC Social Network Benchmark’s BI Workload. GRADES-NDA at SIGMOD, 2018 Orri Erling et al., The LDBC Social Network Benchmark: Interactive Workload, SIGMOD 2015
  • 24. PERFORMANCE EXPERIMENTS  LDBC Interactive workload  Tools o PostgreSQL (reference implementation) o Cypher-to-SQL queries on PostgreSQL o Semantic database (anonymized)  Geometric mean of 20+ executions
  • 25. BENCHMARK RESULTS ON LDBC QUERIES
  • 26. RELATED PROJECTS  Cytosm o Cypher to SQL Mapping o HP Labs for Vertica o Project abandoned in 2017 o gTop (graph topology) reused  Cypher for Apache Spark o Neo4j’s project o Executes queries in Spark o Read-only Tool Source Target OSS Updates Paths CAPS Cypher SparkSQL    Cytosm Cypher Vertica SQL    Cypher-to-SQL Cypher PostgreSQL   
  • 27. SUMMARY  Mapping property graph queries to SQL is challenging o Similar to ORM o + edge properties o + reachability  Initial implementation: C2S o Moderate feature coverage o Poor performance o Needs some tweaks, e.g. work around CTE optimization fences Gábor Szárnyas, József Marton, János Maginecz, Dániel Varró: Incremental View Maintenance on Property Graphs. arXiv preprint 2018
  • 28. RELATED RESOURCES ingraph and C2S github.com/ftsrg/ingraph Cypher for Apache Spark github.com/opencypher/cypher-for-apache-spark Cytosm github.com/cytosm/cytosm LDBC github.com/ldbc/ Thanks for the contributions to the whole ingraph team.

Editor's Notes

  1. https://www.youtube.com/watch?v=nCnR6wRo8x4
  2. A teljesítményméréshez implementáltuk az interaktív profil összes lekérdezését és a hozzá tartozó szoftverkomponenseket 3 eszközhöz, azonban a mérés során csak a komplex lekérdezéseket mértük le, a rövid lekérdezések és frissítések mérése a közeljövőbeli terveink között szerepel. A mérés során 5 különböző eszköz teljesítményét mértük le: - PostgreSQL: ez volt a munkánk során a referenciaimplementáció, ezzel validáltuk a lekérdezéseinket biztosítva azt, hogy a különböző nyelveken megírt lekérdezések minden eszköz esetében valóban ugyan azt az eredményt adják. - C2S: A dolgozat elkészültéig 6 lekérdezést sikerült Cypher nyelvről SQL-re transzformálni, azonban 1 lekérdezést tovább technikai probléma miatt nem tudtunk lemérni, így csak 5 lekérdezés eredményét tudjuk bemutatni. Továbbá egy meg nem nevezett tulajdonsággráf alapú adatbázist és 2db szemantikus adatbázis teljesíményét mértük le. Mivel az eredmények nem auditáltak (azaz a fejlesztők által nem megvizsgált és elfogadottak), így a szokásnak megfelelően anonimizált módon közöljük az eredményeket. Az eszközök mérése során minden lekérdezés válaszidejét legalább 20 különböző behelyettesítési paraméter esetében lemértük, majd azok geometriai átlagát vettük.
  3. More papers at http://ldbcouncil.org/publications (30+ in total)