This document summarizes Hannes Voigt's presentation on graph abstraction. It discusses matching patterns to bind variables, and using those variables to construct new graph elements through production patterns. It provides examples of simple graph construction and aggregation through grouping. The goal is to introduce an intuitive way to abstract and construct new subgraphs through pattern matching and variable bindings.
2. 2
Hannes Voigt
Starting June 2018
! Language and Standards Group, Neo4j
Before
! Postdoc at Database System Group,
! Technische Universität Dresden
Education and Experience
! Ph.D. from Technische Universität Dresden in 2014
! Visiting scholar at SAP Labs, Palo Alto for one year in 2010
! Visiting scholar at University Waterloo, for 4 months in 2007
Activities
! LDBC Graph Query Language Standardization Task Force
! Graph-related industry activities with SAP HANA Graph and openCypher
! Involvement with Graphs and Graph QLs since early 2010
3. 3
Short Commercial Break
Life of a Property Graph Query
! Data Models
! Query Languages
! Constraints
! Query Specification
! Data Structures and Indexing
! Query Processing
! Physical Operators
Currently under review …
… and to be appear soon
5. 5
Concept Chasm
Users talk about…
! Application entities
! e.g. discussions, communities,
topics, etc.
! Multiple abstraction
levels
Base data contains…
! Fine granular data
! Low abstraction
! E.g. individual
twitter messages,
retweet relationships,
etc.
e.g. discussions, communities,
abstrac
Base data contains
Fine granular data
Low abstraction
[Martin Grandjean, https://commons.wikimedia.org/wiki/File:Social_Network_Analysis_Visualization.png, 2014]
abstraction
Base data contains…
Fine granular data
Low abstraction [http://nodexlgraphgallery.org/Pages/Graph.aspx?graphID=70790]
Query
language
main means
to bridge
concept
chasm
Users talk in high level concepts ! Data captured in low level concepts
" Concept chasm
6. 6
Graph Abstraction
Match -> variable bindings
Different types of variables
▪ Value variable
▪ Vertex variable
▪ Edge variables
▪ (Path variables)
▪ ((Sub)Graph variables)
Match Pattern
!" … !$
Variable Bindings
7. 7
Graph Abstraction
Match -> variable bindings -> production
Different types of variables
▪ Value variable
▪ Vertex variable
▪ Edge variables
▪ (Path variables)
▪ ((Sub)Graph variables)
Idea in general
▪ Create the resulting graph by instantiate
production pattern
▪ Make it is intuitive as match patterns
▪ Make it use the same intuition as match
Match Pattern Production Pattern
!" … !$
Variable Bindings
8. 8
Graph Abstraction
Match -> variable bindings -> production
Different types of variables
▪ Value variable
▪ Vertex variable
▪ Edge variables
▪ (Path variables)
▪ ((Sub)Graph variables)
Idea in little more detail
▪ Existing vertices from bound vertex variables
▪ New vertices with new unbound vertex
variables
▪ Edges either implicit (via vertex variable) or
explicit (with edge variables)
▪ Existing values from bound value variable
▪ Bound/unbound is a parse time property
Match Pattern Production Pattern
!" … !$
Variable Bindings
9. 9
Graph Abstraction
Match -> variable bindings -> production
Different types of variables
▪ Value variable
▪ Vertex variable
▪ Edge variables
▪ (Path variables)
▪ ((Sub)Graph variables)
Production
▪ Existing vertices from bound vertex variables
▪ New vertices with new unbound vertex
variables
▪ Edges either implicit (via vertex variable) or
explicit (with edge variables)
▪ Existing values from bound value variable
▪ Bound/unbound is a parse time property
Production Pattern
!" … !$
Variable Bindings
Scope of this talk
10. 10
Graph Abstraction
Match -> variable bindings -> production
Different types of variables
▪ Value variable
▪ Vertex variable
▪ Edge variables
▪ (Path variables)
▪ ((Sub)Graph variables)
Production
▪ Existing vertices from bound vertex variables
▪ New vertices with new unbound vertex
variables
▪ Edges either implicit (via vertex variable) or
explicit (with edge variables)
▪ Existing values from bound value variable
▪ Bound/unbound is a parse time property
Match Pattern Production Pattern
!" … !$
Variable Bindings
Scope of this talk
Hi, I am
just
for beauty
16. 16
MATCH (a)-->(b)
CONSTRUCT (a)-->(c)<--(b)
Super Simple Example
! "
#
! "
#
$
%
&
Result Graph
"
a
b
Base Graph
same as
MATCHMATCH (a)(a)------>(b)MATCH
CONSTRUCT
(a)MATCH (a)
CONSTRUCTCONSTRUCTCONSTRUCTCONSTRUCT (a)
(a)
(a)(a)(a)(a)(a)------>-- ()<--(b)
>(b)>(b)>(b)>(b)
>>>()()()
>(b)>(b)
()()()()()()
>(b)
<<<----------(b)(b)(b)(b)
Not mentioned variables are unbound.
Think of it as there is a variable, you just
don’t know its name
17. 17
same as
MATCH (a)-->(b)
CONSTRUCT (a)-->()<--(b)
MATCH (a)-->(b)
CONSTRUCT (a)-->(c)<--(b)
Super Simple Example
! "
#
! "
#
$
%
&
Result Graph
"
a
b
Note:Note:
!
Note:Note:
Identities for new vertices are systemIdentities for new vertices are system-Identities for new vertices are system-generated!! Identities for new vertices are system
!
Identities for new vertices are systemIdentities for new vertices are systemIdentities for new vertices are systemIdentities for new vertices are systemIdentities for new vertices are system-generatedgeneratedgenerated! Identities for new vertices are systemIdentities for new vertices are system
Specific ID values (e.g. 4) are not important!!! Specific ID values (e.g. 4) are not important
!
17
Specific ID values (e.g. 4) are not importantSpecific ID values (e.g. 4) are not importantSpecific ID values (e.g. 4) are not importantSpecific ID values (e.g. 4) are not importantSpecific ID values (e.g. 4) are not importantSpecific ID values (e.g. 4) are not importantSpecific ID values (e.g. 4) are not important! Specific ID values (e.g. 4) are not importantSpecific ID values (e.g. 4) are not importantSpecific ID values (e.g. 4) are not importantSpecific ID values (e.g. 4) are not important
! As long as they are unique per variable and match group
Base Graph
18. 18
Identities
New object identities
▪ Assuming a binding table ! with schema " ! = $%, … , $( and
▪ An object construction pattern with variable ) with ) ∉ " !
▪ A new identity + is generated with
+ = , variableNameOf ) , $%, … , $(
where , is a Skolem function
Match Pattern Production Pattern
89 … 8:
Variable Bindings
19. 19
Identities
New object identities
▪ Assuming a binding table ! with schema " ! = $%, … , $( and
▪ An object construction pattern with variable ) with ) ∉ " !
▪ A new identity + is generated with
+ = , variableNameOf ) , $%, … , $(
where , is a Skolem function
Match Pattern Production Pattern
89 … 8:
Variable Bindings
Construction
Domain
Construction
Instance
20. 20
Identities
New object identities
! Assuming a binding table ! with schema " ! # $%& ' & $( and
! An object construction pattern with variable ) with ) * " !
! A new identity + is generated with
+ # , -./0.1234.5367 ) & $%& ' & $(
where , is a Skolem function
Match Pattern Production Pattern
89 ' 8:
Variable Bindings
Construction
Domain
Construction
Instance
In a nutshell:
,
In a nutshell:
returns same identities for the
In a nutshell:
In a nutshell:
In a nutshell:
,, returns same identities for the
returns same identities for the
returns same identities for the
same input parameter
returns same identities for the
returns same identities for the
same input parameter values
21. 21
Identities
New object identities
▪ Assuming a binding table ! with schema " ! = $%, … , $( and
▪ An object construction pattern with variable ) with ) ∉ " !
▪ A new identity + is generated with
+ = , variableNameOf ) , 89:;<
where , is a Skolem function
Match Pattern Production Pattern
=> … =?
Variable Bindings
Construction
Domain
Construction
Instance
For bag semantics construction
instance is determined by the
row id of the row in the binding
table
22. 22
Aspects of Identity Generation
Object Identities
▪ Should be a pure technical aspect and not of user concern
- System-generated in any case
▪ Queries can involve object (node, edge) constructors
▪ Object constructor produces new object identities (ID values)
The scope of ID uniqueness is the transaction (query)
▪ ID uniqueness guarantied within a single transactions (query)
▪ ID uniqueness not guarantied across multiple transactions (query)
Repeatability of ID generation is not guarantied
▪ The same query on an unchanged dataset can return different IDs
25. 25
Composability
Find vertices connected by chain of triangles
▪ Simplified syntax Base data G1
3
4
5
7
8
[:friends]
1 2 3 4
2
6
[:contains]
--1st abstraction step
MATCH (p1)-[:friends]-(p2),
(p2)-[:friends]-(p3)-[:friends]-(p1)
CONSTRUCT (t:FriendsTriangle)-[:contains]->(p1),
(t)-[:contains]->(p2),
(t)-[:contains]->(p3)
26. 26
Composability
Find vertices connected by chain of triangles
▪ Simplified syntax Base data G1
3
4
5
7
8
[:friends]
[:connected]
1 2 3 4
2
6
[:contains]
--1st abstraction step
MATCH (p1)-[:friends]-(p2),
(p2)-[:friends]-(p3)-[:friends]-(p1)
CONSTRUCT (t:FriendsTriangle)-[:contains]->(p1),
(t)-[:contains]->(p2),
(t)-[:contains]->(p3)
--2nd abstraction step
MATCH (t1)-[:contains]->(pa)<-[:contains]-(t2),
(t1)-[:contains]->(pb)<-[:contains]-(t2)
WHERE pa!=pb
CONSTRUCT (t1)-[:connected]->(t2)
27. 27
Composability
Find vertices connected by chain of triangles
! Simplified syntax
--1st abstraction step
MATCH (p1)-[:friends]-(p2),
(p2)-[:friends]-(p3)-[:friends]-(p1)
CONSTRUCT (t:FriendsTriangle)-[:contains]->(p1),
(t)-[:contains]->(p2),
(t)-[:contains]->(p3)
--2nd abstraction step
MATCH (t1)-[:contains]->(pa)<-[:contains]-(t2),
(t1)-[:contains]->(pb)<-[:contains]-(t2)
WHERE pa!=pb
CONSTRUCT (t1)-[:connected]->(t2)
--3rd abstraction step
MATCH (pa)<-[:contains]-(ta),
(ta)-/tp:connected*/->(tb),
(tb)-[:contains]->(pb)
RETURN pa, pb, length(tp)+1 AS triangleDist
Base data G1
3
4
5
7
8
[:friends]
5
pa pb triangleDist
2 6 3
! ! !
[:connected]
1 2 3 4
[:friends]
2
6
[:contains]
28. 28
Use for DML
General
▪ Just a matter of the target graph, which the concept is agnostic about
Create new elements
▪ CONSTRUCT (:Person {name: 'Hannes’}), (:Talk {title: 'GraghAgg'})
CONSTRUCT (h)-[g:GIVES]->(t)
▪ Input per definition: Binding table containing one empty tuples (no columns)
▪ 1st CONSTRUCT: Nodes h and t unbound à are created
▪ 2nd CONSTRUCT: Nodes h and t are bound à are already in the target graph
Edge g is unbound à is created
Merge in new elements
▪ Assume (:Person {name: 'Hannes'}) already exists in the target graph
▪ MATCH (h:Person {name: 'Hannes'})
CONSTRUCT (h)-[g:GIVES]->(t:Talk {title: 'GraghAgg'})
▪ Node h is bound à are already in the target graph
▪ Edge g and node t are unbound à are created
32. 32
Simple Example with Grouping
! "
#
Base Graph
Syntax alternative
! PER
! GROUP BY! Outside the pattern
! …
MATCH (a)-->(b)
CONSTRUCT (c GROUP a)
37. 37
Base Graph
Simple Example with Grouping
! "
#
$
%&
Result Graph
same as
MATCH (a)-[e]->(b)
CONSTRUCT (c GROUP a,b,e)
Grouping subsumes simple
construction (level 1).
Allows syntax shortcuts.
(Assuming set semantics.)MATCH (a)-->(b)
CONSTRUCT (c)
38. 38
Simple Example with Grouping
MATCH (a)-->(b)
CONSTRUCT (c GROUP a), (d GROUP b)
2 3
1
Base Graph
39. 39
Base Graph
<
Simple Example with Grouping
2 3
1
5
4
Result Graph
Group
with a=2
Group
with a=3
MATCH (a)-->(b)
CONSTRUCT (c GROUP a), (d GROUP b)
40. 40
Base Graph
<
Simple Example with Grouping
2 3
1
5
4
7
6
Result GraphGroup
with b=1
Group
with b=3
MATCH (a)-->(b)
CONSTRUCT (c GROUP a), (d GROUP b)
41. 41
Simple Example with Grouping
2 3
1
5
4
7
6
Result GraphBase Graph
MATCH (a)-->(b)
CONSTRUCT (c GROUP a)-[e]->(d GROUP b)
?
42. 42
Simple Example with Grouping
2 3
1
5
4
7
6
Result GraphBase Graph
MATCH (a)-->(b)
CONSTRUCT (c GROUP a)-[e GROUP a,b]->(d GROUP b)
same as
MATCH (a)-->(b)
CONSTRUCT (c GROUP a)-[e]->(d GROUP b)
Edges are implicitly grouped by
the grouping keys of their nodes
(and may define additional
grouping keys themselves)
43. 43
Simple Example with Grouping
! "
#
$$
Result Graph
%
& '
Base Graph
MATCH (a)-->(b)
CONSTRUCT (c GROUP a)-[e GROUP a,b]->(d GROUP b)
same as
MATCH (a)-->(b)
CONSTRUCT (c GROUP a)-[e]->(d GROUP b)
Edges are implicitly grouped by
the grouping keys of their nodes
(and may define additional
grouping keys themselves)
44. 44
Base Graph
Simple Example with Grouping
! "
#
$$
Result Graph
%
& '
MATCH (a)-->(b)
CONSTRUCT (c GROUP a)-[e GROUP a,b]->(d GROUP b)
same as
MATCH (a)-->(b)
CONSTRUCT (c GROUP a)-[e]->(d GROUP b)
Edges are implicitly grouped by
the grouping keys of their nodes
(and may define additional
grouping keys themselves)
45. 45
Base Graph
Simple Example with Grouping
! "
#
Result Graph
$
% &
'
MATCH (a)-->(b)
CONSTRUCT (c GROUP a)-[e GROUP a,b]->(d GROUP b)
same as
MATCH (a)-->(b)
CONSTRUCT (c GROUP a)-[e]->(d GROUP b)
Edges are implicitly grouped by
the grouping keys of their nodes
(and may define additional
grouping keys themselves)
46. 46
Base Graph
Simple Example with Grouping
! "
#
Result Graph
$
% #
MATCH (a)-->(b)
CONSTRUCT (c GROUP a)-[e GROUP a,b]->(b GROUP b)
same as
b is bound, hence, already
in the graph or replicated
depending on the target
graph
MATCH (a)-->(b)
CONSTRUCT (c GROUP a)-[e]->(b)
"
47. 47
Aggregation in Graph Construction
Vertex patterns
▪ Have grouping variables: (x GROUP a,b …)
▪ Grouping variables must be a subset of the set of all bound variables
▪ Bound vertices must have their vertex variable as grouping variable,
hence (x GROUP x …) is the same as (x …) if x is a bound variable
▪ Unbound vertices without specified grouping variables, are implicitly grouped by all bound variables,
hence (x …) is the same as (x GROUP a,b …) if x is an unbound variable and a and b are all
bound variables
Edge patterns
▪ Inherit grouping if they connect one or two grouping vertices
▪ Edges can have own grouping variables, that are added to the inherit grouping attributes
General
▪ All bound variables not used as group variable can be used in the same object patter only in an
aggregation function
48. 48
Identities
New object identities with Aggregation
▪ Assuming a binding table ! with schema " ! and
▪ An object construction pattern with variable # ∉ " ! and grouping variables %&, … , %) ⊆ " !
▪ A new identity + is generated with
+ = - variableNameOf # , %&, … , %)
where - is a Skolem function
Match Pattern Production Pattern
9: … 9;
Variable Bindings
Construction
Domain
Construction
Instance
49. 49
Replicating and Copying (cf. Stefan’s Talk)
Let x be a bound variable à Replication
▪ (x) à Replication is standard behavior for bound variables
▪ Note that (x) is a shortcut for (x GROUP x)
▪ Clones x into the output graph
▪ Creates new node with identity ! = # variableNameOf / , /
▪ System may also track lineage of x
50. 50
Replicating and Copying (cf. Stefan’s Talk)
Let x be a bound variable à Replication
▪ (x) à Cloning is standard behavior for bound variables
▪ Note that (x) is a shortcut for (x GROUP x)
▪ Clones x into the output graph
▪ Creates new node with identity ! = # variableNameOf / , /
▪ System may also track lineage of x
Let x be a bound variable à Copying
▪ (y COPY x) à new node with all labels and properties of x copied over
▪ Copies x‘s data into the output graph
▪ Creates new node with identity ! = # variableNameOf / , 12, … , 14
▪ Note that this may copy the data of a specific node x multiple time into the target graph
▪ 1:1 copying can be ensured by grouping: (y COPY x GROUP x)
▪ No lineage tracking
51. 51
Use for DML
General
▪ Grouping gives explicit control over distinctness of created elements
Create new nodes for distinct values
▪ MATCH (p:Person)
UNWIND p.talks AS s
CONSTRUCT (p)-[:GIVES]->(t GROUP s :Talk {title: s})
▪ Gives you a :Talk node for every distinct s value
▪ If multiple people gave the same talk, they all will be connected to the same :Talk node
52. 52
Use for DML
Create new edges for distinct values
▪ MATCH (h:Person)-[:GIVES]->(t:Talk)
UNWIND t.dates AS d
CONSTRUCT (h)-[g GROUP d :GIVES {date: d}]->(t)
▪ If a talk was given at multiple times, the same pair of person and talk node will get connected by one
edge for each distinct date the talk was given between (g grouped by h, t, and d)
53. 53
Use for DML
Create new edges for distinct values
▪ MATCH (h:Person)-[:GIVES]->(t:Talk)
UNWIND t.dates AS d
CONSTRUCT (h)-[g GROUP d :GIVES {date: d}]->(t)
▪ If a talk was given at multiple times, the same pair of person and talk node will get connected by one
edge for each distinct date the talk was given between (g grouped by h, t, and d)
▪ What happens if a talk was given multiple times at the same day?
54. 54
Use for DML
Create new edges for distinct values
▪ MATCH (h:Person)-[:GIVES]->(t:Talk)
UNWIND t.dates AS d
CONSTRUCT (h)-[g GROUP d :GIVES {date: d}]->(t)
▪ If a talk was given at multiple times, the same pair of person and talk node will get connected by one
edge for each distinct date the talk was given between (g grouped by h, t, and d)
▪ What happens if a talk was given multiple times at the same day?
Create new elements for every row in the binding table
▪ Note, binding table has bag semantics
▪ MATCH (h:Person {name: 'Hannes'})
UWIND [1,2,2,3] AS x
CONSTRUCT (h)-[g:FAVORITE_NUMBERS]->(n GROUP ROW {num: x})
▪ Will definitely create four n nodes: ({num:1}), ({num:2}), ({num:2}), ({num:3})
63. 63
Summarize the structure of a graph in a smaller graph
▪ Group all vertices and all edge
▪ Represent the relationship
of the groups in a graph
Graph Summarization
[Peixiang Zhao et al.: Graph Cube: On Warehousing and OLAP Multidimensional Networks, SIGMOD 2011]
c
a
f
d
h i
b
e
g
j
!gender,COUNT ∗ ., !∅,COUNT ∗ 0 9
13
5 5
FriendsCo-workersSchema: Male/Teacher Female/Teacher Male/Lawyer Female/Lawyer
64. 64
Summarize the structure of a graph in a smaller graph
▪ Group all vertices and all edge
▪ Represent the relationship
of the groups in a graph
!gender,job,COUNT ∗ 1, !∅,COUNT ∗ 3
Graph Summarization
[Peixiang Zhao et al.: Graph Cube: On Warehousing and OLAP Multidimensional Networks, SIGMOD 2011]
c
a
f
d
h i
b
e
g
j
2
1
1
2
4 3
3 3
2 2
!gender,COUNT ∗ 1, !∅,COUNT ∗ 3 9
13
5 5
FriendsCo-workersSchema: Male/Teacher Female/Teacher Male/Lawyer Female/Lawyer
65. 65
Summarize the structure of a graph in a smaller graph
▪ Group all vertices and all edge
▪ Represent the relationship
of the groups in a graph
!gender,job,COUNT ∗ 1, !∅,COUNT ∗ 3
!gender,COUNT ∗ 1, !status,COUNT ∗ 3
Graph Summarization
[Peixiang Zhao et al.: Graph Cube: On Warehousing and OLAP Multidimensional Networks, SIGMOD 2011]
c
a
f
d
h i
b
e
g
j
6 12
31
5 5
2
1
1
2
4 3
3 3
2 2
!gender,COUNT ∗ 1, !∅,COUNT ∗ 3 9
13
5 5
FriendsCo-workersSchema: Male/Teacher Female/Teacher Male/Lawyer Female/Lawyer
66. 66
Summarize with Construction
Can Graph Summarization expressed as Graph Construction?
▪ Consider summarization !color &, !∅ )
▪ First try:
MATCH (a)-[e]->(b)
CONSTRUCT (x GROUP a.color)-->(y GROUP b.color)
67. 67
Summarize with Construction
Can Graph Summarization expressed as Graph Construction?
▪ Consider summarization !color &, !∅ )
▪ First try:
Group with
a.color=red
MATCH (a)-[e]->(b)
CONSTRUCT (x GROUP a.color)-->(y GROUP b.color)
68. 68
Summarize with Construction
Can Graph Summarization expressed as Graph Construction?
▪ Consider summarization !color &, !∅ )
▪ First try:
Group with
b.color=blue
MATCH (a)-[e]->(b)
CONSTRUCT (x GROUP a.color)-->(y GROUP b.color)
69. 69
Summarize with Construction
Can Graph Summarization expressed as Graph Construction?
▪ Consider summarization !color &, !∅ )
▪ First try:
Group with
b.color=red
MATCH (a)-[e]->(b)
CONSTRUCT (x GROUP a.color)-->(y GROUP b.color)
70. 70
Summarize with Construction
Can Graph Summarization expressed as Graph Construction?
▪ Consider summarization !color &, !∅ )
▪ First try:
Correct
Summarization
≠
Does not
work!
MATCH (a)-[e]->(b)
CONSTRUCT (x GROUP a.color)-->(y GROUP b.color)
71. 71
Summarize with Construction
Can Graph Summarization expressed as Graph Construction?
▪ Consider summarization !color &, !∅ )
▪ First try:
Correct
Summarization
≠
Does not
work!
Problem: x and y are separate construction domains
MATCH (a)-[e]->(b)
CONSTRUCT (x GROUP a.color)-->(y GROUP b.color)
72. 72
Summarize with Construction
Can Graph Summarization expressed as Graph Construction?
▪ Consider summarization !color &, !∅ )
▪ Solution:
MATCH (a)-[e]->(b)
CONSTRUCT (z AS x GROUP a.color)-->(z AS y GROUP b.color)
Idea: Aliasing – Two instances (x and y) in a construction
pattern from the same construction domain (z)
73. 73
Summarize with Construction
Can Graph Summarization expressed as Graph Construction?
▪ Consider summarization !color &, !∅ )
▪ Solution:
Group with
a.color=red
MATCH (a)-[e]->(b)
CONSTRUCT (z AS x GROUP a.color)-->(z AS y GROUP b.color)
74. 74
Summarize with Construction
Can Graph Summarization expressed as Graph Construction?
▪ Consider summarization !color &, !∅ )
▪ Solution:
Group with
b.color=blue
MATCH (a)-[e]->(b)
CONSTRUCT (z AS x GROUP a.color)-->(z AS y GROUP b.color)
75. 75
Summarize with Construction
Can Graph Summarization expressed as Graph Construction?
▪ Consider summarization !color &, !∅ )
▪ Solution:
Group with
b.color=red
MATCH (a)-[e]->(b)
CONSTRUCT (z AS x GROUP a.color)-->(z AS y GROUP b.color)
76. 76
Identities
New object identities with Aggregation
▪ Assuming a binding table ! with schema ! " and
▪ An object construction pattern with variable # ∉ ! " , domain % ∉ ! " , grouping &', … , &* ⊆ ! "
▪ A new identity , is generated with
, = . variableNameOf % , &', … , &*
where . is a Skolem function
Match Pattern Production Pattern
:; … :<
Variable Bindings
Construction
Domain
Construction
Instance
78. 78
Example
Money Transfer between Groups
100
300
150
450 50
300220
130
100
500
750
900
600 50
MATCH (s1)<-[e:in]-(c1)-[e:transfer]->(c2)-[:in]->(s2)
CONSTRUCT (g AS g1 GROUP s1)
-[{amount=SUM(e.amount)}]->
(g AS g2 GROUP s2)
79. 79
Example
Money Movement of each Group
100
300
150
450 50
300220
130
100
500
flow: 2850
MATCH (s1)<-[e:in]-(c1)-[e:transfer]->(c2)-[:in]->(s2)
CONSTRUCT (g AS g1 GROUP s1 {flow=SUM(e.amount)})
-->
(g AS g2 GROUP s2 {flow=SUM(e.amount)})
flow: 1750
80. 80
Example
Cost and Revenue of each Group
100
300
150
450 50
300220
130
100
500
cost: -1350
revn: 1500
MATCH (s1)<-[e:in]-(c1)-[e:transfer]->(c2)-[:in]->(s2)
CONSTRUCT (g AS g1 GROUP s1 {cost=SUM(-1*e.amount)})
-->
(g AS g2 GROUP s2 {revn=SUM(e.amount)})
cost: -950
revn: 800
81. 81
Example
Profit of each Group
100
300
150
450 50
300220
130
100
500
prof: 150
MATCH (s1)<-[e:in]-(c1)-[e:transfer]->(c2)-[:in]->(s2)
CONSTRUCT (g AS g1 GROUP s1 {prof=SUM(-1*e.amount)})
-->
(g AS g2 GROUP s2 {prof=SUM(e.amount)})
prof: -150
84. 84
Reminder: Regular Per-Match Construction
Base Graph
2 3
1
Result Graph
5
4
MATCH (a)-->(b)
CONSTRUCT (c GROUP a)
Base Graph
<2 3
1 4Group
with a=2
Group
with a=3
85. 85
Cross-Match Construction
Base Graph
! "
#
Result Graph
$
MATCH (a)-->(b)
CONSTRUCT (c GROUP a)-[e]->(ORDER BY id(a) 1 PRECEDING c)
Base Graph
! "
#
id(a)=2 < id(a)=3
$
%
$
Idea: Refer to nodes coming from
a different match result
The c created for
the preceding
match result
according to
ascending id of a
87. 87
Referencing relative to a given order
Cross-Match Construction
Base Graph
! "
#
Result Graph
$
MATCH (a)-->(b)
CONSTRUCT (c)-[e]->(ORDER BY id(b),id(a) 1 PRECEDING c)
Base Graph
! "
# $
%
$ &
ORDER BY id(b),id(a): (1,2) < (1,3) < (3,2)
88. 88
Referencing relative to a given order
Cross-Match Construction
Base Graph
! "
#
Result Graph
$
MATCH (a)-->(b)
CONSTRUCT (c)-[e]->(ORDER BY id(b),id(a) 1 PRECEDING c)
(c)-[e]->(ORDER BY id(b),id(a) 1 SUCCEEDING c)
Base Graph
! "
#
ORDER BY id(b),id(a): (1,2) < (1,3) < (3,2)
$
%
$ &
89. 89
Order is local to reference
Cross-Match Construction
Base Graph
! "
#
Result Graph
$
MATCH (a)-->(b)
CONSTRUCT (c)-[e]->(ORDER BY id(b),id(a) 1 PRECEDING c)
(c)-[e]->(ORDER BY id(b),id(a) 1 SUCCEEDING c)
(c)-[e]->(ORDER BY id(a),id(b) 1 PRECEDING c)
Base Graph
! "
#
ORDER BY id(b),id(a): (1,2) < (1,3) < (3,2)
ORDER BY id(a),id(b): (2,1) < (2,3) < (3,1)
$
%
$ &
90. 90
In conjunction with Aliasing
Cross-Match Construction
Base Graph
! "
#
Result Graph
$
MATCH (a)-->(b)
CONSTRUCT (c)-[e:Prev]->(ORDER BY id(b),id(a) 1 PRECEDING c AS ca),
(c)-[f:Succ]->(ORDER BY id(b),id(a) 1 SUCCEEDING c),
(c)-[g:Prev2]->(ORDER BY id(a),id(b) 1 PRECEDING c AS cb),
(ca)-[h:Cross]->(cb)
Base Graph
! "
#
ORDER BY id(b),id(a): (1,2) < (1,3) < (3,2)
ORDER BY id(a),id(b): (2,1) < (2,3) < (3,1)
Result Graph
$
%
$ &
91. 91
Referencing absolute to a given order
Cross-Match Construction
Base Graph
! "
#
Result Graph
$
MATCH (a)-->(b)
CONSTRUCT (c)-[e]->(ORDER BY id(b),id(a) FIRST c)
Base Graph
! "
#
ORDER BY id(b),id(a): (1,2) < (1,3) < (3,2)
$
%
$ &
92. 92
Referencing many
Cross-Match Construction
Base Graph
! "
#
Result Graph
$
MATCH (a)-->(b)
CONSTRUCT (c)-[e]->(ORDER BY id(b),id(a) ALL PRECEDING c)
Base Graph
! "
#
ORDER BY id(b),id(a): (1,2) < (1,3) < (3,2)
$
%
$ &
93. 93
Partitioning the order
Cross-Match Construction
Base Graph
! "
#
Result Graph
$
MATCH (a)-->(b)
CONSTRUCT (c)-[e]->(PARTITION a ORDER BY id(b),id(a) 1 PRECEDING c)
Base Graph
! "
#
ORDER BY id(b),id(a): (1,2) < (3,2) | (1,3)
$
%
$ &
94. 94
Cross-Match Construction
Reference Production of other Match Results
▪ No additional identities are produced from the given match
▪ Only referring to identities produced of another match in the match result set
▪ The construction domain which is referred to does not have to be part of the pattern
Syntax
▪ Referencing syntax can be design analogously to window specification in SQL OLAP functions
Open questions
▪ Self referencing? Non-total orders? How does it compare to joins? …
Matches Productions
!" … !$
Variable Bindings
97. 97
Example
Coauthor network
QueryGraphs GCORE Cypher
Hannes
Nikolay
Tobias
Stefan
SIGMODSynthesis Lectures on DM
MATCH (a)-[:AUTHORED]->(p)-[:APPEARED_IN]->(v), (b)-[:AUTHORED]->(p)
a b p p.cit… v.imp…
Stefan Tobias Cypher 150 50
Stefan Tobias GCORE 100 50
Stefan Hannes GCORE 100 50
Tobias Stefan Cypher 150 50
Tobias Stefan GCORE 100 50
Tobias Hannes GCORE 100 50
Hannes Stefan GCORE 100 50
Hannes Tobias GCORE 100 50
Hannes Nikolay QG 100 40
Nikolay Hannes QG 100 40
98. 98
Example
Coauthor network
Top-coauthor network
QueryGraphs GCORE Cypher
Hannes
Nikolay
Tobias
Stefan
SIGMODSynthesis Lectures on DM
MATCH (a)-[:AUTHORED]->(p)-[:APPEARED_IN]->(v), (b)-[:AUTHORED]->(p)
CONSTRUCT (a)-[:TOP_COAUTHOR {paper: p.name}]->
(PARTITION a ORDER v.impactFactor * p.citations DESC ALL FIRST b)
Hannes
Nikolay
Tobias
Stefan
a b p p.cit… v.imp…
Stefan Tobias Cypher 150 50
Stefan Tobias GCORE 100 50
Stefan Hannes GCORE 100 50
Tobias Stefan Cypher 150 50
Tobias Stefan GCORE 100 50
Tobias Hannes GCORE 100 50
Hannes Stefan GCORE 100 50
Hannes Tobias GCORE 100 50
Hannes Nikolay QG 100 40
Nikolay Hannes QG 100 40
{name: GCORE}
{name: GCORE}{name: QG} {name: Cypher}
99. 99
Example
Coauthor network
Top-coauthor network
QueryGraphs GCORE Cypher
Hannes
Nikolay
Tobias
Stefan
SIGMODSynthesis Lectures on DM
MATCH (a)-[:AUTHORED]->(p)-[:APPEARED_IN]->(v), (b)-[:AUTHORED]->(p)
CONSTRUCT (a)-[:TOP_COAUTHOR {paper: p.name}]->
(PARTITION a ORDER v.impactFactor * p.citations DESC ALL FIRST b)
Hannes
Nikolay
Tobias
Stefan
a b p p.cit… v.imp…
Stefan Tobias Cypher 150 50
Stefan Tobias GCORE 100 50
Stefan Hannes GCORE 100 50
Tobias Stefan Cypher 150 50
Tobias Stefan GCORE 100 50
Tobias Hannes GCORE 100 50
Hannes Stefan GCORE 100 50
Hannes Tobias GCORE 100 50
Hannes Nikolay QG 100 40
Nikolay Hannes QG 100 40
{name: GCORE}
{name: GCORE}{name: QG} {name: Cypher}
Hannes
100. 100
Example
Coauthor network
Top-coauthor network
QueryGraphs GCORE Cypher
Hannes
Nikolay
Tobias
Stefan
SIGMODSynthesis Lectures on DM
MATCH (a)-[:AUTHORED]->(p)-[:APPEARED_IN]->(v), (b)-[:AUTHORED]->(p)
CONSTRUCT (a)-[:TOP_COAUTHOR {paper: p.name}]->
(PARTITION a ORDER v.impactFactor * p.citations DESC ALL FIRST b)
Hannes
Nikolay
Tobias
Stefan
a b p p.cit… v.imp…
Stefan Tobias Cypher 150 50
Stefan Tobias GCORE 100 50
Stefan Hannes GCORE 100 50
Tobias Stefan Cypher 150 50
Tobias Stefan GCORE 100 50
Tobias Hannes GCORE 100 50
Hannes Stefan GCORE 100 50
Hannes Tobias GCORE 100 50
Hannes Nikolay QG 100 40
Nikolay Hannes QG 100 40
{name: GCORE}
{name: GCORE}{name: QG} {name: Cypher}
Tobias
101. 101
Graph Abstraction
… … …
Abstract
Structure
Link
Abstractions
Abstract
multiple Subgraphs
Abstract
single Subgraph
…
…
103. 103
Concept Chasm
Users talk about…
! Application entities
! e.g. discussions, communities,
topics, etc.
! Multiple abstraction
levels
Base data contains…
! Fine granular data
! Low abstraction
! E.g. individual
twitter messages,
retweet relationships,
etc.
e.g. discussions, communities,
abstrac
Base data contains
Fine granular data
Low abstraction
[Martin Grandjean, https://commons.wikimedia.org/wiki/File:Social_Network_Analysis_Visualization.png, 2014]
abstraction
Base data contains…
Fine granular data
Low abstraction [http://nodexlgraphgallery.org/Pages/Graph.aspx?graphID=70790]
Query
language
main means
to bridge
concept
chasm
Users talk in high level concepts ! Data captured in low level concepts
" Concept chasm