SlideShare ist ein Scribd-Unternehmen logo
1 von 56
Downloaden Sie, um offline zu lesen
Artem Aliev and Russell Spitzer, DataStax
A Tale of Two Graph Frameworks
on Spark: 

GraphFrames and Tinkerpop
OLAP
#EUeco3
#EUeco3
Pierrot and Harlequin
• Artem
• Graph Analytics Expert
• Earth
• Russell
• Distributed Systems Enthusiast
• Earth
2
Tinkerpop and GraphFrames provide
Complimentary Approaches for Graph Analytics
DataSet Catalyst
GraphFrames
3#EUeco3
Graphs are Vertices and Edges
4
Vertices are things and edges represent their relations to one another
#EUeco3
Graphs are Vertices and Edges
5
Registry: USS Enterprise (NCC-1701-C)
Class: Ambassador
Service: 2332[11] – 2344 (12 Years)
Registry: USS Enterprise (NCC-1701-D)
Class: Galaxy
Service: 2363–2371 (8 Years)
Registry: USS Enterprise (NCC-1701)
Class: Constitution class[6]
Service: 2245–2285 (40 Years)
Registry: USS Enterprise (NCC-1701-A)
Class: Enterprise class[8][9]
Service: 2286–2293 (7 Years)
#EUeco3
Graphs are Vertices and Edges
6
Registry: USS Enterprise (NCC-1701-C)
Class: Ambassador
Service: 2332[11] – 2344 (12 Years)
Registry: USS Enterprise (NCC-1701-D)
Class: Galaxy
Service: 2363–2371 (8 Years)
Registry: USS Enterprise (NCC-1701)
Class: Constitution class[6]
Service: 2245–2285 (40 Years)
Registry: USS Enterprise (NCC-1701-A)
Class: Enterprise class[8][9]
Service: 2286–2293 (7 Years)
Vertex
Properties
#EUeco3
Graphs are Vertices and Edges
7
Registry: USS Enterprise (NCC-1701-C)
Class: Ambassador
Service: 2332[11] – 2344 (12 Years)
Registry: USS Enterprise (NCC-1701-D)
Class: Galaxy
Service: 2363–2371 (8 Years)
Registry: USS Enterprise (NCC-1701)
Class: Constitution class[6]
Service: 2245–2285 (40 Years)
Registry: USS Enterprise (NCC-1701-A)
Class: Enterprise class[8][9]
Service: 2286–2293 (7 Years)
succeeded by
succeeded by
succeeded by
#EUeco3
Graphs are Vertices and Edges
8
Registry: USS Enterprise (NCC-1701-C)
Class: Ambassador
Service: 2332[11] – 2344 (12 Years)
Registry: USS Enterprise (NCC-1701-D)
Class: Galaxy
Service: 2363–2371 (8 Years)
Registry: USS Enterprise (NCC-1701)
Class: Constitution class[6]
Service: 2245–2285 (40 Years)
Registry: USS Enterprise (NCC-1701-A)
Class: Enterprise class[8][9]
Service: 2286–2293 (7 Years)
Edge
Edge Labelsucceeded by
succeeded by
succeeded by
#EUeco3
Graphs are Vertices and Edges
9
Registry: USS Enterprise (NCC-1701-C)
Class: Ambassador
Service: 2332[11] – 2344 (12 Years)
Registry: USS Enterprise (NCC-1701-D)
Class: Galaxy
Service: 2363–2371 (8 Years)
Registry: USS Enterprise (NCC-1701)
Class: Constitution class[6]
Service: 2245–2285 (40 Years)
Registry: USS Enterprise (NCC-1701-A)
Class: Enterprise class[8][9]
Service: 2286–2293 (7 Years)
Ship
Ship
Ship
Ship
Vertex Label
succeeded by
succeeded by
succeeded by
#EUeco3
Graphs are Vertices and Edges
10
Registry: USS Enterprise (NCC-1701-C)
Class: Ambassador
Service: 2332[11] – 2344 (12 Years)
Registry: USS Enterprise (NCC-1701-D)
Class: Galaxy
Service: 2363–2371 (8 Years)
Registry: USS Enterprise (NCC-1701)
Class: Constitution class
Service: 2245–2285 (40 Years)
Ship
Ship
Ship
Ship
Position: Captain

Name: Kirk
Position: Captain

Name: Picard
Crew
Crew
succeeded by
succeeded by
succeeded by
#EUeco3
Graphs are Vertices and Edges
11
Registry: USS Enterprise (NCC-1701-C)
Class: Ambassador
Service: 2332[11] – 2344 (12 Years)
Registry: USS Enterprise (NCC-1701-D)
Class: Galaxy
Service: 2363–2371 (8 Years)
Registry: USS Enterprise (NCC-1701)
Class: Constitution class
Service: 2245–2285 (40 Years)
Registry: USS Enterprise (NCC-1701-A)
Class: Enterprise class
Service: 2286–2293 (7 Years)
Ship
Ship
Ship
Ship
Position: Captain

Name: Kirk
Position: Captain

Name: Picard
Crew
Crew
succeeded by
succeeded by
succeeded by
served on
served on
served on
served on
#EUeco3
Graphs are Vertices and Edges
12
Registry: USS Enterprise (NCC-1701-C)
Class: Ambassador
Service: 2332[11] – 2344 (12 Years)
Registry: USS Enterprise (NCC-1701-D)
Class: Galaxy
Service: 2363–2371 (8 Years)
Registry: USS Enterprise (NCC-1701)
Class: Constitution class
Service: 2245–2285 (40 Years)
Registry: USS Enterprise (NCC-1701-A)
Class: Enterprise class
Service: 2286–2293 (7 Years)
Ship
Ship
Ship
Ship
Position: Captain

Name: Kirk
Position: Captain

Name: Picard
Crew
Crew
succeeded by
succeeded by
succeeded by
served on
served on
served on
served on
But why do I
want this?
#EUeco3
Graphs let us ask questions about our data based
on their relations
13
What Captain Served After Kirk?
What Ship was two after the
NCC-1701?
#EUeco3
Traversals involve following paths through the
Graph
14
Registry: USS Enterprise (NCC-1701-C)
Class: Ambassador
Service: 2332[11] – 2344 (12 Years)
Registry: USS Enterprise (NCC-1701-D)
Class: Galaxy
Service: 2363–2371 (8 Years)
Registry: USS Enterprise (NCC-1701)
Class: Constitution class
Service: 2245–2285 (40 Years)
Registry: USS Enterprise (NCC-1701-A)
Class: Enterprise class
Service: 2286–2293 (7 Years)
Ship
Ship
Ship
Ship
Position: Captain

Name: Kirk
Position: Captain

Name: Picard
Crew
Crew
succeeded by
succeeded by
succeeded by
served on
served on
served on
served on
#EUeco3
What Captain was After Kirk?
15
Registry: USS Enterprise (NCC-1701-C)
Class: Ambassador
Service: 2332[11] – 2344 (12 Years)
Registry: USS Enterprise (NCC-1701-A)
Class: Enterprise class
Service: 2286–2293 (7 Years)
Ship
Ship
Position: Captain

Name: Kirk
Position: Captain

Name: Picard
Crew
Crewsucceeded by
served on
served on
#EUeco3
What Ship was two after the NCC-1701?
16
Registry: USS Enterprise (NCC-1701-C)
Class: Ambassador
Service: 2332[11] – 2344 (12 Years)
Registry: USS Enterprise (NCC-1701)
Class: Constitution class
Service: 2245–2285 (40 Years)
Registry: USS Enterprise (NCC-1701-A)
Class: Enterprise class
Service: 2286–2293 (7 Years)
Ship
Ship
Ship
succeeded by
succeeded by
#EUeco3
Tinkerpop is a Powerful and Flexible Graph
Framework
• Server, Language, Connectors
• Graph Framework for 

OLAP and OLTP
• Node Centric Representations
• Fluent API (Gremlin)
• Fully Self Contained Framework
17#EUeco3
OLTP Examples
18#EUeco3 18
Movie Lens
Example
Schema
19
https://grouplens.org/datasets/movielens/
#EUeco3 19
20
#EUeco3
What happens when you have too much data?
21
#EUeco3
Tinkerpop Spark OLAP Mechanism
• Instead of one traversal we traverse starting from all nodes simultaneously
22
Distribution Requires Partitioning
23
?
Big Data
Independent Chunks
of Data#EUeco3
#EUeco3
Vertex Stored in a PairRDD
Id -> StarVertex(Edge and Property Information)
24
1
A
C
D
Star Vertex: Adjacency list representation

1: "A", "Kirk"

A: "C", "Kirk"

C: "D", "Picard"

D: "Picard"
 Just Id 

Of Connected 

Vertex
#EUeco3
Vertex Program Runs Initializing Traverser for
every Vertex
25
1
A
C
D
SparkMemory - Accumulator - Used for GlobalState
#EUeco3
Then we cycle through a message Passing
Algorithm
26
1
A
C
D
1
A
C
D
1
A
C
D
SparkMemory - Accumulator - Used for GlobalState
#EUeco3
Then we cycle through a message Passing
Algorithm
27
1
A
C
D
1
A
C
D
1
A
C
D
SparkMemory - Accumulator - Used for GlobalState
Passes messages from one Vertex to another with a join
#EUeco3
Then we cycle through a message Passing
Algorithm
28
1
A
C
D
1
A
C
D
1
A
C
D
SparkMemory - Accumulator - Used for GlobalState
Repeat
#EUeco3
Then we cycle through a message Passing
Algorithm
29
1
A
C
D
1
A
C
D
1
A
C
D
SparkMemory - Accumulator - Used for GlobalState
All Traversers Halt

Or
Program Terminates
Result!
#EUeco3
Example OLAP Traversals
30
#EUeco3
Tinkerpop Spark OLAP Pros/Cons
Pros
• Every message pass requires only a single shuffle
• Edges and edge properties accessible without a step
• Very Flexible, Many Provider Specific Shortcuts possible
• Internal properties can be any Java type
• All in one, Server already ready for multiple clients
Cons
• Limited in ability to connect to external sources/other spark applications
• Flexibility of framework allows for many platform specific shortcuts to be added
• Genericness provides difficulty in making some optimizations
• Edges co-partitioned with vertices, high degree nodes can cause memory issues
31
#EUeco3
GraphFrames Background
• Third Party Package
• https://graphframes.github.io/
• Integrates with Dataset/Dataframe in Spark
• Relational under the hood
32
#EUeco3
GraphFrames are built of two DataFrames
33
Row
Column
#EUeco3
GraphFrames are built of two DataFrames
34
id job species
Geordi Chief
Engineer
Human
Data Science
Officer
Android
Vertex DataFrame
src dst relationship
Geordi Data Friend
Edge DataFrame
Friend
#EUeco3
GraphFrames are built of two DataFrames
35
id job species
Geordi Chief
Engineer
Human
Data Science
Officer
Android
Vertex DataFrame
src dst relationship
Geordi Data Friend
Edge DataFrame
Friend
Can Only Be Spark Types
#EUeco3
GraphFrames are built of two DataFrames
36
id job species
Geordi Chief
Engineer
Human
Data Science
Officer
Android
Vertex DataFrame
src dst relationship
Geordi Data Friend
Edge DataFrame
Friend
No Built in Labels
#EUeco3
Catalyst Optimizes any Requests
• Simple requests using DataFrame api don't do
anything special
• Some methods fall back to GraphX (RDD Based)
• Others use pure DataFrame methods
37
#EUeco3
GraphFrames Motif Matching
38
GraphFrame
(a)-[e]->(b)
V E
#EUeco3
GraphFrames Motif Matching
39
GraphFrame
(a)-[e]->(b)
Vertex (a) Vertices as a UDT "A"V E
A: <VertexRow>
#EUeco3
GraphFrames Motif Matching
40
GraphFrame
(a)-[e]->(b)
Vertex (a) Vertices as a UDT "A"
Edge [b] 

Edges as UDT "E"

Join with edges
where A.id = E.src
V E
A: <VertexRow>
Join
A: <VertexRow>,
E: <EdgeRow>
#EUeco3
GraphFrames Motif Matching
41
GraphFrame
(a)-[e]->(b)
Vertex (a) Vertices as a UDT "A"
[e]
Vertices as UDT "B"
Join with edges where
E.dst = B.id
Edge
Vertex
[b] 

Edges as UDT "E"

Join with edges
where A.id = E.src
V E
A: <VertexRow>
A: <VertexRow>,
E: <EdgeRow>
Join
JoinA: <VertexRow>,
E: <EdgeRow>,
B: <VertexRow>
#EUeco3
GraphFrames Motif Matching
42
GraphFrame
(a)-[e]->(b)
Vertex (a) Vertices as a UDT "A"
[e]
Vertices as UDT "B"
Join with edges where
E.dst = B.id
Edge
Vertex
[b] 

Edges as UDT "E"

Join with edges
where A.id = E.src
V E
A: <VertexRow>
A: <VertexRow>,
E: <EdgeRow>
Join
JoinA: <VertexRow>,
E: <EdgeRow>,
B: <VertexRow>
THAT'S SO
MANY JOINS
#EUeco3 43
Vertex
Edge
Vertex
A: <VertexRow>
A: <VertexRow>,
E: <EdgeRow>
A: <VertexRow>,
E: <EdgeRow>,
B: <VertexRow>
DataFrames means Optimizations are Automatic
#EUeco3 44
Vertex
Edge
Vertex
A: <VertexRow>
A: <VertexRow>,
E: <EdgeRow>
A: <VertexRow>,
E: <EdgeRow>,
B: <VertexRow>
Select A.ID
Columns Pruned and Predicates Pushed
45
Vertex
Edge
Vertex
A: <VertexRow>
A: <VertexRow>,
E: <EdgeRow>
A: <VertexRow>,
E: <EdgeRow>,
B: <VertexRow>
Select A.ID
Columns Pruned and Predicates Pushed
#EUeco3
46
Vertex
Edge
Vertex
A: <VertexRow>
A: <VertexRow>,
E: <EdgeRow>
A: <VertexRow>,
E: <EdgeRow>,
B: <VertexRow>
Select A.ID
Columns Pruned and Predicates Pushed
#EUeco3
47
Vertex
Edge
Vertex
A: <VertexRow>
A: <VertexRow>,
E: <EdgeRow>
A: <VertexRow>,
E: <EdgeRow>,
B: <VertexRow>
Select A.ID
Columns Pruned and Predicates Pushed
#EUeco3
#EUeco3
All of the normal optimizations happen within this
FrameWork
48
Vertex
Edge
Vertex
A: <VertexRow>
A: <VertexRow>,
E: <EdgeRow>
A: <VertexRow>,
E: <EdgeRow>,
B: <VertexRow>
Broadcast?
Broadcast?
#EUeco3
Code Generation and Internal Rows
49
Vertex
Edge
Vertex
A: <VertexRow>
A: <VertexRow>,
E: <EdgeRow>
A: <VertexRow>,
E: <EdgeRow>,
B: <VertexRow>
Code
Generation
Code
Generation
Code
Generation
Code
Generation
Code
Generation
#EUeco3
GraphFrames Examples
50
#EUeco3
GraphFrame Pros Cons
Pros
• Much Faster on basic counts
• Powerful optimizations + CodeGen
• Easy to connect to other sources


Cons
• Slower on complex traversals (2 Joins per hop)
• Relational Model not as Flexible
51
#EUeco3
Choosing the Right Framework
52
Choose TinkerPop OLAP For Long Paths
• More complicated queries
• Traversals that require many hops
• g.V().out.out.out.out 

• Avoid for simple counts and aggregations
• Avoid if you have very high degree Vertices
53#EUeco3
Choose GraphFrames for Interoperability and
Short Paths
• General Edge/Vertex stats groupCount, min, max
• Connecting to other sources
• Short paths
• High Degree Vertices
• Avoid
• Long path algorithms
54#EUeco3
#EUeco3
Choosing the Right Framework
55
Gremlin on

Graphframes
OLTP backed
by DSE Graph
Built in Spark
We write it!
Search Built In!
Advanced
Security
#EUeco3
Thanks for Listening
56
Datastax Academy Graph Course
https://academy.datastax.com/resources/ds330-datastax-enterprise-graph

Try out Datastax Enterprise!
https://academy.datastax.com/quick-downloads



Apache Tinkerpop

http://tinkerpop.apache.org/


GraphFrames Link
https://graphframes.github.io/

Weitere ähnliche Inhalte

Ähnlich wie Tale of Two Graph Frameworks: Graph Frames and Tinkerpop

Web-Scale Graph Analytics with Apache® Spark™
Web-Scale Graph Analytics with Apache® Spark™Web-Scale Graph Analytics with Apache® Spark™
Web-Scale Graph Analytics with Apache® Spark™Databricks
 
Web-Scale Graph Analytics with Apache® Spark™
Web-Scale Graph Analytics with Apache® Spark™Web-Scale Graph Analytics with Apache® Spark™
Web-Scale Graph Analytics with Apache® Spark™Databricks
 
Automated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL...
Automated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL...Automated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL...
Automated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL...MDC_UNICA
 
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at LyftSF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at LyftChester Chen
 
Big Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb ShardingBig Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb ShardingAraf Karsh Hamid
 
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...Larry Smarr
 
OrientDB - the 2nd generation of (Multi-Model) NoSQL - J On The Beach 2016
OrientDB - the 2nd generation of (Multi-Model) NoSQL  - J On The Beach 2016OrientDB - the 2nd generation of (Multi-Model) NoSQL  - J On The Beach 2016
OrientDB - the 2nd generation of (Multi-Model) NoSQL - J On The Beach 2016Luigi Dell'Aquila
 
Cyclone DDS Unleashed: ROS & Cyclone DDS.pdf
Cyclone DDS Unleashed: ROS & Cyclone DDS.pdfCyclone DDS Unleashed: ROS & Cyclone DDS.pdf
Cyclone DDS Unleashed: ROS & Cyclone DDS.pdfZettaScaleTechnology
 
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui MengChallenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui MengDatabricks
 
Challenging Web-Scale Graph Analytics with Apache Spark
Challenging Web-Scale Graph Analytics with Apache SparkChallenging Web-Scale Graph Analytics with Apache Spark
Challenging Web-Scale Graph Analytics with Apache SparkDatabricks
 
2018 GIS in Development: FOSS4G in the Government (Proof of Concept)
2018 GIS in Development: FOSS4G in the Government (Proof of Concept)2018 GIS in Development: FOSS4G in the Government (Proof of Concept)
2018 GIS in Development: FOSS4G in the Government (Proof of Concept)GIS in the Rockies
 
Edge Device Multi-unicasting for Video Streaming
Edge Device Multi-unicasting for Video StreamingEdge Device Multi-unicasting for Video Streaming
Edge Device Multi-unicasting for Video StreamingTal Lavian Ph.D.
 
Efficient analysis of large scale digital circuits and parasitic informations
Efficient analysis of large scale digital circuits and parasitic informationsEfficient analysis of large scale digital circuits and parasitic informations
Efficient analysis of large scale digital circuits and parasitic informationsDimitris Akridas
 
The Sierra Supercomputer: Science and Technology on a Mission
The Sierra Supercomputer: Science and Technology on a MissionThe Sierra Supercomputer: Science and Technology on a Mission
The Sierra Supercomputer: Science and Technology on a Missioninside-BigData.com
 
Use FME To Efficiently Create National-Scale Vector Contours From High-Resolu...
Use FME To Efficiently Create National-Scale Vector Contours From High-Resolu...Use FME To Efficiently Create National-Scale Vector Contours From High-Resolu...
Use FME To Efficiently Create National-Scale Vector Contours From High-Resolu...Safe Software
 
Gntc 2017 cord platform
Gntc 2017 cord platformGntc 2017 cord platform
Gntc 2017 cord platformChun Ming Ou
 
DRUG - RDSTK Talk
DRUG - RDSTK TalkDRUG - RDSTK Talk
DRUG - RDSTK Talkrtelmore
 
Where are yours vertexes and what are they talking about?
Where are yours vertexes and what are they talking about?Where are yours vertexes and what are they talking about?
Where are yours vertexes and what are they talking about?Roberto Franchini
 

Ähnlich wie Tale of Two Graph Frameworks: Graph Frames and Tinkerpop (20)

Web-Scale Graph Analytics with Apache® Spark™
Web-Scale Graph Analytics with Apache® Spark™Web-Scale Graph Analytics with Apache® Spark™
Web-Scale Graph Analytics with Apache® Spark™
 
Web-Scale Graph Analytics with Apache® Spark™
Web-Scale Graph Analytics with Apache® Spark™Web-Scale Graph Analytics with Apache® Spark™
Web-Scale Graph Analytics with Apache® Spark™
 
Resume
ResumeResume
Resume
 
Automated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL...
Automated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL...Automated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL...
Automated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL...
 
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at LyftSF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
 
Big Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb ShardingBig Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb Sharding
 
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...
 
OrientDB - the 2nd generation of (Multi-Model) NoSQL - J On The Beach 2016
OrientDB - the 2nd generation of (Multi-Model) NoSQL  - J On The Beach 2016OrientDB - the 2nd generation of (Multi-Model) NoSQL  - J On The Beach 2016
OrientDB - the 2nd generation of (Multi-Model) NoSQL - J On The Beach 2016
 
Cyclone DDS Unleashed: ROS & Cyclone DDS.pdf
Cyclone DDS Unleashed: ROS & Cyclone DDS.pdfCyclone DDS Unleashed: ROS & Cyclone DDS.pdf
Cyclone DDS Unleashed: ROS & Cyclone DDS.pdf
 
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui MengChallenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
 
Challenging Web-Scale Graph Analytics with Apache Spark
Challenging Web-Scale Graph Analytics with Apache SparkChallenging Web-Scale Graph Analytics with Apache Spark
Challenging Web-Scale Graph Analytics with Apache Spark
 
2018 GIS in Development: FOSS4G in the Government (Proof of Concept)
2018 GIS in Development: FOSS4G in the Government (Proof of Concept)2018 GIS in Development: FOSS4G in the Government (Proof of Concept)
2018 GIS in Development: FOSS4G in the Government (Proof of Concept)
 
Edge Device Multi-unicasting for Video Streaming
Edge Device Multi-unicasting for Video StreamingEdge Device Multi-unicasting for Video Streaming
Edge Device Multi-unicasting for Video Streaming
 
Efficient analysis of large scale digital circuits and parasitic informations
Efficient analysis of large scale digital circuits and parasitic informationsEfficient analysis of large scale digital circuits and parasitic informations
Efficient analysis of large scale digital circuits and parasitic informations
 
The Sierra Supercomputer: Science and Technology on a Mission
The Sierra Supercomputer: Science and Technology on a MissionThe Sierra Supercomputer: Science and Technology on a Mission
The Sierra Supercomputer: Science and Technology on a Mission
 
Use FME To Efficiently Create National-Scale Vector Contours From High-Resolu...
Use FME To Efficiently Create National-Scale Vector Contours From High-Resolu...Use FME To Efficiently Create National-Scale Vector Contours From High-Resolu...
Use FME To Efficiently Create National-Scale Vector Contours From High-Resolu...
 
Gntc 2017 cord platform
Gntc 2017 cord platformGntc 2017 cord platform
Gntc 2017 cord platform
 
DRUG - RDSTK Talk
DRUG - RDSTK TalkDRUG - RDSTK Talk
DRUG - RDSTK Talk
 
Where are yours vertexes and what are they talking about?
Where are yours vertexes and what are they talking about?Where are yours vertexes and what are they talking about?
Where are yours vertexes and what are they talking about?
 
TransPAC3/ACE Measurement & PerfSONAR Update
TransPAC3/ACE Measurement & PerfSONAR UpdateTransPAC3/ACE Measurement & PerfSONAR Update
TransPAC3/ACE Measurement & PerfSONAR Update
 

Mehr von Russell Spitzer

Maximum Overdrive: Tuning the Spark Cassandra Connector
Maximum Overdrive: Tuning the Spark Cassandra ConnectorMaximum Overdrive: Tuning the Spark Cassandra Connector
Maximum Overdrive: Tuning the Spark Cassandra ConnectorRussell Spitzer
 
Spark and Cassandra 2 Fast 2 Furious
Spark and Cassandra 2 Fast 2 FuriousSpark and Cassandra 2 Fast 2 Furious
Spark and Cassandra 2 Fast 2 FuriousRussell Spitzer
 
Spark Cassandra Connector: Past, Present, and Future
Spark Cassandra Connector: Past, Present, and FutureSpark Cassandra Connector: Past, Present, and Future
Spark Cassandra Connector: Past, Present, and FutureRussell Spitzer
 
Spark Cassandra Connector Dataframes
Spark Cassandra Connector DataframesSpark Cassandra Connector Dataframes
Spark Cassandra Connector DataframesRussell Spitzer
 
Cassandra and Spark: Optimizing for Data Locality
Cassandra and Spark: Optimizing for Data LocalityCassandra and Spark: Optimizing for Data Locality
Cassandra and Spark: Optimizing for Data LocalityRussell Spitzer
 
Zero to Streaming: Spark and Cassandra
Zero to Streaming: Spark and CassandraZero to Streaming: Spark and Cassandra
Zero to Streaming: Spark and CassandraRussell Spitzer
 
Cassandra Fundamentals - C* 2.0
Cassandra Fundamentals - C* 2.0Cassandra Fundamentals - C* 2.0
Cassandra Fundamentals - C* 2.0Russell Spitzer
 
Escape From Hadoop: Spark One Liners for C* Ops
Escape From Hadoop: Spark One Liners for C* OpsEscape From Hadoop: Spark One Liners for C* Ops
Escape From Hadoop: Spark One Liners for C* OpsRussell Spitzer
 

Mehr von Russell Spitzer (10)

Cassandra and Spark SQL
Cassandra and Spark SQLCassandra and Spark SQL
Cassandra and Spark SQL
 
Maximum Overdrive: Tuning the Spark Cassandra Connector
Maximum Overdrive: Tuning the Spark Cassandra ConnectorMaximum Overdrive: Tuning the Spark Cassandra Connector
Maximum Overdrive: Tuning the Spark Cassandra Connector
 
Spark and Cassandra 2 Fast 2 Furious
Spark and Cassandra 2 Fast 2 FuriousSpark and Cassandra 2 Fast 2 Furious
Spark and Cassandra 2 Fast 2 Furious
 
Spark Cassandra Connector: Past, Present, and Future
Spark Cassandra Connector: Past, Present, and FutureSpark Cassandra Connector: Past, Present, and Future
Spark Cassandra Connector: Past, Present, and Future
 
Spark Cassandra Connector Dataframes
Spark Cassandra Connector DataframesSpark Cassandra Connector Dataframes
Spark Cassandra Connector Dataframes
 
Cassandra and Spark: Optimizing for Data Locality
Cassandra and Spark: Optimizing for Data LocalityCassandra and Spark: Optimizing for Data Locality
Cassandra and Spark: Optimizing for Data Locality
 
Cassandra and IoT
Cassandra and IoTCassandra and IoT
Cassandra and IoT
 
Zero to Streaming: Spark and Cassandra
Zero to Streaming: Spark and CassandraZero to Streaming: Spark and Cassandra
Zero to Streaming: Spark and Cassandra
 
Cassandra Fundamentals - C* 2.0
Cassandra Fundamentals - C* 2.0Cassandra Fundamentals - C* 2.0
Cassandra Fundamentals - C* 2.0
 
Escape From Hadoop: Spark One Liners for C* Ops
Escape From Hadoop: Spark One Liners for C* OpsEscape From Hadoop: Spark One Liners for C* Ops
Escape From Hadoop: Spark One Liners for C* Ops
 

Kürzlich hochgeladen

Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfVishalKumarJha10
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfryanfarris8
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024Mind IT Systems
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 

Kürzlich hochgeladen (20)

Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 

Tale of Two Graph Frameworks: Graph Frames and Tinkerpop

  • 1. Artem Aliev and Russell Spitzer, DataStax A Tale of Two Graph Frameworks on Spark: 
 GraphFrames and Tinkerpop OLAP #EUeco3
  • 2. #EUeco3 Pierrot and Harlequin • Artem • Graph Analytics Expert • Earth • Russell • Distributed Systems Enthusiast • Earth 2
  • 3. Tinkerpop and GraphFrames provide Complimentary Approaches for Graph Analytics DataSet Catalyst GraphFrames 3#EUeco3
  • 4. Graphs are Vertices and Edges 4 Vertices are things and edges represent their relations to one another #EUeco3
  • 5. Graphs are Vertices and Edges 5 Registry: USS Enterprise (NCC-1701-C) Class: Ambassador Service: 2332[11] – 2344 (12 Years) Registry: USS Enterprise (NCC-1701-D) Class: Galaxy Service: 2363–2371 (8 Years) Registry: USS Enterprise (NCC-1701) Class: Constitution class[6] Service: 2245–2285 (40 Years) Registry: USS Enterprise (NCC-1701-A) Class: Enterprise class[8][9] Service: 2286–2293 (7 Years) #EUeco3
  • 6. Graphs are Vertices and Edges 6 Registry: USS Enterprise (NCC-1701-C) Class: Ambassador Service: 2332[11] – 2344 (12 Years) Registry: USS Enterprise (NCC-1701-D) Class: Galaxy Service: 2363–2371 (8 Years) Registry: USS Enterprise (NCC-1701) Class: Constitution class[6] Service: 2245–2285 (40 Years) Registry: USS Enterprise (NCC-1701-A) Class: Enterprise class[8][9] Service: 2286–2293 (7 Years) Vertex Properties #EUeco3
  • 7. Graphs are Vertices and Edges 7 Registry: USS Enterprise (NCC-1701-C) Class: Ambassador Service: 2332[11] – 2344 (12 Years) Registry: USS Enterprise (NCC-1701-D) Class: Galaxy Service: 2363–2371 (8 Years) Registry: USS Enterprise (NCC-1701) Class: Constitution class[6] Service: 2245–2285 (40 Years) Registry: USS Enterprise (NCC-1701-A) Class: Enterprise class[8][9] Service: 2286–2293 (7 Years) succeeded by succeeded by succeeded by #EUeco3
  • 8. Graphs are Vertices and Edges 8 Registry: USS Enterprise (NCC-1701-C) Class: Ambassador Service: 2332[11] – 2344 (12 Years) Registry: USS Enterprise (NCC-1701-D) Class: Galaxy Service: 2363–2371 (8 Years) Registry: USS Enterprise (NCC-1701) Class: Constitution class[6] Service: 2245–2285 (40 Years) Registry: USS Enterprise (NCC-1701-A) Class: Enterprise class[8][9] Service: 2286–2293 (7 Years) Edge Edge Labelsucceeded by succeeded by succeeded by #EUeco3
  • 9. Graphs are Vertices and Edges 9 Registry: USS Enterprise (NCC-1701-C) Class: Ambassador Service: 2332[11] – 2344 (12 Years) Registry: USS Enterprise (NCC-1701-D) Class: Galaxy Service: 2363–2371 (8 Years) Registry: USS Enterprise (NCC-1701) Class: Constitution class[6] Service: 2245–2285 (40 Years) Registry: USS Enterprise (NCC-1701-A) Class: Enterprise class[8][9] Service: 2286–2293 (7 Years) Ship Ship Ship Ship Vertex Label succeeded by succeeded by succeeded by #EUeco3
  • 10. Graphs are Vertices and Edges 10 Registry: USS Enterprise (NCC-1701-C) Class: Ambassador Service: 2332[11] – 2344 (12 Years) Registry: USS Enterprise (NCC-1701-D) Class: Galaxy Service: 2363–2371 (8 Years) Registry: USS Enterprise (NCC-1701) Class: Constitution class Service: 2245–2285 (40 Years) Ship Ship Ship Ship Position: Captain
 Name: Kirk Position: Captain
 Name: Picard Crew Crew succeeded by succeeded by succeeded by #EUeco3
  • 11. Graphs are Vertices and Edges 11 Registry: USS Enterprise (NCC-1701-C) Class: Ambassador Service: 2332[11] – 2344 (12 Years) Registry: USS Enterprise (NCC-1701-D) Class: Galaxy Service: 2363–2371 (8 Years) Registry: USS Enterprise (NCC-1701) Class: Constitution class Service: 2245–2285 (40 Years) Registry: USS Enterprise (NCC-1701-A) Class: Enterprise class Service: 2286–2293 (7 Years) Ship Ship Ship Ship Position: Captain
 Name: Kirk Position: Captain
 Name: Picard Crew Crew succeeded by succeeded by succeeded by served on served on served on served on #EUeco3
  • 12. Graphs are Vertices and Edges 12 Registry: USS Enterprise (NCC-1701-C) Class: Ambassador Service: 2332[11] – 2344 (12 Years) Registry: USS Enterprise (NCC-1701-D) Class: Galaxy Service: 2363–2371 (8 Years) Registry: USS Enterprise (NCC-1701) Class: Constitution class Service: 2245–2285 (40 Years) Registry: USS Enterprise (NCC-1701-A) Class: Enterprise class Service: 2286–2293 (7 Years) Ship Ship Ship Ship Position: Captain
 Name: Kirk Position: Captain
 Name: Picard Crew Crew succeeded by succeeded by succeeded by served on served on served on served on But why do I want this? #EUeco3
  • 13. Graphs let us ask questions about our data based on their relations 13 What Captain Served After Kirk? What Ship was two after the NCC-1701? #EUeco3
  • 14. Traversals involve following paths through the Graph 14 Registry: USS Enterprise (NCC-1701-C) Class: Ambassador Service: 2332[11] – 2344 (12 Years) Registry: USS Enterprise (NCC-1701-D) Class: Galaxy Service: 2363–2371 (8 Years) Registry: USS Enterprise (NCC-1701) Class: Constitution class Service: 2245–2285 (40 Years) Registry: USS Enterprise (NCC-1701-A) Class: Enterprise class Service: 2286–2293 (7 Years) Ship Ship Ship Ship Position: Captain
 Name: Kirk Position: Captain
 Name: Picard Crew Crew succeeded by succeeded by succeeded by served on served on served on served on #EUeco3
  • 15. What Captain was After Kirk? 15 Registry: USS Enterprise (NCC-1701-C) Class: Ambassador Service: 2332[11] – 2344 (12 Years) Registry: USS Enterprise (NCC-1701-A) Class: Enterprise class Service: 2286–2293 (7 Years) Ship Ship Position: Captain
 Name: Kirk Position: Captain
 Name: Picard Crew Crewsucceeded by served on served on #EUeco3
  • 16. What Ship was two after the NCC-1701? 16 Registry: USS Enterprise (NCC-1701-C) Class: Ambassador Service: 2332[11] – 2344 (12 Years) Registry: USS Enterprise (NCC-1701) Class: Constitution class Service: 2245–2285 (40 Years) Registry: USS Enterprise (NCC-1701-A) Class: Enterprise class Service: 2286–2293 (7 Years) Ship Ship Ship succeeded by succeeded by #EUeco3
  • 17. Tinkerpop is a Powerful and Flexible Graph Framework • Server, Language, Connectors • Graph Framework for 
 OLAP and OLTP • Node Centric Representations • Fluent API (Gremlin) • Fully Self Contained Framework 17#EUeco3
  • 20. 20
  • 21. #EUeco3 What happens when you have too much data? 21
  • 22. #EUeco3 Tinkerpop Spark OLAP Mechanism • Instead of one traversal we traverse starting from all nodes simultaneously 22
  • 23. Distribution Requires Partitioning 23 ? Big Data Independent Chunks of Data#EUeco3
  • 24. #EUeco3 Vertex Stored in a PairRDD Id -> StarVertex(Edge and Property Information) 24 1 A C D Star Vertex: Adjacency list representation
 1: "A", "Kirk"
 A: "C", "Kirk"
 C: "D", "Picard"
 D: "Picard"
 Just Id 
 Of Connected 
 Vertex
  • 25. #EUeco3 Vertex Program Runs Initializing Traverser for every Vertex 25 1 A C D SparkMemory - Accumulator - Used for GlobalState
  • 26. #EUeco3 Then we cycle through a message Passing Algorithm 26 1 A C D 1 A C D 1 A C D SparkMemory - Accumulator - Used for GlobalState
  • 27. #EUeco3 Then we cycle through a message Passing Algorithm 27 1 A C D 1 A C D 1 A C D SparkMemory - Accumulator - Used for GlobalState Passes messages from one Vertex to another with a join
  • 28. #EUeco3 Then we cycle through a message Passing Algorithm 28 1 A C D 1 A C D 1 A C D SparkMemory - Accumulator - Used for GlobalState Repeat
  • 29. #EUeco3 Then we cycle through a message Passing Algorithm 29 1 A C D 1 A C D 1 A C D SparkMemory - Accumulator - Used for GlobalState All Traversers Halt
 Or Program Terminates Result!
  • 31. #EUeco3 Tinkerpop Spark OLAP Pros/Cons Pros • Every message pass requires only a single shuffle • Edges and edge properties accessible without a step • Very Flexible, Many Provider Specific Shortcuts possible • Internal properties can be any Java type • All in one, Server already ready for multiple clients Cons • Limited in ability to connect to external sources/other spark applications • Flexibility of framework allows for many platform specific shortcuts to be added • Genericness provides difficulty in making some optimizations • Edges co-partitioned with vertices, high degree nodes can cause memory issues 31
  • 32. #EUeco3 GraphFrames Background • Third Party Package • https://graphframes.github.io/ • Integrates with Dataset/Dataframe in Spark • Relational under the hood 32
  • 33. #EUeco3 GraphFrames are built of two DataFrames 33 Row Column
  • 34. #EUeco3 GraphFrames are built of two DataFrames 34 id job species Geordi Chief Engineer Human Data Science Officer Android Vertex DataFrame src dst relationship Geordi Data Friend Edge DataFrame Friend
  • 35. #EUeco3 GraphFrames are built of two DataFrames 35 id job species Geordi Chief Engineer Human Data Science Officer Android Vertex DataFrame src dst relationship Geordi Data Friend Edge DataFrame Friend Can Only Be Spark Types
  • 36. #EUeco3 GraphFrames are built of two DataFrames 36 id job species Geordi Chief Engineer Human Data Science Officer Android Vertex DataFrame src dst relationship Geordi Data Friend Edge DataFrame Friend No Built in Labels
  • 37. #EUeco3 Catalyst Optimizes any Requests • Simple requests using DataFrame api don't do anything special • Some methods fall back to GraphX (RDD Based) • Others use pure DataFrame methods 37
  • 39. #EUeco3 GraphFrames Motif Matching 39 GraphFrame (a)-[e]->(b) Vertex (a) Vertices as a UDT "A"V E A: <VertexRow>
  • 40. #EUeco3 GraphFrames Motif Matching 40 GraphFrame (a)-[e]->(b) Vertex (a) Vertices as a UDT "A" Edge [b] 
 Edges as UDT "E"
 Join with edges where A.id = E.src V E A: <VertexRow> Join A: <VertexRow>, E: <EdgeRow>
  • 41. #EUeco3 GraphFrames Motif Matching 41 GraphFrame (a)-[e]->(b) Vertex (a) Vertices as a UDT "A" [e] Vertices as UDT "B" Join with edges where E.dst = B.id Edge Vertex [b] 
 Edges as UDT "E"
 Join with edges where A.id = E.src V E A: <VertexRow> A: <VertexRow>, E: <EdgeRow> Join JoinA: <VertexRow>, E: <EdgeRow>, B: <VertexRow>
  • 42. #EUeco3 GraphFrames Motif Matching 42 GraphFrame (a)-[e]->(b) Vertex (a) Vertices as a UDT "A" [e] Vertices as UDT "B" Join with edges where E.dst = B.id Edge Vertex [b] 
 Edges as UDT "E"
 Join with edges where A.id = E.src V E A: <VertexRow> A: <VertexRow>, E: <EdgeRow> Join JoinA: <VertexRow>, E: <EdgeRow>, B: <VertexRow> THAT'S SO MANY JOINS
  • 43. #EUeco3 43 Vertex Edge Vertex A: <VertexRow> A: <VertexRow>, E: <EdgeRow> A: <VertexRow>, E: <EdgeRow>, B: <VertexRow> DataFrames means Optimizations are Automatic
  • 44. #EUeco3 44 Vertex Edge Vertex A: <VertexRow> A: <VertexRow>, E: <EdgeRow> A: <VertexRow>, E: <EdgeRow>, B: <VertexRow> Select A.ID Columns Pruned and Predicates Pushed
  • 45. 45 Vertex Edge Vertex A: <VertexRow> A: <VertexRow>, E: <EdgeRow> A: <VertexRow>, E: <EdgeRow>, B: <VertexRow> Select A.ID Columns Pruned and Predicates Pushed #EUeco3
  • 46. 46 Vertex Edge Vertex A: <VertexRow> A: <VertexRow>, E: <EdgeRow> A: <VertexRow>, E: <EdgeRow>, B: <VertexRow> Select A.ID Columns Pruned and Predicates Pushed #EUeco3
  • 47. 47 Vertex Edge Vertex A: <VertexRow> A: <VertexRow>, E: <EdgeRow> A: <VertexRow>, E: <EdgeRow>, B: <VertexRow> Select A.ID Columns Pruned and Predicates Pushed #EUeco3
  • 48. #EUeco3 All of the normal optimizations happen within this FrameWork 48 Vertex Edge Vertex A: <VertexRow> A: <VertexRow>, E: <EdgeRow> A: <VertexRow>, E: <EdgeRow>, B: <VertexRow> Broadcast? Broadcast?
  • 49. #EUeco3 Code Generation and Internal Rows 49 Vertex Edge Vertex A: <VertexRow> A: <VertexRow>, E: <EdgeRow> A: <VertexRow>, E: <EdgeRow>, B: <VertexRow> Code Generation Code Generation Code Generation Code Generation Code Generation
  • 51. #EUeco3 GraphFrame Pros Cons Pros • Much Faster on basic counts • Powerful optimizations + CodeGen • Easy to connect to other sources 
 Cons • Slower on complex traversals (2 Joins per hop) • Relational Model not as Flexible 51
  • 53. Choose TinkerPop OLAP For Long Paths • More complicated queries • Traversals that require many hops • g.V().out.out.out.out 
 • Avoid for simple counts and aggregations • Avoid if you have very high degree Vertices 53#EUeco3
  • 54. Choose GraphFrames for Interoperability and Short Paths • General Edge/Vertex stats groupCount, min, max • Connecting to other sources • Short paths • High Degree Vertices • Avoid • Long path algorithms 54#EUeco3
  • 55. #EUeco3 Choosing the Right Framework 55 Gremlin on
 Graphframes OLTP backed by DSE Graph Built in Spark We write it! Search Built In! Advanced Security
  • 56. #EUeco3 Thanks for Listening 56 Datastax Academy Graph Course https://academy.datastax.com/resources/ds330-datastax-enterprise-graph
 Try out Datastax Enterprise! https://academy.datastax.com/quick-downloads
 
 Apache Tinkerpop
 http://tinkerpop.apache.org/ 
 GraphFrames Link https://graphframes.github.io/