Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Graph Data Science: The Secret to Accelerating Innovation with AI/ML
1. Neo4j, Inc. All rights reserved 2021
Neo4j, Inc. All rights reserved 2021
1
Graph Data Science:
Accelerating AI & Machine Learning
Alicia Frame, PhD
Director, Data Science @ Neo4j
2. Neo4j, Inc. All rights reserved 2021
2
Neo4j: The Connected Data Company
20 of the top 25 financial firms
7 of the top 10 retailers
7 of the top 10 software vendors
Neo4j is the creator of:
• The world’s leading graph database
• The first graph data science platform
• The most flexible graph data model
• The easiest-to-use graph query language
Thousands of Organizations Use Neo4j
Silicon Valley | London | Munich
Paris | Malmö
3. Neo4j, Inc. All rights reserved 2021
3
Node
Represents an entity in the graph
Relationship
Connect nodes to each other
Property
Describes a node or relationship:
e.g. name, age, weight etc
What’s a graph?
MICA
ANDRE
Name: “Andre”
Born: May 29, 1970
Twitter: “@dan”
Name: “Mica”
Born: Dec 5, 1975
CAR
Brand “Volvo”
Model: “V70”
Since:
Jan 10, 2011
LOVES
LOVES
LOVES
LIVES WITH
O
W
N
S
D
R
I
V
E
S
4. Neo4j, Inc. All rights reserved 2021
Networks of People Transaction Networks
Bought
B
ou
gh
t
V
i
e
w
e
d
R
e
t
u
r
n
e
d
Bought
Knowledge Networks
Pl
ay
s
Lives_in
In_sport
Likes
F
a
n
_
o
f
Plays_for
Risk management,
Supply chain, Orders,
Payments, etc.
Employees, Customers,
Suppliers, Partners,
Influencers, etc.
Enterprise content,
Domain specific content,
eCommerce content, etc
K
n
o
w
s
Knows
Knows
K
n
o
w
s
4
Everything is Naturally Connected
5. Neo4j, Inc. All rights reserved 2021
5
Higher Pay and More Promotions
• People Near Structural Holes
• Organizational Misfits
Network Structure is
Highly Predictive
Photo by Helena Lopes on Unsplash
“Organizational Misfits and the Origins of Brokerage in Intrafirm Networks” A. Kleinbaum
“Structural Holes and Good Ideas” R. Burt
6. Neo4j, Inc. All rights reserved 2021
Consider What Drives Your Business
It’s not the numbers, it’s the relationships behind them
Plants
Warehouses
Suppliers
Distributors
Competitors
Partners
Regulations
Employees
Citizens
Customers
Products
Parts
Services
Regions
7. Neo4j, Inc. All rights reserved 2021
Neo4j, Inc. All rights reserved 2021
7
Relationships
are the strongest
predictors of behavior
But You Can’t Analyse
What You Can’t See
● Most data science techniques
ignore relationships
● It’s painful to manually engineer
connected features from tabular
data
● Graphs are built on
relationships, so…
● You don’t have to guess at the
correlations: with graphs,
relationships are built in
James Fowler
8. Neo4j, Inc. All rights reserved 2021
8
8 Top 10 Tech Trends in Data and Analytics, 16 Feb 2021
According to Gartner, “Graphs form
the foundation of modern D&A,
with capabilities to enhance and
improve user collaboration, ML models
and explainable AI.
The recent Gartner AI in Organizations
Survey demonstrates that graph
techniques are increasingly
prevalent as AI maturity grows,
going from 13% adoption when AI
maturity is lowest to 48% when
maturity is highest.”
AI Research Papers
Featuring Graph
Source: Dimensions Knowledge System
4x
Increase in
traffic to
Neo4j GDS
page in
2H-2020
Analytics & Data Science Interest
Exploding in Neo4j Community
100k+
Practicing data
scientists
engaged with
Neo4j
+210k
downloads
9. Neo4j, Inc. All rights reserved 2021
9
Queries
Find the patterns you know exist.
Machine Learning
Uncover trends and make
predictions
Visualization
Explore, collaborate, and explain
Graphs & Data Science
Analytics
Feature
Engineering
Data
Exploration
Graph
Data
Science
Queries
Machine Learning Visualization
10. Neo4j, Inc. All rights reserved 2021
10
Graphs & Data Science
Knowledge Graphs
Graph Algorithms
Graph Native
Machine Learning
Find the patterns you’re
looking for in connected data
Use unsupervised machine
learning techniques to
identify associations,
anomalies, and trends.
Use embeddings to learn the
features in your graph that
you don’t even know are
important yet.
Train in-graph supervise ML
models to predict links,
labels, and missing data.
11. Neo4j, Inc. All rights reserved 2021
Better Predictions with Data You Already Have
● Traditional ML ignores network structure because it’s difficult to extract
● Uncover patterns and trends you can’t find any other way
● Easily generate predictive features to incorporate into ML pipelines
11
Machine Learning Pipeline
12. Neo4j, Inc. All rights reserved 2021
Neo4j’s Graph Data Science Framework
Neo4j Graph Data
Science Library
Neo4j
Database
Neo4j
Bloom
Scalable Graph Algorithms &
Analytics Workspace
Native Graph Creation &
Persistence
Visual Graph
Exploration & Prototyping
13. Neo4j, Inc. All rights reserved 2021
Robust Graph Algorithms & ML methods
● Compute metrics about the topology and connectivity
● Build predictive models to enhance your graph
● Highly parallelized and scale to 10’s of billions of nodes
13
The Neo4j GDS Library
Mutable In-Memory
Workspace
Computational Graph
Native Graph Store
Efficient & Flexible Analytics Workspace
● Automatically reshapes transactional graphs into
an in-memory analytics graph
● Optimized for global traversals and aggregation
● Create workflows and layer algorithms
● Store and manage predictive models in the
model catalog
14. Neo4j, Inc. All rights reserved 2021
Community
Detection
14
Neo4j’s Graph Data Science Library
Unsupervised Graph Algorithms
Clustering
Dimension Reduction
(generalization)
Association
Which parts of my graph are
connected to each other?
Which nodes are most
similar?
How important is each node?
Supervised Machine Learning
Node Classification
Link Prediction
Where will connections
form next?
What’s the label
for this node?
Centrality
Embeddings
Similarity
Pathfinding
More Algos than
any other vendor
ONLY in neo4j
16. Neo4j, Inc. All rights reserved 2021
Graph Features & Graph Models for Predictions
Traditional ML problems where
relationships between your data points
are important predictive features
16
Predictions influenced by
graph structure
Predictions about
graph structure
Enhance your graph by predicting
missing data or changes to your graph
that will occur in the future
17. Neo4j, Inc. All rights reserved 2021
17
Neo4j’s In-Graph ML Models
Node
classification:
“What kind of node
is this?”
Link prediction:
“Should there be a
relationship between
these nodes?”
Labeled data: Pairs of nodes
that are either linked or not
Features: Pre-existing
attributes, algorithms
(pageRank), embedding
18. Neo4j, Inc. All rights reserved 2021
18
The Only Completely In-Graph, ML Workflow
Graph-Native
Feature
Engineering
Train
Predictive Model
Queries
Algorithms
Embeddings
1. Model Type
2. Property
Selection
3. Train & Test
4. Model
Selection
Apply Model to
Existing / New
Data
Use Predictions
for Decisions
Use Predictions
to Enhance
the Graph
Publish & Share
Store Model in
Database
19. Neo4j, Inc. All rights reserved 2021
What’s most important and
influential in my business?
What’s occurring that’s unusual?
What’s going to happen next?
But traditional
approaches to data make
it impossible to reveal and
effectively use those
connections as data sizes
become large
Predictive signals get lost in
big data noise
19
Graph Data Science Answers the BIG Questions
Connected Data is
Powerful
Graph Data Science uses
Connections to Answer
Critical Questions
20. Neo4j, Inc. All rights reserved 2021
Neo4j, Inc. All rights reserved 2021
20
Resources
Graph Resources
● Video: Advantages of Graph Technology
● Code: https://github.com/neo4j/graph-data-science/
● Whitepaper: Financial Fraud Detection with Graph Data Science
● Case Study: Meredith Corporation
Neo4j BookShelf
● Graph Databases For Dummies
● Graph Data Science For Dummies
● O’Reilly Graph Algorithms