In diesem Webinar wollen wir einen Überblick über unser Angebot für Data Scientsts geben und zeigen, was heute schon relativ einfach und schnell möglich ist.
1. Neo4j, Inc. All rights reserved 2021
Neo4j, Inc. All rights reserved 2021
1
Herzlich Willkommen!
Einstieg in Neo4j Graph Data
Science
Alexander.Katzdobler@neo4j.com
andrew.frei@neo4j.com
2. Neo4j, Inc. All rights reserved 2021
2
Organisatorisches
○ Fragen während des Webinars werden zum Schluss behandelt und können
gerne währenddessen per Chat gestellt werden.
○ Informationen zum Webinar werden im Nachgang an alle Teilnehmer
versendet
3. Neo4j, Inc. All rights reserved 2021
7/10
20/25
7/10
Top Life Science Firms
Top Financial Firms
Top Software Vendors
Anyway You Like It
Neo4j - The Graph Company
The Industry’s Largest Dedicated Investment in Graphs
3
Creator of the Property
Graph and Cypher language
at the core of the GQL ISO
project
Thousands of Customers
World-Wide
HQ in Silicon Valley, offices
include London, Munich,
Paris & Malmo
Industry Leaders use Neo4j
On-Prem
DB-as-a-Service
In the Cloud
4. Neo4j, Inc. All rights reserved 2021
Highly Valuable Connected Data Use Cases
Drive Enterprise Adoption
Network &
IT Operations
Fraud
Detection
Identity & Access
Management
Knowledge
Graph
Master Data
Management
Real-Time
Recommendations
4
5. Neo4j, Inc. All rights reserved 2021
Graph is the Fastest Growing DBMS
Category, Neo4j is the Leading Player
FASTEST GROWING CATEGORY MOST POPULAR WITH DEVELOPERS
STRONGEST COMMUNITY
Developers
LinkedIn Skills
41k+
members with
220k+
Meetups
72k+
Members globally
5
6. Neo4j, Inc. All rights reserved 2021
CAR
DRIVES
name: “Dan”
born: May 29, 1970
twitter: “@dan”
name: “Ann”
born: Dec 5, 1975
since:
Jan 10, 2011
brand: “Volvo”
model: “V70”
Latitude: 37.5629900°
Longitude: -122.3255300°
Nodes
• Can have Labels to classify nodes
• Labels have native indexes
Relationships
• Relate nodes by type and direction
Properties
• Attributes of Nodes & Relationships
• Stored as Name/Value pairs
• Can have indexes and composite indexes
• Visibility security by user/role
Neo4j Invented the Labeled Property Graph Model
MARRIED TO
LIVES WITH
O
W
N
S
PERSON PERSON
6
7. Neo4j, Inc. All rights reserved 2021
7
Graph Databases Unlock Data-context for
Richer Insights, Better Decisions, and Faster Innovation
Shifting from Data-Driven to Intelligence-Driven
Bought
B
ou
gh
t
V
i
e
w
e
d
R
e
t
u
r
n
e
d
Bought
K
n
o
w
s
Knows
Knows
K
n
o
w
s
Pl
ay
s
Lives_in
In_sport
Likes
F
a
n
_
o
f
Plays_for
People
E.g., Employees, Customers,
Suppliers, Partners, Influencers
Transactions
E.g., Risk management, Supply chain,
Payments
Knowledge
E.g. Enterprise content, Knowledgebase,
eCommerce content
8. Neo4j, Inc. All rights reserved 2021
Which of the colored nodes would be considered the most
‘important'?
D
A
B
E
H
G
F
C
I
J
K
L
M N
8
9. Neo4j, Inc. All rights reserved 2021
Which of the colored nodes would be considered the most
‘important'?
D D has the highest valence
This is the most connected individual in the network. If importance is how well you are personally known, you pick D.
Node G has the highest closeness centrality (0.52).
Information will disperse through the network more quickly through this individual. If you need to get a message out rapidly, choose them.
G
Node I has the highest betweenness centrality (0.59).
This person is an efficient connector of other people. Risk of disruption is higher if you lose this individual.
I
D
A
B
E
H
G
F
C
I
J
K
L
M N
9
10. Neo4j, Inc. All rights reserved 2021
What is Graph data science?
Graph Data Science is a science-driven
approach to gain knowledge from the
relationships and structures in data, typically
to power predictions.
Data scientists use
relationships to answer
questions.
11. Neo4j, Inc. All rights reserved 2021
Why Graph data science?
Relationships and network structures are
highly predictive and underutilized – and
already in your data.
● Relationships are the strongest predictor of
behavior - James Fowler (“Connected”)
Productionize more accurate
and predictive models.
Graphs are a natural way to store and use
this predictive information, but different than
what you’re doing today.
12. Neo4j, Inc. All rights reserved 2021
Neo4j for Graph Data Science
Neo4j Graph Data
Science Library
Scalable Graph
Algorithms & Analytics
Workspace
Native Graph
Creation & Persistence
Neo4j
Database
Visual Graph
Exploration
& Prototyping
Neo4j
Bloom
Practical way to harness
the natural power of
relationships and network
structures to infer behavior
Automated transformations
with an integrated database
built to store and protect
relationships
Explore results visually,
quickly prototype and
collaborate with different
groups.
12
13. Neo4j, Inc. All rights reserved 2021
13
Graphs & Data Science
Knowledge Graphs
Graph Algorithms
Graph Native
Machine Learning
Find the patterns you’re
looking for in connected data
Use unsupervised machine
learning techniques to
identify associations,
anomalies, and trends.
Use embeddings to learn the
features in your graph that
you don’t even know are
important yet.
Train in-graph supervise ML
models to predict links,
labels, and missing data.
14. Neo4j, Inc. All rights reserved 2021
When do I need Graph Algorithms?
Query (e.g. Cypher/Python)
Real-time, local decisioning
and pattern matching
Graph Algorithms
Global analysis and iterations
You know what you’re looking for
and making a decision
You’re learning the overall structure of
a network, updating data, and
predicting
Local Patterns Global Computation
15. Neo4j, Inc. All rights reserved 2021
Robust Graph Algorithms
● Compute connectivity metrics and learn the topology of your graph
● Highly parallelized and scale to 10’s of billions of nodes
15
The Neo4j GDS Library
Mutable In-Memory
Workspace
Computational Graph
Native Graph Store
Efficient & Flexible Analytics
Workspace
● Automatically reshapes transactional graphs
into an in-memory analytics graph
● Optimized for analytics with global traversals
and aggregation
● Create workflows and layer algorithms
16. Neo4j, Inc. All rights reserved 2021
16
More, Better, Faster Algorithms
Pathfinding &
Search
• Shortest Path
• Single-Source Shortest Path
• All Pairs Shortest Path
• A* Shortest Path
• Yen’s K Shortest Path
• Minimum Weight Spanning Tree
• K-Spanning Tree (MST)
• Random Walk
• Breadth & Depth First Search
Centrality &
Importance
• Degree Centrality
• Closeness Centrality
• Harmonic Centrality
• Betweenness Centrality & Approx.
• PageRank
• Personalized PageRank
• ArticleRank
• Eigenvector Centrality
• Hyperlink Induced Topic Search (HITS)
• Influence Maximization (Greedy, CELF)
Community
Detection
• Triangle Count
• Local Clustering Coefficient
• Connected Components (Union Find)
• Strongly Connected Components
• Label Propagation
• Louvain Modularity
• K-1 Coloring
• Modularity Optimization
• Speaker Listener Label Propagation
Supervised
Machine Learning
• Node Classification
• Link Prediction
… and more!
Heuristic Link
Prediction
• Adamic Adar
• Common Neighbors
• Preferential Attachment
• Resource Allocations
• Same Community
• Total Neighbors
Similarity
• Node Similarity
• K-Nearest Neighbors (KNN)
• Jaccard Similarity
• Cosine Similarity
• Pearson Similarity
• Euclidean Distance
• Approximate Nearest Neighbors (ANN)
Graph
Embeddings
• Node2Vec
• FastRP
• FastRPExtended
• GraphSAGE
• Synthetic Graph Generation
• Scale Properties
• Collapse Paths
• One Hot Encoding
• Split Relationships
• Graph Export
• Pregel API (write your own algos)
17. Neo4j, Inc. All rights reserved 2021
Graph Algorithms for Drug Discovery
Identify drug mechanisms and
new targets based on network
structure
PageRank to identify essential
regulatory genes or drug
targets
Shortest path to link drug
targets to possible outcomes
or side effects
Node Similarity to find
structurally similar chemicals
17
18. Neo4j, Inc. All rights reserved 2021
What: Finds important nodes
based on their relationships
Why: Recommendations,
identifying influencers
Features:
- Tolerance
- Damping
18
Page Rank
19. Neo4j, Inc. All rights reserved 2021
19
Get started with Graph Data Science in Neo4j