SlideShare ist ein Scribd-Unternehmen logo
1 von 41
Downloaden Sie, um offline zu lesen
© Copyright 2019 Pivotal Software, Inc. All rights Reserved.
Graph Analytics
with Greenplum and Apache MADlib
Pivotal Korea
HongDon Lee, Sr. Data Scientist
30th January 2019
Agenda
1. Why Graph Analytics?
2. What is Graph Analytics?
3. Graph Analytics w/ MADlib
Graph
Analytics
Why
Where
to use
What
How
Everything is connected!
“Nothing ever exists
entirely alone.
Everything is
in relation to
everything else
(緣起)”
“Learn how to see:
Everything is
connected to
everything else”
“In nature we never
see anything isolated,
but everything in
connection with
something else which
is before it, beside it,
under it and over it”
Buddha Leonardo
da vinci
Goethe
3
What a Small World!
“6 Degrees of Separation”
1973, Stanley Milgram, Small-world experiment
1 2 3 4 5 6
4
From Reductionism to Holism
Reductionism Holism
“Divide and Conquer”
vs.
“Everything has to be understood
in relation to the whole”
5
From Individual to Relation
Time
Features
2019.01.01
2019.01.02
2019.01.03
2019.01.04
2019.01.05
2019.01.06
2019.01.30
...
Cross-sectional
Perspective
Longitudinal
Perspective
At the individual levelDemographics
Behaviors
Preferences
Economic Status
Education Background
...
W
ho
are
you?
6
From Individual to Relation
Time
Features
Demographics
Behaviors
Preferences
Economic Status
Education Background
2019.01.01
...
2019.01.02
2019.01.03
2019.01.04
2019.01.05
2019.01.06
2019.01.30
...
Cross-sectional
Perspective
Longitudinal
Perspective
Relation/ Connection
Family
Friends
Colleagues
Community
...
“Tell me who your friends are
and I’ll tell you who you are”
- Mexican Proverb -
7
Graph Analytics, one of the Data Scientist’s knifes
Graph Analytics
t-Test, ANOVA
CNN, RNN, GAN
Random Forest,
XGBoost
Bayesian
Statistics
Regression,
Logistic Regression
PCA, factor
analysis
Clustering
Text Analysis, NLP
Depends on
business
problem and
data
8
Network: Everywhere with Everything, All the time
MMO Role-Playing Game
* www.researchgate.net
Chemistry
* https://www.nature.com/articles/
Social Network Epidemiology
* http://www.netminer.com/community* Grandjean, M. (2016)
Bank Risk
* https://cambridge-intelligence.com
1st
Party Fraud Manufacturing
* www.infoglide.com * https://blog.trifinance.com* www.researchgate.net
Gene
9
Use Cases - PageRank
● Measures the importance of a vertex in a graph by counting the number and
quality of the links to that vertex
❏ Web Search
❏ Scientific impact of researchers
❏ Neuroscience
❏ Street and space usage
* Image from https://en.wikipedia.org/wiki/PageRank
10
Use Cases - Single Source Shortest Path
● Find a path to every vertex so that the sum of the weights of its constituent
edges is minimized
❏ Vehicle routing/ navigation
❏ Degrees of separation in a social network
❏ Mid-delay path in a telecommunications network
❏ Plant and facility layout
❏ VLSI (Very-Large-Scale Integration)
design
11
Use Cases - Cyber-security by Graph model
● Using historical window events data to build
historical graphs* of typical user behavior
• Which machines does the user log in to?
• Which machines does the user log in from?
• How often?
• In which order?
● Is this behavior typical?
• Is it typical for this user?
• Is it typical for someone in a particular department?
• Is this typical for someone in the user’s job role?
● Graph models are sensitive to direction, order,
and frequency.
34.23.123.4
Typical Behavior
Anomalous Behavior
DB with financial
information
34.23.123.51
34.23.1.1
34.23.0.1
34.23.2.8
34.23.123.4
34.23.1.1
34.23.0.1
34.23.2.8
34.23.123.51
*Reference: Alexander D. Kenta, Lorie M. Liebrock, Joshua C. Neila. Authentication graphs: Analyzing user behavior within an enterprise network.
vs.
12
Use Cases - Connected Component
● Calculate the Jaccard Dissimilarity
Scores for each pair of materials
● If material X and Y are potential
duplicates and material Y and Z are
potential duplicates then X, Y, Z is a
connected component in the graph of
all materials and form a cluster
⇒ Connected component analysis
resulted in 10% of materials identified
as potential duplicates based on their
bill of material attributes
Z
X Y
Z
Features for each material
• part type
• material type & group
• product line & family
• revision key
• weld, material & coating
specs
• quality matrix
• unit of measurement
• Weight
:
13
Agenda
1. Why Graph Analytics?
2. What is Graph Analytics?
3. Graph Analytics w/ MADlib
Graph
Analytics
Why
Where
to use
What
How
The Origin of Graph Theory
● Seven bridges of Konigsberg problem
● Leonhard Euler, a mathematician, proved that the problem has no solution
“The problem was to devise a walk through the city that would
cross each of those bridges once and only once.”
Euler, 1753
15
[ Terminology of Graph theory ]
What is Graph Theory?
● Graph theory is the study of graphs, which are mathematical structures
used to model pairwise relations between objects.
0 1
2
4
3
5
6
7
1
2
10
1
10
1
1
3
1
-2
1
1
vertice
edge weight ● Vertex
● Node
● Point
● Actor
V
● Edge
● Link
● Arc
● Line
Directed
Undirected
[ Directed Network Graph with Weight (example) ]
10
Weight
E
16
Graph Algorithms and Measures
Group
Structure
Centrality
Types Question Feasures
Path
“What are the sub-graphs,
component, communities?”
“What is the character of the
network structure?”
“What is the most important
vertices within a graph”
“What is the shortest
path(distance) among vertices”
weakly-connected component
Density, Diameter,
Average path length,
Modularity
Degree (in/out, weight),
Closeness,
PageRank, Hub, Authority,
Betweenness,
Clustering coefficient
Single source shortest path,
All pairs shortest path,
Breadth-First Search
Graph-based
Features
1
2
3
4
17
Graph Algorithms and Measures - (1) Group
Group
Structure
Centrality
Path
Weakly-Connected Component
● A Connected Component (or just Component) of an undirected graph is
subgraph in which any two vertices are connected to each other by paths,
and which is connected to no additional vertices in the supergraph
* source: Wikipedia
[ A supergraph with three connected components ]
Component 1
Component 2
Component 3
Supergraph
18
D =
|E|
|V| (|V| - 1) / 2
Graph Algorithms and Measures - (2) Structure
Group
Structure
Centrality
Path
Density
● A dense graph is a graph in which the number of edges is close to the
maximal number of edges. The opposite, a graph with only a few edges, is a
sparse graph. The distinction between sparse and dense graphs is rather vague,
and depends on the context.
* source: Wikipedia
❏ For Undirected simple graphs
❏ For Directed simple graphs
D =
|E|
|V| (|V| - 1)
E : the number of Edges, V : the number of Vertices
[ Density by components (example) ]
D=
|6|
|4|(|4|-1)/2
D =
|3|
|4|(|4|-1)/2
=1 =0.5
19
Graph Algorithms and Measures - (3) Path
Group
Structure
Centrality
Path
Single Source Shortest Path (SSSP)
● Given a graph and a source vertex, the Single Source Shortest Path (SSSP)
algorithm finds a path from the source vertex to every other vertex in the
graph, such that the sum of the weights of the path is minimized.
0 1
2
4
3
5
6 7
1
2
10
1
10
1
1
3
1
-2
1
1
[ Shortest paths from vertex ‘0’ (example) ]
ID weight parent
0 0 0
1 1 0
2 1 0
3 2 (= 1+1) 2
4 10 0
5 2 2
6 3 5
7 4 6
* weight : The total weight of the shortest path from the source vertex to this particular vertex.
* parent : The parent of this vertex in the shortest path from source.
23
0
20
Graph Algorithms and Measures - (4) Centrality
Group
Structure
Centrality
Path
In-Degree, Out-Degree
● The node in-degree is the number of edges pointing in to the node
● The node out-degree is the number of edges pointing out of the node
0 1
2
4
3
5
6 7
1
2
10
1
10
1
1
3
1
-2
1
1
[ The in-out degree for each node (example) ]
ID In-degree Out-degree
0 2 3
1 1 2
2 2 3
3 2 1
4 1 1
5 1 1
6 2 1
7 1 0
21
Graph Algorithms and Measures - (4) Centrality
Group
Structure
Centrality
Path
PageRank (1 / 2)
The size of each face is proportional to the total
size of the other faces which are pointing to it.
- PR(A): PageRank of node A
- N: the total number of Nodes
- L(B): the number of Links from node B
- d: damping factor (probability, at any step, that a surfer will
continue randomly clicking on links)
● PageRank works by counting the number and quality of links to a page to
determine a rough estimate of how important the website is. The underlying
assumption is that more important websites are likely to receive more links
from other websites
* source: Wikipedia
22
Graph Algorithms and Measures - (4) Centrality
Group
Structure
Centrality
Path
PageRank (2 / 2)
A
B
C
D
0.25
0.25
0.25
0.25
1st round 2nd round
PR(A) = (1-0.85)/4 +
0.85*(0.25/2 + 0.25/1 +
0.25/3) = 0.427
PR(B) = (1-0.85)/4 +
0.85*(0.25/3) = 0.108
PR(C) = (1-0.85)/4 +
0.85*(0.25/2 + 0.25/3) =
0.214
PR(D) = (1-0.85)/4 +
0.85*0 = 0.037
PR(A) = 0.25
PR(B) =0.25
PR(C) = 0.25
PR(D) = 0.25
Final round
A
B
C
D
0.108
0.214
0.037
0.427
...
Recursive calculation → converged
A
B
C
D
0.048
0.069
0.038
0.127
* PR(A): PageRank of node A, N: the total number of Nodes, L(B): the number of Links from node B, d: damping factor (typically 0.85) 23
Graph Algorithms and Measures - (4) Centrality
Group
Structure
Centrality
Path
Closeness
● Closeness of a node is a measure of centrality in a network, calculated as the
sum of the length of the shortest paths between the node and all other
nodes in the graph (ie, based on All Pairs Shortest Path)
0 1
2
4
3
5
6 7
1
2
10
1
10
1
1
3
1
-2
1
1
[ All Pairs Shortest Path ]
source destination weight
0 0 0
0 1 1
0 2 1
0 3 2
0 4 10
0 5 2
0 6 3
0 7 4
1 0 4
1 1 0
1 2 2
1 3 3
1 4 14
1 5 3
1 6 4
1 7 5
2 0 2
...
[ Closeness Centrality ]
src_id closeness
0 0.043
1 0.028
2 0.041
3 0.035
* N: The number of nodes, * d(y, x) : The distance between y and x node 24
Big Issue of Graph Algorithms - High Complexity
* image source: https://www.xkcd.com/399/
25
Big Issue of Graph Algorithms - High Complexity
Type Algorithms/ Measures Time Complexity
Group Weakly-Connected Component O(|V| + |E|)
Structure
Density O(|V|+|E|log|E|)
Diameter O(|V|3
)
Path
All Pairs Shortest Path O(|V|3
)
Single Source Shortest Path O(|V|2
)
Breadth-First Search O(E + V)
Centrality
In-Degree, Out-Degree O(|V| + |E|)
Closeness Centrality O(|V|3
)
PageRank O(log(network size)/(1-damping factor))
Betweenness Centrality O(|V|2
log|V|+|V||E|)
* |V|: the number of Vertices in graph
* |E|: the number of Edges in graph
● Computationally
Intensive
- exponential to the
number of vertices
and edges
Graph Analysis
at Scale,
parallel processing
with MADlib
on Greenplum
26
Agenda
1. Why Graph Analytics?
2. What is Graph Analytics?
3. Graph Analytics w/ MADlib
Why
Where
to use
What
How
Graph
Analytics
Tools for Graph Analytics
● Graph Analytics at Scale with Open Source MADlib on Greenplum
Commercial
Open Source
Small Data Big Data/ Parallel Processing
Data Size & Processing
WhetherOSSornot
sna,
igraph,
ergm,
network
NetworkX,
graph-tool,
SNAP,
pygraphviz
...
Interactive visualization focused Graph DB
28
Analytics Platform, GPDB
Graph Analytics at Scale
● Designed for very large graphs
(billions of vertices/edges)
● No need to move data and transform
for external graph engine
- One analytics database to deploy and
manage
● Familiar SQL interface
● Combine context-based graph
analytics with other content-based
techniques
❏ Advanced Analytics In Database
Extended Language
GPText
❏ Scale Out
❏ MPP (Massively Parallel Processing) Architecture
REGRESSIONCLASSIFICATIONCLUSTERING GEOSPATIAL GRAPHTEXT IMAGE
Graph Analytics with MADlib on Greenplum
29
: Scaleable, In-Database Machine Learning
● Open source https://github.com/apache/madlib
● Downloads and docs http://madlib.apache.org/
● Wiki https://cwiki.apache.org/confluence/display/MADLIB/
Apache MADlib: Big Data Machine Learning in SQL
Open source,
top level
Apache
project
For
PostgreSQL
and Greenplum
Database
Powerful machine
learning, graph,
statistics and
analytics for data
scientists
30
: Functions
Data Types and Transformations
Array and Matrix Operations
Matrix Factorization
• Low Rank
• Singular Value Decomposition (SVD)
Norms and Distance Functions
Sparse Vectors
Encoding Categorical Variables
Path Functions
Pivot
Sessionize
Stemming
April 2018
Graph
All Pairs Shortest Path (APSP)
Breadth-First Search
Hyperlink-Induced Topic Search (HITS)
Average Path Length
Closeness Centrality
Graph Diameter
In-Out Degree
PageRank and Personalized PageRank
Single Source Shortest Path (SSSP)
Weakly Connected Components
Model Selection
Cross Validation
Prediction Metrics
Train-Test Split
Statistics
Descriptive Statistics
• Cardinality Estimators
• Correlation and Covariance
• Summary
Inferential Statistics
• Hypothesis Tests
Probability Functions
Supervised Learning
Neural Networks
Support Vector Machines (SVM)
Conditional Random Field (CRF)
Regression Models
• Clustered Variance
• Cox-Proportional Hazards Regression
• Elastic Net Regularization
• Generalized Linear Models
• Linear Regression
• Logistic Regression
• Marginal Effects
• Multinomial Regression
• Naïve Bayes
• Ordinal Regression
• Robust Variance
Tree Methods
• Decision Tree
• Random Forest
Time Series Analysis
• ARIMA
Unsupervised Learning
Association Rules (Apriori)
Clustering (k-Means)
Principal Component Analysis (PCA)
Topic Modelling (Latent Dirichlet Allocation)
Utility Functions
Columns to Vector
Conjugate Gradient
Linear Solvers
• Dense Linear Systems
• Sparse Linear Systems
Mini-Batching
PMML Export
Term Frequency for Text
Vector to Columns
Nearest Neighbors
• k-Nearest Neighbors
Sampling
Balanced/ Random/ Stratified Sampling
31
: Graph Representation in MADlib
Source
Vertex
Dest
Vertex
Edge
Weight
Edge
Params
0 3 1.0 ...
1 0 5.0 ...
1 2 3.0 ...
2 3 8.0 ...
3 0 3.0 ...
3 1 2.0 ...
Vertex Vertex
Params
0 ...
1 ...
2 ...
3 ...
. . . . . .
Vertex Table Edge Table
...
...
0
1
2
3
5
3
8
2
1
3
[ Directed Graph (example) ]
V
32
example : PageRank in MADlib
● Create vertex and edge tables to represent the graph
* http://madlib.apache.org/docs/latest/group__grp__pagerank.html
DROP TABLE IF EXISTS vertex;
CREATE TABLE vertex(
id INTEGER
);
INSERT INTO vertex VALUES
(0),
(1),
(2),
(3),
(4),
(5),
(6);
DROP TABLE IF EXISTS edge;
CREATE TABLE edge(
src INTEGER,
dest INTEGER,
user_id INTEGER
)
DISTRIBUTED BY (user_id);
INSERT INTO edge VALUES
(0, 1, 1), (0, 2, 1), -- user id 1
(0, 4, 1), (1, 2, 1),
(1, 3, 1), (2, 3, 1),
(2, 5, 1), (2, 6, 1),
(3, 0, 1), (4, 0, 1),
(5, 6, 1), (6, 3, 1),
(0, 1, 2), (0, 2, 2), -- user id 2
(0, 4, 2), (1, 2, 2),
(1, 3, 2), (2, 3, 2),
(3, 0, 2), (4, 0, 2),
(5, 6, 2), (6, 3, 2);
33
example : PageRank in MADlib
● Compute the PageRank with All IDs
DROP TABLE IF EXISTS pagerank_out, pagerank_out_summary;
SELECT madlib.pagerank(
'vertex' -- Vertex table
, 'id' -- Vertex id column
, 'edge' -- Edge table
, 'src=src, dest=dest' -- Comma delimited string of edge arguments
, 'pagerank_out' -- Output table of RageRank
, NULL); -- Damping factor (default 0.85)
SELECT * FROM pagerank_out
ORDER BY pagerank DESC;
34
example : PageRank in MADlib
● Network Diagram by Graphviz and PyGraphviz
* PyGraphviz is a Python interface to the Graphviz graph layout and visualization package
A vertex with a high PageRank is usually
considered more "important" or more
"influential" or more "relevant" than a
vertex with a low PageRank.
Size of node is proportional
to PageRank value
User ID 1
User ID 2
ID
(PageRank)
35
example : PageRank in MADlib
● PageRank of vertices associated with each user by the grouping feature
DROP TABLE IF EXISTS pagerank_gr_out, pagerank_gr_out_summary;
SELECT madlib.pagerank(
'vertex' -- Vertex table
, 'id' -- Vertex id column
, 'edge' -- Edge table
, 'src=src, dest=dest' -- Comma delimited string of edge arguments
, 'pagerank_gr_out' -- Output table of PageRank
, NULL -- Default damping factor (0.85)
, NULL -- Default max iterations (100)
, 0.00000001 -- Threshold
, 'user_id' -- Grouping column name
);
SELECT * FROM pagerank_gr_out
ORDER BY user_id, pagerank DESC;
PageRank
of user id 1
PageRank
of user id 2
36
example : PageRank in MADlib
● Personalized PageRank of vertices {2, 4}, for Recommendations
DROP TABLE IF EXISTS pagerank_pers_out,
pagerank_pers_out_summary;
SELECT madlib.pagerank(
'vertex' -- Vertex table
, 'id' -- Vertex id column
, 'edge' -- Edge table
, 'src=src, dest=dest' -- Comma delimited string of edge arguments
, 'pagerank_pers_out' -- Output table of PageRank
, NULL -- Default damping factor (0.85)
, NULL -- Default max iterations (100)
, NULL -- Default Threshold (1/number of vertices*1000)
, NULL -- No Grouping
, '{2, 4}' -- Personalization vertices
);
SELECT * FROM pagerank_pers_out
ORDER BY pagerank DESC;
SELECT * FROM
pagerank_pers_out_summary;
* Personalized PageRank = (1-p)*Ax + p*E , where ‘E’ is the list of vertices for personalized PageRank, ‘p’ is the damping factor 37
example : PageRank in MADlib
Greenplum cluster:
● 1 master
● 4 segment hosts with 6
segments per host
Normal random graphs with
mean degrees 50 edges per vertex
(i.e., 5B edges in the largest case)
5B edges
(1K) (10K) (100K) (1M) (10M) (100M)
* Note: log-log scale
(100s)
(1s)
(10K s)
(1M s)
PageRank Performance on Greenplum w/ MADlib
38
In Summary
● Capture the Relationship in Networks using Graph Analytics
→ Community, Structure, Path, Centrality
→ Combine context-based graph analytics with other content-based insights
● Graph analytics at SCALE with Open Source Software
→ Apache MADlib on Greenplum, massively parallel processing
39
One more thing...
GREENPLUM SUMMIT at PostgresConf 2019
by Pivotal
40
Transforming How The World Builds Software
© Copyright 2019 Pivotal Software, Inc. All rights Reserved.

Weitere ähnliche Inhalte

Ähnlich wie Graph Analytics with Greenplum and Apache MADlib

Using Networks to Measure Influence and Impact
Using Networks to Measure Influence and ImpactUsing Networks to Measure Influence and Impact
Using Networks to Measure Influence and ImpactYunhao Zhang
 
CS583-unsupervised-learning.ppt
CS583-unsupervised-learning.pptCS583-unsupervised-learning.ppt
CS583-unsupervised-learning.pptHathiramN1
 
CS583-unsupervised-learning.ppt learning
CS583-unsupervised-learning.ppt learningCS583-unsupervised-learning.ppt learning
CS583-unsupervised-learning.ppt learningssuserb02eff
 
Improve ML Predictions using Graph Analytics (today!)
Improve ML Predictions using Graph Analytics (today!)Improve ML Predictions using Graph Analytics (today!)
Improve ML Predictions using Graph Analytics (today!)Neo4j
 
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace DataMPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace DataIRJET Journal
 
Attentive Relational Networks for Mapping Images to Scene Graphs
Attentive Relational Networks for Mapping Images to Scene GraphsAttentive Relational Networks for Mapping Images to Scene Graphs
Attentive Relational Networks for Mapping Images to Scene GraphsSangmin Woo
 
mini project_shortest path visualizer.pptx
mini project_shortest path visualizer.pptxmini project_shortest path visualizer.pptx
mini project_shortest path visualizer.pptxtusharpawar803067
 
Modeling the Impact of R & Python Packages: Dependency and Contributor Networks
Modeling the Impact of R & Python Packages: Dependency and Contributor NetworksModeling the Impact of R & Python Packages: Dependency and Contributor Networks
Modeling the Impact of R & Python Packages: Dependency and Contributor NetworksMelissa Moody
 
A Graph Summarization: A Survey | Summarizing and understanding large graphs
A Graph Summarization: A Survey | Summarizing and understanding large graphsA Graph Summarization: A Survey | Summarizing and understanding large graphs
A Graph Summarization: A Survey | Summarizing and understanding large graphsaftab alam
 
Spanning Tree in data structure and .pptx
Spanning Tree in data structure and .pptxSpanning Tree in data structure and .pptx
Spanning Tree in data structure and .pptxasimshahzad8611
 
IRJET- Survey on Implementation of Graph Theory in Routing Protocols of Wired...
IRJET- Survey on Implementation of Graph Theory in Routing Protocols of Wired...IRJET- Survey on Implementation of Graph Theory in Routing Protocols of Wired...
IRJET- Survey on Implementation of Graph Theory in Routing Protocols of Wired...IRJET Journal
 
Unsupervised Learning.pptx
Unsupervised Learning.pptxUnsupervised Learning.pptx
Unsupervised Learning.pptxGandhiMathy6
 
cs224w-79-final
cs224w-79-finalcs224w-79-final
cs224w-79-finalDarren Koh
 
Conceptual Fixture Design Method Based On Petri Net
Conceptual Fixture Design Method Based On Petri NetConceptual Fixture Design Method Based On Petri Net
Conceptual Fixture Design Method Based On Petri NetIJRES Journal
 
Neo4j workshop at GraphSummit London 14 Nov 2023.pdf
Neo4j workshop at GraphSummit London 14 Nov 2023.pdfNeo4j workshop at GraphSummit London 14 Nov 2023.pdf
Neo4j workshop at GraphSummit London 14 Nov 2023.pdfNeo4j
 
Mat189: Cluster Analysis with NBA Sports Data
Mat189: Cluster Analysis with NBA Sports DataMat189: Cluster Analysis with NBA Sports Data
Mat189: Cluster Analysis with NBA Sports DataKathleneNgo
 
log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...
log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...
log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...ABINASHPADHY6
 

Ähnlich wie Graph Analytics with Greenplum and Apache MADlib (20)

Using Networks to Measure Influence and Impact
Using Networks to Measure Influence and ImpactUsing Networks to Measure Influence and Impact
Using Networks to Measure Influence and Impact
 
CS583-unsupervised-learning.ppt
CS583-unsupervised-learning.pptCS583-unsupervised-learning.ppt
CS583-unsupervised-learning.ppt
 
CS583-unsupervised-learning.ppt learning
CS583-unsupervised-learning.ppt learningCS583-unsupervised-learning.ppt learning
CS583-unsupervised-learning.ppt learning
 
Improve ML Predictions using Graph Analytics (today!)
Improve ML Predictions using Graph Analytics (today!)Improve ML Predictions using Graph Analytics (today!)
Improve ML Predictions using Graph Analytics (today!)
 
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace DataMPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
 
Attentive Relational Networks for Mapping Images to Scene Graphs
Attentive Relational Networks for Mapping Images to Scene GraphsAttentive Relational Networks for Mapping Images to Scene Graphs
Attentive Relational Networks for Mapping Images to Scene Graphs
 
mini project_shortest path visualizer.pptx
mini project_shortest path visualizer.pptxmini project_shortest path visualizer.pptx
mini project_shortest path visualizer.pptx
 
Modeling the Impact of R & Python Packages: Dependency and Contributor Networks
Modeling the Impact of R & Python Packages: Dependency and Contributor NetworksModeling the Impact of R & Python Packages: Dependency and Contributor Networks
Modeling the Impact of R & Python Packages: Dependency and Contributor Networks
 
A Graph Summarization: A Survey | Summarizing and understanding large graphs
A Graph Summarization: A Survey | Summarizing and understanding large graphsA Graph Summarization: A Survey | Summarizing and understanding large graphs
A Graph Summarization: A Survey | Summarizing and understanding large graphs
 
Spanning Tree in data structure and .pptx
Spanning Tree in data structure and .pptxSpanning Tree in data structure and .pptx
Spanning Tree in data structure and .pptx
 
HalifaxNGGs
HalifaxNGGsHalifaxNGGs
HalifaxNGGs
 
IRJET- Survey on Implementation of Graph Theory in Routing Protocols of Wired...
IRJET- Survey on Implementation of Graph Theory in Routing Protocols of Wired...IRJET- Survey on Implementation of Graph Theory in Routing Protocols of Wired...
IRJET- Survey on Implementation of Graph Theory in Routing Protocols of Wired...
 
Unsupervised Learning.pptx
Unsupervised Learning.pptxUnsupervised Learning.pptx
Unsupervised Learning.pptx
 
Hierarchical Clustering
Hierarchical ClusteringHierarchical Clustering
Hierarchical Clustering
 
Keynote at AImWD
Keynote at AImWDKeynote at AImWD
Keynote at AImWD
 
cs224w-79-final
cs224w-79-finalcs224w-79-final
cs224w-79-final
 
Conceptual Fixture Design Method Based On Petri Net
Conceptual Fixture Design Method Based On Petri NetConceptual Fixture Design Method Based On Petri Net
Conceptual Fixture Design Method Based On Petri Net
 
Neo4j workshop at GraphSummit London 14 Nov 2023.pdf
Neo4j workshop at GraphSummit London 14 Nov 2023.pdfNeo4j workshop at GraphSummit London 14 Nov 2023.pdf
Neo4j workshop at GraphSummit London 14 Nov 2023.pdf
 
Mat189: Cluster Analysis with NBA Sports Data
Mat189: Cluster Analysis with NBA Sports DataMat189: Cluster Analysis with NBA Sports Data
Mat189: Cluster Analysis with NBA Sports Data
 
log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...
log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...
log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...
 

Mehr von VMware Tanzu

What AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About ItWhat AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About ItVMware Tanzu
 
Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023VMware Tanzu
 
Enhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at ScaleEnhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at ScaleVMware Tanzu
 
Spring Update | July 2023
Spring Update | July 2023Spring Update | July 2023
Spring Update | July 2023VMware Tanzu
 
Platforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a ProductPlatforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a ProductVMware Tanzu
 
Building Cloud Ready Apps
Building Cloud Ready AppsBuilding Cloud Ready Apps
Building Cloud Ready AppsVMware Tanzu
 
Spring Boot 3 And Beyond
Spring Boot 3 And BeyondSpring Boot 3 And Beyond
Spring Boot 3 And BeyondVMware Tanzu
 
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdfSpring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdfVMware Tanzu
 
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023VMware Tanzu
 
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023VMware Tanzu
 
tanzu_developer_connect.pptx
tanzu_developer_connect.pptxtanzu_developer_connect.pptx
tanzu_developer_connect.pptxVMware Tanzu
 
Tanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - FrenchTanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - FrenchVMware Tanzu
 
Tanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - EnglishTanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - EnglishVMware Tanzu
 
Virtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - EnglishVirtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - EnglishVMware Tanzu
 
Tanzu Developer Connect - French
Tanzu Developer Connect - FrenchTanzu Developer Connect - French
Tanzu Developer Connect - FrenchVMware Tanzu
 
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023VMware Tanzu
 
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring BootSpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring BootVMware Tanzu
 
SpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software EngineerSpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software EngineerVMware Tanzu
 
SpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs PracticeSpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs PracticeVMware Tanzu
 
SpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense SolutionsSpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense SolutionsVMware Tanzu
 

Mehr von VMware Tanzu (20)

What AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About ItWhat AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About It
 
Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023
 
Enhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at ScaleEnhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at Scale
 
Spring Update | July 2023
Spring Update | July 2023Spring Update | July 2023
Spring Update | July 2023
 
Platforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a ProductPlatforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a Product
 
Building Cloud Ready Apps
Building Cloud Ready AppsBuilding Cloud Ready Apps
Building Cloud Ready Apps
 
Spring Boot 3 And Beyond
Spring Boot 3 And BeyondSpring Boot 3 And Beyond
Spring Boot 3 And Beyond
 
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdfSpring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
 
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
 
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
 
tanzu_developer_connect.pptx
tanzu_developer_connect.pptxtanzu_developer_connect.pptx
tanzu_developer_connect.pptx
 
Tanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - FrenchTanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - French
 
Tanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - EnglishTanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - English
 
Virtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - EnglishVirtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - English
 
Tanzu Developer Connect - French
Tanzu Developer Connect - FrenchTanzu Developer Connect - French
Tanzu Developer Connect - French
 
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
 
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring BootSpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
 
SpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software EngineerSpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software Engineer
 
SpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs PracticeSpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs Practice
 
SpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense SolutionsSpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
 

Kürzlich hochgeladen

WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in sowetomasabamasaba
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...masabamasaba
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...masabamasaba
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benonimasabamasaba
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...masabamasaba
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationJuha-Pekka Tolvanen
 

Kürzlich hochgeladen (20)

WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 

Graph Analytics with Greenplum and Apache MADlib

  • 1. © Copyright 2019 Pivotal Software, Inc. All rights Reserved. Graph Analytics with Greenplum and Apache MADlib Pivotal Korea HongDon Lee, Sr. Data Scientist 30th January 2019
  • 2. Agenda 1. Why Graph Analytics? 2. What is Graph Analytics? 3. Graph Analytics w/ MADlib Graph Analytics Why Where to use What How
  • 3. Everything is connected! “Nothing ever exists entirely alone. Everything is in relation to everything else (緣起)” “Learn how to see: Everything is connected to everything else” “In nature we never see anything isolated, but everything in connection with something else which is before it, beside it, under it and over it” Buddha Leonardo da vinci Goethe 3
  • 4. What a Small World! “6 Degrees of Separation” 1973, Stanley Milgram, Small-world experiment 1 2 3 4 5 6 4
  • 5. From Reductionism to Holism Reductionism Holism “Divide and Conquer” vs. “Everything has to be understood in relation to the whole” 5
  • 6. From Individual to Relation Time Features 2019.01.01 2019.01.02 2019.01.03 2019.01.04 2019.01.05 2019.01.06 2019.01.30 ... Cross-sectional Perspective Longitudinal Perspective At the individual levelDemographics Behaviors Preferences Economic Status Education Background ... W ho are you? 6
  • 7. From Individual to Relation Time Features Demographics Behaviors Preferences Economic Status Education Background 2019.01.01 ... 2019.01.02 2019.01.03 2019.01.04 2019.01.05 2019.01.06 2019.01.30 ... Cross-sectional Perspective Longitudinal Perspective Relation/ Connection Family Friends Colleagues Community ... “Tell me who your friends are and I’ll tell you who you are” - Mexican Proverb - 7
  • 8. Graph Analytics, one of the Data Scientist’s knifes Graph Analytics t-Test, ANOVA CNN, RNN, GAN Random Forest, XGBoost Bayesian Statistics Regression, Logistic Regression PCA, factor analysis Clustering Text Analysis, NLP Depends on business problem and data 8
  • 9. Network: Everywhere with Everything, All the time MMO Role-Playing Game * www.researchgate.net Chemistry * https://www.nature.com/articles/ Social Network Epidemiology * http://www.netminer.com/community* Grandjean, M. (2016) Bank Risk * https://cambridge-intelligence.com 1st Party Fraud Manufacturing * www.infoglide.com * https://blog.trifinance.com* www.researchgate.net Gene 9
  • 10. Use Cases - PageRank ● Measures the importance of a vertex in a graph by counting the number and quality of the links to that vertex ❏ Web Search ❏ Scientific impact of researchers ❏ Neuroscience ❏ Street and space usage * Image from https://en.wikipedia.org/wiki/PageRank 10
  • 11. Use Cases - Single Source Shortest Path ● Find a path to every vertex so that the sum of the weights of its constituent edges is minimized ❏ Vehicle routing/ navigation ❏ Degrees of separation in a social network ❏ Mid-delay path in a telecommunications network ❏ Plant and facility layout ❏ VLSI (Very-Large-Scale Integration) design 11
  • 12. Use Cases - Cyber-security by Graph model ● Using historical window events data to build historical graphs* of typical user behavior • Which machines does the user log in to? • Which machines does the user log in from? • How often? • In which order? ● Is this behavior typical? • Is it typical for this user? • Is it typical for someone in a particular department? • Is this typical for someone in the user’s job role? ● Graph models are sensitive to direction, order, and frequency. 34.23.123.4 Typical Behavior Anomalous Behavior DB with financial information 34.23.123.51 34.23.1.1 34.23.0.1 34.23.2.8 34.23.123.4 34.23.1.1 34.23.0.1 34.23.2.8 34.23.123.51 *Reference: Alexander D. Kenta, Lorie M. Liebrock, Joshua C. Neila. Authentication graphs: Analyzing user behavior within an enterprise network. vs. 12
  • 13. Use Cases - Connected Component ● Calculate the Jaccard Dissimilarity Scores for each pair of materials ● If material X and Y are potential duplicates and material Y and Z are potential duplicates then X, Y, Z is a connected component in the graph of all materials and form a cluster ⇒ Connected component analysis resulted in 10% of materials identified as potential duplicates based on their bill of material attributes Z X Y Z Features for each material • part type • material type & group • product line & family • revision key • weld, material & coating specs • quality matrix • unit of measurement • Weight : 13
  • 14. Agenda 1. Why Graph Analytics? 2. What is Graph Analytics? 3. Graph Analytics w/ MADlib Graph Analytics Why Where to use What How
  • 15. The Origin of Graph Theory ● Seven bridges of Konigsberg problem ● Leonhard Euler, a mathematician, proved that the problem has no solution “The problem was to devise a walk through the city that would cross each of those bridges once and only once.” Euler, 1753 15
  • 16. [ Terminology of Graph theory ] What is Graph Theory? ● Graph theory is the study of graphs, which are mathematical structures used to model pairwise relations between objects. 0 1 2 4 3 5 6 7 1 2 10 1 10 1 1 3 1 -2 1 1 vertice edge weight ● Vertex ● Node ● Point ● Actor V ● Edge ● Link ● Arc ● Line Directed Undirected [ Directed Network Graph with Weight (example) ] 10 Weight E 16
  • 17. Graph Algorithms and Measures Group Structure Centrality Types Question Feasures Path “What are the sub-graphs, component, communities?” “What is the character of the network structure?” “What is the most important vertices within a graph” “What is the shortest path(distance) among vertices” weakly-connected component Density, Diameter, Average path length, Modularity Degree (in/out, weight), Closeness, PageRank, Hub, Authority, Betweenness, Clustering coefficient Single source shortest path, All pairs shortest path, Breadth-First Search Graph-based Features 1 2 3 4 17
  • 18. Graph Algorithms and Measures - (1) Group Group Structure Centrality Path Weakly-Connected Component ● A Connected Component (or just Component) of an undirected graph is subgraph in which any two vertices are connected to each other by paths, and which is connected to no additional vertices in the supergraph * source: Wikipedia [ A supergraph with three connected components ] Component 1 Component 2 Component 3 Supergraph 18
  • 19. D = |E| |V| (|V| - 1) / 2 Graph Algorithms and Measures - (2) Structure Group Structure Centrality Path Density ● A dense graph is a graph in which the number of edges is close to the maximal number of edges. The opposite, a graph with only a few edges, is a sparse graph. The distinction between sparse and dense graphs is rather vague, and depends on the context. * source: Wikipedia ❏ For Undirected simple graphs ❏ For Directed simple graphs D = |E| |V| (|V| - 1) E : the number of Edges, V : the number of Vertices [ Density by components (example) ] D= |6| |4|(|4|-1)/2 D = |3| |4|(|4|-1)/2 =1 =0.5 19
  • 20. Graph Algorithms and Measures - (3) Path Group Structure Centrality Path Single Source Shortest Path (SSSP) ● Given a graph and a source vertex, the Single Source Shortest Path (SSSP) algorithm finds a path from the source vertex to every other vertex in the graph, such that the sum of the weights of the path is minimized. 0 1 2 4 3 5 6 7 1 2 10 1 10 1 1 3 1 -2 1 1 [ Shortest paths from vertex ‘0’ (example) ] ID weight parent 0 0 0 1 1 0 2 1 0 3 2 (= 1+1) 2 4 10 0 5 2 2 6 3 5 7 4 6 * weight : The total weight of the shortest path from the source vertex to this particular vertex. * parent : The parent of this vertex in the shortest path from source. 23 0 20
  • 21. Graph Algorithms and Measures - (4) Centrality Group Structure Centrality Path In-Degree, Out-Degree ● The node in-degree is the number of edges pointing in to the node ● The node out-degree is the number of edges pointing out of the node 0 1 2 4 3 5 6 7 1 2 10 1 10 1 1 3 1 -2 1 1 [ The in-out degree for each node (example) ] ID In-degree Out-degree 0 2 3 1 1 2 2 2 3 3 2 1 4 1 1 5 1 1 6 2 1 7 1 0 21
  • 22. Graph Algorithms and Measures - (4) Centrality Group Structure Centrality Path PageRank (1 / 2) The size of each face is proportional to the total size of the other faces which are pointing to it. - PR(A): PageRank of node A - N: the total number of Nodes - L(B): the number of Links from node B - d: damping factor (probability, at any step, that a surfer will continue randomly clicking on links) ● PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is. The underlying assumption is that more important websites are likely to receive more links from other websites * source: Wikipedia 22
  • 23. Graph Algorithms and Measures - (4) Centrality Group Structure Centrality Path PageRank (2 / 2) A B C D 0.25 0.25 0.25 0.25 1st round 2nd round PR(A) = (1-0.85)/4 + 0.85*(0.25/2 + 0.25/1 + 0.25/3) = 0.427 PR(B) = (1-0.85)/4 + 0.85*(0.25/3) = 0.108 PR(C) = (1-0.85)/4 + 0.85*(0.25/2 + 0.25/3) = 0.214 PR(D) = (1-0.85)/4 + 0.85*0 = 0.037 PR(A) = 0.25 PR(B) =0.25 PR(C) = 0.25 PR(D) = 0.25 Final round A B C D 0.108 0.214 0.037 0.427 ... Recursive calculation → converged A B C D 0.048 0.069 0.038 0.127 * PR(A): PageRank of node A, N: the total number of Nodes, L(B): the number of Links from node B, d: damping factor (typically 0.85) 23
  • 24. Graph Algorithms and Measures - (4) Centrality Group Structure Centrality Path Closeness ● Closeness of a node is a measure of centrality in a network, calculated as the sum of the length of the shortest paths between the node and all other nodes in the graph (ie, based on All Pairs Shortest Path) 0 1 2 4 3 5 6 7 1 2 10 1 10 1 1 3 1 -2 1 1 [ All Pairs Shortest Path ] source destination weight 0 0 0 0 1 1 0 2 1 0 3 2 0 4 10 0 5 2 0 6 3 0 7 4 1 0 4 1 1 0 1 2 2 1 3 3 1 4 14 1 5 3 1 6 4 1 7 5 2 0 2 ... [ Closeness Centrality ] src_id closeness 0 0.043 1 0.028 2 0.041 3 0.035 * N: The number of nodes, * d(y, x) : The distance between y and x node 24
  • 25. Big Issue of Graph Algorithms - High Complexity * image source: https://www.xkcd.com/399/ 25
  • 26. Big Issue of Graph Algorithms - High Complexity Type Algorithms/ Measures Time Complexity Group Weakly-Connected Component O(|V| + |E|) Structure Density O(|V|+|E|log|E|) Diameter O(|V|3 ) Path All Pairs Shortest Path O(|V|3 ) Single Source Shortest Path O(|V|2 ) Breadth-First Search O(E + V) Centrality In-Degree, Out-Degree O(|V| + |E|) Closeness Centrality O(|V|3 ) PageRank O(log(network size)/(1-damping factor)) Betweenness Centrality O(|V|2 log|V|+|V||E|) * |V|: the number of Vertices in graph * |E|: the number of Edges in graph ● Computationally Intensive - exponential to the number of vertices and edges Graph Analysis at Scale, parallel processing with MADlib on Greenplum 26
  • 27. Agenda 1. Why Graph Analytics? 2. What is Graph Analytics? 3. Graph Analytics w/ MADlib Why Where to use What How Graph Analytics
  • 28. Tools for Graph Analytics ● Graph Analytics at Scale with Open Source MADlib on Greenplum Commercial Open Source Small Data Big Data/ Parallel Processing Data Size & Processing WhetherOSSornot sna, igraph, ergm, network NetworkX, graph-tool, SNAP, pygraphviz ... Interactive visualization focused Graph DB 28
  • 29. Analytics Platform, GPDB Graph Analytics at Scale ● Designed for very large graphs (billions of vertices/edges) ● No need to move data and transform for external graph engine - One analytics database to deploy and manage ● Familiar SQL interface ● Combine context-based graph analytics with other content-based techniques ❏ Advanced Analytics In Database Extended Language GPText ❏ Scale Out ❏ MPP (Massively Parallel Processing) Architecture REGRESSIONCLASSIFICATIONCLUSTERING GEOSPATIAL GRAPHTEXT IMAGE Graph Analytics with MADlib on Greenplum 29
  • 30. : Scaleable, In-Database Machine Learning ● Open source https://github.com/apache/madlib ● Downloads and docs http://madlib.apache.org/ ● Wiki https://cwiki.apache.org/confluence/display/MADLIB/ Apache MADlib: Big Data Machine Learning in SQL Open source, top level Apache project For PostgreSQL and Greenplum Database Powerful machine learning, graph, statistics and analytics for data scientists 30
  • 31. : Functions Data Types and Transformations Array and Matrix Operations Matrix Factorization • Low Rank • Singular Value Decomposition (SVD) Norms and Distance Functions Sparse Vectors Encoding Categorical Variables Path Functions Pivot Sessionize Stemming April 2018 Graph All Pairs Shortest Path (APSP) Breadth-First Search Hyperlink-Induced Topic Search (HITS) Average Path Length Closeness Centrality Graph Diameter In-Out Degree PageRank and Personalized PageRank Single Source Shortest Path (SSSP) Weakly Connected Components Model Selection Cross Validation Prediction Metrics Train-Test Split Statistics Descriptive Statistics • Cardinality Estimators • Correlation and Covariance • Summary Inferential Statistics • Hypothesis Tests Probability Functions Supervised Learning Neural Networks Support Vector Machines (SVM) Conditional Random Field (CRF) Regression Models • Clustered Variance • Cox-Proportional Hazards Regression • Elastic Net Regularization • Generalized Linear Models • Linear Regression • Logistic Regression • Marginal Effects • Multinomial Regression • Naïve Bayes • Ordinal Regression • Robust Variance Tree Methods • Decision Tree • Random Forest Time Series Analysis • ARIMA Unsupervised Learning Association Rules (Apriori) Clustering (k-Means) Principal Component Analysis (PCA) Topic Modelling (Latent Dirichlet Allocation) Utility Functions Columns to Vector Conjugate Gradient Linear Solvers • Dense Linear Systems • Sparse Linear Systems Mini-Batching PMML Export Term Frequency for Text Vector to Columns Nearest Neighbors • k-Nearest Neighbors Sampling Balanced/ Random/ Stratified Sampling 31
  • 32. : Graph Representation in MADlib Source Vertex Dest Vertex Edge Weight Edge Params 0 3 1.0 ... 1 0 5.0 ... 1 2 3.0 ... 2 3 8.0 ... 3 0 3.0 ... 3 1 2.0 ... Vertex Vertex Params 0 ... 1 ... 2 ... 3 ... . . . . . . Vertex Table Edge Table ... ... 0 1 2 3 5 3 8 2 1 3 [ Directed Graph (example) ] V 32
  • 33. example : PageRank in MADlib ● Create vertex and edge tables to represent the graph * http://madlib.apache.org/docs/latest/group__grp__pagerank.html DROP TABLE IF EXISTS vertex; CREATE TABLE vertex( id INTEGER ); INSERT INTO vertex VALUES (0), (1), (2), (3), (4), (5), (6); DROP TABLE IF EXISTS edge; CREATE TABLE edge( src INTEGER, dest INTEGER, user_id INTEGER ) DISTRIBUTED BY (user_id); INSERT INTO edge VALUES (0, 1, 1), (0, 2, 1), -- user id 1 (0, 4, 1), (1, 2, 1), (1, 3, 1), (2, 3, 1), (2, 5, 1), (2, 6, 1), (3, 0, 1), (4, 0, 1), (5, 6, 1), (6, 3, 1), (0, 1, 2), (0, 2, 2), -- user id 2 (0, 4, 2), (1, 2, 2), (1, 3, 2), (2, 3, 2), (3, 0, 2), (4, 0, 2), (5, 6, 2), (6, 3, 2); 33
  • 34. example : PageRank in MADlib ● Compute the PageRank with All IDs DROP TABLE IF EXISTS pagerank_out, pagerank_out_summary; SELECT madlib.pagerank( 'vertex' -- Vertex table , 'id' -- Vertex id column , 'edge' -- Edge table , 'src=src, dest=dest' -- Comma delimited string of edge arguments , 'pagerank_out' -- Output table of RageRank , NULL); -- Damping factor (default 0.85) SELECT * FROM pagerank_out ORDER BY pagerank DESC; 34
  • 35. example : PageRank in MADlib ● Network Diagram by Graphviz and PyGraphviz * PyGraphviz is a Python interface to the Graphviz graph layout and visualization package A vertex with a high PageRank is usually considered more "important" or more "influential" or more "relevant" than a vertex with a low PageRank. Size of node is proportional to PageRank value User ID 1 User ID 2 ID (PageRank) 35
  • 36. example : PageRank in MADlib ● PageRank of vertices associated with each user by the grouping feature DROP TABLE IF EXISTS pagerank_gr_out, pagerank_gr_out_summary; SELECT madlib.pagerank( 'vertex' -- Vertex table , 'id' -- Vertex id column , 'edge' -- Edge table , 'src=src, dest=dest' -- Comma delimited string of edge arguments , 'pagerank_gr_out' -- Output table of PageRank , NULL -- Default damping factor (0.85) , NULL -- Default max iterations (100) , 0.00000001 -- Threshold , 'user_id' -- Grouping column name ); SELECT * FROM pagerank_gr_out ORDER BY user_id, pagerank DESC; PageRank of user id 1 PageRank of user id 2 36
  • 37. example : PageRank in MADlib ● Personalized PageRank of vertices {2, 4}, for Recommendations DROP TABLE IF EXISTS pagerank_pers_out, pagerank_pers_out_summary; SELECT madlib.pagerank( 'vertex' -- Vertex table , 'id' -- Vertex id column , 'edge' -- Edge table , 'src=src, dest=dest' -- Comma delimited string of edge arguments , 'pagerank_pers_out' -- Output table of PageRank , NULL -- Default damping factor (0.85) , NULL -- Default max iterations (100) , NULL -- Default Threshold (1/number of vertices*1000) , NULL -- No Grouping , '{2, 4}' -- Personalization vertices ); SELECT * FROM pagerank_pers_out ORDER BY pagerank DESC; SELECT * FROM pagerank_pers_out_summary; * Personalized PageRank = (1-p)*Ax + p*E , where ‘E’ is the list of vertices for personalized PageRank, ‘p’ is the damping factor 37
  • 38. example : PageRank in MADlib Greenplum cluster: ● 1 master ● 4 segment hosts with 6 segments per host Normal random graphs with mean degrees 50 edges per vertex (i.e., 5B edges in the largest case) 5B edges (1K) (10K) (100K) (1M) (10M) (100M) * Note: log-log scale (100s) (1s) (10K s) (1M s) PageRank Performance on Greenplum w/ MADlib 38
  • 39. In Summary ● Capture the Relationship in Networks using Graph Analytics → Community, Structure, Path, Centrality → Combine context-based graph analytics with other content-based insights ● Graph analytics at SCALE with Open Source Software → Apache MADlib on Greenplum, massively parallel processing 39
  • 40. One more thing... GREENPLUM SUMMIT at PostgresConf 2019 by Pivotal 40
  • 41. Transforming How The World Builds Software © Copyright 2019 Pivotal Software, Inc. All rights Reserved.