SlideShare a Scribd company logo
1 of 33
Download to read offline
Representation Learning on Graphs with
Complex Structures
Prof. Dr. Philippe Cudré-Mauroux
eXascale Infolab, U. of Fribourg–Switzerland
DL4G-SDE @ WWW2019
San Francisco, May 13, 2019
Representation Learning on Graphs
■ Projecting nodes of a graph onto a vector space while preserving key
structural properties of the graph (e.g., topological proximity of the nodes)
8/5/192 WWW2019@San Francisco
Neural embedding
techniques
(e.g.word2vec)
…
0.19 0.32 1.89 1.21 0.87
0.67 0.45 1.76 1.42 0.98
1.32 0.77 1.11 1.29 1.31
1
Perozzi, Bryan, Rami Al-Rfou, and Steven Skiena. "Deepwalk: Online learning of social representations." In Proceedings of the 20th ACM SIGKDD
international conference on Knowledge discovery and data mining, pp. 701-710. ACM, 2014.
DeepWalk1
8/5/193 WWW2019@San Francisco
What if the graph at hand exhibits
a much more complex structure?
Outlines
■ JUST: Embedding heterogeneous graphs without meta-paths
[CIKM’18]
■ LBSN2Vec: Embedding heterogeneous hypergraphs from LBSNs
[WWW’19]
■ NodeSketch: Highly-efficient graph embeddings via recursive
sketching [KDD’19]
8/5/194 WWW2019@San Francisco
Heterogeneous Graphs
■ Heterogeneous Graphs contain multiple node types:
● Homogeneous edges: linking nodes from the same domain
● Heterogeneous edges: linking nodes across different domains
8/5/195 WWW2019@San Francisco
Meta-Paths in Heterogeneous Graphs
■ A meta-path is a sequence of node types encoding key composite relations among the
involved node types.
■ Meta-paths are used to guide random walks to redefine the neighborhood of a node.
8/5/196 WWW2019@San Francisco
1
Yuxiao Dong, Nitesh V Chawla, and Ananthram Swami. 2017. metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 135–144.
Metapath2vec1
Neural embedding
techniques
(e.g.word2vec)
…
0.19 0.32 1.89 1.21 0.87
0.67 0.45 1.76 1.42 0.98
1.32 0.77 1.11 1.29 1.31
Challenges with Meta-Paths
■ The choice of meta-paths highly affects the quality of the learnt node
embeddings for a specific task.
■ How to select meta-paths ?
● Graph specific and highly depends on prior knowledge from domain experts.
● Strategies to combine a set of meta-paths can be complex and computationally
expensive.
8/5/197 WWW2019@San Francisco
Are meta-paths necessary?
8/5/198 WWW2019@San Francisco
JUST: Embedding Heterogeneous Graphs without Meta-Paths
■ Random Walk with JUmp and STay strategies to probabilistically control the
random walk.
■ 2 ways to balance the random walk:
● Step I: Jump or stay?
−Objective: Balance the number of heterogeneous and homogeneous edges traversed during
random walks (stay with probability 𝝰, exponential decay).
● Step II: If Jump, where to Jump?
−Objective: Control the randomness in choosing a target domain
(memory window to favor diversity).
■ Learn node embeddings with SkipGram model.
8/5/199 WWW2019@San Francisco
Results
8/5/1910 WWW2019@San Francisco
JUST achieves state-of-the-art performance without using meta-paths.
Node classification results
Runtime Performance
■ End-to-end node embedding learning time for all random-walk based
methods in seconds.
8/5/1911 WWW2019@San Francisco
DBLP Movie Foursquare
DeepWalk 236 333 484
Metapath2vec (original) 965 19,200 2,248
Metapath2vec (ours) 290 408 550
Hin2vec 904 1,301 1,801
JUST 310 442 616
• Compared to DeepWalk and Metapath2vec, JUST has minor overhead on learning time, but achieves
better results in classification and clustering tasks.
• Compared to Hin2vec, JUST achieves 3x speedup learning time, and achieves better results in most
experiments.
Outlines
■ JUST: Embedding heterogeneous graphs without meta-paths
[CIKM’18]
■ LBSN2Vec: Embedding heterogeneous hypergraphs from LBSNs
[WWW’19]
■ NodeSketch: Highly-efficient graph embeddings via recursive
sketching [KDD’19]
8/5/1912 WWW2019@San Francisco
Social Relationships v.s. Human Mobility
8/5/1913 WWW2019@San Francisco
8/5/1914 WWW2019@San Francisco
How to quantify the impact of social relationships and
mobility on each other?
● Two types of links
−Friendships
−Check-ins (Hyperedges)
Location Based Social Networks
■A hypergraph with
● Four data domains
8/5/1915 WWW2019@San Francisco
Spatial
- POI
Temporal
- Time slot
Semantic
- Activity category
Social
- User
Hypergraph Embedding
8/5/1916 WWW2019@San Francisco
0.19 0.32 1.89 1.21 0.87
0.67 0.45 1.76 1.42 0.98
1.32 0.77 1.11 1.29 1.31
045 0.89 1.56 0.02 0.79
…
Graph embedding
Neural embedding
techniques
(e.g. SkipGram)
1. How to sample from a
LBSN hypergraph?
2. How to preserve n-wise
proximity from Hyperedges?
1. Sample from A Hypergraph: Random Walk with Stay
■ Balancing the impact of social and mobility on the learnt embeddings
8/5/1917 WWW2019@San Francisco
Sample and learn from
• A check-in hyperedge with probability 𝛼
• A user-user pair with probability (1-𝛼)
2. Learn from Hyperedges: Learning via Best-Fit-Line
■ Maximizing the similarity between the nodes of a hyperedge and their
best-fit-line under cosine similarity.
8/5/1918 WWW2019@San Francisco
1. Compute the best-fit-line
2. Maximize the cosine similarity between each node
and the best-fit-line
Task I: Friendship Prediction
■ Comparison with other graph embedding techniques
● (S) Social network only
● (S&M) Social and mobility through clique expansion
8/5/1919 WWW2019@San Francisco
↑ 32.95% on
precision@10
Clique expansion
Task II: Location Prediction
■ Comparison with other graph embedding techniques
● (M) Mobility (Check-in) network only
● (S&M) Social and mobility through clique expansion
8/5/1920 WWW2019@San Francisco
↑ 25.32% on
accuracy@10
8/5/19 WWW2019@San Francisco21
Balancing the Impact of Social Relationships and Mobility Matters!
Asymmetric impact of mobility and social relationships on predicting each other:
• Friendship prediction: 80% social and 20% mobility data
• Location prediction: 60% social and 40% mobility data
Outlines
■ JUST: Embedding heterogeneous graphs without meta-paths
[CIKM’18]
■ LBSN2Vec: Embedding heterogeneous hypergraphs from LBSNs
[WWW’19]
■ NodeSketch: Highly-efficient graph embeddings via recursive
sketching [KDD’19]
8/5/1922 WWW2019@San Francisco
Graph Embeddings
■ Graph-sampling based techniques
● Sample node pairs from a graph, and preserve node proximity from the node pairs
● Examples: DeepWalk, Node2Vec, LINE, SDNE and VERSE, etc.
● Efficiency bottleneck: A large number of node pairs -> significant computation resources (CPU time)
■ Factorization based techniques
● Factorize a (transformed, e.g., high-order) proximity/adjacency matrix of a graph
● Examples: GraRep, HOPE and NetMF, etc.
● Efficiency bottleneck: Large matrix factorization -> significant computation resources (both CPU time and
RAM)
■ Node proximity preserved using cosine similarity
● Efficiency bottleneck: cosine similarity is less efficient than hamming similarity, for example.
8/5/1923 WWW2019@San Francisco
Similarity-Preserving Hashing/Sketching
■ Efficient similarity approximation of high dimensional data
● Data-dependent hashing (learning-to-hash)
−Learning dataset-specific hashing functions
−Examples: spectral hashing, iterative quantization, etc.
−Efficient in similarity computation, but requires learning hashing functions
● Data-independent hashing/sketching (locality sensitive hashing)
−Hashing without involving any learning process from data
−Examples: minhash, consistent weighted sampling, etc.
−Efficient in both similarity approximation and hashing
8/5/1924 WWW2019@San Francisco
Can we sketch nodes in a graph as embeddings?
8/5/1925 WWW2019@San Francisco
Preliminary: Consistent Weighted Sampling1
■ Principled techniques for highly-efficient similarity approximation
8/5/1926 WWW2019@San Francisco
The min-max similarity
between original data
Can be approximated by the
Hamming similarity between
sketches
1.32 2.77 1.11 3.29 1.31V
Sketch S = S1 … Sj … SL
D=5 Random hash
function hj , j=1…,L.
1
Dingqi Yang, Bin Li, Rettig Laura, Philippe Cudré-Mauroux, D2HistoSketch: Discriminative and Dynamic Similarity-Preserving Sketching of Streaming Histograms,
IEEE Transactions on Knowledge and Data Engineering (TKDE) 2018
Sketching the Adjacency Matrix ?
■ Adjacency matrix v.s. Self-Loop-Augmented (SLA) adjacency matrix
8/5/1927 WWW2019@San Francisco
NodeSketch: Low-Order Node Embeddings
8/5/1928 WWW2019@San Francisco
1
2
3
4 5
NodeSketch: High-Order Node Embeddings
8/5/1929 WWW2019@San Francisco
1 1
0.33 0.33 0.33
Neighbors
𝒏 ∈ 𝜞 𝒓
Node 2 2 3 1
SLA adjacency vector '𝑽 𝒓
Sketch element distribution
𝟏
𝑳
∑𝒋-𝟏
𝑳
𝕝[𝑺 𝒋
𝒏
𝒌2𝟏 -𝒊], 𝑖=1,..,D
1.066 1.066 0.066
Approximate 𝑘-order
SLA adjacency vector '𝑽 𝒓
(𝒌)
node 1
Sketching using Eq. 3
*Weight
α=0.2
Merge
1 1
1 1 1
1 1 1 1
1 1
1 1
SLA adjacency
matrix '𝑨
2 1 1
2 3 1
2 3 4
4 3 4
5 3 5
(𝑘-1)-order node
embeddings 𝑺(𝒌 − 𝟏)
𝑘-order
embeddings 𝑺(𝒌)
2 1 3
2 3 4
2 3 4
2 3 4
4 3 5
(𝑘-1)-order Sketches
𝑺 𝒏
(𝒌 − 𝟏)
… … …
Uniformity of the generated samples:
The foundation of our recursive sketching process
1
2
3
4 5
Results: Node Classification Performance using Kernel SVM
8/5/1930 WWW2019@San Francisco
Classical graph
embedding techniques
(preserving cosine
similarity)
Learning-to-hash
techniques
Sketching
techniques
NodeSketch shows comparable performance to the best-performing state-of-the-art techniques.
Results: Runtime Performance
8/5/1931 WWW2019@San Francisco
NodeSketch is highly-efficient, and significantly
outperforms all baselines, showing 9x-273x speedup.
Hamming similarity also shows improved efficiency (1.19x-
1.68x speedup) over cosine similarity.
Take-Away Messages
■ JUST: Meta-path free heterogeneous graph embedding can achieve state-
of-the-art performance efficiently. [CIKM’18]
■ LBSN2Vec: Asymmetric impact of social and mobility on each other
[WWW’19]
■ NodeSketch: High-quality node embeddings can be generated via highly-
efficient sketching techniques [KDD’19]
8/5/1932 WWW2019@San Francisco
[CIKM’18] Hussein, Rana, Dingqi Yang, and Philippe Cudré-Mauroux. "Are Meta-Paths Necessary?: Revisiting Heterogeneous Graph Embeddings." CIKM’18.
[WWW’19] Dingqi Yang, Bingqing Qu, Jie Yang, Philippe Cudre-Mauroux, ”Revisiting User Mobility and Social Relationships in LBSNs: A Hypergraph Embedding Approach.” WWW’19.
[KDD’19] Dingqi Yang, Paolo Rosso, Bin Li and Philippe Cudre-Mauroux, “NodeSketch: Highly-Efficient Graph Embeddings via Recursive Sketching.” KDD’19.
Future Plan for Representation Learning on Graphs
■ Attributed graph structure (e.g., property graphs)
■ Heterogeneous data structures (e.g., structured knowledge graph + unstructured text)
■ Dynamic graphs (e.g., streaming LBSN graphs)
4/29/19 Dingqi's job talk @ University of Luxembourg33

More Related Content

What's hot

Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Xiaohan Zeng
 
GANs and Applications
GANs and ApplicationsGANs and Applications
GANs and ApplicationsHoang Nguyen
 
Deep Learning for Graphs
Deep Learning for GraphsDeep Learning for Graphs
Deep Learning for GraphsDeepLearningBlr
 
Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)DheerajPachauri
 
Social network analysis intro part I
Social network analysis intro part ISocial network analysis intro part I
Social network analysis intro part ITHomas Plotkowiak
 
Introduction to Social Network Analysis
Introduction to Social Network AnalysisIntroduction to Social Network Analysis
Introduction to Social Network AnalysisPremsankar Chakkingal
 
Social network analysis part ii
Social network analysis part iiSocial network analysis part ii
Social network analysis part iiTHomas Plotkowiak
 
Domain Transfer and Adaptation Survey
Domain Transfer and Adaptation SurveyDomain Transfer and Adaptation Survey
Domain Transfer and Adaptation SurveySangwoo Mo
 
4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data miningKrish_ver2
 
Social Media Mining: An Introduction
Social Media Mining: An IntroductionSocial Media Mining: An Introduction
Social Media Mining: An IntroductionAli Abbasi
 
Multilayer tutorial-netsci2014-slightlyupdated
Multilayer tutorial-netsci2014-slightlyupdatedMultilayer tutorial-netsci2014-slightlyupdated
Multilayer tutorial-netsci2014-slightlyupdatedMason Porter
 
Graph neural networks overview
Graph neural networks overviewGraph neural networks overview
Graph neural networks overviewRodion Kiryukhin
 
Artificial Neural Networks - ANN
Artificial Neural Networks - ANNArtificial Neural Networks - ANN
Artificial Neural Networks - ANNMohamed Talaat
 
Social network analysis
Social network analysisSocial network analysis
Social network analysisCaleb Jones
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural networknainabhatt2
 

What's hot (20)

Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
 
Particle filter
Particle filterParticle filter
Particle filter
 
Introduction to Social Network Analysis
Introduction to Social Network AnalysisIntroduction to Social Network Analysis
Introduction to Social Network Analysis
 
GANs and Applications
GANs and ApplicationsGANs and Applications
GANs and Applications
 
Deep Learning for Graphs
Deep Learning for GraphsDeep Learning for Graphs
Deep Learning for Graphs
 
Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)
 
Social network analysis intro part I
Social network analysis intro part ISocial network analysis intro part I
Social network analysis intro part I
 
Introduction to Social Network Analysis
Introduction to Social Network AnalysisIntroduction to Social Network Analysis
Introduction to Social Network Analysis
 
Social network analysis part ii
Social network analysis part iiSocial network analysis part ii
Social network analysis part ii
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network Analysis
 
Domain Transfer and Adaptation Survey
Domain Transfer and Adaptation SurveyDomain Transfer and Adaptation Survey
Domain Transfer and Adaptation Survey
 
06 Community Detection
06 Community Detection06 Community Detection
06 Community Detection
 
4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data mining
 
Social Media Mining: An Introduction
Social Media Mining: An IntroductionSocial Media Mining: An Introduction
Social Media Mining: An Introduction
 
Multilayer tutorial-netsci2014-slightlyupdated
Multilayer tutorial-netsci2014-slightlyupdatedMultilayer tutorial-netsci2014-slightlyupdated
Multilayer tutorial-netsci2014-slightlyupdated
 
HOPFIELD NETWORK
HOPFIELD NETWORKHOPFIELD NETWORK
HOPFIELD NETWORK
 
Graph neural networks overview
Graph neural networks overviewGraph neural networks overview
Graph neural networks overview
 
Artificial Neural Networks - ANN
Artificial Neural Networks - ANNArtificial Neural Networks - ANN
Artificial Neural Networks - ANN
 
Social network analysis
Social network analysisSocial network analysis
Social network analysis
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 

Similar to Representation Learning on Complex Graphs

High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingNesreen K. Ahmed
 
A New Algorithm Model for Massive-Scale Streaming Graph Analysis
A New Algorithm Model for Massive-Scale Streaming Graph AnalysisA New Algorithm Model for Massive-Scale Streaming Graph Analysis
A New Algorithm Model for Massive-Scale Streaming Graph AnalysisJason Riedy
 
Ling liu part 01:big graph processing
Ling liu part 01:big graph processingLing liu part 01:big graph processing
Ling liu part 01:big graph processingjins0618
 
Euro30 2019 - Benchmarking tree approaches on street data
Euro30 2019 - Benchmarking tree approaches on street dataEuro30 2019 - Benchmarking tree approaches on street data
Euro30 2019 - Benchmarking tree approaches on street dataFabion Kauker
 
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...miyurud
 
[20240318_LabSeminar_Huy]GSTNet: Global Spatial-Temporal Network for Traffic ...
[20240318_LabSeminar_Huy]GSTNet: Global Spatial-Temporal Network for Traffic ...[20240318_LabSeminar_Huy]GSTNet: Global Spatial-Temporal Network for Traffic ...
[20240318_LabSeminar_Huy]GSTNet: Global Spatial-Temporal Network for Traffic ...thanhdowork
 
20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworkstm1966
 
The Future is Big Graphs: A Community View on Graph Processing Systems
The Future is Big Graphs: A Community View on Graph Processing SystemsThe Future is Big Graphs: A Community View on Graph Processing Systems
The Future is Big Graphs: A Community View on Graph Processing SystemsNeo4j
 
Deep learning for 3 d point clouds presentation
Deep learning for 3 d point clouds presentationDeep learning for 3 d point clouds presentation
Deep learning for 3 d point clouds presentationVijaylaxmiNagurkar
 
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTES
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTESDyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTES
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTESSubhajit Sahu
 
[20240408_LabSeminar_Huy]PivotalSTGNN.pptx
[20240408_LabSeminar_Huy]PivotalSTGNN.pptx[20240408_LabSeminar_Huy]PivotalSTGNN.pptx
[20240408_LabSeminar_Huy]PivotalSTGNN.pptxthanhdowork
 
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph KernelsDDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph Kernelsivaderivader
 
On Integrating Information Visualization Techniques into Data Mining: A Revie...
On Integrating Information Visualization Techniques into Data Mining: A Revie...On Integrating Information Visualization Techniques into Data Mining: A Revie...
On Integrating Information Visualization Techniques into Data Mining: A Revie...Sushant Gautam
 
Laplacian-regularized Graph Bandits
Laplacian-regularized Graph BanditsLaplacian-regularized Graph Bandits
Laplacian-regularized Graph Banditslauratoni4
 
Skyline Query Processing using Filtering in Distributed Environment
Skyline Query Processing using Filtering in Distributed EnvironmentSkyline Query Processing using Filtering in Distributed Environment
Skyline Query Processing using Filtering in Distributed EnvironmentIJMER
 

Similar to Representation Learning on Complex Graphs (20)

Cikm 2018
Cikm 2018Cikm 2018
Cikm 2018
 
High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and Modeling
 
A New Algorithm Model for Massive-Scale Streaming Graph Analysis
A New Algorithm Model for Massive-Scale Streaming Graph AnalysisA New Algorithm Model for Massive-Scale Streaming Graph Analysis
A New Algorithm Model for Massive-Scale Streaming Graph Analysis
 
Ling liu part 01:big graph processing
Ling liu part 01:big graph processingLing liu part 01:big graph processing
Ling liu part 01:big graph processing
 
PointNet
PointNetPointNet
PointNet
 
Euro30 2019 - Benchmarking tree approaches on street data
Euro30 2019 - Benchmarking tree approaches on street dataEuro30 2019 - Benchmarking tree approaches on street data
Euro30 2019 - Benchmarking tree approaches on street data
 
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...
 
[20240318_LabSeminar_Huy]GSTNet: Global Spatial-Temporal Network for Traffic ...
[20240318_LabSeminar_Huy]GSTNet: Global Spatial-Temporal Network for Traffic ...[20240318_LabSeminar_Huy]GSTNet: Global Spatial-Temporal Network for Traffic ...
[20240318_LabSeminar_Huy]GSTNet: Global Spatial-Temporal Network for Traffic ...
 
20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks
 
The Future is Big Graphs: A Community View on Graph Processing Systems
The Future is Big Graphs: A Community View on Graph Processing SystemsThe Future is Big Graphs: A Community View on Graph Processing Systems
The Future is Big Graphs: A Community View on Graph Processing Systems
 
Portfolio
PortfolioPortfolio
Portfolio
 
MapReduce Algorithm Design
MapReduce Algorithm DesignMapReduce Algorithm Design
MapReduce Algorithm Design
 
Deep learning for 3 d point clouds presentation
Deep learning for 3 d point clouds presentationDeep learning for 3 d point clouds presentation
Deep learning for 3 d point clouds presentation
 
Visual Network Narrations
Visual Network NarrationsVisual Network Narrations
Visual Network Narrations
 
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTES
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTESDyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTES
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTES
 
[20240408_LabSeminar_Huy]PivotalSTGNN.pptx
[20240408_LabSeminar_Huy]PivotalSTGNN.pptx[20240408_LabSeminar_Huy]PivotalSTGNN.pptx
[20240408_LabSeminar_Huy]PivotalSTGNN.pptx
 
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph KernelsDDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
 
On Integrating Information Visualization Techniques into Data Mining: A Revie...
On Integrating Information Visualization Techniques into Data Mining: A Revie...On Integrating Information Visualization Techniques into Data Mining: A Revie...
On Integrating Information Visualization Techniques into Data Mining: A Revie...
 
Laplacian-regularized Graph Bandits
Laplacian-regularized Graph BanditsLaplacian-regularized Graph Bandits
Laplacian-regularized Graph Bandits
 
Skyline Query Processing using Filtering in Distributed Environment
Skyline Query Processing using Filtering in Distributed EnvironmentSkyline Query Processing using Filtering in Distributed Environment
Skyline Query Processing using Filtering in Distributed Environment
 

More from eXascale Infolab

Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link PredictionBeyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link PredictioneXascale Infolab
 
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...eXascale Infolab
 
A force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapA force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapeXascale Infolab
 
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...eXascale Infolab
 
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...eXascale Infolab
 
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceansDependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceanseXascale Infolab
 
SANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutionSANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutioneXascale Infolab
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataeXascale Infolab
 
Entity-Centric Data Management
Entity-Centric Data ManagementEntity-Centric Data Management
Entity-Centric Data ManagementeXascale Infolab
 
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataLDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataeXascale Infolab
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataeXascale Infolab
 
The Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task CrowdsourcingThe Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task CrowdsourcingeXascale Infolab
 
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...eXascale Infolab
 
CIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingCIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingeXascale Infolab
 
An Introduction to Big Data
An Introduction to Big DataAn Introduction to Big Data
An Introduction to Big DataeXascale Infolab
 
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)eXascale Infolab
 

More from eXascale Infolab (20)

Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link PredictionBeyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
 
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
 
A force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapA force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory map
 
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
 
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
 
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceansDependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
 
Crowd scheduling www2016
Crowd scheduling www2016Crowd scheduling www2016
Crowd scheduling www2016
 
SANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutionSANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference Resolution
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked Data
 
Entity-Centric Data Management
Entity-Centric Data ManagementEntity-Centric Data Management
Entity-Centric Data Management
 
SSSW 2015 Sense Making
SSSW 2015 Sense MakingSSSW 2015 Sense Making
SSSW 2015 Sense Making
 
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataLDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web Data
 
The Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task CrowdsourcingThe Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task Crowdsourcing
 
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
 
CIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingCIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition ranking
 
OLTP-Bench
OLTP-BenchOLTP-Bench
OLTP-Bench
 
An Introduction to Big Data
An Introduction to Big DataAn Introduction to Big Data
An Introduction to Big Data
 
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
 
Hasler2014
Hasler2014Hasler2014
Hasler2014
 

Recently uploaded

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 

Recently uploaded (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 

Representation Learning on Complex Graphs

  • 1. Representation Learning on Graphs with Complex Structures Prof. Dr. Philippe Cudré-Mauroux eXascale Infolab, U. of Fribourg–Switzerland DL4G-SDE @ WWW2019 San Francisco, May 13, 2019
  • 2. Representation Learning on Graphs ■ Projecting nodes of a graph onto a vector space while preserving key structural properties of the graph (e.g., topological proximity of the nodes) 8/5/192 WWW2019@San Francisco Neural embedding techniques (e.g.word2vec) … 0.19 0.32 1.89 1.21 0.87 0.67 0.45 1.76 1.42 0.98 1.32 0.77 1.11 1.29 1.31 1 Perozzi, Bryan, Rami Al-Rfou, and Steven Skiena. "Deepwalk: Online learning of social representations." In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 701-710. ACM, 2014. DeepWalk1
  • 3. 8/5/193 WWW2019@San Francisco What if the graph at hand exhibits a much more complex structure?
  • 4. Outlines ■ JUST: Embedding heterogeneous graphs without meta-paths [CIKM’18] ■ LBSN2Vec: Embedding heterogeneous hypergraphs from LBSNs [WWW’19] ■ NodeSketch: Highly-efficient graph embeddings via recursive sketching [KDD’19] 8/5/194 WWW2019@San Francisco
  • 5. Heterogeneous Graphs ■ Heterogeneous Graphs contain multiple node types: ● Homogeneous edges: linking nodes from the same domain ● Heterogeneous edges: linking nodes across different domains 8/5/195 WWW2019@San Francisco
  • 6. Meta-Paths in Heterogeneous Graphs ■ A meta-path is a sequence of node types encoding key composite relations among the involved node types. ■ Meta-paths are used to guide random walks to redefine the neighborhood of a node. 8/5/196 WWW2019@San Francisco 1 Yuxiao Dong, Nitesh V Chawla, and Ananthram Swami. 2017. metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 135–144. Metapath2vec1 Neural embedding techniques (e.g.word2vec) … 0.19 0.32 1.89 1.21 0.87 0.67 0.45 1.76 1.42 0.98 1.32 0.77 1.11 1.29 1.31
  • 7. Challenges with Meta-Paths ■ The choice of meta-paths highly affects the quality of the learnt node embeddings for a specific task. ■ How to select meta-paths ? ● Graph specific and highly depends on prior knowledge from domain experts. ● Strategies to combine a set of meta-paths can be complex and computationally expensive. 8/5/197 WWW2019@San Francisco
  • 8. Are meta-paths necessary? 8/5/198 WWW2019@San Francisco
  • 9. JUST: Embedding Heterogeneous Graphs without Meta-Paths ■ Random Walk with JUmp and STay strategies to probabilistically control the random walk. ■ 2 ways to balance the random walk: ● Step I: Jump or stay? −Objective: Balance the number of heterogeneous and homogeneous edges traversed during random walks (stay with probability 𝝰, exponential decay). ● Step II: If Jump, where to Jump? −Objective: Control the randomness in choosing a target domain (memory window to favor diversity). ■ Learn node embeddings with SkipGram model. 8/5/199 WWW2019@San Francisco
  • 10. Results 8/5/1910 WWW2019@San Francisco JUST achieves state-of-the-art performance without using meta-paths. Node classification results
  • 11. Runtime Performance ■ End-to-end node embedding learning time for all random-walk based methods in seconds. 8/5/1911 WWW2019@San Francisco DBLP Movie Foursquare DeepWalk 236 333 484 Metapath2vec (original) 965 19,200 2,248 Metapath2vec (ours) 290 408 550 Hin2vec 904 1,301 1,801 JUST 310 442 616 • Compared to DeepWalk and Metapath2vec, JUST has minor overhead on learning time, but achieves better results in classification and clustering tasks. • Compared to Hin2vec, JUST achieves 3x speedup learning time, and achieves better results in most experiments.
  • 12. Outlines ■ JUST: Embedding heterogeneous graphs without meta-paths [CIKM’18] ■ LBSN2Vec: Embedding heterogeneous hypergraphs from LBSNs [WWW’19] ■ NodeSketch: Highly-efficient graph embeddings via recursive sketching [KDD’19] 8/5/1912 WWW2019@San Francisco
  • 13. Social Relationships v.s. Human Mobility 8/5/1913 WWW2019@San Francisco
  • 14. 8/5/1914 WWW2019@San Francisco How to quantify the impact of social relationships and mobility on each other?
  • 15. ● Two types of links −Friendships −Check-ins (Hyperedges) Location Based Social Networks ■A hypergraph with ● Four data domains 8/5/1915 WWW2019@San Francisco Spatial - POI Temporal - Time slot Semantic - Activity category Social - User
  • 16. Hypergraph Embedding 8/5/1916 WWW2019@San Francisco 0.19 0.32 1.89 1.21 0.87 0.67 0.45 1.76 1.42 0.98 1.32 0.77 1.11 1.29 1.31 045 0.89 1.56 0.02 0.79 … Graph embedding Neural embedding techniques (e.g. SkipGram) 1. How to sample from a LBSN hypergraph? 2. How to preserve n-wise proximity from Hyperedges?
  • 17. 1. Sample from A Hypergraph: Random Walk with Stay ■ Balancing the impact of social and mobility on the learnt embeddings 8/5/1917 WWW2019@San Francisco Sample and learn from • A check-in hyperedge with probability 𝛼 • A user-user pair with probability (1-𝛼)
  • 18. 2. Learn from Hyperedges: Learning via Best-Fit-Line ■ Maximizing the similarity between the nodes of a hyperedge and their best-fit-line under cosine similarity. 8/5/1918 WWW2019@San Francisco 1. Compute the best-fit-line 2. Maximize the cosine similarity between each node and the best-fit-line
  • 19. Task I: Friendship Prediction ■ Comparison with other graph embedding techniques ● (S) Social network only ● (S&M) Social and mobility through clique expansion 8/5/1919 WWW2019@San Francisco ↑ 32.95% on precision@10 Clique expansion
  • 20. Task II: Location Prediction ■ Comparison with other graph embedding techniques ● (M) Mobility (Check-in) network only ● (S&M) Social and mobility through clique expansion 8/5/1920 WWW2019@San Francisco ↑ 25.32% on accuracy@10
  • 21. 8/5/19 WWW2019@San Francisco21 Balancing the Impact of Social Relationships and Mobility Matters! Asymmetric impact of mobility and social relationships on predicting each other: • Friendship prediction: 80% social and 20% mobility data • Location prediction: 60% social and 40% mobility data
  • 22. Outlines ■ JUST: Embedding heterogeneous graphs without meta-paths [CIKM’18] ■ LBSN2Vec: Embedding heterogeneous hypergraphs from LBSNs [WWW’19] ■ NodeSketch: Highly-efficient graph embeddings via recursive sketching [KDD’19] 8/5/1922 WWW2019@San Francisco
  • 23. Graph Embeddings ■ Graph-sampling based techniques ● Sample node pairs from a graph, and preserve node proximity from the node pairs ● Examples: DeepWalk, Node2Vec, LINE, SDNE and VERSE, etc. ● Efficiency bottleneck: A large number of node pairs -> significant computation resources (CPU time) ■ Factorization based techniques ● Factorize a (transformed, e.g., high-order) proximity/adjacency matrix of a graph ● Examples: GraRep, HOPE and NetMF, etc. ● Efficiency bottleneck: Large matrix factorization -> significant computation resources (both CPU time and RAM) ■ Node proximity preserved using cosine similarity ● Efficiency bottleneck: cosine similarity is less efficient than hamming similarity, for example. 8/5/1923 WWW2019@San Francisco
  • 24. Similarity-Preserving Hashing/Sketching ■ Efficient similarity approximation of high dimensional data ● Data-dependent hashing (learning-to-hash) −Learning dataset-specific hashing functions −Examples: spectral hashing, iterative quantization, etc. −Efficient in similarity computation, but requires learning hashing functions ● Data-independent hashing/sketching (locality sensitive hashing) −Hashing without involving any learning process from data −Examples: minhash, consistent weighted sampling, etc. −Efficient in both similarity approximation and hashing 8/5/1924 WWW2019@San Francisco
  • 25. Can we sketch nodes in a graph as embeddings? 8/5/1925 WWW2019@San Francisco
  • 26. Preliminary: Consistent Weighted Sampling1 ■ Principled techniques for highly-efficient similarity approximation 8/5/1926 WWW2019@San Francisco The min-max similarity between original data Can be approximated by the Hamming similarity between sketches 1.32 2.77 1.11 3.29 1.31V Sketch S = S1 … Sj … SL D=5 Random hash function hj , j=1…,L. 1 Dingqi Yang, Bin Li, Rettig Laura, Philippe Cudré-Mauroux, D2HistoSketch: Discriminative and Dynamic Similarity-Preserving Sketching of Streaming Histograms, IEEE Transactions on Knowledge and Data Engineering (TKDE) 2018
  • 27. Sketching the Adjacency Matrix ? ■ Adjacency matrix v.s. Self-Loop-Augmented (SLA) adjacency matrix 8/5/1927 WWW2019@San Francisco
  • 28. NodeSketch: Low-Order Node Embeddings 8/5/1928 WWW2019@San Francisco 1 2 3 4 5
  • 29. NodeSketch: High-Order Node Embeddings 8/5/1929 WWW2019@San Francisco 1 1 0.33 0.33 0.33 Neighbors 𝒏 ∈ 𝜞 𝒓 Node 2 2 3 1 SLA adjacency vector '𝑽 𝒓 Sketch element distribution 𝟏 𝑳 ∑𝒋-𝟏 𝑳 𝕝[𝑺 𝒋 𝒏 𝒌2𝟏 -𝒊], 𝑖=1,..,D 1.066 1.066 0.066 Approximate 𝑘-order SLA adjacency vector '𝑽 𝒓 (𝒌) node 1 Sketching using Eq. 3 *Weight α=0.2 Merge 1 1 1 1 1 1 1 1 1 1 1 1 1 SLA adjacency matrix '𝑨 2 1 1 2 3 1 2 3 4 4 3 4 5 3 5 (𝑘-1)-order node embeddings 𝑺(𝒌 − 𝟏) 𝑘-order embeddings 𝑺(𝒌) 2 1 3 2 3 4 2 3 4 2 3 4 4 3 5 (𝑘-1)-order Sketches 𝑺 𝒏 (𝒌 − 𝟏) … … … Uniformity of the generated samples: The foundation of our recursive sketching process 1 2 3 4 5
  • 30. Results: Node Classification Performance using Kernel SVM 8/5/1930 WWW2019@San Francisco Classical graph embedding techniques (preserving cosine similarity) Learning-to-hash techniques Sketching techniques NodeSketch shows comparable performance to the best-performing state-of-the-art techniques.
  • 31. Results: Runtime Performance 8/5/1931 WWW2019@San Francisco NodeSketch is highly-efficient, and significantly outperforms all baselines, showing 9x-273x speedup. Hamming similarity also shows improved efficiency (1.19x- 1.68x speedup) over cosine similarity.
  • 32. Take-Away Messages ■ JUST: Meta-path free heterogeneous graph embedding can achieve state- of-the-art performance efficiently. [CIKM’18] ■ LBSN2Vec: Asymmetric impact of social and mobility on each other [WWW’19] ■ NodeSketch: High-quality node embeddings can be generated via highly- efficient sketching techniques [KDD’19] 8/5/1932 WWW2019@San Francisco [CIKM’18] Hussein, Rana, Dingqi Yang, and Philippe Cudré-Mauroux. "Are Meta-Paths Necessary?: Revisiting Heterogeneous Graph Embeddings." CIKM’18. [WWW’19] Dingqi Yang, Bingqing Qu, Jie Yang, Philippe Cudre-Mauroux, ”Revisiting User Mobility and Social Relationships in LBSNs: A Hypergraph Embedding Approach.” WWW’19. [KDD’19] Dingqi Yang, Paolo Rosso, Bin Li and Philippe Cudre-Mauroux, “NodeSketch: Highly-Efficient Graph Embeddings via Recursive Sketching.” KDD’19.
  • 33. Future Plan for Representation Learning on Graphs ■ Attributed graph structure (e.g., property graphs) ■ Heterogeneous data structures (e.g., structured knowledge graph + unstructured text) ■ Dynamic graphs (e.g., streaming LBSN graphs) 4/29/19 Dingqi's job talk @ University of Luxembourg33