Tensor Spectral Clustering is an algorithm that generalizes graph partitioning and spectral clustering methods to account for higher-order network structures. It defines a new objective function called motif conductance that measures how partitions cut motifs like triangles in addition to edges. The algorithm represents a tensor of higher-order random walk transitions as a matrix and computes eigenvectors to find a partition that minimizes the number of motifs cut, allowing networks to be clustered based on higher-order connectivity patterns. Experiments on synthetic and real networks show it can discover meaningful partitions by accounting for motifs that capture important structural relationships.
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Tensor Spectral Clustering for Motif-Based Graph Partitioning
1. TENSOR SPECTRAL CLUSTERING
FOR PARTITIONING HIGHER-ORDER NETWORK
STRUCTURES
1
Austin Benson
ICME, Stanford University
arbenson@stanford.edu
Joint work with
David Gleich, Purdue
Jure Leskovec, Stanford
SIAM Data Mining 2015
Vancouver, BC
2. Background: graph partitioning and applications
2
Goal: find a ``balanced” partition of a graph that does
not cut many edges.
Applications: community structure in social networks,
decompose networks into functional modules
3. Background: graph partitioning and clustering
3
A popular measure of the quality of a cut is conductance:
vol(S) is the number of edge end points in the set S
NP-hard in general, but there are approximation algorithms
4. Background: spectral clustering and random
walks
4
P is a transition matrix representing
the random walk Markov chain.
Entries of z used to partition graph.
Central computation:
P43 = Pr(3 4) = 1/3
zTP = λ2zT
P = ATD-1
6. Problem: clustering methods are based on
edges and do not use higher-order relations
or motifs, which can better model problems.
6
Edges Motifs
7. Problem: current methods only consider edges
… and that is not enough to model many problems
7
In social networks, we want to penalize cutting
triangles more than cutting edges. The triangle motif
represents stronger social ties.
8. Problem: current methods only consider edges
… and that is not enough to model many problems
8
SPT16
HO
CLN1
CLN2
In transcription networks, the ``feedforward loop” motif
represents biological function. Thus, we want to look for
clusters of this structure.
SWI4_SWI6
9. Our contributions
9
1. We generalize the definition of conductance for motifs.
2. We provide an algorithm for optimizing this objective:
Tensor Spectral Clustering (TSC) Algorithm:
Input: set of motifs and weights
Output: Partition of graph that does not cut the motifs
corresponding to the weights (and some normalization).
1 1 1 2
10. Roadmap of Tensor Spectral Clustering
10
G
Random walk
transition matrix
P
Eigenvector
zTP = λ2zT
z S
Objective Φ
Motifs and weights
that model problem
Random walk
transition tensor
P
Represent tensor
by a random walk matrix
PG++
New objective Φ’
Sweep cut
11. Motif-based conductance
11
Our algorithm is a heuristic for minimizing this objective
based on the random walk interpretation of spectral
Edges cut Triangles cut
vol(S) =
#(edge end
points in S)
vol3(S) =
#(triangle end
points in S)
12. First-order second-order Markov chain
12
k
ji
r
1/3
1/3
1/3
k
ji
r
1/2
1/2
Prob(i j) = 1/3 Prob((i, j) (j, k)) = 1/2
P P
13. 13
P
k
ji
r
1/2
1/2
Representing the transition tensor
Idea: Represent the tensor as a matrix, respecting the
motif transitions of the data. Then we can compute
eigenvectors.
P
Problem 1: Even stationary distribution of second-
order Markov chain is O(n2) storage.
Problem 2: Tensor eigenvectors are hard to compute.
14. 14
Representing the transition tensor
P(:, :,
1)
P
Each slice of transition tensor is a transition matrix.
Convex combinations of these slices is a transition matrix.
Which combination should we use?
15. Transition tensor transition matrix
15
k
ji
r
1/2
1/2
1. Compute tensor PageRank vector [Gleich+14]
2. Collapse back to probability matrix
Convex combination
of slices P(:, :, k)
16. 16
Theorem
Suppose there is a partition of the graph that
does not cut any of the motifs of interest. Then
the second left eigenvector of the matrix P[x]
properly partitions the graph.
17. Layered flow network
17
The network “flows” downward
Use directed 3-cycles to model flow:
Tensor spectral clustering: {0,1,2,3}, {4,5,6,7}, {8,9,10,11}
Standard spectral: {0,1,2,3,4,5,6,7}, {8,10,11}, {9}
1 1 1 2
18. Planted motif communities
18
Tensor spectral clustering: {0,1,2,3,4,5,12,13,16}
Standard spectral: {0,1,4,5,9,11,16,17,19,20}
0
2
4
Plant a group of 6 nodes
with high motif frequency
into a random graph.
20. Summary of results
20
1. New objective function: motif conductance
2. Tensor Spectral Clustering algorithm that is a
heuristic for minimizing motif conductance.
Input: different motifs and weights
Output: partition minimizing the number of motifs cut
corresponding to the weights
More recent work: algorithm with Cheeger-like
inequality for motif conductance.