SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Downloaden Sie, um offline zu lesen
Jia Wang, James Cheng
The Chinese University of Hong Kong
Definition
 Support(e) in H: #triangles e is in
 k-truss of G: largest subgraph H; each edge in H has support ≥ k-2 in H
 Truss decomposition: find all k-trusses in graph G
 Truss number of edge e: the maximum k, s.t., e is in the k-truss
a
b
c
e
d
l
f
g h
k
i
ja
b
c
e
d
l
f
g h
k
i
ja
b
c
e
d
l
f
g h
k
i
ja
b
c
e
d
l
f
g h
k
i
j
2-truss
3-truss
4-truss
5-trusskmax = 5
truss-number(fj) = 4
Advantages
 Notions of cohesive subgraphs
 n-clique, n-clan, k-plex, n-club
 lambda set
 k-core
 (The largest subgraph H of graph G s.t. every vertex in H is
connected to at least k-1 vertices within H)
 k-truss
NP-Hard
O(n^4)
“Seedbed”
Refines k-core
O(m^1.5)
Stronger ties
O(m+n)
Advantages
5
19
15
13
12
11
10
9
8
14
7
6 4 3
2
1
18
17
16
20
21 108
14
7
5
4 3
1
19
18
17
16
15
20
21 108
14
7
5
4
19
18
1521
A sample network kmax-core
(kmax = 3)
kmax-truss
(kmax = 4)
 Example
redundancy removed
smaller and denser
higher clustering coefficient
(0.65 v.s. 0.80)
Dataset VT/VC ET/EC CCT/CCC
Amazon 5K/33K 55K/442K 0.99/0.72
Wiki 237/700 32K/147K 0.64/0.42
Skitter 185/222 16K/33K 0.95/0.71
Blog 49/387 2K/54K 1.00/0.52
LJ 383/395 146K/155K 1.00/0.99
BTC 653/1295 10K/838K 0.45/0.00002
Web 498/862 82K/148K 1.00/0.59
k-truss vs. k-core
 T: the maximum truss
 C: the maximum core
 CC: (global) clustering coefficient
Refined Maximum clique!
Denser
Outline
 In-memory Algorithms
 Existing algorithm
 Improved algorithm
 I/O-Efficient Algorithms
 Bottom-up approach
 Top-down approach
 Experimental Results
 Conclusion
In-memory Algorithm
 Support computation by triangle listing
 Bottom-up (from small to large k)
 Remove edges with support less than k, update supports of others
 repeat on the remaining graph
Remove isolated
vertices
Remove edges with
support less than 1
Remove edges with
support less than 2
Remove edges with
support less than 3 a
b
c
e
d
l
f
g h
k
i
ja
b
c
e
d
l
f
g h
k
i
ja
b
c
e
d
f
h i
ja
b
c
e
d
2-truss 3-truss 4-truss 5-truss
In-memory Algorithm
 Complexity
State-of-the-artO(Sum(deg^2(v)))
O(|E|^1.5)
By updating
supports
carefully
Cost of
triangle
listing
Weakness
Rapidly
growing
networks
Cannot fit
in memory
High I/O
cost
Random
access of the
graph
Disk-
resident
Parallel: hard
MapReduce: inefficient
I/O-efficient: most efficient
Outline
 In-memory Algorithms
 Existing algorithm
 Improved algorithm
 I/O-Efficient Algorithms
 Bottom-up approach
 Top-down approach
 Experimental Results
 Conclusion
Graph Partitioning
 Partition graph into a set of memory-resident subgraphs
a
b
c
e
d
l g d
f
g h
k
i
ja
b
c
e
d
l
f
g
hk
i
j
a
b
c
e
d
l
f
g h
k
i
j
a
b
c
e
d
l g
a
b
c
e
d
l
f
g
hk
i
j
d
f
g h
k
i
j
Lower-bounding Truss
Number
 Truss decomposition locally
 local truss number as lower bound
a
b
c
e
d
l g
a
b
c
e
d
l
f
g
hk
i
j
d
f
g h
k
i
ja
b
c
e
d
l g
a
b
c
e
d
l
f
g
hk
i
j
d
f
g h
k
i
ja
b
c
e
d
l g
a
b
c
e
d
l
f
g
hk
i
j
d
f
g h
k
i
j
35 4 2
Lower bounds
Truss Decomposition
 Bottom-up decomposition from small to large k
 Only on small candidate subgraph H: induced by
vertices connected to edges with lower-bound ≤ k
 On H: remove edges with (global) truss number ≤ k
 Added to k-truss of G
 O(kmax) scans of the graph
 (assume candidate graph fits in main memory)
Truss Decomposition
 Example
a
b
c
e
d
l
f
g h
k
i
j
Candidate
subgraph for
k=3
a
b
c
e
d
l g
a
b
c
e
d
l
f
g
hk
i
j
d
f
g h
k
i
j
35 4
e
d
l
f
g h
k
Lower bounds
a
b
c
e
d
l
f
g h
k
i
j
a
b
c
e
d
l g
a
b
c
e
d
l
f
g
hk
i
j
d
f
g h
k
i
j
35 4
e
d
l
f
g h
k
Recursively
remove edges
with support
less than 4
Removed edges
have truss
number 3
Truss Decomposition
 Example
Lower bounds
Truss Decomposition
a
b
c
e
d
f
h i
j
a
b
c
e
d
l g
a
b
c
e
d
l
f
g
hk
i
j
d
f
g h
k
i
j
35 4
e
d
l
f
g h
k
Removed edges
have truss
number 3
 Example
Lower bounds
Recursively
remove edges
with support
less than 4
Truss Decomposition
a
b
c
e
d
l g
a
b
c
e
d
l
f
g
hk
i
j
d
f
g h
k
i
j
35 4
a
b
c
e
d
f
h i
j a
b
c
e
d
f
h i
j
Candidate
subgraph for
k=4
 Example
Lower bounds
Truss Decomposition
a
b
c
e
d
l g
a
b
c
e
d
l
f
g
hk
i
j
d
f
g h
k
i
j
35 4
a
b
c
e
d
f
h i
j a
b
c
e
d
f
h i
j
Recursively
remove edges
with support
less than 5
Removed edges
have truss
number 4
 Example
Lower bounds
Truss Decomposition
a
b
c
e
d
l g
a
b
c
e
d
l
f
g
hk
i
j
d
f
g h
k
i
j
35 4
a
b
c
e
d
Recursively
remove edges
with support
less than 5
Removed edges
have truss
number 4
a
b
c
e
d
 Example
Lower bounds
Truss Decomposition
a
b
c
e
d
l g
a
b
c
e
d
l
f
g
hk
i
j
d
f
g h
k
i
j
35 4
a
b
c
e
d
a
b
c
e
d
Candidate
subgraph for
k=5
 Example
Lower bounds
Truss Decomposition
a
b
c
e
d
l g
a
b
c
e
d
l
f
g
hk
i
j
d
f
g h
k
i
j
35 4
a
b
c
e
d
a
b
c
e
d
Recursively
remove edges
with support
less than 6
Removed edges
have truss
number 5
 Example
Lower bounds
Truss Decomposition
a
b
c
e
d
l g
a
b
c
e
d
l
f
g
hk
i
j
d
f
g h
k
i
j
35 4
Recursively
remove edges
with support
less than 6
Removed edges
have truss
number 5
 Example
Lower bounds
Outline
 In-memory Algorithms
 Existing algorithm
 Improved algorithm
 I/O-Efficient Algorithms
 Bottom-up approach
 Top-down approach
 Experimental Results
 Conclusion
Top-down Approach
 We want top-t k-trusses in many applications
 Core of the network
 Small k implies sparse subgraph
 Top-down approach for ONLY top-t k-trusses
 avoid wasting time for sparser trusses
Overview
 Partition graph into a set of memory-resident subgraphs
 Upper-bound truss number locally using neighborhood info
 Repeat for multiple rounds to refine the bounds
 Top-down truss decomposition
 k-truss in descending order of k on candidate subgraph H
 H: induced by vertices connected to edges with upper-bound ≥ k
Compute k-truss Top-down
 O(t) scans of the graph
 (assume candidate graph fits in main memory)
 Example: top-1 k-truss (kmax = 5)
a
b
c
e
d
l
f
g h
k
i
j a
b
c
e
d
Candidate
subgraph
Outline
 In-memory Algorithms
 Existing algorithm
 Improved algorithm
 I/O-Efficient Algorithms
 Bottom-up approach
 Top-down approach
 Experiments
 Conclusion
Setting
Dataset |VG| |EG| Disk size
Amazon 0.4M 3.4M 47.9M
Wiki 2.4M 5.0M 66.5M
Skitter 1.7M 11.0M 149.1M
Blog 1.0M 12.8M 177.2M
LJ 4.8M 69M 809.1M
BTC 165M 773M 10.0G
Web 106M 1092M 12.2G
Disk-resident
Datasets
• Intel Core2 Duo 2.80GHz CPU
• 4GB RAM, Ubuntu 11.04
In-memory
 TD-inmem (state-of-the-art algorithm)
 TD-inmem+ (our algorithm)
 Time (in seconds)
Dataset Wiki Skitter Amazon Blog
TD-inmem 8856 9204 68 1261
TD-inmem+ 121 281 31 361
Speed ratio 73.2 32.8 2.2 3.5
Great speed-up
On Massive Networks
LJ Web BTC
Top-down (top-20) 149 2354 1744
Bottom-up 664 6317 1768
Speed-ratio 4.5 2.7 1.0
Significant savings
Outline
 In-memory Algorithms
 Existing algorithm
 Improved algorithm
 I/O-Efficient Algorithms
 Bottom-up approach
 Top-down approach
 Experimental Results
 Conclusion
Conclusion
 Improved in-memory algorithm
 Better theoretical complexity
 Two I/O-efficient algorithms
 Bottom-up approach: truss decomposition
 Top-down approach: top-t k-trusses
 Experiments
 Efficiency of proposed algorithms
 Usefulness of k-trusses, e.g., clustering coefficients
Thank you!
33
Compare with MapReduce
Approach
Dataset |VG| |EG| Disk size
P2P 6.3K 41.6K 237K
HEP 9.9K 52.0K 317K
P2P HEP LJ BTC Web
Bottom-up < 1 < 1 664 1768 6314
MapReduce 4200 14700 - - -
Time (in seconds)
Datasets
Upper-bound truss number
 Initial bound
 Support (in G)
 Further bound
 Suppose edge e = (u,v)
 If less than x edges with support at least x are connected to u: e
cannot be in (x+2)-truss (same for v)
 Example: edge (e,d)
a
b
c
e
d
l
f
g h
k
i
j

Weitere ähnliche Inhalte

Was ist angesagt?

Turing Machine
Turing MachineTuring Machine
Turing MachineRajendran
 
Lecture 08 uninformed search techniques
Lecture 08 uninformed search techniquesLecture 08 uninformed search techniques
Lecture 08 uninformed search techniquesHema Kashyap
 
5 gusev
5 gusev5 gusev
5 gusevYandex
 
Joint CSI Estimation, Beamforming and Scheduling Design for Wideband Massive ...
Joint CSI Estimation, Beamforming and Scheduling Design for Wideband Massive ...Joint CSI Estimation, Beamforming and Scheduling Design for Wideband Massive ...
Joint CSI Estimation, Beamforming and Scheduling Design for Wideband Massive ...T. E. BOGALE
 
Jarrar: Informed Search
Jarrar: Informed Search  Jarrar: Informed Search
Jarrar: Informed Search Mustafa Jarrar
 
Intro to-iterative-deepening
Intro to-iterative-deepeningIntro to-iterative-deepening
Intro to-iterative-deepeningAdel Totott
 
Solving problems by searching Informed (heuristics) Search
Solving problems by searching Informed (heuristics) SearchSolving problems by searching Informed (heuristics) Search
Solving problems by searching Informed (heuristics) Searchmatele41
 
Assignment 3 push down automata final
Assignment 3 push down automata finalAssignment 3 push down automata final
Assignment 3 push down automata finalPawan Goel
 
Cryptography and network security
Cryptography and network security Cryptography and network security
Cryptography and network security Tasif Tanzim
 
0015.register allocation-graph-coloring
0015.register allocation-graph-coloring0015.register allocation-graph-coloring
0015.register allocation-graph-coloringsean chen
 
CNIT 141: 9. Elliptic Curve Cryptosystems
CNIT 141: 9. Elliptic Curve CryptosystemsCNIT 141: 9. Elliptic Curve Cryptosystems
CNIT 141: 9. Elliptic Curve CryptosystemsSam Bowne
 
Assembly language (addition and subtraction)
Assembly language (addition and subtraction)Assembly language (addition and subtraction)
Assembly language (addition and subtraction)Muhammad Umar Farooq
 
Lecture 10 Uninformed Search Techniques conti..
Lecture 10 Uninformed Search Techniques conti..Lecture 10 Uninformed Search Techniques conti..
Lecture 10 Uninformed Search Techniques conti..Hema Kashyap
 
Steering Time-Dependent Estimation of Posteriors with Hyperparameter Indexing...
Steering Time-Dependent Estimation of Posteriors with Hyperparameter Indexing...Steering Time-Dependent Estimation of Posteriors with Hyperparameter Indexing...
Steering Time-Dependent Estimation of Posteriors with Hyperparameter Indexing...Tomonari Masada
 
[AAAI-16] Tiebreaking Strategies for A* Search: How to Explore the Final Fron...
[AAAI-16] Tiebreaking Strategies for A* Search: How to Explore the Final Fron...[AAAI-16] Tiebreaking Strategies for A* Search: How to Explore the Final Fron...
[AAAI-16] Tiebreaking Strategies for A* Search: How to Explore the Final Fron...Asai Masataro
 
Cypher for Gremlin
Cypher for GremlinCypher for Gremlin
Cypher for GremlinopenCypher
 
3 - Finding similar items
3 - Finding similar items3 - Finding similar items
3 - Finding similar itemsViet-Trung TRAN
 

Was ist angesagt? (20)

Turing Machine
Turing MachineTuring Machine
Turing Machine
 
Lecture 08 uninformed search techniques
Lecture 08 uninformed search techniquesLecture 08 uninformed search techniques
Lecture 08 uninformed search techniques
 
5 gusev
5 gusev5 gusev
5 gusev
 
M4 heuristics
M4 heuristicsM4 heuristics
M4 heuristics
 
Joint CSI Estimation, Beamforming and Scheduling Design for Wideband Massive ...
Joint CSI Estimation, Beamforming and Scheduling Design for Wideband Massive ...Joint CSI Estimation, Beamforming and Scheduling Design for Wideband Massive ...
Joint CSI Estimation, Beamforming and Scheduling Design for Wideband Massive ...
 
Jarrar: Informed Search
Jarrar: Informed Search  Jarrar: Informed Search
Jarrar: Informed Search
 
Intro to-iterative-deepening
Intro to-iterative-deepeningIntro to-iterative-deepening
Intro to-iterative-deepening
 
Solving problems by searching Informed (heuristics) Search
Solving problems by searching Informed (heuristics) SearchSolving problems by searching Informed (heuristics) Search
Solving problems by searching Informed (heuristics) Search
 
Assignment 3 push down automata final
Assignment 3 push down automata finalAssignment 3 push down automata final
Assignment 3 push down automata final
 
Cryptography and network security
Cryptography and network security Cryptography and network security
Cryptography and network security
 
An Overview of HDF-EOS (Part 1)
An Overview of HDF-EOS (Part 1)An Overview of HDF-EOS (Part 1)
An Overview of HDF-EOS (Part 1)
 
0015.register allocation-graph-coloring
0015.register allocation-graph-coloring0015.register allocation-graph-coloring
0015.register allocation-graph-coloring
 
CNIT 141: 9. Elliptic Curve Cryptosystems
CNIT 141: 9. Elliptic Curve CryptosystemsCNIT 141: 9. Elliptic Curve Cryptosystems
CNIT 141: 9. Elliptic Curve Cryptosystems
 
Assembly language (addition and subtraction)
Assembly language (addition and subtraction)Assembly language (addition and subtraction)
Assembly language (addition and subtraction)
 
Asymptotic Notation
Asymptotic NotationAsymptotic Notation
Asymptotic Notation
 
Lecture 10 Uninformed Search Techniques conti..
Lecture 10 Uninformed Search Techniques conti..Lecture 10 Uninformed Search Techniques conti..
Lecture 10 Uninformed Search Techniques conti..
 
Steering Time-Dependent Estimation of Posteriors with Hyperparameter Indexing...
Steering Time-Dependent Estimation of Posteriors with Hyperparameter Indexing...Steering Time-Dependent Estimation of Posteriors with Hyperparameter Indexing...
Steering Time-Dependent Estimation of Posteriors with Hyperparameter Indexing...
 
[AAAI-16] Tiebreaking Strategies for A* Search: How to Explore the Final Fron...
[AAAI-16] Tiebreaking Strategies for A* Search: How to Explore the Final Fron...[AAAI-16] Tiebreaking Strategies for A* Search: How to Explore the Final Fron...
[AAAI-16] Tiebreaking Strategies for A* Search: How to Explore the Final Fron...
 
Cypher for Gremlin
Cypher for GremlinCypher for Gremlin
Cypher for Gremlin
 
3 - Finding similar items
3 - Finding similar items3 - Finding similar items
3 - Finding similar items
 

Andere mochten auch

eating disorders sorority life
eating disorders sorority lifeeating disorders sorority life
eating disorders sorority lifeMatthew Greer
 
La salud en los niños
La salud en los niñosLa salud en los niños
La salud en los niñosNATALY RIAÑO
 
Presentation lea sj. de jesus
Presentation  lea sj. de jesusPresentation  lea sj. de jesus
Presentation lea sj. de jesusraf0208
 
MSR-Products-Sales_Presentation
MSR-Products-Sales_PresentationMSR-Products-Sales_Presentation
MSR-Products-Sales_PresentationBrian House
 
Satya Group Promotes Excellence in the Real Estate Industry
Satya Group Promotes Excellence in the Real Estate IndustrySatya Group Promotes Excellence in the Real Estate Industry
Satya Group Promotes Excellence in the Real Estate IndustrySatya Group
 
Sumary of licenses and certifications for Jessica Reed Linked in update
Sumary of licenses and certifications for Jessica Reed Linked in updateSumary of licenses and certifications for Jessica Reed Linked in update
Sumary of licenses and certifications for Jessica Reed Linked in updateChristie Viands
 
Developing and Implementing a Model Extended Foster Care Program
Developing and Implementing a Model Extended Foster Care ProgramDeveloping and Implementing a Model Extended Foster Care Program
Developing and Implementing a Model Extended Foster Care ProgramThe Annie E. Casey Foundation
 
PresentacióN Materiales Aula De Enlace
PresentacióN Materiales Aula De EnlacePresentacióN Materiales Aula De Enlace
PresentacióN Materiales Aula De EnlaceJosé Antonio
 

Andere mochten auch (13)

eating disorders sorority life
eating disorders sorority lifeeating disorders sorority life
eating disorders sorority life
 
La salud en los niños
La salud en los niñosLa salud en los niños
La salud en los niños
 
Sap mm
Sap mmSap mm
Sap mm
 
Presentation lea sj. de jesus
Presentation  lea sj. de jesusPresentation  lea sj. de jesus
Presentation lea sj. de jesus
 
MSR-Products-Sales_Presentation
MSR-Products-Sales_PresentationMSR-Products-Sales_Presentation
MSR-Products-Sales_Presentation
 
Satya Group Promotes Excellence in the Real Estate Industry
Satya Group Promotes Excellence in the Real Estate IndustrySatya Group Promotes Excellence in the Real Estate Industry
Satya Group Promotes Excellence in the Real Estate Industry
 
Sumary of licenses and certifications for Jessica Reed Linked in update
Sumary of licenses and certifications for Jessica Reed Linked in updateSumary of licenses and certifications for Jessica Reed Linked in update
Sumary of licenses and certifications for Jessica Reed Linked in update
 
Andrew Heneisen Resume
Andrew Heneisen ResumeAndrew Heneisen Resume
Andrew Heneisen Resume
 
Developing and Implementing a Model Extended Foster Care Program
Developing and Implementing a Model Extended Foster Care ProgramDeveloping and Implementing a Model Extended Foster Care Program
Developing and Implementing a Model Extended Foster Care Program
 
PresentacióN Materiales Aula De Enlace
PresentacióN Materiales Aula De EnlacePresentacióN Materiales Aula De Enlace
PresentacióN Materiales Aula De Enlace
 
2013 UNIFORM EVALUATION (UFE) HONOUR ROLL
2013 UNIFORM EVALUATION (UFE) HONOUR ROLL2013 UNIFORM EVALUATION (UFE) HONOUR ROLL
2013 UNIFORM EVALUATION (UFE) HONOUR ROLL
 
Shemeeka Williams Resume
Shemeeka Williams ResumeShemeeka Williams Resume
Shemeeka Williams Resume
 
Apache Spark Components
Apache Spark ComponentsApache Spark Components
Apache Spark Components
 

Ähnlich wie ktruss-short

Kernel for Chordal Vertex Deletion
Kernel for Chordal Vertex DeletionKernel for Chordal Vertex Deletion
Kernel for Chordal Vertex DeletionAkankshaAgrawal55
 
snarks <3 hash functions
snarks <3 hash functionssnarks <3 hash functions
snarks <3 hash functionsRebekah Mercer
 
Finding similar items in high dimensional spaces locality sensitive hashing
Finding similar items in high dimensional spaces  locality sensitive hashingFinding similar items in high dimensional spaces  locality sensitive hashing
Finding similar items in high dimensional spaces locality sensitive hashingDmitriy Selivanov
 
Дмитрий Селиванов, OK.RU. Finding Similar Items in high-dimensional spaces: L...
Дмитрий Селиванов, OK.RU. Finding Similar Items in high-dimensional spaces: L...Дмитрий Селиванов, OK.RU. Finding Similar Items in high-dimensional spaces: L...
Дмитрий Селиванов, OK.RU. Finding Similar Items in high-dimensional spaces: L...Mail.ru Group
 
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...Alex Pruden
 
Injecting image priors into Learnable Compressive Subsampling
Injecting image priors into Learnable Compressive SubsamplingInjecting image priors into Learnable Compressive Subsampling
Injecting image priors into Learnable Compressive SubsamplingMartino Ferrari
 
CPM2013-tabei201306
CPM2013-tabei201306CPM2013-tabei201306
CPM2013-tabei201306Yasuo Tabei
 
20110319 parameterized algorithms_fomin_lecture01-02
20110319 parameterized algorithms_fomin_lecture01-0220110319 parameterized algorithms_fomin_lecture01-02
20110319 parameterized algorithms_fomin_lecture01-02Computer Science Club
 
Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Usatyuk Vasiliy
 
Polynomial Kernel for Interval Vertex Deletion
Polynomial Kernel for Interval Vertex DeletionPolynomial Kernel for Interval Vertex Deletion
Polynomial Kernel for Interval Vertex DeletionAkankshaAgrawal55
 
Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Usatyuk Vasiliy
 
Reducing Structural Bias in Technology Mapping
Reducing Structural Bias in Technology MappingReducing Structural Bias in Technology Mapping
Reducing Structural Bias in Technology Mappingsatrajit
 
Vlsiphysicaldesignautomationonpartitioning 120219012744-phpapp01
Vlsiphysicaldesignautomationonpartitioning 120219012744-phpapp01Vlsiphysicaldesignautomationonpartitioning 120219012744-phpapp01
Vlsiphysicaldesignautomationonpartitioning 120219012744-phpapp01Hemant Jha
 

Ähnlich wie ktruss-short (20)

Kernel for Chordal Vertex Deletion
Kernel for Chordal Vertex DeletionKernel for Chordal Vertex Deletion
Kernel for Chordal Vertex Deletion
 
Unit 3
Unit 3Unit 3
Unit 3
 
Unit 3
Unit 3Unit 3
Unit 3
 
snarks <3 hash functions
snarks <3 hash functionssnarks <3 hash functions
snarks <3 hash functions
 
Finding similar items in high dimensional spaces locality sensitive hashing
Finding similar items in high dimensional spaces  locality sensitive hashingFinding similar items in high dimensional spaces  locality sensitive hashing
Finding similar items in high dimensional spaces locality sensitive hashing
 
Дмитрий Селиванов, OK.RU. Finding Similar Items in high-dimensional spaces: L...
Дмитрий Селиванов, OK.RU. Finding Similar Items in high-dimensional spaces: L...Дмитрий Селиванов, OK.RU. Finding Similar Items in high-dimensional spaces: L...
Дмитрий Селиванов, OK.RU. Finding Similar Items in high-dimensional spaces: L...
 
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
 
Injecting image priors into Learnable Compressive Subsampling
Injecting image priors into Learnable Compressive SubsamplingInjecting image priors into Learnable Compressive Subsampling
Injecting image priors into Learnable Compressive Subsampling
 
Biochip
BiochipBiochip
Biochip
 
CPM2013-tabei201306
CPM2013-tabei201306CPM2013-tabei201306
CPM2013-tabei201306
 
20110319 parameterized algorithms_fomin_lecture01-02
20110319 parameterized algorithms_fomin_lecture01-0220110319 parameterized algorithms_fomin_lecture01-02
20110319 parameterized algorithms_fomin_lecture01-02
 
Ivd soda-2019
Ivd soda-2019Ivd soda-2019
Ivd soda-2019
 
Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...
 
Huff
HuffHuff
Huff
 
Biconnectivity
BiconnectivityBiconnectivity
Biconnectivity
 
Polynomial Kernel for Interval Vertex Deletion
Polynomial Kernel for Interval Vertex DeletionPolynomial Kernel for Interval Vertex Deletion
Polynomial Kernel for Interval Vertex Deletion
 
Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...
 
Reducing Structural Bias in Technology Mapping
Reducing Structural Bias in Technology MappingReducing Structural Bias in Technology Mapping
Reducing Structural Bias in Technology Mapping
 
Vlsiphysicaldesignautomationonpartitioning 120219012744-phpapp01
Vlsiphysicaldesignautomationonpartitioning 120219012744-phpapp01Vlsiphysicaldesignautomationonpartitioning 120219012744-phpapp01
Vlsiphysicaldesignautomationonpartitioning 120219012744-phpapp01
 
Graph mining ppt
Graph mining pptGraph mining ppt
Graph mining ppt
 

ktruss-short

  • 1. Jia Wang, James Cheng The Chinese University of Hong Kong
  • 2. Definition  Support(e) in H: #triangles e is in  k-truss of G: largest subgraph H; each edge in H has support ≥ k-2 in H  Truss decomposition: find all k-trusses in graph G  Truss number of edge e: the maximum k, s.t., e is in the k-truss a b c e d l f g h k i ja b c e d l f g h k i ja b c e d l f g h k i ja b c e d l f g h k i j 2-truss 3-truss 4-truss 5-trusskmax = 5 truss-number(fj) = 4
  • 3. Advantages  Notions of cohesive subgraphs  n-clique, n-clan, k-plex, n-club  lambda set  k-core  (The largest subgraph H of graph G s.t. every vertex in H is connected to at least k-1 vertices within H)  k-truss NP-Hard O(n^4) “Seedbed” Refines k-core O(m^1.5) Stronger ties O(m+n)
  • 4. Advantages 5 19 15 13 12 11 10 9 8 14 7 6 4 3 2 1 18 17 16 20 21 108 14 7 5 4 3 1 19 18 17 16 15 20 21 108 14 7 5 4 19 18 1521 A sample network kmax-core (kmax = 3) kmax-truss (kmax = 4)  Example redundancy removed smaller and denser higher clustering coefficient (0.65 v.s. 0.80)
  • 5. Dataset VT/VC ET/EC CCT/CCC Amazon 5K/33K 55K/442K 0.99/0.72 Wiki 237/700 32K/147K 0.64/0.42 Skitter 185/222 16K/33K 0.95/0.71 Blog 49/387 2K/54K 1.00/0.52 LJ 383/395 146K/155K 1.00/0.99 BTC 653/1295 10K/838K 0.45/0.00002 Web 498/862 82K/148K 1.00/0.59 k-truss vs. k-core  T: the maximum truss  C: the maximum core  CC: (global) clustering coefficient Refined Maximum clique! Denser
  • 6. Outline  In-memory Algorithms  Existing algorithm  Improved algorithm  I/O-Efficient Algorithms  Bottom-up approach  Top-down approach  Experimental Results  Conclusion
  • 7. In-memory Algorithm  Support computation by triangle listing  Bottom-up (from small to large k)  Remove edges with support less than k, update supports of others  repeat on the remaining graph Remove isolated vertices Remove edges with support less than 1 Remove edges with support less than 2 Remove edges with support less than 3 a b c e d l f g h k i ja b c e d l f g h k i ja b c e d f h i ja b c e d 2-truss 3-truss 4-truss 5-truss
  • 8. In-memory Algorithm  Complexity State-of-the-artO(Sum(deg^2(v))) O(|E|^1.5) By updating supports carefully Cost of triangle listing
  • 9. Weakness Rapidly growing networks Cannot fit in memory High I/O cost Random access of the graph Disk- resident Parallel: hard MapReduce: inefficient I/O-efficient: most efficient
  • 10. Outline  In-memory Algorithms  Existing algorithm  Improved algorithm  I/O-Efficient Algorithms  Bottom-up approach  Top-down approach  Experimental Results  Conclusion
  • 11. Graph Partitioning  Partition graph into a set of memory-resident subgraphs a b c e d l g d f g h k i ja b c e d l f g hk i j a b c e d l f g h k i j
  • 12. a b c e d l g a b c e d l f g hk i j d f g h k i j Lower-bounding Truss Number  Truss decomposition locally  local truss number as lower bound a b c e d l g a b c e d l f g hk i j d f g h k i ja b c e d l g a b c e d l f g hk i j d f g h k i ja b c e d l g a b c e d l f g hk i j d f g h k i j 35 4 2 Lower bounds
  • 13. Truss Decomposition  Bottom-up decomposition from small to large k  Only on small candidate subgraph H: induced by vertices connected to edges with lower-bound ≤ k  On H: remove edges with (global) truss number ≤ k  Added to k-truss of G  O(kmax) scans of the graph  (assume candidate graph fits in main memory)
  • 14. Truss Decomposition  Example a b c e d l f g h k i j Candidate subgraph for k=3 a b c e d l g a b c e d l f g hk i j d f g h k i j 35 4 e d l f g h k Lower bounds
  • 15. a b c e d l f g h k i j a b c e d l g a b c e d l f g hk i j d f g h k i j 35 4 e d l f g h k Recursively remove edges with support less than 4 Removed edges have truss number 3 Truss Decomposition  Example Lower bounds
  • 16. Truss Decomposition a b c e d f h i j a b c e d l g a b c e d l f g hk i j d f g h k i j 35 4 e d l f g h k Removed edges have truss number 3  Example Lower bounds Recursively remove edges with support less than 4
  • 17. Truss Decomposition a b c e d l g a b c e d l f g hk i j d f g h k i j 35 4 a b c e d f h i j a b c e d f h i j Candidate subgraph for k=4  Example Lower bounds
  • 18. Truss Decomposition a b c e d l g a b c e d l f g hk i j d f g h k i j 35 4 a b c e d f h i j a b c e d f h i j Recursively remove edges with support less than 5 Removed edges have truss number 4  Example Lower bounds
  • 19. Truss Decomposition a b c e d l g a b c e d l f g hk i j d f g h k i j 35 4 a b c e d Recursively remove edges with support less than 5 Removed edges have truss number 4 a b c e d  Example Lower bounds
  • 20. Truss Decomposition a b c e d l g a b c e d l f g hk i j d f g h k i j 35 4 a b c e d a b c e d Candidate subgraph for k=5  Example Lower bounds
  • 21. Truss Decomposition a b c e d l g a b c e d l f g hk i j d f g h k i j 35 4 a b c e d a b c e d Recursively remove edges with support less than 6 Removed edges have truss number 5  Example Lower bounds
  • 22. Truss Decomposition a b c e d l g a b c e d l f g hk i j d f g h k i j 35 4 Recursively remove edges with support less than 6 Removed edges have truss number 5  Example Lower bounds
  • 23. Outline  In-memory Algorithms  Existing algorithm  Improved algorithm  I/O-Efficient Algorithms  Bottom-up approach  Top-down approach  Experimental Results  Conclusion
  • 24. Top-down Approach  We want top-t k-trusses in many applications  Core of the network  Small k implies sparse subgraph  Top-down approach for ONLY top-t k-trusses  avoid wasting time for sparser trusses
  • 25. Overview  Partition graph into a set of memory-resident subgraphs  Upper-bound truss number locally using neighborhood info  Repeat for multiple rounds to refine the bounds  Top-down truss decomposition  k-truss in descending order of k on candidate subgraph H  H: induced by vertices connected to edges with upper-bound ≥ k
  • 26. Compute k-truss Top-down  O(t) scans of the graph  (assume candidate graph fits in main memory)  Example: top-1 k-truss (kmax = 5) a b c e d l f g h k i j a b c e d Candidate subgraph
  • 27. Outline  In-memory Algorithms  Existing algorithm  Improved algorithm  I/O-Efficient Algorithms  Bottom-up approach  Top-down approach  Experiments  Conclusion
  • 28. Setting Dataset |VG| |EG| Disk size Amazon 0.4M 3.4M 47.9M Wiki 2.4M 5.0M 66.5M Skitter 1.7M 11.0M 149.1M Blog 1.0M 12.8M 177.2M LJ 4.8M 69M 809.1M BTC 165M 773M 10.0G Web 106M 1092M 12.2G Disk-resident Datasets • Intel Core2 Duo 2.80GHz CPU • 4GB RAM, Ubuntu 11.04
  • 29. In-memory  TD-inmem (state-of-the-art algorithm)  TD-inmem+ (our algorithm)  Time (in seconds) Dataset Wiki Skitter Amazon Blog TD-inmem 8856 9204 68 1261 TD-inmem+ 121 281 31 361 Speed ratio 73.2 32.8 2.2 3.5 Great speed-up
  • 30. On Massive Networks LJ Web BTC Top-down (top-20) 149 2354 1744 Bottom-up 664 6317 1768 Speed-ratio 4.5 2.7 1.0 Significant savings
  • 31. Outline  In-memory Algorithms  Existing algorithm  Improved algorithm  I/O-Efficient Algorithms  Bottom-up approach  Top-down approach  Experimental Results  Conclusion
  • 32. Conclusion  Improved in-memory algorithm  Better theoretical complexity  Two I/O-efficient algorithms  Bottom-up approach: truss decomposition  Top-down approach: top-t k-trusses  Experiments  Efficiency of proposed algorithms  Usefulness of k-trusses, e.g., clustering coefficients
  • 34. Compare with MapReduce Approach Dataset |VG| |EG| Disk size P2P 6.3K 41.6K 237K HEP 9.9K 52.0K 317K P2P HEP LJ BTC Web Bottom-up < 1 < 1 664 1768 6314 MapReduce 4200 14700 - - - Time (in seconds) Datasets
  • 35. Upper-bound truss number  Initial bound  Support (in G)  Further bound  Suppose edge e = (u,v)  If less than x edges with support at least x are connected to u: e cannot be in (x+2)-truss (same for v)  Example: edge (e,d) a b c e d l f g h k i j

Hinweis der Redaktion

  1. James, from…
  2. Definition of Support? No 6-truss Show how support can be used as upper bound Decomposition by computing truss number
  3. NP-hard to compute Complexity of k-core
  4. k-core compared to sample network
  5. Triangle listing algorithm
  6. smoother
  7. How?
  8. What’s decomposition? Recall that truss decomposition is to … Equivalently, since the k-truss consists of , we may instead compute …
  9. external
  10. Reference Blue colors indicate..
  11. Initialize k to the maximum upper-bound
  12. Show other edges have support less than 4