SlideShare a Scribd company logo
1 of 48
Download to read offline
Community Detection in Social Networks
A Brief Overview
Satyaki Sikdar
Heritage Institute of Technology, Kolkata
8 January 2016
Satyaki Sikdar Community Detection 8 January 2016 1 / 37
Introduction
Table of Contents
1 Introduction
About Me
Social Networks
Mathematical background
2 Motivation
3 The Hunt for Communities
4 The Need for Speed (and quality)
Satyaki Sikdar Community Detection 8 January 2016 2 / 37
Introduction About Me
about me
Extremely lazy - I’ve been told
Working with social networks for the past 8 months the supervision of Prof. Partha
Basuchowdhuri
Conversant in Python, C++ and C - an average programmer at best
Vice Chair of Heritage Institute of Technology ACM Student Chapter
Satyaki Sikdar Community Detection 8 January 2016 3 / 37
Introduction Social Networks
Networks
Networks are everywhere. They crop up wherever there are interactions between actors.
friendship networks
Satyaki Sikdar Community Detection 8 January 2016 4 / 37
Introduction Social Networks
Networks
Networks are everywhere. They crop up wherever there are interactions between actors.
friendship networks
follower networks
Satyaki Sikdar Community Detection 8 January 2016 4 / 37
Introduction Social Networks
Networks
Networks are everywhere. They crop up wherever there are interactions between actors.
friendship networks
follower networks
neural networks
Satyaki Sikdar Community Detection 8 January 2016 4 / 37
Introduction Social Networks
Networks
Networks are everywhere. They crop up wherever there are interactions between actors.
friendship networks
follower networks
neural networks
telecom networks
Satyaki Sikdar Community Detection 8 January 2016 4 / 37
Introduction Social Networks
Networks
Networks are everywhere. They crop up wherever there are interactions between actors.
friendship networks
follower networks
neural networks
telecom networks
trade of goods and services
Satyaki Sikdar Community Detection 8 January 2016 4 / 37
Introduction Social Networks
Networks
Networks are everywhere. They crop up wherever there are interactions between actors.
friendship networks
follower networks
neural networks
telecom networks
trade of goods and services
protein protein interactions - medicine design
Satyaki Sikdar Community Detection 8 January 2016 4 / 37
Introduction Social Networks
Networks
Networks are everywhere. They crop up wherever there are interactions between actors.
friendship networks
follower networks
neural networks
telecom networks
trade of goods and services
protein protein interactions - medicine design
citations and collaborations
Satyaki Sikdar Community Detection 8 January 2016 4 / 37
Introduction Social Networks
Networks
Networks are everywhere. They crop up wherever there are interactions between actors.
friendship networks
follower networks
neural networks
telecom networks
trade of goods and services
protein protein interactions - medicine design
citations and collaborations
power grid networks
Satyaki Sikdar Community Detection 8 January 2016 4 / 37
Introduction Social Networks
Networks
Networks are everywhere. They crop up wherever there are interactions between actors.
friendship networks
follower networks
neural networks
telecom networks
trade of goods and services
protein protein interactions - medicine design
citations and collaborations
power grid networks
predator prey networks
Satyaki Sikdar Community Detection 8 January 2016 4 / 37
Introduction Social Networks
Citation and Email networks
Satyaki Sikdar Community Detection 8 January 2016 5 / 37
Introduction Social Networks
Telecommunication and Protein networks
Satyaki Sikdar Community Detection 8 January 2016 6 / 37
Introduction Social Networks
Friendship and Les Mis´erables
Satyaki Sikdar Community Detection 8 January 2016 7 / 37
Introduction Social Networks
High school relationship network
Nearly bipartite
One giant component and a lot of little
ones
No cycles, almost tree like - information /
disease spreads fast
Satyaki Sikdar Community Detection 8 January 2016 8 / 37
Introduction Mathematical background
Network representation
Networks portray the interactions between different actors.
Actors or individuals are nodes/vertices in
the graph
If there’s interaction between two nodes,
there’s an edge/link between them
The links can have weights or intensities
signifying the strength of connections
The links can be directed, like in the web
graph. There’s a directed link between
two nodes (pages) A and B if there’s a
hyperlink to B from A
Satyaki Sikdar Community Detection 8 January 2016 9 / 37
Introduction Mathematical background
Degree and degree distribution
The degree of a node is the number of outward edges from that node
The degree distribution of a network is distribution of the fraction of nodes with a given
degree with the corresponding degrees
Node Degree
1 3
2 2
3 4
4 2
5 3
6 3
7 3
8 2
9 2
10 2
Satyaki Sikdar Community Detection 8 January 2016 10 / 37
Motivation
Table of Contents
1 Introduction
2 Motivation
What are they and why do we even care?
Communities!
Justification for the presence of communities
3 The Hunt for Communities
4 The Need for Speed (and quality)
Satyaki Sikdar Community Detection 8 January 2016 11 / 37
Motivation What are they and why do we even care?
Community Structure: An Informal Definition
The degree distribution follows a power
law and is long-tailed
The distribution of edges is
inhomogeneous
High concentrations of edges within
special groups of vertices, and low
concentrations between them. This
feature of real networks is called
community structure
Satyaki Sikdar Community Detection 8 January 2016 12 / 37
Motivation What are they and why do we even care?
Degree distributions of real life networks
Satyaki Sikdar Community Detection 8 January 2016 13 / 37
Motivation Communities!
Why bother about communities?
Communities are groups of vertices which probably share common properties and/or play
similar roles within the graph.
Society offers a wide variety of possible group organizations: families, working and
friendship circles, villages, towns, nations.
Communities also occur in many networked systems from biology, computer science,
engineering, economics, politics, etc.
In protein-protein interaction networks, communities are likely to group proteins having
the same specific function within the cell
In the graph of the World Wide Web they may correspond to groups of pages dealing
with the same or related topics
Satyaki Sikdar Community Detection 8 January 2016 14 / 37
Motivation Communities!
Applications of Community Detection
Clustering Web clients who have similar interests and are geographically near to each
other improves the performance of services
Identifying clusters of customers with similar interests in the network of purchase
networks of online retailers enables to set up efficient recommendation systems
Clusters of large graphs can be used to create data structures in order to efficiently store
the graph data and to handle navigational queries, like path searches
Allocation of tasks to processors in parallel computing. This can be accomplished by
splitting the computer cluster into groups with roughly the same number of processors,
such that the number of physical connections between processors of different groups is
minimal.
Satyaki Sikdar Community Detection 8 January 2016 15 / 37
Motivation Communities!
A few real world examples
Figure: Zachary’s Karate Club
Figure: Collaboration network between scientists
working at the Santa Fe Institute
Satyaki Sikdar Community Detection 8 January 2016 16 / 37
Motivation Justification for the presence of communities
An Empirical Justification
Figure: Add health friendship data Coded by Race: Blue = Black, Yellow = White, Red = Hispanic,
Green = Asian, White = Other
Satyaki Sikdar Community Detection 8 January 2016 17 / 37
Motivation Justification for the presence of communities
Homophily: Birds of a feather stick together
There’s a visible bias in friendships
52% white students, white-white friendships 86%
Satyaki Sikdar Community Detection 8 January 2016 18 / 37
Motivation Justification for the presence of communities
Homophily: Birds of a feather stick together
There’s a visible bias in friendships
52% white students, white-white friendships 86%
38% black students, black-black friendships 85%
Satyaki Sikdar Community Detection 8 January 2016 18 / 37
Motivation Justification for the presence of communities
Homophily: Birds of a feather stick together
There’s a visible bias in friendships
52% white students, white-white friendships 86%
38% black students, black-black friendships 85%
5% Hispanics, Hispanic-Hispanic friendships 2%
Satyaki Sikdar Community Detection 8 January 2016 18 / 37
Motivation Justification for the presence of communities
Homophily: Birds of a feather stick together
There’s a visible bias in friendships
52% white students, white-white friendships 86%
38% black students, black-black friendships 85%
5% Hispanics, Hispanic-Hispanic friendships 2%
Asymmetric behavior highlights homophily
Results in non-uniform edge distributions
Promotes the formation and maintains the community structure
Satyaki Sikdar Community Detection 8 January 2016 18 / 37
The Hunt for Communities
Table of Contents
1 Introduction
2 Motivation
3 The Hunt for Communities
Where to start?
Definitions
A na¨ıve approach - NP hardness
Girvan-Newman Algorithm
Girvan-Newman in Action
Modularity
Louvain Method
Our method - methodical graph sparsification
Satyaki Sikdar Community Detection 8 January 2016 19 / 37
The Hunt for Communities Where to start?
Formalizing the problem
For a given graph G(V, E), find a cover C = {C1 , C2 , ..., Ck} such that
i
Ci = V
For disjoint communities, Ci Cj = ∅ ∀i, j
For overlapping communities, Ci Cj = ∅ ∀i, j
Figure: Zachary’s Karate Club Network
C = {C1, C2, C3}, C1 = yellow nodes, C2 =
green, C3 = blue is a disjoint cover
However, ¯C = { ¯C1, ¯C2}, ¯C1 = yellow & green
nodes and ¯C2 = blue & green nodes is an
overlapping cover
Satyaki Sikdar Community Detection 8 January 2016 20 / 37
The Hunt for Communities Definitions
A few more definitions
Figure: A simple graph with three
communities. Intra-community
edges are blue and inter-community
ones in green
Let C be a community of a graph G(V, E) with |C| = nc,
|V| = n and |E| = m . We define,
Average link density δ(G) =
m
n(n − 1)/2
Intra-cluster density δint(C) =
#internal edges of C
nc(nc − 1)/2
Inter-cluster density δext(C) =
#intercluster edges of C
nc(n − nc)
For a good community, we expect δint(C) >> δ(G) and
δext(C) << δ(G)
We look to maximize
C
(δint(C) − δext(C))
Satyaki Sikdar Community Detection 8 January 2016 21 / 37
The Hunt for Communities A na¨ıve approach - NP hardness
A Na¨ıve Approach
We have an objective function f(C) =
C∈C
(δint(C) − δext(C))
How do we find a good C?
Exhaustive enumeration, or in simple words, brute force!
Try out all the possible communities C of all possible sizes, pick the best sets of C that
maximizes f(C)
What’s the problem? Too many choices of C to pick from - needle in a haystack!
Even for small graphs, brute forcing becomes infeasible
Can we do better?
Satyaki Sikdar Community Detection 8 January 2016 22 / 37
The Hunt for Communities Girvan-Newman Algorithm
A Little Background: Edge Betweenness Centrality
Betweenness centrality of an edge e is the sum of the fraction of all-pairs shortest paths that
pass through e: cB(e) =
s,t∈V
σ(s, t|e)
σ(s, t)
where σ(s, t) is the number of shortest paths from s
to t and σ(s, t|e) is the number of shortest paths from s to t passing through the edge e
Top 6 edges
Edge cB(e) type
(10, 13) 0.3 inter
(3, 5) 0.23333 inter
(7, 15) 0.2079 inter
(1, 8) 0.1873 inter
(13, 15) 0.1746 intra
(5, 7) 0.1476 intra
Bottom 6 edges
Edge cB(e) type
(8, 11) 0.022 intra
(1, 2) 0.0269 intra
(9, 11) 0.031 intra
(8, 9) 0.0412 intra
(12, 15) 0.052 intra
(3, 4) 0.060 intra
Satyaki Sikdar Community Detection 8 January 2016 23 / 37
The Hunt for Communities Girvan-Newman Algorithm
The Girvan-Newman Algorithm
Proposed by Girvan and Newman in 2002, and was improved in 2004.
Based on reachability of nodes - shortest paths
Edges are selected on the basis of the edge betweenness centrality
The algorithm
1 Computation centrality for all edges
2 Removal of edge with largest centrality; ties can be broken randomly
3 Recalculation of the centralities on the running graph
4 Iterate from step 2, stop when you get clusters of desirable quality
Satyaki Sikdar Community Detection 8 January 2016 24 / 37
The Hunt for Communities Girvan-Newman in Action
(a) Best edge: (10, 13)
(f) Final graph
(b) Best edge: (3, 5)
(e) Best edge: (2, 11)
(c) Best edge: (7, 15)
(d) Best edge: (1, 8)
Satyaki Sikdar Community Detection 8 January 2016 25 / 37
The Hunt for Communities Modularity
Modularity
For a given graph G(V, E), and a disjoint cover C = {C1 , C2 , ..., Ck}, we have,
the number of intra-community edges as
1
2
ij
Aij δ(ci , cj )
the expected number of edges between all pairs of nodes in a community as
1
2
ij
ki kj
2m
δ(ci , cj )
the difference of the actual and the expected values is
1
2
ij
Aij −
ki kj
2m
δ(ci , cj )
We define modularity Q =
1
2m
ij
Aij −
ki kj
2m
δ(ci , cj ). Q ∈ [−1, 1]
The higher the modularity, the better is the community structure*.
The lower it is, the more is the randomness in edge distribution
Satyaki Sikdar Community Detection 8 January 2016 26 / 37
The Hunt for Communities Louvain Method
Louvain Method: A Greedy Approach
Proposed by Blondel et al in 2008.
Takes the greedy maximization approach
Very fast in practice, it’s the current state-of-the-art in disjoint community detection.
Performs hierarchical partitioning, stopping when there cannot be any further
improvement in modularity
Contracts the graph in each iteration thereby speeding up the process
Satyaki Sikdar Community Detection 8 January 2016 27 / 37
The Hunt for Communities Louvain Method
The Algorithm
1 Initially each node is in it’s own community
2 A sequential sweep over the nodes is performed.
Given a node i, the gain in weighted modularity (∆Q) coming from putting i in the
community of its neighbor j is computed. i is put in that community for which ∆Q is
maximum (∆Q 0).
3 Communities are replaced by supernodes and two supernodes are connected by an edge iff
there’s at least an edge between vertices of the two communities.
4 The above two steps are repeated as long as ∆Q 0
Satyaki Sikdar Community Detection 8 January 2016 28 / 37
The Hunt for Communities Louvain Method
Louvain Method in Action
Satyaki Sikdar Community Detection 8 January 2016 29 / 37
The Hunt for Communities Louvain Method
Figure: Belgian mobile phone network. The red nodes are French speakers and the Green ones are
Dutch
Satyaki Sikdar Community Detection 8 January 2016 30 / 37
The Hunt for Communities Our method - methodical graph sparsification
Community Detection by Graph Sparsification
Proposed by Basuchowdhuri, Sikdar, Shreshtha, Majumder in 2015. Accepted in ACM
CoDS 2016 as a full paper.
The input graph is methodically sparsified preserving the community structure. A
t-spanner is used for this purpose.
Louvain Method is applied on the reduced graph to obtain the clusters
Very fast in practice. Performance is comparable to Louvain Method both in terms of
quality and modularity.
Satyaki Sikdar Community Detection 8 January 2016 31 / 37
The Hunt for Communities Our method - methodical graph sparsification
The Algorithm
1 Construct a t-spanner for the given network. Take the complement of the spanner in the
original network
2 Form a cover using any fast community detection in the sparsified graph
3 Run Louvain method to refine the clusters
Satyaki Sikdar Community Detection 8 January 2016 32 / 37
The Hunt for Communities Our method - methodical graph sparsification
Figure: Original network. n =
115, m = 613
Figure: Sparsified network. n
= 115, m = 137
Figure: Final network. n =
115, m = 137
Satyaki Sikdar Community Detection 8 January 2016 33 / 37
The Need for Speed (and quality)
Table of Contents
1 Introduction
2 Motivation
3 The Hunt for Communities
4 The Need for Speed (and quality)
Performance comparison
Satyaki Sikdar Community Detection 8 January 2016 34 / 37
The Need for Speed (and quality) Performance comparison
Performance Comparison
Louvain Method Our Algorithm
Dataset n m Modularity Time t Modularity Time
Karate 34 78 0.415 0 7 0.589422 0.5
Dolphins 62 159 0.518 0 5 0.6744 0.53
Football 115 613 0.604 0 9 0.8627 0.69
Enron 33,696 180,811 0.596 0.38 3 0.855 13.13
DBLP 317,080 1,049,866 0.819 11 9 0.9589864 78.56
Satyaki Sikdar Community Detection 8 January 2016 35 / 37
The Need for Speed (and quality) Performance comparison
Wrapping Up
Social network analysis is a vibrant dynamic field spanning across fields like sociology,
economics, physics, biology and not just CS
Community detection is an active field of research.
Not much work is done with dynamic networks.
Satyaki Sikdar Community Detection 8 January 2016 36 / 37
The Need for Speed (and quality) Performance comparison
Thank you for listening!
Satyaki Sikdar Community Detection 8 January 2016 37 / 37

More Related Content

What's hot

Community detection algorithms
Community detection algorithmsCommunity detection algorithms
Community detection algorithmsAlireza Andalib
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network AnalysisFred Stutzman
 
Community detection in graphs
Community detection in graphsCommunity detection in graphs
Community detection in graphsNicola Barbieri
 
Social Network Analysis Introduction including Data Structure Graph overview.
Social Network Analysis Introduction including Data Structure Graph overview. Social Network Analysis Introduction including Data Structure Graph overview.
Social Network Analysis Introduction including Data Structure Graph overview. Doug Needham
 
Social Media Mining - Chapter 10 (Behavior Analytics)
Social Media Mining - Chapter 10 (Behavior Analytics)Social Media Mining - Chapter 10 (Behavior Analytics)
Social Media Mining - Chapter 10 (Behavior Analytics)SocialMediaMining
 
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISNE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISrathnaarul
 
Social Network Analysis power point presentation
Social Network Analysis power point presentation Social Network Analysis power point presentation
Social Network Analysis power point presentation Ratnesh Shah
 
Social Media Mining - Chapter 6 (Community Analysis)
Social Media Mining - Chapter 6 (Community Analysis)Social Media Mining - Chapter 6 (Community Analysis)
Social Media Mining - Chapter 6 (Community Analysis)SocialMediaMining
 
Social Media Mining - Chapter 3 (Network Measures)
Social Media Mining - Chapter 3 (Network Measures)Social Media Mining - Chapter 3 (Network Measures)
Social Media Mining - Chapter 3 (Network Measures)SocialMediaMining
 
Group and Community Detection in Social Networks
Group and Community Detection in Social NetworksGroup and Community Detection in Social Networks
Group and Community Detection in Social NetworksKent State University
 
Overlapping community detection survey
Overlapping community detection surveyOverlapping community detection survey
Overlapping community detection survey煜林 车
 
Social Network Visualization 101
Social Network Visualization 101Social Network Visualization 101
Social Network Visualization 101librarianrafia
 
Link analysis : Comparative study of HITS and Page Rank Algorithm
Link analysis : Comparative study of HITS and Page Rank AlgorithmLink analysis : Comparative study of HITS and Page Rank Algorithm
Link analysis : Comparative study of HITS and Page Rank AlgorithmKavita Kushwah
 
Network measures used in social network analysis
Network measures used in social network analysis Network measures used in social network analysis
Network measures used in social network analysis Dragan Gasevic
 

What's hot (20)

Community detection algorithms
Community detection algorithmsCommunity detection algorithms
Community detection algorithms
 
3 Centrality
3 Centrality3 Centrality
3 Centrality
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network Analysis
 
Community detection in graphs
Community detection in graphsCommunity detection in graphs
Community detection in graphs
 
Social Network Analysis Introduction including Data Structure Graph overview.
Social Network Analysis Introduction including Data Structure Graph overview. Social Network Analysis Introduction including Data Structure Graph overview.
Social Network Analysis Introduction including Data Structure Graph overview.
 
Social Media Mining - Chapter 10 (Behavior Analytics)
Social Media Mining - Chapter 10 (Behavior Analytics)Social Media Mining - Chapter 10 (Behavior Analytics)
Social Media Mining - Chapter 10 (Behavior Analytics)
 
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISNE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSIS
 
Social Network Analysis power point presentation
Social Network Analysis power point presentation Social Network Analysis power point presentation
Social Network Analysis power point presentation
 
Social Media Mining - Chapter 6 (Community Analysis)
Social Media Mining - Chapter 6 (Community Analysis)Social Media Mining - Chapter 6 (Community Analysis)
Social Media Mining - Chapter 6 (Community Analysis)
 
Social Media Mining - Chapter 3 (Network Measures)
Social Media Mining - Chapter 3 (Network Measures)Social Media Mining - Chapter 3 (Network Measures)
Social Media Mining - Chapter 3 (Network Measures)
 
Web mining
Web mining Web mining
Web mining
 
Group and Community Detection in Social Networks
Group and Community Detection in Social NetworksGroup and Community Detection in Social Networks
Group and Community Detection in Social Networks
 
Overlapping community detection survey
Overlapping community detection surveyOverlapping community detection survey
Overlapping community detection survey
 
Introduction to Complex Networks
Introduction to Complex NetworksIntroduction to Complex Networks
Introduction to Complex Networks
 
Social Network Visualization 101
Social Network Visualization 101Social Network Visualization 101
Social Network Visualization 101
 
Fp growth
Fp growthFp growth
Fp growth
 
Social Network Analysis (SNA)
Social Network Analysis (SNA)Social Network Analysis (SNA)
Social Network Analysis (SNA)
 
Lect12 graph mining
Lect12 graph miningLect12 graph mining
Lect12 graph mining
 
Link analysis : Comparative study of HITS and Page Rank Algorithm
Link analysis : Comparative study of HITS and Page Rank AlgorithmLink analysis : Comparative study of HITS and Page Rank Algorithm
Link analysis : Comparative study of HITS and Page Rank Algorithm
 
Network measures used in social network analysis
Network measures used in social network analysis Network measures used in social network analysis
Network measures used in social network analysis
 

Similar to Community Detection in Social Networks: A Brief Overview

Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...
Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...
Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...BAINIDA
 
Distributed Link Prediction in Large Scale Graphs using Apache Spark
Distributed Link Prediction in Large Scale Graphs using Apache SparkDistributed Link Prediction in Large Scale Graphs using Apache Spark
Distributed Link Prediction in Large Scale Graphs using Apache SparkAnastasios Theodosiou
 
Data & Digital Ethics - CDAO Conference Sydney 2018
Data & Digital Ethics - CDAO Conference Sydney 2018Data & Digital Ethics - CDAO Conference Sydney 2018
Data & Digital Ethics - CDAO Conference Sydney 2018Kate Carruthers
 
Professor Hendrik Speck - Social and Virtual. - An Analysis Framework for Lar...
Professor Hendrik Speck - Social and Virtual. - An Analysis Framework for Lar...Professor Hendrik Speck - Social and Virtual. - An Analysis Framework for Lar...
Professor Hendrik Speck - Social and Virtual. - An Analysis Framework for Lar...Hendrik Speck
 
2010 Catalyst Conference - Trends in Social Network Analysis
2010 Catalyst Conference - Trends in Social Network Analysis2010 Catalyst Conference - Trends in Social Network Analysis
2010 Catalyst Conference - Trends in Social Network AnalysisMarc Smith
 
Network effectiveness presentation materials
Network effectiveness presentation materialsNetwork effectiveness presentation materials
Network effectiveness presentation materialsguestb12b087
 
Fuzzy AndANN Based Mining Approach Testing For Social Network Analysis
Fuzzy AndANN Based Mining Approach Testing For Social Network AnalysisFuzzy AndANN Based Mining Approach Testing For Social Network Analysis
Fuzzy AndANN Based Mining Approach Testing For Social Network AnalysisIJERA Editor
 
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Xiaohan Zeng
 
02 Network Data Collection
02 Network Data Collection02 Network Data Collection
02 Network Data Collectiondnac
 
Socialnetworkanalysis
SocialnetworkanalysisSocialnetworkanalysis
Socialnetworkanalysiskcarter14
 
WSI Stimulus Project: Centre for longitudinal studies of online citizen parti...
WSI Stimulus Project: Centre for longitudinal studies of online citizen parti...WSI Stimulus Project: Centre for longitudinal studies of online citizen parti...
WSI Stimulus Project: Centre for longitudinal studies of online citizen parti...Ramine Tinati
 
RecSys 2018 - Enhancing Structural Diversity in Social Networks by Recommendi...
RecSys 2018 - Enhancing Structural Diversity in Social Networks by Recommendi...RecSys 2018 - Enhancing Structural Diversity in Social Networks by Recommendi...
RecSys 2018 - Enhancing Structural Diversity in Social Networks by Recommendi...Javier Sanz-Cruzado Puig
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network AnalysisScott Gomer
 
Social Network Analysis - full show
Social Network Analysis - full showSocial Network Analysis - full show
Social Network Analysis - full showScott Gomer
 
IRJET- A Survey on Link Prediction Techniques
IRJET-  	  A Survey on Link Prediction TechniquesIRJET-  	  A Survey on Link Prediction Techniques
IRJET- A Survey on Link Prediction TechniquesIRJET Journal
 
Networkcreatingandsustainingsuccessfulnetworks 100517063428-phpapp01
Networkcreatingandsustainingsuccessfulnetworks 100517063428-phpapp01Networkcreatingandsustainingsuccessfulnetworks 100517063428-phpapp01
Networkcreatingandsustainingsuccessfulnetworks 100517063428-phpapp01achmad munawar
 
Net work creating and sustaining successful networks
Net work creating and sustaining successful networksNet work creating and sustaining successful networks
Net work creating and sustaining successful networksPatti Anklam
 

Similar to Community Detection in Social Networks: A Brief Overview (20)

Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...
Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...
Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...
 
Distributed Link Prediction in Large Scale Graphs using Apache Spark
Distributed Link Prediction in Large Scale Graphs using Apache SparkDistributed Link Prediction in Large Scale Graphs using Apache Spark
Distributed Link Prediction in Large Scale Graphs using Apache Spark
 
Data & Digital Ethics - CDAO Conference Sydney 2018
Data & Digital Ethics - CDAO Conference Sydney 2018Data & Digital Ethics - CDAO Conference Sydney 2018
Data & Digital Ethics - CDAO Conference Sydney 2018
 
Professor Hendrik Speck - Social and Virtual. - An Analysis Framework for Lar...
Professor Hendrik Speck - Social and Virtual. - An Analysis Framework for Lar...Professor Hendrik Speck - Social and Virtual. - An Analysis Framework for Lar...
Professor Hendrik Speck - Social and Virtual. - An Analysis Framework for Lar...
 
2010 Catalyst Conference - Trends in Social Network Analysis
2010 Catalyst Conference - Trends in Social Network Analysis2010 Catalyst Conference - Trends in Social Network Analysis
2010 Catalyst Conference - Trends in Social Network Analysis
 
Network effectiveness presentation materials
Network effectiveness presentation materialsNetwork effectiveness presentation materials
Network effectiveness presentation materials
 
DREaM Event 2: Louise Cooke
DREaM Event 2: Louise CookeDREaM Event 2: Louise Cooke
DREaM Event 2: Louise Cooke
 
Fuzzy AndANN Based Mining Approach Testing For Social Network Analysis
Fuzzy AndANN Based Mining Approach Testing For Social Network AnalysisFuzzy AndANN Based Mining Approach Testing For Social Network Analysis
Fuzzy AndANN Based Mining Approach Testing For Social Network Analysis
 
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
 
02 Network Data Collection
02 Network Data Collection02 Network Data Collection
02 Network Data Collection
 
02 Network Data Collection (2016)
02 Network Data Collection (2016)02 Network Data Collection (2016)
02 Network Data Collection (2016)
 
Socialnetworkanalysis
SocialnetworkanalysisSocialnetworkanalysis
Socialnetworkanalysis
 
WSI Stimulus Project: Centre for longitudinal studies of online citizen parti...
WSI Stimulus Project: Centre for longitudinal studies of online citizen parti...WSI Stimulus Project: Centre for longitudinal studies of online citizen parti...
WSI Stimulus Project: Centre for longitudinal studies of online citizen parti...
 
RecSys 2018 - Enhancing Structural Diversity in Social Networks by Recommendi...
RecSys 2018 - Enhancing Structural Diversity in Social Networks by Recommendi...RecSys 2018 - Enhancing Structural Diversity in Social Networks by Recommendi...
RecSys 2018 - Enhancing Structural Diversity in Social Networks by Recommendi...
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network Analysis
 
Social Network Analysis - full show
Social Network Analysis - full showSocial Network Analysis - full show
Social Network Analysis - full show
 
IRJET- A Survey on Link Prediction Techniques
IRJET-  	  A Survey on Link Prediction TechniquesIRJET-  	  A Survey on Link Prediction Techniques
IRJET- A Survey on Link Prediction Techniques
 
Networkcreatingandsustainingsuccessfulnetworks 100517063428-phpapp01
Networkcreatingandsustainingsuccessfulnetworks 100517063428-phpapp01Networkcreatingandsustainingsuccessfulnetworks 100517063428-phpapp01
Networkcreatingandsustainingsuccessfulnetworks 100517063428-phpapp01
 
Net work creating and sustaining successful networks
Net work creating and sustaining successful networksNet work creating and sustaining successful networks
Net work creating and sustaining successful networks
 
okraku_sunbelt-2016-presentation_041016
okraku_sunbelt-2016-presentation_041016okraku_sunbelt-2016-presentation_041016
okraku_sunbelt-2016-presentation_041016
 

Recently uploaded

Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptJoemSTuliba
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubaikojalkojal131
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPirithiRaju
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)riyaescorts54
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptArshadWarsi13
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxFarihaAbdulRasheed
 

Recently uploaded (20)

Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.ppt
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.ppt
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort ServiceHot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
 

Community Detection in Social Networks: A Brief Overview

  • 1. Community Detection in Social Networks A Brief Overview Satyaki Sikdar Heritage Institute of Technology, Kolkata 8 January 2016 Satyaki Sikdar Community Detection 8 January 2016 1 / 37
  • 2. Introduction Table of Contents 1 Introduction About Me Social Networks Mathematical background 2 Motivation 3 The Hunt for Communities 4 The Need for Speed (and quality) Satyaki Sikdar Community Detection 8 January 2016 2 / 37
  • 3. Introduction About Me about me Extremely lazy - I’ve been told Working with social networks for the past 8 months the supervision of Prof. Partha Basuchowdhuri Conversant in Python, C++ and C - an average programmer at best Vice Chair of Heritage Institute of Technology ACM Student Chapter Satyaki Sikdar Community Detection 8 January 2016 3 / 37
  • 4. Introduction Social Networks Networks Networks are everywhere. They crop up wherever there are interactions between actors. friendship networks Satyaki Sikdar Community Detection 8 January 2016 4 / 37
  • 5. Introduction Social Networks Networks Networks are everywhere. They crop up wherever there are interactions between actors. friendship networks follower networks Satyaki Sikdar Community Detection 8 January 2016 4 / 37
  • 6. Introduction Social Networks Networks Networks are everywhere. They crop up wherever there are interactions between actors. friendship networks follower networks neural networks Satyaki Sikdar Community Detection 8 January 2016 4 / 37
  • 7. Introduction Social Networks Networks Networks are everywhere. They crop up wherever there are interactions between actors. friendship networks follower networks neural networks telecom networks Satyaki Sikdar Community Detection 8 January 2016 4 / 37
  • 8. Introduction Social Networks Networks Networks are everywhere. They crop up wherever there are interactions between actors. friendship networks follower networks neural networks telecom networks trade of goods and services Satyaki Sikdar Community Detection 8 January 2016 4 / 37
  • 9. Introduction Social Networks Networks Networks are everywhere. They crop up wherever there are interactions between actors. friendship networks follower networks neural networks telecom networks trade of goods and services protein protein interactions - medicine design Satyaki Sikdar Community Detection 8 January 2016 4 / 37
  • 10. Introduction Social Networks Networks Networks are everywhere. They crop up wherever there are interactions between actors. friendship networks follower networks neural networks telecom networks trade of goods and services protein protein interactions - medicine design citations and collaborations Satyaki Sikdar Community Detection 8 January 2016 4 / 37
  • 11. Introduction Social Networks Networks Networks are everywhere. They crop up wherever there are interactions between actors. friendship networks follower networks neural networks telecom networks trade of goods and services protein protein interactions - medicine design citations and collaborations power grid networks Satyaki Sikdar Community Detection 8 January 2016 4 / 37
  • 12. Introduction Social Networks Networks Networks are everywhere. They crop up wherever there are interactions between actors. friendship networks follower networks neural networks telecom networks trade of goods and services protein protein interactions - medicine design citations and collaborations power grid networks predator prey networks Satyaki Sikdar Community Detection 8 January 2016 4 / 37
  • 13. Introduction Social Networks Citation and Email networks Satyaki Sikdar Community Detection 8 January 2016 5 / 37
  • 14. Introduction Social Networks Telecommunication and Protein networks Satyaki Sikdar Community Detection 8 January 2016 6 / 37
  • 15. Introduction Social Networks Friendship and Les Mis´erables Satyaki Sikdar Community Detection 8 January 2016 7 / 37
  • 16. Introduction Social Networks High school relationship network Nearly bipartite One giant component and a lot of little ones No cycles, almost tree like - information / disease spreads fast Satyaki Sikdar Community Detection 8 January 2016 8 / 37
  • 17. Introduction Mathematical background Network representation Networks portray the interactions between different actors. Actors or individuals are nodes/vertices in the graph If there’s interaction between two nodes, there’s an edge/link between them The links can have weights or intensities signifying the strength of connections The links can be directed, like in the web graph. There’s a directed link between two nodes (pages) A and B if there’s a hyperlink to B from A Satyaki Sikdar Community Detection 8 January 2016 9 / 37
  • 18. Introduction Mathematical background Degree and degree distribution The degree of a node is the number of outward edges from that node The degree distribution of a network is distribution of the fraction of nodes with a given degree with the corresponding degrees Node Degree 1 3 2 2 3 4 4 2 5 3 6 3 7 3 8 2 9 2 10 2 Satyaki Sikdar Community Detection 8 January 2016 10 / 37
  • 19. Motivation Table of Contents 1 Introduction 2 Motivation What are they and why do we even care? Communities! Justification for the presence of communities 3 The Hunt for Communities 4 The Need for Speed (and quality) Satyaki Sikdar Community Detection 8 January 2016 11 / 37
  • 20. Motivation What are they and why do we even care? Community Structure: An Informal Definition The degree distribution follows a power law and is long-tailed The distribution of edges is inhomogeneous High concentrations of edges within special groups of vertices, and low concentrations between them. This feature of real networks is called community structure Satyaki Sikdar Community Detection 8 January 2016 12 / 37
  • 21. Motivation What are they and why do we even care? Degree distributions of real life networks Satyaki Sikdar Community Detection 8 January 2016 13 / 37
  • 22. Motivation Communities! Why bother about communities? Communities are groups of vertices which probably share common properties and/or play similar roles within the graph. Society offers a wide variety of possible group organizations: families, working and friendship circles, villages, towns, nations. Communities also occur in many networked systems from biology, computer science, engineering, economics, politics, etc. In protein-protein interaction networks, communities are likely to group proteins having the same specific function within the cell In the graph of the World Wide Web they may correspond to groups of pages dealing with the same or related topics Satyaki Sikdar Community Detection 8 January 2016 14 / 37
  • 23. Motivation Communities! Applications of Community Detection Clustering Web clients who have similar interests and are geographically near to each other improves the performance of services Identifying clusters of customers with similar interests in the network of purchase networks of online retailers enables to set up efficient recommendation systems Clusters of large graphs can be used to create data structures in order to efficiently store the graph data and to handle navigational queries, like path searches Allocation of tasks to processors in parallel computing. This can be accomplished by splitting the computer cluster into groups with roughly the same number of processors, such that the number of physical connections between processors of different groups is minimal. Satyaki Sikdar Community Detection 8 January 2016 15 / 37
  • 24. Motivation Communities! A few real world examples Figure: Zachary’s Karate Club Figure: Collaboration network between scientists working at the Santa Fe Institute Satyaki Sikdar Community Detection 8 January 2016 16 / 37
  • 25. Motivation Justification for the presence of communities An Empirical Justification Figure: Add health friendship data Coded by Race: Blue = Black, Yellow = White, Red = Hispanic, Green = Asian, White = Other Satyaki Sikdar Community Detection 8 January 2016 17 / 37
  • 26. Motivation Justification for the presence of communities Homophily: Birds of a feather stick together There’s a visible bias in friendships 52% white students, white-white friendships 86% Satyaki Sikdar Community Detection 8 January 2016 18 / 37
  • 27. Motivation Justification for the presence of communities Homophily: Birds of a feather stick together There’s a visible bias in friendships 52% white students, white-white friendships 86% 38% black students, black-black friendships 85% Satyaki Sikdar Community Detection 8 January 2016 18 / 37
  • 28. Motivation Justification for the presence of communities Homophily: Birds of a feather stick together There’s a visible bias in friendships 52% white students, white-white friendships 86% 38% black students, black-black friendships 85% 5% Hispanics, Hispanic-Hispanic friendships 2% Satyaki Sikdar Community Detection 8 January 2016 18 / 37
  • 29. Motivation Justification for the presence of communities Homophily: Birds of a feather stick together There’s a visible bias in friendships 52% white students, white-white friendships 86% 38% black students, black-black friendships 85% 5% Hispanics, Hispanic-Hispanic friendships 2% Asymmetric behavior highlights homophily Results in non-uniform edge distributions Promotes the formation and maintains the community structure Satyaki Sikdar Community Detection 8 January 2016 18 / 37
  • 30. The Hunt for Communities Table of Contents 1 Introduction 2 Motivation 3 The Hunt for Communities Where to start? Definitions A na¨ıve approach - NP hardness Girvan-Newman Algorithm Girvan-Newman in Action Modularity Louvain Method Our method - methodical graph sparsification Satyaki Sikdar Community Detection 8 January 2016 19 / 37
  • 31. The Hunt for Communities Where to start? Formalizing the problem For a given graph G(V, E), find a cover C = {C1 , C2 , ..., Ck} such that i Ci = V For disjoint communities, Ci Cj = ∅ ∀i, j For overlapping communities, Ci Cj = ∅ ∀i, j Figure: Zachary’s Karate Club Network C = {C1, C2, C3}, C1 = yellow nodes, C2 = green, C3 = blue is a disjoint cover However, ¯C = { ¯C1, ¯C2}, ¯C1 = yellow & green nodes and ¯C2 = blue & green nodes is an overlapping cover Satyaki Sikdar Community Detection 8 January 2016 20 / 37
  • 32. The Hunt for Communities Definitions A few more definitions Figure: A simple graph with three communities. Intra-community edges are blue and inter-community ones in green Let C be a community of a graph G(V, E) with |C| = nc, |V| = n and |E| = m . We define, Average link density δ(G) = m n(n − 1)/2 Intra-cluster density δint(C) = #internal edges of C nc(nc − 1)/2 Inter-cluster density δext(C) = #intercluster edges of C nc(n − nc) For a good community, we expect δint(C) >> δ(G) and δext(C) << δ(G) We look to maximize C (δint(C) − δext(C)) Satyaki Sikdar Community Detection 8 January 2016 21 / 37
  • 33. The Hunt for Communities A na¨ıve approach - NP hardness A Na¨ıve Approach We have an objective function f(C) = C∈C (δint(C) − δext(C)) How do we find a good C? Exhaustive enumeration, or in simple words, brute force! Try out all the possible communities C of all possible sizes, pick the best sets of C that maximizes f(C) What’s the problem? Too many choices of C to pick from - needle in a haystack! Even for small graphs, brute forcing becomes infeasible Can we do better? Satyaki Sikdar Community Detection 8 January 2016 22 / 37
  • 34. The Hunt for Communities Girvan-Newman Algorithm A Little Background: Edge Betweenness Centrality Betweenness centrality of an edge e is the sum of the fraction of all-pairs shortest paths that pass through e: cB(e) = s,t∈V σ(s, t|e) σ(s, t) where σ(s, t) is the number of shortest paths from s to t and σ(s, t|e) is the number of shortest paths from s to t passing through the edge e Top 6 edges Edge cB(e) type (10, 13) 0.3 inter (3, 5) 0.23333 inter (7, 15) 0.2079 inter (1, 8) 0.1873 inter (13, 15) 0.1746 intra (5, 7) 0.1476 intra Bottom 6 edges Edge cB(e) type (8, 11) 0.022 intra (1, 2) 0.0269 intra (9, 11) 0.031 intra (8, 9) 0.0412 intra (12, 15) 0.052 intra (3, 4) 0.060 intra Satyaki Sikdar Community Detection 8 January 2016 23 / 37
  • 35. The Hunt for Communities Girvan-Newman Algorithm The Girvan-Newman Algorithm Proposed by Girvan and Newman in 2002, and was improved in 2004. Based on reachability of nodes - shortest paths Edges are selected on the basis of the edge betweenness centrality The algorithm 1 Computation centrality for all edges 2 Removal of edge with largest centrality; ties can be broken randomly 3 Recalculation of the centralities on the running graph 4 Iterate from step 2, stop when you get clusters of desirable quality Satyaki Sikdar Community Detection 8 January 2016 24 / 37
  • 36. The Hunt for Communities Girvan-Newman in Action (a) Best edge: (10, 13) (f) Final graph (b) Best edge: (3, 5) (e) Best edge: (2, 11) (c) Best edge: (7, 15) (d) Best edge: (1, 8) Satyaki Sikdar Community Detection 8 January 2016 25 / 37
  • 37. The Hunt for Communities Modularity Modularity For a given graph G(V, E), and a disjoint cover C = {C1 , C2 , ..., Ck}, we have, the number of intra-community edges as 1 2 ij Aij δ(ci , cj ) the expected number of edges between all pairs of nodes in a community as 1 2 ij ki kj 2m δ(ci , cj ) the difference of the actual and the expected values is 1 2 ij Aij − ki kj 2m δ(ci , cj ) We define modularity Q = 1 2m ij Aij − ki kj 2m δ(ci , cj ). Q ∈ [−1, 1] The higher the modularity, the better is the community structure*. The lower it is, the more is the randomness in edge distribution Satyaki Sikdar Community Detection 8 January 2016 26 / 37
  • 38. The Hunt for Communities Louvain Method Louvain Method: A Greedy Approach Proposed by Blondel et al in 2008. Takes the greedy maximization approach Very fast in practice, it’s the current state-of-the-art in disjoint community detection. Performs hierarchical partitioning, stopping when there cannot be any further improvement in modularity Contracts the graph in each iteration thereby speeding up the process Satyaki Sikdar Community Detection 8 January 2016 27 / 37
  • 39. The Hunt for Communities Louvain Method The Algorithm 1 Initially each node is in it’s own community 2 A sequential sweep over the nodes is performed. Given a node i, the gain in weighted modularity (∆Q) coming from putting i in the community of its neighbor j is computed. i is put in that community for which ∆Q is maximum (∆Q 0). 3 Communities are replaced by supernodes and two supernodes are connected by an edge iff there’s at least an edge between vertices of the two communities. 4 The above two steps are repeated as long as ∆Q 0 Satyaki Sikdar Community Detection 8 January 2016 28 / 37
  • 40. The Hunt for Communities Louvain Method Louvain Method in Action Satyaki Sikdar Community Detection 8 January 2016 29 / 37
  • 41. The Hunt for Communities Louvain Method Figure: Belgian mobile phone network. The red nodes are French speakers and the Green ones are Dutch Satyaki Sikdar Community Detection 8 January 2016 30 / 37
  • 42. The Hunt for Communities Our method - methodical graph sparsification Community Detection by Graph Sparsification Proposed by Basuchowdhuri, Sikdar, Shreshtha, Majumder in 2015. Accepted in ACM CoDS 2016 as a full paper. The input graph is methodically sparsified preserving the community structure. A t-spanner is used for this purpose. Louvain Method is applied on the reduced graph to obtain the clusters Very fast in practice. Performance is comparable to Louvain Method both in terms of quality and modularity. Satyaki Sikdar Community Detection 8 January 2016 31 / 37
  • 43. The Hunt for Communities Our method - methodical graph sparsification The Algorithm 1 Construct a t-spanner for the given network. Take the complement of the spanner in the original network 2 Form a cover using any fast community detection in the sparsified graph 3 Run Louvain method to refine the clusters Satyaki Sikdar Community Detection 8 January 2016 32 / 37
  • 44. The Hunt for Communities Our method - methodical graph sparsification Figure: Original network. n = 115, m = 613 Figure: Sparsified network. n = 115, m = 137 Figure: Final network. n = 115, m = 137 Satyaki Sikdar Community Detection 8 January 2016 33 / 37
  • 45. The Need for Speed (and quality) Table of Contents 1 Introduction 2 Motivation 3 The Hunt for Communities 4 The Need for Speed (and quality) Performance comparison Satyaki Sikdar Community Detection 8 January 2016 34 / 37
  • 46. The Need for Speed (and quality) Performance comparison Performance Comparison Louvain Method Our Algorithm Dataset n m Modularity Time t Modularity Time Karate 34 78 0.415 0 7 0.589422 0.5 Dolphins 62 159 0.518 0 5 0.6744 0.53 Football 115 613 0.604 0 9 0.8627 0.69 Enron 33,696 180,811 0.596 0.38 3 0.855 13.13 DBLP 317,080 1,049,866 0.819 11 9 0.9589864 78.56 Satyaki Sikdar Community Detection 8 January 2016 35 / 37
  • 47. The Need for Speed (and quality) Performance comparison Wrapping Up Social network analysis is a vibrant dynamic field spanning across fields like sociology, economics, physics, biology and not just CS Community detection is an active field of research. Not much work is done with dynamic networks. Satyaki Sikdar Community Detection 8 January 2016 36 / 37
  • 48. The Need for Speed (and quality) Performance comparison Thank you for listening! Satyaki Sikdar Community Detection 8 January 2016 37 / 37