SlideShare ist ein Scribd-Unternehmen logo
1 von 48
Downloaden Sie, um offline zu lesen
centrality measures
Survey and comparisons
Authors: Antonio Esposito
Emanuele Pesce
Supervisors: Prof. Vincenzo Auletta
Ph.D Diodato Ferraioli
Aprile 2015
University of Salerno, deparment of computer science
0
outline
Introduction
Centrality measures
Geometric measures
Path-based measures
Spectral measures
Effectiveness of centrality measures
Axioms for centrality
Information retrieval
Conclusions
1
introduction
centrality of a network
What is a centrality measure?
∙ Given a network, the centrality is a quantitative measure which
aims at reveling the importance of a node
∙ The more a node is centered, the more it is important
∙ Formally, a centrality measure is a real valued function on the
nodes of a graph
What do you mean by center?
∙ There are many intuitive ideas about what a center is, so there are
many different centrality measures
3
definition of center
The center of a star is at the same time:
∙ the node with largest degree
∙ the node that is closest to the other nodes
∙ the node through which most shortest paths pass
∙ the node with the largest number of incoming paths
∙ the node that maximize the dominant eigenvector of the graph
matrix
Several centrality indices
∙ Different centrality indices capture different properties of a
network
4
centrality: some applications
Centrality is used often for detecting:
∙ how influential a person is in a social network?
∙ how well used a road is in a transportation network?
∙ how important a web page is?
∙ how important a room is in a building?
5
centrality measures
centrality measures
Geometric measures
∙ Indegree
∙ Closeness
∙ Harmonic
∙ Lin’s Index
Path-based measures
∙ Betweeness
Spectral measures
∙ The left dominant eigenvector
∙ Seeley’s index
∙ Katz’s index
∙ PageRank
∙ HITS
∙ SALSA
7
different centrality measures
Example of different centrality measures applied to the same
network
8
geometric measures
The idea
∙ In geometric measures the importance is a function of distances.
∙ A geometric centrality depends on how many nodes exist at every
distance
9
geometric measures: indegree centrality
∙ Indegree centrality is defined as the number of incoming arcs of a
node x
Cindegree(x) = d−
(x) (1)
∙ The node with the highest degree is the most important
When to use it?
∙ To identify people whom you can talk to
∙ To identify people whom will do favors for you
10
indegree centrality: examples
Indegree measure applied on different networks
11
indegree centrality: examples
Indegree centrality can be deceiving because it is a local measure
Indegree centrality doeas not work well for:
∙ detecting nodes that are broker between two groups
∙ predicting if an information reaches a node
12
geometric measures: closeness centrality
∙ Closeness centrality of x is defined by:
Ccloseness(x) =
1
∑
d(y,x)<∞
d(y, x)
(2)
∙ Divide it for the max number of nodes (n − 1) to normalize the closeness centrality
∙ Nodes with empty coreachable set have centrality 0
∙ The closer a node is to all others, the more it is important
When to use it?
∙ To identify people whom tend to be very influential person within their local
network
∙ They may often not be public figures, but they are often respected locally
∙ To measure how long it will take to spread information from node x to all other
nodes
13
closeness centrality: example
Closeness measure applied to different networks
14
geometric measures: harmonic centrality
∙ Harmonic centrality of x, with the convention ∞−1
= 0 is defined
by:
Charmonic(x) =
1
∑
y̸=x
d(y, x)
(3)
∙ It is correlated to closeness centrality in simple networks, but it
also accounts for nodes y that cannot reach x
When to use it?
∙ The same for the closeness but it can be applied to graphs that
are not connected
15
harmonic centrality: examples
Harmonic and indegree measures applied to the same network
(Zachary’s karate club)
16
lin’s index
∙ Lin’s index of x
Clin(x) =
|{y | d(y, x) < ∞}|2
∑
d(y,x)<∞
d(y, x)
(4)
∙ As closeness, but here nodes with a larger coreachable set are
more important
A fact
∙ Surprisingly, Lin’s index was ignored in literature, even though it
seems to provide a reasonable solution for detecting centers in
networks
17
path-based measures
The idea
∙ Path-based measures exploit not only the existence of shortest
paths but actually take into examination all shortest paths (or all
paths) coming into a node
18
path-based measures: betweenness centrality
∙ The intuition behind the betweenness centrality is to measure the
probability that a random shortest path passes though a given
node. Betweenness of x is defined as:
Cbetweenness(x) =
∑
y,z̸=x,αyz̸=0
αyz(x)
αyz
(5)
∙ αyz is the number of shortest paths going from y to z
∙ αyz(x) is the number of shortest paths that pass through x
∙ The higher is the fraction of shortest paths which passes through
a node, the more the node is important
When to use it?
∙ To identify nodes which have a large influence on the transfer of
items through the network
19
betweenness centrality: examples
Betweenness applied to different networks
20
betweenness and indegree
Betweenness and indegree measures applied to the same network
(Zachary’s karate club)
21
betweenness and closeness
∙ Betweenness and closeness measures applied to the same
network
∙ The nodes are sized by degree and colored by betweenness
22
spectral measures
The idea
∙ In spectral measures the importance is related to the iterated
computation of the left dominant eigenvector of the adjacency
matrix.
∙ In the spectral centrality the importance of a node is given by the
importance of the neighbourhood
∙ The more important are the nodes pointing at you, the more
important you are
23
spectral measures
How many of them?
∙ The dominant eigenvector
∙ Seeley’s index
∙ Katz’s index
∙ PageRank
∙ HITS
∙ SALSA
24
spectral measures: some useful notation
Given the adjacency matrix A we can compute:
∙ The ℓ1 norm of the matrix ¯A
∙ Each element of the row i is divided by the sum of its elements
∙ The symmetric graph G′
of the given graph G
∙ The transpose of AT
of the adjacency matrix A
∙ The number of k−lenght path from a node i to another node j
∙ Ak
: in such a matrix, each element aij will be the number of paths with
lenght = k from the node i to the node j
25
spectral measures: the left dominant eigenvector
Dominant eigenvector
∙ Taking in consideration the left dominant eigenvector means to consider the
incoming edges of a node.
∙ To find out the node’s importance, we perform an iterated computation of:
xt+1
i
=
1
λ
n∑
i=0
A
(t)
ij
(6)
where:
∙ x0
i = 1 ∀ i at step 0
∙ xt
is the score after t iterations
∙ λ is the dominant eigenvalue of the adjacency matrix A
∙ After that, the vector x is normalized and the process iterated until convergence
∙ Each node starts with the same score. Then, in iteration, it receives the sum of the
connected neighbor’s score
26
eigenvector centrality: example
In figure 1 there are applications on the same graph of degree and
eigenvector centrality
Figure 1: Degree and eigenvector centrality
27
spectral measures: seeley’s index
∙ Why give away all of our importance?
∙ It would have more sense to equally divide our importance among our successors
∙ The process will remains the same, but from an algebric point of view that means
normalizing each row of the adjacency matrix:
xt+1
i
=
1
λ
n∑
i=0
¯A
(t)
ij
(7)
where:
∙ x0
i = 1 ∀ i at step 0
∙ xt
is the score after t iterations
∙ λ is the dominant eigenvalue of the adjacency matrix A
∙ ¯A is the normalized form of the adjacency matrix
∙ Isolated nodes of a non strongly connected graph will have null score over
iterations
28
spectral measures: katz’s index
Katz’s index weighs all incoming paths to a node and then compute:
x = 1
∞∑
i=0
βi
Ai
(8)
where:
∙ x is the output’s scores vector
∙ 1 is the weight’s vector (for example all 1)
∙ βi
is an attenuation factor (β < 1
λ )
∙ Ai
contains in the generic element aij the number of i-lenght path
from i to j
29
spectral measures: pagerank
PageRank - a little overview
∙ It’s supposed to be how the Google’s search engine works
∙ It is the unique vector p satisfying
p = (1 − α)v(1 − α¯A)−1
∙ where:
∙ α ∈ [0, 1) is a dumping factor
∙ v is a preference vector (a distribution)
∙ ¯A is the ℓ1 normalized adjacency matrix
∙ As shown, PageRank and Katz’s index differ by a constant factor
and the ℓ1 normalization of the adjacency matrix A
30
spectral measures: eigenvector and pagerank
In figure 2 there are applications of the same graph of eigenvector
PageRank centrality
Figure 2: Degree and eigenvector centrality
31
spectral measures: hits
HITS - a little overview by Kleinberg
∙ The key here is the mutual reinforcement
∙ A node ( such as a page ) is authoritative if it is pointed by many
good hubs
∙ Hubs: pages containing good list of authoritative pages
∙ Then an Hub is good if it points to many authoritative pages
∙ We iteratively compute the:
∙ ai: authoritativeness score ( where a0 = 1)
∙ hi: hubbiness score
as the following:
hi+1 = aiAT
ai+1 = hi+1A
∙ This process converges to the left dominant eigenvector of the
matrix AT
A giving the final score of authoritativeness, called ”HITS”
32
spectral measures: salsa
SALSA was ideated by Lempel and Moran
∙ Based on the same mutual reinforcement between
authoritativeness and hubbiness, but ℓ1normalizing the matrices A
and AT
.
∙ Starting value: a0 = 1
∙ hi+1 = ai
¯AT
∙ ai+1 = ai
¯A
∙ Contrarily to HITS there is no need of a large number of iteration
with SALSA
33
spectral measures: some applications
∙ Left dominant eigenvector: the idea on which networks structure
analysisis is based
∙ Seeley’s index: feedback’s network
∙ Katz’s index: citations networks
∙ expecially good with direct acyclic graphs (where the basic dominant
eigenvector don’t perform well)
∙ HITS: web page’s citations
∙ Pagerank: Google’s search engine
∙ SALSA: link structure analysis
34
effectiveness of centrality mea-
sures
axioms for centrality
∙ Boldi and Vigna in 2013 tried to provide a method to evaluate and
compare different centrality measures
∙ They defined three axioms that an index should satisfy to behave
predictably
∙ Size axiom
∙ Density axiom
∙ The score-monotonicity axiom
36
axioms for centrality: size axiom
Given a graph Sk,p (figure 3), made by a k − clique and a directed
p − cycle, the size axioms is satisfied if there are threshold values,
of p and k such that:
∙ p > k (if the cycle is very large) the nodes of the cycle are more
important
∙ k > p the nodes of clique are more important
∙ intuitively, for p = k, the nodes of the clique are more important
Figure 3: Graph Sk,p 37
axioms for centrality: density axiom
∙ Given a graph Dk,p(figure 4), made by a k − clique and a directed
p − cycle connected by a bidirectional bridge x ↔ y, where x is a
node of the clique and y a node of the cycle.
∙ A centrality measure satisfies the density axiom for k = p, if the
centrality of x is strictly larger than the centrality of y.
Figure 4: Graph Gk,p
38
axioms for centrality: the score-monotonicity axiom
∙ A centrality measure satisfies the score-monotonicity axiom if for
every graph G and every pair of node x, y such that x ↛ y, when we
add x → y to G the centrality of y increases.
39
axioms for centrality: centrality axioms: comparisons
Figure 5: For each centrality and each axiom, the report whether it is
satisfied
The harmonic centrality satisfies all axioms.
40
information retrieval: sanity check
∙ Boldi and Vigna have applied centrality measures on standard
datasets in order to find out the behavior of different indices
∙ There are standard datasets with associated queries and ground
truth about which documents are relevant for every query
∙ Those collections are typically used to compare the merits and the
demerits about retrieval methods
41
information retrieval: datasets
Dataset GOV2, tested in two different ways:
∙ with all links: complete dataset
∙ with inter-host link only: links between pages of the same host
are excluted from the graph
Measures of effectiveness chosen:
∙ P@10: precision at 10, fraction of relevant documents retrieved
among the first ten
∙ NDCG@10: discounted cumulative gain at 10, measure the
usefulness, or gain, of a document based on its position in the
result list
42
information retrieval: results
For each centrality measure the discounted cumulative and precision at 10, on GOV2
dataset using all links (on the left) and using only inter-host links (on the right).
Figure 6: All links Figure 7: Inter-host links 43
conclusions
conclusions
∙ A very simple measure as harmonic centrality, turned out to be a
good notion of centrality.
∙ it satisfies all centrality axioms proposed
∙ it works well to retrieve information
Choose the right measure
∙ No centrality measure is better than the others in every situation
∙ Some are better than others to reach a particular goal, but it
depends on the specific application domain
∙ So, the best approach is to understand which measure fits the
problem better
45
references and useful resources
Paolo Boldi and Sebastiano Vigna
Axioms for centrality.
Nicola Perra and Santo Fortunato
Spectral centrality measures in complex networks.
M. E. J. Newman
Networks: an introduction
46
Thank you for your attention!
47

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Social Network Analysis Workshop
Social Network Analysis WorkshopSocial Network Analysis Workshop
Social Network Analysis Workshop
 
Introduction to Social Network Analysis
Introduction to Social Network AnalysisIntroduction to Social Network Analysis
Introduction to Social Network Analysis
 
Link prediction
Link predictionLink prediction
Link prediction
 
Social Network Analysis power point presentation
Social Network Analysis power point presentation Social Network Analysis power point presentation
Social Network Analysis power point presentation
 
CS6010 Social Network Analysis Unit I
CS6010 Social Network Analysis Unit ICS6010 Social Network Analysis Unit I
CS6010 Social Network Analysis Unit I
 
Ppt
PptPpt
Ppt
 
Data cleaning-outlier-detection
Data cleaning-outlier-detectionData cleaning-outlier-detection
Data cleaning-outlier-detection
 
Community detection algorithms
Community detection algorithmsCommunity detection algorithms
Community detection algorithms
 
Group and Community Detection in Social Networks
Group and Community Detection in Social NetworksGroup and Community Detection in Social Networks
Group and Community Detection in Social Networks
 
Community detection in graphs
Community detection in graphsCommunity detection in graphs
Community detection in graphs
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network Analysis
 
CS6010 Social Network Analysis Unit III
CS6010 Social Network Analysis   Unit IIICS6010 Social Network Analysis   Unit III
CS6010 Social Network Analysis Unit III
 
Semantic Networks
Semantic NetworksSemantic Networks
Semantic Networks
 
4. social network analysis
4. social network analysis4. social network analysis
4. social network analysis
 
CS6010 Social Network Analysis Unit V
CS6010 Social Network Analysis Unit VCS6010 Social Network Analysis Unit V
CS6010 Social Network Analysis Unit V
 
Classification techniques in data mining
Classification techniques in data miningClassification techniques in data mining
Classification techniques in data mining
 
Decision tree
Decision treeDecision tree
Decision tree
 
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
 
3.6 constraint based cluster analysis
3.6 constraint based cluster analysis3.6 constraint based cluster analysis
3.6 constraint based cluster analysis
 

Andere mochten auch

Social network analysis basics
Social network analysis basicsSocial network analysis basics
Social network analysis basics
Pradeep Kumar
 
Social network analysis & Big Data - Telecommunications and more
Social network analysis & Big Data - Telecommunications and moreSocial network analysis & Big Data - Telecommunications and more
Social network analysis & Big Data - Telecommunications and more
Wael Elrifai
 

Andere mochten auch (20)

Social network analysis basics
Social network analysis basicsSocial network analysis basics
Social network analysis basics
 
Mapping the South African Twittersphere
Mapping the South African TwittersphereMapping the South African Twittersphere
Mapping the South African Twittersphere
 
GraphDice: A System for Exploring Multivariate Social Networks
GraphDice: A System for Exploring Multivariate Social NetworksGraphDice: A System for Exploring Multivariate Social Networks
GraphDice: A System for Exploring Multivariate Social Networks
 
Networks
NetworksNetworks
Networks
 
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISNE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSIS
 
A network based model for predicting a hashtag break out in twitter
A network based model for predicting a hashtag break out in twitter A network based model for predicting a hashtag break out in twitter
A network based model for predicting a hashtag break out in twitter
 
Community detection
Community detectionCommunity detection
Community detection
 
This isn't what I thought it was: community in the network age
This isn't what I thought it was: community in the network ageThis isn't what I thought it was: community in the network age
This isn't what I thought it was: community in the network age
 
The Network, the Community and the Self-Creativity
The Network, the Community and the Self-CreativityThe Network, the Community and the Self-Creativity
The Network, the Community and the Self-Creativity
 
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISNE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSIS
 
Community Detection
Community Detection Community Detection
Community Detection
 
Detecting Community Structures in Social Networks by Graph Sparsification
Detecting Community Structures in Social Networks by Graph SparsificationDetecting Community Structures in Social Networks by Graph Sparsification
Detecting Community Structures in Social Networks by Graph Sparsification
 
Network sampling, community detection
Network sampling, community detectionNetwork sampling, community detection
Network sampling, community detection
 
Social network analysis of Jose Rizal
Social network analysis of Jose RizalSocial network analysis of Jose Rizal
Social network analysis of Jose Rizal
 
Community detection in social networks[1]
Community detection in social networks[1]Community detection in social networks[1]
Community detection in social networks[1]
 
Big Data: Mapping Twitter Communities
Big Data: Mapping Twitter CommunitiesBig Data: Mapping Twitter Communities
Big Data: Mapping Twitter Communities
 
Social Network Analysis: applications for education research
Social Network Analysis: applications for education researchSocial Network Analysis: applications for education research
Social Network Analysis: applications for education research
 
Community Detection with Networkx
Community Detection with NetworkxCommunity Detection with Networkx
Community Detection with Networkx
 
Social network analysis & Big Data - Telecommunications and more
Social network analysis & Big Data - Telecommunications and moreSocial network analysis & Big Data - Telecommunications and more
Social network analysis & Big Data - Telecommunications and more
 
Noli me tangere characters
Noli me tangere charactersNoli me tangere characters
Noli me tangere characters
 

Ähnlich wie Network centrality measures and their effectiveness

Dimension Reduction Introduction & PCA.pptx
Dimension Reduction Introduction & PCA.pptxDimension Reduction Introduction & PCA.pptx
Dimension Reduction Introduction & PCA.pptx
RohanBorgalli
 
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
Adam Fausett
 
Histogram-Based Method for Effective Initialization of the K-Means Clustering...
Histogram-Based Method for Effective Initialization of the K-Means Clustering...Histogram-Based Method for Effective Initialization of the K-Means Clustering...
Histogram-Based Method for Effective Initialization of the K-Means Clustering...
Gingles Caroline
 
Understanding High-dimensional Networks for Continuous Variables Using ECL
Understanding High-dimensional Networks for Continuous Variables Using ECLUnderstanding High-dimensional Networks for Continuous Variables Using ECL
Understanding High-dimensional Networks for Continuous Variables Using ECL
HPCC Systems
 

Ähnlich wie Network centrality measures and their effectiveness (20)

machine learning.pptx
machine learning.pptxmachine learning.pptx
machine learning.pptx
 
Dimension Reduction Introduction & PCA.pptx
Dimension Reduction Introduction & PCA.pptxDimension Reduction Introduction & PCA.pptx
Dimension Reduction Introduction & PCA.pptx
 
clustering tendency
clustering tendencyclustering tendency
clustering tendency
 
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
 
Graph Partitioning and Spectral Methods
Graph Partitioning and Spectral MethodsGraph Partitioning and Spectral Methods
Graph Partitioning and Spectral Methods
 
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
 
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKS
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKSEVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKS
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKS
 
Introduction to Supervised ML Concepts and Algorithms
Introduction to Supervised ML Concepts and AlgorithmsIntroduction to Supervised ML Concepts and Algorithms
Introduction to Supervised ML Concepts and Algorithms
 
Multiplex Networks: structure and dynamics
Multiplex Networks: structure and dynamicsMultiplex Networks: structure and dynamics
Multiplex Networks: structure and dynamics
 
Histogram-Based Method for Effective Initialization of the K-Means Clustering...
Histogram-Based Method for Effective Initialization of the K-Means Clustering...Histogram-Based Method for Effective Initialization of the K-Means Clustering...
Histogram-Based Method for Effective Initialization of the K-Means Clustering...
 
Asymptotic Notation and Data Structures
Asymptotic Notation and Data StructuresAsymptotic Notation and Data Structures
Asymptotic Notation and Data Structures
 
trees-and-graphs_computer_science_for_student.pptx
trees-and-graphs_computer_science_for_student.pptxtrees-and-graphs_computer_science_for_student.pptx
trees-and-graphs_computer_science_for_student.pptx
 
EE8120_Projecte_15
EE8120_Projecte_15EE8120_Projecte_15
EE8120_Projecte_15
 
Clustering of graphs and search of assemblages
Clustering of graphs and search of assemblagesClustering of graphs and search of assemblages
Clustering of graphs and search of assemblages
 
Understanding High-dimensional Networks for Continuous Variables Using ECL
Understanding High-dimensional Networks for Continuous Variables Using ECLUnderstanding High-dimensional Networks for Continuous Variables Using ECL
Understanding High-dimensional Networks for Continuous Variables Using ECL
 
Minicourse on Network Science
Minicourse on Network ScienceMinicourse on Network Science
Minicourse on Network Science
 
Jindřich Libovický - 2017 - Attention Strategies for Multi-Source Sequence-...
Jindřich Libovický - 2017 - Attention Strategies for Multi-Source Sequence-...Jindřich Libovický - 2017 - Attention Strategies for Multi-Source Sequence-...
Jindřich Libovický - 2017 - Attention Strategies for Multi-Source Sequence-...
 
Dimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptxDimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptx
 
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
 

Kürzlich hochgeladen

Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 

Kürzlich hochgeladen (20)

GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Introduction to Viruses
Introduction to VirusesIntroduction to Viruses
Introduction to Viruses
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
chemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdfchemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdf
 

Network centrality measures and their effectiveness

  • 1. centrality measures Survey and comparisons Authors: Antonio Esposito Emanuele Pesce Supervisors: Prof. Vincenzo Auletta Ph.D Diodato Ferraioli Aprile 2015 University of Salerno, deparment of computer science 0
  • 2. outline Introduction Centrality measures Geometric measures Path-based measures Spectral measures Effectiveness of centrality measures Axioms for centrality Information retrieval Conclusions 1
  • 4. centrality of a network What is a centrality measure? ∙ Given a network, the centrality is a quantitative measure which aims at reveling the importance of a node ∙ The more a node is centered, the more it is important ∙ Formally, a centrality measure is a real valued function on the nodes of a graph What do you mean by center? ∙ There are many intuitive ideas about what a center is, so there are many different centrality measures 3
  • 5. definition of center The center of a star is at the same time: ∙ the node with largest degree ∙ the node that is closest to the other nodes ∙ the node through which most shortest paths pass ∙ the node with the largest number of incoming paths ∙ the node that maximize the dominant eigenvector of the graph matrix Several centrality indices ∙ Different centrality indices capture different properties of a network 4
  • 6. centrality: some applications Centrality is used often for detecting: ∙ how influential a person is in a social network? ∙ how well used a road is in a transportation network? ∙ how important a web page is? ∙ how important a room is in a building? 5
  • 8. centrality measures Geometric measures ∙ Indegree ∙ Closeness ∙ Harmonic ∙ Lin’s Index Path-based measures ∙ Betweeness Spectral measures ∙ The left dominant eigenvector ∙ Seeley’s index ∙ Katz’s index ∙ PageRank ∙ HITS ∙ SALSA 7
  • 9. different centrality measures Example of different centrality measures applied to the same network 8
  • 10. geometric measures The idea ∙ In geometric measures the importance is a function of distances. ∙ A geometric centrality depends on how many nodes exist at every distance 9
  • 11. geometric measures: indegree centrality ∙ Indegree centrality is defined as the number of incoming arcs of a node x Cindegree(x) = d− (x) (1) ∙ The node with the highest degree is the most important When to use it? ∙ To identify people whom you can talk to ∙ To identify people whom will do favors for you 10
  • 12. indegree centrality: examples Indegree measure applied on different networks 11
  • 13. indegree centrality: examples Indegree centrality can be deceiving because it is a local measure Indegree centrality doeas not work well for: ∙ detecting nodes that are broker between two groups ∙ predicting if an information reaches a node 12
  • 14. geometric measures: closeness centrality ∙ Closeness centrality of x is defined by: Ccloseness(x) = 1 ∑ d(y,x)<∞ d(y, x) (2) ∙ Divide it for the max number of nodes (n − 1) to normalize the closeness centrality ∙ Nodes with empty coreachable set have centrality 0 ∙ The closer a node is to all others, the more it is important When to use it? ∙ To identify people whom tend to be very influential person within their local network ∙ They may often not be public figures, but they are often respected locally ∙ To measure how long it will take to spread information from node x to all other nodes 13
  • 15. closeness centrality: example Closeness measure applied to different networks 14
  • 16. geometric measures: harmonic centrality ∙ Harmonic centrality of x, with the convention ∞−1 = 0 is defined by: Charmonic(x) = 1 ∑ y̸=x d(y, x) (3) ∙ It is correlated to closeness centrality in simple networks, but it also accounts for nodes y that cannot reach x When to use it? ∙ The same for the closeness but it can be applied to graphs that are not connected 15
  • 17. harmonic centrality: examples Harmonic and indegree measures applied to the same network (Zachary’s karate club) 16
  • 18. lin’s index ∙ Lin’s index of x Clin(x) = |{y | d(y, x) < ∞}|2 ∑ d(y,x)<∞ d(y, x) (4) ∙ As closeness, but here nodes with a larger coreachable set are more important A fact ∙ Surprisingly, Lin’s index was ignored in literature, even though it seems to provide a reasonable solution for detecting centers in networks 17
  • 19. path-based measures The idea ∙ Path-based measures exploit not only the existence of shortest paths but actually take into examination all shortest paths (or all paths) coming into a node 18
  • 20. path-based measures: betweenness centrality ∙ The intuition behind the betweenness centrality is to measure the probability that a random shortest path passes though a given node. Betweenness of x is defined as: Cbetweenness(x) = ∑ y,z̸=x,αyz̸=0 αyz(x) αyz (5) ∙ αyz is the number of shortest paths going from y to z ∙ αyz(x) is the number of shortest paths that pass through x ∙ The higher is the fraction of shortest paths which passes through a node, the more the node is important When to use it? ∙ To identify nodes which have a large influence on the transfer of items through the network 19
  • 21. betweenness centrality: examples Betweenness applied to different networks 20
  • 22. betweenness and indegree Betweenness and indegree measures applied to the same network (Zachary’s karate club) 21
  • 23. betweenness and closeness ∙ Betweenness and closeness measures applied to the same network ∙ The nodes are sized by degree and colored by betweenness 22
  • 24. spectral measures The idea ∙ In spectral measures the importance is related to the iterated computation of the left dominant eigenvector of the adjacency matrix. ∙ In the spectral centrality the importance of a node is given by the importance of the neighbourhood ∙ The more important are the nodes pointing at you, the more important you are 23
  • 25. spectral measures How many of them? ∙ The dominant eigenvector ∙ Seeley’s index ∙ Katz’s index ∙ PageRank ∙ HITS ∙ SALSA 24
  • 26. spectral measures: some useful notation Given the adjacency matrix A we can compute: ∙ The ℓ1 norm of the matrix ¯A ∙ Each element of the row i is divided by the sum of its elements ∙ The symmetric graph G′ of the given graph G ∙ The transpose of AT of the adjacency matrix A ∙ The number of k−lenght path from a node i to another node j ∙ Ak : in such a matrix, each element aij will be the number of paths with lenght = k from the node i to the node j 25
  • 27. spectral measures: the left dominant eigenvector Dominant eigenvector ∙ Taking in consideration the left dominant eigenvector means to consider the incoming edges of a node. ∙ To find out the node’s importance, we perform an iterated computation of: xt+1 i = 1 λ n∑ i=0 A (t) ij (6) where: ∙ x0 i = 1 ∀ i at step 0 ∙ xt is the score after t iterations ∙ λ is the dominant eigenvalue of the adjacency matrix A ∙ After that, the vector x is normalized and the process iterated until convergence ∙ Each node starts with the same score. Then, in iteration, it receives the sum of the connected neighbor’s score 26
  • 28. eigenvector centrality: example In figure 1 there are applications on the same graph of degree and eigenvector centrality Figure 1: Degree and eigenvector centrality 27
  • 29. spectral measures: seeley’s index ∙ Why give away all of our importance? ∙ It would have more sense to equally divide our importance among our successors ∙ The process will remains the same, but from an algebric point of view that means normalizing each row of the adjacency matrix: xt+1 i = 1 λ n∑ i=0 ¯A (t) ij (7) where: ∙ x0 i = 1 ∀ i at step 0 ∙ xt is the score after t iterations ∙ λ is the dominant eigenvalue of the adjacency matrix A ∙ ¯A is the normalized form of the adjacency matrix ∙ Isolated nodes of a non strongly connected graph will have null score over iterations 28
  • 30. spectral measures: katz’s index Katz’s index weighs all incoming paths to a node and then compute: x = 1 ∞∑ i=0 βi Ai (8) where: ∙ x is the output’s scores vector ∙ 1 is the weight’s vector (for example all 1) ∙ βi is an attenuation factor (β < 1 λ ) ∙ Ai contains in the generic element aij the number of i-lenght path from i to j 29
  • 31. spectral measures: pagerank PageRank - a little overview ∙ It’s supposed to be how the Google’s search engine works ∙ It is the unique vector p satisfying p = (1 − α)v(1 − α¯A)−1 ∙ where: ∙ α ∈ [0, 1) is a dumping factor ∙ v is a preference vector (a distribution) ∙ ¯A is the ℓ1 normalized adjacency matrix ∙ As shown, PageRank and Katz’s index differ by a constant factor and the ℓ1 normalization of the adjacency matrix A 30
  • 32. spectral measures: eigenvector and pagerank In figure 2 there are applications of the same graph of eigenvector PageRank centrality Figure 2: Degree and eigenvector centrality 31
  • 33. spectral measures: hits HITS - a little overview by Kleinberg ∙ The key here is the mutual reinforcement ∙ A node ( such as a page ) is authoritative if it is pointed by many good hubs ∙ Hubs: pages containing good list of authoritative pages ∙ Then an Hub is good if it points to many authoritative pages ∙ We iteratively compute the: ∙ ai: authoritativeness score ( where a0 = 1) ∙ hi: hubbiness score as the following: hi+1 = aiAT ai+1 = hi+1A ∙ This process converges to the left dominant eigenvector of the matrix AT A giving the final score of authoritativeness, called ”HITS” 32
  • 34. spectral measures: salsa SALSA was ideated by Lempel and Moran ∙ Based on the same mutual reinforcement between authoritativeness and hubbiness, but ℓ1normalizing the matrices A and AT . ∙ Starting value: a0 = 1 ∙ hi+1 = ai ¯AT ∙ ai+1 = ai ¯A ∙ Contrarily to HITS there is no need of a large number of iteration with SALSA 33
  • 35. spectral measures: some applications ∙ Left dominant eigenvector: the idea on which networks structure analysisis is based ∙ Seeley’s index: feedback’s network ∙ Katz’s index: citations networks ∙ expecially good with direct acyclic graphs (where the basic dominant eigenvector don’t perform well) ∙ HITS: web page’s citations ∙ Pagerank: Google’s search engine ∙ SALSA: link structure analysis 34
  • 37. axioms for centrality ∙ Boldi and Vigna in 2013 tried to provide a method to evaluate and compare different centrality measures ∙ They defined three axioms that an index should satisfy to behave predictably ∙ Size axiom ∙ Density axiom ∙ The score-monotonicity axiom 36
  • 38. axioms for centrality: size axiom Given a graph Sk,p (figure 3), made by a k − clique and a directed p − cycle, the size axioms is satisfied if there are threshold values, of p and k such that: ∙ p > k (if the cycle is very large) the nodes of the cycle are more important ∙ k > p the nodes of clique are more important ∙ intuitively, for p = k, the nodes of the clique are more important Figure 3: Graph Sk,p 37
  • 39. axioms for centrality: density axiom ∙ Given a graph Dk,p(figure 4), made by a k − clique and a directed p − cycle connected by a bidirectional bridge x ↔ y, where x is a node of the clique and y a node of the cycle. ∙ A centrality measure satisfies the density axiom for k = p, if the centrality of x is strictly larger than the centrality of y. Figure 4: Graph Gk,p 38
  • 40. axioms for centrality: the score-monotonicity axiom ∙ A centrality measure satisfies the score-monotonicity axiom if for every graph G and every pair of node x, y such that x ↛ y, when we add x → y to G the centrality of y increases. 39
  • 41. axioms for centrality: centrality axioms: comparisons Figure 5: For each centrality and each axiom, the report whether it is satisfied The harmonic centrality satisfies all axioms. 40
  • 42. information retrieval: sanity check ∙ Boldi and Vigna have applied centrality measures on standard datasets in order to find out the behavior of different indices ∙ There are standard datasets with associated queries and ground truth about which documents are relevant for every query ∙ Those collections are typically used to compare the merits and the demerits about retrieval methods 41
  • 43. information retrieval: datasets Dataset GOV2, tested in two different ways: ∙ with all links: complete dataset ∙ with inter-host link only: links between pages of the same host are excluted from the graph Measures of effectiveness chosen: ∙ P@10: precision at 10, fraction of relevant documents retrieved among the first ten ∙ NDCG@10: discounted cumulative gain at 10, measure the usefulness, or gain, of a document based on its position in the result list 42
  • 44. information retrieval: results For each centrality measure the discounted cumulative and precision at 10, on GOV2 dataset using all links (on the left) and using only inter-host links (on the right). Figure 6: All links Figure 7: Inter-host links 43
  • 46. conclusions ∙ A very simple measure as harmonic centrality, turned out to be a good notion of centrality. ∙ it satisfies all centrality axioms proposed ∙ it works well to retrieve information Choose the right measure ∙ No centrality measure is better than the others in every situation ∙ Some are better than others to reach a particular goal, but it depends on the specific application domain ∙ So, the best approach is to understand which measure fits the problem better 45
  • 47. references and useful resources Paolo Boldi and Sebastiano Vigna Axioms for centrality. Nicola Perra and Santo Fortunato Spectral centrality measures in complex networks. M. E. J. Newman Networks: an introduction 46
  • 48. Thank you for your attention! 47