SlideShare a Scribd company logo
1 of 6
Download to read offline
FINDING IMPORTANT NODES IN SOCIAL
NETWORKS BASED ON MODIFIED
PAGERANK
Li-qing Qiu1, Yong-quan Liang2, Jing-Chen3
1, 2

College of Information Science and Technology, Shandong University
of Science and Technology, Qingdao, China
3
Shandong labor vocational and technical college Jinan, China
liqingqiu2005@126.com1, lyq@sdust.edu.cn2
wfchenj@126.com3

ABSTRACT
Important nodes are individuals who have huge influence on social network. Finding important
nodes in social networks is of great significance for research on the structure of the social
networks. Based on the core idea of Pagerank, a new ranking method is proposed by
considering the link similarity between the nodes. The key concept of the method is the use of
the link vector which records the contact times between nodes. Then the link similarity is
computed based on the vectors through the similarity function. The proposed method
incorporates the link similarity into original Pagerank. The experiment results show that the
proposed method can get better performance.

KEYWORDS
social networks, Pagerank, link similarity

1. INTRODUCTION
In modern society, social networks play an important role in a quickly changing world, and more
and more people prefer to obtain information from social networks. This explains the increasing
interest in social networks analysis which examines topology of a network in order to find
interesting structure within it. Recent works have pointed that some very active nodes have huge
impacts on other nodes [1]. Therefore, the problem of how to appropriately find important nodes
in social networks among the huge nodes becomes an important issue.
Pagerank[2,3] is a well known method for identifying authoritative pages in a hyperlink network
of web pages. Pagerank relies on the democratic nature of the web by using its topology as an
indicator of the value to be attached to any page. Pagerank is so useful that we can also apply it to
social networks in that the mutual relationship in social networks can be structured as links to the
microblogs, and nodes can also be regarded as websites in Pagerank algorithm [4]. In the paper,
we present a new importance ranking method based on the modified Pagerank to find the
important nodes in social networks. In other words, the proposed method is derives from
David C. Wyld et al. (Eds) : CST, ITCS, JSE, SIP, ARIA, DMS - 2014
pp. 39–44, 2014. © CS & IT-CSCP 2014

DOI : 10.5121/csit.2014.4104
40

Computer Science & Information Technology (CS & IT)

Pagerank by considering the link similarity which measures the similarity between the nodes.
The paper is organized as follows. In section 2 we describe our generalization of new
measurement, and the experimental results together with the experimental settings are given in
section 3. At last we conclude the paper by summarizing our findings in section 4.

2. PROPOSED METHOD
In the section, we propose a new method to find important nodes based on modified Pagerank, by
considering the link similarity. We introduce the modified Pagerank model in section 2.1, and
then we analyze the importance of nodes using the model in section 2.2.

2.1 Modified Pagerank
The core idea of Pagerank is that of introducing a notion of page authority. In pagerank, the
authority reminds the notion of citation in the scientific literature. In particular, the authority of a
page p depends on the number of incoming hyperlinks and on the authority of the page q which
cites p with a forward link. Moreover, selective citations from q to p are assumed to provide
more contribution to the value of p than uniform citations. Therefore, the Pagerank value PR p
of p is computed by taking into account the set of pages pa[ p] pointing to p .The Pagerank
value PR p is defined as follows:
PR p = (1 − d ) + d

∑
q∈ pa [ p ]

PRq
hq

(1)

Here d ∈ (0,1) is a dumping factor which corresponds to the probability with which a surfer
jumps to a page picked uniformly at random.
The quality of the links, as measured by Pagerank, is a good choice for ranking nodes but we
think there are some other features that can incorporate the activity of the node. We propose to
incorporate the features via the link similarity taking into account contact times of the node. The
idea behind is that the node has higher similarity must be prized with a higher value. Our main
idea consists in constructing the link vector that records the contact times of the nodes, defining a
link similarity function to measure the similarity of the nodes according to the link vector, and
then reconstructing Pagerank model by considering the link similarity. These ideas are a work in
progress.
Definition 1. For node vi , its link vector is defined as:
Vi = {ti1 , ti1 ,L , ti −1 , ti +1 ,L , tin }

(2)

Where tim (0 ≤ m ≤ n ) is the contact times of vi and vm .
Definition 2. For node vi and v j , the link similarity is defined as:
Computer Science & Information Technology (CS & IT)

41

n

ur uu
r
Similarity (Vi ,V j ) = Vi ⋅ V j =

∑V

im

⋅ V jm

m =1
n

(3)

n

∑V

2

im

m =1

∑V

2
jm

m =1

Obviously, the link similarity measures are functions that take two link vectors as arguments and
compute a real value in the interval [0..1] , the value 1 means that the two nodes are closely
related while the value 0 means the nodes are quite different.
Definition 3. For node vi , its modified Pagerank value is defined :

PRvi = (1 − d ) + d

∑
v j ∈ pa[ vi ]

PRv j ⋅ Similarity (Vi ,V j )
hv j

(4)

Here pa[vi ] is the set of nodes which point to vi , and the dump factor d is set to 0.15 in our
experiments.
In the Modified Pagerank, the link similarity of the nodes is taken into account, which is a
positive indicator to modify the original Pagerank(see Equation 1).

2.2 Node Analysis based on modified Pagerank
The node analysis based on modified Pagerank is composed of 3 steps as follows:
Step1: read network as graph;
Step2: computer modified Pagerank value according to Equation 4;
Step3: obtain node modified Pagerank value list.
From above steps, we can see that a bit change is made to the original Pagerank by adding the
link similarity indicator. The modified Pagerank assumes that the initial Pagerank values of all
nodes are the same. Firstly, we calculate the first iterative ranking of each node based on the
initial Pagerank values, and then calculate the second rank according to the first iteration. The
process continues until the termination condition is satisfied. At last the Pagerank estimator
converges to its practical value, which has proven no matter what the initial value is. The whole
processing is implemented without manual intervention.

3. EXPERIMENTS
In the following, we use several different networks to study the performance of our generalization
of novel algorithm for detecting local community structure.

3.1 First Experiment
We start with the first synthetic dataset, which is shown as Figure 1, to illustrate the process of
the proposed method in detail. The network contains 5 nodes, and the contact times are associated
42

Computer Science & Information Technology (CS & IT)

with corresponding edges.

Figure1. A simple network of 5 nodes and 8 edges.

So the mutual relationship matrix can be produced as briefly shown in Table 1.
Table1. Mutual relationship matrix, A.
V1

V3

V4

V5

2

V1

V2

2

1

1

1

0

1

1

0

V2

2

V3

2

1

V4

1

0

1

V5

1

1

0

1
1

a

Parameters in the matrix ij is defined as the contact times between the nodes. Then the link
vectors are: V 1 = {2, 2,1,1} , V 2 = {2,1,0,1} , V 3 = {2,1,1,0} , V 4 = {1,0,1,1} , V 5 = {1,1,0,1} .Thereby,
the link similarity can be computed as:

Similarity (V 1,V 2) =

2 * 2 + 2 *1 + 1* 0 + 1*1
2

2 + 22 + 12 + 12 * 22 + 12 + 02 + 12

= 0.904

The process continues until we get the total link similarity of the network. The link similarity
matrix is shown as follows:
Table2. The link similarity matrix,B.
V1

V3

V4

V5

0.904

V1

V2

0.904

0.730

0.913

0.833

0.707

0.904

0.707

0.707

V2

0.904

V3

0.904

0.833

V4

0.730

0.707

0.707

V5

0.913

0.904

0.707

0.816
0.816
Computer Science & Information Technology (CS & IT)

43

Then the modified Pagerank can be computed according to Equation 4, and we get the following
Pagerank values:
Table3. The modified Pagerank Values.
Value
V1

0.116

V2

0.099

V3

0.101

V4

0.093

V5

0.109

3.2 Second Experiment
Secondly, we apply our modified Pagerank method to one small network, which is the muchdiscussed “karate club” network of friendships between 34 members of a karate club at a US
university, assembled by Zachary [5] by direct observation of the club’s members. This network
is of particular interest because the club split in two during the course of Zachary’s observations
as a result of an internal dispute between the director and the coach. In other words the network
can be classified into two communities-one’s center is the director, the other‘s center is the coach.
We select degree centrality and PageRank algorithm as baseline methods, which be identified as
“degree” and “PageRank” respectively. And our proposed modified Pagerank method is
identified as “M-Pagerank”.Table4 shows the result of the experiment. We select the nodes in the
top 10 according to different methods.
Table4. The comparison of different methods
degree

Node ID

PageRank

Node ID

M-Pagerank

Node ID

0.515
0.485
0.364
0.303
0.273
0.182
0.182
0.152
0.152
0.152

34
1
33
3
2
32
4
24
9
24

0.101
0.097
0.072
0.057
0.053
0.037
0.036
0.032
0.030
0.030

34
1
33
3
2
32
4
24
9
14

0.093
0.089
0.065
0.050
0.036
0.028
0.017
0.016
0.014
0.014

34
1
33
3
2
32
14
4
9
31

From the above table, we can see that three methods appear to have many common results.
However, degree can not distinguish nodes because some nodes have the equal value. For
example, node 32 and node 4 have the common value according to degree method, which shows
that degree method can not distinguish nodes well. M-Pagerank performs better than Pagerank in
that M-Pagerank considers the link similarity of the nodes. For example, node 31 is on the
boundary of two communities, which has much connection between the two communities.
44

Computer Science & Information Technology (CS & IT)

According to M-Pagerank, node 31 is important than node 24 obviously. However, Pagerank rank
node 24 higher than node 31, which is not reasonable.

4. CONCLUSION
In the paper, we have shown a new method to find important nodes in social works based on
modified Pagerank. The method is capable of incorporate the link similarity of the nodes via the
link vector. The final goal for this model is to incorporate some features into original Pagerank.
The proposed method enables a new way of ranking the nodes in social works. We have analyzed
the experiment results using test networks. In our future work, we will further discuss how to
achieve better performance to detect importance nodes.

ACKNOWLEDGEMENT
This paper is supported by National Science Foundation for Post-doctoral Scientists of China
under grant 2013M541938, and Shandong Province Postdoctoral special funds for innovative
projects of China under grant 201302036.

REFERENCES
[1]
[2]
[3]
[4]

[5]

Khorasgani RR, Chen J, Zaiane OR.Top leaders community detection approach in information
networks. KDD 2010, 1-9.
Brin S, Page L. The anatomy of a large-scale hypertextual web search engine. WWW7, 1998.
Bianchini M, Gori M, Scarselli F.Inside Pagerank. ACM transactions on internet technology, 2006,92128.
Abbassi Z, Minokni VS. A recommender system based on local random walks and spectral methods.
9th Web KDD and 1st SNA-KDD 2007 workshop on web mining and social network
anaysis,2007,102-108
Zachary WW. An information flow model for conflict and fission in small groups, Journal of
anthropological Research, 1977, 33: 452-473.

More Related Content

What's hot

Using Networks to Measure Influence and Impact
Using Networks to Measure Influence and ImpactUsing Networks to Measure Influence and Impact
Using Networks to Measure Influence and Impact
Yunhao Zhang
 
What Sets Verified Users apart? Insights Into, Analysis of and Prediction of ...
What Sets Verified Users apart? Insights Into, Analysis of and Prediction of ...What Sets Verified Users apart? Insights Into, Analysis of and Prediction of ...
What Sets Verified Users apart? Insights Into, Analysis of and Prediction of ...
IIIT Hyderabad
 
Operating Task Redistribution in Hyperconverged Networks
Operating Task Redistribution in Hyperconverged Networks Operating Task Redistribution in Hyperconverged Networks
Operating Task Redistribution in Hyperconverged Networks
IJECEIAES
 
Done reread maximizingpagerankviaoutlinks(2)
Done reread maximizingpagerankviaoutlinks(2)Done reread maximizingpagerankviaoutlinks(2)
Done reread maximizingpagerankviaoutlinks(2)
James Arnold
 
Detection and elemination of block hole attack
Detection and elemination of block hole attackDetection and elemination of block hole attack
Detection and elemination of block hole attack
Maalolan Narasimhan
 

What's hot (18)

Using Networks to Measure Influence and Impact
Using Networks to Measure Influence and ImpactUsing Networks to Measure Influence and Impact
Using Networks to Measure Influence and Impact
 
J046026268
J046026268J046026268
J046026268
 
08 Exponential Random Graph Models (ERGM)
08 Exponential Random Graph Models (ERGM)08 Exponential Random Graph Models (ERGM)
08 Exponential Random Graph Models (ERGM)
 
Using spectral radius ratio for node degree
Using spectral radius ratio for node degreeUsing spectral radius ratio for node degree
Using spectral radius ratio for node degree
 
07 Statistical approaches to randomization
07 Statistical approaches to randomization07 Statistical approaches to randomization
07 Statistical approaches to randomization
 
The Web Science MacroScope: Mixed-methods Approach for Understanding Web Acti...
The Web Science MacroScope: Mixed-methods Approach for Understanding Web Acti...The Web Science MacroScope: Mixed-methods Approach for Understanding Web Acti...
The Web Science MacroScope: Mixed-methods Approach for Understanding Web Acti...
 
sanju
sanjusanju
sanju
 
String Identifier in Multiple Medical Databases
String Identifier in Multiple Medical DatabasesString Identifier in Multiple Medical Databases
String Identifier in Multiple Medical Databases
 
A Study on Hardware and Software Link Quality Metrics for Wireless Multimedia...
A Study on Hardware and Software Link Quality Metrics for Wireless Multimedia...A Study on Hardware and Software Link Quality Metrics for Wireless Multimedia...
A Study on Hardware and Software Link Quality Metrics for Wireless Multimedia...
 
What Sets Verified Users apart? Insights Into, Analysis of and Prediction of ...
What Sets Verified Users apart? Insights Into, Analysis of and Prediction of ...What Sets Verified Users apart? Insights Into, Analysis of and Prediction of ...
What Sets Verified Users apart? Insights Into, Analysis of and Prediction of ...
 
Business Market Research on Instant Messaging -2013
Business Market Research on Instant Messaging -2013Business Market Research on Instant Messaging -2013
Business Market Research on Instant Messaging -2013
 
Operating Task Redistribution in Hyperconverged Networks
Operating Task Redistribution in Hyperconverged Networks Operating Task Redistribution in Hyperconverged Networks
Operating Task Redistribution in Hyperconverged Networks
 
Probability Density Functions of the Packet Length for Computer Networks With...
Probability Density Functions of the Packet Length for Computer Networks With...Probability Density Functions of the Packet Length for Computer Networks With...
Probability Density Functions of the Packet Length for Computer Networks With...
 
Poster presentation 5th BENet (Belgium Network Research Meting), Namur
Poster presentation 5th BENet (Belgium Network Research Meting), Namur Poster presentation 5th BENet (Belgium Network Research Meting), Namur
Poster presentation 5th BENet (Belgium Network Research Meting), Namur
 
Done reread maximizingpagerankviaoutlinks(2)
Done reread maximizingpagerankviaoutlinks(2)Done reread maximizingpagerankviaoutlinks(2)
Done reread maximizingpagerankviaoutlinks(2)
 
Page Rank
Page RankPage Rank
Page Rank
 
Detection and elemination of block hole attack
Detection and elemination of block hole attackDetection and elemination of block hole attack
Detection and elemination of block hole attack
 
network mining and representation learning
network mining and representation learningnetwork mining and representation learning
network mining and representation learning
 

Viewers also liked

יחידה אסטרטגית
יחידה אסטרטגיתיחידה אסטרטגית
יחידה אסטרטגית
AshdodSlides
 
מינהל תפעול
מינהל תפעולמינהל תפעול
מינהל תפעול
AshdodSlides
 
מינהל כללי מצגת לחברי מועצה
מינהל כללי  מצגת לחברי מועצה מינהל כללי  מצגת לחברי מועצה
מינהל כללי מצגת לחברי מועצה
AshdodSlides
 
מינהל כללי
מינהל כללי מינהל כללי
מינהל כללי
AshdodSlides
 
photo trip advisor 2
photo trip advisor 2photo trip advisor 2
photo trip advisor 2
Katy McInnes
 
Chef Bio_Dave Pretino
Chef Bio_Dave PretinoChef Bio_Dave Pretino
Chef Bio_Dave Pretino
David Pretino
 
מינהל החינוך
מינהל החינוךמינהל החינוך
מינהל החינוך
AshdodSlides
 

Viewers also liked (15)

יחידה אסטרטגית
יחידה אסטרטגיתיחידה אסטרטגית
יחידה אסטרטגית
 
יובלים
יובליםיובלים
יובלים
 
מינהל תפעול
מינהל תפעולמינהל תפעול
מינהל תפעול
 
מינהל כללי מצגת לחברי מועצה
מינהל כללי  מצגת לחברי מועצה מינהל כללי  מצגת לחברי מועצה
מינהל כללי מצגת לחברי מועצה
 
Tribunal Superior de Justicia - Guia de autoridades 2015
Tribunal Superior de Justicia - Guia de autoridades 2015Tribunal Superior de Justicia - Guia de autoridades 2015
Tribunal Superior de Justicia - Guia de autoridades 2015
 
מינהל כללי
מינהל כללי מינהל כללי
מינהל כללי
 
photo trip advisor 2
photo trip advisor 2photo trip advisor 2
photo trip advisor 2
 
Chef Bio_Dave Pretino
Chef Bio_Dave PretinoChef Bio_Dave Pretino
Chef Bio_Dave Pretino
 
חפא
חפאחפא
חפא
 
מינהל החינוך
מינהל החינוךמינהל החינוך
מינהל החינוך
 
Investigacion como insertar un vídeo en eclipse
Investigacion como insertar un vídeo en eclipseInvestigacion como insertar un vídeo en eclipse
Investigacion como insertar un vídeo en eclipse
 
Presentación
Presentación Presentación
Presentación
 
The Puzzling Piece Challenge
The Puzzling Piece ChallengeThe Puzzling Piece Challenge
The Puzzling Piece Challenge
 
Google+ in numbers
Google+ in numbersGoogle+ in numbers
Google+ in numbers
 
New doc 5
New doc 5New doc 5
New doc 5
 

Similar to Finding important nodes in social networks based on modified pagerank

A new approach to erd s collaboration network using page rank
A new approach to erd s collaboration network using page rankA new approach to erd s collaboration network using page rank
A new approach to erd s collaboration network using page rank
Alexander Decker
 
A new approach to erd s collaboration network using page rank
A new approach to erd s collaboration network using page rankA new approach to erd s collaboration network using page rank
A new approach to erd s collaboration network using page rank
Alexander Decker
 
Predicting_new_friendships_in_social_networks
Predicting_new_friendships_in_social_networksPredicting_new_friendships_in_social_networks
Predicting_new_friendships_in_social_networks
Anvardh Nanduri
 
2014 USA
2014 USA2014 USA
2014 USA
LI HE
 
Network Flow Pattern Extraction by Clustering Eugine Kang
Network Flow Pattern Extraction by Clustering Eugine KangNetwork Flow Pattern Extraction by Clustering Eugine Kang
Network Flow Pattern Extraction by Clustering Eugine Kang
Eugine Kang
 

Similar to Finding important nodes in social networks based on modified pagerank (20)

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK
FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANKFINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK
FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK
 
Ranking nodes in growing networks: when PageRank fails
Ranking nodes in growing networks: when PageRank failsRanking nodes in growing networks: when PageRank fails
Ranking nodes in growing networks: when PageRank fails
 
Centrality Prediction in Mobile Social Networks
Centrality Prediction in Mobile Social NetworksCentrality Prediction in Mobile Social Networks
Centrality Prediction in Mobile Social Networks
 
A new approach to erd s collaboration network using page rank
A new approach to erd s collaboration network using page rankA new approach to erd s collaboration network using page rank
A new approach to erd s collaboration network using page rank
 
A new approach to erd s collaboration network using page rank
A new approach to erd s collaboration network using page rankA new approach to erd s collaboration network using page rank
A new approach to erd s collaboration network using page rank
 
Ripple Algorithm to Evaluate the Importance of Network Nodes
Ripple Algorithm to Evaluate the Importance of Network NodesRipple Algorithm to Evaluate the Importance of Network Nodes
Ripple Algorithm to Evaluate the Importance of Network Nodes
 
Predicting_new_friendships_in_social_networks
Predicting_new_friendships_in_social_networksPredicting_new_friendships_in_social_networks
Predicting_new_friendships_in_social_networks
 
An Improved PageRank Algorithm for Multilayer Networks
An Improved PageRank Algorithm for Multilayer NetworksAn Improved PageRank Algorithm for Multilayer Networks
An Improved PageRank Algorithm for Multilayer Networks
 
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKS
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKSEVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKS
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKS
 
Sub-Graph Finding Information over Nebula Networks
Sub-Graph Finding Information over Nebula NetworksSub-Graph Finding Information over Nebula Networks
Sub-Graph Finding Information over Nebula Networks
 
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKS
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKSEVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKS
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKS
 
IRJET- Link Prediction in Social Networks
IRJET- Link Prediction in Social NetworksIRJET- Link Prediction in Social Networks
IRJET- Link Prediction in Social Networks
 
IRJET- Study of Topological Analogies of Perfect Difference Network and Compl...
IRJET- Study of Topological Analogies of Perfect Difference Network and Compl...IRJET- Study of Topological Analogies of Perfect Difference Network and Compl...
IRJET- Study of Topological Analogies of Perfect Difference Network and Compl...
 
2014 USA
2014 USA2014 USA
2014 USA
 
Performance Analysis and Parallelization of CosineSimilarity of Documents
Performance Analysis and Parallelization of CosineSimilarity of DocumentsPerformance Analysis and Parallelization of CosineSimilarity of Documents
Performance Analysis and Parallelization of CosineSimilarity of Documents
 
Network Flow Pattern Extraction by Clustering Eugine Kang
Network Flow Pattern Extraction by Clustering Eugine KangNetwork Flow Pattern Extraction by Clustering Eugine Kang
Network Flow Pattern Extraction by Clustering Eugine Kang
 
An Efficient Algorithm For Ranking Research Papers Based On Citation Network
An Efficient Algorithm For Ranking Research Papers Based On Citation NetworkAn Efficient Algorithm For Ranking Research Papers Based On Citation Network
An Efficient Algorithm For Ranking Research Papers Based On Citation Network
 
Using PageRank Algorithm to Improve Coupling Metrics
Using PageRank Algorithm to Improve Coupling MetricsUsing PageRank Algorithm to Improve Coupling Metrics
Using PageRank Algorithm to Improve Coupling Metrics
 
A New Bi-level Program Based on Unblocked Reliability for a Continuous Road N...
A New Bi-level Program Based on Unblocked Reliability for a Continuous Road N...A New Bi-level Program Based on Unblocked Reliability for a Continuous Road N...
A New Bi-level Program Based on Unblocked Reliability for a Continuous Road N...
 
Performance neo4j-versus (2)
Performance neo4j-versus (2)Performance neo4j-versus (2)
Performance neo4j-versus (2)
 

Recently uploaded

Recently uploaded (20)

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Finding important nodes in social networks based on modified pagerank

  • 1. FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK Li-qing Qiu1, Yong-quan Liang2, Jing-Chen3 1, 2 College of Information Science and Technology, Shandong University of Science and Technology, Qingdao, China 3 Shandong labor vocational and technical college Jinan, China liqingqiu2005@126.com1, lyq@sdust.edu.cn2 wfchenj@126.com3 ABSTRACT Important nodes are individuals who have huge influence on social network. Finding important nodes in social networks is of great significance for research on the structure of the social networks. Based on the core idea of Pagerank, a new ranking method is proposed by considering the link similarity between the nodes. The key concept of the method is the use of the link vector which records the contact times between nodes. Then the link similarity is computed based on the vectors through the similarity function. The proposed method incorporates the link similarity into original Pagerank. The experiment results show that the proposed method can get better performance. KEYWORDS social networks, Pagerank, link similarity 1. INTRODUCTION In modern society, social networks play an important role in a quickly changing world, and more and more people prefer to obtain information from social networks. This explains the increasing interest in social networks analysis which examines topology of a network in order to find interesting structure within it. Recent works have pointed that some very active nodes have huge impacts on other nodes [1]. Therefore, the problem of how to appropriately find important nodes in social networks among the huge nodes becomes an important issue. Pagerank[2,3] is a well known method for identifying authoritative pages in a hyperlink network of web pages. Pagerank relies on the democratic nature of the web by using its topology as an indicator of the value to be attached to any page. Pagerank is so useful that we can also apply it to social networks in that the mutual relationship in social networks can be structured as links to the microblogs, and nodes can also be regarded as websites in Pagerank algorithm [4]. In the paper, we present a new importance ranking method based on the modified Pagerank to find the important nodes in social networks. In other words, the proposed method is derives from David C. Wyld et al. (Eds) : CST, ITCS, JSE, SIP, ARIA, DMS - 2014 pp. 39–44, 2014. © CS & IT-CSCP 2014 DOI : 10.5121/csit.2014.4104
  • 2. 40 Computer Science & Information Technology (CS & IT) Pagerank by considering the link similarity which measures the similarity between the nodes. The paper is organized as follows. In section 2 we describe our generalization of new measurement, and the experimental results together with the experimental settings are given in section 3. At last we conclude the paper by summarizing our findings in section 4. 2. PROPOSED METHOD In the section, we propose a new method to find important nodes based on modified Pagerank, by considering the link similarity. We introduce the modified Pagerank model in section 2.1, and then we analyze the importance of nodes using the model in section 2.2. 2.1 Modified Pagerank The core idea of Pagerank is that of introducing a notion of page authority. In pagerank, the authority reminds the notion of citation in the scientific literature. In particular, the authority of a page p depends on the number of incoming hyperlinks and on the authority of the page q which cites p with a forward link. Moreover, selective citations from q to p are assumed to provide more contribution to the value of p than uniform citations. Therefore, the Pagerank value PR p of p is computed by taking into account the set of pages pa[ p] pointing to p .The Pagerank value PR p is defined as follows: PR p = (1 − d ) + d ∑ q∈ pa [ p ] PRq hq (1) Here d ∈ (0,1) is a dumping factor which corresponds to the probability with which a surfer jumps to a page picked uniformly at random. The quality of the links, as measured by Pagerank, is a good choice for ranking nodes but we think there are some other features that can incorporate the activity of the node. We propose to incorporate the features via the link similarity taking into account contact times of the node. The idea behind is that the node has higher similarity must be prized with a higher value. Our main idea consists in constructing the link vector that records the contact times of the nodes, defining a link similarity function to measure the similarity of the nodes according to the link vector, and then reconstructing Pagerank model by considering the link similarity. These ideas are a work in progress. Definition 1. For node vi , its link vector is defined as: Vi = {ti1 , ti1 ,L , ti −1 , ti +1 ,L , tin } (2) Where tim (0 ≤ m ≤ n ) is the contact times of vi and vm . Definition 2. For node vi and v j , the link similarity is defined as:
  • 3. Computer Science & Information Technology (CS & IT) 41 n ur uu r Similarity (Vi ,V j ) = Vi ⋅ V j = ∑V im ⋅ V jm m =1 n (3) n ∑V 2 im m =1 ∑V 2 jm m =1 Obviously, the link similarity measures are functions that take two link vectors as arguments and compute a real value in the interval [0..1] , the value 1 means that the two nodes are closely related while the value 0 means the nodes are quite different. Definition 3. For node vi , its modified Pagerank value is defined : PRvi = (1 − d ) + d ∑ v j ∈ pa[ vi ] PRv j ⋅ Similarity (Vi ,V j ) hv j (4) Here pa[vi ] is the set of nodes which point to vi , and the dump factor d is set to 0.15 in our experiments. In the Modified Pagerank, the link similarity of the nodes is taken into account, which is a positive indicator to modify the original Pagerank(see Equation 1). 2.2 Node Analysis based on modified Pagerank The node analysis based on modified Pagerank is composed of 3 steps as follows: Step1: read network as graph; Step2: computer modified Pagerank value according to Equation 4; Step3: obtain node modified Pagerank value list. From above steps, we can see that a bit change is made to the original Pagerank by adding the link similarity indicator. The modified Pagerank assumes that the initial Pagerank values of all nodes are the same. Firstly, we calculate the first iterative ranking of each node based on the initial Pagerank values, and then calculate the second rank according to the first iteration. The process continues until the termination condition is satisfied. At last the Pagerank estimator converges to its practical value, which has proven no matter what the initial value is. The whole processing is implemented without manual intervention. 3. EXPERIMENTS In the following, we use several different networks to study the performance of our generalization of novel algorithm for detecting local community structure. 3.1 First Experiment We start with the first synthetic dataset, which is shown as Figure 1, to illustrate the process of the proposed method in detail. The network contains 5 nodes, and the contact times are associated
  • 4. 42 Computer Science & Information Technology (CS & IT) with corresponding edges. Figure1. A simple network of 5 nodes and 8 edges. So the mutual relationship matrix can be produced as briefly shown in Table 1. Table1. Mutual relationship matrix, A. V1 V3 V4 V5 2 V1 V2 2 1 1 1 0 1 1 0 V2 2 V3 2 1 V4 1 0 1 V5 1 1 0 1 1 a Parameters in the matrix ij is defined as the contact times between the nodes. Then the link vectors are: V 1 = {2, 2,1,1} , V 2 = {2,1,0,1} , V 3 = {2,1,1,0} , V 4 = {1,0,1,1} , V 5 = {1,1,0,1} .Thereby, the link similarity can be computed as: Similarity (V 1,V 2) = 2 * 2 + 2 *1 + 1* 0 + 1*1 2 2 + 22 + 12 + 12 * 22 + 12 + 02 + 12 = 0.904 The process continues until we get the total link similarity of the network. The link similarity matrix is shown as follows: Table2. The link similarity matrix,B. V1 V3 V4 V5 0.904 V1 V2 0.904 0.730 0.913 0.833 0.707 0.904 0.707 0.707 V2 0.904 V3 0.904 0.833 V4 0.730 0.707 0.707 V5 0.913 0.904 0.707 0.816 0.816
  • 5. Computer Science & Information Technology (CS & IT) 43 Then the modified Pagerank can be computed according to Equation 4, and we get the following Pagerank values: Table3. The modified Pagerank Values. Value V1 0.116 V2 0.099 V3 0.101 V4 0.093 V5 0.109 3.2 Second Experiment Secondly, we apply our modified Pagerank method to one small network, which is the muchdiscussed “karate club” network of friendships between 34 members of a karate club at a US university, assembled by Zachary [5] by direct observation of the club’s members. This network is of particular interest because the club split in two during the course of Zachary’s observations as a result of an internal dispute between the director and the coach. In other words the network can be classified into two communities-one’s center is the director, the other‘s center is the coach. We select degree centrality and PageRank algorithm as baseline methods, which be identified as “degree” and “PageRank” respectively. And our proposed modified Pagerank method is identified as “M-Pagerank”.Table4 shows the result of the experiment. We select the nodes in the top 10 according to different methods. Table4. The comparison of different methods degree Node ID PageRank Node ID M-Pagerank Node ID 0.515 0.485 0.364 0.303 0.273 0.182 0.182 0.152 0.152 0.152 34 1 33 3 2 32 4 24 9 24 0.101 0.097 0.072 0.057 0.053 0.037 0.036 0.032 0.030 0.030 34 1 33 3 2 32 4 24 9 14 0.093 0.089 0.065 0.050 0.036 0.028 0.017 0.016 0.014 0.014 34 1 33 3 2 32 14 4 9 31 From the above table, we can see that three methods appear to have many common results. However, degree can not distinguish nodes because some nodes have the equal value. For example, node 32 and node 4 have the common value according to degree method, which shows that degree method can not distinguish nodes well. M-Pagerank performs better than Pagerank in that M-Pagerank considers the link similarity of the nodes. For example, node 31 is on the boundary of two communities, which has much connection between the two communities.
  • 6. 44 Computer Science & Information Technology (CS & IT) According to M-Pagerank, node 31 is important than node 24 obviously. However, Pagerank rank node 24 higher than node 31, which is not reasonable. 4. CONCLUSION In the paper, we have shown a new method to find important nodes in social works based on modified Pagerank. The method is capable of incorporate the link similarity of the nodes via the link vector. The final goal for this model is to incorporate some features into original Pagerank. The proposed method enables a new way of ranking the nodes in social works. We have analyzed the experiment results using test networks. In our future work, we will further discuss how to achieve better performance to detect importance nodes. ACKNOWLEDGEMENT This paper is supported by National Science Foundation for Post-doctoral Scientists of China under grant 2013M541938, and Shandong Province Postdoctoral special funds for innovative projects of China under grant 201302036. REFERENCES [1] [2] [3] [4] [5] Khorasgani RR, Chen J, Zaiane OR.Top leaders community detection approach in information networks. KDD 2010, 1-9. Brin S, Page L. The anatomy of a large-scale hypertextual web search engine. WWW7, 1998. Bianchini M, Gori M, Scarselli F.Inside Pagerank. ACM transactions on internet technology, 2006,92128. Abbassi Z, Minokni VS. A recommender system based on local random walks and spectral methods. 9th Web KDD and 1st SNA-KDD 2007 workshop on web mining and social network anaysis,2007,102-108 Zachary WW. An information flow model for conflict and fission in small groups, Journal of anthropological Research, 1977, 33: 452-473.