PageRank

Adding uncertainty to
the PageRank random
surfer
DAVID F. GLEICH, PURDUE UNIVERSITY, COMPUTER SCIENCE
UTRC SEMINAR, 13 DECEMBER 2011

1/40
UTRC Seminar
David Gleich, Purdue

+

+
Uncertainty Quantiﬁcation

2/40
UTRC Seminar

are a great way to model and
study problems in network
science and physical science

3/40
UTRC Seminar

are a great way to model and
study problems in network
science and physical science
I hope I’m preaching to the choir.

4/40
UTRC Seminar

A cartoon websearch primer
1.  Crawl webpages
2.  Analyze webpage text (information retrieval)
3.  Analyze webpage links
4.  Fit measures to human evaluations
5.  Produce rankings
6.  Continuously update

5/40
UTRC Seminar

1
2
to

3

6/40
UTRC Seminar

What is PageRank?
PageRank by Google
PageRank by Google
3
3
The Model
2 5 1.The Model uniformly with
follow edges
2
4
5 1. follow edges uniformly with
probability , and
4
2. randomly jump, with probability
probability and
1 6
2. randomlyassume everywhere is
1 , we’ll jump with probability
1 6 equally, likely assume everywhere is
1 we’ll
equally likely

The places we ﬁnd the
surfer most often are im-
portant pages. often are im-
surfer most
portant pages.

7/40
David F. Gleich (Sandia) PageRank intro Purdue 5 / 36
David F. Gleich (Sandia) PageRank intro UTRC Seminar
Purdue 5 / 36

The most important page on the web.

8/40
UTRC Seminar

PageRank via
PageRank details
PageRank by Google 3

3

2 5 The Model 0 0 0 3
2
1/ 6 1/ 2 0
2 5 6 1/ 6 0 0 1/ 3 0 0 7
1. follow edges uniformlyPwith
j 0
! 6 probability1/ 3, 0 0 7 eT P=eT
1/ 6 1/ 2 0 0 0
4
4 4 1/ 6 0 1/ 2 0 and 5
1/ 6 0 1/ 2 1/ 3 0 1
2. randomly jump 0
1/ 6 0 0 0 1 with probability
1 6 | {z }
1 6 1 , we’ll assume everywhere
P
equally likely

T 0
“jump” ! v = [ 1 ... 1 ]
n n eT v=1
î ó
Markov chain P + (1 )ve T x=x
unique x ) j 0, eT x = 1. are im-
surfer most often
Linear system ( portant pages.
P)x = (1 )v
Ignored dangling nodes patched back to v

9/40
algorithms later
David F. Gleich (Sandia)
David F. Gleich (Sandia) PageRank intro PageRank intro Purdue 6 / Purdue
36
UTRC Seminar

ther uses for PageRank
ensitivity?
else people use PageRank to do
ProteinRank
GeneRank
ObjectRank
NM_003748
NM_003862
Contig32125_RC
U82987
AB037863
NM_020974
Contig55377_RC
NM_003882
NM_000849
Contig48328_RC
Contig46223_RC
NM_006117
NM_003239
NM_018401
AF257175
AF201951
NM_001282
Contig63102_RC
NM_000286
Contig34634_RC
NM_000320
AB033007
AL355708
NM_000017
NM_006763
AF148505
Contig57595
NM_001280
AJ224741
U45975
Contig49670_RC
Contig753_RC
Contig25055_RC
Contig53646_RC
Contig42421_RC
Contig51749_RC
EventRank
AL137514
NM_004911
NM_000224
NM_013262
Contig41887_RC
NM_004163
AB020689
NM_015416
Contig43747_RC

IsoRank
NM_012429
AB033043
AL133619
NM_016569
NM_004480
NM_004798
Contig37063_RC
NM_000507
AB037745
Contig50802_RC
NM_001007
Contig53742_RC
NM_018104
Contig51963
Contig53268_RC
NM_012261
NM_020244
Contig55813_RC
Contig27312_RC
Contig44064_RC
NM_002570
NM_002900
AL050090
NM_015417
Contig47405_RC
NM_016337
Contig55829_RC
Contig37598
Contig45347_RC
NM_020675
NM_003234
AL080110
AL137295
Contig17359_RC
NM_013296
NM_019013
AF052159
Contig55313_RC
NM_002358
NM_004358
Contig50106_RC
NM_005342
NM_014754
U58033
Contig64688
NM_001827
Contig3902_RC
Contig41413_RC
NM_015434
NM_014078
NM_018120
NM_001124
L27560
Contig45816_RC
AL050021
NM_006115
NM_001333
NM_005496
Contig51519_RC
Contig1778_RC
NM_014363
NM_001905
NM_018454
NM_002811

Clustering
NM_004603
AB032973
NM_006096
D25328
Contig46802_RC
X94232
NM_018004
Contig8581_RC
Contig55188_RC
Contig50410
Contig53226_RC
NM_012214
NM_006201
NM_006372
Contig13480_RC
AL137502
Contig40128_RC
NM_003676
NM_013437
Contig2504_RC
AL133603
NM_012177
R70506_RC
NM_003662
NM_018136
NM_000158
NM_018410
Contig21812_RC
NM_004052
Contig4595
Contig60864_RC
NM_003878
U96131
NM_005563
NM_018455
Contig44799_RC
NM_003258

P)x = (1
NM_004456
NM_003158
NM_014750
Contig25343_RC
NM_005196
Contig57864_RC
NM_014109
NM_002808
Contig58368_RC
Contig46653_RC

( )v
NM_004504
M21551
NM_014875
NM_001168
NM_003376
NM_018098
AF161553
NM_020166
NM_017779

(graph partitioning)
NM_018265
AF155117
NM_004701
NM_006281
Contig44289_RC
NM_004336
Contig33814_RC
NM_003600
NM_006265
NM_000291
NM_000096
NM_001673
NM_001216
NM_014968
NM_018354
NM_007036
NM_004702
Contig2399_RC
NM_001809
Contig20217_RC
NM_003981
NM_007203
NM_006681
AF055033
NM_014889
NM_020386
NM_000599
Contig56457_RC
NM_005915
Contig24252_RC
Contig55725_RC
NM_002916
NM_014321
NM_006931
AL080079
Contig51464_RC
NM_000788
NM_016448
X05610
NM_014791
Contig40831_RC
AK000745
NM_015984
NM_016577
Contig32185_RC
AF052162
AF073519
NM_003607
NM_006101
NM_003875
Contig25991
Contig35251_RC
NM_004994
NM_000436
NM_002073
NM_002019
NM_000127
NM_020188

Sports ranking
AL137718
Contig28552_RC
Contig38288_RC
AA555029_RC
NM_016359
Contig46218_RC
Contig63649_RC
AL080059
10 20 30 40 50 60 70

he (links : 1examined and understood
se GD )x = w to Food webs
nd “nearby” important
Centrality
enes.
Teaching

10/40
Conjectured new papers: TweetRank (Done, WSDM 2010), WaveRank,
he jump : examined, understood, and u
Rank, PaperRank, UniversityRank, LabRank. I think theDavid Gleich, Purdue
UTRC Seminar
last one involves a

What else people use PageRank to do

GeneRank

NM_003748
NM_003862
Contig32125_RC
U82987
AB037863
NM_020974
Contig55377_RC
NM_003882
NM_000849
Contig48328_RC
Contig46223_RC
NM_006117
NM_003239
NM_018401
AF257175
AF201951
NM_001282
Contig63102_RC
NM_000286
Contig34634_RC
NM_000320
AB033007
AL355708
NM_000017
NM_006763
AF148505
Contig57595
NM_001280
AJ224741
U45975
Contig49670_RC
Contig753_RC
Contig25055_RC
Contig53646_RC
Contig42421_RC
Contig51749_RC
AL137514
NM_004911
NM_000224
NM_013262
Contig41887_RC
NM_004163
AB020689
NM_015416
Contig43747_RC
NM_012429
AB033043
AL133619
NM_016569
NM_004480
NM_004798
Contig37063_RC
NM_000507
AB037745
Contig50802_RC
NM_001007
Contig53742_RC
NM_018104
Contig51963
Contig53268_RC
NM_012261
NM_020244
Contig55813_RC
Contig27312_RC
Contig44064_RC
NM_002570
NM_002900
AL050090
NM_015417
Contig47405_RC
NM_016337
Contig55829_RC
Contig37598
Contig45347_RC
NM_020675
NM_003234
AL080110
AL137295
Contig17359_RC
NM_013296
NM_019013
AF052159
Contig55313_RC
NM_002358
NM_004358
Contig50106_RC
NM_005342
NM_014754
U58033
Contig64688
NM_001827
Contig3902_RC
Contig41413_RC
NM_015434
NM_014078
NM_018120
NM_001124
L27560
Contig45816_RC
AL050021
NM_006115
NM_001333
NM_005496
Contig51519_RC
Contig1778_RC
NM_014363
NM_001905
NM_018454
NM_002811
NM_004603
AB032973
NM_006096
D25328
Contig46802_RC
X94232
NM_018004
Contig8581_RC
Contig55188_RC
Contig50410
Contig53226_RC
NM_012214
NM_006201
NM_006372
Contig13480_RC
AL137502
Contig40128_RC
NM_003676
NM_013437
Contig2504_RC
AL133603
NM_012177
R70506_RC
NM_003662
NM_018136
NM_000158
NM_018410
Contig21812_RC
NM_004052
Contig4595
Contig60864_RC
NM_003878
U96131
NM_005563
NM_018455
Contig44799_RC
NM_003258
NM_004456
NM_003158
NM_014750
Contig25343_RC
NM_005196
Contig57864_RC
NM_014109
NM_002808
Contig58368_RC
Contig46653_RC
NM_004504
M21551
NM_014875
NM_001168
NM_003376
NM_018098
AF161553
NM_020166
NM_017779

(g
NM_018265
AF155117
NM_004701
NM_006281
Contig44289_RC
NM_004336
Contig33814_RC
NM_003600
NM_006265
NM_000291
NM_000096
NM_001673
NM_001216
NM_014968
NM_018354
NM_007036
NM_004702
Contig2399_RC
NM_001809
Contig20217_RC
NM_003981
NM_007203
NM_006681
AF055033
NM_014889
NM_020386
NM_000599
Contig56457_RC
NM_005915
Contig24252_RC
Contig55725_RC
NM_002916
NM_014321
NM_006931
AL080079
Contig51464_RC
NM_000788
NM_016448
X05610
NM_014791
Contig40831_RC
AK000745
NM_015984
NM_016577
Contig32185_RC
AF052162
AF073519
NM_003607
NM_006101
NM_003875
Contig25991
Contig35251_RC
NM_004994
NM_000436
NM_002073
NM_002019
NM_000127
NM_020188

S
AL137718
Contig28552_RC
Contig38288_RC
AA555029_RC
NM_016359
Contig46218_RC
Contig63649_RC
AL080059
10 20 30 40 50 60 70

Use ( GD 1 )x = w to
ﬁnd “nearby” important
genes.

11/40
Note Conjectured new papers: TweetRank (Done, WS
UTRC Seminar

Richardson is a robust, simple
algorithm to compute PageRank
Given α, P, v

(I ↵P)x = (1 ↵)v
Richardson )
(k+1) (k)
x = ↵Px + (1 ↵)v
(k) k
error = kx xk1  2↵

12/40
UTRC Seminar

Sensitivity

13/40
UTRC Seminar

Which sensitivity?
PageRank circa 2006

( P)x = (1 )v

Sensitivity to the links : examined and understood

Sensitivity to the jump : examined, understood, and useful

Sensitivity to : less well understood

14/40
For information about how to compute the PageRank derivative, see:
Gleich, Glynn, Golub, Greif. Three results on the PageRank vector, 2007.
UTRC Seminar

Wikipedia test case
PageRank on Wikipedia
= 0.50 = 0.85 = 0.99
United States United States C:Contents
C:Living people C:Main topic classif. C:Main topic classif.
France C:Contents C:Fundamental
Germany C:Living people United States
England C:Ctgs. by country C:Wikipedia admin.
United Kingdom United Kingdom P:List of portals
Canada C:Fundamental P:Contents/Portals
Japan C:Ctgs. by topic C:Portals
Poland C:Wikipedia admin. C:Society
Australia France C:Ctgs. by topic

Note Top 10 articles on Wikipedia with highest PageRank

15/40
David F. Gleich (Sandia) Sensitivity Purdue 11 / 36

UTRC Seminar

What is alpha?
What is alpha?
The teleportation parameter!

Author

Brin and Page (1998) 0.85

Najork et al. (2007) 0.85

Litvak et al. (2006) 0.5
Experiment (slide 19) 0.63

Algorithms (...) 0.85

For you,αis clear.

or you, is clear
oogle Google wants PageRank for everyone
wants PageRank for everyone

16/40
UTRC Seminar

What about me?
Multiple surfers should have an impact!
Each person picks from distribution A

...

# #
x(E [A]) E [x(A)]
& .

17/40
x(E [A]) 6= E [x(A)]
David F. Gleich (Sandia) Random sensitivity Purdue 15 / 36
UTRC Seminar

alpha PageRank PageRa
RandomPageRank
dom alpha alpha
Random alpha PageRank
RAPr

or PageRank meets UQ

s the random variables as the random variables
Model PageRank
ageRank as the random variables
x(A) x(A)
x(A)
and look at
k E [x(A)] and Std [x(A)] .
at
E [x(A)] and Std [x(A)] .
E [x(A)] and Std [x(A)] .

18/40
Explored in Constantine and Gleich, WAW2007; and "
Constantine and Gleich, J. Internet Mathematics 2011.
UTRC Seminar

Alpha, measured from users!
What is alpha based on users?
3.0 InfBeta( 3.2 , 2.0 , 1.9e−05 , 0.0019 )
mean 0.63
2.5
mode 0.69
2.0
density

1.5

1.0

0.5

0.0
0.0 0.2 0.4 0.6 0.8 1.0
Raw α

19/40
see Gleich et al. WWW2010 for more
Constantine, Flaxman, Gleich, Gunawardana, Tracking the Random Surfer, WWW2010.
UTRC Seminar

What is A?
A simple model for alpha





 

20/40
Bet ( , b, , r)
UTRC Seminar

An Examplerandom variables
The PageRank
x
1

3 x
2

2 5 x
3

4
x4

1 6
x
5

x
6

21/40
0 0.5

UTRC Seminar

A theoretical concern
Just one a problem
isn’t really
second ...
Z 1 Z 1
1
E [x( )] = x( ) ( ) d = (1 )( P) v ( )d
0 0

= 1 ( P) 1
!
P stochastic singular?

Yes, but ...
1
lim (1 )( P) v=x is unique
!1

22/40
(Think about P = 1, use Jordan Form of P to generalize)
UTRC Seminar

Many PageRank properties are
What changes?
unchanged by a random alpha
Really, what stays the same!

x(A) A ⇠ Bet ( , b, , r) with 0  < r  1

1. E [ (A)] 0 and kE [x(A)]k = 1;
thus E [x(A)] is a probability distribution.
P î ó
2. E [x(A)] = =0
E A A +1 P v;
thus we can interpret E [x(A)] in length- paths.

3. for page with no in-links, (A) = (1 A) ;
thus E [ (A)] = (E [A]) and Std [ (A)] = Std [A]

23/40
But is this one useful?
UTRC Seminar

Wikipedia test case (take 2)
RAPr on Wikipedia
RAPr on Wikipedia
EE [x(A)]
[x(A)] Std [x(A)]
Std [x(A)]
United States
United States United States
United States
C:Living people
C:Living people C:Living people
C:Living people
France
France C:Main topic classif.
C:Main topic classif.
United Kingdom
United Kingdom C:Contents
C:Contents
Germany
Germany C:Ctgs. by country
C:Ctgs. by country
England
England United Kingdom
United Kingdom
Canada
Canada France
France
Japan
Japan C:Fundamental
C:Fundamental
Poland
Poland England
England

24/40
Australia
Australia C:Ctgs. by topic
C:Ctgs. by topic
Note A A ⇠ Bet(0.5, 1.5, [0, 1]) ⇡ ⇡ empirical distribution on WikipediaGleich, Purdue
Note ⇠ Bet (0.5, 1.5, [0, 1]) empirical distribution Seminar
David
UTRC on Wikipedia

Ulam Networks
Ulam Networks
Ulam Networks
PageRank on a
dynamical system Networks yt+1
Chirikov map
Chirikov map Ulam networ
yt+1 = yt +k sin( t + t ) 1. divide phas
Ulam Ulam network Ulam t+1 = t +
network 2. form P base
hirikov map
Chirikov map
= Chirikov
+k sin( t
Ulam phase Ulam Networks
yt+1 = ytyt illustrates map1.1. divide network
space into uniform c
nicely +k sin(t + +t ) t ) divide phase space into uniform cel
Ulam network

+1 = = t Ulam Networks based ontrajectories.
the uncertainty.
NetworksP based onUlam network
+Ulam + yt+1 2.2. formmap
+1
yt+1 = yt +k sin( t + t ) 1. divide phase space into uniform cells
t+1 y yt+1
t +t+1 = t
t+1
ChirikovP P
form form
2. based on trajectories.
trajectories.
Chirikov map
Chirikov map yt+1 = yt +k sin( t + t ) 1. divide phase space
Ulam network
Ulam network
1. = t + yt+1
t t ) divide phase space into form P based
yt+1t+1 = t +k+k sin(+ t +)t+1 1. divide phase space into uniform cells on tr
y = y yt sin( t 2. uniform cells
t+1 = = +t yt+1
t+1 t + yt+1 form P P based trajectories.
2. 2. form based onon trajectories.

log(E [x(A)]) log(
log(E [x(A)]) log(Std [x(A)]))/ log(E Bet (2, 1
A ⇠ [x(A)])
Note Bet (2, 16)
A ⇠ White is larger, black is smaller
Note White is larger, black is smaller Google matrix, dynamical attractors, and
Google matrix, dynamical attractors, and Ulam networks, Shepelyansky and Zhirov, arXiv
David F. Gleich (Sandia) Random sensitivity
log(E [x(A)])
log(E [x(A)])
log(E [x(A)])
log(E [x(A)]) log(Std [x(A)]))/ log(E [x(A)]) [x(A)
log(Std[x(A)]))/ log(E [x(A)]) [x(
log(Std [x(A)]))/ log(E 23 [x(
log(Std log(E

25/40
David F. Gleich (Sandia) log(E [x(A)]) [x(A)]))/ log(Std/ 36
Random sensitivity Purdue

White is larger, black is smaller
⇠ Bet (2, 16)
A A ⇠ Bet (2, 16)
Note White is larger, black is is
Note White is larger, black
Bet (2, 16) A ⇠ Bet (2, 16)
Model from Shepelyasky and Zhirov, Bet(2, 16)
Asmaller "
Asmaller
⇠⇠
Phy. Rev. E. 2011.
Google matrix, dynamical attractors, andUTRCnetworks,smaller Gleich, Purdue
arXiv
Ulam Seminar
David
GoogleNote dynamical attractors, andblack is Shepelyansky and and Zhirov,
matrix, White is larger, Ulam networks, Shepelyansky Zhirov, arXiv

Convergence
0
10

Algorithms & "
Convergence
−5
10

Monte Carlo
−10
10
1. Monte Carlo
E [x(A)] −15

1 PN
10

⇡ N =1 x( ⇠A
0 1 2 3 4 5
) 0
10 10 10 10 10 10

10

2. Path Damping
E [x(A)] 10
−5

PN î ó
⇡ =0 E A A +1 P v
Path Damping
−10
10

3. Quadrature
E [x(A)] 10
−15 (No Std)
Rr 10
0 1
10
2
10 10
3

⇡ x( ) d ( ) 0
10

PN C
⇡ =1 x( ) −5
s
10
(h
Convergence toto semi-exact
Convergence semi-exact
solutions on a 335-nodestrong
solution on a 335-node graph −10
10 Quadrature
component.
(harvard500 strong component).

26/40
Blue = Beta(2,16)
16)
Blue Bet (2, −15
10
Green = Beta(1,1,0.1,0.9)
0.9)
Green Bet (1, 1, 0.1, 0 10 20 30 40 50 60 70 80 90 100
Salmon = uniform (0.6, 0.9)
Salmon Uniform(0.6,0.9)
David F. Gleich (Sandia) Random sensitivity
Red = Beta(-0.5, -0.5, 0.2, 0.7)
Red Bet ( 0.5, 0.5, 0.2, 0.7) UTRC Seminar

f(α)
⋅

g(α)
. ⋅

f (α) = 1724683103168320512000α 102
− 351689859974563275916800α 101
+ 1046657678560756011923040α 100
(α) = 21252680112847680000α 102
+332821515558986503317268308α 99 + 202994690094545539249274953458α 98 + 701216550622104187641429941160α 97
−3542775096896042918400α 101 − 377301357230918051819160α 100 + 62030166204003769204027938α 99 + 301903572553392042618587937α 98
+38942435173273232195508862504752α 96 − 5204876256969489587508598423780757α 95 − 53419116345848724180375395029139614α 94
−27515144995670593102754792187α 97 − 1391342388530090922919905979557α 96 − 11397010225845179645798293856049α 95
+1621997105501543781796265745838677670α + 17992097277595516775992937444966323725α 92
93
+487046819801240647260974920877667α 94 + 8641748415645906110710596472701695α 93 − 14615573868254463557271968794871527α 92
−228388738389199148614341585444680228464α 91 − 2572935401339464873388154472765864295466α 90
−1455304405730842808585234463006780870α 91 − 16140532952116322684344866986683755014α 90
−18662047188535851000868073690251020472621α 89 − 155192964832717622674637679380949267008397α 88
−107685923577790689207116358432796101348α 89 + 3574857500140390342079726927167132783327α 88
+13633798075806927018912795365187923947976816α 87 + 153692481592717017931843564092779914769739855α 86
+76245995916566900197088870723441134067760α 87 − 320477613697118756563592647774688786780579α 86
−2424702525231324896856434133527720085459106818α 85 − 34112664906875644324640001664890877920583430935α 84
−14315018719450474212530996756919665488506623α 85 − 12271042346558183829899943919127664848771235α 84
+222921632950502905446093540571509314548545319158α 83 + 4458381340774458139955262362762709170337141183042α 82
+1538719934896052457300693234469902122130588440α 83 + 7259823837632938466306787148779956756499503259α 82
−9722398912749159172830586061232227612575398195577α 81 − 402863595222192101330043246404750577170418624210463α 80
−91383277962053778179963631846131934198363974003α 81 − 912158632690159715631486922494993985581191177254α 80
−241296146875962767748365749082981265577900593669099α 79 + 26884891161116233003550134767867058390000240645389885α 78
+1124589169570249225316595386438810701468062018941α 79 − 55599491760340084897708205765116975153096053881206α 78
+75002935639704657680175868562515328344632861061620026α 77 − 1355245718493528694128677343628002432897202221776993666α 76
+254197028878341726795811304127085084201803714274594α 77 − 1155102780712932745491921904562487673324953687625090α 76
−6666337432948865424681896342751813538288258918631143898α 75 + 50876562123828411130342908134923596879946044492587906688α 74
−19623309116424352882311523132748440745863270150867432α 75 − 72367264828688457023192884699324797029606326773402260α 74
+385972738637461890892793659070699381929652086327544953064α 73 − 1324370012053495348856190918458325441254102678707139546912α 72
+510591330662979105902331311824358111451756310585317896α 73 + 6560635654785580651459993551515346226540950556472012168α 72
−16416792980158036153780188009203628703318521649963318398744α 71 + 17510197624369310054645143199845105805941154913191274775360α 70
+11841946546859350197679256661965428675545845230913012752α 71 − 222422692257166102165445803087102201095333519552710152624α 70
+533320137070985354296793454864336229974212018883255863520736α 69 + 275502212308122569075672900514808641788656066608417565862128α 68
−1447290325427425453794609658098719385231428839474861685840α 69 + 2125011726240928873652963898522501443619028980101705108896α 68
−13429082722840051523544458153489421210623008268881676515202688α 67
+56163879158282775333105949842095267377034088228166264755488α 67 + 133653341840138472687713523321901358136789047544268798190144α 66
−23110058843365910555627839838104471746030299594537756688223008α 66
−1165851790876533575106055126719543401792990924852555883239232α 65
+262081257818502675810469542460738736851208401216965512926700160α 65
−7205045167922126127366881708591461911830986630512778219907200α 64
+729407390179003876249104385055674850942454472967192021090685376α 64
+8196149623293434725419276185048399130126199483584663609965696α 63
−3847937179452929633833233710422322341537775007885518269634539392α 63
+190347290617372900092754118891814664663338859287254054095265536α 62
−15488141989129507247130473020571135237573107436265881323677072000α 62
+296403177926940870392191966640325276665391672647048523475737600α 61
+36050325771659567239591241663693950811960305821938730156334667776α 61
−3179986962227253427695124755087565566711837258936975824737021952α 60
+246707867322513330007744656494007568641366676837744833157870986240α 60
−12273950891286672757637149571293897139589064857886165164957404160α 59
+66698815198854350338382524697115939758820557665663603703007667712α 59
+31408962973625270006925545397999409094566386715881351869322999808α 58
−2959446110396107328472639479854607457433633185566140760490226286592α 58
+253177395609699067378776631302481890469651122338031051366108686336α 57
−12528512804728910558071029225789548204605758683928995029146000314368α 57
−15354832074031738521204442047058295183786064138590507845987942400α 56
+19985525277247932558760938212461479524515746377831707793868714172416α 56
−3457076532174502560822426326142749948730584183953208907119801098240α 55
+343866190600408921247069416527135879796528858737524668958998645633024α 55
−6661437625275114934838338879511817915494254490727882100057772130304α 54
+237159992339459130849980507259488489676582642639199883151854812422144α 54
+28704083600179676384022705580143799967745682382583318411010759639040α 53
−6150352682504179603648657901968989091083378789857325448622418220859392α 53
+173119877625293135511416194747967318688771201702803231109775079243776α 52
−12507084588874068660420542622454441021005365876210831205762085535989760α 52
+42285615967170654345485778244291908234053330314299949447131636826112α 51
+76052343558405304817491728967709919562879906814237879556140479278219264α 51
−3092545165791022831669116892040565590342926023532342815170675350831104α 50
+281657470545819893901842735393494111347269819443029672934492155921629184α 50
−7385454932946443098573906964601689327710122151758555775183630113177600α 49
−524010169549932716315240835391286383538294517356494888193446880264060928α 49
+44090325705050939960465955316629060665099648652920301218388343039721472α 48
−4283228548253488673520351046009849054273946705738400536855052450584985600α 48
−2155194129185085332436034710334032595487897368550943059587873095183237120α 47 +180430494757250498411208705426475191214202221095549279916495110854934528α 47
+44942983365390912258646063248936155917171235534162037124027584790839951360α 46 −493525709032718650057526281767644848135900953167613963100373354560880640α 46
+123764976043225311633569878034493895722302903722502785220748272524591104000α 45 −3036843091999605016958964058463815080108229170215714733277797291506270208α 45
−263604612819883334094471942440378055857630908721587551326277602165812887552α 44 +3865732160987803528525842299699004166440912343665407865787648656852123648α 44
−2043045823645899057845901056050369454115577248500633141166053687383937777664α 43 +40165478772124194334610082404062794103423683134161618111009172215000203264α 43
−883572534249006235663814128436259426227447113734226469390794110452279279616α 42 −11270446090439842262616868429380066718469755470664378191173671836048162816α 42
+22029266389692672474905374638580604237511322238870051881693348503640495620096α 41 −431725269187383778295706776607285692623377582173153891079752971949306806272α 41
+45203159614332573226167349621344476004471313288020398240113991699259941978112α 40 −223578855128847742913688810087057318022143978462109332025481258567127269376α 40
−168198634626680009003513480377236264968641977685259854545270514440488513175552α 39 +3806641102807223385639875513891980988734164017656312910101180605432770592768α 39
−668594708420193863217346925249650551196858552245852383052928679191604052885504α 38 +4698338022493830197418469777664958098209079184719648168122484318843776794624α 38
+829995196451920004299651167659513171123326408698056871202815263749436350660608α 37 −27472779560617412642244986656083233718762546534015558981997009063520073940992α 37
+6805400890411122172338081288981379379115027947251954438964848554500327026458624α 36 −53681346508826005770227174053581590059283954164048929404839105532796000534528α 36
−839859147076619012613401783607878586283917926703478867476334483102478263910400α 35 +159792483519832871643195761447614587325418857137220582772566606510963040452608α 35
−54336251411672379109173054554388944990018972031681985156883655345205770838867968α 34 +447775073289651418862702364745936934030540232799739862181009845955145918054400α 34
−31763834543511199735483407052389951464492348704450435677017768682913434678853632α 33 −716151822637851063198942928932119452580573299424788537816341142171636199325696α 33
+357712343186400835247921272739995225258056636329416844164038875886993432486346752α 32 −2933014614963404405624949533910517712184375976693976408790612422895031925342208α 32
+394894109850616441422196163643656479874423531345017994904270039571808903743143936α 31 +2123830137329614973540541687269913350581043300869459472923500012177964595675136α 31
−1993929054800515710688917066299914269693286626662952457319746685784090804001701888α 30 +15491595398748844916213727820453788960246908641990943232584972825253134896988160α 30
−3002267549064744794430368624087097289757148076091004127530245571997364275264880640α 29 −567958048418299255333286711763252835749000069031930133372386182151207554908160α 29
+9573037450950832796546125489519791559144293205440801001596502044790259906531819520α 28 −66470511672905973490254270449160748571544305482918584099892594998203682442444800α 28
+17344649689902103638748302705765490194768583990372876266091126135709005379492904960α 27 −41709961606955286961348486645761651227147583272088758133872408100389592005345280α 27
−40109860118705371377719161262775470420310263138301806878152530252877875499258347520α 26 +230054579604523153712298391601390663928014143964616089795553744517711724229427200α 26
−81164940713776050502710413692301000793918577563223455903690236298808582388129464320α 25 +329047428589773383037144315393721888182438735406281384979987048470391313714380800α 25
+148564684652598057008901304730992665142722743799406464890491019151228896289384038400α 24 −624457510685469088854461981456149137717339107570818384916469113052631000311398400α 24
+316011966716392521139260824696069224379619965509016982919437611079336611648687308800α 23 −1677023335298418194342571458068169568589073430891247365956379137661470030954496000α 23
−493937443242584182232311411058386151572960694882377097092991520649901348015308800000α 22 +1230550173656441248007569837874753909716280107131735470279305802166985335767040000α 22
−1032631097012698995004666052463769745257461602028357530776684844670222403998056448000α 21 +6678146820080249693682249156290720809288474225484848277349949390581823092817920000α 21
+1496051498205595023212876520305710378404801491740260076675413316113755884612485120000α 20 −1335521590342284671869409797110636836566621705504540007316206486982151246970880000α 20
+2808040259722605050478570986966436499493340536522637266197921902172159568196403200000α 19 −22059887560957847625176129319162020059098234073297049383363906396128975739944960000α 19
−4136197022520781923456607241837348242573554483478641216258357714108027813809356800000α 18 −785340364420012115414030768139171427530706940353196376588668778473179840512000000α 18
−6160485939298474432256897143548698073388765535732087612799636625216399725507379200000α 17 +61717145472396641090916430698897769773248344842243911905796114382139890755174400000α 17
+10181878863815582516096533223816217477281300683613861533575784672221983047837286400000α 16 +4618799817652795890614174914969648296550799276151584724886711273496204279808000000α 16
+10194856369622478439949806821168034096091795201083524149176484646680561362088755200000α 15 −145443881953486865648263190807202565800657985019098154597005260614892420857856000000α 15
−21187043154586589769777874169878395124445179056372063547781637907554829911195648000000α 14 +3895092842622840658053685865827455168729291218619441351061760037262624555008000000α 14
−10797617499303349106653965603456976243130791900284770633819420554804763717271552000000α 13 +280377685657177839855779204679112256388859412881172644774038185688216297799680000000α 13
+34810029936836090031365778846622873044044684351940766602929819682347633942200320000000α 12 −61886949354628165807560200683179015577169820467161436162087652305392411607040000000α 12
+2444726911101623695480273766948648572307801537702233726799335232903066482114560000000α 11 −419675757995547385956754793581818014152422427747599875638509945701343323750400000000α 11
−41556351242381300546605413427086996396993985304666948472681225427743791067955200000000α 10 +198444685626856286689595946806119633184708804987305884557282973568786130534400000000α 10
+13235190618796698664164720739564065289559223880035423917829189953498612603289600000000α 9 +448747751865602411338231508161295374031102511536034615950378207124016594944000000000α 9
+31707117886734781934293206235313206269589256456457305316683904435212337020928000000000α 8 −354225411849996408405676297836399354389793212596699228226946778751972671488000000000α 8
−22902862215982163769314078339007120966645769612414912118902053217891120054272000000000α 7 −289553838601814478908147100882111896550771868124112559407400778696805580800000000000α 7
−10861447231493527964796381160797577847506774386629058766245206345294177894400000000000α 6 +380193432519284724415876033554186663453423948344477630293719517144232755200000000000α 6
+49868638731749836953497035941697409493586060953068752243112234044096512000000000000α 5

27/40
+16702340614440996726321322519580478377784196762456013465355368892183714201600000000000α 5
−2484655700299390942962097170834933290427413703835951243538833117682860032000000000000α 4 −225214852583720088017543526212238701302651117601148886021831714815344640000000000000α 4
−4413329047578208225715144832646023361841607400402542869168917500552806400000000000000α 3 +65704820370519415064487362188463863760063365628098565999947778359296000000000000000α 3
+2487780731058996453939104246064539866264498778933228932687035282279628800000000000000α 2 +49648864534173955171275387887713942931184684832027306458656054181888000000000000000α 2
−402148158541143771038030692426712820265062425103540831235367384383488000000000000000α −35756856984770583727093678769849105127720172150476292008503798661120000000000000000α
−5203808713264169193283107063136995887025759130647063545708229427200000000000000000 +6649311133615327302528414580675050300088470000271247863960515379200000000000000000

Figure 2.5 – A PageRank function. x 1 (α) = (−236030) f (α)(α), see section .

x1(α) = -23/6030 f(α)/g(α)
Figure 2.5 (continued).

UTRC Seminar

Random alpha PageRank
has a rigorous convergence
Convergence
theory.
theory
Method Conv. Work Required What is N?
1 number of
Monte Carlo p N PageRank systems
N samples from A
Path Damping
r N+2 N + 1 matrix vector terms of
(without
N1+ products Neumann series
Std [x(A)])
number of
Gaussian
r 2N N PageRank systems quadrature
Quadrature
points

and r are parameters from Bet ( , b, , r)

28/40
David F. Gleich (Sandia) Random sensitivity UTRC Seminar
David Gleich, Purdue 27 / 36
Purdue

Convergence of quadrature in the r=1 regime
is matrix dependent.
Singularities
10
0.03
8

6 0.02

4
0.01
2

0
0 1.00129
2
3
−2 4
5
6
7
8 −0.01
−4 9
10

−6
−0.02
−8

−10 −0.03
−10 −5 0 5 10 0.97 0.98 0.99 1 1.01 1.02 1.03

29/40
log10(9+|1/λ|)eiarg(1/λ)
1/λ

Note 500-node harvard500 graph from Cleve Moler, left plot is Gleich, Purdue
UTRC Seminar
David

Establishing this theoretical
convergence proved
independently useful.

Constantine, Gleich, and Iaccarino. Spectral Methods for Parameterized
Matrix Equations, SIMAX, 2010.

A(s)x(s) = b(s)

, A(J 1 )x(J 1 ) = b(J 1 )

) A(J N )x(J N ) = b(J N ) or

) AN (J 1 )xN (J 1 ) = bN (J 1 )

Constantine, Gleich, and Iaccarino. A factorization of the spectral Galerkin
system for parameterized matrix equations: derivation and applications, SISC
2011.

30/40

How to compute the Galerkin solution
in a weakly intrusive manner.!
UTRC Seminar

A real test-case
Webspam application
Hosts of uk-2006 are labeled as spam, not-spam, other

P R f FP FN
Baseline 0.694 0.558 0.618 0.034 0.442

Beta(0.5,1.5) 0.695 0.561 0.621 0.034 0.439
Beta(1,1) 0.698 0.562 0.622 0.033 0.438
Beta(2,16) 0.699 0.562 0.623 0.033 0.438

31/40
Note Bagged (10) J48 decision tree classiﬁer in Weka, mean of 50 repetitions from
10-fold cross-validation of 4948 non-spam and 674 spam hosts (5622 total).
Becchetti et al. Link analysis for Web spam detection, 2008.
David F. Gleich (Sandia) Random sensitivity UTRC Seminar
Purdue 28 / 36

New directions

32/40
UTRC Seminar

Data driven surrogate functions
Beyond spectral methods for UQ

33/40
UTRC Seminar

j

r Square s

)
t
t

A L B
Network alignment

34/40
m ximize w T x + 1 xT Sx
UTRC Seminar

Nuclear-norm
matrix completion
based ranking
Gleich and Lim, KDD2011

avid F. Gleich (Purdue) KDD 2011 16/20

Overlapping clusters
for distributed computation
Andersen, Gleich, and Mirrokni, WSDM2012

35/40
UTRC Seminar

Local methods for massive FOR KATZ
TOP-K ALGORITHM
network analysis
Approximate    
                           
where     is sparse

Keep     sparse too
Ideally, don’t “touch” all of    

This is possible for
personalized PageRank!

36/40
David F. Gleich (Purdue) Univ. Chicago SSCS Seminar 34 of 47

UTRC Seminar

Graph spectra
Graph spectra

37/40
UTRC Seminar

What about time?
Real networks evolve in time.

What to do?

Look towards dynamical systems!

38/40
UTRC Seminar

What about time?
Real networks evolve in time.

What to do?

Look towards dynamical systems!

Now I must be preaching to the choir!

39/40
UTRC Seminar

Questions?
www.cs.purdue.edu/homes/dgleich
Google “David Gleich”

40
UTRC Seminar

PageRank

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (18)

Mehr von David Gleich

Mehr von David Gleich (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

PageRank