This document summarizes a talk on influence propagation in large graphs. It discusses theorems and algorithms related to modeling the spread of information, viruses, and diseases over networks. The document begins by motivating the importance of understanding dynamical processes over networks through examples related to epidemiology, viral marketing, cybersecurity, and more. It then outlines threshold results for epidemic models on static graphs that depend on the largest eigenvalue of the graph's adjacency matrix and properties of the propagation model. The talk discusses proofs of these results and also covers extensions to dynamic graphs and competing viruses. Finally, it discusses algorithms for determining who to immunize to control outbreaks.
Influence Propagation in Large Graphs - Theorems and Algorithms
1. Influence propagation in
large graphs - theorems and
algorithms
B. Aditya Prakash
http://www.cs.cmu.edu/~badityap
Christos Faloutsos
http://www.cs.cmu.edu/~christos
Carnegie Mellon University
A.U.T.’12
3. Networks are everywhere!
Facebook Network [2010]
Gene Regulatory Network
[Decourty 2008]
Human Disease Network
[Barabasi 2007]
The Internet [2005]
AUT '12 Prakash and Faloutsos 2012 3
5. Why do we care?
• Information Diffusion
• Viral Marketing
• Epidemiology and Public Health
• Cyber Security
• Human mobility
• Games and Virtual Worlds
• Ecology
• Social Collaboration
........
AUT '12 Prakash and Faloutsos 2012 5
6. Why do we care? (1:
Epidemiology)
• Dynamical Processes over networks
[AJPH 2007]
CDC data: Visualization
of the first 35
Diseases over contact networks tuberculosis (TB) patients
and their 1039 contacts
AUT '12 Prakash and Faloutsos 2012 6
7. Why do we care? (1:
Epidemiology)
• Dynamical Processes over networks
• Each circle is a hospital
• ~3000 hospitals
• More than 30,000 patients
transferred
[US-MEDICARE Problem: Given k units of
NETWORK 2005]
disinfectant, whom to immunize?
AUT '12 Prakash and Faloutsos 2012 7
8. Why do we care? (1:
Epidemiology)
~6x
fewer!
[US-MEDICARE
NETWORK 2005]
CURRENT PRACTICE OUR METHOD
Hospital-acquired inf. took 99K+ lives, cost $5B+ (all per year)
AUT '12 Prakash and Faloutsos 2012 8
9. Why do we care? (2: Online
Diffusion)
> 800m users, ~$1B
revenue [WSJ 2010]
~100m active users
> 50m users
AUT '12 Prakash and Faloutsos 2012 9
10. Why do we care? (2: Online
Diffusion)
• Dynamical Processes over networks
Buy Versace™!
Followers
Celebrity
Social Media Marketing
AUT '12 Prakash and Faloutsos 2012 10
11. High Impact – Multiple Settings
epidemic out-breaks
Q. How to squash rumors faster?
products/viruses
Q. How do opinions spread?
transmit s/w patches
Q. How to market better?
AUT '12 Prakash and Faloutsos 2012 11
12. Research Theme
ANALYSIS
Understanding
POLICY/
DATA ACTION
Large real-world Managing
networks & processes
AUT '12 Prakash and Faloutsos 2012 12
13. In this talk
Given propagation models:
Q1: Will an epidemic
happen?
ANALYSIS
Understanding
AUT '12 Prakash and Faloutsos 2012 13
14. In this talk
Q2: How to immunize and
control out-breaks better?
POLICY/
ACTION
Managing
AUT '12 Prakash and Faloutsos 2012 14
15. Outline
• Motivation
• Epidemics: what happens? (Theory)
• Action: Who to immunize? (Algorithms)
AUT '12 Prakash and Faloutsos 2012 15
18. Problem Statement
# Infected
above (epidemic)
Separate the
regimes?
below (extinction)
time
Find, a condition under which
– virus will die out exponentially quickly
– regardless of initial infection condition
AUT '12 Prakash and Faloutsos 2012 18
19. Threshold (static version)
Problem Statement
• Given:
– Graph G, and
– Virus specs (attack prob. etc.)
• Find:
– A condition for virus extinction/invasion
AUT '12 Prakash and Faloutsos 2012 19
20. Threshold: Why important?
• Accelerating simulations
• Forecasting (‘What-if’ scenarios)
• Design of contagion and/or topology
• A great handle to manipulate the spreading
– Immunization
– Maximize collaboration
…..
AUT '12 Prakash and Faloutsos 2012 20
21. Outline
• Motivation
• Epidemics: what happens? (Theory)
– Background
– Result (Static Graphs)
– Proof Ideas (Static Graphs)
– Bonus 1: Dynamic Graphs
– Bonus 2: Competing Viruses
• Action: Who to immunize? (Algorithms)
AUT '12 Prakash and Faloutsos 2012 21
22. “SIR” model: life immunity
(mumps)
• Each node in the graph is in one of three states
– Susceptible (i.e. healthy)
– Infected
– Removed (i.e. can’t get infected again)
Prob. δ
.β
Prob
t=1 t=2 t=3
AUT '12 Prakash and Faloutsos 2012 22
23. Terminology: continued
• Other virus propagation models (“VPM”)
– SIS : susceptible-infected-susceptible, flu-like
– SIRS : temporary immunity, like pertussis
– SEIR : mumps-like, with virus incubation
(E = Exposed)
….………….
• Underlying contact-network – ‘who-can-infect-
whom’
AUT '12 Prakash and Faloutsos 2012 23
24. Related Work
R. M. Anderson and R. M. May. Infectious Diseases of Humans. Oxford University Press, 1991.
A. Barrat, M. Barthélemy, and A. Vespignani. Dynamical Processes on Complex Networks.
All are about either:
Cambridge University Press, 2010.
F. M. Bass. A new product growth for model consumer durables. Management Science,
15(5):215–227, 1969.
D. Chakrabarti, Y. Wang, C. Wang, J. Leskovec, and C. Faloutsos. Epidemic thresholds in real
networks. ACM TISSEC, 10(4), 2008.
• Structured
D. Easley and J. Kleinberg. Networks, Crowds, and Markets: Reasoning About a Highly
Connected World. Cambridge University Press, 2010.
A. Ganesh, L. Massoulie, and D. Towsley. The effect of network topology in spread of
topologies (cliques,
epidemics. IEEE INFOCOM, 2005.
Y. Hayashi, M. Minoura, and J. Matsukubo. Recoverable prevalence in growing scale-free
block-diagonals,
hierarchies, random)
networks and the effective immunization. arXiv:cond-at/0305549 v2, Aug. 6 2003.
H. W. Hethcote. The mathematics of infectious diseases. SIAM Review, 42, 2000.
H. W. Hethcote and J. A. Yorke. Gonorrhea transmission dynamics and control. Springer
Lecture Notes in Biomathematics, 46, 1984.
J. O. Kephart and S. R. White. Directed-graph epidemiological models of computer viruses.
IEEE Computer Society Symposium on Research in Security and Privacy, 1991.
J. O. Kephart and S. R. White. Measuring and modeling computer virus prevalence. IEEE
Computer Society Symposium on Research in Security and Privacy, 1993.
• Specific virus
R. Pastor-Santorras and A. Vespignani. Epidemic spreading in scale-free networks. Physical
Review Letters 86, 14, 2001. propagation models
………
………
• Static graphs
………
AUT '12 Prakash and Faloutsos 2012 24
25. Outline
• Motivation
• Epidemics: what happens? (Theory)
– Background
– Result (Static Graphs)
– Proof Ideas (Static Graphs)
– Bonus 1: Dynamic Graphs
– Bonus 2: Competing Viruses
• Action: Who to immunize? (Algorithms)
AUT '12 Prakash and Faloutsos 2012 25
26. How should the answer look
like?
• Answer should depend on:
– Graph
– Virus Propagation Model (VPM)
• But how??
– Graph – average degree? max. degree? diameter?
– VPM – which parameters?
– How to combine – linear? quadratic? exponential?
βd avg + δ diameter ? ( β 2 d 2 avg − δd avg ) / d max ? …..
AUT '12 Prakash and Faloutsos 2012 26
27. Static Graphs: Our Main Result
• Informally, w/ Deepay
Chakrabarti
For,
any arbitrary topology (adjacency
matrix A) λ
any virus propagation model (VPM) in
standard literature CVPM
the epidemic threshold depends only
•
No
8.on the λ, first eigenvalue of A, and epidemic if
9.some constant CVPM determined by the
, CVPM
virus propagation model
In Prakash+ ICDM 2011 (Selected among best
AUT '12 Prakash and Faloutsos 2012 27
28. Our thresholds for some models
• s = effective strength
• s < 1 : below threshold
Models Effective Strength Threshold (tipping
(s) point)
SIS, SIR, SIRS, SEIR β
s=λ.
δ
βγ
s=1
SIV, SEIV s=λ. δ (γ +θ )
β1v2 + β2ε
SI 1 I 2 V1 V2 (H.I.V.) s = λ . v2 ( ε + v1 )
AUT '12 Prakash and Faloutsos 2012 28
29. Our result: Intuition for λ
“Official” definition: “Un-official” Intuition
• Let A be the adjacency • λ ~ # paths in the graph
matrix. Then λ is the root
with the largest magnitude of
the characteristic polynomial k
A
of A [det(A – xI)].
≈λ
k
. u
• Doesn’t give much intuition!
k
A (i, j) = # of paths i j
of length k
AUT '12 Prakash and Faloutsos 2012 29
30. Largest Eigenvalue (λ)
better connectivity higher λ
λ≈2 λ= N λ = N-1
λ≈2 λ= 31.67 λ= 999
N = 1000
AUT '12 Prakash and Faloutsos 2012 N nodes
30
31. Examples: Simulations – SIR
(mumps)
Time ticks Effective Strength
(a) Infection profile (b) “Take-off” plot
PORTLAND graph: synthetic population,
AUT '12 31 million links, 6 million nodes
Prakash and Faloutsos 2012 31
32. Examples: Simulations – SIRS
(pertusis)
Time ticks Effective Strength
(a) Infection profile (b) “Take-off” plot
PORTLAND graph: synthetic population,
AUT '12 31 million links, 6 million nodes
Prakash and Faloutsos 2012 32
33. Models and more models
Model Used for
SIR Mumps
SIS Flu
SIRS Pertussis
SEIR Chicken-pox
……..
SICR Tuberculosis
MSIR Measles
SIV Sensor Stability
SI 1 I 2 V1 V2 H.I.V.
……….
33
36. Special case: H.I.V.
SI 1 I 2 V1 V2
“Non-terminal”
“Terminal”
Multiple Infectious,
Vigilant states
36
37. Ingredient 2: NLDS+Stability
• View as a NLDS
– discrete time
– non-linear dynamical system (NLDS)
S
.
.
Probability vector
.
Specifies the state I
size
mN x 1 of the system at time
t size N (number of V
.
. nodes in the
. 37
graph)
38. Ingredient 2: NLDS + Stability
• View as a NLDS
– discrete time
– non-linear dynamical system (NLDS)
.
.
Non-linear
.
function
size
mN x 1 Explicitly gives the
evolution of system
.
.
. 38
39. Ingredient 2: NLDS + Stability
• View as a NLDS
– discrete time
– non-linear dynamical system (NLDS)
• Threshold Stability of NLDS
39
40. Special case: SIR
S
S
size
I
3N x 1 I
R
R
= probability that node
i is not attacked by
any of its infectious
NLDS
neighbors
40
42. Stability for SIR
Stable Unstable
under threshold above threshold
42
43. See paper for
full proof
General VPM
structure Model-based
λ * CVPM <
1
Graph-based
Topology
and stability
43
44. Outline
• Motivation
• Epidemics: what happens? (Theory)
– Background
– Result (Static Graphs)
– Proof Ideas (Static Graphs)
– Bonus 1: Dynamic Graphs
– Bonus 2: Competing Viruses
• Action: Who to immunize? (Algorithms)
AUT '12 Prakash and Faloutsos 2012 44
45. See paper for
full proof
General VPM
structure Model-based
λ * CVPM < 1
Graph-based
Topology and
stability
AUT '12 Prakash and Faloutsos 2012 45
46. Outline
• Motivation
• Epidemics: what happens? (Theory)
– Background
– Result (Static Graphs)
– Proof Ideas (Static Graphs)
– Bonus 1: Dynamic Graphs
– Bonus 2: Competing Viruses
• Action: Who to immunize? (Algorithms)
AUT '12 Prakash and Faloutsos 2012 46
47. Dynamic Graphs: Epidemic?
DAY Alternating behaviors
(e.g., work)
adjacency 8
matrix
8
AUT '12 Prakash and Faloutsos 2012 47
48. Dynamic Graphs: Epidemic?
NIGHT Alternating behaviors
(e.g., home)
adjacency 8
matrix
8
AUT '12 Prakash and Faloutsos 2012 48
49. Model Description
Healthy
N2
• SIS model
Prob. β
– recovery rate δ N1 X Prob Prob. δ
.β
– infection rate β
Infected N3
• Set of T arbitrary graphs
day N N
night , weekend…..
N N
AUT '12 Prakash and Faloutsos 2012 49
50. Our result: Dynamic Graphs
Threshold
• Informally, NO epidemic if
eig (S) = <1
Single number! S =
Largest eigenvalue of
The system matrix S
In Prakash+, ECML-PKDD 2010
AUT '12 Prakash and Faloutsos 2012 50
52. Footprint (#
infected @
“steady state”)
“Take-off” plots
Synthetic MIT Reality
EPIDEMIC
Our EPIDEMIC Our
threshold threshold
NO EPIDEMIC NO EPIDEMIC
(log scale)
AUT '12 Prakash and Faloutsos 2012 52
53. Outline
• Motivation
• Epidemics: what happens? (Theory)
– Background
– Result (Static Graphs)
– Proof Ideas (Static Graphs)
– Bonus 1: Dynamic Graphs
– Bonus 2: Competing Viruses
• Action: Who to immunize? (Algorithms)
AUT '12 Prakash and Faloutsos 2012 53
54. Competing Contagions
iPhone v Android Blu-ray v HD-DVD
Biological common flu/avian flu, pneumococcal inf
AUT '12 Prakash and Faloutsos 2012 54
55. A simple model
• Modified flu-like
• Mutual Immunity (“pick one of the two”)
• Susceptible-Infected1-Infected2-Susceptible
Virus 1 Virus 2
AUT '12 Prakash and Faloutsos 2012 55
56. Question: What happens in the
end?
Number of green: virus 1
Infections red: virus 2
Footprint @ Steady State
Footprint @ Steady State
= ?
ASSUME:
Virus 1 is stronger than Virus
AUT '12 Prakash and Faloutsos 2012 56
57. Question: What happens in the
end? Footprint @ Steady State
Number of green: virus 1 Footprint @ Steady State
Infections red: virus 2
Strength
Strength
??
= 2
Strength
Strength
ASSUME:
Virus 1 is stronger than Virus
AUT '12 Prakash and Faloutsos 2012 57
58. Answer: Winner-Takes-All
Number of green: virus 1
Infections red: virus 2
ASSUME:
Virus 1 is stronger than Virus
AUT '12 Prakash and Faloutsos 2012 58
59. Our Result: Winner-Takes-All
Given our model, and any graph, the
weaker virus always dies-out completely
1. The stronger survives only if it is above threshold
2. Virus 1 is stronger than Virus 2, if:
strength(Virus 1) > strength(Virus 2)
4. Strength(Virus) = λ β / δ same as before!
In Prakash+ WWW 2012
AUT '12 Prakash and Faloutsos 2012 59
61. Outline
• Motivation
• Epidemics: what happens? (Theory)
• Action: Who to immunize? (Algorithms)
AUT '12 Prakash and Faloutsos 2012 61
62. Full Static Immunization
Given: a graph A, virus prop. model and budget k;
Find: k ‘best’ nodes for immunization (removal).
?
?
k=2
?
?
AUT '12 Prakash and Faloutsos 2012 62
63. Outline
• Motivation
• Epidemics: what happens? (Theory)
• Action: Who to immunize? (Algorithms)
– Full Immunization (Static Graphs)
– Fractional Immunization
AUT '12 Prakash and Faloutsos 2012 63
64. Challenges
• Given a graph A, budget k,
Q1 (Metric) How to measure the ‘shield-
value’ for a set of nodes (S)?
Q2 (Algorithm) How to find a set of k nodes
with highest ‘shield-value’?
AUT '12 Prakash and Faloutsos 2012 64
65. Proposed vulnerability measure
λ
λ is the epidemic threshold
“Safe” “Vulnerable” “Deadly”
Increasing λ
AUT '12
Increasing vulnerability
Prakash and Faloutsos 2012 65
66. A1: “Eigen-Drop”: an ideal shield
value
Eigen-Drop(S)
Δ λ = λ - λs
9 9
11 9
10
Δ 10
1
1
4 4
8 8
2 2
7 3 7
3
5 5
6 6
Original Graph Without {2, 6} 66
AUT '12 Prakash and Faloutsos 2012
67. (Q2) - Direct Algorithm too
expensive!
• Immunize k nodes which maximize Δ λ
S = argmax Δ λ
• Combinatorial!
• Complexity:
– Example:
• 1,000 nodes, with 10,000 edges
• It takes 0.01 seconds to compute λ
• It takes 2,615 years to find 5-best nodes!
AUT '12 Prakash and Faloutsos 2012 67
68. A2: Our Solution
• Part 1: Shield Value
– Carefully approximate Eigen-drop (Δ λ)
– Matrix perturbation theory
• Part 2: Algorithm
– Greedily pick best node at each step
– Near-optimal due to submodularity
• NetShield (linear complexity)
– O(nk2+m) n = # nodes; m = # edges
In Tong, Prakash+ ICDM 2010
AUT '12 Prakash and Faloutsos 2012 68
69. Experiment: Immunization
Log(fraction of
quality
infected
nodes)
PageRank
Betweeness (shortest path)
Degree
Lower Acquaintance
is NetShield Eigs (=HITS)
better Time
AUT '12 Prakash and Faloutsos 2012 69
70. Outline
• Motivation
• Epidemics: what happens? (Theory)
• Action: Who to immunize? (Algorithms)
– Full Immunization (Static Graphs)
– Fractional Immunization
AUT '12 Prakash and Faloutsos 2012 70
71. Fractional Immunization of Networks
B. Aditya Prakash, Lada Adamic, Theodore
Iwashyna (M.D.), Hanghang Tong, Christos
Faloutsos
Under review
AUT '12 Prakash and Faloutsos 2012 71
72. Fractional Asymmetric
Immunization
Drug-resistant Bacteria
(like XDR-TB)
Hospital Another
Hospital
AUT '12 Prakash and Faloutsos 2012 72
73. Fractional Asymmetric
Immunization
Drug-resistant Bacteria
(like XDR-TB)
Hospital Another
Hospital
AUT '12 Prakash and Faloutsos 2012 73
74. Fractional Asymmetric
Immunization
Problem: Given k units of disinfectant,
how to distribute them to maximize
hospitals saved?
Hospital Another
Hospital
AUT '12 Prakash and Faloutsos 2012 74
75. Our Algorithm “SMART-
ALLOC”
~6x [US-MEDICARE NETWORK 2005]
fewer! • Each circle is a hospital, ~3000 hospitals
• More than 30,000 patients transferred
CURRENT PRACTICE SMART-ALLOC 75
AUT '12 Prakash and Faloutsos 2012
76. Running Time
Wall-Clock
> 1 week
Time
≈
> 30,000x
speed-up!
Lower
is 14 secs
better Simulations SMART-ALLOC
AUT '12 Prakash and Faloutsos 2012 76
77. Lower Experiments
is
better
PENN-NETWORK SECOND-LIFE
~5 x ~2.5 x
AUT '12 K = 200 Prakash and Faloutsos 2012 K = 2000 77
79. References
• Threshold Conditions for Arbitrary Cascade Models on Arbitrary Networks (B. Aditya
Prakash, Deepayan Chakrabarti, Michalis Faloutsos, Nicholas Valler, Christos Faloutsos) -
In IEEE ICDM 2011, Vancouver (Invited to KAIS Journal Best Papers of ICDM.)
• Virus Propagation on Time-Varying Networks: Theory and Immunization Algorithms (B.
Aditya Prakash, Hanghang Tong, Nicholas Valler, Michalis Faloutsos and Christos
Faloutsos) – In ECML-PKDD 2010, Barcelona, Spain
• Epidemic Spreading on Mobile Ad Hoc Networks: Determining the Tipping Point (Nicholas
Valler, B. Aditya Prakash, Hanghang Tong, Michalis Faloutsos and Christos Faloutsos) –
In IEEE NETWORKING 2011, Valencia, Spain
• Winner-takes-all: Competing Viruses or Ideas on fair-play networks (B. Aditya Prakash,
Alex Beutel, Roni Rosenfeld, Christos Faloutsos) – In WWW 2012, Lyon
• On the Vulnerability of Large Graphs (Hanghang Tong, B. Aditya Prakash, Tina Eliassi-
Rad and Christos Faloutsos) – In IEEE ICDM 2010, Sydney, Australia
• Fractional Immunization of Networks (B. Aditya Prakash, Lada Adamic, Theodore
Iwashyna, Hanghang Tong, Christos Faloutsos) - Under Submission
• Rise and Fall Patterns of Information Diffusion: Model and Implications (Yasuko
Matsubara, Yasushi Sakurai, B. Aditya Prakash, Lei Li, Christos Faloutsos) - Under
Submission
http://www.cs.cmu.edu/~badityap/
AUT '12 Prakash and Faloutsos 2012 79
80. Propagation on Large Networks
B. Aditya Prakash
Christos Faloutsos
Analysis Policy/Action Data
AUT '12 Prakash and Faloutsos 2012 80