The network inference problem consists of reconstructing the edge set of a network given traces representing the chronology of infection times as epidemics spread through the network. This problem is a paradigmatic representative of prediction tasks in machine learning that require deducing a latent structure from observed patterns of activity in a network, which often require an unrealistically large number of resources (e.g., amount of available data, or computational time). A fundamental question is to understand which properties we can predict with a reasonable degree of accuracy with the available resources, and which we cannot. We define the trace complexity as the number of distinct traces required to achieve high fidelity in reconstructing the topology of the unobserved network or, more generally, some of its properties. We give algorithms that are competitive with, while being simpler and more efficient than, existing network inference approaches. Moreover, we prove that our algorithms are nearly optimal, by proving an information-theoretic lower bound on the number of traces that an optimal inference algorithm requires for performing this task in the general case. Given these strong lower bounds, we turn our attention to special cases, such as trees and bounded-degree graphs, and to property recovery tasks, such as reconstructing the degree distribution without inferring the network. We show that these problems require a much smaller (and more realistic) number of traces, making them potentially solvable in practice. By Bruno Abrahao, Flavio Chierichetti, Robert Kleinberg and Alessandro Panconesi.
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Trace Complexity of Network Inference
1. Trace Complexity of
Network Inference
Bruno Abrahao (Cornell)
Flavio Chierichetti (Sapienza)
Robert Kleinberg (Cornell)
Alessandro Panconesi (Sapienza) Cornell University
1
Text
Sapienza University
Wednesday, August 14, 13
3. Influence and diffusion on networks
• Network Inference: Find influencers, improve marketing,
prevent disease outbreaks, and forecast crimes
2
Wednesday, August 14, 13
4. The Network Inference Problem
• Learning each edge independently
- [Adar,Adamic‘2005]
• MLE-inspired approaches
- [Gomez-Rodriguez, Leskovec, Krause’2010]
- [Gomez-Rodriguez, Balduzzi, Scholkopf’2011]
- [Myers, Leskovec‘2011]
- [Du et al.‘2012]
• Information theoretic
- [Netrapalli, Sanghavi‘2012]
- [Grippon, Rabbat‘2013]
3
Wednesday, August 14, 13
5. The Network Inference Problem
• Learning each edge independently
- [Adar,Adamic‘2005]
• MLE-inspired approaches
- [Gomez-Rodriguez, Leskovec, Krause’2010]
- [Gomez-Rodriguez, Balduzzi, Scholkopf’2011]
- [Myers, Leskovec‘2011]
- [Du et al.‘2012]
• Information theoretic
- [Netrapalli, Sanghavi‘2012]
- [Grippon, Rabbat‘2013]
3
Our work
Wednesday, August 14, 13
6. The Network Inference Problem
• The relationship between the amount of data and the
performance of inference algorithms is not well understood
4
What can be inferred? What amounts of resources are
required? How hard is the inference task?
Wednesday, August 14, 13
7. Our goal
• Provide rigorous foundation to network inference
1. develop a measure that relates the amount of data to the
performance of algorithms
2. give information-theoretic performance guarantees
3. develop more efficient algorithms
5
Wednesday, August 14, 13
8. We assume an underlying cascade model
6
b
d
e
a
c
st = 0.0
Wednesday, August 14, 13
9. We assume an underlying cascade model
6
b
d
e
a
c
s
Pr{H} = p
Wednesday, August 14, 13
10. We assume an underlying cascade model
6
b
d
e
a
c
s
Pr{H} = p
Wednesday, August 14, 13
11. We assume an underlying cascade model
6
b
d
e
a
c
s
Exp( )
Wednesday, August 14, 13
12. We assume an underlying cascade model
6
b
d
e
a
c
s
!
Wednesday, August 14, 13
13. We assume an underlying cascade model
6
b
d
e
a
c
s
c
Wednesday, August 14, 13
14. We assume an underlying cascade model
6
b
d
e
a
c
s
c
a b
Wednesday, August 14, 13
15. We assume an underlying cascade model
6
b
d
e
a
c
s
c
a b
Node s
t=0.0
Node c
t=0.345
Node a
t=1.236
Node b
t=1.705
Trace
Wednesday, August 14, 13
16. Traces and cascades
• Each cascade generates one trace
• Random cascade: starts at a node chosen uniformly at
random (assumption in some of our models)
• Traces do not directly reflect the underlying network over
which the cascade propagates
7
Node s
t=0.0
Node c
t=0.345
Node a
t=1.236
Node b
t=1.705
Wednesday, August 14, 13
17. Traces and cascades
• Each cascade generates one trace
• Random cascade: starts at a node chosen uniformly at
random (assumption in some of our models)
• Traces do not directly reflect the underlying network over
which the cascade propagates
7
Node s
t=0.0
Node c
t=0.345
Node a
t=1.236
Node b
t=1.705
How much structural information is contained in a trace?
Wednesday, August 14, 13
18. Our Research Question I
How many traces do we need to reconstruct the
underlying network?
We call this measure the trace complexity of the problem.
8
Wednesday, August 14, 13
19. Our Research Question II
How does trace length play a role for inference?
As we keep scanning the trace, it becomes less and less
informative.
9
Node s
t=0.0
Node c
t=0.345
Node a
t=1.236
Node b
t=1.705
Wednesday, August 14, 13
20. Our Research Question II
How does trace length play a role for inference?
As we keep scanning the trace, it becomes less and less
informative.
9
Node s
t=0.0
Node c
t=0.345
Node a
t=1.236
Node b
t=1.705
s
c
Wednesday, August 14, 13
21. Our Research Question II
How does trace length play a role for inference?
As we keep scanning the trace, it becomes less and less
informative.
9
Node s
t=0.0
Node c
t=0.345
Node a
t=1.236
Node b
t=1.705
s
c
a
?
?
Wednesday, August 14, 13
22. Our Research Question II
How does trace length play a role for inference?
As we keep scanning the trace, it becomes less and less
informative.
9
Node s
t=0.0
Node c
t=0.345
Node a
t=1.236
Node b
t=1.705
s
c
a
?
?
b
?
?
?
Wednesday, August 14, 13
23. • First-Edge Algorithm
- Infers the edge corresponding to the first two nodes in
each trace (and ignores the rest of the trace)
The head of the trace
10
Node s
t=0.0
Node c
t=0.345
Node a
t=1.236
Node b
t=1.705
Node d
t=1.725
Wednesday, August 14, 13
24. • First-Edge Algorithm
- Infers the edge corresponding to the first two nodes in
each trace (and ignores the rest of the trace)
The head of the trace
10
Node s
t=0.0
Node c
t=0.345
Wednesday, August 14, 13
25. 1. The head of traces
• First-Edge is close to the best we can do for exact reconstruction
2. The tail of traces
• We give algorithms using exponentially fewer traces
- trees
- bounded degree graphs
3. Infer properties without reconstructing the network itself
- degree distribution
Contributions
11
O(log n)
⌦(n 1 ✏
)
O(n)
O(poly( ) log n)
Wednesday, August 14, 13
26. How many traces do we need for exact
reconstruction of general graphs?
12
Wednesday, August 14, 13
27. Lower bound for exact reconstruction of general graphs
13
a b
c...
df
e
G0 = Kn
a b
c...
df
e
G1 = Kn {a, b}
1. We choose the unknown graph in {G0, G1}
2. Run random cascades on the chosen graph
Wednesday, August 14, 13
28. 14
a b
c...
df
e
G?
Given a set of ` random traces T1, . . . , T`,
Bayes’ rule can tell us which of the two alternatives
G0 or G1 is the most likely.
Lower bound for exact reconstruction of general graphs
Wednesday, August 14, 13
29. 15
a b
c...
df
e
G?
Lower bound for exact reconstruction of general graphs
Lemma
Let ` < n2 ✏
. For any small positive constant ✏.
Then with prob. 1-o(1) over the random traces T1, ..., T`,
the posterior Pr{G0|T1, ..., T`} lies in [1
2 o(1), 1
2 + o(1)]
Wednesday, August 14, 13
30. 16
Lower bound for exact reconstruction of general graphs
Corollary
If ` < n · 1 ✏
, any algorithm will fail to reconstruct
the graph with high probability.
Let Δ be the largest degree of a
node in the network
⌦(n 1 ✏
) traces are necessary
Wednesday, August 14, 13
31. The head of the trace
17
First-Edge reconstructs the graph with O(n log n) traces.
First-Edge
O(n log n)
Lower bound
⌦(n 1 ✏
)
First-Edge is close to the best we can do for exact reconstruction!
Wednesday, August 14, 13
32. Can we reconstruct special families
of graphs using fewer traces?
18
Wednesday, August 14, 13
33. • Useful information to reconstruct special graphs
• We give algorithms for inference using exponentially fewer traces.
- trees
- bounded degree graphs
The tail of the trace
19
O(log n)
O(poly( ) log n)
Wednesday, August 14, 13
34. Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using
O(log n) traces.
Wednesday, August 14, 13
35. Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using
O(log n) traces.
Take ` traces
Wednesday, August 14, 13
36. Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using
O(log n) traces.
Take ` traces
1. Set c(u, v) as the median of observations |t(u) t(v)| over all traces
Wednesday, August 14, 13
37. Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using
O(log n) traces.
Take ` traces
u v
1. Set c(u, v) as the median of observations |t(u) t(v)| over all traces
Wednesday, August 14, 13
38. Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using
O(log n) traces.
Take ` traces
u v
{u,v} is the only route of
infection between u and v
1. Set c(u, v) as the median of observations |t(u) t(v)| over all traces
Wednesday, August 14, 13
39. Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using
O(log n) traces.
Take ` traces
u v
{u,v} is the only route of
infection between u and v
Incubation time between u
and v is a sample of Exp(λ)
1. Set c(u, v) as the median of observations |t(u) t(v)| over all traces
Wednesday, August 14, 13
40. Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using
O(log n) traces.
Take ` traces
u v
{u,v} is the only route of
infection between u and v
Incubation time between u
and v is a sample of Exp(λ)
1. Set c(u, v) as the median of observations |t(u) t(v)| over all traces
If (u, v) 2 E, c(u, v) < 1
with prob. approaching 1 exponentially with `
Wednesday, August 14, 13
41. Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using
O(log n) traces.
Take ` traces
u v
1. Set c(u, v) as the median of observations |t(u) t(v)| over all traces
(*Step 3 omitted)
Otherwise⇤
, c(u, v) > 1
with prob. approaching 1 exponentially with `
If (u, v) 2 E, c(u, v) < 1
with prob. approaching 1 exponentially with `
Wednesday, August 14, 13
42. Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using
O(log n) traces.
Take ` traces
u v
1. Set c(u, v) as the median of observations |t(u) t(v)| over all traces
(*Step 3 omitted)
Otherwise⇤
, c(u, v) > 1
with prob. approaching 1 exponentially with `
If (u, v) 2 E, c(u, v) < 1
with prob. approaching 1 exponentially with `
Prob. that all these events happen 1 1
nc using ` c · log n traces.
Wednesday, August 14, 13
43. Local MLE for inferring bounded degree graphs
• Think of the potential neighbor sets of u as “forecasters”
predicting the infection time of u, given their own infection times
21
• Identify the most accurate using a proper scoring rule
Trace complexity O(poly( ) log n)
Wednesday, August 14, 13
44. Can we recover properties of a network without
paying the full price of network reconstruction?
22
Wednesday, August 14, 13
45. • Useful to reason about the behavior of processes that take
place in the network
• Robustness [Cohen et al.’00]
• Network evolution [Leskovec, Kleinberg, Faloutsos’05]
• ...
Obtaining network properties cheaper
23
We can infer the degree distribution with high probability
with O(n) traces.
Lower bound for reconstruct the whole network
⌦(n 1 ✏
)
Wednesday, August 14, 13
52. Reconstructing degree distribution
24
s
Pr{Erlang(n, ) < z} = Pr{Pois(z · ) n}
Trace 1 t1
Trace 2 t2
Trace 3 t3
.
.
.
Trace ` t`
Let d be the degree of s
• T =
P`
i=1 ti is Erlang(`, d )
Wednesday, August 14, 13
53. Reconstructing degree distribution
24
s
Pr{Erlang(n, ) < z} = Pr{Pois(z · ) n}
Trace 1 t1
Trace 2 t2
Trace 3 t3
.
.
.
Trace ` t`
Let d be the degree of s
• T =
P`
i=1 ti is Erlang(`, d )
Output: ˆd = `
T
We achieve (1 + ✏)-approximation
with probability 1
using ⌦
⇣
ln 1
✏2
⌘
traces.
Using the Poisson tail bound
Wednesday, August 14, 13
54. Reconstructing degree distribution
• Using 10n traces
25
Barabasi-Albert
1024 nodes
Facebook-Rice Undergraduate
1220 nodes
Facebook-Rice Graduate
503 nodes
Wednesday, August 14, 13
55. Building on the First-Edge algorithm
• First-Edge is close to optimal, but
• Naive and too conservative: Ignores most of the trace
information
• predictable performance: At most as many true-
positive edges as the number of traces (and no false positives)
26
Wednesday, August 14, 13
56. Could we discover more true positives
if we are willing to take more (calculated) risks?
27
Wednesday, August 14, 13
58. • Idea: 1. Reconstruct degree distribution
2. Guess edges by exploring the memoryless property.
First-Edge+
28
Wednesday, August 14, 13
59. • Idea: 1. Reconstruct degree distribution
2. Guess edges by exploring the memoryless property.
First-Edge+
28
s
t0
N(s) = ds
Wednesday, August 14, 13
60. • Idea: 1. Reconstruct degree distribution
2. Guess edges by exploring the memoryless property.
First-Edge+
28
s
t0
N(s) = ds
u
t1
N(u) = du
Wednesday, August 14, 13
61. • Idea: 1. Reconstruct degree distribution
2. Guess edges by exploring the memoryless property.
First-Edge+
28
ds 1 + du 1 edges waiting at time t1
s
t0
N(s) = ds
u
t1
N(u) = du
Wednesday, August 14, 13
62. • Idea: 1. Reconstruct degree distribution
2. Guess edges by exploring the memoryless property.
First-Edge+
28
ds 1 + du 1 edges waiting at time t1
s
t0
N(s) = ds
u
t1
N(u) = du
Any of these are equally likely to
be the first to finish
Wednesday, August 14, 13
63. • Idea: 1. Reconstruct degree distribution
2. Guess edges by exploring the memoryless property.
First-Edge+
28
ds 1 + du 1 edges waiting at time t1
v
?t2
s
t0
N(s) = ds
u
t1
N(u) = du
Wednesday, August 14, 13
64. • Idea: 1. Reconstruct degree distribution
2. Guess edges by exploring the memoryless property.
First-Edge+
28
ds 1 + du 1 edges waiting at time t1
v
?t2
s
t0
N(s) = ds
u
t1
N(u) = du s infected v with probability
p(s,v) = ds 1
ds+du 2
u infected v with probability
p(u,v) = du 1
ds+du 2
Wednesday, August 14, 13
65. • Idea: 1. Reconstruct degree distribution
2. Guess edges by exploring the memoryless property.
First-Edge+
28
ds 1 + du 1 edges waiting at time t1
Infer (x, y) if p(x,y) 0.5
v
?t2
s
t0
N(s) = ds
u
t1
N(u) = du s infected v with probability
p(s,v) = ds 1
ds+du 2
u infected v with probability
p(u,v) = du 1
ds+du 2
Wednesday, August 14, 13
66. • Idea: 1. Reconstruct degree distribution
2. Guess edges by exploring the memoryless property.
First-Edge+
28
ds 1 + du 1 edges waiting at time t1
Infer (x, y) if p(x,y) 0.5
Given a larger trace prefix u1, · · · , uk ( u1 is the source)
p(ui,uk+1) '
dui
P
j duj
v
?t2
s
t0
N(s) = ds
u
t1
N(u) = du s infected v with probability
p(s,v) = ds 1
ds+du 2
u infected v with probability
p(u,v) = du 1
ds+du 2
Wednesday, August 14, 13
72. Conclusions
• Our results have direct implication in the design of network
inference algorithms
• We provide rigorous analysis of the relationship between the
amount of data and the performance of algorithms
• We give algorithms that are competitive with, while being
simpler and more efficient than, existing approaches
30
Wednesday, August 14, 13
73. Open questions and challenges
• Performance guarantees for approximated reconstruction
• Trace complexity under other distributions of incubation
times
• Bounded degree network inference has trace complexity
polynomial in Δ, but running time exponential in Δ
- Can we optimize the algorithm?
• Other network properties that can be recovered without
reconstructing the network
31
Wednesday, August 14, 13
74. Trace complexity of
Network Inference
Bruno Abrahao (Cornell)
Flavio Chierichetti (Sapienza)
Robert Kleinberg (Cornell)
Alessandro Panconesi (Sapienza) Cornell University
32
Text
Sapienza University
Complete version including all proofs
www.arxiv.org/abs/1308.2954
or
http://www.cs.cornell.edu/~abrahao
Wednesday, August 14, 13