TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
variational bayes in biophysics
1. bio+stats vbem/networks hierarchical
variational and hierarchical modeling
for biological data
chris wiggins
columbia
april 23, 2012
chris.wiggins@columbia.edu 4/23/12
Chris Wiggins
• APAM: Department of Applied Physics and Applied Mathematics;
• C2B2: Center for Computational Biology and Bioinformatics;
• CISB: Columbia University Initiative in Systems Biology
• ISDE: Institute for Data Sciences and Engineering
Columbia University
September 28, 2012
2. bio+stats vbem/networks hierarchical biological challenges inference model selection
thanks. . .
- jake hofman (vbmod,vbfret)
- jonathan bronson (vbfret)
- jan-willem van de meent (hfret)
- ruben gonzalez (vbfret, hfret)
for more info:
- vbfret.sourceforge.net
- vbmod.sourceforge.net
- hfret.sourceforge.net (soon)
chris.wiggins@columbia.edu 4/23/12
BMC bioinformatics, 2010;
PNAS 2009;
Biophysical Journal 2009;
3. bio+stats vbem/networks hierarchical
1 biology and statistics
genomics
generative modeling
2 variational/biological networks
variational Bayesian expectation maximization
inference
model selection
3 hierarchical/time series
biological challenges
inference
model selection
chris.wiggins@columbia.edu 4/23/12
8. introduction formulation results extensions motivation history
introduction:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Hartwell, Hopfield, Leibler and Murray
NATURE|VOL 402 | SUPP | 2
DECEMBER 1999 | www.nature.com
9. introduction formulation results extensions motivation history
motivation:
community detection in networks
social networks
biological networks
problem: over-fitting/resolution limit
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
10. introduction formulation results extensions motivation history
history:
by community
math/cs: spectral methods (Fiedler ’74, Shi + Malik ’00)
math/cs: clustering generally (Taskar, Koller, Getoor)
physics: modularity
common thread: test w/ stochastic block model (’76, ’83)
ergo: use as inference tool (Hastings 0604429,
Newman+Liecht 061148)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
11. introduction formulation results extensions generative model max likelihood max evidence algo
formulation:
generative model
maximum likelihood
maximum evidence
complexity control. . .
variational/mft. . .
algorithm
in physics: “test hamiltonian”
in ML “variational bayesian methods” (Jordan, Mackay)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
12. introduction formulation results extensions generative model max likelihood max evidence algo
generative model:
foreach node roll K-sided die with bias π to choose
zi {1, . . . , K}
foreach edge flip coin with bias ϑ+ if zi = zj , else ϑ−
draw edge if coin lands heads up
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Stochastic block models (Holland, Laskey, Leinhardt 1983; Wang and Wong, 1987)
i≠j
zi zj
Aij
π
θ
13. introduction formulation results extensions generative model max likelihood max evidence algo
generative model. . . (bis)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Die rolling, coin flipping, and priors: where counts are:
non-edges within
modules
edges within
modules
edges between
modules
non-edges
between modules
nodes in each
module
14. introduction formulation results extensions generative model max likelihood max evidence algo
max likelihood:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
H ⇥ ln p(A, ⌦z|⌦⇤, ⌦⇥) =
i,j
(JLAij JG) zi,zj +
K
µ=1
hµ
N
i=1
zi,µ
JG ⇥ ln ⇥c/⇥d
JL ⇥ ln(1 ⇥d)/(1 ⇥c) + JG
hµ ⇥ ln µ
Extends Newman (2004, 2006), Hastings (2006), Bornholdt & Reichardt (2006)
•Die rolling, coin flipping <-> infinite-range spin-glass Potts model:
15. introduction formulation results extensions generative model max likelihood max evidence algo
formulation:
generative model
maximum likelihood
maximum evidence
complexity control. . .
variational/mft. . .
algorithm
in physics: “test hamiltonian”
in ML “variational bayesian methods” (Jordan, Mackay)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
16. introduction formulation results extensions generative model max likelihood max evidence algo
formulation:
generative model
maximum likelihood
maximum evidence
complexity control. . .
variational/mft. . .
algorithm
in physics: “test hamiltonian”
in ML “variational bayesian methods” (Jordan, Mackay)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
17. introduction formulation results extensions generative model max likelihood max evidence algo
max evidence:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Increasing complexity
18. introduction formulation results extensions generative model max likelihood max evidence algo
max evidence:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
http://research.microsoft.com/~minka/statlearn/demo/
19. introduction formulation results extensions generative model max likelihood max evidence algo
max evidence:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
20. introduction formulation results extensions generative model max likelihood max evidence algo
max evidence:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
cf. “BIC” Schwartz, 1978
21. introduction formulation results extensions generative model max likelihood max evidence algo
generative model:
foreach node roll K-sided die with bias π to choose
zi {1, . . . , K}
foreach edge flip coin with bias ϑ+ if zi = zj , else ϑ−
draw edge if coin lands heads up
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Stochastic block models (Holland, Laskey, Leinhardt 1983; Wang and Wong, 1987)
i≠j
zi zj
Aij
π
θ
22. introduction formulation results extensions generative model max likelihood max evidence algo
generative model:
foreach node roll K-sided die with bias π to choose
zi {1, . . . , K}
foreach edge flip coin with bias ϑ+ if zi = zj , else ϑ−
draw edge if coin lands heads up
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Stochastic block models (Holland, Laskey, Leinhardt 1983; Wang and Wong, 1987)
i≠j
zi zj
Aij
π
θ c
n
23. introduction formulation results extensions generative model max likelihood max evidence algo
generative model. . . (bis)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Die rolling, coin flipping, and priors: where counts are:
non-edges within
modules
edges within
modules
edges between
modules
non-edges
between modules
nodes in each
module
24. introduction formulation results extensions generative model max likelihood max evidence algo
generative model. . . (bis)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Die rolling, coin flipping, and priors: where counts are:
non-edges within
modules
edges within
modules
edges between
modules
non-edges
between modules
nodes in each
module
25. introduction formulation results extensions generative model max likelihood max evidence algo
max likelihood:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Extends Newman (2004, 2006), Hastings (2006), Bornholdt & Reichardt (2006)
•Die rolling, coin flipping <-> infinite-range spin-glass Potts model:
•Infer distributions over spin assignments, coupling constants, and
chemical potentials and find number of occupied spin states
JG ⇥ ln ⇥c/⇥d
JL ⇥ ln(1 ⇥d)/(1 ⇥c) + JG
hµ ⇥ ln µ
•Die rolling, coin flipping <-> infinite-range spin-glass Potts model:
•Infer distributions over spin assignments, coupling constants, and
chemical potentials and find number of occupied spin states
H ⇥ ln p(A, ⌦z|⌦⇤, ⌦⇥) =
i,j
(JLAij JG) zi,zj +
K
µ=1
hµ
N
i=1
zi,µ
26. introduction formulation results extensions generative model max likelihood max evidence algo
max evidence:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Extends Newman (2004, 2006), Hastings (2006), Bornholdt & Reichardt (2006)
•Die rolling, coin flipping <-> infinite-range spin-glass Potts model:
•Infer distributions over spin assignments, coupling constants, and
chemical potentials and find number of occupied spin states
H ⇥ ln p(A, ⌦z|⌦⇤, ⌦⇥) =
i,j
(JLAij JG) zi,zj +
K
µ=1
hµ
N
i=1
zi,µ
JG ⇥ ln ⇥c/⇥d
JL ⇥ ln(1 ⇥d)/(1 ⇥c) + JG
hµ ⇥ ln µ
p(A|K) =
⇥z
⇥
d⌦
⇥
d⌦⇥ p(A,⌦z,⌦⇥, ⌦) =
⇥z
⇥
d⌦
⇥
d⌦⇥ e H
p(⌦)p(⌦⇥)
27. introduction formulation results extensions generative model max likelihood max evidence algo
max evidence:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Extends Newman (2004, 2006), Hastings (2006), Bornholdt & Reichardt (2004 & 2006)
•Die rolling, coin flipping <-> infinite-range spin-glass Potts model:
•Infer distributions over spin assignments, coupling constants, and
chemical potentials and find number of occupied spin states
H ⇥ ln p(A, ⌦z|⌦⇤, ⌦⇥) =
i,j
(JLAij JG) zi,zj +
K
µ=1
hµ
N
i=1
zi,µ
JG ⇥ ln ⇥c/⇥d
JL ⇥ ln(1 ⇥d)/(1 ⇥c) + JG
hµ ⇥ ln µ
p(A|K) =
⇥z
⇥
d⌦
⇥
d⌦⇥ p(A,⌦z,⌦⇥, ⌦) =
⇥z
⇥
d⌦
⇥
d⌦⇥ e H
p(⌦)p(⌦⇥)
Can do integrals,
but sum is
intractable, O(KN);
use mean-field
28. introduction formulation results extensions generative model max likelihood max evidence algo
max evidence:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
• Gibbs’/Jensen’s inequality (log of expected value bounds expected value of log) for any distribution q
p(A|K) =
⇥z
⇥
d⌦
⇥
d⌦⇥ p(A,⌦z,⌦⇥, ⌦) =
⇥z
⇥
d⌦
⇥
d⌦⇥ e H
p(⌦)p(⌦⇥)
Variational Bayes (MacKay, Jordan, Ghahramani, Jaakola, Saul 1999; cf. Feynman 1972)
29. introduction formulation results extensions generative model max likelihood max evidence algo
max evidence:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
why would you do this? (A1):
Beal, 2003
30. introduction formulation results extensions generative model max likelihood max evidence algo
max evidence:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
why would you do this? (A2):
Beal, 2003
31. introduction formulation results extensions generative model max likelihood max evidence algo
max evidence:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
why would you do this? (A3):
Beal, 2003
32. introduction formulation results extensions generative model max likelihood max evidence algo
max evidence:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
• Gibbs’/Jensen’s inequality (log of expected value bounds expected value of log) for any distribution q
Variational Bayes (MacKay, Jordan, Ghahramani, Jaakola, Saul 1999; cf. Feynman 1972)
• F is a functional of q; find approximation to posterior by optimizing approximation to
evidence
• Take q(z, π, θ)=q(z)q(π)q(θ); Qiμ is probability node i in module μ where expected counts
are:
33. introduction formulation results extensions generative model max likelihood max evidence algo
algo:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
where expected counts
are:
34. introduction formulation results extensions generative model max likelihood max evidence algo
algo:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
where expected counts
are:
35. introduction formulation results extensions generative model max likelihood max evidence algo
algo:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
36. introduction formulation results extensions generative model max likelihood max evidence algo
algo:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
suggests hard limit in step 3; sparse in step 1
37. introduction formulation results extensions run time consistency good vs easy real data
results:
run time
consistency
required plot: good vs. easy
real data
karate
biology
american football
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
38. introduction formulation results extensions run time consistency good vs easy real data
run time:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
• Main loop runtime for 104 nodes in MATLAB ~30 seconds
39. introduction formulation results extensions run time consistency good vs easy real data
consistency:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
40. introduction formulation results extensions run time consistency good vs easy real data
consistency:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
2
4
6
8
10
12
θ
N=8, K=2, distribution after 2 iterations
p(θ+
)
p(θ
−
)
41. introduction formulation results extensions run time consistency good vs easy real data
consistency:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
• K=4?
• Automatic complexity control: probability of occupation for extraneous modules
goes to zero
42. introduction formulation results extensions run time consistency good vs easy real data
consistency:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
• K=4?
• Automatic complexity control: probability of occupation for extraneous modules
goes to zero
44. introduction formulation results extensions run time consistency good vs easy real data
consistency:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
The “resolution limit” problem
10 12 14 16 18 20
8
10
12
14
16
18
20
Ktrue
K*
K
*
=Ktrue
Variational Bayes
Modularity optimization
10 12 14 16 18 20
0.72
0.74
0.76
0.78
0.8
0.82
0.84
Ktrue
GNmodularity
Resolution limit problem on ring of 4−node cliques
Single−clique communities (correct)
Double−clique communities (incorrect)
GN modularity (Clauset’s algorithm)
Girvan-Newman modularity or Potts model w/ fixed parameters suffers from a resolution limit,
where size of detected modules depends on network size
Fortunato et. al. (2007), Kumpula et. al. (2007),
45. introduction formulation results extensions run time consistency good vs easy real data
good vs easy:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
47. introduction formulation results extensions run time consistency good vs easy real data
real data:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
APS march meeting 2008
superconductivity
(experimentalists)
Nanotubes, Graphene
superconductivity
(theorists)
48. introduction formulation results extensions
extensions:
model extensions
full SBM (done)
hierarchical model p(Aij = 1|zi = zi ± 1)
hierarchical modeling (ensemble of graphs)
p(D|u, K) = ΠL
i dϑi p(D|ϑi )p(ϑi |u, K)
Rd
embedding (latent are real)
more ‘rigorous’ SOM?
‘correct’ for degree (allow variable affinity) (cf. Bader, Karrer)
algorithm extensions
BP (see earlier talks)
map-reduce
cvmod (model selection via cross validation)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
49. introduction formulation results extensions
full SBM. . .
probability of edge depends only on block membership:
p(Aij |zi = µ, zj = ν) = ϑµν
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
50. introduction formulation results extensions
full SBM. . .
probability of edge depends only on block membership:
p(Aij |zi = µ, zj = ν) = ϑµν
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
• Nodes belong to “blocks” of
varying size
• Roll die for assignment of
nodes to blocks
• Probability of edge between two
nodes depends only on block
membership
• Flip (one of K2) coins for edges
• Result: mixture of Erdos-Renyi
graphs
0 20 40 60 80 100 120
0
20
40
60
80
100
120
nz = 2275
adjacency matrix
51. introduction formulation results extensions
full SBM. . .
probability of edge depends only on block membership:
p(Aij |zi = µ, zj = ν) = ϑµν
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
vs
52. introduction formulation results extensions
full SBM. . .
probability of edge depends only on block membership:
p(Aij |zi = µ, zj = ν) = ϑµν
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
0 20 40 60 80 100 120
0
20
40
60
80
100
120
nz = 2803
adjacency matrix
53. introduction formulation results extensions
full SBM. . .
probability of edge depends only on block membership:
p(Aij |zi = µ, zj = ν) = ϑµν
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
vs
54. introduction formulation results extensions
full SBM. . .
probability of edge depends only on block membership:
p(Aij |zi = µ, zj = ν) = ϑµν
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
>> vbsbm_vs_vbmod(0)
running vbmod ...
Elapsed time is 1.136925 seconds.
running vbsbm ...
Elapsed time is 1.398904 seconds.
Fmod=13089.158019 Fsbm=13144.445782
vbmod wins
55. introduction formulation results extensions
full SBM. . .
probability of edge depends only on block membership:
p(Aij |zi = µ, zj = ν) = ϑµν
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
>> vbsbm_vs_vbmod(0.25)
running vbmod ...
Elapsed time is 1.557298 seconds.
running vbsbm ...
Elapsed time is 1.759527 seconds.
Fmod=20457.142416 Fsbm=19457.306022
vbsbm wins
56. introduction formulation results extensions
full SBM. . .
probability of edge depends only on block membership:
p(Aij |zi = µ, zj = ν) = ϑµν
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
>> vbsbm_vs_vbmod(0.5)
running vbmod ...
Elapsed time is 2.624886 seconds.
running vbsbm ...
Elapsed time is 1.440242 seconds.
Fmod=26133.351210 Fsbm=23921.797625
vbsbm wins
57. introduction formulation results extensions
full SBM. . .
probability of edge depends only on block membership:
p(Aij |zi = µ, zj = ν) = ϑµν
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
• Using same framework we can compare the
unconstrained and full stochastic block models via p(D|M,K*)
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
perturbation to constrained model
winpercentageforunconstrainedmodel
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
perturbation to constrained model
winpercentageforunconstrainedmodel
0 20 40 60 80 100 120
0
20
40
60
80
100
120
nz = 2100
adjacency matrix
0 20 40 60 80 100 120
0
20
40
60
80
100
120
nz = 2048
adjacency matrix
0 20 40 60 80 100 120
0
20
40
60
80
100
120
nz = 2108
adjacency matrix
58. introduction formulation results extensions
extensions:
model extensions
full SBM (done)
hierarchical model p(Aij = 1|zi = zi ± 1)
hierarchical modeling (ensemble of graphs)
p(D|u, K) = ΠL
i dϑi p(D|ϑi )p(ϑi |u, K)
Rd
embedding (latent are real)
more ‘rigorous’ SOM?
‘correct’ for degree (allow variable affinity) (cf. Bader, Karrer)
algorithm extensions
BP (see earlier talks)
map-reduce
cvmod (model selection via cross validation)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
59. introduction formulation results extensions
extensions:
model extensions
full SBM (done)
hierarchical model p(Aij = 1|zi = zi ± 1)
hierarchical modeling (ensemble of graphs)
p(D|u, K) = ΠL
i dϑi p(D|ϑi )p(ϑi |u, K)
Rd
embedding (latent are real)
more ‘rigorous’ SOM?
‘correct’ for degree (allow variable affinity) (cf. Bader, Karrer)
algorithm extensions
BP (see earlier talks)
map-reduce
cvmod (model selection via cross validation)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
60. introduction formulation results extensions
extensions:
model extensions
full SBM (done)
hierarchical model p(Aij = 1|zi = zi ± 1)
hierarchical modeling (ensemble of graphs)
p(D|u, K) = ΠL
i dϑi p(D|ϑi )p(ϑi |u, K)
Rd
embedding (latent are real)
more ‘rigorous’ SOM?
‘correct’ for degree (allow variable affinity) (cf. Bader, Karrer)
algorithm extensions
BP (see earlier talks)
map-reduce
cvmod (model selection via cross validation)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
61. introduction formulation results extensions
extensions:
model extensions
full SBM (done)
hierarchical model p(Aij = 1|zi = zi ± 1)
hierarchical modeling (ensemble of graphs)
p(D|u, K) = ΠL
i dϑi p(D|ϑi )p(ϑi |u, K)
Rd
embedding (latent are real)
more ‘rigorous’ SOM?
‘correct’ for degree (allow variable affinity) (cf. Bader, Karrer)
algorithm extensions
BP (see earlier talks)
map-reduce
cvmod (model selection via cross validation)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Stochastic block models (Holland, Laskey, Leinhardt 1983; Wang and Wong, 1987)
i≠j
zi
zj
Aij
π
θ c
n
62. introduction formulation results extensions
extensions:
model extensions
full SBM (done)
hierarchical model p(Aij = 1|zi = zi ± 1)
hierarchical modeling (ensemble of graphs)
p(D|u, K) = ΠL
i dϑi p(D|ϑi )p(ϑi |u, K)
Rd
embedding (latent are real)
more ‘rigorous’ SOM?
‘correct’ for degree (allow variable affinity) (cf. Bader, Karrer)
algorithm extensions
BP (see earlier talks)
map-reduce
cvmod (model selection via cross validation)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Stochastic block models (Holland, Laskey, Leinhardt 1983; Wang and Wong, 1987)
i≠j
zi
zj
Aij
π
θ c
n
L
63. introduction formulation results extensions
extensions:
model extensions
full SBM (done)
hierarchical model p(Aij = 1|zi = zi ± 1)
hierarchical modeling (ensemble of graphs)
p(D|u, K) = ΠL
i dϑi p(D|ϑi )p(ϑi |u, K)
Rd
embedding (latent are real)
more ‘rigorous’ SOM?
‘correct’ for degree (allow variable affinity) (cf. Bader, Karrer)
algorithm extensions
BP (see earlier talks)
map-reduce
cvmod (model selection via cross validation)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
64. introduction formulation results extensions
for more info. . .
code: MATLAB & python (inc. “full” SBM) (vbmod.sf.net)
paper: arxiv 08 / prl 08
Hofman soon to come (not by me)
code in C++, inc. full ‘vblabel propagation’ algo
twitter-scale analysis
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
66. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Jan-Willem van de Meent, Ruben Gonzalez, Chris Wiggins
Columbia University
72. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Tinoco and Gonzalez, Genes Dev, 2011
73. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Unbound
EF-G bound
Tinoco and Gonzalez, Genes Dev, 2011 Fei et al, PNAS, 2009
(short-lived GS1 states correspond to an EF-G + GDPNP binding event)
74. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Unbound
EF-G bound
Tinoco and Gonzalez, Genes Dev, 2011 Fei et al, PNAS, 2009
75. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
1. Identify states
2. Estimate Kinetic Rates
3. Average over many time series
4. Detect subpopulations
80. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
FRET SignalHistogram
Idea: Find probability of belonging to each state
82. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Log-Likelihood
L = log p(x θ) = log
z
p(x, z θ)
Expectation Maximization
1. calculate p(z | x, θi)
2. calculate θi+1 from p(z | x, θi)
83. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Learned Truth
Accurate for occupancy of states,
not so good for rate estimates
85. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
probability of state depends on previous state
p(zt+ =l zt =k) = Akl
93. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
2 States 3 States
94. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Log-Likelihood
L = log p(x θ) = log
z
p(x, z θ)
Log-Evidence
L = log p(x u) = log
z
∫ dθ p(x, z θ)p(θu)
Log-Evidence
95. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Log-Likelihood
L = log p(x θ) = log
z
p(x, z θ)
Log-Evidence
L = log p(x u) = log
z
∫ dθ p(x, z θ)p(θu)
Prior
96. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Log-Likelihood
L = log p(x θ) = log
z
p(x, z θ)
Log-Evidence
L = log p(x u) = log
z
∫ dθ p(x, z θ)p(θu)
Ensemble
97. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Log-Likelihood
L = log p(x θ) = log
z
p(x, z θ)
Log-Evidence
best model has highest average likelihood
L = log p(x u) = log
z
∫ dθ p(x, z θ)p(θu)
Log-Evidence
98. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Log-Evidence
31
L = log p(x u) = log
z
∫ dθ p(x, z θ)p(θu)
Lower Bound
L =
z
∫ dθ q(z)q(θ w)log
p(x, z, θ u)
q(z)q(θ w)
≥ log p(x u)
q(z)q(θ w) p(z, θ x)
99. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
31
Lower bound tight for true posterior
L =
z
∫ dθ p(z, θ x)log
p(x, z, θ u)
p(z, θ x)
=
z
∫ dθ p(z, θ x)log[p(x u)]
= log p(x u)
L = log p(x u) − Dkl [q(z)q(θ w) p(z, θ x)]
117. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Unbound
EF-G bound
Tinoco and Gonzalez, Genes Dev, 2011 Fei et al, PNAS, 2009
118. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
ξntkl = p(znt = k, znt+ = l xn)
1. Run mixture model on posterior counts
p(ξnA) =
tkl
Aξntkl
kl
p(ξn um) = ∫ dA p(Aum)p(ξn A)
2. Rerun with M x K block-diagonal form
uA
=
uA
uA
uA
M
130. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Low Noise, UnderfittedInf Out - Inf In
131. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Low Noise, CorrectOut vs In
132. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Low Noise, OverfittedInf Out - Inf In
133. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
High Noise, UnderfittedInf Out - Inf In
134. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
High Noise, CorrectInf Out - Inf In
135. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
High Noise, OverfittedInf Out - Inf In
136. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
High Noise, OverfittedInf Out - Inf In
137. bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
the future, in progress:
X
138. bio+stats vbem/networks hierarchical biological challenges inference model selection
thanks. . .
- jake hofman (vbmod,vbfret)
- jonathan bronson (vbfret)
- jan-willem van de meent (hfret)
- ruben gonzalez (vbfret, hfret)
for more info:
- vbfret.sourceforge.net
- vbmod.sourceforge.net
- hfret.sourceforge.net (soon)
chris.wiggins@columbia.edu 4/23/12
BMC bioinformatics, 2010;
PNAS 2009;
Biophysical Journal 2009;
139. traditional role of statistics in biophysics
“if your experiment needs
statistics, you ought to
have done a better
experiment”
-lord rutherford