SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Downloaden Sie, um offline zu lesen
Likelihood free computational statistics
Pierre Pudlo
Universit´e Montpellier 2
Institut de Math´ematiques et Mod´elisation de Montpellier (I3M)
Institut de Biologie Computationelle
Labex NUMEV
17/04/2015
Pierre Pudlo (UM2) Avignon 17/04/2015 1 / 20
Contents
1 Approximate Bayesian computation
2 ABC model choice
3 Bayesian computation with empirical likelihood
Pierre Pudlo (UM2) Avignon 17/04/2015 2 / 20
Contents
1 Approximate Bayesian computation
2 ABC model choice
3 Bayesian computation with empirical likelihood
Pierre Pudlo (UM2) Avignon 17/04/2015 3 / 20
Intractable likelihoods
Problem
How to perform a Bayesian analysis when the likelihood f(y|φ) is intractable?
Example 1. Gibbs random fields
f(y|φ) ∝ exp(−H(y, φ))
is known up to a constant
Z(φ) =
y
exp(−H(y, φ))
Example 2. Neutral population
genetics
Aim. Infer demographic parameters on
the past of some populations based on
the trace left in genomes of individuals
sampled from current populations.
Latent process (past history of the
sample) ∈ space of high dimension.
If y is the genetic data of the sample,
the likelihood is
f(y|φ) =
Z
f(y, z | φ) dz
Typically, dim(Z ) dim(Y ).
No hope to compute the likelihood with
clever Monte Carlo algorithms?
Coralie Merle, Rapha¨el Leblois et
Franc¸ois Rousset
Pierre Pudlo (UM2) Avignon 17/04/2015 4 / 20
A bend via importance sampling
If y is the genetic data of the sample,
the likelihood is
f(y|φ) =
Z
f(y, z | φ) dz
We are trying to compute this integral
with importance sampling.
Actually z = (z1, . . . , zT) is a measured
valued Markov chain, stopped at a
given optional time T and y = zT hence
f(y|φ) =
Z
1{y = zT} f(z1, . . . , zT | φ) dz
Importance sampling introduces an
auxiliary distribution q(dz | φ)
f(y|φ) =
Z
1{y = zT}
f(z | φ)
q(z | φ)
weight of z
sampling distr.
q(z | φ)dz
The most efficient q is the conditional
distribution of the Markov chain
knowing that zT = y, but even harder to
compute than f(y | φ).
Any other q who is a Markovian
distribution is inefficient as the
variance of the weight grows
exponentially with T.
Need a clever q: see the seminal paper
of Stephens and Donnelly (2000)
And resampling algorithms. . .
Pierre Pudlo (UM2) Avignon 17/04/2015 5 / 20
Approximate Bayesian computation
Idea
Infer conditional distribution of φ given yobs on simulations from the joint π(φ)f(y|φ)
ABC algorithm
A) Generate a large set of (φ, y)
from the Bayesian model
π(φ) f(y|φ)
B) Keep the particles (φ, y) such
that d(η(yobs), η(y)) ≤ ε
C) Return the φ’s of the kept
particles
Curse of dimensionality: y is replaced
by some numerical summaries η(y)
Stage A) is computationally heavy!
We end up rejecting almost all
simulations except if fallen in the
neighborhood of η(yobs)
Sequential ABC algorithms try to avoid
drawing φ is area of low π(φ|y).
An auto-calibrated ABC-SMC
sampler with Mohammed Sedki,
Jean-Michel Marin, Jean-Marie
Cornuet and Christian P. Robert
Pierre Pudlo (UM2) Avignon 17/04/2015 6 / 20
ABC sequential sampler
How to calibrate ε1 ≥ ε2 ≥ · · · ≥ εT and T to be efficient?
The auto-calibrated ABC-SMC sampler developed with Mohammed Sedki,
Jean-Michel Marin, Jean-Marie Cornet and Christian P. Robert
Pierre Pudlo (UM2) Avignon 17/04/2015 7 / 20
ABC target
Three levels of approximation of the
posterior π φ yobs
1 the ABC posterior distribution
π φ η(yobs)
2 approximated with a kernel of
bandwidth ε (or with k-nearest
neighbours)
π φ d(η(y), η(yobs)) ≤ ε
3 a Monte Carlo error:
sample size N < ∞
See, e.g., our review with J.-M. Marin,
C. Robert and R. Ryder
If η(y) are not sufficient statistics,
π φ yobs π φ η(yobs)
Information regarding yobs might be
lost!
Curse of dimensionality:
cannot have both ε small, N large
when η(y) is of large dimension
Post-processing of Beaumont et al.
(2002) with local linear regression.
But the lack of sufficiency might still be
problematic. See Robert et al. (2011)
for model choice.
Pierre Pudlo (UM2) Avignon 17/04/2015 8 / 20
Contents
1 Approximate Bayesian computation
2 ABC model choice
3 Bayesian computation with empirical likelihood
Pierre Pudlo (UM2) Avignon 17/04/2015 9 / 20
ABC model choice
ABC model choice
A) Generate a large set of
(m, φ, y) from the Bayesian
model, π(m)πm(φ) fm(y|φ)
B) Keep the particles (m, φ, y)
such that d(η(y), η(yobs)) ≤ ε
C) For each m, return
pm(yobs) = porportion of m
among the kept particles
Likewise, if η(y) is not sufficient for the
model choice issue,
π m y π m η(y)
It might be difficult to design
informative η(y).
Toy example.
Model 1. yi
iid
∼ N (φ, 1)
Model 2. yi
iid
∼ N (φ, 2)
Same prior on φ (whatever the model)
& uniform prior on model index
η(y) = y1 + · · · + yn is sufficient to
estimate φ in both models
But η(y) carries no information
regarding the variance (hence the
model choice issue)
Other examples in Robert et al. (2011)
In population genetics. Might be
difficult to find summary statistics that
help discriminate between models
(= possible historical scenarios on the
sampled populations)
Pierre Pudlo (UM2) Avignon 17/04/2015 10 / 20
ABC model choice
ABC model choice
A) Generate a large set of
(m, φ, y) from the Bayesian
model π(m)πm(φ) fm(y|φ)
B) Keep the particles (m, φ, y)
such that d(η(y), η(yobs)) ≤ ε
C) For each m, return
pm(yobs) = porportion of m
among the kept particles
If ε is tuned so that the number of kept
particles is k, then pm is a k-nearest
neighbor estimate of
E 1{M = m} η(yobs)
Approximating the posterior
probabilities of model m is a
regression problem where
the response is 1{M = m},
the co-variables are the summary
statistics η(y),
the loss is L2
(conditional
expectation)
The prefered method to approximate
the postererior probabilities in DIYABC
is a local multinomial regression.
Ticklish if dim(η(y)) large, or high
correlation in the summary statistics.
Pierre Pudlo (UM2) Avignon 17/04/2015 11 / 20
Choosing between hidden random fields
Choosing between dependency
graph: 4 or 8 neighbours?
Models. α, β ∼ prior
z | β ∼ Potts on G4 or G8 with interaction β
y | z, α ∼ i P(yi|zi, α)
How to sum up the noisy y?
Without noise (directly observed field),
sufficient statistics for the model choice
issue.
With Julien Stoehr and Lionel Cucala
a method to design new summary
statistics
Based on a clustering of the observed
data on possible dependency graphs
number of connected components
size of the largest connected
component,
. . .
Pierre Pudlo (UM2) Avignon 17/04/2015 12 / 20
Machine learning to analyse machine simulated data
ABC model choice
A) Generate a large set of
(m, φ, y) from π(m)πm(φ) fm(y|φ)
B) Infer (anything?) about
m η(y) with machine learning
methods
In this machine learning perspective:
the (iid) simulations of A) form the
training set
yobs becomes a new data point
With J.-M. Marin, J.-M. Cornuet, A.
Estoup, M. Gautier and C. P. Robert
Predicting m is a classification
problem
Computing π(m|η(y)) is a
regression problem
It is well known that classification is
much simple than regression.
(dimension of the object we infer)
Why computing π(m|η(y)) if we know
that
π(m|y) π(m|η(y))?
Pierre Pudlo (UM2) Avignon 17/04/2015 13 / 20
An example with random forest on human SNP data
Out of Africa
6 scenarios, 6 models
Observed data. 4 populations, 30
individuals per population; 10,000
genotyped SNP from the 1000 Genome
Project
Random forest trained on 40, 000
simulations (112 summary statistics)
predict the model which supports
a single out-of-Africa colonization
event,
a secondary split between European
and Asian lineages and
a recent admixture for Americans
with African origin
Confidence in the selected model?
Pierre Pudlo (UM2) Avignon 17/04/2015 14 / 20
Example (continued)
Observed data. 4 populations, 30
individuals per population; 10,000
genotyped SNP from the 1000 Genome
Project
Random forest trained on 40, 000
simulations (112 summary statistics)
predict the model which supports
a single out-of-Africa colonization
event,
a secondary split between European
and Asian lineages and
a recent admixture for Americans
with African origin
Benefits of random forests?
1 Can find the relevant statistics in a
large set of statistics (112) to
discriminate models
2 Lower prior misclassification error
(≈ 6%) than other methods (ABC, i.e.
k-nn ≈ 18%)
3 Supply a similarity measure to
compare η(y) and η(yobs)
Confidence in the selected model?
Compute the average of the
misclassification error over an ABC
approximation of the predictive (∗). Here,
≤ 0.1%
(∗) π(m, φ, y | ηobs) = π(m | ηobs)πm(φ | ηobs)fm(y | φ)
Pierre Pudlo (UM2) Avignon 17/04/2015 15 / 20
Contents
1 Approximate Bayesian computation
2 ABC model choice
3 Bayesian computation with empirical likelihood
Pierre Pudlo (UM2) Avignon 17/04/2015 16 / 20
Another approximation of the likelihood
What if both
the likelihood is intractable and
unable to simulate a dataset in a reasonable amount of time to resort on ABC?
First answer: use pseudo-likelihoods
such as the pairwise composite likelihood
fPCL(y | φ) =
i<j
f(yi, yj | φ)
Maximum composite likelihood
estimators φ(y) are suitable estimators
But cannot substitute a true likelihood
in a Bayesian framework
leads to credible intervals which are
too narrow: over-confidence in φ(y), see
e.g. Ribatet et al. (2012)
Our proposal with Kerrie Mengersen and
Christian P. Robert:
use the empirical likelihood of Owen
(2001, 2011)
It relies on iid blocks in the dataset y to
reconstruct a likelihood
& permits likelihood ratio tests
confidence intervals are correct
Original aim of Owen: remove parametric
assumptions
Pierre Pudlo (UM2) Avignon 17/04/2015 17 / 20
Bayesian computation via empirical likelihood
Our proposal with Kerrie Mengersen and
Christian P. Robert:
use the empirical likelihood of Owen
(2001, 2011)
It relies on iid blocks in the dataset y to
reconstruct a likelihood
& permits likelihood ratio tests
confidence intervals are correct
Original aim of Owen: remove parametric
assumptions
With empirical likelihood, the parameter φ
is defined as
(∗) E h(yb, φ) = 0
where
yb is one block of y,
E the expected value according to
the true distribution of the block yb
h is a known function
E.g, if φ is the mean of an iid sample,
h(yb, φ) = yb − φ
In population genetics, what is (∗) with
dates of population splits
population sizes, etc. ?
Pierre Pudlo (UM2) Avignon 17/04/2015 18 / 20
Bayesian computation via empirical likelihood
With empirical likelihood, the parameter φ
is defined as
(∗) E h(yb, φ) = 0
where
yb is one block of y,
E the expected value according to
the true distribution of the block yb
h is a known function
E.g, if φ is the mean of an iid sample,
h(yb, φ) = yb − φ
In population genetics, what is (∗) with
dates of population splits
population sizes, etc. ?
A block = genetic data at given locus
h(yb, φ) is the pairwise composite score
function we can explicitly compute in
many situations:
h(yb, φ) = φ log fPCL(yb | φ)
Benefits.
much faster than ABC (no need to
simulate fake data)
same accuracy than ABC or even
much precise: no loss of information
with summary statistics
Pierre Pudlo (UM2) Avignon 17/04/2015 19 / 20
An experiment
Evolutionary scenario:
MRCA
POP 0 POP 1 POP 2
τ1
τ2
Dataset:
50 genes per populations,
100 microsat. loci
Assumptions:
Ne identical over all populations
φ = log10(θ, τ1, τ2)
non-informative prior
Comparison of ABC and EL
histogram = EL
curve = ABC
vertical line = “true” parameter
Pierre Pudlo (UM2) Avignon 17/04/2015 20 / 20
An experiment
Evolutionary scenario:
MRCA
POP 0 POP 1 POP 2
τ1
τ2
Dataset:
50 genes per populations,
100 microsat. loci
Assumptions:
Ne identical over all populations
φ = log10(θ, τ1, τ2)
non-informative prior
Comparison of ABC and EL
histogram = EL
curve = ABC
vertical line = “true” parameter
Pierre Pudlo (UM2) Avignon 17/04/2015 20 / 20
An experiment
Evolutionary scenario:
MRCA
POP 0 POP 1 POP 2
τ1
τ2
Dataset:
50 genes per populations,
100 microsat. loci
Assumptions:
Ne identical over all populations
φ = log10(θ, τ1, τ2)
non-informative prior
Comparison of ABC and EL
histogram = EL
curve = ABC
vertical line = “true” parameter
Pierre Pudlo (UM2) Avignon 17/04/2015 20 / 20

Weitere ähnliche Inhalte

Was ist angesagt?

Convergence of ABC methods
Convergence of ABC methodsConvergence of ABC methods
Convergence of ABC methodsChristian Robert
 
ABC in London, May 5, 2011
ABC in London, May 5, 2011ABC in London, May 5, 2011
ABC in London, May 5, 2011Christian Robert
 
Colloquium in honor of Hans Ruedi Künsch
Colloquium in honor of Hans Ruedi KünschColloquium in honor of Hans Ruedi Künsch
Colloquium in honor of Hans Ruedi KünschChristian Robert
 
Bayesian inference on mixtures
Bayesian inference on mixturesBayesian inference on mixtures
Bayesian inference on mixturesChristian Robert
 
better together? statistical learning in models made of modules
better together? statistical learning in models made of modulesbetter together? statistical learning in models made of modules
better together? statistical learning in models made of modulesChristian Robert
 
Approximate Bayesian model choice via random forests
Approximate Bayesian model choice via random forestsApproximate Bayesian model choice via random forests
Approximate Bayesian model choice via random forestsChristian Robert
 
random forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationrandom forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationChristian Robert
 
CISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergenceCISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergenceChristian Robert
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Christian Robert
 
from model uncertainty to ABC
from model uncertainty to ABCfrom model uncertainty to ABC
from model uncertainty to ABCChristian Robert
 
NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015Christian Robert
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerChristian Robert
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussionChristian Robert
 
A Note on Confidence Bands for Linear Regression Means-07-24-2015
A Note on Confidence Bands for Linear Regression Means-07-24-2015A Note on Confidence Bands for Linear Regression Means-07-24-2015
A Note on Confidence Bands for Linear Regression Means-07-24-2015Junfeng Liu
 
Columbia workshop [ABC model choice]
Columbia workshop [ABC model choice]Columbia workshop [ABC model choice]
Columbia workshop [ABC model choice]Christian Robert
 

Was ist angesagt? (20)

Convergence of ABC methods
Convergence of ABC methodsConvergence of ABC methods
Convergence of ABC methods
 
ABC in London, May 5, 2011
ABC in London, May 5, 2011ABC in London, May 5, 2011
ABC in London, May 5, 2011
 
Colloquium in honor of Hans Ruedi Künsch
Colloquium in honor of Hans Ruedi KünschColloquium in honor of Hans Ruedi Künsch
Colloquium in honor of Hans Ruedi Künsch
 
Bayesian inference on mixtures
Bayesian inference on mixturesBayesian inference on mixtures
Bayesian inference on mixtures
 
Boston talk
Boston talkBoston talk
Boston talk
 
ABC workshop: 17w5025
ABC workshop: 17w5025ABC workshop: 17w5025
ABC workshop: 17w5025
 
better together? statistical learning in models made of modules
better together? statistical learning in models made of modulesbetter together? statistical learning in models made of modules
better together? statistical learning in models made of modules
 
Approximate Bayesian model choice via random forests
Approximate Bayesian model choice via random forestsApproximate Bayesian model choice via random forests
Approximate Bayesian model choice via random forests
 
BIRS 12w5105 meeting
BIRS 12w5105 meetingBIRS 12w5105 meeting
BIRS 12w5105 meeting
 
random forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationrandom forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimation
 
CISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergenceCISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergence
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
from model uncertainty to ABC
from model uncertainty to ABCfrom model uncertainty to ABC
from model uncertainty to ABC
 
asymptotics of ABC
asymptotics of ABCasymptotics of ABC
asymptotics of ABC
 
Intractable likelihoods
Intractable likelihoodsIntractable likelihoods
Intractable likelihoods
 
NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like sampler
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussion
 
A Note on Confidence Bands for Linear Regression Means-07-24-2015
A Note on Confidence Bands for Linear Regression Means-07-24-2015A Note on Confidence Bands for Linear Regression Means-07-24-2015
A Note on Confidence Bands for Linear Regression Means-07-24-2015
 
Columbia workshop [ABC model choice]
Columbia workshop [ABC model choice]Columbia workshop [ABC model choice]
Columbia workshop [ABC model choice]
 

Ähnlich wie Likelihood free computational statistics

3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMINGChristian Robert
 
Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinChristian Robert
 
Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017Christian Robert
 
ABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified modelsABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified modelsChristian Robert
 
Pre-computation for ABC in image analysis
Pre-computation for ABC in image analysisPre-computation for ABC in image analysis
Pre-computation for ABC in image analysisMatt Moores
 
Stratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computationStratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computationUmberto Picchini
 
Slides: Hypothesis testing, information divergence and computational geometry
Slides: Hypothesis testing, information divergence and computational geometrySlides: Hypothesis testing, information divergence and computational geometry
Slides: Hypothesis testing, information divergence and computational geometryFrank Nielsen
 
Introduction to Evidential Neural Networks
Introduction to Evidential Neural NetworksIntroduction to Evidential Neural Networks
Introduction to Evidential Neural NetworksFederico Cerutti
 
Random Forest for Big Data
Random Forest for Big DataRandom Forest for Big Data
Random Forest for Big Datatuxette
 
Neural Networks with Complex Sample Data
Neural Networks with Complex Sample DataNeural Networks with Complex Sample Data
Neural Networks with Complex Sample DataSavano Pereira
 
Considerate Approaches to ABC Model Selection
Considerate Approaches to ABC Model SelectionConsiderate Approaches to ABC Model Selection
Considerate Approaches to ABC Model SelectionMichael Stumpf
 
Approximate Bayesian computation for the Ising/Potts model
Approximate Bayesian computation for the Ising/Potts modelApproximate Bayesian computation for the Ising/Potts model
Approximate Bayesian computation for the Ising/Potts modelMatt Moores
 

Ähnlich wie Likelihood free computational statistics (20)

3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael Martin
 
Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017
 
ABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified modelsABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified models
 
Pre-computation for ABC in image analysis
Pre-computation for ABC in image analysisPre-computation for ABC in image analysis
Pre-computation for ABC in image analysis
 
the ABC of ABC
the ABC of ABCthe ABC of ABC
the ABC of ABC
 
Intro to ABC
Intro to ABCIntro to ABC
Intro to ABC
 
Lecture11 xing
Lecture11 xingLecture11 xing
Lecture11 xing
 
Stratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computationStratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computation
 
BAYSM'14, Wien, Austria
BAYSM'14, Wien, AustriaBAYSM'14, Wien, Austria
BAYSM'14, Wien, Austria
 
Slides: Hypothesis testing, information divergence and computational geometry
Slides: Hypothesis testing, information divergence and computational geometrySlides: Hypothesis testing, information divergence and computational geometry
Slides: Hypothesis testing, information divergence and computational geometry
 
Introduction to Evidential Neural Networks
Introduction to Evidential Neural NetworksIntroduction to Evidential Neural Networks
Introduction to Evidential Neural Networks
 
Random Forest for Big Data
Random Forest for Big DataRandom Forest for Big Data
Random Forest for Big Data
 
AI Lesson 29
AI Lesson 29AI Lesson 29
AI Lesson 29
 
Lesson 29
Lesson 29Lesson 29
Lesson 29
 
Neural Networks with Complex Sample Data
Neural Networks with Complex Sample DataNeural Networks with Complex Sample Data
Neural Networks with Complex Sample Data
 
Considerate Approaches to ABC Model Selection
Considerate Approaches to ABC Model SelectionConsiderate Approaches to ABC Model Selection
Considerate Approaches to ABC Model Selection
 
Approximate Bayesian computation for the Ising/Potts model
Approximate Bayesian computation for the Ising/Potts modelApproximate Bayesian computation for the Ising/Potts model
Approximate Bayesian computation for the Ising/Potts model
 
ABC model choice
ABC model choiceABC model choice
ABC model choice
 

Kürzlich hochgeladen

Bureau of Indian Standards Specification of Shampoo.pptx
Bureau of Indian Standards Specification of Shampoo.pptxBureau of Indian Standards Specification of Shampoo.pptx
Bureau of Indian Standards Specification of Shampoo.pptxkastureyashashree
 
Role of herbs in hair care Amla and heena.pptx
Role of herbs in hair care  Amla and  heena.pptxRole of herbs in hair care  Amla and  heena.pptx
Role of herbs in hair care Amla and heena.pptxVaishnaviAware
 
KeyBio pipeline for bioinformatics and data science
KeyBio pipeline for bioinformatics and data scienceKeyBio pipeline for bioinformatics and data science
KeyBio pipeline for bioinformatics and data scienceLayne Sadler
 
CW marking grid Analytical BS - M Ahmad.docx
CW  marking grid Analytical BS - M Ahmad.docxCW  marking grid Analytical BS - M Ahmad.docx
CW marking grid Analytical BS - M Ahmad.docxmarwaahmad357
 
Role of Herbs in Cosmetics in Cosmetic Science.
Role of Herbs in Cosmetics in Cosmetic Science.Role of Herbs in Cosmetics in Cosmetic Science.
Role of Herbs in Cosmetics in Cosmetic Science.ShwetaHattimare
 
Principles & Formulation of Hair Care Products
Principles & Formulation of Hair Care  ProductsPrinciples & Formulation of Hair Care  Products
Principles & Formulation of Hair Care Productspurwaborkar@gmail.com
 
TORSION IN GASTROPODS- Anatomical event (Zoology)
TORSION IN GASTROPODS- Anatomical event (Zoology)TORSION IN GASTROPODS- Anatomical event (Zoology)
TORSION IN GASTROPODS- Anatomical event (Zoology)chatterjeesoumili50
 
Gene transfer in plants agrobacterium.pdf
Gene transfer in plants agrobacterium.pdfGene transfer in plants agrobacterium.pdf
Gene transfer in plants agrobacterium.pdfNetHelix
 
soft skills question paper set for bba ca
soft skills question paper set for bba casoft skills question paper set for bba ca
soft skills question paper set for bba caohsadfeeling
 
Contracts with Interdependent Preferences (2)
Contracts with Interdependent Preferences (2)Contracts with Interdependent Preferences (2)
Contracts with Interdependent Preferences (2)GRAPE
 
Pests of tenai_Identification,Binomics_Dr.UPR
Pests of tenai_Identification,Binomics_Dr.UPRPests of tenai_Identification,Binomics_Dr.UPR
Pests of tenai_Identification,Binomics_Dr.UPRPirithiRaju
 
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptxTHE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptxAkinrotimiOluwadunsi
 
PSP3 employability assessment form .docx
PSP3 employability assessment form .docxPSP3 employability assessment form .docx
PSP3 employability assessment form .docxmarwaahmad357
 
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdfPests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdfPirithiRaju
 
Pests of ragi_Identification, Binomics_Dr.UPR
Pests of ragi_Identification, Binomics_Dr.UPRPests of ragi_Identification, Binomics_Dr.UPR
Pests of ragi_Identification, Binomics_Dr.UPRPirithiRaju
 
Main Exam Applied biochemistry final year
Main Exam Applied biochemistry final yearMain Exam Applied biochemistry final year
Main Exam Applied biochemistry final yearmarwaahmad357
 
Legacy Analysis of Dark Matter Annihilation from the Milky Way Dwarf Spheroid...
Legacy Analysis of Dark Matter Annihilation from the Milky Way Dwarf Spheroid...Legacy Analysis of Dark Matter Annihilation from the Milky Way Dwarf Spheroid...
Legacy Analysis of Dark Matter Annihilation from the Milky Way Dwarf Spheroid...Sérgio Sacani
 
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky WayShiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky WaySérgio Sacani
 
Pests of Redgram_Identification, Binomics_Dr.UPR
Pests of Redgram_Identification, Binomics_Dr.UPRPests of Redgram_Identification, Binomics_Dr.UPR
Pests of Redgram_Identification, Binomics_Dr.UPRPirithiRaju
 

Kürzlich hochgeladen (20)

Bureau of Indian Standards Specification of Shampoo.pptx
Bureau of Indian Standards Specification of Shampoo.pptxBureau of Indian Standards Specification of Shampoo.pptx
Bureau of Indian Standards Specification of Shampoo.pptx
 
Role of herbs in hair care Amla and heena.pptx
Role of herbs in hair care  Amla and  heena.pptxRole of herbs in hair care  Amla and  heena.pptx
Role of herbs in hair care Amla and heena.pptx
 
KeyBio pipeline for bioinformatics and data science
KeyBio pipeline for bioinformatics and data scienceKeyBio pipeline for bioinformatics and data science
KeyBio pipeline for bioinformatics and data science
 
CW marking grid Analytical BS - M Ahmad.docx
CW  marking grid Analytical BS - M Ahmad.docxCW  marking grid Analytical BS - M Ahmad.docx
CW marking grid Analytical BS - M Ahmad.docx
 
Role of Herbs in Cosmetics in Cosmetic Science.
Role of Herbs in Cosmetics in Cosmetic Science.Role of Herbs in Cosmetics in Cosmetic Science.
Role of Herbs in Cosmetics in Cosmetic Science.
 
Cheminformatics tools supporting dissemination of data associated with US EPA...
Cheminformatics tools supporting dissemination of data associated with US EPA...Cheminformatics tools supporting dissemination of data associated with US EPA...
Cheminformatics tools supporting dissemination of data associated with US EPA...
 
Principles & Formulation of Hair Care Products
Principles & Formulation of Hair Care  ProductsPrinciples & Formulation of Hair Care  Products
Principles & Formulation of Hair Care Products
 
TORSION IN GASTROPODS- Anatomical event (Zoology)
TORSION IN GASTROPODS- Anatomical event (Zoology)TORSION IN GASTROPODS- Anatomical event (Zoology)
TORSION IN GASTROPODS- Anatomical event (Zoology)
 
Gene transfer in plants agrobacterium.pdf
Gene transfer in plants agrobacterium.pdfGene transfer in plants agrobacterium.pdf
Gene transfer in plants agrobacterium.pdf
 
soft skills question paper set for bba ca
soft skills question paper set for bba casoft skills question paper set for bba ca
soft skills question paper set for bba ca
 
Contracts with Interdependent Preferences (2)
Contracts with Interdependent Preferences (2)Contracts with Interdependent Preferences (2)
Contracts with Interdependent Preferences (2)
 
Pests of tenai_Identification,Binomics_Dr.UPR
Pests of tenai_Identification,Binomics_Dr.UPRPests of tenai_Identification,Binomics_Dr.UPR
Pests of tenai_Identification,Binomics_Dr.UPR
 
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptxTHE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
 
PSP3 employability assessment form .docx
PSP3 employability assessment form .docxPSP3 employability assessment form .docx
PSP3 employability assessment form .docx
 
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdfPests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
 
Pests of ragi_Identification, Binomics_Dr.UPR
Pests of ragi_Identification, Binomics_Dr.UPRPests of ragi_Identification, Binomics_Dr.UPR
Pests of ragi_Identification, Binomics_Dr.UPR
 
Main Exam Applied biochemistry final year
Main Exam Applied biochemistry final yearMain Exam Applied biochemistry final year
Main Exam Applied biochemistry final year
 
Legacy Analysis of Dark Matter Annihilation from the Milky Way Dwarf Spheroid...
Legacy Analysis of Dark Matter Annihilation from the Milky Way Dwarf Spheroid...Legacy Analysis of Dark Matter Annihilation from the Milky Way Dwarf Spheroid...
Legacy Analysis of Dark Matter Annihilation from the Milky Way Dwarf Spheroid...
 
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky WayShiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
 
Pests of Redgram_Identification, Binomics_Dr.UPR
Pests of Redgram_Identification, Binomics_Dr.UPRPests of Redgram_Identification, Binomics_Dr.UPR
Pests of Redgram_Identification, Binomics_Dr.UPR
 

Likelihood free computational statistics

  • 1. Likelihood free computational statistics Pierre Pudlo Universit´e Montpellier 2 Institut de Math´ematiques et Mod´elisation de Montpellier (I3M) Institut de Biologie Computationelle Labex NUMEV 17/04/2015 Pierre Pudlo (UM2) Avignon 17/04/2015 1 / 20
  • 2. Contents 1 Approximate Bayesian computation 2 ABC model choice 3 Bayesian computation with empirical likelihood Pierre Pudlo (UM2) Avignon 17/04/2015 2 / 20
  • 3. Contents 1 Approximate Bayesian computation 2 ABC model choice 3 Bayesian computation with empirical likelihood Pierre Pudlo (UM2) Avignon 17/04/2015 3 / 20
  • 4. Intractable likelihoods Problem How to perform a Bayesian analysis when the likelihood f(y|φ) is intractable? Example 1. Gibbs random fields f(y|φ) ∝ exp(−H(y, φ)) is known up to a constant Z(φ) = y exp(−H(y, φ)) Example 2. Neutral population genetics Aim. Infer demographic parameters on the past of some populations based on the trace left in genomes of individuals sampled from current populations. Latent process (past history of the sample) ∈ space of high dimension. If y is the genetic data of the sample, the likelihood is f(y|φ) = Z f(y, z | φ) dz Typically, dim(Z ) dim(Y ). No hope to compute the likelihood with clever Monte Carlo algorithms? Coralie Merle, Rapha¨el Leblois et Franc¸ois Rousset Pierre Pudlo (UM2) Avignon 17/04/2015 4 / 20
  • 5. A bend via importance sampling If y is the genetic data of the sample, the likelihood is f(y|φ) = Z f(y, z | φ) dz We are trying to compute this integral with importance sampling. Actually z = (z1, . . . , zT) is a measured valued Markov chain, stopped at a given optional time T and y = zT hence f(y|φ) = Z 1{y = zT} f(z1, . . . , zT | φ) dz Importance sampling introduces an auxiliary distribution q(dz | φ) f(y|φ) = Z 1{y = zT} f(z | φ) q(z | φ) weight of z sampling distr. q(z | φ)dz The most efficient q is the conditional distribution of the Markov chain knowing that zT = y, but even harder to compute than f(y | φ). Any other q who is a Markovian distribution is inefficient as the variance of the weight grows exponentially with T. Need a clever q: see the seminal paper of Stephens and Donnelly (2000) And resampling algorithms. . . Pierre Pudlo (UM2) Avignon 17/04/2015 5 / 20
  • 6. Approximate Bayesian computation Idea Infer conditional distribution of φ given yobs on simulations from the joint π(φ)f(y|φ) ABC algorithm A) Generate a large set of (φ, y) from the Bayesian model π(φ) f(y|φ) B) Keep the particles (φ, y) such that d(η(yobs), η(y)) ≤ ε C) Return the φ’s of the kept particles Curse of dimensionality: y is replaced by some numerical summaries η(y) Stage A) is computationally heavy! We end up rejecting almost all simulations except if fallen in the neighborhood of η(yobs) Sequential ABC algorithms try to avoid drawing φ is area of low π(φ|y). An auto-calibrated ABC-SMC sampler with Mohammed Sedki, Jean-Michel Marin, Jean-Marie Cornuet and Christian P. Robert Pierre Pudlo (UM2) Avignon 17/04/2015 6 / 20
  • 7. ABC sequential sampler How to calibrate ε1 ≥ ε2 ≥ · · · ≥ εT and T to be efficient? The auto-calibrated ABC-SMC sampler developed with Mohammed Sedki, Jean-Michel Marin, Jean-Marie Cornet and Christian P. Robert Pierre Pudlo (UM2) Avignon 17/04/2015 7 / 20
  • 8. ABC target Three levels of approximation of the posterior π φ yobs 1 the ABC posterior distribution π φ η(yobs) 2 approximated with a kernel of bandwidth ε (or with k-nearest neighbours) π φ d(η(y), η(yobs)) ≤ ε 3 a Monte Carlo error: sample size N < ∞ See, e.g., our review with J.-M. Marin, C. Robert and R. Ryder If η(y) are not sufficient statistics, π φ yobs π φ η(yobs) Information regarding yobs might be lost! Curse of dimensionality: cannot have both ε small, N large when η(y) is of large dimension Post-processing of Beaumont et al. (2002) with local linear regression. But the lack of sufficiency might still be problematic. See Robert et al. (2011) for model choice. Pierre Pudlo (UM2) Avignon 17/04/2015 8 / 20
  • 9. Contents 1 Approximate Bayesian computation 2 ABC model choice 3 Bayesian computation with empirical likelihood Pierre Pudlo (UM2) Avignon 17/04/2015 9 / 20
  • 10. ABC model choice ABC model choice A) Generate a large set of (m, φ, y) from the Bayesian model, π(m)πm(φ) fm(y|φ) B) Keep the particles (m, φ, y) such that d(η(y), η(yobs)) ≤ ε C) For each m, return pm(yobs) = porportion of m among the kept particles Likewise, if η(y) is not sufficient for the model choice issue, π m y π m η(y) It might be difficult to design informative η(y). Toy example. Model 1. yi iid ∼ N (φ, 1) Model 2. yi iid ∼ N (φ, 2) Same prior on φ (whatever the model) & uniform prior on model index η(y) = y1 + · · · + yn is sufficient to estimate φ in both models But η(y) carries no information regarding the variance (hence the model choice issue) Other examples in Robert et al. (2011) In population genetics. Might be difficult to find summary statistics that help discriminate between models (= possible historical scenarios on the sampled populations) Pierre Pudlo (UM2) Avignon 17/04/2015 10 / 20
  • 11. ABC model choice ABC model choice A) Generate a large set of (m, φ, y) from the Bayesian model π(m)πm(φ) fm(y|φ) B) Keep the particles (m, φ, y) such that d(η(y), η(yobs)) ≤ ε C) For each m, return pm(yobs) = porportion of m among the kept particles If ε is tuned so that the number of kept particles is k, then pm is a k-nearest neighbor estimate of E 1{M = m} η(yobs) Approximating the posterior probabilities of model m is a regression problem where the response is 1{M = m}, the co-variables are the summary statistics η(y), the loss is L2 (conditional expectation) The prefered method to approximate the postererior probabilities in DIYABC is a local multinomial regression. Ticklish if dim(η(y)) large, or high correlation in the summary statistics. Pierre Pudlo (UM2) Avignon 17/04/2015 11 / 20
  • 12. Choosing between hidden random fields Choosing between dependency graph: 4 or 8 neighbours? Models. α, β ∼ prior z | β ∼ Potts on G4 or G8 with interaction β y | z, α ∼ i P(yi|zi, α) How to sum up the noisy y? Without noise (directly observed field), sufficient statistics for the model choice issue. With Julien Stoehr and Lionel Cucala a method to design new summary statistics Based on a clustering of the observed data on possible dependency graphs number of connected components size of the largest connected component, . . . Pierre Pudlo (UM2) Avignon 17/04/2015 12 / 20
  • 13. Machine learning to analyse machine simulated data ABC model choice A) Generate a large set of (m, φ, y) from π(m)πm(φ) fm(y|φ) B) Infer (anything?) about m η(y) with machine learning methods In this machine learning perspective: the (iid) simulations of A) form the training set yobs becomes a new data point With J.-M. Marin, J.-M. Cornuet, A. Estoup, M. Gautier and C. P. Robert Predicting m is a classification problem Computing π(m|η(y)) is a regression problem It is well known that classification is much simple than regression. (dimension of the object we infer) Why computing π(m|η(y)) if we know that π(m|y) π(m|η(y))? Pierre Pudlo (UM2) Avignon 17/04/2015 13 / 20
  • 14. An example with random forest on human SNP data Out of Africa 6 scenarios, 6 models Observed data. 4 populations, 30 individuals per population; 10,000 genotyped SNP from the 1000 Genome Project Random forest trained on 40, 000 simulations (112 summary statistics) predict the model which supports a single out-of-Africa colonization event, a secondary split between European and Asian lineages and a recent admixture for Americans with African origin Confidence in the selected model? Pierre Pudlo (UM2) Avignon 17/04/2015 14 / 20
  • 15. Example (continued) Observed data. 4 populations, 30 individuals per population; 10,000 genotyped SNP from the 1000 Genome Project Random forest trained on 40, 000 simulations (112 summary statistics) predict the model which supports a single out-of-Africa colonization event, a secondary split between European and Asian lineages and a recent admixture for Americans with African origin Benefits of random forests? 1 Can find the relevant statistics in a large set of statistics (112) to discriminate models 2 Lower prior misclassification error (≈ 6%) than other methods (ABC, i.e. k-nn ≈ 18%) 3 Supply a similarity measure to compare η(y) and η(yobs) Confidence in the selected model? Compute the average of the misclassification error over an ABC approximation of the predictive (∗). Here, ≤ 0.1% (∗) π(m, φ, y | ηobs) = π(m | ηobs)πm(φ | ηobs)fm(y | φ) Pierre Pudlo (UM2) Avignon 17/04/2015 15 / 20
  • 16. Contents 1 Approximate Bayesian computation 2 ABC model choice 3 Bayesian computation with empirical likelihood Pierre Pudlo (UM2) Avignon 17/04/2015 16 / 20
  • 17. Another approximation of the likelihood What if both the likelihood is intractable and unable to simulate a dataset in a reasonable amount of time to resort on ABC? First answer: use pseudo-likelihoods such as the pairwise composite likelihood fPCL(y | φ) = i<j f(yi, yj | φ) Maximum composite likelihood estimators φ(y) are suitable estimators But cannot substitute a true likelihood in a Bayesian framework leads to credible intervals which are too narrow: over-confidence in φ(y), see e.g. Ribatet et al. (2012) Our proposal with Kerrie Mengersen and Christian P. Robert: use the empirical likelihood of Owen (2001, 2011) It relies on iid blocks in the dataset y to reconstruct a likelihood & permits likelihood ratio tests confidence intervals are correct Original aim of Owen: remove parametric assumptions Pierre Pudlo (UM2) Avignon 17/04/2015 17 / 20
  • 18. Bayesian computation via empirical likelihood Our proposal with Kerrie Mengersen and Christian P. Robert: use the empirical likelihood of Owen (2001, 2011) It relies on iid blocks in the dataset y to reconstruct a likelihood & permits likelihood ratio tests confidence intervals are correct Original aim of Owen: remove parametric assumptions With empirical likelihood, the parameter φ is defined as (∗) E h(yb, φ) = 0 where yb is one block of y, E the expected value according to the true distribution of the block yb h is a known function E.g, if φ is the mean of an iid sample, h(yb, φ) = yb − φ In population genetics, what is (∗) with dates of population splits population sizes, etc. ? Pierre Pudlo (UM2) Avignon 17/04/2015 18 / 20
  • 19. Bayesian computation via empirical likelihood With empirical likelihood, the parameter φ is defined as (∗) E h(yb, φ) = 0 where yb is one block of y, E the expected value according to the true distribution of the block yb h is a known function E.g, if φ is the mean of an iid sample, h(yb, φ) = yb − φ In population genetics, what is (∗) with dates of population splits population sizes, etc. ? A block = genetic data at given locus h(yb, φ) is the pairwise composite score function we can explicitly compute in many situations: h(yb, φ) = φ log fPCL(yb | φ) Benefits. much faster than ABC (no need to simulate fake data) same accuracy than ABC or even much precise: no loss of information with summary statistics Pierre Pudlo (UM2) Avignon 17/04/2015 19 / 20
  • 20. An experiment Evolutionary scenario: MRCA POP 0 POP 1 POP 2 τ1 τ2 Dataset: 50 genes per populations, 100 microsat. loci Assumptions: Ne identical over all populations φ = log10(θ, τ1, τ2) non-informative prior Comparison of ABC and EL histogram = EL curve = ABC vertical line = “true” parameter Pierre Pudlo (UM2) Avignon 17/04/2015 20 / 20
  • 21. An experiment Evolutionary scenario: MRCA POP 0 POP 1 POP 2 τ1 τ2 Dataset: 50 genes per populations, 100 microsat. loci Assumptions: Ne identical over all populations φ = log10(θ, τ1, τ2) non-informative prior Comparison of ABC and EL histogram = EL curve = ABC vertical line = “true” parameter Pierre Pudlo (UM2) Avignon 17/04/2015 20 / 20
  • 22. An experiment Evolutionary scenario: MRCA POP 0 POP 1 POP 2 τ1 τ2 Dataset: 50 genes per populations, 100 microsat. loci Assumptions: Ne identical over all populations φ = log10(θ, τ1, τ2) non-informative prior Comparison of ABC and EL histogram = EL curve = ABC vertical line = “true” parameter Pierre Pudlo (UM2) Avignon 17/04/2015 20 / 20