SlideShare ist ein Scribd-Unternehmen logo
1 von 57
Downloaden Sie, um offline zu lesen
Bayesian Probabilistic Numerical Methods (Part I)
Chris. J. Oates
Newcastle University
Alan Turing Institute
August 2017 @ SAMSI
The SAMSI Working Group on Probabilistic Numerics
Fran¸cois-Xavier Briol Oksana Chkrebtii Jon Cockayne Mark Girolami Philipp Hennig
Warwick Ohio State Warwick Imperial MPI T¨ubingen
Han Cheng Lie Houman Owhadi Florian Schaefer Andrew Stuart Tim Sullivan
FU Berlin Caltech Caltech Caltech FU Berlin
Motivation
Consider the task of solving the PDE:
−∆u = f on Ω
u = 0 on ∂Ω
Given an approximate solution un we can obtain an a posteriori error bound
∀β, y (u − un) 2
≤ (1 + β) un − y 2
+
1 + β
β
C2
Ω ∆y + f 2
does not involve u
the “deviation majorant”
CΩ = diam(Ω)
Babu˘ska and Rheinboldt (1978) A posteriori error estimates for the finite element method. Cited 1378.
Ainsworth and Oden (2011) A posteriori error estimation in finite element analysis. Cited 2252.
Problem: ∆y + f is a quadrature and in a pipeline our computational budget will be limited.
Motivation
Consider the task of solving the PDE:
−∆u = f on Ω
u = 0 on ∂Ω
Given an approximate solution un we can obtain an a posteriori error bound
∀β, y (u − un) 2
≤ (1 + β) un − y 2
+
1 + β
β
C2
Ω ∆y + f 2
does not involve u
the “deviation majorant”
CΩ = diam(Ω)
Babu˘ska and Rheinboldt (1978) A posteriori error estimates for the finite element method. Cited 1378.
Ainsworth and Oden (2011) A posteriori error estimation in finite element analysis. Cited 2252.
Problem: ∆y + f is a quadrature and in a pipeline our computational budget will be limited.
Motivation
Consider the task of solving the PDE:
−∆u = f on Ω
u = 0 on ∂Ω
Given an approximate solution un we can obtain an a posteriori error bound
∀β, y (u − un) 2
≤ (1 + β) un − y 2
+
1 + β
β
C2
Ω ∆y + f 2
does not involve u
the “deviation majorant”
CΩ = diam(Ω)
Babu˘ska and Rheinboldt (1978) A posteriori error estimates for the finite element method. Cited 1378.
Ainsworth and Oden (2011) A posteriori error estimation in finite element analysis. Cited 2252.
Problem: ∆y + f is a quadrature and in a pipeline our computational budget will be limited.
Motivation
Consider the task of solving the PDE:
−∆u = f on Ω
u = 0 on ∂Ω
Given an approximate solution un we can obtain an a posteriori error bound
∀β, y (u − un) 2
≤ (1 + β) un − y 2
+
1 + β
β
C2
Ω ∆y + f 2
does not involve u
the “deviation majorant”
CΩ = diam(Ω)
Babu˘ska and Rheinboldt (1978) A posteriori error estimates for the finite element method. Cited 1378.
Ainsworth and Oden (2011) A posteriori error estimation in finite element analysis. Cited 2252.
Problem: ∆y + f is a quadrature and in a pipeline our computational budget will be limited.
Numerical sol’n
of PDE
−∆u = f
↑
f
Computational pipelines are efficient precisely because “going back is not allowed”.
=⇒ a posteriori error bounds are precluded.
Numerical sol’n
of PDE
−∆u = f
↑
f
Computational pipelines are efficient precisely because “going back is not allowed”.
=⇒ a posteriori error bounds are precluded.
Information-based complexity viewpoint:
An(f ) =



f (x1)
...
f (xn)



based on size n computational budget.
Consider a numerical solution un, based on the information An(f ).
Problem: It is impossible to get a computable bound on u − un , based only on An(f ).
Information-based complexity viewpoint:
An(f ) =



f (x1)
...
f (xn)



based on size n computational budget.
Consider a numerical solution un, based on the information An(f ).
Problem: It is impossible to get a computable bound on u − un , based only on An(f ).
Information-based complexity viewpoint:
An(f ) =



f (x1)
...
f (xn)



based on size n computational budget.
Consider a numerical solution un, based on the information An(f ).
Problem: It is impossible to get a computable bound on u − un , based only on An(f ).
Proof: One cannot distinguish between f 1, f 2 ∈ H−1
(D) such that
f 1 = 0 on D = [0, 1]
f 2 =
2
(b − a)(2 − a − b)
1[a < x < b] such that {x1, . . . , xn} ∩ (a, b) = ∅
since An(f 1) = An(f 2).
Yet these yield wildly different solutions:
u1(x) = 0 u 2 = 0
u2(x) =



x 0 < x < a
x − (x−a)2
(b−a)(2−a−b)
a < x < b
(a+b)(1−x)
(2−a−b)
b < x < 1
u 2 ≥
a3/2
31/2
Moral: A posteriori error analysis requires global information on f , such as f .
How to proceed when global information cannot be obtained?
Proof: One cannot distinguish between f 1, f 2 ∈ H−1
(D) such that
f 1 = 0 on D = [0, 1]
f 2 =
2
(b − a)(2 − a − b)
1[a < x < b] such that {x1, . . . , xn} ∩ (a, b) = ∅
since An(f 1) = An(f 2).
Yet these yield wildly different solutions:
u1(x) = 0 u 2 = 0
u2(x) =



x 0 < x < a
x − (x−a)2
(b−a)(2−a−b)
a < x < b
(a+b)(1−x)
(2−a−b)
b < x < 1
u 2 ≥
a3/2
31/2
Moral: A posteriori error analysis requires global information on f , such as f .
How to proceed when global information cannot be obtained?
Proof: One cannot distinguish between f 1, f 2 ∈ H−1
(D) such that
f 1 = 0 on D = [0, 1]
f 2 =
2
(b − a)(2 − a − b)
1[a < x < b] such that {x1, . . . , xn} ∩ (a, b) = ∅
since An(f 1) = An(f 2).
Yet these yield wildly different solutions:
u1(x) = 0 u 2 = 0
u2(x) =



x 0 < x < a
x − (x−a)2
(b−a)(2−a−b)
a < x < b
(a+b)(1−x)
(2−a−b)
b < x < 1
u 2 ≥
a3/2
31/2
Moral: A posteriori error analysis requires global information on f , such as f .
How to proceed when global information cannot be obtained?
Proof: One cannot distinguish between f 1, f 2 ∈ H−1
(D) such that
f 1 = 0 on D = [0, 1]
f 2 =
2
(b − a)(2 − a − b)
1[a < x < b] such that {x1, . . . , xn} ∩ (a, b) = ∅
since An(f 1) = An(f 2).
Yet these yield wildly different solutions:
u1(x) = 0 u 2 = 0
u2(x) =



x 0 < x < a
x − (x−a)2
(b−a)(2−a−b)
a < x < b
(a+b)(1−x)
(2−a−b)
b < x < 1
u 2 ≥
a3/2
31/2
Moral: A posteriori error analysis requires global information on f , such as f .
How to proceed when global information cannot be obtained?
Idea: Exploit domain-specific subjective prior belief.
Indeed, if we model f as a draw from a distribution Pf ,n, then f has statistical properties which can
be leveraged.
=⇒ statistical analogue of a posteriori error analysis
=⇒ statistical local error indicators, etc.
How to select Pf ,n? To be useful, Pf ,n should depend on An(f ).
A natural (Bayesian) approach:
Pf ,n ∝ Pf
“prior”
× δAn(f )
“likelihood”
Randomness used as an allegorical device to represent epistemic uncertainty.
Idea: Exploit domain-specific subjective prior belief.
Indeed, if we model f as a draw from a distribution Pf ,n, then f has statistical properties which can
be leveraged.
=⇒ statistical analogue of a posteriori error analysis
=⇒ statistical local error indicators, etc.
How to select Pf ,n? To be useful, Pf ,n should depend on An(f ).
A natural (Bayesian) approach:
Pf ,n ∝ Pf
“prior”
× δAn(f )
“likelihood”
Randomness used as an allegorical device to represent epistemic uncertainty.
Idea: Exploit domain-specific subjective prior belief.
Indeed, if we model f as a draw from a distribution Pf ,n, then f has statistical properties which can
be leveraged.
=⇒ statistical analogue of a posteriori error analysis
=⇒ statistical local error indicators, etc.
How to select Pf ,n? To be useful, Pf ,n should depend on An(f ).
A natural (Bayesian) approach:
Pf ,n ∝ Pf
“prior”
× δAn(f )
“likelihood”
Randomness used as an allegorical device to represent epistemic uncertainty.
Idea: Exploit domain-specific subjective prior belief.
Indeed, if we model f as a draw from a distribution Pf ,n, then f has statistical properties which can
be leveraged.
=⇒ statistical analogue of a posteriori error analysis
=⇒ statistical local error indicators, etc.
How to select Pf ,n? To be useful, Pf ,n should depend on An(f ).
A natural (Bayesian) approach:
Pf ,n ∝ Pf
“prior”
× δAn(f )
“likelihood”
Randomness used as an allegorical device to represent epistemic uncertainty.
Idea: Exploit domain-specific subjective prior belief.
Indeed, if we model f as a draw from a distribution Pf ,n, then f has statistical properties which can
be leveraged.
=⇒ statistical analogue of a posteriori error analysis
=⇒ statistical local error indicators, etc.
How to select Pf ,n? To be useful, Pf ,n should depend on An(f ).
A natural (Bayesian) approach:
Pf ,n ∝ Pf
“prior”
× δAn(f )
“likelihood”
Randomness used as an allegorical device to represent epistemic uncertainty.
Idea: Exploit domain-specific subjective prior belief.
Indeed, if we model f as a draw from a distribution Pf ,n, then f has statistical properties which can
be leveraged.
=⇒ statistical analogue of a posteriori error analysis
=⇒ statistical local error indicators, etc.
How to select Pf ,n? To be useful, Pf ,n should depend on An(f ).
A natural (Bayesian) approach:
Pf ,n ∝ Pf
“prior”
× δAn(f )
“likelihood”
Randomness used as an allegorical device to represent epistemic uncertainty.
Idea: Exploit domain-specific subjective prior belief.
Indeed, if we model f as a draw from a distribution Pf ,n, then f has statistical properties which can
be leveraged.
=⇒ statistical analogue of a posteriori error analysis
=⇒ statistical local error indicators, etc.
How to select Pf ,n? To be useful, Pf ,n should depend on An(f ).
A natural (Bayesian) approach:
Pf ,n ∝ Pf
“prior”
× δAn(f )
“likelihood”
Randomness used as an allegorical device to represent epistemic uncertainty.
Bayesian Probabilistic Numerical Methods
In a Bayesian probabilistic numerical method;
a prior measure Pf is placed on f
a posterior measure Pf ,n is defined as the “restriction of Pf to those functions f for which
An(f ) = a e.g. An(f ) =



f (x1)
...
f (xn)


 = a
is satisfied” (needs to be formalised)
equivalent to prior and posterior measures
Pu Pu,n[a]
on the solution space of the PDE.
=⇒ principled and general uncertainty quantification for numerical methods.
=⇒ probabilistic quantification of numerical error that can be propagated forward.
Bayesian Probabilistic Numerical Methods
In a Bayesian probabilistic numerical method;
a prior measure Pf is placed on f
a posterior measure Pf ,n is defined as the “restriction of Pf to those functions f for which
An(f ) = a e.g. An(f ) =



f (x1)
...
f (xn)


 = a
is satisfied” (needs to be formalised)
equivalent to prior and posterior measures
Pu Pu,n[a]
on the solution space of the PDE.
=⇒ principled and general uncertainty quantification for numerical methods.
=⇒ probabilistic quantification of numerical error that can be propagated forward.
Bayesian Probabilistic Numerical Methods
In a Bayesian probabilistic numerical method;
a prior measure Pf is placed on f
a posterior measure Pf ,n is defined as the “restriction of Pf to those functions f for which
An(f ) = a e.g. An(f ) =



f (x1)
...
f (xn)


 = a
is satisfied” (needs to be formalised)
equivalent to prior and posterior measures
Pu Pu,n[a]
on the solution space of the PDE.
=⇒ principled and general uncertainty quantification for numerical methods.
=⇒ probabilistic quantification of numerical error that can be propagated forward.
Bayesian Probabilistic Numerical Methods
In a Bayesian probabilistic numerical method;
a prior measure Pf is placed on f
a posterior measure Pf ,n is defined as the “restriction of Pf to those functions f for which
An(f ) = a e.g. An(f ) =



f (x1)
...
f (xn)


 = a
is satisfied” (needs to be formalised)
equivalent to prior and posterior measures
Pu Pu,n[a]
on the solution space of the PDE.
=⇒ principled and general uncertainty quantification for numerical methods.
=⇒ probabilistic quantification of numerical error that can be propagated forward.
Bayesian Probabilistic Numerical Methods
In a Bayesian probabilistic numerical method;
a prior measure Pf is placed on f
a posterior measure Pf ,n is defined as the “restriction of Pf to those functions f for which
An(f ) = a e.g. An(f ) =



f (x1)
...
f (xn)


 = a
is satisfied” (needs to be formalised)
equivalent to prior and posterior measures
Pu Pu,n[a]
on the solution space of the PDE.
=⇒ principled and general uncertainty quantification for numerical methods.
=⇒ probabilistic quantification of numerical error that can be propagated forward.
Example
Consider again the linear PDE
−∆u = f 1 on Ω
u = f 2 on ∂Ω.
Gaussian prior Pu and condition on
An(f ) =










...
f 1( )
...
f 2( )
...










= a.
=⇒ Gaussian conditional distribution Pu,n[a].
Outline of the Research
Bayesian Probabilistic Numerical Methods
Cockayne, Oates, Sullivan, Girolami (2017)
arXiv:1702.03673
1. Elicit the Abstract Structure . . .
2. Establish Well-Defined, Existence and Uniqueness of Pu,n[a]
3. Characterise the Optimal Information Operator An ← next
4. Algorithms to Sample from Pu,n[a]
5. Extend to Pipelines of Computation
Optimal Information
Consider an information operator
An(f ) =



f (x1)
...
f (xn)


 .
The aim is to select locations x1, . . . , xn that are optimal in the sense that
{x1, . . . , xn} ∈ arg inf L {Pu,n[An(f )](ω), u(f )} dω dPf .
=⇒ L is a loss function on the solution space of the PDE that must be specified.
L(u, u ) = |u − u | corresponds to Wasserstein metric d(Pu,n, δ(u)).
=⇒ not equivalent to the Bayes risk from decision theory.
=⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
Optimal Information
Consider an information operator
An(f ) =



f (x1)
...
f (xn)


 .
The aim is to select locations x1, . . . , xn that are optimal in the sense that
{x1, . . . , xn} ∈ arg inf L {Pu,n[An(f )](ω), u(f )} dω dPf .
=⇒ L is a loss function on the solution space of the PDE that must be specified.
L(u, u ) = |u − u | corresponds to Wasserstein metric d(Pu,n, δ(u)).
=⇒ not equivalent to the Bayes risk from decision theory.
=⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
Optimal Information
Consider an information operator
An(f ) =



f (x1)
...
f (xn)


 .
The aim is to select locations x1, . . . , xn that are optimal in the sense that
{x1, . . . , xn} ∈ arg inf L {d[An(f )], u(f )} dω dPf .
=⇒ L is a loss function on the solution space of the PDE that must be specified.
L(u, u ) = |u − u | corresponds to Wasserstein metric d(Pu,n, δ(u)).
=⇒ not equivalent to the Bayes risk from decision theory.
=⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
Optimal Information
Consider an information operator
An(f ) =



f (x1)
...
f (xn)


 .
The aim is to select locations x1, . . . , xn that are optimal in the sense that
{x1, . . . , xn} ∈ arg inf L {d[An(f )], u(f )} dω dPf .
=⇒ L is a loss function on the solution space of the PDE that must be specified.
L(u, u ) = |u − u | corresponds to Wasserstein metric d(Pu,n, δ(u)).
=⇒ not equivalent to the Bayes risk from decision theory.
=⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
An adversary picks a card at random and our goal is to ascertain whether the suit
of their card was ♥ under 0-1 loss. i.e. f ∼ uniform({♥, ♦, ♣, ♠}). Consider two
possible experiments:
Experiment Bayes’ risk Probabilistic numerics risk
(maximum a posteriori (full posterior)
point estimate)
Q: Is it red?
Q: Is it ♠?
since“¬♥” always a posterior mode.
=⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
An adversary picks a card at random and our goal is to ascertain whether the suit
of their card was ♥ under 0-1 loss. i.e. f ∼ uniform({♥, ♦, ♣, ♠}). Consider two
possible experiments:
Experiment Bayes’ risk Probabilistic numerics risk
(maximum a posteriori (full posterior)
point estimate)
Q: Is it red?
Q: Is it ♠?
since“¬♥” always a posterior mode.
=⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
An adversary picks a card at random and our goal is to ascertain whether the suit
of their card was ♥ under 0-1 loss. i.e. f ∼ uniform({♥, ♦, ♣, ♠}). Consider two
possible experiments:
Experiment Bayes’ risk Probabilistic numerics risk
(maximum a posteriori (full posterior)
point estimate)
Q: Is it red? 1
4
Q: Is it ♠? 1
4
since“¬♥” always a posterior mode.
=⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
An adversary picks a card at random and our goal is to ascertain whether the suit
of their card was ♥ under 0-1 loss. i.e. f ∼ uniform({♥, ♦, ♣, ♠}). Consider two
possible experiments:
Experiment Bayes’ risk Probabilistic numerics risk
(maximum a posteriori (full posterior)
point estimate)
Q: Is it red? 1
4
1
4
♥
·
p(¬♥|A(♥))
1
2
Q: Is it ♠? 1
4
since“¬♥” always a posterior mode.
=⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
An adversary picks a card at random and our goal is to ascertain whether the suit
of their card was ♥ under 0-1 loss. i.e. f ∼ uniform({♥, ♦, ♣, ♠}). Consider two
possible experiments:
Experiment Bayes’ risk Probabilistic numerics risk
(maximum a posteriori (full posterior)
point estimate)
Q: Is it red? 1
4
1
4
♥
·
p(¬♥|A(♥))
1
2
+ 1
4
♦
·
p(♥|A(♦))
1
2
Q: Is it ♠? 1
4
since“¬♥” always a posterior mode.
=⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
An adversary picks a card at random and our goal is to ascertain whether the suit
of their card was ♥ under 0-1 loss. i.e. f ∼ uniform({♥, ♦, ♣, ♠}). Consider two
possible experiments:
Experiment Bayes’ risk Probabilistic numerics risk
(maximum a posteriori (full posterior)
point estimate)
Q: Is it red? 1
4
1
4
♥
·
p(¬♥|A(♥))
1
2
+ 1
4
♦
·
p(♥|A(♦))
1
2
+ 1
4
♣
·
p(♥|A(♣))
0
Q: Is it ♠? 1
4
since“¬♥” always a posterior mode.
=⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
An adversary picks a card at random and our goal is to ascertain whether the suit
of their card was ♥ under 0-1 loss. i.e. f ∼ uniform({♥, ♦, ♣, ♠}). Consider two
possible experiments:
Experiment Bayes’ risk Probabilistic numerics risk
(maximum a posteriori (full posterior)
point estimate)
Q: Is it red? 1
4
1
4
♥
·
p(¬♥|A(♥))
1
2
+ 1
4
♦
·
p(♥|A(♦))
1
2
+ 1
4
♣
·
p(♥|A(♣))
0 + 1
4
♠
·
p(♥|A(♠))
0 = 1
4
Q: Is it ♠? 1
4
since“¬♥” always a posterior mode.
=⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
An adversary picks a card at random and our goal is to ascertain whether the suit
of their card was ♥ under 0-1 loss. i.e. f ∼ uniform({♥, ♦, ♣, ♠}). Consider two
possible experiments:
Experiment Bayes’ risk Probabilistic numerics risk
(maximum a posteriori (full posterior)
point estimate)
Q: Is it red? 1
4
1
4
♥
·
p(¬♥|A(♥))
1
2
+ 1
4
♦
·
p(♥|A(♦))
1
2
+ 1
4
♣
·
p(♥|A(♣))
0 + 1
4
♠
·
p(♥|A(♠))
0 = 1
4
Q: Is it ♠? 1
4
1
4
♥
·
p(¬♥|A(♥))
2
3
since“¬♥” always a posterior mode.
=⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
An adversary picks a card at random and our goal is to ascertain whether the suit
of their card was ♥ under 0-1 loss. i.e. f ∼ uniform({♥, ♦, ♣, ♠}). Consider two
possible experiments:
Experiment Bayes’ risk Probabilistic numerics risk
(maximum a posteriori (full posterior)
point estimate)
Q: Is it red? 1
4
1
4
♥
·
p(¬♥|A(♥))
1
2
+ 1
4
♦
·
p(♥|A(♦))
1
2
+ 1
4
♣
·
p(♥|A(♣))
0 + 1
4
♠
·
p(♥|A(♠))
0 = 1
4
Q: Is it ♠? 1
4
1
4
♥
·
p(¬♥|A(♥))
2
3
+ 1
4
♦
·
p(♥|A(♦))
1
3
since“¬♥” always a posterior mode.
=⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
An adversary picks a card at random and our goal is to ascertain whether the suit
of their card was ♥ under 0-1 loss. i.e. f ∼ uniform({♥, ♦, ♣, ♠}). Consider two
possible experiments:
Experiment Bayes’ risk Probabilistic numerics risk
(maximum a posteriori (full posterior)
point estimate)
Q: Is it red? 1
4
1
4
♥
·
p(¬♥|A(♥))
1
2
+ 1
4
♦
·
p(♥|A(♦))
1
2
+ 1
4
♣
·
p(♥|A(♣))
0 + 1
4
♠
·
p(♥|A(♠))
0 = 1
4
Q: Is it ♠? 1
4
1
4
♥
·
p(¬♥|A(♥))
2
3
+ 1
4
♦
·
p(♥|A(♦))
1
3
+ 1
4
♣
·
p(♥|A(♣))
1
3
since“¬♥” always a posterior mode.
=⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
An adversary picks a card at random and our goal is to ascertain whether the suit
of their card was ♥ under 0-1 loss. i.e. f ∼ uniform({♥, ♦, ♣, ♠}). Consider two
possible experiments:
Experiment Bayes’ risk Probabilistic numerics risk
(maximum a posteriori (full posterior)
point estimate)
Q: Is it red? 1
4
1
4
♥
·
p(¬♥|A(♥))
1
2
+ 1
4
♦
·
p(♥|A(♦))
1
2
+ 1
4
♣
·
p(♥|A(♣))
0 + 1
4
♠
·
p(♥|A(♠))
0 = 1
4
Q: Is it ♠? 1
4
1
4
♥
·
p(¬♥|A(♥))
2
3
+ 1
4
♦
·
p(♥|A(♦))
1
3
+ 1
4
♣
·
p(♥|A(♣))
1
3
+ 1
4
♠
·
p(♥|A(♠))
0 = 1
3
since“¬♥” always a posterior mode.
=⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
An adversary picks a card at random and our goal is to ascertain whether the suit
of their card was ♥ under 0-1 loss. i.e. f ∼ uniform({♥, ♦, ♣, ♠}). Consider two
possible experiments:
Experiment Bayes’ risk Probabilistic numerics risk
(maximum a posteriori (full posterior)
point estimate)
Q: Is it red? 1
4
1
4
♥
·
p(¬♥|A(♥))
1
2
+ 1
4
♦
·
p(♥|A(♦))
1
2
+ 1
4
♣
·
p(♥|A(♣))
0 + 1
4
♠
·
p(♥|A(♠))
0 = 1
4
Q: Is it ♠? 1
4
1
4
♥
·
p(¬♥|A(♥))
2
3
+ 1
4
♦
·
p(♥|A(♦))
1
3
+ 1
4
♣
·
p(♥|A(♣))
1
3
+ 1
4
♠
·
p(♥|A(♠))
0 = 1
3
since“¬♥” always a posterior mode.
=⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
Average Case
(∗)
↔ Bayesian Decision
?
↔ Bayesian Probabilistic
Analysis Theory Numerical Methods
(∗) Kadane and Wasilkowski (1985) Average Case -Complexity in Computer Science: A Bayesian View.
Average Case
(∗)
↔ Bayesian Decision
?
↔ Bayesian Probabilistic
Analysis Theory Numerical Methods
(∗) Kadane and Wasilkowski (1985) Average Case -Complexity in Computer Science: A Bayesian View.
Bayes
rule
(decision
theory)
Optimal
(Bayesian
probabilistic
numerics)
Contours of constant average risk
Risk set
(decision
theory)
Risk set
(Bayesian
probabilistic
numerics)
Average Case
(∗)
↔ Bayesian Decision
?
↔ Bayesian Probabilistic
Analysis Theory Numerical Methods
(∗) Kadane and Wasilkowski (1985) Average Case -Complexity in Computer Science: A Bayesian View.
Bayes
rule
(decision
theory)
Optimal
(Bayesian
probabilistic
numerics)
Contours of constant average risk
Risk set
(decision
theory)
Risk set
(Bayesian
probabilistic
numerics)
Theorem
Let u(f ) be the quantity of interest.
Assume that u(f ) belongs to an inner-product
space with associated norm · and consider
the canonical loss
L(u(f ), u(f )) = u(f ) − u(f ) 2
.
Then optimal information for Bayesian prob-
abilistic numerics = Bayesian decision theory
(= average case analysis).
Average Case
(∗)
↔ Bayesian Decision
?
↔ Bayesian Probabilistic
Analysis Theory Numerical Methods
(∗) Kadane and Wasilkowski (1985) Average Case -Complexity in Computer Science: A Bayesian View.
Bayes
rule
(decision
theory)
Optimal
(Bayesian
probabilistic
numerics)
Contours of constant average risk
Risk set
(decision
theory)
Risk set
(Bayesian
probabilistic
numerics)
Theorem
Let u(f ) be the quantity of interest.
Assume that u(f ) belongs to an inner-product
space with associated norm · and consider
the canonical loss
L(u(f ), u(f )) = u(f ) − u(f ) 2
.
Then optimal information for Bayesian prob-
abilistic numerics = Bayesian decision theory
(= average case analysis).
Average Case
(∗)
↔ Bayesian Decision
?
↔ Bayesian Probabilistic
Analysis Theory Numerical Methods
(∗) Kadane and Wasilkowski (1985) Average Case -Complexity in Computer Science: A Bayesian View.
Bayes
rule
(decision
theory)
Optimal
(Bayesian
probabilistic
numerics)
Contours of constant average risk
Risk set
(decision
theory)
Risk set
(Bayesian
probabilistic
numerics)
Theorem
Let u(f ) be the quantity of interest.
Assume that u(f ) belongs to an inner-product
space with associated norm · and consider
the canonical loss
L(u(f ), u(f )) = u(f ) − u(f ) 2
.
Then optimal information for Bayesian prob-
abilistic numerics = Bayesian decision theory
(= average case analysis).
Example
For the linear PDE
−∆u = f 1 on Ω
u = f 2 on ∂Ω
we can consider a loss function
L(u, u ) = u − u 2
L2(Ω).
Corollary
The points {. . . , , . . . , , . . . } are asymptotically op-
timal iff h1 ∨ h2 = O(n−1/2
) where
h1 = max
x∈Ω
min x − 2
h2 = max
x∈∂Ω
min x − 2.
Wendland (2005) Scattered Data Approximation.
Conclusion
In Part I it has been argued that:
Efficient large-scale computation precludes popular a posteriori methods.
Probabilistic numerical methods provide a principled alternative framework.
Optimal information for Bayesian probabilistic numerical methods is not always equivalent to
optimal information in Bayesian decision theory.
Full details (Parts I and II) can be found in the preprint:
Bayesian Probabilistic Numerical Methods
Cockayne, Oates, Sullivan, Girolami (2017)
arXiv:1702.03673
Thank you for your attention!
Conclusion
In Part I it has been argued that:
Efficient large-scale computation precludes popular a posteriori methods.
Probabilistic numerical methods provide a principled alternative framework.
Optimal information for Bayesian probabilistic numerical methods is not always equivalent to
optimal information in Bayesian decision theory.
Full details (Parts I and II) can be found in the preprint:
Bayesian Probabilistic Numerical Methods
Cockayne, Oates, Sullivan, Girolami (2017)
arXiv:1702.03673
Thank you for your attention!
Conclusion
In Part I it has been argued that:
Efficient large-scale computation precludes popular a posteriori methods.
Probabilistic numerical methods provide a principled alternative framework.
Optimal information for Bayesian probabilistic numerical methods is not always equivalent to
optimal information in Bayesian decision theory.
Full details (Parts I and II) can be found in the preprint:
Bayesian Probabilistic Numerical Methods
Cockayne, Oates, Sullivan, Girolami (2017)
arXiv:1702.03673
Thank you for your attention!
Conclusion
In Part I it has been argued that:
Efficient large-scale computation precludes popular a posteriori methods.
Probabilistic numerical methods provide a principled alternative framework.
Optimal information for Bayesian probabilistic numerical methods is not always equivalent to
optimal information in Bayesian decision theory.
Full details (Parts I and II) can be found in the preprint:
Bayesian Probabilistic Numerical Methods
Cockayne, Oates, Sullivan, Girolami (2017)
arXiv:1702.03673
Thank you for your attention!
Conclusion
In Part I it has been argued that:
Efficient large-scale computation precludes popular a posteriori methods.
Probabilistic numerical methods provide a principled alternative framework.
Optimal information for Bayesian probabilistic numerical methods is not always equivalent to
optimal information in Bayesian decision theory.
Full details (Parts I and II) can be found in the preprint:
Bayesian Probabilistic Numerical Methods
Cockayne, Oates, Sullivan, Girolami (2017)
arXiv:1702.03673
Thank you for your attention!
Conclusion
In Part I it has been argued that:
Efficient large-scale computation precludes popular a posteriori methods.
Probabilistic numerical methods provide a principled alternative framework.
Optimal information for Bayesian probabilistic numerical methods is not always equivalent to
optimal information in Bayesian decision theory.
Full details (Parts I and II) can be found in the preprint:
Bayesian Probabilistic Numerical Methods
Cockayne, Oates, Sullivan, Girolami (2017)
arXiv:1702.03673
Thank you for your attention!

Weitere ähnliche Inhalte

Was ist angesagt?

Tailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest NeighborsTailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest Neighbors
Frank Nielsen
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
butest
 

Was ist angesagt? (20)

A nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formulaA nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formula
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
Connection between inverse problems and uncertainty quantification problems
Connection between inverse problems and uncertainty quantification problemsConnection between inverse problems and uncertainty quantification problems
Connection between inverse problems and uncertainty quantification problems
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximations
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Optimal interval clustering: Application to Bregman clustering and statistica...
Optimal interval clustering: Application to Bregman clustering and statistica...Optimal interval clustering: Application to Bregman clustering and statistica...
Optimal interval clustering: Application to Bregman clustering and statistica...
 
ABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified modelsABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified models
 
ABC short course: final chapters
ABC short course: final chaptersABC short course: final chapters
ABC short course: final chapters
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
 
Tailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest NeighborsTailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest Neighbors
 
Boston talk
Boston talkBoston talk
Boston talk
 
Approximate Bayesian model choice via random forests
Approximate Bayesian model choice via random forestsApproximate Bayesian model choice via random forests
Approximate Bayesian model choice via random forests
 
the ABC of ABC
the ABC of ABCthe ABC of ABC
the ABC of ABC
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussion
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Lausanne 2019 #2
Lausanne 2019 #2Lausanne 2019 #2
Lausanne 2019 #2
 
Naive Bayes Classifier
Naive Bayes ClassifierNaive Bayes Classifier
Naive Bayes Classifier
 
ABC based on Wasserstein distances
ABC based on Wasserstein distancesABC based on Wasserstein distances
ABC based on Wasserstein distances
 

Ähnlich wie Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applied Mathematics Opening Workshop, Bayesian Probabilistic Numerical Methods (Part I) - Chris Oates, Aug 29, 2017

CSE357 fa21 (1) Course Intro and Probability 8-26.pdf
CSE357 fa21 (1) Course Intro and Probability 8-26.pdfCSE357 fa21 (1) Course Intro and Probability 8-26.pdf
CSE357 fa21 (1) Course Intro and Probability 8-26.pdf
NermeenKamel7
 

Ähnlich wie Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applied Mathematics Opening Workshop, Bayesian Probabilistic Numerical Methods (Part I) - Chris Oates, Aug 29, 2017 (20)

Numerical differentation with c
Numerical differentation with cNumerical differentation with c
Numerical differentation with c
 
Expectation propagation
Expectation propagationExpectation propagation
Expectation propagation
 
Gaussian processing
Gaussian processingGaussian processing
Gaussian processing
 
Probability cheatsheet
Probability cheatsheetProbability cheatsheet
Probability cheatsheet
 
Probability Cheatsheet.pdf
Probability Cheatsheet.pdfProbability Cheatsheet.pdf
Probability Cheatsheet.pdf
 
chap2.pdf
chap2.pdfchap2.pdf
chap2.pdf
 
QMC: Operator Splitting Workshop, Sparse Non-Parametric Regression - Noah Sim...
QMC: Operator Splitting Workshop, Sparse Non-Parametric Regression - Noah Sim...QMC: Operator Splitting Workshop, Sparse Non-Parametric Regression - Noah Sim...
QMC: Operator Splitting Workshop, Sparse Non-Parametric Regression - Noah Sim...
 
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
 
Calculus- Basics
Calculus- BasicsCalculus- Basics
Calculus- Basics
 
FDA and Statistical learning theory
FDA and Statistical learning theoryFDA and Statistical learning theory
FDA and Statistical learning theory
 
BAYSM'14, Wien, Austria
BAYSM'14, Wien, AustriaBAYSM'14, Wien, Austria
BAYSM'14, Wien, Austria
 
Likelihood free computational statistics
Likelihood free computational statisticsLikelihood free computational statistics
Likelihood free computational statistics
 
CSE357 fa21 (1) Course Intro and Probability 8-26.pdf
CSE357 fa21 (1) Course Intro and Probability 8-26.pdfCSE357 fa21 (1) Course Intro and Probability 8-26.pdf
CSE357 fa21 (1) Course Intro and Probability 8-26.pdf
 
A Short Course in Data Stream Mining
A Short Course in Data Stream MiningA Short Course in Data Stream Mining
A Short Course in Data Stream Mining
 
Probability cheatsheet
Probability cheatsheetProbability cheatsheet
Probability cheatsheet
 
On non-negative unbiased estimators
On non-negative unbiased estimatorsOn non-negative unbiased estimators
On non-negative unbiased estimators
 
Propensity albert
Propensity albertPropensity albert
Propensity albert
 
pattern recognition
pattern recognition pattern recognition
pattern recognition
 
Bisection method
Bisection methodBisection method
Bisection method
 
A New Double Numerical Integration Formula Based On The First Order Derivative
A New Double Numerical Integration Formula Based On The First Order DerivativeA New Double Numerical Integration Formula Based On The First Order Derivative
A New Double Numerical Integration Formula Based On The First Order Derivative
 

Mehr von The Statistical and Applied Mathematical Sciences Institute

Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
The Statistical and Applied Mathematical Sciences Institute
 
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
The Statistical and Applied Mathematical Sciences Institute
 
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
The Statistical and Applied Mathematical Sciences Institute
 
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
The Statistical and Applied Mathematical Sciences Institute
 
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
The Statistical and Applied Mathematical Sciences Institute
 
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
The Statistical and Applied Mathematical Sciences Institute
 

Mehr von The Statistical and Applied Mathematical Sciences Institute (20)

Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
 
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
 
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
 
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
 
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
 
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
 
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
 
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
 
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
 
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
 
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
 
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
 
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
 
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
 
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
 
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
 
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
 
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
 
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
 
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
 

Kürzlich hochgeladen

Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 

Kürzlich hochgeladen (20)

How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 

Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applied Mathematics Opening Workshop, Bayesian Probabilistic Numerical Methods (Part I) - Chris Oates, Aug 29, 2017

  • 1. Bayesian Probabilistic Numerical Methods (Part I) Chris. J. Oates Newcastle University Alan Turing Institute August 2017 @ SAMSI
  • 2. The SAMSI Working Group on Probabilistic Numerics Fran¸cois-Xavier Briol Oksana Chkrebtii Jon Cockayne Mark Girolami Philipp Hennig Warwick Ohio State Warwick Imperial MPI T¨ubingen Han Cheng Lie Houman Owhadi Florian Schaefer Andrew Stuart Tim Sullivan FU Berlin Caltech Caltech Caltech FU Berlin
  • 3. Motivation Consider the task of solving the PDE: −∆u = f on Ω u = 0 on ∂Ω Given an approximate solution un we can obtain an a posteriori error bound ∀β, y (u − un) 2 ≤ (1 + β) un − y 2 + 1 + β β C2 Ω ∆y + f 2 does not involve u the “deviation majorant” CΩ = diam(Ω) Babu˘ska and Rheinboldt (1978) A posteriori error estimates for the finite element method. Cited 1378. Ainsworth and Oden (2011) A posteriori error estimation in finite element analysis. Cited 2252. Problem: ∆y + f is a quadrature and in a pipeline our computational budget will be limited.
  • 4. Motivation Consider the task of solving the PDE: −∆u = f on Ω u = 0 on ∂Ω Given an approximate solution un we can obtain an a posteriori error bound ∀β, y (u − un) 2 ≤ (1 + β) un − y 2 + 1 + β β C2 Ω ∆y + f 2 does not involve u the “deviation majorant” CΩ = diam(Ω) Babu˘ska and Rheinboldt (1978) A posteriori error estimates for the finite element method. Cited 1378. Ainsworth and Oden (2011) A posteriori error estimation in finite element analysis. Cited 2252. Problem: ∆y + f is a quadrature and in a pipeline our computational budget will be limited.
  • 5. Motivation Consider the task of solving the PDE: −∆u = f on Ω u = 0 on ∂Ω Given an approximate solution un we can obtain an a posteriori error bound ∀β, y (u − un) 2 ≤ (1 + β) un − y 2 + 1 + β β C2 Ω ∆y + f 2 does not involve u the “deviation majorant” CΩ = diam(Ω) Babu˘ska and Rheinboldt (1978) A posteriori error estimates for the finite element method. Cited 1378. Ainsworth and Oden (2011) A posteriori error estimation in finite element analysis. Cited 2252. Problem: ∆y + f is a quadrature and in a pipeline our computational budget will be limited.
  • 6. Motivation Consider the task of solving the PDE: −∆u = f on Ω u = 0 on ∂Ω Given an approximate solution un we can obtain an a posteriori error bound ∀β, y (u − un) 2 ≤ (1 + β) un − y 2 + 1 + β β C2 Ω ∆y + f 2 does not involve u the “deviation majorant” CΩ = diam(Ω) Babu˘ska and Rheinboldt (1978) A posteriori error estimates for the finite element method. Cited 1378. Ainsworth and Oden (2011) A posteriori error estimation in finite element analysis. Cited 2252. Problem: ∆y + f is a quadrature and in a pipeline our computational budget will be limited.
  • 7. Numerical sol’n of PDE −∆u = f ↑ f Computational pipelines are efficient precisely because “going back is not allowed”. =⇒ a posteriori error bounds are precluded.
  • 8. Numerical sol’n of PDE −∆u = f ↑ f Computational pipelines are efficient precisely because “going back is not allowed”. =⇒ a posteriori error bounds are precluded.
  • 9. Information-based complexity viewpoint: An(f ) =    f (x1) ... f (xn)    based on size n computational budget. Consider a numerical solution un, based on the information An(f ). Problem: It is impossible to get a computable bound on u − un , based only on An(f ).
  • 10. Information-based complexity viewpoint: An(f ) =    f (x1) ... f (xn)    based on size n computational budget. Consider a numerical solution un, based on the information An(f ). Problem: It is impossible to get a computable bound on u − un , based only on An(f ).
  • 11. Information-based complexity viewpoint: An(f ) =    f (x1) ... f (xn)    based on size n computational budget. Consider a numerical solution un, based on the information An(f ). Problem: It is impossible to get a computable bound on u − un , based only on An(f ).
  • 12. Proof: One cannot distinguish between f 1, f 2 ∈ H−1 (D) such that f 1 = 0 on D = [0, 1] f 2 = 2 (b − a)(2 − a − b) 1[a < x < b] such that {x1, . . . , xn} ∩ (a, b) = ∅ since An(f 1) = An(f 2). Yet these yield wildly different solutions: u1(x) = 0 u 2 = 0 u2(x) =    x 0 < x < a x − (x−a)2 (b−a)(2−a−b) a < x < b (a+b)(1−x) (2−a−b) b < x < 1 u 2 ≥ a3/2 31/2 Moral: A posteriori error analysis requires global information on f , such as f . How to proceed when global information cannot be obtained?
  • 13. Proof: One cannot distinguish between f 1, f 2 ∈ H−1 (D) such that f 1 = 0 on D = [0, 1] f 2 = 2 (b − a)(2 − a − b) 1[a < x < b] such that {x1, . . . , xn} ∩ (a, b) = ∅ since An(f 1) = An(f 2). Yet these yield wildly different solutions: u1(x) = 0 u 2 = 0 u2(x) =    x 0 < x < a x − (x−a)2 (b−a)(2−a−b) a < x < b (a+b)(1−x) (2−a−b) b < x < 1 u 2 ≥ a3/2 31/2 Moral: A posteriori error analysis requires global information on f , such as f . How to proceed when global information cannot be obtained?
  • 14. Proof: One cannot distinguish between f 1, f 2 ∈ H−1 (D) such that f 1 = 0 on D = [0, 1] f 2 = 2 (b − a)(2 − a − b) 1[a < x < b] such that {x1, . . . , xn} ∩ (a, b) = ∅ since An(f 1) = An(f 2). Yet these yield wildly different solutions: u1(x) = 0 u 2 = 0 u2(x) =    x 0 < x < a x − (x−a)2 (b−a)(2−a−b) a < x < b (a+b)(1−x) (2−a−b) b < x < 1 u 2 ≥ a3/2 31/2 Moral: A posteriori error analysis requires global information on f , such as f . How to proceed when global information cannot be obtained?
  • 15. Proof: One cannot distinguish between f 1, f 2 ∈ H−1 (D) such that f 1 = 0 on D = [0, 1] f 2 = 2 (b − a)(2 − a − b) 1[a < x < b] such that {x1, . . . , xn} ∩ (a, b) = ∅ since An(f 1) = An(f 2). Yet these yield wildly different solutions: u1(x) = 0 u 2 = 0 u2(x) =    x 0 < x < a x − (x−a)2 (b−a)(2−a−b) a < x < b (a+b)(1−x) (2−a−b) b < x < 1 u 2 ≥ a3/2 31/2 Moral: A posteriori error analysis requires global information on f , such as f . How to proceed when global information cannot be obtained?
  • 16. Idea: Exploit domain-specific subjective prior belief. Indeed, if we model f as a draw from a distribution Pf ,n, then f has statistical properties which can be leveraged. =⇒ statistical analogue of a posteriori error analysis =⇒ statistical local error indicators, etc. How to select Pf ,n? To be useful, Pf ,n should depend on An(f ). A natural (Bayesian) approach: Pf ,n ∝ Pf “prior” × δAn(f ) “likelihood” Randomness used as an allegorical device to represent epistemic uncertainty.
  • 17. Idea: Exploit domain-specific subjective prior belief. Indeed, if we model f as a draw from a distribution Pf ,n, then f has statistical properties which can be leveraged. =⇒ statistical analogue of a posteriori error analysis =⇒ statistical local error indicators, etc. How to select Pf ,n? To be useful, Pf ,n should depend on An(f ). A natural (Bayesian) approach: Pf ,n ∝ Pf “prior” × δAn(f ) “likelihood” Randomness used as an allegorical device to represent epistemic uncertainty.
  • 18. Idea: Exploit domain-specific subjective prior belief. Indeed, if we model f as a draw from a distribution Pf ,n, then f has statistical properties which can be leveraged. =⇒ statistical analogue of a posteriori error analysis =⇒ statistical local error indicators, etc. How to select Pf ,n? To be useful, Pf ,n should depend on An(f ). A natural (Bayesian) approach: Pf ,n ∝ Pf “prior” × δAn(f ) “likelihood” Randomness used as an allegorical device to represent epistemic uncertainty.
  • 19. Idea: Exploit domain-specific subjective prior belief. Indeed, if we model f as a draw from a distribution Pf ,n, then f has statistical properties which can be leveraged. =⇒ statistical analogue of a posteriori error analysis =⇒ statistical local error indicators, etc. How to select Pf ,n? To be useful, Pf ,n should depend on An(f ). A natural (Bayesian) approach: Pf ,n ∝ Pf “prior” × δAn(f ) “likelihood” Randomness used as an allegorical device to represent epistemic uncertainty.
  • 20. Idea: Exploit domain-specific subjective prior belief. Indeed, if we model f as a draw from a distribution Pf ,n, then f has statistical properties which can be leveraged. =⇒ statistical analogue of a posteriori error analysis =⇒ statistical local error indicators, etc. How to select Pf ,n? To be useful, Pf ,n should depend on An(f ). A natural (Bayesian) approach: Pf ,n ∝ Pf “prior” × δAn(f ) “likelihood” Randomness used as an allegorical device to represent epistemic uncertainty.
  • 21. Idea: Exploit domain-specific subjective prior belief. Indeed, if we model f as a draw from a distribution Pf ,n, then f has statistical properties which can be leveraged. =⇒ statistical analogue of a posteriori error analysis =⇒ statistical local error indicators, etc. How to select Pf ,n? To be useful, Pf ,n should depend on An(f ). A natural (Bayesian) approach: Pf ,n ∝ Pf “prior” × δAn(f ) “likelihood” Randomness used as an allegorical device to represent epistemic uncertainty.
  • 22. Idea: Exploit domain-specific subjective prior belief. Indeed, if we model f as a draw from a distribution Pf ,n, then f has statistical properties which can be leveraged. =⇒ statistical analogue of a posteriori error analysis =⇒ statistical local error indicators, etc. How to select Pf ,n? To be useful, Pf ,n should depend on An(f ). A natural (Bayesian) approach: Pf ,n ∝ Pf “prior” × δAn(f ) “likelihood” Randomness used as an allegorical device to represent epistemic uncertainty.
  • 23. Bayesian Probabilistic Numerical Methods In a Bayesian probabilistic numerical method; a prior measure Pf is placed on f a posterior measure Pf ,n is defined as the “restriction of Pf to those functions f for which An(f ) = a e.g. An(f ) =    f (x1) ... f (xn)    = a is satisfied” (needs to be formalised) equivalent to prior and posterior measures Pu Pu,n[a] on the solution space of the PDE. =⇒ principled and general uncertainty quantification for numerical methods. =⇒ probabilistic quantification of numerical error that can be propagated forward.
  • 24. Bayesian Probabilistic Numerical Methods In a Bayesian probabilistic numerical method; a prior measure Pf is placed on f a posterior measure Pf ,n is defined as the “restriction of Pf to those functions f for which An(f ) = a e.g. An(f ) =    f (x1) ... f (xn)    = a is satisfied” (needs to be formalised) equivalent to prior and posterior measures Pu Pu,n[a] on the solution space of the PDE. =⇒ principled and general uncertainty quantification for numerical methods. =⇒ probabilistic quantification of numerical error that can be propagated forward.
  • 25. Bayesian Probabilistic Numerical Methods In a Bayesian probabilistic numerical method; a prior measure Pf is placed on f a posterior measure Pf ,n is defined as the “restriction of Pf to those functions f for which An(f ) = a e.g. An(f ) =    f (x1) ... f (xn)    = a is satisfied” (needs to be formalised) equivalent to prior and posterior measures Pu Pu,n[a] on the solution space of the PDE. =⇒ principled and general uncertainty quantification for numerical methods. =⇒ probabilistic quantification of numerical error that can be propagated forward.
  • 26. Bayesian Probabilistic Numerical Methods In a Bayesian probabilistic numerical method; a prior measure Pf is placed on f a posterior measure Pf ,n is defined as the “restriction of Pf to those functions f for which An(f ) = a e.g. An(f ) =    f (x1) ... f (xn)    = a is satisfied” (needs to be formalised) equivalent to prior and posterior measures Pu Pu,n[a] on the solution space of the PDE. =⇒ principled and general uncertainty quantification for numerical methods. =⇒ probabilistic quantification of numerical error that can be propagated forward.
  • 27. Bayesian Probabilistic Numerical Methods In a Bayesian probabilistic numerical method; a prior measure Pf is placed on f a posterior measure Pf ,n is defined as the “restriction of Pf to those functions f for which An(f ) = a e.g. An(f ) =    f (x1) ... f (xn)    = a is satisfied” (needs to be formalised) equivalent to prior and posterior measures Pu Pu,n[a] on the solution space of the PDE. =⇒ principled and general uncertainty quantification for numerical methods. =⇒ probabilistic quantification of numerical error that can be propagated forward.
  • 28. Example Consider again the linear PDE −∆u = f 1 on Ω u = f 2 on ∂Ω. Gaussian prior Pu and condition on An(f ) =           ... f 1( ) ... f 2( ) ...           = a. =⇒ Gaussian conditional distribution Pu,n[a].
  • 29. Outline of the Research Bayesian Probabilistic Numerical Methods Cockayne, Oates, Sullivan, Girolami (2017) arXiv:1702.03673 1. Elicit the Abstract Structure . . . 2. Establish Well-Defined, Existence and Uniqueness of Pu,n[a] 3. Characterise the Optimal Information Operator An ← next 4. Algorithms to Sample from Pu,n[a] 5. Extend to Pipelines of Computation
  • 30. Optimal Information Consider an information operator An(f ) =    f (x1) ... f (xn)    . The aim is to select locations x1, . . . , xn that are optimal in the sense that {x1, . . . , xn} ∈ arg inf L {Pu,n[An(f )](ω), u(f )} dω dPf . =⇒ L is a loss function on the solution space of the PDE that must be specified. L(u, u ) = |u − u | corresponds to Wasserstein metric d(Pu,n, δ(u)). =⇒ not equivalent to the Bayes risk from decision theory. =⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
  • 31. Optimal Information Consider an information operator An(f ) =    f (x1) ... f (xn)    . The aim is to select locations x1, . . . , xn that are optimal in the sense that {x1, . . . , xn} ∈ arg inf L {Pu,n[An(f )](ω), u(f )} dω dPf . =⇒ L is a loss function on the solution space of the PDE that must be specified. L(u, u ) = |u − u | corresponds to Wasserstein metric d(Pu,n, δ(u)). =⇒ not equivalent to the Bayes risk from decision theory. =⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
  • 32. Optimal Information Consider an information operator An(f ) =    f (x1) ... f (xn)    . The aim is to select locations x1, . . . , xn that are optimal in the sense that {x1, . . . , xn} ∈ arg inf L {d[An(f )], u(f )} dω dPf . =⇒ L is a loss function on the solution space of the PDE that must be specified. L(u, u ) = |u − u | corresponds to Wasserstein metric d(Pu,n, δ(u)). =⇒ not equivalent to the Bayes risk from decision theory. =⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
  • 33. Optimal Information Consider an information operator An(f ) =    f (x1) ... f (xn)    . The aim is to select locations x1, . . . , xn that are optimal in the sense that {x1, . . . , xn} ∈ arg inf L {d[An(f )], u(f )} dω dPf . =⇒ L is a loss function on the solution space of the PDE that must be specified. L(u, u ) = |u − u | corresponds to Wasserstein metric d(Pu,n, δ(u)). =⇒ not equivalent to the Bayes risk from decision theory. =⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
  • 34. An adversary picks a card at random and our goal is to ascertain whether the suit of their card was ♥ under 0-1 loss. i.e. f ∼ uniform({♥, ♦, ♣, ♠}). Consider two possible experiments: Experiment Bayes’ risk Probabilistic numerics risk (maximum a posteriori (full posterior) point estimate) Q: Is it red? Q: Is it ♠? since“¬♥” always a posterior mode. =⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
  • 35. An adversary picks a card at random and our goal is to ascertain whether the suit of their card was ♥ under 0-1 loss. i.e. f ∼ uniform({♥, ♦, ♣, ♠}). Consider two possible experiments: Experiment Bayes’ risk Probabilistic numerics risk (maximum a posteriori (full posterior) point estimate) Q: Is it red? Q: Is it ♠? since“¬♥” always a posterior mode. =⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
  • 36. An adversary picks a card at random and our goal is to ascertain whether the suit of their card was ♥ under 0-1 loss. i.e. f ∼ uniform({♥, ♦, ♣, ♠}). Consider two possible experiments: Experiment Bayes’ risk Probabilistic numerics risk (maximum a posteriori (full posterior) point estimate) Q: Is it red? 1 4 Q: Is it ♠? 1 4 since“¬♥” always a posterior mode. =⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
  • 37. An adversary picks a card at random and our goal is to ascertain whether the suit of their card was ♥ under 0-1 loss. i.e. f ∼ uniform({♥, ♦, ♣, ♠}). Consider two possible experiments: Experiment Bayes’ risk Probabilistic numerics risk (maximum a posteriori (full posterior) point estimate) Q: Is it red? 1 4 1 4 ♥ · p(¬♥|A(♥)) 1 2 Q: Is it ♠? 1 4 since“¬♥” always a posterior mode. =⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
  • 38. An adversary picks a card at random and our goal is to ascertain whether the suit of their card was ♥ under 0-1 loss. i.e. f ∼ uniform({♥, ♦, ♣, ♠}). Consider two possible experiments: Experiment Bayes’ risk Probabilistic numerics risk (maximum a posteriori (full posterior) point estimate) Q: Is it red? 1 4 1 4 ♥ · p(¬♥|A(♥)) 1 2 + 1 4 ♦ · p(♥|A(♦)) 1 2 Q: Is it ♠? 1 4 since“¬♥” always a posterior mode. =⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
  • 39. An adversary picks a card at random and our goal is to ascertain whether the suit of their card was ♥ under 0-1 loss. i.e. f ∼ uniform({♥, ♦, ♣, ♠}). Consider two possible experiments: Experiment Bayes’ risk Probabilistic numerics risk (maximum a posteriori (full posterior) point estimate) Q: Is it red? 1 4 1 4 ♥ · p(¬♥|A(♥)) 1 2 + 1 4 ♦ · p(♥|A(♦)) 1 2 + 1 4 ♣ · p(♥|A(♣)) 0 Q: Is it ♠? 1 4 since“¬♥” always a posterior mode. =⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
  • 40. An adversary picks a card at random and our goal is to ascertain whether the suit of their card was ♥ under 0-1 loss. i.e. f ∼ uniform({♥, ♦, ♣, ♠}). Consider two possible experiments: Experiment Bayes’ risk Probabilistic numerics risk (maximum a posteriori (full posterior) point estimate) Q: Is it red? 1 4 1 4 ♥ · p(¬♥|A(♥)) 1 2 + 1 4 ♦ · p(♥|A(♦)) 1 2 + 1 4 ♣ · p(♥|A(♣)) 0 + 1 4 ♠ · p(♥|A(♠)) 0 = 1 4 Q: Is it ♠? 1 4 since“¬♥” always a posterior mode. =⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
  • 41. An adversary picks a card at random and our goal is to ascertain whether the suit of their card was ♥ under 0-1 loss. i.e. f ∼ uniform({♥, ♦, ♣, ♠}). Consider two possible experiments: Experiment Bayes’ risk Probabilistic numerics risk (maximum a posteriori (full posterior) point estimate) Q: Is it red? 1 4 1 4 ♥ · p(¬♥|A(♥)) 1 2 + 1 4 ♦ · p(♥|A(♦)) 1 2 + 1 4 ♣ · p(♥|A(♣)) 0 + 1 4 ♠ · p(♥|A(♠)) 0 = 1 4 Q: Is it ♠? 1 4 1 4 ♥ · p(¬♥|A(♥)) 2 3 since“¬♥” always a posterior mode. =⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
  • 42. An adversary picks a card at random and our goal is to ascertain whether the suit of their card was ♥ under 0-1 loss. i.e. f ∼ uniform({♥, ♦, ♣, ♠}). Consider two possible experiments: Experiment Bayes’ risk Probabilistic numerics risk (maximum a posteriori (full posterior) point estimate) Q: Is it red? 1 4 1 4 ♥ · p(¬♥|A(♥)) 1 2 + 1 4 ♦ · p(♥|A(♦)) 1 2 + 1 4 ♣ · p(♥|A(♣)) 0 + 1 4 ♠ · p(♥|A(♠)) 0 = 1 4 Q: Is it ♠? 1 4 1 4 ♥ · p(¬♥|A(♥)) 2 3 + 1 4 ♦ · p(♥|A(♦)) 1 3 since“¬♥” always a posterior mode. =⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
  • 43. An adversary picks a card at random and our goal is to ascertain whether the suit of their card was ♥ under 0-1 loss. i.e. f ∼ uniform({♥, ♦, ♣, ♠}). Consider two possible experiments: Experiment Bayes’ risk Probabilistic numerics risk (maximum a posteriori (full posterior) point estimate) Q: Is it red? 1 4 1 4 ♥ · p(¬♥|A(♥)) 1 2 + 1 4 ♦ · p(♥|A(♦)) 1 2 + 1 4 ♣ · p(♥|A(♣)) 0 + 1 4 ♠ · p(♥|A(♠)) 0 = 1 4 Q: Is it ♠? 1 4 1 4 ♥ · p(¬♥|A(♥)) 2 3 + 1 4 ♦ · p(♥|A(♦)) 1 3 + 1 4 ♣ · p(♥|A(♣)) 1 3 since“¬♥” always a posterior mode. =⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
  • 44. An adversary picks a card at random and our goal is to ascertain whether the suit of their card was ♥ under 0-1 loss. i.e. f ∼ uniform({♥, ♦, ♣, ♠}). Consider two possible experiments: Experiment Bayes’ risk Probabilistic numerics risk (maximum a posteriori (full posterior) point estimate) Q: Is it red? 1 4 1 4 ♥ · p(¬♥|A(♥)) 1 2 + 1 4 ♦ · p(♥|A(♦)) 1 2 + 1 4 ♣ · p(♥|A(♣)) 0 + 1 4 ♠ · p(♥|A(♠)) 0 = 1 4 Q: Is it ♠? 1 4 1 4 ♥ · p(¬♥|A(♥)) 2 3 + 1 4 ♦ · p(♥|A(♦)) 1 3 + 1 4 ♣ · p(♥|A(♣)) 1 3 + 1 4 ♠ · p(♥|A(♠)) 0 = 1 3 since“¬♥” always a posterior mode. =⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
  • 45. An adversary picks a card at random and our goal is to ascertain whether the suit of their card was ♥ under 0-1 loss. i.e. f ∼ uniform({♥, ♦, ♣, ♠}). Consider two possible experiments: Experiment Bayes’ risk Probabilistic numerics risk (maximum a posteriori (full posterior) point estimate) Q: Is it red? 1 4 1 4 ♥ · p(¬♥|A(♥)) 1 2 + 1 4 ♦ · p(♥|A(♦)) 1 2 + 1 4 ♣ · p(♥|A(♣)) 0 + 1 4 ♠ · p(♥|A(♠)) 0 = 1 4 Q: Is it ♠? 1 4 1 4 ♥ · p(¬♥|A(♥)) 2 3 + 1 4 ♦ · p(♥|A(♦)) 1 3 + 1 4 ♣ · p(♥|A(♣)) 1 3 + 1 4 ♠ · p(♥|A(♠)) 0 = 1 3 since“¬♥” always a posterior mode. =⇒ optimal information for Bayesian probabilistic numerics = Bayesian decision theory.
  • 46. Average Case (∗) ↔ Bayesian Decision ? ↔ Bayesian Probabilistic Analysis Theory Numerical Methods (∗) Kadane and Wasilkowski (1985) Average Case -Complexity in Computer Science: A Bayesian View.
  • 47. Average Case (∗) ↔ Bayesian Decision ? ↔ Bayesian Probabilistic Analysis Theory Numerical Methods (∗) Kadane and Wasilkowski (1985) Average Case -Complexity in Computer Science: A Bayesian View. Bayes rule (decision theory) Optimal (Bayesian probabilistic numerics) Contours of constant average risk Risk set (decision theory) Risk set (Bayesian probabilistic numerics)
  • 48. Average Case (∗) ↔ Bayesian Decision ? ↔ Bayesian Probabilistic Analysis Theory Numerical Methods (∗) Kadane and Wasilkowski (1985) Average Case -Complexity in Computer Science: A Bayesian View. Bayes rule (decision theory) Optimal (Bayesian probabilistic numerics) Contours of constant average risk Risk set (decision theory) Risk set (Bayesian probabilistic numerics) Theorem Let u(f ) be the quantity of interest. Assume that u(f ) belongs to an inner-product space with associated norm · and consider the canonical loss L(u(f ), u(f )) = u(f ) − u(f ) 2 . Then optimal information for Bayesian prob- abilistic numerics = Bayesian decision theory (= average case analysis).
  • 49. Average Case (∗) ↔ Bayesian Decision ? ↔ Bayesian Probabilistic Analysis Theory Numerical Methods (∗) Kadane and Wasilkowski (1985) Average Case -Complexity in Computer Science: A Bayesian View. Bayes rule (decision theory) Optimal (Bayesian probabilistic numerics) Contours of constant average risk Risk set (decision theory) Risk set (Bayesian probabilistic numerics) Theorem Let u(f ) be the quantity of interest. Assume that u(f ) belongs to an inner-product space with associated norm · and consider the canonical loss L(u(f ), u(f )) = u(f ) − u(f ) 2 . Then optimal information for Bayesian prob- abilistic numerics = Bayesian decision theory (= average case analysis).
  • 50. Average Case (∗) ↔ Bayesian Decision ? ↔ Bayesian Probabilistic Analysis Theory Numerical Methods (∗) Kadane and Wasilkowski (1985) Average Case -Complexity in Computer Science: A Bayesian View. Bayes rule (decision theory) Optimal (Bayesian probabilistic numerics) Contours of constant average risk Risk set (decision theory) Risk set (Bayesian probabilistic numerics) Theorem Let u(f ) be the quantity of interest. Assume that u(f ) belongs to an inner-product space with associated norm · and consider the canonical loss L(u(f ), u(f )) = u(f ) − u(f ) 2 . Then optimal information for Bayesian prob- abilistic numerics = Bayesian decision theory (= average case analysis).
  • 51. Example For the linear PDE −∆u = f 1 on Ω u = f 2 on ∂Ω we can consider a loss function L(u, u ) = u − u 2 L2(Ω). Corollary The points {. . . , , . . . , , . . . } are asymptotically op- timal iff h1 ∨ h2 = O(n−1/2 ) where h1 = max x∈Ω min x − 2 h2 = max x∈∂Ω min x − 2. Wendland (2005) Scattered Data Approximation.
  • 52. Conclusion In Part I it has been argued that: Efficient large-scale computation precludes popular a posteriori methods. Probabilistic numerical methods provide a principled alternative framework. Optimal information for Bayesian probabilistic numerical methods is not always equivalent to optimal information in Bayesian decision theory. Full details (Parts I and II) can be found in the preprint: Bayesian Probabilistic Numerical Methods Cockayne, Oates, Sullivan, Girolami (2017) arXiv:1702.03673 Thank you for your attention!
  • 53. Conclusion In Part I it has been argued that: Efficient large-scale computation precludes popular a posteriori methods. Probabilistic numerical methods provide a principled alternative framework. Optimal information for Bayesian probabilistic numerical methods is not always equivalent to optimal information in Bayesian decision theory. Full details (Parts I and II) can be found in the preprint: Bayesian Probabilistic Numerical Methods Cockayne, Oates, Sullivan, Girolami (2017) arXiv:1702.03673 Thank you for your attention!
  • 54. Conclusion In Part I it has been argued that: Efficient large-scale computation precludes popular a posteriori methods. Probabilistic numerical methods provide a principled alternative framework. Optimal information for Bayesian probabilistic numerical methods is not always equivalent to optimal information in Bayesian decision theory. Full details (Parts I and II) can be found in the preprint: Bayesian Probabilistic Numerical Methods Cockayne, Oates, Sullivan, Girolami (2017) arXiv:1702.03673 Thank you for your attention!
  • 55. Conclusion In Part I it has been argued that: Efficient large-scale computation precludes popular a posteriori methods. Probabilistic numerical methods provide a principled alternative framework. Optimal information for Bayesian probabilistic numerical methods is not always equivalent to optimal information in Bayesian decision theory. Full details (Parts I and II) can be found in the preprint: Bayesian Probabilistic Numerical Methods Cockayne, Oates, Sullivan, Girolami (2017) arXiv:1702.03673 Thank you for your attention!
  • 56. Conclusion In Part I it has been argued that: Efficient large-scale computation precludes popular a posteriori methods. Probabilistic numerical methods provide a principled alternative framework. Optimal information for Bayesian probabilistic numerical methods is not always equivalent to optimal information in Bayesian decision theory. Full details (Parts I and II) can be found in the preprint: Bayesian Probabilistic Numerical Methods Cockayne, Oates, Sullivan, Girolami (2017) arXiv:1702.03673 Thank you for your attention!
  • 57. Conclusion In Part I it has been argued that: Efficient large-scale computation precludes popular a posteriori methods. Probabilistic numerical methods provide a principled alternative framework. Optimal information for Bayesian probabilistic numerical methods is not always equivalent to optimal information in Bayesian decision theory. Full details (Parts I and II) can be found in the preprint: Bayesian Probabilistic Numerical Methods Cockayne, Oates, Sullivan, Girolami (2017) arXiv:1702.03673 Thank you for your attention!