The document discusses using k-nearest neighbors and KD-trees to create a computationally cheap approximation (πa) of an expensive-to-evaluate target distribution π. This approximation allows the use of delayed acceptance in a Metropolis-Hastings or pseudo-marginal Metropolis-Hastings algorithm to potentially reduce computation cost per iteration. Specifically, it describes:
1) Using a weighted average of the k nearest neighbor π values to define the approximation πa.
2) How delayed acceptance preserves the stationary distribution while mixing more slowly than standard MH.
3) Storing the evaluated π values in a KD-tree to enable fast lookup of the k nearest neighbors.
1. Fast generic MCMC for targets with expensive
likelihoods
C.Sherlock (Lancaster), A.Golightly (Ncl) & D.Henderson
(Ncl); this talk by: Jere Koskela (Warwick)
Monte Carlo Techniques, Paris, July 2016
2. Motivation
Metropolis-Hastings (MH) algorithms create a realisation:
θ(1), θ(2), . . . from a Markov chain with stationary density π(θ).
For each θ(i) we evaluate π(θ(i)) - and then discard it.
Pseudo-marginal MH algorithms create a realisation of ˆπ(θ(i), u(i))
- and then discard it.
In many applications evaluating π(θ) or ˆπ(θ, u) is computationally
expensive.
We would like to reuse these values to create a more efficient MH
algorithm that still targets the correct stationary distribution.
3. Motivation
Metropolis-Hastings (MH) algorithms create a realisation:
θ(1), θ(2), . . . from a Markov chain with stationary density π(θ).
For each θ(i) we evaluate π(θ(i)) - and then discard it.
Pseudo-marginal MH algorithms create a realisation of ˆπ(θ(i), u(i))
- and then discard it.
In many applications evaluating π(θ) or ˆπ(θ, u) is computationally
expensive.
We would like to reuse these values to create a more efficient MH
algorithm that still targets the correct stationary distribution.
We will focus on the [pseudo-marginal] random walk Metropolis
(RWM) with a Gaussian proposal:
θ |θ ∼ N(θ, λ2
V ).
4. This talk
Creating an approximation to π(θ ): k-NN.
Using an approximation: Delayed acceptance [PsM]MH.
Storing the values: KD-trees.
Algorithm: adaptive da-[PsM]MH; ergodicity.
Choice of P(fixed kernel).
Simulation study.
5. Creating an approximation
At iteration n of the MH we have π(θ(1)), π(θ(2)), . . . , π(θ(n)), and
we would like to create ˆπa(θ ) ≈ π(θ ).
6. Creating an approximation
At iteration n of the MH we have π(θ(1)), π(θ(2)), . . . , π(θ(n)), and
we would like to create ˆπa(θ ) ≈ π(θ ).
Gaussian process to log π?:
7. Creating an approximation
At iteration n of the MH we have π(θ(1)), π(θ(2)), . . . , π(θ(n)), and
we would like to create ˆπa(θ ) ≈ π(θ ).
Gaussian process to log π?:
[problems with cost of fitting and evaluating as n ↑]
8. Creating an approximation
At iteration n of the MH we have π(θ(1)), π(θ(2)), . . . , π(θ(n)), and
we would like to create ˆπa(θ ) ≈ π(θ ).
Gaussian process to log π?:
[problems with cost of fitting and evaluating as n ↑]
Weighted average of k-nearest neighbour π values:
(i) Fitting cost: 0 (actually O(n0)).
9. Creating an approximation
At iteration n of the MH we have π(θ(1)), π(θ(2)), . . . , π(θ(n)), and
we would like to create ˆπa(θ ) ≈ π(θ ).
Gaussian process to log π?:
[problems with cost of fitting and evaluating as n ↑]
Weighted average of k-nearest neighbour π values:
(i) Fitting cost: 0 (actually O(n0)).
(ii) Per-iteration cost: O(n).
10. Creating an approximation
At iteration n of the MH we have π(θ(1)), π(θ(2)), . . . , π(θ(n)), and
we would like to create ˆπa(θ ) ≈ π(θ ).
Gaussian process to log π?:
[problems with cost of fitting and evaluating as n ↑]
Weighted average of k-nearest neighbour π values:
(i) Fitting cost: 0 (actually O(n0)).
(ii) Per-iteration cost: O(n).
(iii) Accuracy ↑ with n.
11. Delayed acceptance MH (1)
(Christen and Fox, 2005). Suppose we have a
computationally-cheap approximation to the posterior: ˆπa(θ).
12. Delayed acceptance MH (1)
(Christen and Fox, 2005). Suppose we have a
computationally-cheap approximation to the posterior: ˆπa(θ).
Define αda(θ; θ ) := α1(θ; θ ) α2(θ; θ ), where
α1 := 1 ∧
ˆπa(θ )q(θ|θ )
ˆπa(θ)q(θ |θ)
and α2 := 1 ∧
π(θ )/ˆπa(θ )
π(θ)/ˆπa(θ)
.
Detailed balance (with respect π(θ)) is still preserved with αda
because
13. Delayed acceptance MH (1)
(Christen and Fox, 2005). Suppose we have a
computationally-cheap approximation to the posterior: ˆπa(θ).
Define αda(θ; θ ) := α1(θ; θ ) α2(θ; θ ), where
α1 := 1 ∧
ˆπa(θ )q(θ|θ )
ˆπa(θ)q(θ |θ)
and α2 := 1 ∧
π(θ )/ˆπa(θ )
π(θ)/ˆπa(θ)
.
Detailed balance (with respect π(θ)) is still preserved with αda
because
π(θ) q(θ |θ) αda(θ; θ )
14. Delayed acceptance MH (1)
(Christen and Fox, 2005). Suppose we have a
computationally-cheap approximation to the posterior: ˆπa(θ).
Define αda(θ; θ ) := α1(θ; θ ) α2(θ; θ ), where
α1 := 1 ∧
ˆπa(θ )q(θ|θ )
ˆπa(θ)q(θ |θ)
and α2 := 1 ∧
π(θ )/ˆπa(θ )
π(θ)/ˆπa(θ)
.
Detailed balance (with respect π(θ)) is still preserved with αda
because
π(θ) q(θ |θ) αda(θ; θ )
= ˆπa(θ) q(θ |θ) α1 × π(θ)
ˆπa(θ) α2.
15. Delayed acceptance MH (1)
(Christen and Fox, 2005). Suppose we have a
computationally-cheap approximation to the posterior: ˆπa(θ).
Define αda(θ; θ ) := α1(θ; θ ) α2(θ; θ ), where
α1 := 1 ∧
ˆπa(θ )q(θ|θ )
ˆπa(θ)q(θ |θ)
and α2 := 1 ∧
π(θ )/ˆπa(θ )
π(θ)/ˆπa(θ)
.
Detailed balance (with respect π(θ)) is still preserved with αda
because
π(θ) q(θ |θ) αda(θ; θ )
= ˆπa(θ) q(θ |θ) α1 × π(θ)
ˆπa(θ) α2.
But this algorithm mixes worse than the equivalent MH algorithm
(Peskun, 1973; Tierney, 1998).
17. Delayed-acceptance [PsM]MH (2)
Using αda = α1(θ; θ )α2(θ; θ ) mixes worse but CPU time/iteration
can be much reduced.
Accept ⇔ accept at Stage 1 (w.p. α1) and accept at Stage 2 (w.p. α2).
18. Delayed-acceptance [PsM]MH (2)
Using αda = α1(θ; θ )α2(θ; θ ) mixes worse but CPU time/iteration
can be much reduced.
Accept ⇔ accept at Stage 1 (w.p. α1) and accept at Stage 2 (w.p. α2).
α1 is quick to calculate.
19. Delayed-acceptance [PsM]MH (2)
Using αda = α1(θ; θ )α2(θ; θ ) mixes worse but CPU time/iteration
can be much reduced.
Accept ⇔ accept at Stage 1 (w.p. α1) and accept at Stage 2 (w.p. α2).
α1 is quick to calculate.
There is no need to calculate α2 if we reject at Stage One.
20. Delayed-acceptance [PsM]MH (2)
Using αda = α1(θ; θ )α2(θ; θ ) mixes worse but CPU time/iteration
can be much reduced.
Accept ⇔ accept at Stage 1 (w.p. α1) and accept at Stage 2 (w.p. α2).
α1 is quick to calculate.
There is no need to calculate α2 if we reject at Stage One.
If ˆπa is accurate then α2 ≈ 1.
21. Delayed-acceptance [PsM]MH (2)
Using αda = α1(θ; θ )α2(θ; θ ) mixes worse but CPU time/iteration
can be much reduced.
Accept ⇔ accept at Stage 1 (w.p. α1) and accept at Stage 2 (w.p. α2).
α1 is quick to calculate.
There is no need to calculate α2 if we reject at Stage One.
If ˆπa is accurate then α2 ≈ 1.
If ˆπa is also cheap then (RWM) can use large jump proposals
[EXPLAIN].
22. Delayed-acceptance [PsM]MH (2)
Using αda = α1(θ; θ )α2(θ; θ ) mixes worse but CPU time/iteration
can be much reduced.
Accept ⇔ accept at Stage 1 (w.p. α1) and accept at Stage 2 (w.p. α2).
α1 is quick to calculate.
There is no need to calculate α2 if we reject at Stage One.
If ˆπa is accurate then α2 ≈ 1.
If ˆπa is also cheap then (RWM) can use large jump proposals
[EXPLAIN].
Delayed-acceptance PMMH:
α2 := 1 ∧
ˆπ(θ , u )/ˆπa(θ )
ˆπ(θ, u)/ˆπa(θ)
.
23. Cheap and accurate approximation?
We: use an inverse-distance-weighted average of the π values from
the k nearest neighbours.
24. Cheap and accurate approximation?
We: use an inverse-distance-weighted average of the π values from
the k nearest neighbours.
But the cost is still O(n)/iter.
25. k-nn and the binary tree
Imagine a table with n values.
θ
(1)
1 θ
(1)
2 · · · θ
(1)
d π(θ(1))
θ
(2)
1 θ
(2)
2 · · · θ
(2)
d π(θ(2))
· · · · · · · · · · · · · · ·
θ
(n)
1 θ
(n)
2 · · · θ
(n)
d π(θ(n))
Look-up of k nearest neighbours to some θ is O(n).
26. k-nn and the binary tree
Imagine a table with n values.
θ
(1)
1 θ
(1)
2 · · · θ
(1)
d π(θ(1))
θ
(2)
1 θ
(2)
2 · · · θ
(2)
d π(θ(2))
· · · · · · · · · · · · · · ·
θ
(n)
1 θ
(n)
2 · · · θ
(n)
d π(θ(n))
Look-up of k nearest neighbours to some θ is O(n).
If d = 1 then could sort list or create a standard binary tree
(median)
(q1) (q3)
(oct1) (oct3) (oct5) (oct7)
for O(log n) look up.
27. k-nn and the binary tree
Imagine a table with n values.
θ
(1)
1 θ
(1)
2 · · · θ
(1)
d π(θ(1))
θ
(2)
1 θ
(2)
2 · · · θ
(2)
d π(θ(2))
· · · · · · · · · · · · · · ·
θ
(n)
1 θ
(n)
2 · · · θ
(n)
d π(θ(n))
Look-up of k nearest neighbours to some θ is O(n).
If d = 1 then could sort list or create a standard binary tree
(median)
(q1) (q3)
(oct1) (oct3) (oct5) (oct7)
for O(log n) look up. For d > 1 a solution is the KD-tree.
28. KD-tree (d=2)
d−split
1 m1:=m{θ1}
m{S} = branch splitting according to θd-split on median of S.
[L] = leaf node with a maximum of 2b − 1 leaves.
KD-tree useful if n/(3b/2) > 2d .
33. Adaptive k-nn using a KD-tree
Initial, training run of n0 iterations. Build initial KD-tree.
34. Adaptive k-nn using a KD-tree
Initial, training run of n0 iterations. Build initial KD-tree.
Main run: ‘every time’ π(θ ) is evaluated, add (θ , π(θ )) to the
KD-tree.
35. Adaptive k-nn using a KD-tree
Initial, training run of n0 iterations. Build initial KD-tree.
Main run: ‘every time’ π(θ ) is evaluated, add (θ , π(θ )) to the
KD-tree.
Set-up is O n0(log n0)2 ; updating is O(log n) evaluation is
O(log n) and accuracy ↑ as the MCMC progresses.
36. Adaptive k-nn using a KD-tree
Initial, training run of n0 iterations. Build initial KD-tree.
Main run: ‘every time’ π(θ ) is evaluated, add (θ , π(θ )) to the
KD-tree.
Set-up is O n0(log n0)2 ; updating is O(log n) evaluation is
O(log n) and accuracy ↑ as the MCMC progresses.
Provided the tree is balanced. [Skip, for lack of time]
37. Refinements
Training dataset ⇒ better distance metric. Transform θ to
approximately isotropic before creating tree, or adding new node.
38. Refinements
Training dataset ⇒ better distance metric. Transform θ to
approximately isotropic before creating tree, or adding new node.
Minimum distance . If ∃ θ s.t. ||θ − θ|| < then
(i) MH: ignore new value.
(ii) PMMH: combine ˆπ(y|θ , u ) with ˆπ(y|θ, u) (running average).
39. Adaptive Algorithm
Components
A fixed [pseudo-marginal] kernel P([θ, u]; [dθ , du ]).
An adaptive [pseudo-marginal] DA kernel Pγ([θ, u]; [dθ , du ]).
40. Adaptive Algorithm
Components
A fixed [pseudo-marginal] kernel P([θ, u]; [dθ , du ]).
An adaptive [pseudo-marginal] DA kernel Pγ([θ, u]; [dθ , du ]).
Both P and Pγ generate ˆπ(θ , u) in the same way.
A fixed probability β ∈ (0, 1].
A set of probabilities: pn → 0.
Algorithm At the start of iteration n, the chain is at [θ, u] and the
DA kernel would be Pγ.
41. Adaptive Algorithm
Components
A fixed [pseudo-marginal] kernel P([θ, u]; [dθ , du ]).
An adaptive [pseudo-marginal] DA kernel Pγ([θ, u]; [dθ , du ]).
Both P and Pγ generate ˆπ(θ , u) in the same way.
A fixed probability β ∈ (0, 1].
A set of probabilities: pn → 0.
Algorithm At the start of iteration n, the chain is at [θ, u] and the
DA kernel would be Pγ.
1. Sample [θ , u ] from
P w.p. β
Pγ w.p. 1 − β.
42. Adaptive Algorithm
Components
A fixed [pseudo-marginal] kernel P([θ, u]; [dθ , du ]).
An adaptive [pseudo-marginal] DA kernel Pγ([θ, u]; [dθ , du ]).
Both P and Pγ generate ˆπ(θ , u) in the same way.
A fixed probability β ∈ (0, 1].
A set of probabilities: pn → 0.
Algorithm At the start of iteration n, the chain is at [θ, u] and the
DA kernel would be Pγ.
1. Sample [θ , u ] from
P w.p. β
Pγ w.p. 1 − β.
2. W.p. pn ‘choose a new γ’: update the kernel by including all
relevant information since the kernel was last updated.
43. Adaptive Algorithm
Components
A fixed [pseudo-marginal] kernel P([θ, u]; [dθ , du ]).
An adaptive [pseudo-marginal] DA kernel Pγ([θ, u]; [dθ , du ]).
Both P and Pγ generate ˆπ(θ , u) in the same way.
A fixed probability β ∈ (0, 1].
A set of probabilities: pn → 0.
Algorithm At the start of iteration n, the chain is at [θ, u] and the
DA kernel would be Pγ.
1. Sample [θ , u ] from
P w.p. β
Pγ w.p. 1 − β.
2. W.p. pn ‘choose a new γ’: update the kernel by including all
relevant information since the kernel was last updated.
Set: pn = 1/(1 + cin), where in = # expensive evaluations so far.
44. Ergodicity
Assumptions on the fixed kernel.
1. Minorisation: there is a density ν(θ) and δ > 0 such that
q(θ |θ)α(θ; θ ) > δν(θ ), where α is the acceptance rate from
the idealised version of the fixed MH algorithm.
45. Ergodicity
Assumptions on the fixed kernel.
1. Minorisation: there is a density ν(θ) and δ > 0 such that
q(θ |θ)α(θ; θ ) > δν(θ ), where α is the acceptance rate from
the idealised version of the fixed MH algorithm.
2. Bounded weights: the support of W := ˆπ(θ, U)/π(θ) is
uniformly (in θ) bounded above by some w < ∞.
46. Ergodicity
Assumptions on the fixed kernel.
1. Minorisation: there is a density ν(θ) and δ > 0 such that
q(θ |θ)α(θ; θ ) > δν(θ ), where α is the acceptance rate from
the idealised version of the fixed MH algorithm.
2. Bounded weights: the support of W := ˆπ(θ, U)/π(θ) is
uniformly (in θ) bounded above by some w < ∞.
Theorem Subject to Assumptions 1 and 2, the adaptive
pseudo-marginal algorithm is ergodic.
47. Ergodicity
Assumptions on the fixed kernel.
1. Minorisation: there is a density ν(θ) and δ > 0 such that
q(θ |θ)α(θ; θ ) > δν(θ ), where α is the acceptance rate from
the idealised version of the fixed MH algorithm.
2. Bounded weights: the support of W := ˆπ(θ, U)/π(θ) is
uniformly (in θ) bounded above by some w < ∞.
Theorem Subject to Assumptions 1 and 2, the adaptive
pseudo-marginal algorithm is ergodic.
NB. For DAMH, as opposed to DAPMMH, only the minorisation
assumption is required.
49. Choice of β
DARWM is more efficient when λ > ˆλRWM.
The lower α1 does not matter.
50. Choice of β
DARWM is more efficient when λ > ˆλRWM.
The lower α1 does not matter.
But low α1 ⇒ fewer expensive evaluations of π [or of ˆπ].
51. Choice of β
DARWM is more efficient when λ > ˆλRWM.
The lower α1 does not matter.
But low α1 ⇒ fewer expensive evaluations of π [or of ˆπ].
P(expensive) = β + (1 − β)α1.
52. Choice of β
DARWM is more efficient when λ > ˆλRWM.
The lower α1 does not matter.
But low α1 ⇒ fewer expensive evaluations of π [or of ˆπ].
P(expensive) = β + (1 − β)α1.
If α1 << 1 most of the expensive evaluations come from the fixed
kernel and much of the benefit from the adaptive kernel will be
lost.
53. Choice of β
DARWM is more efficient when λ > ˆλRWM.
The lower α1 does not matter.
But low α1 ⇒ fewer expensive evaluations of π [or of ˆπ].
P(expensive) = β + (1 − β)α1.
If α1 << 1 most of the expensive evaluations come from the fixed
kernel and much of the benefit from the adaptive kernel will be
lost.
Fixing β ∝ α1 (obtained from the training run) avoids this,
54. Choice of β
DARWM is more efficient when λ > ˆλRWM.
The lower α1 does not matter.
But low α1 ⇒ fewer expensive evaluations of π [or of ˆπ].
P(expensive) = β + (1 − β)α1.
If α1 << 1 most of the expensive evaluations come from the fixed
kernel and much of the benefit from the adaptive kernel will be
lost.
Fixing β ∝ α1 (obtained from the training run) avoids this, but the
guaranteed worst-case TVD from π after n iterations gets larger.
55. Choice of β
DARWM is more efficient when λ > ˆλRWM.
The lower α1 does not matter.
But low α1 ⇒ fewer expensive evaluations of π [or of ˆπ].
P(expensive) = β + (1 − β)α1.
If α1 << 1 most of the expensive evaluations come from the fixed
kernel and much of the benefit from the adaptive kernel will be
lost.
Fixing β ∝ α1 (obtained from the training run) avoids this, but the
guaranteed worst-case TVD from π after n iterations gets larger.
Consider a fixed computational budget ≈ fixed number of
expensive evaluations. This preserves the provable worst-case TVD
from π.
57. Examples
Lotka-Volterra MJP daPMRWM with d = 5
LNA approximation to autoregulatary system daRWM with
d = 10
RWM: θ ∼ N(θ, λ2 ˆΣ) where ˆΣ, obtained from training run (also
gives pre-map).
58. Examples
Lotka-Volterra MJP daPMRWM with d = 5
LNA approximation to autoregulatary system daRWM with
d = 10
RWM: θ ∼ N(θ, λ2 ˆΣ) where ˆΣ, obtained from training run (also
gives pre-map).
Scaling, λ [and number of particles m] chosen to be optimal for
RWM.
59. Examples
Lotka-Volterra MJP daPMRWM with d = 5
LNA approximation to autoregulatary system daRWM with
d = 10
RWM: θ ∼ N(θ, λ2 ˆΣ) where ˆΣ, obtained from training run (also
gives pre-map).
Scaling, λ [and number of particles m] chosen to be optimal for
RWM.
n0 = 10000 (from training run), b = 10, c = 0.001.
Efficency =
minj=1...d ESSj
CPU time
60. Results: LV
RelESS= efficiency of DA[PM]RWM / efficiency of optimal RWM.
1.0 1.5 2.0 2.5 3.0 3.5 4.0
34567
xi
RelESS
1.0 1.5 2.0 2.5 3.0 3.5 4.0
0.000.050.100.150.20 xi
alp1
xi=1 corresponds to the DA using the scaling that is optimal for
the standard RWM algorithm. i.e. DA scaling = xi׈λRWM.
63. LV: further experiments
c Tree Size ˆα1 ˆα2 Rel. mESS
0.0001 41078 0.00772 0.339 7.28
0.001 40256 0.00915 0.276 6.80
0.01 43248 0.0121 0.204 4.67
∞ 10000 0.0175 0.136 3.46
Using a list rather than the KD-tree reduced the efficiency by a
factor of ≈ 2.
65. Summary
Store the expensive evaluations of π or ˆπ in a KD-tree.
Use a delayed-acceptance [PsM]RWM algorithm.
66. Summary
Store the expensive evaluations of π or ˆπ in a KD-tree.
Use a delayed-acceptance [PsM]RWM algorithm.
Adaptive algorithm converges subject to conditions.
67. Summary
Store the expensive evaluations of π or ˆπ in a KD-tree.
Use a delayed-acceptance [PsM]RWM algorithm.
Adaptive algorithm converges subject to conditions.
Need β ∝ α1 for adaptive portion to play a part.
68. Summary
Store the expensive evaluations of π or ˆπ in a KD-tree.
Use a delayed-acceptance [PsM]RWM algorithm.
Adaptive algorithm converges subject to conditions.
Need β ∝ α1 for adaptive portion to play a part.
Improvement in efficiency by factor of between 4-7
69. Summary
Store the expensive evaluations of π or ˆπ in a KD-tree.
Use a delayed-acceptance [PsM]RWM algorithm.
Adaptive algorithm converges subject to conditions.
Need β ∝ α1 for adaptive portion to play a part.
Improvement in efficiency by factor of between 4-7
Code easy-to-use, generic C code for the KD-tree is available.
70. Summary
Store the expensive evaluations of π or ˆπ in a KD-tree.
Use a delayed-acceptance [PsM]RWM algorithm.
Adaptive algorithm converges subject to conditions.
Need β ∝ α1 for adaptive portion to play a part.
Improvement in efficiency by factor of between 4-7
Code easy-to-use, generic C code for the KD-tree is available.
Limitations: Need n 2d for KD-tree to give worthwhile speedup
and for adequate coverage.
71. Summary
Store the expensive evaluations of π or ˆπ in a KD-tree.
Use a delayed-acceptance [PsM]RWM algorithm.
Adaptive algorithm converges subject to conditions.
Need β ∝ α1 for adaptive portion to play a part.
Improvement in efficiency by factor of between 4-7
Code easy-to-use, generic C code for the KD-tree is available.
Limitations: Need n 2d for KD-tree to give worthwhile speedup
and for adequate coverage.
Other: could use k nearest neighbours to estimate gradient and
curvature;
72. Summary
Store the expensive evaluations of π or ˆπ in a KD-tree.
Use a delayed-acceptance [PsM]RWM algorithm.
Adaptive algorithm converges subject to conditions.
Need β ∝ α1 for adaptive portion to play a part.
Improvement in efficiency by factor of between 4-7
Code easy-to-use, generic C code for the KD-tree is available.
Limitations: Need n 2d for KD-tree to give worthwhile speedup
and for adequate coverage.
Other: could use k nearest neighbours to estimate gradient and
curvature; even to fit a local GP?
73. Summary
Store the expensive evaluations of π or ˆπ in a KD-tree.
Use a delayed-acceptance [PsM]RWM algorithm.
Adaptive algorithm converges subject to conditions.
Need β ∝ α1 for adaptive portion to play a part.
Improvement in efficiency by factor of between 4-7
Code easy-to-use, generic C code for the KD-tree is available.
Limitations: Need n 2d for KD-tree to give worthwhile speedup
and for adequate coverage.
Other: could use k nearest neighbours to estimate gradient and
curvature; even to fit a local GP?
Thank you!