SlideShare ist ein Scribd-Unternehmen logo
1 von 44
Downloaden Sie, um offline zu lesen
Introduction
       Some adaptive MCMC algorithms
                             Validity




Stability of adaptive random-walk Metropolis
                  algorithms

                         Matti Vihola
                     matti.vihola@iki.fi

  University of Jyv¨skyl¨, Department of Mathematics and Statistics
                   a    a


 joint work with Eero Saksman (University of Helsinki)

                        Paris, May 5th 2011



                         Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Adaptive MCMC
            Some adaptive MCMC algorithms    Stochastic approximation framework
                                  Validity   Stability?


Adaptive MCMC



     Adaptive MCMC algorithms have been succesfully applied to
     ‘automatise’ the tuning of the proposal distributions.
     In general, they are based on a collection of Markov kernels
     {Pθ }θ∈Θ , each having π as the unique invariant distribution.
     The aim is to find a good kernel (a good value of θ)
     automatically.
     Stochastic approximation framework is the most common way
     to construct the algorithms [Andrieu and Robert, 2001].




                              Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction     Adaptive MCMC
             Some adaptive MCMC algorithms     Stochastic approximation framework
                                   Validity    Stability?


Stochastic approximation framework

      One starts with certain (X1 , Θ1 ) ≡ (x1 , θ1 ) ∈ Rd × Re and for
      n≥2

                       Xn ∼ PΘn−1 (Xn−1 , · )
                       Θn = Θn−1 + γn H(Θn−1 , Xn ).

      where (γn )n≥2 is a non-negative step size sequence that
      vanishes and the function H : Re × Rd → Re determines the
      adaptation.
      The adaptation aims to Θn → θ∗ so that h(θ∗ ) = 0, where

                              h(θ) :=         H(θ, x)π(dx).


                               Matti Vihola    Stability of adaptive random-walk Metropolis algorithms
Introduction    Adaptive MCMC
               Some adaptive MCMC algorithms    Stochastic approximation framework
                                     Validity   Stability?


Stability? I


       Usually, one can identify a Lyapunov function
       w : Re → [0, ∞) such that

                       w(θ), h(θ) < 0               whenever h(θ) = 0.

       This suggests that, “on average”, there should be a “drift” of
       Θn towards a θ∗ such that h(θ∗ ) = 0.
       To have sufficient “averaging”, the sequence of Markov
       kernels PΘn should have nice properties.
       To have nice properties on PΘn , the sequence (Θn )n≥1 should
       behave nicely. . .


                                 Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Adaptive MCMC
                Some adaptive MCMC algorithms    Stochastic approximation framework
                                      Validity   Stability?


Stability? II



       Usual way to overcome the stability issue is to enforce
       stability, or to use adaptive reprojections approach like Andrieu
       and Moulines [2006], Andrieu, Moulines, and Priouret [2005].
       Strong belief (based on overwhelming empirical evidence) that
       many algorithms are intrinsically stable (and do not require
       stabilisation).




                                  Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Adaptive Metropolis algorithm
                 Some adaptive MCMC algorithms    Adaptive scaling Metropolis algorithm
                                       Validity   Robust adaptive Metropolis algorithm


Adaptive Metropolis algorithm I

   The AM algorithm [Haario, Saksman, and Tamminen, 2001]:
        Define X1 ≡ x1 ∈ Rd such that π(x1 ) > 0.
        Choose parameters t > 0 and                ≥ 0.
   For n = 2, 3, . . .
                                   1/2
        Yn = Xn−1 + Σn−1 Wn ,                       where Wn ∼ N (0, I), and
                      Yn ,         with probability α(Xn−1 , Yn ) and
        Xn =
                      Xn−1 ,       otherwise.

   where Σn−1 := t2 Cov(X1 , . . . , Xn−1 ) + I, and I ∈ Rd×d stands
   for the identity matrix.


                                   Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Adaptive Metropolis algorithm
               Some adaptive MCMC algorithms    Adaptive scaling Metropolis algorithm
                                     Validity   Robust adaptive Metropolis algorithm


Adaptive Metropolis algorithm II


   Cov(· · · ) is some consistent covariance estimator (not necessarily
   the standard sample covariance!).
       In what follows, consider the definition
       Cov(X1 , . . . , Xn ) := Sn , where (M1 , S1 ) ≡ (x1 , s1 ) with
       s1 ∈ Rd×d is symmetric and positive definite, and

         Mn = (1 − γn )Mn−1 + γn Xn
          Sn = (1 − γn )Sn−1 + γn (Xn − Mn−1 )(Xn − Mn−1 )T .




                                 Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction        Adaptive Metropolis algorithm
               Some adaptive MCMC algorithms        Adaptive scaling Metropolis algorithm
                                     Validity       Robust adaptive Metropolis algorithm


Example run of AM

      0                            0                                   0


     −2                           −2                                 −2


     −4                           −4                                 −4


     −6                           −6                                 −6


     −8                           −8                                 −8


    −10                         −10                                 −10
          −2   0       2               −2       0         2                −2        0        2

  Figure: The first 10, 100 and √1000 samples of the AM algorithm started
  with s1 = (0.01)2 I, t = 2.38/ 2, = 0 and ηn := n−1 .


                                 Matti Vihola       Stability of adaptive random-walk Metropolis algorithms
Introduction    Adaptive Metropolis algorithm
               Some adaptive MCMC algorithms    Adaptive scaling Metropolis algorithm
                                     Validity   Robust adaptive Metropolis algorithm


Adaptive scaling Metropolis algorithm I



   This algorithm, essentially proposed by [Gilks, Roberts, and Sahu,
   1998, Andrieu and Robert, 2001], adjusts the size of the proposal
   jumps, and tries to attain a given mean acceptance rate.
       Define X1 ≡ x1 ∈ Rd such that π(x1 ) > 0.
       Let Θ1 ≡ θ1 > 0.
       Define the desired mean acceptance rate α∗ ∈ (0, 1). (Usually
       α∗ = 0.44 or α∗ = 0.234. . . )




                                 Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Adaptive Metropolis algorithm
                 Some adaptive MCMC algorithms    Adaptive scaling Metropolis algorithm
                                       Validity   Robust adaptive Metropolis algorithm


Adaptive scaling Metropolis algorithm II


   For n = 2, 3, . . ., iterate

          Yn = Xn−1 + Θn−1 Wn ,                        where Wn ∼ N (0, I), and
                         Yn ,         with probability α(Xn−1 , Yn ) and
         Xn =
                         Xn−1 ,       otherwise.
     log Θn = log Θn−1 + γn α(Xn−1 , Yn ) − α∗ .




                                   Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction        Adaptive Metropolis algorithm
               Some adaptive MCMC algorithms        Adaptive scaling Metropolis algorithm
                                     Validity       Robust adaptive Metropolis algorithm


Example run of ASM

      0                            0                                   0


     −2                           −2                                 −2


     −4                           −4                                 −4


     −6                           −6                                 −6


     −8                           −8                                 −8


    −10                         −10                                 −10
          −2   0       2               −2       0         2                −2        0        2

  Figure: The first 10, 100 and 1000 samples of the ASM algorithm started
  with θ1 = 0.01 and using γn = n−3/4 .


                                 Matti Vihola       Stability of adaptive random-walk Metropolis algorithms
Introduction    Adaptive Metropolis algorithm
             Some adaptive MCMC algorithms    Adaptive scaling Metropolis algorithm
                                   Validity   Robust adaptive Metropolis algorithm


Adaptive scaling within adaptive Metropolis

   The AM and ASM algorithms can be used simultaneously [Atchad´
                                                               e
   and Fort, 2010, Andrieu and Thoms, 2008].

      Yn = Xn−1 + Θn−1 Σn−1 Wn ,                        where Wn ∼ N (0, I), and
                 Yn ,         with probability α(Xn−1 , Yn ) and
      Xn =
                 Xn−1 ,       otherwise.
   log Θn = log Θn−1 + γn α(Xn−1 , Yn ) − α∗
     Mn = (1 − γn )Mn−1 + γn Xn
      Sn = (1 − γn )Sn−1 + γn (Xn − Mn−1 )(Xn − Mn−1 )T
      Σn = t2 Sn + I.



                               Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Adaptive Metropolis algorithm
              Some adaptive MCMC algorithms    Adaptive scaling Metropolis algorithm
                                    Validity   Robust adaptive Metropolis algorithm


Robust adaptive Metropolis algorithm I



   The RAM algorithm is a multidimensional ‘extension’ of the ASM
   adaptation mechanism [Vihola, 2010b].
       Define X1 ≡ x1 ∈ Rd such that π(x1 ) > 0.
       Let Σ1 ≡ s1 ∈ Rd×d be symmetric and positive definite.
       Define the desired mean acceptance rate α∗ ∈ (0, 1).




                                Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Adaptive Metropolis algorithm
                 Some adaptive MCMC algorithms    Adaptive scaling Metropolis algorithm
                                       Validity   Robust adaptive Metropolis algorithm


Robust adaptive Metropolis algorithm II


   For n = 2, 3, . . ., iterate
                                  1/2
        Yn = Xn−1 + Σn−1 Wn ,                       where Wn ∼ N (0, I), and
                      Yn ,         with probability α(Xn−1 , Yn ) and
       Xn =
                      Xn−1 ,       otherwise.
                                                                              T
                                                                          Wn Wn 1/2
                                                                   1/2
        Σn = Σn−1 + γn α(Xn−1 , Yn ) − α∗ Σn−1                                  Σ
                                                                           Wn 2 n−1

   where Σn will be symmetric and positive definite whenever
   γn α(Xn−1 , Yn ) − α∗ > −1.



                                   Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Adaptive Metropolis algorithm
               Some adaptive MCMC algorithms    Adaptive scaling Metropolis algorithm
                                     Validity   Robust adaptive Metropolis algorithm


Robust adaptive Metropolis algorithm III




   Figure: The RAM update with γn α(Xn−1 , Yn ) − α∗ = ±0.8, resp.
   The ellipsoids defined by Σn−1 (Σn ) are drawn in solid (dashed), and the
            1/2
   vector Σn−1 Wn / Wn as a dot.



                                 Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction        Adaptive Metropolis algorithm
               Some adaptive MCMC algorithms        Adaptive scaling Metropolis algorithm
                                     Validity       Robust adaptive Metropolis algorithm


Example run of RAM

      0                            0                                   0


     −2                           −2                                 −2


     −4                           −4                                 −4


     −6                           −6                                 −6


     −8                           −8                                 −8


    −10                         −10                                 −10
          −2   0       2               −2       0         2                −2        0        2

  Figure: The first 10, 100 and 1000 samples of the RAM algorithm started
  with Σ1 = 0.01 × I and using γn = min{1, 2 · n−3/4 }.


                                 Matti Vihola       Stability of adaptive random-walk Metropolis algorithms
Introduction    Adaptive Metropolis algorithm
            Some adaptive MCMC algorithms    Adaptive scaling Metropolis algorithm
                                  Validity   Robust adaptive Metropolis algorithm


RAM vs. ASwAM



     Computationally, they are of similar complexity (Cholesky
     update of order O(d2 )).
     The RAM adaptation does not require that π has a finite
     second moment (or any moment at all).
     In RAM, the adaptation is ‘direct’ ‘while estimating the
     covariance is ‘indirect’ (as it requires also a mean estimate).
     There are also situations where ASwAM seems to be more
     efficient than RAM. . .




                              Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Previous conditions
              Some adaptive MCMC algorithms    New results
                                    Validity   Results for AM and ASM


Previous conditions on validity I

       What conditions are sufficient to guarantee that a law of large
                                            n→∞
       numbers (LLN) holds: n n f (Xk ) − −
                             1
                                 k=1         − → f (x)π(x)dx?
       Key conditions due to Roberts and Rosenthal [2007]:
       Diminishing adaptation (DA) The effect of the adaptation
                    becomes smaller and smaller,
                                                                  n→∞
                                       “ PΘn − PΘn−1 − − 0.”
                                                      −→

       Containment (C) The kernels PΘn have all the time
                   sufficiently good mixing properties:

                                         k                                     m→∞
                                sup sup PΘn (Xn , · ) − π                     − − → 0.
                                                                              −−
                                 n≥1 k≥m


                                Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Previous conditions
              Some adaptive MCMC algorithms    New results
                                    Validity   Results for AM and ASM


Previous conditions on validity II


       DA is usually easier to verify.
       Containment is often tightly related to the stability of the
       process (Θn )n≥1 .
       Establishing containment (or stability) is often difficult before
       showing that a LLN holds. . .
       Typical solution has been to enforce containment.
            In the case of AM, ASM and RAM algorithms, this means that
            the eigenvalues of Σn , Θn are constrained to be within [a, b]
            for some constants 0 < a ≤ b < ∞.




                                Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Previous conditions
               Some adaptive MCMC algorithms    New results
                                     Validity   Results for AM and ASM


Previous conditions on validity III

   There are also other results in the literature on the ergodicity of
   adaptive MCMC given conditions similar to DA and C.
       The original work on AM [Haario, Saksman, and Tamminen,
       2001].
       Atchad´ and Rosenthal [2005] analyse the ASM algorithm
               e
       following the original mixingale approach.
       The results on adaptive reprojections approach Andrieu and
       Moulines [2006], Andrieu, Moulines, and Priouret [2005].
       The recent paper by Atchad´ and Fort [2010] is based on a
                                  e
       resolvent kernel approach.



                                 Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Previous conditions
               Some adaptive MCMC algorithms    New results
                                     Validity   Results for AM and ASM


New results I



   Key ideas in the new results:
       Containment is, in fact, not a necessary condition for LLN.
       That is, the ergodic properties of PΘn can become more and
       more unfavourable, and still a (strong) LLN can hold.
       The adaptation mechanism may include a strong explicit
       ‘drift’ away from ‘bad values’ of θ.




                                 Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Previous conditions
              Some adaptive MCMC algorithms    New results
                                    Validity   Results for AM and ASM


General ergodicity result I


       Assume K1 ⊂ K2 ⊂ · · · ⊂ Re are increasing subsets of the
       adaptation space. Assume that the adaptation follows the
       stochastic approximation dynamic

                        Xn ∼ PΘn−1 (Xn−1 , · )
                        Θn = Θn−1 + γn H(Θn−1 , Xn ).

       and the adaptation parameter Θn ∈ Kn for all n ≥ 1.
           One may also enforce Θn ∈ Kn . . .
       Consider the following conditions for constants c < ∞ and
       0≤       1:


                                Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Previous conditions
              Some adaptive MCMC algorithms    New results
                                    Validity   Results for AM and ASM


General ergodicity result II

   (A1) Drift and minorisation:
        There is a drift function V : Rd → [1, ∞) such that for all
        n ≥ 1 and all θ ∈ Kn

                   Pθ V (x) ≤ λn V (x) + ½Cn (x)bn                             and             (1)
                  Pθ (x, A) ≥           ½Cn (x)δn νn (A)                                       (2)

        where Cn ⊂ Rd are Borel sets, δn , λn ∈ (0, 1) and bn < ∞
        are constants and νn is concentrated on Cn . Furthermore,
        the constants are polynomially bounded so that

                            (1 − λn )−1 ∨ δn ∨ bn ≤ cn .
                                           −1




                                Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction     Previous conditions
              Some adaptive MCMC algorithms     New results
                                    Validity    Results for AM and ASM


General ergodicity result III


   (A2) Continuity:
        For all n ≥ 1 and any r ∈ (0, 1], there is c = c (r) ≥ 1 such
        that for all θ, θ ∈ Kn ,

                       Pθ f − Pθ f         Vr   ≤cn        f    Vr   |θ − θ |.

   (A3) Bound for adaptation function:
        There is a β ∈ [0, 1/2] such that for all n ≥ 1, θ ∈ Kn and
        x ∈ Rd
                            |H(θ, x)| ≤ cn V β (x).



                                Matti Vihola    Stability of adaptive random-walk Metropolis algorithms
Introduction    Previous conditions
               Some adaptive MCMC algorithms    New results
                                     Validity   Results for AM and ASM


General ergodicity result IV


   Theorem (Saksman and Vihola [2010])
   Assume (A1)–(A3) hold and let f be a function with f V α < ∞
   for some α ∈ (0, 1 − β). Assume < κ−1 [(1/2) ∧ (1 − α − β)],
                                           ∗
   where κ∗      1 is an independent constant, and that
      ∞     κ∗ −1 γ < ∞. Then,
      k=1 k        k

                   n
            1
         lim            f (Xk ) =         f (x)π(x)dx              almost surely.
        n→∞ n
                  k=1




                                 Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Previous conditions
               Some adaptive MCMC algorithms    New results
                                     Validity   Results for AM and ASM


Verifiable assumptions I

   Definition (Strongly super-exponential target)
   The target density π is continuously differentiable and has regular
   tails that decay super-exponentially,

                                 x    π(x)
                    lim sup        ·                  < 0              and
                     |x|→∞      |x| | π(x)|
                         x
                    lim      ·          log π(x) = −∞,
                  |x|→∞ |x|ρ

   with some ρ > 1.
   (The “super-exponential case”, ρ = 1, is due to Jarner and Hansen
   [2000] who ensure the geometric ergodicity of a nonadaptive
   random-walk Metropolis process.)

                                 Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Previous conditions
                Some adaptive MCMC algorithms    New results
                                      Validity   Results for AM and ASM


Verifiable assumptions II




                                                       0


                                          α
                                π(x)

                                           x
                                       x      π(x)
   Figure: The condition lim sup|x|→∞ |x| · | π(x)| < 0 implies that there is
   an ε > 0 such that for any sufficiently large |x|, the angle α < π/2 − ε.


                                  Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Previous conditions
             Some adaptive MCMC algorithms    New results
                                   Validity   Results for AM and ASM


AM when π has unbounded support



      Saksman and Vihola [2010]: First ergodicity results for the
      original AM algorithm, without upper bounding the
      eigenvalues λ(Σn ).
      Use the proposal covariance Σn = t2 Sn + I, with                             > 0.
      SLLN and CLT for strongly super-exponential π and functions
      satisfying supx |f (x)|π −p (x) < ∞ for some p ∈ (0, 1).




                               Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Previous conditions
               Some adaptive MCMC algorithms    New results
                                     Validity   Results for AM and ASM


AM without the covariance lower bound I

   Vihola [2011b] contains partial results on the case Σn = t2 Sn , that
   is, without lower bounding the eigenvalues λ(Σn ).
       Analysed first an ‘adaptive random walk’: AM run with ‘flat
       target π ≡ 1’,
                             1/2
               Xn+1 = Xn + tSn Wn+1
              Mn+1 = (1 − γn+1 )Mn + γn+1 Xn+1
               Sn+1 = (1 − γn+1 )Sn + γn+1 (Xn+1 − Mn )2 .

       Ten sample paths of this process (univariate) started at
       x0 = 0, s0 = 1 and with t = 0.01 and ηn = n−1 :



                                 Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Previous conditions
                         Some adaptive MCMC algorithms    New results
                                               Validity   Results for AM and ASM


AM without the covariance lower bound II

             0.1

            0.05
      Xn




                 0

           −0.05

            −0.1
                     0             2000          4000         6000            8000           10000

              0
            10



              −2
      Sn




            10



              −4
            10
                     0             2000          4000         6000            8000           10000



                                           Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction          Previous conditions
                        Some adaptive MCMC algorithms          New results
                                              Validity         Results for AM and ASM


AM without the covariance lower bound III

            600

            400
      Xn




            200

                0

           −200
                    0                 0.5                  1                    1.5                    2
                                                                                                     6
                                                                                                x 10
             5
           10


             0
      Sn




           10


             −5
           10
                    0                 0.5                  1                    1.5                    2
                                                                                                     6
                                                                                                x 10


                                            Matti Vihola       Stability of adaptive random-walk Metropolis algorithms
Introduction    Previous conditions
             Some adaptive MCMC algorithms    New results
                                   Validity   Results for AM and ASM


AM without the covariance lower bound IV


      It is shown that Sn → ∞ almost surely.
                                                   √
      The speed of growth is E[Sn ] ∼ exp t n  i=2 ηi .
                                 √
      (Particularly, E[Sn ] ∼ e2t n when ηn = n−1 ).
      Using the same techniques, one can show the stability (and
      ergodicity) of AM run with a univariate Laplace target π.
      These results have little direct practical value, but they
      indicate that the AM covariance parameter Sn does not tend
      to collapse.




                               Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction     Previous conditions
             Some adaptive MCMC algorithms     New results
                                   Validity    Results for AM and ASM


AM with a fixed proposal component I

      Instead of lower bounding the eigenvalues λ(Σn ), employ a
      fixed proposal component with a probability β ∈ (0, 1)
      [Roberts and Rosenthal, 2009].
      This corresponds to an algorithm where the proposals Yn are
      generated by
                                         1/2
                                      Σ0 Wn ,          with probability β,
              Yn = Xn−1 +              1/2
                                      Σn−1 Wn ,        otherwise.

      with some fixed symm.pos.def. Σ0 .
      In other words, one employs a mixture of ‘adaptive’ and
      ‘nonadaptive’ Markov kernels:

       P ( Xn ∈ A | X1 , . . . , Xn−1 ) = (1 − β)PΣn−1 (A) + βPΣ0 (A)

                               Matti Vihola    Stability of adaptive random-walk Metropolis algorithms
Introduction    Previous conditions
              Some adaptive MCMC algorithms    New results
                                    Validity   Results for AM and ASM


AM with a fixed proposal component II
   Vihola [2011b] shows that, for example with a bounded and
   compactly supported π or with a super-exponential π, the
   eigenvalues λ(Σn ) are bounded away from zero.
       Having a strongly super-exponential target, SLLN (resp. CLT)
       hold for functions satisfying supx |f (x)|π −p (x) < ∞ for some
       p ∈ (0, 1) (p ∈ (0, 1/2)).
       Explicit upper and lower bounds for Σn unnecessary.
       Mixture proposal may be better in practice than the lower
       bound I.
   Also Bai, Roberts, and Rosenthal [2008] analyse this algorithm.
       A completely different approach, with different assumptions.
       Also exponentially decaying π are considered; in this case the
       fixed covariance Σ0 must be large enough.
                                Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Previous conditions
              Some adaptive MCMC algorithms    New results
                                    Validity   Results for AM and ASM


Ergodicity of unconstrained ASM algorithm
   Vihola [2011a] shows two results for the unmodified adaptation,
   without any (upper or lower) bounds for Θn . Assume:
       the desired acceptance rate α∗ ∈ (0, 1/2).
       the adaptation weights satisfy    γn < ∞ (e.g. γn = n−γ with
                                           2

       γ ∈ (1/2, 1]).
   Two cases:
    1. π is bounded, bounded away from zero on the support, and
       the support X = {x : π(x) > 0} is compact and has a smooth
       boundary.
       Then, SLLN holds for bounded functions.
    2. Suppose a strongly super-exponential target having tails with
       uniformly smooth contours.
       Then, SLLN holds for functions satisfying
       supx |f (x)|π −p (x) < ∞ for some p ∈ (0, 1).
                                Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Previous conditions
             Some adaptive MCMC algorithms    New results
                                   Validity   Results for AM and ASM


Ergodicity of ASM within AM




      The results in [Vihola, 2011a] cover also the adaptive scaling
      within adaptive Metropolis algorithm.
      However, one has to assume that the covariance factor Σn is
      bounded within 0 < a ≤ b < ∞




                               Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Previous conditions
              Some adaptive MCMC algorithms    New results
                                    Validity   Results for AM and ASM


Final remarks I


       Current results show that some adaptive MCMC algorithms
       are intrinsically stable, requiring no additional stabilisation
       structures.
            Easier for practitioners; less parameters to ‘tune.’
            Showing that the methods are fairly ‘safe’ to apply.
       The results are related to the more general question of the
       stability of the Robbins-Monro stochastic approximation with
       Markovian noise.
            Manuscript in preparation with C. Andrieu. . .
       Fort, Moulines, and Priouret [2010] consider also interacting
       MCMC (e.g. the equi-energy sampler) and extend the
       approach in [Saksman and Vihola, 2010].


                                Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Previous conditions
              Some adaptive MCMC algorithms    New results
                                    Validity   Results for AM and ASM


Final remarks II

       It is not necessary to use Gaussian proposal distributions. All
       the above results apply to elliptical proposals satisfying a
       particular tail decay condition. For example, the results apply
       with multivariate Student distributions having the form
                                                                  d+p
                            qc (z) ∝ (1 + c−1/2 z 2 )−             2



       where p > 0 [Vihola, 2011a].
       Current results apply only for targets with rapidly decaying
       tails. It is important to establish similar results with
       heavy-tailed targets.
       Overall, there is a need for more general yet practically
       verifiable conditions to check the validity of the methods.

                                Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Previous conditions
              Some adaptive MCMC algorithms    New results
                                    Validity   Results for AM and ASM


Final remarks III



       There is also a free software available for testing several
       adaptive random-walk MCMC algorithms, including the new
       RAM approach [Vihola, 2010a]:
       http://iki.fi/mvihola/grapham/




                                Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
Introduction    Previous conditions
              Some adaptive MCMC algorithms    New results
                                    Validity   Results for AM and ASM


Final remarks III



       There is also a free software available for testing several
       adaptive random-walk MCMC algorithms, including the new
       RAM approach [Vihola, 2010a]:
       http://iki.fi/mvihola/grapham/

                                   Thank you!




                                Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
References




References I

                  ´
   C. Andrieu and E. Moulines. On the ergodicity properties of some
     adaptive MCMC algorithms. Ann. Appl. Probab., 16(3):
     1462–1505, 2006.
   C. Andrieu and C. P. Robert. Controlled MCMC for optimal
     sampling. Technical Report Ceremade 0125, Universit´ Paris
                                                        e
     Dauphine, 2001.
   C. Andrieu and J. Thoms. A tutorial on adaptive MCMC. Statist.
     Comput., 18(4):343–373, Dec. 2008.
               ´
   C. Andrieu, E. Moulines, and P. Priouret. Stability of stochastic
     approximation under verifiable conditions. SIAM J. Control
     Optim., 44(1):283–312, 2005.


                          Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
References




References II

   Y. Atchad´ and G. Fort. Limit theorems for some adaptive MCMC
             e
     algorithms with subgeometric kernels. Bernoulli, 16(1):116–154,
     Feb. 2010.
   Y. F. Atchad´ and J. S. Rosenthal. On adaptive Markov chain
               e
     Monte Carlo algorithms. Bernoulli, 11(5):815–828, 2005.
   Y. Bai, G. O. Roberts, and J. S. Rosenthal. On the containment
     condition for adaptive Markov chain Monte Carlo algorithms.
     Preprint, July 2008. URL
     http://probability.ca/jeff/research.html.
   G. Fort, E. Moulines, and P. Priouret. Convergence of adaptive
     MCMC algorithms: ergodicity and law of large numbers. 2010.
     technical report.


                          Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
References




References III

   W. R. Gilks, G. O. Roberts, and S. K. Sahu. Adaptive Markov
    chain Monte Carlo through regeneration. J. Amer. Statist.
    Assoc., 93(443):1045–1054, 1998.
   H. Haario, E. Saksman, and J. Tamminen. An adaptive Metropolis
     algorithm. Bernoulli, 7(2):223–242, 2001.
   S. F. Jarner and E. Hansen. Geometric ergodicity of Metropolis
      algorithms. Stochastic Process. Appl., 85:341–361, 2000.
   G. O. Roberts and J. S. Rosenthal. Coupling and ergodicity of
     adaptive Markov chain Monte Carlo algorithms. J. Appl.
     Probab., 44(2):458–475, 2007.
   G. O. Roberts and J. S. Rosenthal. Examples of adaptive MCMC.
     J. Comput. Graph. Statist., 18(2):349–367, 2009.

                          Matti Vihola   Stability of adaptive random-walk Metropolis algorithms
References




References IV
   E. Saksman and M. Vihola. On the ergodicity of the adaptive
     Metropolis algorithm on unbounded domains. Ann. Appl.
     Probab., 20(6):2178–2203, Dec. 2010.
   M. Vihola. Grapham: Graphical models with adaptive random walk
    Metropolis algorithms. Comput. Statist. Data Anal., 54(1):
    49–54, Jan. 2010a.
   M. Vihola. Robust adaptive Metropolis algorithm with coerced
    acceptance rate. Preprint, arXiv:1011.4381v1, Nov. 2010b.
   M. Vihola. On the stability and ergodicity of adaptive scaling
    Metropolis algorithms. Preprint, arXiv:0903.4061v3, Apr. 2011a.
   M. Vihola. Can the adaptive Metropolis algorithm collapse without
    the covariance lower bound? Electron. J. Probab., 16:45–75,
    2011b.

                          Matti Vihola   Stability of adaptive random-walk Metropolis algorithms

Weitere ähnliche Inhalte

Was ist angesagt?

Random Matrix Theory and Machine Learning - Part 2
Random Matrix Theory and Machine Learning - Part 2Random Matrix Theory and Machine Learning - Part 2
Random Matrix Theory and Machine Learning - Part 2Fabian Pedregosa
 
Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Fabian Pedregosa
 
Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4Fabian Pedregosa
 
Random Matrix Theory and Machine Learning - Part 1
Random Matrix Theory and Machine Learning - Part 1Random Matrix Theory and Machine Learning - Part 1
Random Matrix Theory and Machine Learning - Part 1Fabian Pedregosa
 
Lesson 12: Linear Approximation
Lesson 12: Linear ApproximationLesson 12: Linear Approximation
Lesson 12: Linear ApproximationMatthew Leingang
 
Lecture1
Lecture1Lecture1
Lecture1rjaeh
 
On estimating the integrated co volatility using
On estimating the integrated co volatility usingOn estimating the integrated co volatility using
On estimating the integrated co volatility usingkkislas
 
Monte Caro Simualtions, Sampling and Markov Chain Monte Carlo
Monte Caro Simualtions, Sampling and Markov Chain Monte CarloMonte Caro Simualtions, Sampling and Markov Chain Monte Carlo
Monte Caro Simualtions, Sampling and Markov Chain Monte CarloXin-She Yang
 
Likelihood survey-nber-0713101
Likelihood survey-nber-0713101Likelihood survey-nber-0713101
Likelihood survey-nber-0713101NBER
 
Mark Girolami's Read Paper 2010
Mark Girolami's Read Paper 2010Mark Girolami's Read Paper 2010
Mark Girolami's Read Paper 2010Christian Robert
 
Fuzzy Speed Regulator for Induction Motor Direct Torque Control Scheme
Fuzzy Speed Regulator for Induction Motor Direct Torque Control SchemeFuzzy Speed Regulator for Induction Motor Direct Torque Control Scheme
Fuzzy Speed Regulator for Induction Motor Direct Torque Control SchemeIDES Editor
 
Dependent processes in Bayesian Nonparametrics
Dependent processes in Bayesian NonparametricsDependent processes in Bayesian Nonparametrics
Dependent processes in Bayesian NonparametricsJulyan Arbel
 
Parameter Estimation for Semiparametric Models with CMARS and Its Applications
Parameter Estimation for Semiparametric Models with CMARS and Its ApplicationsParameter Estimation for Semiparametric Models with CMARS and Its Applications
Parameter Estimation for Semiparametric Models with CMARS and Its ApplicationsSSA KPI
 
Quantum Algorithms and Lower Bounds in Continuous Time
Quantum Algorithms and Lower Bounds in Continuous TimeQuantum Algorithms and Lower Bounds in Continuous Time
Quantum Algorithms and Lower Bounds in Continuous TimeDavid Yonge-Mallo
 
Bayesian Nonparametrics: Models Based on the Dirichlet Process
Bayesian Nonparametrics: Models Based on the Dirichlet ProcessBayesian Nonparametrics: Models Based on the Dirichlet Process
Bayesian Nonparametrics: Models Based on the Dirichlet ProcessAlessandro Panella
 
NIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learningNIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learningzukun
 

Was ist angesagt? (20)

Random Matrix Theory and Machine Learning - Part 2
Random Matrix Theory and Machine Learning - Part 2Random Matrix Theory and Machine Learning - Part 2
Random Matrix Theory and Machine Learning - Part 2
 
Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3
 
Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4
 
Random Matrix Theory and Machine Learning - Part 1
Random Matrix Theory and Machine Learning - Part 1Random Matrix Theory and Machine Learning - Part 1
Random Matrix Theory and Machine Learning - Part 1
 
NC time seminar
NC time seminarNC time seminar
NC time seminar
 
Lesson 12: Linear Approximation
Lesson 12: Linear ApproximationLesson 12: Linear Approximation
Lesson 12: Linear Approximation
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
NTU_paper
NTU_paperNTU_paper
NTU_paper
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Lecture1
Lecture1Lecture1
Lecture1
 
On estimating the integrated co volatility using
On estimating the integrated co volatility usingOn estimating the integrated co volatility using
On estimating the integrated co volatility using
 
Monte Caro Simualtions, Sampling and Markov Chain Monte Carlo
Monte Caro Simualtions, Sampling and Markov Chain Monte CarloMonte Caro Simualtions, Sampling and Markov Chain Monte Carlo
Monte Caro Simualtions, Sampling and Markov Chain Monte Carlo
 
Likelihood survey-nber-0713101
Likelihood survey-nber-0713101Likelihood survey-nber-0713101
Likelihood survey-nber-0713101
 
Mark Girolami's Read Paper 2010
Mark Girolami's Read Paper 2010Mark Girolami's Read Paper 2010
Mark Girolami's Read Paper 2010
 
Fuzzy Speed Regulator for Induction Motor Direct Torque Control Scheme
Fuzzy Speed Regulator for Induction Motor Direct Torque Control SchemeFuzzy Speed Regulator for Induction Motor Direct Torque Control Scheme
Fuzzy Speed Regulator for Induction Motor Direct Torque Control Scheme
 
Dependent processes in Bayesian Nonparametrics
Dependent processes in Bayesian NonparametricsDependent processes in Bayesian Nonparametrics
Dependent processes in Bayesian Nonparametrics
 
Parameter Estimation for Semiparametric Models with CMARS and Its Applications
Parameter Estimation for Semiparametric Models with CMARS and Its ApplicationsParameter Estimation for Semiparametric Models with CMARS and Its Applications
Parameter Estimation for Semiparametric Models with CMARS and Its Applications
 
Quantum Algorithms and Lower Bounds in Continuous Time
Quantum Algorithms and Lower Bounds in Continuous TimeQuantum Algorithms and Lower Bounds in Continuous Time
Quantum Algorithms and Lower Bounds in Continuous Time
 
Bayesian Nonparametrics: Models Based on the Dirichlet Process
Bayesian Nonparametrics: Models Based on the Dirichlet ProcessBayesian Nonparametrics: Models Based on the Dirichlet Process
Bayesian Nonparametrics: Models Based on the Dirichlet Process
 
NIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learningNIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learning
 

Andere mochten auch

A Random Walk Down Entrepreneurship
A Random Walk Down EntrepreneurshipA Random Walk Down Entrepreneurship
A Random Walk Down EntrepreneurshipPeter Choi
 
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...Sc Huang
 
Oil Spill Simulation near The Red Sea Coast using The Random Walk Technique
Oil Spill Simulation near The Red Sea Coast using The Random Walk TechniqueOil Spill Simulation near The Red Sea Coast using The Random Walk Technique
Oil Spill Simulation near The Red Sea Coast using The Random Walk TechniqueAmro Elfeki
 
Prototype SDX Bioinformatics Exchange: Demonstrating an Essential Use-Case fo...
Prototype SDX Bioinformatics Exchange: Demonstrating an Essential Use-Case fo...Prototype SDX Bioinformatics Exchange: Demonstrating an Essential Use-Case fo...
Prototype SDX Bioinformatics Exchange: Demonstrating an Essential Use-Case fo...US-Ignite
 
Access to Essential Medicines and Intellectual Property Aspects
Access to Essential Medicines and Intellectual Property AspectsAccess to Essential Medicines and Intellectual Property Aspects
Access to Essential Medicines and Intellectual Property AspectsMakeMedicinesAffordable
 
Market efficiency presentation
Market efficiency presentationMarket efficiency presentation
Market efficiency presentationMohammed Alashi
 
Health Exchange: Access to medicines April 2010
Health Exchange: Access to medicines April 2010Health Exchange: Access to medicines April 2010
Health Exchange: Access to medicines April 2010HealthlinkWorldwide
 
Applying human rights to advocacy campaigns for access to essential medicines...
Applying human rights to advocacy campaigns for access to essential medicines...Applying human rights to advocacy campaigns for access to essential medicines...
Applying human rights to advocacy campaigns for access to essential medicines...MeTApresents
 
Panel 1-rachel-kiddellmonroe
Panel 1-rachel-kiddellmonroePanel 1-rachel-kiddellmonroe
Panel 1-rachel-kiddellmonroeREA Brasil
 
Five Essential Steps for Successful Networking in Sports Medicine
Five Essential Steps for Successful Networking in Sports MedicineFive Essential Steps for Successful Networking in Sports Medicine
Five Essential Steps for Successful Networking in Sports MedicineKusal Goonewardena
 
Access To Medicine
Access To MedicineAccess To Medicine
Access To MedicineMeTApresents
 
[논문발표] 20160725 A Random Walk Around the City: New Venue Recommendation in Lo...
[논문발표] 20160725 A Random Walk Around the City: New Venue Recommendation in Lo...[논문발표] 20160725 A Random Walk Around the City: New Venue Recommendation in Lo...
[논문발표] 20160725 A Random Walk Around the City: New Venue Recommendation in Lo...Sanghoon Yoon
 
Social Media Plan - Eliminate Random Acts of Marketing (RAMs) Keynote iSummit
Social Media Plan - Eliminate Random Acts of Marketing (RAMs) Keynote iSummitSocial Media Plan - Eliminate Random Acts of Marketing (RAMs) Keynote iSummit
Social Media Plan - Eliminate Random Acts of Marketing (RAMs) Keynote iSummitPam Moore
 

Andere mochten auch (13)

A Random Walk Down Entrepreneurship
A Random Walk Down EntrepreneurshipA Random Walk Down Entrepreneurship
A Random Walk Down Entrepreneurship
 
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
 
Oil Spill Simulation near The Red Sea Coast using The Random Walk Technique
Oil Spill Simulation near The Red Sea Coast using The Random Walk TechniqueOil Spill Simulation near The Red Sea Coast using The Random Walk Technique
Oil Spill Simulation near The Red Sea Coast using The Random Walk Technique
 
Prototype SDX Bioinformatics Exchange: Demonstrating an Essential Use-Case fo...
Prototype SDX Bioinformatics Exchange: Demonstrating an Essential Use-Case fo...Prototype SDX Bioinformatics Exchange: Demonstrating an Essential Use-Case fo...
Prototype SDX Bioinformatics Exchange: Demonstrating an Essential Use-Case fo...
 
Access to Essential Medicines and Intellectual Property Aspects
Access to Essential Medicines and Intellectual Property AspectsAccess to Essential Medicines and Intellectual Property Aspects
Access to Essential Medicines and Intellectual Property Aspects
 
Market efficiency presentation
Market efficiency presentationMarket efficiency presentation
Market efficiency presentation
 
Health Exchange: Access to medicines April 2010
Health Exchange: Access to medicines April 2010Health Exchange: Access to medicines April 2010
Health Exchange: Access to medicines April 2010
 
Applying human rights to advocacy campaigns for access to essential medicines...
Applying human rights to advocacy campaigns for access to essential medicines...Applying human rights to advocacy campaigns for access to essential medicines...
Applying human rights to advocacy campaigns for access to essential medicines...
 
Panel 1-rachel-kiddellmonroe
Panel 1-rachel-kiddellmonroePanel 1-rachel-kiddellmonroe
Panel 1-rachel-kiddellmonroe
 
Five Essential Steps for Successful Networking in Sports Medicine
Five Essential Steps for Successful Networking in Sports MedicineFive Essential Steps for Successful Networking in Sports Medicine
Five Essential Steps for Successful Networking in Sports Medicine
 
Access To Medicine
Access To MedicineAccess To Medicine
Access To Medicine
 
[논문발표] 20160725 A Random Walk Around the City: New Venue Recommendation in Lo...
[논문발표] 20160725 A Random Walk Around the City: New Venue Recommendation in Lo...[논문발표] 20160725 A Random Walk Around the City: New Venue Recommendation in Lo...
[논문발표] 20160725 A Random Walk Around the City: New Venue Recommendation in Lo...
 
Social Media Plan - Eliminate Random Acts of Marketing (RAMs) Keynote iSummit
Social Media Plan - Eliminate Random Acts of Marketing (RAMs) Keynote iSummitSocial Media Plan - Eliminate Random Acts of Marketing (RAMs) Keynote iSummit
Social Media Plan - Eliminate Random Acts of Marketing (RAMs) Keynote iSummit
 

Ähnlich wie Stability of adaptive random-walk Metropolis algorithms

Nonlinear Stochastic Optimization by the Monte-Carlo Method
Nonlinear Stochastic Optimization by the Monte-Carlo MethodNonlinear Stochastic Optimization by the Monte-Carlo Method
Nonlinear Stochastic Optimization by the Monte-Carlo MethodSSA KPI
 
Introduction to advanced Monte Carlo methods
Introduction to advanced Monte Carlo methodsIntroduction to advanced Monte Carlo methods
Introduction to advanced Monte Carlo methodsChristian Robert
 
MCMC and likelihood-free methods
MCMC and likelihood-free methodsMCMC and likelihood-free methods
MCMC and likelihood-free methodsChristian Robert
 
Markov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themMarkov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themPierre Jacob
 
Unbiased MCMC with couplings
Unbiased MCMC with couplingsUnbiased MCMC with couplings
Unbiased MCMC with couplingsPierre Jacob
 
Discussion of Matti Vihola's talk
Discussion of Matti Vihola's talkDiscussion of Matti Vihola's talk
Discussion of Matti Vihola's talkChristian Robert
 
Clustering Random Walk Time Series
Clustering Random Walk Time SeriesClustering Random Walk Time Series
Clustering Random Walk Time SeriesGautier Marti
 
Stochastic Approximation and Simulated Annealing
Stochastic Approximation and Simulated AnnealingStochastic Approximation and Simulated Annealing
Stochastic Approximation and Simulated AnnealingSSA KPI
 
short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018Christian Robert
 
Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods Pierre Jacob
 
MCQMC 2020 talk: Importance Sampling for a Robust and Efficient Multilevel Mo...
MCQMC 2020 talk: Importance Sampling for a Robust and Efficient Multilevel Mo...MCQMC 2020 talk: Importance Sampling for a Robust and Efficient Multilevel Mo...
MCQMC 2020 talk: Importance Sampling for a Robust and Efficient Multilevel Mo...Chiheb Ben Hammouda
 
Unbiased Markov chain Monte Carlo
Unbiased Markov chain Monte CarloUnbiased Markov chain Monte Carlo
Unbiased Markov chain Monte CarloJeremyHeng10
 

Ähnlich wie Stability of adaptive random-walk Metropolis algorithms (20)

Nonlinear Stochastic Optimization by the Monte-Carlo Method
Nonlinear Stochastic Optimization by the Monte-Carlo MethodNonlinear Stochastic Optimization by the Monte-Carlo Method
Nonlinear Stochastic Optimization by the Monte-Carlo Method
 
KAUST_talk_short.pdf
KAUST_talk_short.pdfKAUST_talk_short.pdf
KAUST_talk_short.pdf
 
Introduction to advanced Monte Carlo methods
Introduction to advanced Monte Carlo methodsIntroduction to advanced Monte Carlo methods
Introduction to advanced Monte Carlo methods
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Hastings 1970
Hastings 1970Hastings 1970
Hastings 1970
 
MCMC and likelihood-free methods
MCMC and likelihood-free methodsMCMC and likelihood-free methods
MCMC and likelihood-free methods
 
YSC 2013
YSC 2013YSC 2013
YSC 2013
 
intro
introintro
intro
 
Markov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themMarkov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing them
 
Unbiased MCMC with couplings
Unbiased MCMC with couplingsUnbiased MCMC with couplings
Unbiased MCMC with couplings
 
Discussion of Matti Vihola's talk
Discussion of Matti Vihola's talkDiscussion of Matti Vihola's talk
Discussion of Matti Vihola's talk
 
Clustering Random Walk Time Series
Clustering Random Walk Time SeriesClustering Random Walk Time Series
Clustering Random Walk Time Series
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Stochastic Approximation and Simulated Annealing
Stochastic Approximation and Simulated AnnealingStochastic Approximation and Simulated Annealing
Stochastic Approximation and Simulated Annealing
 
short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018
 
Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods
 
Shanghai tutorial
Shanghai tutorialShanghai tutorial
Shanghai tutorial
 
Jere Koskela slides
Jere Koskela slidesJere Koskela slides
Jere Koskela slides
 
MCQMC 2020 talk: Importance Sampling for a Robust and Efficient Multilevel Mo...
MCQMC 2020 talk: Importance Sampling for a Robust and Efficient Multilevel Mo...MCQMC 2020 talk: Importance Sampling for a Robust and Efficient Multilevel Mo...
MCQMC 2020 talk: Importance Sampling for a Robust and Efficient Multilevel Mo...
 
Unbiased Markov chain Monte Carlo
Unbiased Markov chain Monte CarloUnbiased Markov chain Monte Carlo
Unbiased Markov chain Monte Carlo
 

Mehr von BigMC

Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility i...
Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility i...Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility i...
Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility i...BigMC
 
Dealing with intractability: Recent Bayesian Monte Carlo methods for dealing ...
Dealing with intractability: Recent Bayesian Monte Carlo methods for dealing ...Dealing with intractability: Recent Bayesian Monte Carlo methods for dealing ...
Dealing with intractability: Recent Bayesian Monte Carlo methods for dealing ...BigMC
 
"Monte-Carlo Tree Search for the game of Go"
"Monte-Carlo Tree Search for the game of Go""Monte-Carlo Tree Search for the game of Go"
"Monte-Carlo Tree Search for the game of Go"BigMC
 
Hedibert Lopes' talk at BigMC
Hedibert Lopes' talk at  BigMCHedibert Lopes' talk at  BigMC
Hedibert Lopes' talk at BigMCBigMC
 
Andreas Eberle
Andreas EberleAndreas Eberle
Andreas EberleBigMC
 
Olivier Féron's talk at BigMC March 2011
Olivier Féron's talk at BigMC March 2011Olivier Féron's talk at BigMC March 2011
Olivier Féron's talk at BigMC March 2011BigMC
 
Olivier Cappé's talk at BigMC March 2011
Olivier Cappé's talk at BigMC March 2011Olivier Cappé's talk at BigMC March 2011
Olivier Cappé's talk at BigMC March 2011BigMC
 
Estimation de copules, une approche bayésienne
Estimation de copules, une approche bayésienneEstimation de copules, une approche bayésienne
Estimation de copules, une approche bayésienneBigMC
 
Comparing estimation algorithms for block clustering models
Comparing estimation algorithms for block clustering modelsComparing estimation algorithms for block clustering models
Comparing estimation algorithms for block clustering modelsBigMC
 
Computation of the marginal likelihood
Computation of the marginal likelihoodComputation of the marginal likelihood
Computation of the marginal likelihoodBigMC
 
Learning spline-based curve models (Laure Amate)
Learning spline-based curve models (Laure Amate)Learning spline-based curve models (Laure Amate)
Learning spline-based curve models (Laure Amate)BigMC
 
Omiros' talk on the Bernoulli factory problem
Omiros' talk on the  Bernoulli factory problemOmiros' talk on the  Bernoulli factory problem
Omiros' talk on the Bernoulli factory problemBigMC
 

Mehr von BigMC (12)

Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility i...
Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility i...Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility i...
Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility i...
 
Dealing with intractability: Recent Bayesian Monte Carlo methods for dealing ...
Dealing with intractability: Recent Bayesian Monte Carlo methods for dealing ...Dealing with intractability: Recent Bayesian Monte Carlo methods for dealing ...
Dealing with intractability: Recent Bayesian Monte Carlo methods for dealing ...
 
"Monte-Carlo Tree Search for the game of Go"
"Monte-Carlo Tree Search for the game of Go""Monte-Carlo Tree Search for the game of Go"
"Monte-Carlo Tree Search for the game of Go"
 
Hedibert Lopes' talk at BigMC
Hedibert Lopes' talk at  BigMCHedibert Lopes' talk at  BigMC
Hedibert Lopes' talk at BigMC
 
Andreas Eberle
Andreas EberleAndreas Eberle
Andreas Eberle
 
Olivier Féron's talk at BigMC March 2011
Olivier Féron's talk at BigMC March 2011Olivier Féron's talk at BigMC March 2011
Olivier Féron's talk at BigMC March 2011
 
Olivier Cappé's talk at BigMC March 2011
Olivier Cappé's talk at BigMC March 2011Olivier Cappé's talk at BigMC March 2011
Olivier Cappé's talk at BigMC March 2011
 
Estimation de copules, une approche bayésienne
Estimation de copules, une approche bayésienneEstimation de copules, une approche bayésienne
Estimation de copules, une approche bayésienne
 
Comparing estimation algorithms for block clustering models
Comparing estimation algorithms for block clustering modelsComparing estimation algorithms for block clustering models
Comparing estimation algorithms for block clustering models
 
Computation of the marginal likelihood
Computation of the marginal likelihoodComputation of the marginal likelihood
Computation of the marginal likelihood
 
Learning spline-based curve models (Laure Amate)
Learning spline-based curve models (Laure Amate)Learning spline-based curve models (Laure Amate)
Learning spline-based curve models (Laure Amate)
 
Omiros' talk on the Bernoulli factory problem
Omiros' talk on the  Bernoulli factory problemOmiros' talk on the  Bernoulli factory problem
Omiros' talk on the Bernoulli factory problem
 

Stability of adaptive random-walk Metropolis algorithms

  • 1. Introduction Some adaptive MCMC algorithms Validity Stability of adaptive random-walk Metropolis algorithms Matti Vihola matti.vihola@iki.fi University of Jyv¨skyl¨, Department of Mathematics and Statistics a a joint work with Eero Saksman (University of Helsinki) Paris, May 5th 2011 Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 2. Introduction Adaptive MCMC Some adaptive MCMC algorithms Stochastic approximation framework Validity Stability? Adaptive MCMC Adaptive MCMC algorithms have been succesfully applied to ‘automatise’ the tuning of the proposal distributions. In general, they are based on a collection of Markov kernels {Pθ }θ∈Θ , each having π as the unique invariant distribution. The aim is to find a good kernel (a good value of θ) automatically. Stochastic approximation framework is the most common way to construct the algorithms [Andrieu and Robert, 2001]. Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 3. Introduction Adaptive MCMC Some adaptive MCMC algorithms Stochastic approximation framework Validity Stability? Stochastic approximation framework One starts with certain (X1 , Θ1 ) ≡ (x1 , θ1 ) ∈ Rd × Re and for n≥2 Xn ∼ PΘn−1 (Xn−1 , · ) Θn = Θn−1 + γn H(Θn−1 , Xn ). where (γn )n≥2 is a non-negative step size sequence that vanishes and the function H : Re × Rd → Re determines the adaptation. The adaptation aims to Θn → θ∗ so that h(θ∗ ) = 0, where h(θ) := H(θ, x)π(dx). Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 4. Introduction Adaptive MCMC Some adaptive MCMC algorithms Stochastic approximation framework Validity Stability? Stability? I Usually, one can identify a Lyapunov function w : Re → [0, ∞) such that w(θ), h(θ) < 0 whenever h(θ) = 0. This suggests that, “on average”, there should be a “drift” of Θn towards a θ∗ such that h(θ∗ ) = 0. To have sufficient “averaging”, the sequence of Markov kernels PΘn should have nice properties. To have nice properties on PΘn , the sequence (Θn )n≥1 should behave nicely. . . Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 5. Introduction Adaptive MCMC Some adaptive MCMC algorithms Stochastic approximation framework Validity Stability? Stability? II Usual way to overcome the stability issue is to enforce stability, or to use adaptive reprojections approach like Andrieu and Moulines [2006], Andrieu, Moulines, and Priouret [2005]. Strong belief (based on overwhelming empirical evidence) that many algorithms are intrinsically stable (and do not require stabilisation). Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 6. Introduction Adaptive Metropolis algorithm Some adaptive MCMC algorithms Adaptive scaling Metropolis algorithm Validity Robust adaptive Metropolis algorithm Adaptive Metropolis algorithm I The AM algorithm [Haario, Saksman, and Tamminen, 2001]: Define X1 ≡ x1 ∈ Rd such that π(x1 ) > 0. Choose parameters t > 0 and ≥ 0. For n = 2, 3, . . . 1/2 Yn = Xn−1 + Σn−1 Wn , where Wn ∼ N (0, I), and Yn , with probability α(Xn−1 , Yn ) and Xn = Xn−1 , otherwise. where Σn−1 := t2 Cov(X1 , . . . , Xn−1 ) + I, and I ∈ Rd×d stands for the identity matrix. Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 7. Introduction Adaptive Metropolis algorithm Some adaptive MCMC algorithms Adaptive scaling Metropolis algorithm Validity Robust adaptive Metropolis algorithm Adaptive Metropolis algorithm II Cov(· · · ) is some consistent covariance estimator (not necessarily the standard sample covariance!). In what follows, consider the definition Cov(X1 , . . . , Xn ) := Sn , where (M1 , S1 ) ≡ (x1 , s1 ) with s1 ∈ Rd×d is symmetric and positive definite, and Mn = (1 − γn )Mn−1 + γn Xn Sn = (1 − γn )Sn−1 + γn (Xn − Mn−1 )(Xn − Mn−1 )T . Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 8. Introduction Adaptive Metropolis algorithm Some adaptive MCMC algorithms Adaptive scaling Metropolis algorithm Validity Robust adaptive Metropolis algorithm Example run of AM 0 0 0 −2 −2 −2 −4 −4 −4 −6 −6 −6 −8 −8 −8 −10 −10 −10 −2 0 2 −2 0 2 −2 0 2 Figure: The first 10, 100 and √1000 samples of the AM algorithm started with s1 = (0.01)2 I, t = 2.38/ 2, = 0 and ηn := n−1 . Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 9. Introduction Adaptive Metropolis algorithm Some adaptive MCMC algorithms Adaptive scaling Metropolis algorithm Validity Robust adaptive Metropolis algorithm Adaptive scaling Metropolis algorithm I This algorithm, essentially proposed by [Gilks, Roberts, and Sahu, 1998, Andrieu and Robert, 2001], adjusts the size of the proposal jumps, and tries to attain a given mean acceptance rate. Define X1 ≡ x1 ∈ Rd such that π(x1 ) > 0. Let Θ1 ≡ θ1 > 0. Define the desired mean acceptance rate α∗ ∈ (0, 1). (Usually α∗ = 0.44 or α∗ = 0.234. . . ) Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 10. Introduction Adaptive Metropolis algorithm Some adaptive MCMC algorithms Adaptive scaling Metropolis algorithm Validity Robust adaptive Metropolis algorithm Adaptive scaling Metropolis algorithm II For n = 2, 3, . . ., iterate Yn = Xn−1 + Θn−1 Wn , where Wn ∼ N (0, I), and Yn , with probability α(Xn−1 , Yn ) and Xn = Xn−1 , otherwise. log Θn = log Θn−1 + γn α(Xn−1 , Yn ) − α∗ . Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 11. Introduction Adaptive Metropolis algorithm Some adaptive MCMC algorithms Adaptive scaling Metropolis algorithm Validity Robust adaptive Metropolis algorithm Example run of ASM 0 0 0 −2 −2 −2 −4 −4 −4 −6 −6 −6 −8 −8 −8 −10 −10 −10 −2 0 2 −2 0 2 −2 0 2 Figure: The first 10, 100 and 1000 samples of the ASM algorithm started with θ1 = 0.01 and using γn = n−3/4 . Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 12. Introduction Adaptive Metropolis algorithm Some adaptive MCMC algorithms Adaptive scaling Metropolis algorithm Validity Robust adaptive Metropolis algorithm Adaptive scaling within adaptive Metropolis The AM and ASM algorithms can be used simultaneously [Atchad´ e and Fort, 2010, Andrieu and Thoms, 2008]. Yn = Xn−1 + Θn−1 Σn−1 Wn , where Wn ∼ N (0, I), and Yn , with probability α(Xn−1 , Yn ) and Xn = Xn−1 , otherwise. log Θn = log Θn−1 + γn α(Xn−1 , Yn ) − α∗ Mn = (1 − γn )Mn−1 + γn Xn Sn = (1 − γn )Sn−1 + γn (Xn − Mn−1 )(Xn − Mn−1 )T Σn = t2 Sn + I. Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 13. Introduction Adaptive Metropolis algorithm Some adaptive MCMC algorithms Adaptive scaling Metropolis algorithm Validity Robust adaptive Metropolis algorithm Robust adaptive Metropolis algorithm I The RAM algorithm is a multidimensional ‘extension’ of the ASM adaptation mechanism [Vihola, 2010b]. Define X1 ≡ x1 ∈ Rd such that π(x1 ) > 0. Let Σ1 ≡ s1 ∈ Rd×d be symmetric and positive definite. Define the desired mean acceptance rate α∗ ∈ (0, 1). Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 14. Introduction Adaptive Metropolis algorithm Some adaptive MCMC algorithms Adaptive scaling Metropolis algorithm Validity Robust adaptive Metropolis algorithm Robust adaptive Metropolis algorithm II For n = 2, 3, . . ., iterate 1/2 Yn = Xn−1 + Σn−1 Wn , where Wn ∼ N (0, I), and Yn , with probability α(Xn−1 , Yn ) and Xn = Xn−1 , otherwise. T Wn Wn 1/2 1/2 Σn = Σn−1 + γn α(Xn−1 , Yn ) − α∗ Σn−1 Σ Wn 2 n−1 where Σn will be symmetric and positive definite whenever γn α(Xn−1 , Yn ) − α∗ > −1. Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 15. Introduction Adaptive Metropolis algorithm Some adaptive MCMC algorithms Adaptive scaling Metropolis algorithm Validity Robust adaptive Metropolis algorithm Robust adaptive Metropolis algorithm III Figure: The RAM update with γn α(Xn−1 , Yn ) − α∗ = ±0.8, resp. The ellipsoids defined by Σn−1 (Σn ) are drawn in solid (dashed), and the 1/2 vector Σn−1 Wn / Wn as a dot. Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 16. Introduction Adaptive Metropolis algorithm Some adaptive MCMC algorithms Adaptive scaling Metropolis algorithm Validity Robust adaptive Metropolis algorithm Example run of RAM 0 0 0 −2 −2 −2 −4 −4 −4 −6 −6 −6 −8 −8 −8 −10 −10 −10 −2 0 2 −2 0 2 −2 0 2 Figure: The first 10, 100 and 1000 samples of the RAM algorithm started with Σ1 = 0.01 × I and using γn = min{1, 2 · n−3/4 }. Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 17. Introduction Adaptive Metropolis algorithm Some adaptive MCMC algorithms Adaptive scaling Metropolis algorithm Validity Robust adaptive Metropolis algorithm RAM vs. ASwAM Computationally, they are of similar complexity (Cholesky update of order O(d2 )). The RAM adaptation does not require that π has a finite second moment (or any moment at all). In RAM, the adaptation is ‘direct’ ‘while estimating the covariance is ‘indirect’ (as it requires also a mean estimate). There are also situations where ASwAM seems to be more efficient than RAM. . . Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 18. Introduction Previous conditions Some adaptive MCMC algorithms New results Validity Results for AM and ASM Previous conditions on validity I What conditions are sufficient to guarantee that a law of large n→∞ numbers (LLN) holds: n n f (Xk ) − − 1 k=1 − → f (x)π(x)dx? Key conditions due to Roberts and Rosenthal [2007]: Diminishing adaptation (DA) The effect of the adaptation becomes smaller and smaller, n→∞ “ PΘn − PΘn−1 − − 0.” −→ Containment (C) The kernels PΘn have all the time sufficiently good mixing properties: k m→∞ sup sup PΘn (Xn , · ) − π − − → 0. −− n≥1 k≥m Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 19. Introduction Previous conditions Some adaptive MCMC algorithms New results Validity Results for AM and ASM Previous conditions on validity II DA is usually easier to verify. Containment is often tightly related to the stability of the process (Θn )n≥1 . Establishing containment (or stability) is often difficult before showing that a LLN holds. . . Typical solution has been to enforce containment. In the case of AM, ASM and RAM algorithms, this means that the eigenvalues of Σn , Θn are constrained to be within [a, b] for some constants 0 < a ≤ b < ∞. Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 20. Introduction Previous conditions Some adaptive MCMC algorithms New results Validity Results for AM and ASM Previous conditions on validity III There are also other results in the literature on the ergodicity of adaptive MCMC given conditions similar to DA and C. The original work on AM [Haario, Saksman, and Tamminen, 2001]. Atchad´ and Rosenthal [2005] analyse the ASM algorithm e following the original mixingale approach. The results on adaptive reprojections approach Andrieu and Moulines [2006], Andrieu, Moulines, and Priouret [2005]. The recent paper by Atchad´ and Fort [2010] is based on a e resolvent kernel approach. Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 21. Introduction Previous conditions Some adaptive MCMC algorithms New results Validity Results for AM and ASM New results I Key ideas in the new results: Containment is, in fact, not a necessary condition for LLN. That is, the ergodic properties of PΘn can become more and more unfavourable, and still a (strong) LLN can hold. The adaptation mechanism may include a strong explicit ‘drift’ away from ‘bad values’ of θ. Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 22. Introduction Previous conditions Some adaptive MCMC algorithms New results Validity Results for AM and ASM General ergodicity result I Assume K1 ⊂ K2 ⊂ · · · ⊂ Re are increasing subsets of the adaptation space. Assume that the adaptation follows the stochastic approximation dynamic Xn ∼ PΘn−1 (Xn−1 , · ) Θn = Θn−1 + γn H(Θn−1 , Xn ). and the adaptation parameter Θn ∈ Kn for all n ≥ 1. One may also enforce Θn ∈ Kn . . . Consider the following conditions for constants c < ∞ and 0≤ 1: Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 23. Introduction Previous conditions Some adaptive MCMC algorithms New results Validity Results for AM and ASM General ergodicity result II (A1) Drift and minorisation: There is a drift function V : Rd → [1, ∞) such that for all n ≥ 1 and all θ ∈ Kn Pθ V (x) ≤ λn V (x) + ½Cn (x)bn and (1) Pθ (x, A) ≥ ½Cn (x)δn νn (A) (2) where Cn ⊂ Rd are Borel sets, δn , λn ∈ (0, 1) and bn < ∞ are constants and νn is concentrated on Cn . Furthermore, the constants are polynomially bounded so that (1 − λn )−1 ∨ δn ∨ bn ≤ cn . −1 Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 24. Introduction Previous conditions Some adaptive MCMC algorithms New results Validity Results for AM and ASM General ergodicity result III (A2) Continuity: For all n ≥ 1 and any r ∈ (0, 1], there is c = c (r) ≥ 1 such that for all θ, θ ∈ Kn , Pθ f − Pθ f Vr ≤cn f Vr |θ − θ |. (A3) Bound for adaptation function: There is a β ∈ [0, 1/2] such that for all n ≥ 1, θ ∈ Kn and x ∈ Rd |H(θ, x)| ≤ cn V β (x). Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 25. Introduction Previous conditions Some adaptive MCMC algorithms New results Validity Results for AM and ASM General ergodicity result IV Theorem (Saksman and Vihola [2010]) Assume (A1)–(A3) hold and let f be a function with f V α < ∞ for some α ∈ (0, 1 − β). Assume < κ−1 [(1/2) ∧ (1 − α − β)], ∗ where κ∗ 1 is an independent constant, and that ∞ κ∗ −1 γ < ∞. Then, k=1 k k n 1 lim f (Xk ) = f (x)π(x)dx almost surely. n→∞ n k=1 Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 26. Introduction Previous conditions Some adaptive MCMC algorithms New results Validity Results for AM and ASM Verifiable assumptions I Definition (Strongly super-exponential target) The target density π is continuously differentiable and has regular tails that decay super-exponentially, x π(x) lim sup · < 0 and |x|→∞ |x| | π(x)| x lim · log π(x) = −∞, |x|→∞ |x|ρ with some ρ > 1. (The “super-exponential case”, ρ = 1, is due to Jarner and Hansen [2000] who ensure the geometric ergodicity of a nonadaptive random-walk Metropolis process.) Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 27. Introduction Previous conditions Some adaptive MCMC algorithms New results Validity Results for AM and ASM Verifiable assumptions II 0 α π(x) x x π(x) Figure: The condition lim sup|x|→∞ |x| · | π(x)| < 0 implies that there is an ε > 0 such that for any sufficiently large |x|, the angle α < π/2 − ε. Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 28. Introduction Previous conditions Some adaptive MCMC algorithms New results Validity Results for AM and ASM AM when π has unbounded support Saksman and Vihola [2010]: First ergodicity results for the original AM algorithm, without upper bounding the eigenvalues λ(Σn ). Use the proposal covariance Σn = t2 Sn + I, with > 0. SLLN and CLT for strongly super-exponential π and functions satisfying supx |f (x)|π −p (x) < ∞ for some p ∈ (0, 1). Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 29. Introduction Previous conditions Some adaptive MCMC algorithms New results Validity Results for AM and ASM AM without the covariance lower bound I Vihola [2011b] contains partial results on the case Σn = t2 Sn , that is, without lower bounding the eigenvalues λ(Σn ). Analysed first an ‘adaptive random walk’: AM run with ‘flat target π ≡ 1’, 1/2 Xn+1 = Xn + tSn Wn+1 Mn+1 = (1 − γn+1 )Mn + γn+1 Xn+1 Sn+1 = (1 − γn+1 )Sn + γn+1 (Xn+1 − Mn )2 . Ten sample paths of this process (univariate) started at x0 = 0, s0 = 1 and with t = 0.01 and ηn = n−1 : Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 30. Introduction Previous conditions Some adaptive MCMC algorithms New results Validity Results for AM and ASM AM without the covariance lower bound II 0.1 0.05 Xn 0 −0.05 −0.1 0 2000 4000 6000 8000 10000 0 10 −2 Sn 10 −4 10 0 2000 4000 6000 8000 10000 Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 31. Introduction Previous conditions Some adaptive MCMC algorithms New results Validity Results for AM and ASM AM without the covariance lower bound III 600 400 Xn 200 0 −200 0 0.5 1 1.5 2 6 x 10 5 10 0 Sn 10 −5 10 0 0.5 1 1.5 2 6 x 10 Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 32. Introduction Previous conditions Some adaptive MCMC algorithms New results Validity Results for AM and ASM AM without the covariance lower bound IV It is shown that Sn → ∞ almost surely. √ The speed of growth is E[Sn ] ∼ exp t n i=2 ηi . √ (Particularly, E[Sn ] ∼ e2t n when ηn = n−1 ). Using the same techniques, one can show the stability (and ergodicity) of AM run with a univariate Laplace target π. These results have little direct practical value, but they indicate that the AM covariance parameter Sn does not tend to collapse. Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 33. Introduction Previous conditions Some adaptive MCMC algorithms New results Validity Results for AM and ASM AM with a fixed proposal component I Instead of lower bounding the eigenvalues λ(Σn ), employ a fixed proposal component with a probability β ∈ (0, 1) [Roberts and Rosenthal, 2009]. This corresponds to an algorithm where the proposals Yn are generated by 1/2 Σ0 Wn , with probability β, Yn = Xn−1 + 1/2 Σn−1 Wn , otherwise. with some fixed symm.pos.def. Σ0 . In other words, one employs a mixture of ‘adaptive’ and ‘nonadaptive’ Markov kernels: P ( Xn ∈ A | X1 , . . . , Xn−1 ) = (1 − β)PΣn−1 (A) + βPΣ0 (A) Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 34. Introduction Previous conditions Some adaptive MCMC algorithms New results Validity Results for AM and ASM AM with a fixed proposal component II Vihola [2011b] shows that, for example with a bounded and compactly supported π or with a super-exponential π, the eigenvalues λ(Σn ) are bounded away from zero. Having a strongly super-exponential target, SLLN (resp. CLT) hold for functions satisfying supx |f (x)|π −p (x) < ∞ for some p ∈ (0, 1) (p ∈ (0, 1/2)). Explicit upper and lower bounds for Σn unnecessary. Mixture proposal may be better in practice than the lower bound I. Also Bai, Roberts, and Rosenthal [2008] analyse this algorithm. A completely different approach, with different assumptions. Also exponentially decaying π are considered; in this case the fixed covariance Σ0 must be large enough. Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 35. Introduction Previous conditions Some adaptive MCMC algorithms New results Validity Results for AM and ASM Ergodicity of unconstrained ASM algorithm Vihola [2011a] shows two results for the unmodified adaptation, without any (upper or lower) bounds for Θn . Assume: the desired acceptance rate α∗ ∈ (0, 1/2). the adaptation weights satisfy γn < ∞ (e.g. γn = n−γ with 2 γ ∈ (1/2, 1]). Two cases: 1. π is bounded, bounded away from zero on the support, and the support X = {x : π(x) > 0} is compact and has a smooth boundary. Then, SLLN holds for bounded functions. 2. Suppose a strongly super-exponential target having tails with uniformly smooth contours. Then, SLLN holds for functions satisfying supx |f (x)|π −p (x) < ∞ for some p ∈ (0, 1). Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 36. Introduction Previous conditions Some adaptive MCMC algorithms New results Validity Results for AM and ASM Ergodicity of ASM within AM The results in [Vihola, 2011a] cover also the adaptive scaling within adaptive Metropolis algorithm. However, one has to assume that the covariance factor Σn is bounded within 0 < a ≤ b < ∞ Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 37. Introduction Previous conditions Some adaptive MCMC algorithms New results Validity Results for AM and ASM Final remarks I Current results show that some adaptive MCMC algorithms are intrinsically stable, requiring no additional stabilisation structures. Easier for practitioners; less parameters to ‘tune.’ Showing that the methods are fairly ‘safe’ to apply. The results are related to the more general question of the stability of the Robbins-Monro stochastic approximation with Markovian noise. Manuscript in preparation with C. Andrieu. . . Fort, Moulines, and Priouret [2010] consider also interacting MCMC (e.g. the equi-energy sampler) and extend the approach in [Saksman and Vihola, 2010]. Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 38. Introduction Previous conditions Some adaptive MCMC algorithms New results Validity Results for AM and ASM Final remarks II It is not necessary to use Gaussian proposal distributions. All the above results apply to elliptical proposals satisfying a particular tail decay condition. For example, the results apply with multivariate Student distributions having the form d+p qc (z) ∝ (1 + c−1/2 z 2 )− 2 where p > 0 [Vihola, 2011a]. Current results apply only for targets with rapidly decaying tails. It is important to establish similar results with heavy-tailed targets. Overall, there is a need for more general yet practically verifiable conditions to check the validity of the methods. Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 39. Introduction Previous conditions Some adaptive MCMC algorithms New results Validity Results for AM and ASM Final remarks III There is also a free software available for testing several adaptive random-walk MCMC algorithms, including the new RAM approach [Vihola, 2010a]: http://iki.fi/mvihola/grapham/ Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 40. Introduction Previous conditions Some adaptive MCMC algorithms New results Validity Results for AM and ASM Final remarks III There is also a free software available for testing several adaptive random-walk MCMC algorithms, including the new RAM approach [Vihola, 2010a]: http://iki.fi/mvihola/grapham/ Thank you! Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 41. References References I ´ C. Andrieu and E. Moulines. On the ergodicity properties of some adaptive MCMC algorithms. Ann. Appl. Probab., 16(3): 1462–1505, 2006. C. Andrieu and C. P. Robert. Controlled MCMC for optimal sampling. Technical Report Ceremade 0125, Universit´ Paris e Dauphine, 2001. C. Andrieu and J. Thoms. A tutorial on adaptive MCMC. Statist. Comput., 18(4):343–373, Dec. 2008. ´ C. Andrieu, E. Moulines, and P. Priouret. Stability of stochastic approximation under verifiable conditions. SIAM J. Control Optim., 44(1):283–312, 2005. Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 42. References References II Y. Atchad´ and G. Fort. Limit theorems for some adaptive MCMC e algorithms with subgeometric kernels. Bernoulli, 16(1):116–154, Feb. 2010. Y. F. Atchad´ and J. S. Rosenthal. On adaptive Markov chain e Monte Carlo algorithms. Bernoulli, 11(5):815–828, 2005. Y. Bai, G. O. Roberts, and J. S. Rosenthal. On the containment condition for adaptive Markov chain Monte Carlo algorithms. Preprint, July 2008. URL http://probability.ca/jeff/research.html. G. Fort, E. Moulines, and P. Priouret. Convergence of adaptive MCMC algorithms: ergodicity and law of large numbers. 2010. technical report. Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 43. References References III W. R. Gilks, G. O. Roberts, and S. K. Sahu. Adaptive Markov chain Monte Carlo through regeneration. J. Amer. Statist. Assoc., 93(443):1045–1054, 1998. H. Haario, E. Saksman, and J. Tamminen. An adaptive Metropolis algorithm. Bernoulli, 7(2):223–242, 2001. S. F. Jarner and E. Hansen. Geometric ergodicity of Metropolis algorithms. Stochastic Process. Appl., 85:341–361, 2000. G. O. Roberts and J. S. Rosenthal. Coupling and ergodicity of adaptive Markov chain Monte Carlo algorithms. J. Appl. Probab., 44(2):458–475, 2007. G. O. Roberts and J. S. Rosenthal. Examples of adaptive MCMC. J. Comput. Graph. Statist., 18(2):349–367, 2009. Matti Vihola Stability of adaptive random-walk Metropolis algorithms
  • 44. References References IV E. Saksman and M. Vihola. On the ergodicity of the adaptive Metropolis algorithm on unbounded domains. Ann. Appl. Probab., 20(6):2178–2203, Dec. 2010. M. Vihola. Grapham: Graphical models with adaptive random walk Metropolis algorithms. Comput. Statist. Data Anal., 54(1): 49–54, Jan. 2010a. M. Vihola. Robust adaptive Metropolis algorithm with coerced acceptance rate. Preprint, arXiv:1011.4381v1, Nov. 2010b. M. Vihola. On the stability and ergodicity of adaptive scaling Metropolis algorithms. Preprint, arXiv:0903.4061v3, Apr. 2011a. M. Vihola. Can the adaptive Metropolis algorithm collapse without the covariance lower bound? Electron. J. Probab., 16:45–75, 2011b. Matti Vihola Stability of adaptive random-walk Metropolis algorithms