SlideShare ist ein Scribd-Unternehmen logo
1 von 44
empirical bayesian@ieee.org




                  Bootstrap resampling with independent data
                             Some theory and applications



                                    Jan Galkowski



                                21st November 2012
                                   (last rev November 21, 2012)




J. T. Galkowski                  Bootstrapping Independent Data   1 / 44
empirical bayesian@ieee.org


  Why do we care about Uncertainty Qualification (UQ)?




            Inherent variability of mechanisms, measurements
            Statistics are themselves random variables, e.g., sample mean, sample standard
            deviation, sample median
            For simple distributions, knowing the first few moments suffices to characterize
            them
            For complicated or mixture distributions . . . ?
            We often pretend distributions of measurements are stationary. Are they?
            Diagnostics! What to do if aren’t?
            Will have more to say about latter in next lecture on bootstrapping with
            dependent data




J. T. Galkowski                          Bootstrapping Independent Data                      2 / 44
empirical bayesian@ieee.org


  Sketch of How the Bootstrap Works




                     8 6 11 6 11 5 11 2
                     1           2           3            4           5            6           7               8

                      11 8 2 11 5 8 8 11
                       2 0.125    2 0.125     2 0.125    2 0.125    2 0.125    2 0.125    2 0.125    2 0.125
                       5 0.125    5 0.125     5 0.125    5 0.125    5 0.125    5 0.125    5 0.125    5 0.125
                       6 0.250    6 0.250     6 0.250    6 0.250    6 0.250    6 0.250    6 0.250    6 0.250
                       8 0.125    8 0.125     8 0.125    8 0.125    8 0.125    8 0.125    8 0.125    8 0.125
                      11 0.375   11 0.375    11 0.375   11 0.375   11 0.375   11 0.375   11 0.375   11 0.375



                         2 11 2 8 11 5 11 2
                       2 0.125    2 0.125     2 0.125    2 0.125    2 0.125    2 0.125    2 0.125    2 0.125
                       5 0.125    5 0.125     5 0.125    5 0.125    5 0.125    5 0.125    5 0.125    5 0.125
                       6 0.250    6 0.250     6 0.250    6 0.250    6 0.250    6 0.250    6 0.250    6 0.250
                       8 0.125    8 0.125     8 0.125    8 0.125    8 0.125    8 0.125    8 0.125    8 0.125
                      11 0.375   11 0.375    11 0.375   11 0.375   11 0.375   11 0.375   11 0.375   11 0.375




J. T. Galkowski                             Bootstrapping Independent Data                                         3 / 44
empirical bayesian@ieee.org


  Sketch of Why the Bootstrap Works


                         400
                         300       Growth in Number of Possible Bootstrap Resamplings with Sample Size
              log10(N)

                         200
                         100
                         0




                               0            50                      100                    150           200

                                                             size of sample, N



J. T. Galkowski                                   Bootstrapping Independent Data                               4 / 44
empirical bayesian@ieee.org


  Test Data Model, Easy Case


                  0.12
                  0.10
                  0.08
                  0.06
                  0.04
                  0.02
                  0.00         Synthetic Gaussians for Easy Test Case




                         10      20                   30                40   50

                                                 x




J. T. Galkowski                Bootstrapping Independent Data                     5 / 44
empirical bayesian@ieee.org


  Boxplot of well-separated Gaussians


                                 30
                                 25          Simple case of two well separated Gaussians
                  sample value

                                 20
                                 15
                                 10




                                      Left                                             Right

                                                         Contributing Distribution




J. T. Galkowski                                Bootstrapping Independent Data                  6 / 44
empirical bayesian@ieee.org


  Histogram of same well-separated Gaussians


                  0.25   Left Sample of Two Well−separated Gaussians                    Right Sample of Two Well−separated Gaussians




                                                                                 0.25
                  0.20




                                                                                 0.20
                  0.15




                                                                                 0.15
                  0.10




                                                                                 0.10
                  0.05




                                                                                 0.05
                  0.00




                                                                                 0.00




                         10     15      20      25       30      35     40               10     15     20      25       30      35   40

                               Sample of Left Well−Separated Gaussian                                Sample of Right Gaussian




J. T. Galkowski                                               Bootstrapping Independent Data                                              7 / 44
empirical bayesian@ieee.org


  Standard t-test on well-separated Gaussians




                                                                   values               values
                              Group                        n       average              s.d.
                              1: left Gaussians            24      19.540998             4.867407
                              2: right Gaussians           35      28.471698             2.820691
                              3: POOLED                    59      24.838871            3.7822811

                               ¯    ¯
                              (Y2 − Y1 ) − 0.0 = 8.930700                   ¯       ¯
                                                                     SE(Y2 − Y1 ) = 1.002398
                                                                 8.930700−0.0
                                    two-sided t-statistic =        1.002398
                                                                              = 8.909336
                                   P = 0.0000      t-statistic = 8.90934 = t57 (0.0000)




        1
            Adjusted for d.o.f.
J. T. Galkowski                                    Bootstrapping Independent Data                   8 / 44
empirical bayesian@ieee.org


  Bootstrap characterization of well-separated Gaussians

                                 Two well−separated Gaussians, differences
                                     after 10,000 bootstrap replications
                  0.5


                                                                                             z t test
                                                                                             z t test − se t test
                                                                                             z t test + se t test
                                                                                             z bootstrap
                                                                                             z bootstrap − se bootstrap
                                                                                             z bootstrap + se bootstrap
                  0.4
                  0.3
                  0.2
                  0.1
                  0.0




                        6                 8                         10                  12

                                difference between Right Gaussians and Left Gaussians




J. T. Galkowski                    Bootstrapping Independent Data                                                         9 / 44
empirical bayesian@ieee.org


  Bootstrap test of well-separated Gaussians




                                                          values                        values
                   Group                          n       average                       s.d.
                   1: resampled Left              24      19.540998                      4.867407
                   2: resampled Right             35      28.471698                      2.820691
                   3: POOLED                      59      24.838871                     3.7822812
                                    ¯    ¯
                   Recall original (Y2 − Y1 ) − 0.0 = 8.930700
                   baseline two-sided t statistic P = 0.0000     t-statistic = 8.90934 = t57 (0.0000)
                   bootstrap resampled Left two-sided t statistic P = 0.0001       0 of 10000 exceed baseline t
                   bootstrap resampled Right t statistic P = 0.0002      1 of 10000 exceed baseline t




        2
            Adjusted for d.o.f.
J. T. Galkowski                                   Bootstrapping Independent Data                                  10 / 44
empirical bayesian@ieee.org


  Test Data Model, Hard Case


                  0.06
                  0.05
                  0.04
                  0.03
                  0.02
                  0.01
                  0.00         Synthetic Gaussians for Hard Test Case




                         10      20                   30                40   50

                                                 x




J. T. Galkowski                Bootstrapping Independent Data                     11 / 44
empirical bayesian@ieee.org


  Boxplot of well-mixed Gaussians


                                 40
                                 35
                                 30          Hard case of two well mixed Gaussians
                  sample value

                                 25
                                 20
                                 15
                                 10




                                      Left                                           Right

                                                      Contributing Distribution




J. T. Galkowski                              Bootstrapping Independent Data                  12 / 44
empirical bayesian@ieee.org


  Histogram of same well-mixed Gaussians


                  0.15
                  0.10
                  0.05
                  0.00                       Two well−mixed Gaussians




                         10   15           20                 25                  30       35   40

                                   Pooled Sample of Right and Left Mixtures of Gaussians




J. T. Galkowski                      Bootstrapping Independent Data                                  13 / 44
empirical bayesian@ieee.org


  Standard t-test on mixture distribution of Gaussians




                                                                   values               values
                              Group                        n       average              s.d.
                              1: left Gaussians            24      22.635617             6.534154
                              2: right Gaussians           35      27.558842             6.140611
                              3: POOLED                    59      25.556174            6.3023673

                               ¯    ¯
                              (Y2 − Y1 ) − 0.0 = 4.923225                   ¯       ¯
                                                                     SE(Y2 − Y1 ) = 1.670283
                                                                 4.923225−0.0
                                    two-sided t-statistic =        1.670283
                                                                              = 2.947539
                                   P = 0.0023      t-statistic = 2.94754 = t57 (0.0023)




        3
            Adjusted for d.o.f.
J. T. Galkowski                                    Bootstrapping Independent Data                   14 / 44
empirical bayesian@ieee.org


  Resampling from each population separately

                        Bootstrap Resampling Pseudo−Populations from                     Bootstrap Resampling Pseudo−Populations from
                            Left Case of Two well−mixed Gaussians                           Right Case of Two well−mixed Gaussians
                  0.4




                                                                                   0.4
                  0.3




                                                                                   0.3
                  0.2




                                                                                   0.2
                  0.1




                                                                                   0.1
                  0.0




                                                                                   0.0




                         15      20        25          30        35     40                15       20       25          30       35       40

                              Resampling of Left Well−Mixed Gaussians                          Resampling of Right Well−Mixed Gaussians
                                         (20,000 resamplings)                                             (20,000 resamplings)




J. T. Galkowski                                                 Bootstrapping Independent Data                                                 15 / 44
empirical bayesian@ieee.org


  What a pooled bootstrap resampling population looks like

                                   Bootstrap Resampling Pseudo−Populations from
                                     Pooled Case of Two well−mixed Gaussians
                  0.25
                  0.20
                  0.15
                  0.10
                  0.05
                  0.00




                         15   20                25                    30            35   40

                                        Pooled Resampling of Well−Mixed Gaussians




J. T. Galkowski                        Bootstrapping Independent Data                         16 / 44
empirical bayesian@ieee.org


  Bootstrap test of well-mixed Gaussians




                                                        values                           values
                Group                          n        average                          s.d.
                1: resampled Left              24       22.635617                         6.534154
                2: resampled Right             35       27.558842                         6.140611
                3: POOLED                      59       25.556174                        6.3023674
                                 ¯    ¯
                Recall original (Y2 − Y1 ) − 0.0 = 4.923225
                baseline two-sided t statistic P = 0.0023     t-statistic = 2.94754 = t57 (0.0023)
                bootstrap resampled Left two-sided t statistic P = 0.1328         2656 of 20000 exceed baseline t
                bootstrap resampled Right t statistic P = 0.1331      2662 of 20000 exceed baseline t




        4
            Adjusted for d.o.f.
J. T. Galkowski                                  Bootstrapping Independent Data                                     17 / 44
empirical bayesian@ieee.org


  Resampled t-values

                        Bootstrap resampled t−values and Baseline from                  Bootstrap resampled t−values and Baseline from
                             Left Sample of Well−mixed Gaussians                            Right Sample of Well−mixed Gaussians
                  0.4




                                                  baseline t−value = 2.9475




                                                                                  0.4
                                                                                                                  baseline t−value = 2.9475
                  0.3




                                                                                  0.3
                  0.2




                                                                                  0.2
                  0.1




                                                                                  0.1
                  0.0




                                                                                  0.0




                          0      1      2      3          4        5          6           0      1      2         3       4         5

                                              t                                                               t




J. T. Galkowski                                               Bootstrapping Independent Data                                                  18 / 44
empirical bayesian@ieee.org


  A little Bootstrap theory: Glivenko-Cantelli


               Empirical distribution function: i.i.d. X1 , X2 , . . . , Xn ∼ F then
                                                                           n
                                                                def   1
                                                            Fn =        ∑ IX ≤x (x )
                                                                                 k
                                                                      n   k =1

               Glivenko-Cantelli Theorem: As n → ∞,

                                                               ∑ |Fn − F | → 0
                                                                x

               almost surely (abbreviated “a.s.”).




   Given
                                  def
                             Ω0 = {ω : lim Xn exists} = {ω : (lim sup Xn ) − (lim inf Xn ) = 0}
                                               n→∞                             n→∞       n→∞

   if P (Ω0 ) = 1, we say that Xn converges almost surely5 .

        5
            See Durrett, 4th edition, 2010, pp 15-16.
J. T. Galkowski                                         Bootstrapping Independent Data            19 / 44
empirical bayesian@ieee.org


  Proof of Glivenko-Cantelli Theorem
  After R. Durrett, 2010, 4th edition, Probability: Theory and Examples, 76-77



       1    Fix x. Let Zn = IXn ≤x (x ).

       2    Since Zn i.i.d. with E [Zn ] = P (Xn < x ) = F (x − ) = limy →x F (y ), strong law ⇒ Fn (x − ) = n ∑n =1 Zk → F (x − ) a.s.
                                                                                                             1
                                                                                                                k

       3                                                    i
            For 1 ≤ i ≤ k − 1, let xi ,k = inf{y : F (y ) ≥ k }.

       4    Pointwise convergence of Fn (x ), Fn (x − ) ⇒ can pick Wk so

                                                                                                            1
                                                                   n ≥ Wk ⇒ |Fn (xi ,k ) − F (xi ,k )| <
                                                                                                            k

                                                                                                            1
                                                                   n ≥ Wk ⇒ |Fn (xi− ) − F (xi− )| <
                                                                                   ,k         ,k            k

            for 1 ≤ i ≤ k − 1.

       5    Let x0,k = −∞ and xk ,k = ∞. Then last inequalities valid for i = 0 or i = k .

       6    Given u = xi −1,k or u = xi ,k , 1 ≤ i ≤ k − 1, n ≥ Wk , monotonicity of Fn , monotonicity of F , F (xi− ) − F (xi −1,k ) ≤ k ,
                                                                                                                   ,k
                                                                                                                                        1



                                                                                     1                      2                    2
                                                  Fn (u ) ≤ Fn (xi− ) ≤ F (xi− ) +
                                                                  ,k         ,k          ≤ F (xi −1,k ) +           ≤ F (u ) +
                                                                                     k                      k                    k

                                                                                           1                    2                    2
                                                Fn (u ) ≥ Fn (xi −1,k ) ≥ F (xi −1,k ) −       ≥ F (xi− ) −
                                                                                                      ,k             ≥ F (u ) −
                                                                                           k                    k                    k

       7                                            2
            Consequently, supu |Fn (u ) − F (u )| ≤ k . Result proved.

J. T. Galkowski                                              Bootstrapping Independent Data                                                   20 / 44
empirical bayesian@ieee.org


  A little Bootstrap theory: Edgeworth expansions



            Edgeworth expansion:
                  1   ˆ
                      θ is constructed from sample of size n
                  2   θ0 is a “parameter”
                      √
                  3      n(θ − θ0 ) ∼ N (0, σ 2 )
                            ˆ
                           √ θ −θ0
                               ˆ                   √                          √
                  4   P {( n σ ≤ x } = Φ(x ) + ( n)1 p1 (x )φ (x ) + · · · + ( n)j pj φ (x ) + . . .
                  5   φ (x ) = N (0, 1) and Φ(x ) = −∞ φ (u ) du
                                                    x

                                        g (X∗ −g (X)
                                           ¯      ¯
            Then letting A(X∗ ) =
                         ˆ ¯          ¯
                                    h(X)
                                            , where s∗ denotes the corresponding bootstrap
            estimate of the sample statistic s,
                  1   Characteristic function χ of X satisfies lim sup|χ(t)| < 1
                                                                             t →0
                                                     g (X∗ −g (X)
                                                        ¯      ¯
                  2   Asymptotic variance of               ¯
                                                         h(X)
                                                                    =1
                               √   g (X∗ −g (X)
                                      ¯      ¯                      √
                  3   Then P { n         ¯
                                       h(X)
                                             ≤ x } = Φ(x ) + ∑ν=1 ( n)j πj (x )φ (x ) + o(n−ν/2 )
                                                               j
                  4   πj is a polynomial of degree 3j − 1, odd for even j and even for odd j with
                      coefficients a function of moments of X up to order j + 2.


J. T. Galkowski                                   Bootstrapping Independent Data                       21 / 44
empirical bayesian@ieee.org


  Bumpus 1898 data


                                             0.78
                                             0.76
                                             0.74   Sparrow Survival after Severe Snowstorm (Bumpus, 1898, Woods Hole, MA)
                  Humerus length in inches

                                             0.72
                                             0.70
                                             0.68
                                             0.66




                                                            PERISHED                                     SURVIVED

                                                                                 status of sparrow




J. T. Galkowski                                                        Bootstrapping Independent Data                        22 / 44
empirical bayesian@ieee.org


  t-test results from Bumpus 1898 data




                                                                 Humerus              Humerus
                                                                 average              s.d.
                              Group                      n       (inches)             (inches)
                              1: perished                24       0.727917             0.023543
                              2: survived                35       0.738000             0.019839
                              3: POOLED                  59       0.733898             0.0214116

                               ¯    ¯
                              (Y2 − Y1 ) − 0.0 = 0.010083                 ¯       ¯
                                                                   SE(Y2 − Y1 ) = 0.005674
                                                               0.010083−0.0
                                    two-sided t-statistic =      0.005674
                                                                            = 1.776998
                                   P = 0.0405    t-statistic = 1.77700 = t57 (0.0405)




        6
            Adjusted for d.o.f.
J. T. Galkowski                                  Bootstrapping Independent Data                    23 / 44
empirical bayesian@ieee.org


  Bootstrap results from Bumpus 1898 data




                                                           values                       values
                    Group                          n       average                      s.d.
                    1: perished                    24       0.727917                     0.023543
                    2: survived                    35       0.738000                     0.019839
                    3: POOLED                      59       0.733898                    0.0214117
                                     ¯    ¯
                    Recall original (Y2 − Y1 ) − 0.0 = 0.010083
                    baseline two-sided t statistic P = 0.0405     t-statistic = 1.77700 = t57 (0.0405)
                    bootstrap perished two-sided t statistic P = 0.2711      2710 of 10000 exceed baseline t
                    bootstrap survived t statistic P = 0.2564     2563 of 10000 exceed baseline t




        7
            Adjusted for d.o.f.
J. T. Galkowski                                   Bootstrapping Independent Data                               24 / 44
empirical bayesian@ieee.org


  Can go elsewhere: Stroke risk and proportions



                                                                             strokes8             subjects
                                     aspirin group                              119                11,037
                                     placebo group                              98                 11,034

   How to bootstrap this problem . . .
        1      Create an “aspirin sample” of length 11,037 having 119 ones and 11,037 - 119 zeros.
        2      Create a “placebo sample” of length 11,034 having 98 ones and 11,037 - 98 zeros.
        3      Draw with replacement of length 11,037 from “aspirin sample”.
        4      Draw with replacement of length 11,034 from “placebo sample”.
        5      Calculate proportion ones in draw from “aspirin sample” to number of ones in draw from
               “placebo sample”.
        6      Repeat drawing and calculation a large number of times, say, 1000.
        7      Calculate the sample statistics you like from the resulting population.


        8
            From New York Times of 27th January 1987, as quoted by Efron and Tibshirani in An Introduction to the Bootstrap, 1998.
J. T. Galkowski                                            Bootstrapping Independent Data                                            25 / 44
empirical bayesian@ieee.org


  Bootstrap display of proportions in stroke risk

                                      Resampled proportions for stroke risk
                                         using 10000 bootstrap replicas
                  300




                                              median risk ratio = 1.2140, p = 0.548
                  250
                  200
                  150
                  100
                  50
                  0




                                1.0                             1.5                   2.0

                                                   aspirin placebo




J. T. Galkowski                       Bootstrapping Independent Data                        26 / 44
empirical bayesian@ieee.org


  Can go elsewhere: In lieu of cross-validation



            Cross-validation is a bag of techniques for trying to assess prediction error.

            There are other “bags of techniques” as well, one will be mentioned later, but a
            review needs await a lecture on predictive inference.




                    y + δ y = [f + δ f ](x + δ x ) −→ δ y = f(δ x ) + δ f (x + δ x )




            This is gotten wrong a lot. Important to separate variation in data from variation
            in model.

            Will present the standard set of methods, and then offer one Bootstrap-based
            alternative.


J. T. Galkowski                         Bootstrapping Independent Data                           27 / 44
empirical bayesian@ieee.org


  Cross-validation: A review
  Classical methods




            Split-half:
                  1   Randomly pick half data subset from sample without replacement.
                  2   Fit to that.
                  3   Test fit on half not selected.
                  4   Report measure like MSE for test.

            K -fold: Let F = {k : k = 1, . . . , K }.
                  1   Divide sample into K randomly picked disjoint subsets.
                  2   For each j , j ∈ F , fit to all i = j , i ∈ F .
                  3   Then test the fit obtained on subset j.
                  4   Report measure like MSE by averaging over all K of the “folds”.

            Leave one out cross-validation: K -fold with K = 1. (Not covered here.)




J. T. Galkowski                                        Bootstrapping Independent Data   28 / 44
empirical bayesian@ieee.org


  Cross-validation: A review
  Bootstrap-related methods




            Bootstrap validation:
                  1   Take B Bootstrap resamplings from the sample.
                  2   For each resampling, fit the model to the resampling, then predict the model on the
                      resampling and on the entire original sample.
                  3   Note the difference between performance on the entire original sample and the performance
                      on the resampling in each case. Each of these values are called an estimate of optimism.
                  4   Average the optimism values, and then add that to the average residual squared error of the
                      original fit to get an estimate of prediction error.

            Two versions of Bootstrap-based cross validation illustrated here. The first is the “ordinary
            Bootstrap” realization of validation9 , and the “0.632+ bootstrap”, a refined version
            designed to replace classical cross-validation.



        9
         B. Efron, G. Gong, “A leisurely look at the Bootstrap, the Jackknife, and cross-validation”, The American Statistician, 37(1), 1983, 36-48,
   as realized by F. E. Harrell, Jr, in the R package rms.
J. T. Galkowski                                           Bootstrapping Independent Data                                                           29 / 44
empirical bayesian@ieee.org


  Data to model: Salinity as predicted by temperature, pressure, latitude, month
  ARGO float, platform 1900764 (WHOI, IFREMER)



                                                                 ARGO Float, Platform 1900764 (WHOI, IFREMER)
                                                   25
                                                   20
                  Adjusted Temperature (Celsius)

                                                   15
                                                   10
                                                   5




                                                        35.0   35.5               36.0                   36.5   37.0

                                                                               Salinity Adjusted (psu)

J. T. Galkowski                                                       Bootstrapping Independent Data                   30 / 44
empirical bayesian@ieee.org


  The ARGO Floats System




J. T. Galkowski            Bootstrapping Independent Data   31 / 44
empirical bayesian@ieee.org


  Location of collection: Equatorial Atlantic, near Africa
  Canary Current dominates at surface, flows southwest




J. T. Galkowski                         Bootstrapping Independent Data   32 / 44
empirical bayesian@ieee.org


  Location of collection: Near Cape Verde Islands




J. T. Galkowski                   Bootstrapping Independent Data   33 / 44
empirical bayesian@ieee.org


  Models to be compared: Linear regression to predict Salinity
  Two competitive models




                         Table:    salinity = f (temperature, pressure, month, latitude, intercept)


                                      Estimate          Std. Error            t value           Pr(>|t|)
                    (Intercept)      33.58993            0.04224           795.28623            0.00000
                  temperature         0.11911            0.00082           144.46335            0.00000
                      pressure        0.00014            0.00001            11.87072            0.00000
                         month       -0.00069            0.00061            -1.14065            0.25404
                        latitude      0.02783            0.00194            14.36736            0.00000



                             Table:     salinity = f (temperature, pressure, month, latitude)


                                     Estimate          Std. Error             t value           Pr(>|t|)
                  temperature         0.35984           0.00595             60.43834            0.00000
                     pressure         0.00347           0.00008             40.91654            0.00000
                       month         -0.06683           0.00466            -14.33299            0.00000
                      latitude        1.43767           0.00606            237.19820            0.00000

J. T. Galkowski                             Bootstrapping Independent Data                                 34 / 44
empirical bayesian@ieee.org


  Models to be compared: Linear regression to predict Salinity
  Two more competitive models




                            Table:   salinity = f (temperature, pressure, month, intercept)


                                    Estimate         Std. Error            t value            Pr(>|t|)
                    (Intercept)    34.14529           0.01719          1986.78239             0.00000
                  temperature       0.11940           0.00083           143.49286             0.00000
                      pressure      0.00014           0.00001            12.02606             0.00000
                         month      0.00106           0.00060             1.76639             0.07736




                                  Table:   salinity = f (temperature, pressure, month)


                                    Estimate         Std. Error            t value            Pr(>|t|)
                   temperature      1.71963           0.00403           427.12858             0.00000
                      pressure      0.02196           0.00008           264.55418             0.00000
                        month       0.14220           0.01147            12.40126             0.00000


J. T. Galkowski                            Bootstrapping Independent Data                                35 / 44
empirical bayesian@ieee.org


  Models to be compared: Linear regression to predict Salinity
  Still two more competitive models




                                  Table:   salinity = f (temperature, pressure, intercept)


                                     Estimate         Std. Error              t value        Pr(>|t|)
                    (Intercept)     34.14888           0.01707            2000.88558         0.00000
                  temperature        0.11951           0.00083             144.02016         0.00000
                      pressure       0.00014           0.00001              12.17462         0.00000




                                     Table:     salinity = f (temperature, pressure)


                                     Estimate          Std. Error            t value         Pr(>|t|)
                   temperature       1.75765            0.00263           668.68046          0.00000
                      pressure       0.02246            0.00007           308.89198          0.00000




J. T. Galkowski                            Bootstrapping Independent Data                               36 / 44
empirical bayesian@ieee.org


  Variation by season: Salinity as predicted by temperature, pressure, latitude, month
  ARGO float, platform 1900764 (WHOI, IFREMER)



                                                                 ARGO Float, Platform 1900764 (WHOI, IFREMER)
                                                                       (showing dependency by month)
                                                   25
                                                   20
                  Adjusted Temperature (Celsius)

                                                   15
                                                   10
                                                   5




                                                        35.0   35.5               36.0                   36.5   37.0

                                                                               Salinity Adjusted (psu)

J. T. Galkowski                                                       Bootstrapping Independent Data                   37 / 44
empirical bayesian@ieee.org


  Split-Half cross-validation example
  Also called “Hold Out cross-validation” or “Split-Sample cross-validation”




                                        Table: Split-half cross-validation summary

                                                                                                MSE
                         temperature, pressure, month, latitude; with intercept             0.219357
                                 temperature, pressure, month; with intercept               0.180320
                                         temperature, pressure; with intercept              0.133647
                      temperature, pressure, month, latitude; without intercept            10.743651
                             temperature, pressure, month; without intercept               48.300449
                                      temperature, pressure; without intercept             33.191302




J. T. Galkowski                                           Bootstrapping Independent Data               38 / 44
empirical bayesian@ieee.org


  K-fold Cross-validation example




                              Table: 10-fold cross-validation summary

                                                                                  MSE
                     temperature, pressure, month, latitude; with intercept   0.209468
                             temperature, pressure, month; with intercept     0.211403
                                     temperature, pressure; with intercept    0.211403
                  temperature, pressure, month, latitude; without intercept   1.625414
                         temperature, pressure, month; without intercept      4.071300
                                  temperature, pressure; without intercept    4.099267




J. T. Galkowski                          Bootstrapping Independent Data                  39 / 44
empirical bayesian@ieee.org


  Bootstrap validation instead



        Table: Bootstrap Validation summary (100 resamplings, Efron & Gong Boostrap)
                                                                                     MSE
                        temperature, pressure, month, latitude; with intercept   0.043891
                                temperature, pressure, month; with intercept     0.044472
                                        temperature, pressure; with intercept    0.044584
                     temperature, pressure, month, latitude; without intercept   0.043893
                            temperature, pressure, month; without intercept      0.044507
                                     temperature, pressure; without intercept    0.044723



             Table: Bootstrap Validation summary (100 resamplings, 0.632+ Boostrap)
                                                                                     MSE
                        temperature, pressure, month, latitude; with intercept   0.043733
                                temperature, pressure, month; with intercept     0.044760
                                        temperature, pressure; with intercept    0.044618
                     temperature, pressure, month, latitude; without intercept   0.043833
                            temperature, pressure, month; without intercept      0.044626
                                     temperature, pressure; without intercept    0.044672

J. T. Galkowski                           Bootstrapping Independent Data                    40 / 44
empirical bayesian@ieee.org


  Model Selection and Comparison: Information Criteria
  Something to be taught another day ... lecture on predictive inference




                        Table: Information Criteria Judging Quality of Several Models
                                                                                       AIC        AICc         BIC     R2
          temperature, pressure, month, latitude; with intercept                  -3086.41    -3086.40    -3042.75   0.91
                  temperature, pressure, month; with intercept                    -2883.86    -2883.86    -2847.48   0.91
                          temperature, pressure; with intercept                   -2882.74    -2882.74    -2853.64   0.91
       temperature, pressure, month, latitude; without intercept                      0.00        0.00        0.00   1.00
              temperature, pressure, month; without intercept                         0.00        0.00        0.00   0.99
                       temperature, pressure; without intercept                       0.00        0.00        0.00   0.99




                   Table: Relative Importance of Various Attributes in Several Models

                                                                                temperature    pressure     month    latitude
      temperature+pressure+month+latitude+intercept model                           0.5916       0.4048    0.0006     0.0030
              temperature+pressure+month+intercept model                            0.5935       0.4057    0.0008     0.0000
                     temperature+pressure+intercept model                           0.5942       0.4058    0.0000     0.0000


J. T. Galkowski                                          Bootstrapping Independent Data                                     41 / 44
empirical bayesian@ieee.org


  Computation




            R
            SEQBoot special purpose and free
            Nothing in GNU Scientific Library, unfortunately.
            For non-delicate inferences, can simply write the sampling yourself.
            Kleiner, Talwalkar, Sarkar, Jordan “Big data bootstrap”, 2012




J. T. Galkowski                        Bootstrapping Independent Data              42 / 44
empirical bayesian@ieee.org


  When do these fail?




            Fail principally when data are co-correlated or dependent
            Special techniques for dependent data: Next lecture!
            Fail when trying to characterize extremes of distributions like minimum,
            maximum: Insufficient number of samples
            Deteriorates when sampling distribution of population is not smooth, in a formal
            sense




J. T. Galkowski                        Bootstrapping Independent Data                          43 / 44
empirical bayesian@ieee.org


  Bibliography

            M. R. Chernick, Bootstrap Methods: A Guide for Practitioners and Researches, 2nd edition, 2008, John Wiley & Sons.


            A. C. Davison, D. V. Hinkley, Bootstrap Methods and their Application, first published 1997, 11th printing, 2009, Cambridge University

            Press.


            B. Efron, R. J. Tibshirani, An Introduction to the Bootstrap, 1993, Chapman & Hall/CRC.


            P. I. Good, Resampling Methods – A Practical Guide to Data Analysis, 2006.


            B. Efron, The Jackknife, the Bootstrap and Other Resampling Plans, CBMS-NSF Regional Conference Series in Applied Mathematics,

            1982.


            P. Hall, The Bootstrap and Edgeworth Expansion, Springer-Verlag, 1992.


            K. Singh, G. J. Babu, “On the asymptotic optimality of the bootstrap”, Scandanavian Journal of Statistics, 17, 1990, 1-9.


            A. Kleiner, A. Talwalkar, P. Sarkar, M. I. Jordan, “The Big Data Bootstrap”, in J. Langford and J. Pineau (eds.), Proceedings of the 29th

            International Conference on Machine Learning (ICML), Edinburgh, UK, 2012.


            A. Kleiner, A. Talwalkar, P. Sarkar, M. I. Jordan, “A scalable bootstrap for massive data”, http://arxiv.org/abs/1112.5016, last

            accessed 16th November 2012.


            B. Efron, R. Tibshirani, “Improvements on cross-validation: The 0.632+ Bootstrap method”, JASA, 92(438), June 1997, 548-560.

J. T. Galkowski                                           Bootstrapping Independent Data                                                           44 / 44

Weitere ähnliche Inhalte

Empfohlen

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Empfohlen (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Tutorial on Bootstrap resampling with independent data

  • 1. empirical bayesian@ieee.org Bootstrap resampling with independent data Some theory and applications Jan Galkowski 21st November 2012 (last rev November 21, 2012) J. T. Galkowski Bootstrapping Independent Data 1 / 44
  • 2. empirical bayesian@ieee.org Why do we care about Uncertainty Qualification (UQ)? Inherent variability of mechanisms, measurements Statistics are themselves random variables, e.g., sample mean, sample standard deviation, sample median For simple distributions, knowing the first few moments suffices to characterize them For complicated or mixture distributions . . . ? We often pretend distributions of measurements are stationary. Are they? Diagnostics! What to do if aren’t? Will have more to say about latter in next lecture on bootstrapping with dependent data J. T. Galkowski Bootstrapping Independent Data 2 / 44
  • 3. empirical bayesian@ieee.org Sketch of How the Bootstrap Works 8 6 11 6 11 5 11 2 1 2 3 4 5 6 7 8 11 8 2 11 5 8 8 11 2 0.125 2 0.125 2 0.125 2 0.125 2 0.125 2 0.125 2 0.125 2 0.125 5 0.125 5 0.125 5 0.125 5 0.125 5 0.125 5 0.125 5 0.125 5 0.125 6 0.250 6 0.250 6 0.250 6 0.250 6 0.250 6 0.250 6 0.250 6 0.250 8 0.125 8 0.125 8 0.125 8 0.125 8 0.125 8 0.125 8 0.125 8 0.125 11 0.375 11 0.375 11 0.375 11 0.375 11 0.375 11 0.375 11 0.375 11 0.375 2 11 2 8 11 5 11 2 2 0.125 2 0.125 2 0.125 2 0.125 2 0.125 2 0.125 2 0.125 2 0.125 5 0.125 5 0.125 5 0.125 5 0.125 5 0.125 5 0.125 5 0.125 5 0.125 6 0.250 6 0.250 6 0.250 6 0.250 6 0.250 6 0.250 6 0.250 6 0.250 8 0.125 8 0.125 8 0.125 8 0.125 8 0.125 8 0.125 8 0.125 8 0.125 11 0.375 11 0.375 11 0.375 11 0.375 11 0.375 11 0.375 11 0.375 11 0.375 J. T. Galkowski Bootstrapping Independent Data 3 / 44
  • 4. empirical bayesian@ieee.org Sketch of Why the Bootstrap Works 400 300 Growth in Number of Possible Bootstrap Resamplings with Sample Size log10(N) 200 100 0 0 50 100 150 200 size of sample, N J. T. Galkowski Bootstrapping Independent Data 4 / 44
  • 5. empirical bayesian@ieee.org Test Data Model, Easy Case 0.12 0.10 0.08 0.06 0.04 0.02 0.00 Synthetic Gaussians for Easy Test Case 10 20 30 40 50 x J. T. Galkowski Bootstrapping Independent Data 5 / 44
  • 6. empirical bayesian@ieee.org Boxplot of well-separated Gaussians 30 25 Simple case of two well separated Gaussians sample value 20 15 10 Left Right Contributing Distribution J. T. Galkowski Bootstrapping Independent Data 6 / 44
  • 7. empirical bayesian@ieee.org Histogram of same well-separated Gaussians 0.25 Left Sample of Two Well−separated Gaussians Right Sample of Two Well−separated Gaussians 0.25 0.20 0.20 0.15 0.15 0.10 0.10 0.05 0.05 0.00 0.00 10 15 20 25 30 35 40 10 15 20 25 30 35 40 Sample of Left Well−Separated Gaussian Sample of Right Gaussian J. T. Galkowski Bootstrapping Independent Data 7 / 44
  • 8. empirical bayesian@ieee.org Standard t-test on well-separated Gaussians values values Group n average s.d. 1: left Gaussians 24 19.540998 4.867407 2: right Gaussians 35 28.471698 2.820691 3: POOLED 59 24.838871 3.7822811 ¯ ¯ (Y2 − Y1 ) − 0.0 = 8.930700 ¯ ¯ SE(Y2 − Y1 ) = 1.002398 8.930700−0.0 two-sided t-statistic = 1.002398 = 8.909336 P = 0.0000 t-statistic = 8.90934 = t57 (0.0000) 1 Adjusted for d.o.f. J. T. Galkowski Bootstrapping Independent Data 8 / 44
  • 9. empirical bayesian@ieee.org Bootstrap characterization of well-separated Gaussians Two well−separated Gaussians, differences after 10,000 bootstrap replications 0.5 z t test z t test − se t test z t test + se t test z bootstrap z bootstrap − se bootstrap z bootstrap + se bootstrap 0.4 0.3 0.2 0.1 0.0 6 8 10 12 difference between Right Gaussians and Left Gaussians J. T. Galkowski Bootstrapping Independent Data 9 / 44
  • 10. empirical bayesian@ieee.org Bootstrap test of well-separated Gaussians values values Group n average s.d. 1: resampled Left 24 19.540998 4.867407 2: resampled Right 35 28.471698 2.820691 3: POOLED 59 24.838871 3.7822812 ¯ ¯ Recall original (Y2 − Y1 ) − 0.0 = 8.930700 baseline two-sided t statistic P = 0.0000 t-statistic = 8.90934 = t57 (0.0000) bootstrap resampled Left two-sided t statistic P = 0.0001 0 of 10000 exceed baseline t bootstrap resampled Right t statistic P = 0.0002 1 of 10000 exceed baseline t 2 Adjusted for d.o.f. J. T. Galkowski Bootstrapping Independent Data 10 / 44
  • 11. empirical bayesian@ieee.org Test Data Model, Hard Case 0.06 0.05 0.04 0.03 0.02 0.01 0.00 Synthetic Gaussians for Hard Test Case 10 20 30 40 50 x J. T. Galkowski Bootstrapping Independent Data 11 / 44
  • 12. empirical bayesian@ieee.org Boxplot of well-mixed Gaussians 40 35 30 Hard case of two well mixed Gaussians sample value 25 20 15 10 Left Right Contributing Distribution J. T. Galkowski Bootstrapping Independent Data 12 / 44
  • 13. empirical bayesian@ieee.org Histogram of same well-mixed Gaussians 0.15 0.10 0.05 0.00 Two well−mixed Gaussians 10 15 20 25 30 35 40 Pooled Sample of Right and Left Mixtures of Gaussians J. T. Galkowski Bootstrapping Independent Data 13 / 44
  • 14. empirical bayesian@ieee.org Standard t-test on mixture distribution of Gaussians values values Group n average s.d. 1: left Gaussians 24 22.635617 6.534154 2: right Gaussians 35 27.558842 6.140611 3: POOLED 59 25.556174 6.3023673 ¯ ¯ (Y2 − Y1 ) − 0.0 = 4.923225 ¯ ¯ SE(Y2 − Y1 ) = 1.670283 4.923225−0.0 two-sided t-statistic = 1.670283 = 2.947539 P = 0.0023 t-statistic = 2.94754 = t57 (0.0023) 3 Adjusted for d.o.f. J. T. Galkowski Bootstrapping Independent Data 14 / 44
  • 15. empirical bayesian@ieee.org Resampling from each population separately Bootstrap Resampling Pseudo−Populations from Bootstrap Resampling Pseudo−Populations from Left Case of Two well−mixed Gaussians Right Case of Two well−mixed Gaussians 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0.0 0.0 15 20 25 30 35 40 15 20 25 30 35 40 Resampling of Left Well−Mixed Gaussians Resampling of Right Well−Mixed Gaussians (20,000 resamplings) (20,000 resamplings) J. T. Galkowski Bootstrapping Independent Data 15 / 44
  • 16. empirical bayesian@ieee.org What a pooled bootstrap resampling population looks like Bootstrap Resampling Pseudo−Populations from Pooled Case of Two well−mixed Gaussians 0.25 0.20 0.15 0.10 0.05 0.00 15 20 25 30 35 40 Pooled Resampling of Well−Mixed Gaussians J. T. Galkowski Bootstrapping Independent Data 16 / 44
  • 17. empirical bayesian@ieee.org Bootstrap test of well-mixed Gaussians values values Group n average s.d. 1: resampled Left 24 22.635617 6.534154 2: resampled Right 35 27.558842 6.140611 3: POOLED 59 25.556174 6.3023674 ¯ ¯ Recall original (Y2 − Y1 ) − 0.0 = 4.923225 baseline two-sided t statistic P = 0.0023 t-statistic = 2.94754 = t57 (0.0023) bootstrap resampled Left two-sided t statistic P = 0.1328 2656 of 20000 exceed baseline t bootstrap resampled Right t statistic P = 0.1331 2662 of 20000 exceed baseline t 4 Adjusted for d.o.f. J. T. Galkowski Bootstrapping Independent Data 17 / 44
  • 18. empirical bayesian@ieee.org Resampled t-values Bootstrap resampled t−values and Baseline from Bootstrap resampled t−values and Baseline from Left Sample of Well−mixed Gaussians Right Sample of Well−mixed Gaussians 0.4 baseline t−value = 2.9475 0.4 baseline t−value = 2.9475 0.3 0.3 0.2 0.2 0.1 0.1 0.0 0.0 0 1 2 3 4 5 6 0 1 2 3 4 5 t t J. T. Galkowski Bootstrapping Independent Data 18 / 44
  • 19. empirical bayesian@ieee.org A little Bootstrap theory: Glivenko-Cantelli Empirical distribution function: i.i.d. X1 , X2 , . . . , Xn ∼ F then n def 1 Fn = ∑ IX ≤x (x ) k n k =1 Glivenko-Cantelli Theorem: As n → ∞, ∑ |Fn − F | → 0 x almost surely (abbreviated “a.s.”). Given def Ω0 = {ω : lim Xn exists} = {ω : (lim sup Xn ) − (lim inf Xn ) = 0} n→∞ n→∞ n→∞ if P (Ω0 ) = 1, we say that Xn converges almost surely5 . 5 See Durrett, 4th edition, 2010, pp 15-16. J. T. Galkowski Bootstrapping Independent Data 19 / 44
  • 20. empirical bayesian@ieee.org Proof of Glivenko-Cantelli Theorem After R. Durrett, 2010, 4th edition, Probability: Theory and Examples, 76-77 1 Fix x. Let Zn = IXn ≤x (x ). 2 Since Zn i.i.d. with E [Zn ] = P (Xn < x ) = F (x − ) = limy →x F (y ), strong law ⇒ Fn (x − ) = n ∑n =1 Zk → F (x − ) a.s. 1 k 3 i For 1 ≤ i ≤ k − 1, let xi ,k = inf{y : F (y ) ≥ k }. 4 Pointwise convergence of Fn (x ), Fn (x − ) ⇒ can pick Wk so 1 n ≥ Wk ⇒ |Fn (xi ,k ) − F (xi ,k )| < k 1 n ≥ Wk ⇒ |Fn (xi− ) − F (xi− )| < ,k ,k k for 1 ≤ i ≤ k − 1. 5 Let x0,k = −∞ and xk ,k = ∞. Then last inequalities valid for i = 0 or i = k . 6 Given u = xi −1,k or u = xi ,k , 1 ≤ i ≤ k − 1, n ≥ Wk , monotonicity of Fn , monotonicity of F , F (xi− ) − F (xi −1,k ) ≤ k , ,k 1 1 2 2 Fn (u ) ≤ Fn (xi− ) ≤ F (xi− ) + ,k ,k ≤ F (xi −1,k ) + ≤ F (u ) + k k k 1 2 2 Fn (u ) ≥ Fn (xi −1,k ) ≥ F (xi −1,k ) − ≥ F (xi− ) − ,k ≥ F (u ) − k k k 7 2 Consequently, supu |Fn (u ) − F (u )| ≤ k . Result proved. J. T. Galkowski Bootstrapping Independent Data 20 / 44
  • 21. empirical bayesian@ieee.org A little Bootstrap theory: Edgeworth expansions Edgeworth expansion: 1 ˆ θ is constructed from sample of size n 2 θ0 is a “parameter” √ 3 n(θ − θ0 ) ∼ N (0, σ 2 ) ˆ √ θ −θ0 ˆ √ √ 4 P {( n σ ≤ x } = Φ(x ) + ( n)1 p1 (x )φ (x ) + · · · + ( n)j pj φ (x ) + . . . 5 φ (x ) = N (0, 1) and Φ(x ) = −∞ φ (u ) du x g (X∗ −g (X) ¯ ¯ Then letting A(X∗ ) = ˆ ¯ ¯ h(X) , where s∗ denotes the corresponding bootstrap estimate of the sample statistic s, 1 Characteristic function χ of X satisfies lim sup|χ(t)| < 1 t →0 g (X∗ −g (X) ¯ ¯ 2 Asymptotic variance of ¯ h(X) =1 √ g (X∗ −g (X) ¯ ¯ √ 3 Then P { n ¯ h(X) ≤ x } = Φ(x ) + ∑ν=1 ( n)j πj (x )φ (x ) + o(n−ν/2 ) j 4 πj is a polynomial of degree 3j − 1, odd for even j and even for odd j with coefficients a function of moments of X up to order j + 2. J. T. Galkowski Bootstrapping Independent Data 21 / 44
  • 22. empirical bayesian@ieee.org Bumpus 1898 data 0.78 0.76 0.74 Sparrow Survival after Severe Snowstorm (Bumpus, 1898, Woods Hole, MA) Humerus length in inches 0.72 0.70 0.68 0.66 PERISHED SURVIVED status of sparrow J. T. Galkowski Bootstrapping Independent Data 22 / 44
  • 23. empirical bayesian@ieee.org t-test results from Bumpus 1898 data Humerus Humerus average s.d. Group n (inches) (inches) 1: perished 24 0.727917 0.023543 2: survived 35 0.738000 0.019839 3: POOLED 59 0.733898 0.0214116 ¯ ¯ (Y2 − Y1 ) − 0.0 = 0.010083 ¯ ¯ SE(Y2 − Y1 ) = 0.005674 0.010083−0.0 two-sided t-statistic = 0.005674 = 1.776998 P = 0.0405 t-statistic = 1.77700 = t57 (0.0405) 6 Adjusted for d.o.f. J. T. Galkowski Bootstrapping Independent Data 23 / 44
  • 24. empirical bayesian@ieee.org Bootstrap results from Bumpus 1898 data values values Group n average s.d. 1: perished 24 0.727917 0.023543 2: survived 35 0.738000 0.019839 3: POOLED 59 0.733898 0.0214117 ¯ ¯ Recall original (Y2 − Y1 ) − 0.0 = 0.010083 baseline two-sided t statistic P = 0.0405 t-statistic = 1.77700 = t57 (0.0405) bootstrap perished two-sided t statistic P = 0.2711 2710 of 10000 exceed baseline t bootstrap survived t statistic P = 0.2564 2563 of 10000 exceed baseline t 7 Adjusted for d.o.f. J. T. Galkowski Bootstrapping Independent Data 24 / 44
  • 25. empirical bayesian@ieee.org Can go elsewhere: Stroke risk and proportions strokes8 subjects aspirin group 119 11,037 placebo group 98 11,034 How to bootstrap this problem . . . 1 Create an “aspirin sample” of length 11,037 having 119 ones and 11,037 - 119 zeros. 2 Create a “placebo sample” of length 11,034 having 98 ones and 11,037 - 98 zeros. 3 Draw with replacement of length 11,037 from “aspirin sample”. 4 Draw with replacement of length 11,034 from “placebo sample”. 5 Calculate proportion ones in draw from “aspirin sample” to number of ones in draw from “placebo sample”. 6 Repeat drawing and calculation a large number of times, say, 1000. 7 Calculate the sample statistics you like from the resulting population. 8 From New York Times of 27th January 1987, as quoted by Efron and Tibshirani in An Introduction to the Bootstrap, 1998. J. T. Galkowski Bootstrapping Independent Data 25 / 44
  • 26. empirical bayesian@ieee.org Bootstrap display of proportions in stroke risk Resampled proportions for stroke risk using 10000 bootstrap replicas 300 median risk ratio = 1.2140, p = 0.548 250 200 150 100 50 0 1.0 1.5 2.0 aspirin placebo J. T. Galkowski Bootstrapping Independent Data 26 / 44
  • 27. empirical bayesian@ieee.org Can go elsewhere: In lieu of cross-validation Cross-validation is a bag of techniques for trying to assess prediction error. There are other “bags of techniques” as well, one will be mentioned later, but a review needs await a lecture on predictive inference. y + δ y = [f + δ f ](x + δ x ) −→ δ y = f(δ x ) + δ f (x + δ x ) This is gotten wrong a lot. Important to separate variation in data from variation in model. Will present the standard set of methods, and then offer one Bootstrap-based alternative. J. T. Galkowski Bootstrapping Independent Data 27 / 44
  • 28. empirical bayesian@ieee.org Cross-validation: A review Classical methods Split-half: 1 Randomly pick half data subset from sample without replacement. 2 Fit to that. 3 Test fit on half not selected. 4 Report measure like MSE for test. K -fold: Let F = {k : k = 1, . . . , K }. 1 Divide sample into K randomly picked disjoint subsets. 2 For each j , j ∈ F , fit to all i = j , i ∈ F . 3 Then test the fit obtained on subset j. 4 Report measure like MSE by averaging over all K of the “folds”. Leave one out cross-validation: K -fold with K = 1. (Not covered here.) J. T. Galkowski Bootstrapping Independent Data 28 / 44
  • 29. empirical bayesian@ieee.org Cross-validation: A review Bootstrap-related methods Bootstrap validation: 1 Take B Bootstrap resamplings from the sample. 2 For each resampling, fit the model to the resampling, then predict the model on the resampling and on the entire original sample. 3 Note the difference between performance on the entire original sample and the performance on the resampling in each case. Each of these values are called an estimate of optimism. 4 Average the optimism values, and then add that to the average residual squared error of the original fit to get an estimate of prediction error. Two versions of Bootstrap-based cross validation illustrated here. The first is the “ordinary Bootstrap” realization of validation9 , and the “0.632+ bootstrap”, a refined version designed to replace classical cross-validation. 9 B. Efron, G. Gong, “A leisurely look at the Bootstrap, the Jackknife, and cross-validation”, The American Statistician, 37(1), 1983, 36-48, as realized by F. E. Harrell, Jr, in the R package rms. J. T. Galkowski Bootstrapping Independent Data 29 / 44
  • 30. empirical bayesian@ieee.org Data to model: Salinity as predicted by temperature, pressure, latitude, month ARGO float, platform 1900764 (WHOI, IFREMER) ARGO Float, Platform 1900764 (WHOI, IFREMER) 25 20 Adjusted Temperature (Celsius) 15 10 5 35.0 35.5 36.0 36.5 37.0 Salinity Adjusted (psu) J. T. Galkowski Bootstrapping Independent Data 30 / 44
  • 31. empirical bayesian@ieee.org The ARGO Floats System J. T. Galkowski Bootstrapping Independent Data 31 / 44
  • 32. empirical bayesian@ieee.org Location of collection: Equatorial Atlantic, near Africa Canary Current dominates at surface, flows southwest J. T. Galkowski Bootstrapping Independent Data 32 / 44
  • 33. empirical bayesian@ieee.org Location of collection: Near Cape Verde Islands J. T. Galkowski Bootstrapping Independent Data 33 / 44
  • 34. empirical bayesian@ieee.org Models to be compared: Linear regression to predict Salinity Two competitive models Table: salinity = f (temperature, pressure, month, latitude, intercept) Estimate Std. Error t value Pr(>|t|) (Intercept) 33.58993 0.04224 795.28623 0.00000 temperature 0.11911 0.00082 144.46335 0.00000 pressure 0.00014 0.00001 11.87072 0.00000 month -0.00069 0.00061 -1.14065 0.25404 latitude 0.02783 0.00194 14.36736 0.00000 Table: salinity = f (temperature, pressure, month, latitude) Estimate Std. Error t value Pr(>|t|) temperature 0.35984 0.00595 60.43834 0.00000 pressure 0.00347 0.00008 40.91654 0.00000 month -0.06683 0.00466 -14.33299 0.00000 latitude 1.43767 0.00606 237.19820 0.00000 J. T. Galkowski Bootstrapping Independent Data 34 / 44
  • 35. empirical bayesian@ieee.org Models to be compared: Linear regression to predict Salinity Two more competitive models Table: salinity = f (temperature, pressure, month, intercept) Estimate Std. Error t value Pr(>|t|) (Intercept) 34.14529 0.01719 1986.78239 0.00000 temperature 0.11940 0.00083 143.49286 0.00000 pressure 0.00014 0.00001 12.02606 0.00000 month 0.00106 0.00060 1.76639 0.07736 Table: salinity = f (temperature, pressure, month) Estimate Std. Error t value Pr(>|t|) temperature 1.71963 0.00403 427.12858 0.00000 pressure 0.02196 0.00008 264.55418 0.00000 month 0.14220 0.01147 12.40126 0.00000 J. T. Galkowski Bootstrapping Independent Data 35 / 44
  • 36. empirical bayesian@ieee.org Models to be compared: Linear regression to predict Salinity Still two more competitive models Table: salinity = f (temperature, pressure, intercept) Estimate Std. Error t value Pr(>|t|) (Intercept) 34.14888 0.01707 2000.88558 0.00000 temperature 0.11951 0.00083 144.02016 0.00000 pressure 0.00014 0.00001 12.17462 0.00000 Table: salinity = f (temperature, pressure) Estimate Std. Error t value Pr(>|t|) temperature 1.75765 0.00263 668.68046 0.00000 pressure 0.02246 0.00007 308.89198 0.00000 J. T. Galkowski Bootstrapping Independent Data 36 / 44
  • 37. empirical bayesian@ieee.org Variation by season: Salinity as predicted by temperature, pressure, latitude, month ARGO float, platform 1900764 (WHOI, IFREMER) ARGO Float, Platform 1900764 (WHOI, IFREMER) (showing dependency by month) 25 20 Adjusted Temperature (Celsius) 15 10 5 35.0 35.5 36.0 36.5 37.0 Salinity Adjusted (psu) J. T. Galkowski Bootstrapping Independent Data 37 / 44
  • 38. empirical bayesian@ieee.org Split-Half cross-validation example Also called “Hold Out cross-validation” or “Split-Sample cross-validation” Table: Split-half cross-validation summary MSE temperature, pressure, month, latitude; with intercept 0.219357 temperature, pressure, month; with intercept 0.180320 temperature, pressure; with intercept 0.133647 temperature, pressure, month, latitude; without intercept 10.743651 temperature, pressure, month; without intercept 48.300449 temperature, pressure; without intercept 33.191302 J. T. Galkowski Bootstrapping Independent Data 38 / 44
  • 39. empirical bayesian@ieee.org K-fold Cross-validation example Table: 10-fold cross-validation summary MSE temperature, pressure, month, latitude; with intercept 0.209468 temperature, pressure, month; with intercept 0.211403 temperature, pressure; with intercept 0.211403 temperature, pressure, month, latitude; without intercept 1.625414 temperature, pressure, month; without intercept 4.071300 temperature, pressure; without intercept 4.099267 J. T. Galkowski Bootstrapping Independent Data 39 / 44
  • 40. empirical bayesian@ieee.org Bootstrap validation instead Table: Bootstrap Validation summary (100 resamplings, Efron & Gong Boostrap) MSE temperature, pressure, month, latitude; with intercept 0.043891 temperature, pressure, month; with intercept 0.044472 temperature, pressure; with intercept 0.044584 temperature, pressure, month, latitude; without intercept 0.043893 temperature, pressure, month; without intercept 0.044507 temperature, pressure; without intercept 0.044723 Table: Bootstrap Validation summary (100 resamplings, 0.632+ Boostrap) MSE temperature, pressure, month, latitude; with intercept 0.043733 temperature, pressure, month; with intercept 0.044760 temperature, pressure; with intercept 0.044618 temperature, pressure, month, latitude; without intercept 0.043833 temperature, pressure, month; without intercept 0.044626 temperature, pressure; without intercept 0.044672 J. T. Galkowski Bootstrapping Independent Data 40 / 44
  • 41. empirical bayesian@ieee.org Model Selection and Comparison: Information Criteria Something to be taught another day ... lecture on predictive inference Table: Information Criteria Judging Quality of Several Models AIC AICc BIC R2 temperature, pressure, month, latitude; with intercept -3086.41 -3086.40 -3042.75 0.91 temperature, pressure, month; with intercept -2883.86 -2883.86 -2847.48 0.91 temperature, pressure; with intercept -2882.74 -2882.74 -2853.64 0.91 temperature, pressure, month, latitude; without intercept 0.00 0.00 0.00 1.00 temperature, pressure, month; without intercept 0.00 0.00 0.00 0.99 temperature, pressure; without intercept 0.00 0.00 0.00 0.99 Table: Relative Importance of Various Attributes in Several Models temperature pressure month latitude temperature+pressure+month+latitude+intercept model 0.5916 0.4048 0.0006 0.0030 temperature+pressure+month+intercept model 0.5935 0.4057 0.0008 0.0000 temperature+pressure+intercept model 0.5942 0.4058 0.0000 0.0000 J. T. Galkowski Bootstrapping Independent Data 41 / 44
  • 42. empirical bayesian@ieee.org Computation R SEQBoot special purpose and free Nothing in GNU Scientific Library, unfortunately. For non-delicate inferences, can simply write the sampling yourself. Kleiner, Talwalkar, Sarkar, Jordan “Big data bootstrap”, 2012 J. T. Galkowski Bootstrapping Independent Data 42 / 44
  • 43. empirical bayesian@ieee.org When do these fail? Fail principally when data are co-correlated or dependent Special techniques for dependent data: Next lecture! Fail when trying to characterize extremes of distributions like minimum, maximum: Insufficient number of samples Deteriorates when sampling distribution of population is not smooth, in a formal sense J. T. Galkowski Bootstrapping Independent Data 43 / 44
  • 44. empirical bayesian@ieee.org Bibliography M. R. Chernick, Bootstrap Methods: A Guide for Practitioners and Researches, 2nd edition, 2008, John Wiley & Sons. A. C. Davison, D. V. Hinkley, Bootstrap Methods and their Application, first published 1997, 11th printing, 2009, Cambridge University Press. B. Efron, R. J. Tibshirani, An Introduction to the Bootstrap, 1993, Chapman & Hall/CRC. P. I. Good, Resampling Methods – A Practical Guide to Data Analysis, 2006. B. Efron, The Jackknife, the Bootstrap and Other Resampling Plans, CBMS-NSF Regional Conference Series in Applied Mathematics, 1982. P. Hall, The Bootstrap and Edgeworth Expansion, Springer-Verlag, 1992. K. Singh, G. J. Babu, “On the asymptotic optimality of the bootstrap”, Scandanavian Journal of Statistics, 17, 1990, 1-9. A. Kleiner, A. Talwalkar, P. Sarkar, M. I. Jordan, “The Big Data Bootstrap”, in J. Langford and J. Pineau (eds.), Proceedings of the 29th International Conference on Machine Learning (ICML), Edinburgh, UK, 2012. A. Kleiner, A. Talwalkar, P. Sarkar, M. I. Jordan, “A scalable bootstrap for massive data”, http://arxiv.org/abs/1112.5016, last accessed 16th November 2012. B. Efron, R. Tibshirani, “Improvements on cross-validation: The 0.632+ Bootstrap method”, JASA, 92(438), June 1997, 548-560. J. T. Galkowski Bootstrapping Independent Data 44 / 44