Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

CAR Models for Agricultural data

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Wird geladen in …3
×

Hier ansehen

1 von 42 Anzeige
Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Anzeige

Ähnlich wie CAR Models for Agricultural data (20)

Aktuellste (20)

Anzeige

CAR Models for Agricultural data

  1. 1. CAR Models for Agricultural data Margaret Donald Joint work with Clair Alston, Chris Strickland, Rick Young, and Kerrie Mengersen Modern Spatial Statistics Conference In honour of Julian Besag QUT, May 12, 2011
  2. 2. Outline Modelling agricultural data in three spatial dimensions & in a fourth dimension, time 1. Data 2. Modelling in 3 spatial dimensions 2.1 Random effects 2.2 Treatment effects 2.3 Results 3. Modelling in 4 dimensions 3.1 Model 3.2 Results 4. Selected References
  3. 3. Agriculture: Modelling in 3 dimensions Data Data Data are moisture measurements from a field experiment To determine a cropping method least likely to lead to salinification Consist of treatment, moisture value, row, column, x co-ord, y co-ord, depth and date or, 108 sites × 15 depths (1620) by 56 days or 90720 moisture observations. The treatments were Long Fallowing (3 phases-treatments) Continuous cropping (1 treatment) Response cropping (2 treatments) Pastures (Lucerne, Lucerne mixture) Pastures (native grasses) The purpose was to determine the difference between long fallowing and response cropping.
  4. 4. Agriculture: Modelling in 3 dimensions Data Figure: Site treatments
  5. 5. Agriculture: Modelling in 3 dimensions Methods CAR or kriging Kriging models are slow to converge, because at each MCMC iteration they involve all the data require matrix inversions And here with a complex regression model they failed to converge Conditional Autoregressive (CAR) models deal with spatial auto-correlation using the notion of neighbour thought of as ‘areal’ models are easy to use flexible appropriate for dealing with localised spatial similarity CAR models have been shown to be closely related to kriging (Rue, Tjelmeland,2002; Hrafnkelsson, Cressie, 2003; Besag,Mondal,2005;Lindgren et al,2010)
  6. 6. Agriculture: Modelling in 3 dimensions Methods Model for a single day For site i ∈ I at depth d ∈ D, the model is yid = µj(i)d + ψid + ϵid where µj(i)d is the treatment effect, j, at site, i, and depth, d ψid is the spatial residual at (i, d) ϵid is the unstructured residual at (i, d), with ϵid ∼ N(0, σ 2 ) Relabel ψid as ψs , where s ∈ I × D points on the 3-dimensional lattice, then the conditional probability of the spatial residual, ψs , given its neighbours, ψk , is given by ( ) ∑ wsk ψk ψs |ψk , k ∈ ∂s ∼ N , γ 2 /ws+ ) ws+ k∈∂s
  7. 7. Agriculture: Modelling in 3 dimensions Random effects Random effects A single spatial variance across all depths is forced on us when (∑ ) wsk ψk ψs |ψk , k ∈ ∂s ∼ N k∈∂s ws+ , γ 2 /ws+ ) and neighbours include depth neighbours. If, however, we define neighbourhoods only within a layer, then there are two possibilities: ( ) ∑ wik ψkd ψid |ψkd , k ∈ ∂i ∼ N , γ 2 /wi+ ) wi+ k∈∂i Or ( ) ∑ wik ψkd ψid |ψkd , k ∈ ∂i ∼ N 2 , γd /wi+ ) wi+ k∈∂i Similarly, ϵid ∼ N(0, σ 2 ), or perhaps ϵid ∼ N(0, σd ). 2
  8. 8. Agriculture: Modelling in 3 dimensions Random effects Random Effects - continued To specify a CAR model, we nominate which sites are neighbours a weight for each pair of neighbours Use weights of 0 and 1 Use the DIC to determine choice of neighbours.
  9. 9. Agriculture: Modelling in 3 dimensions Treatment effects: functional description Modelling the treatment effect 15 depth measurements for each treatment site Treatment effect is a function of depth smooth continuous Choices orthogonal polynomials splines linear cubic cubic radial bases
  10. 10. Agriculture: Modelling in 3 dimensions Treatment effects: functional description Errors-in-measurement model true depth z interval-censored related to the observed depth index d: zd |d ∼ N(d, σz )I(zd−1 , zd+1 ) for d = 2, 3, ...14 2 z1 |d = 1 ∼ N(1, σz )I(0, z2 ) 2 z15 |d = 15 ∼ N(15, σz )I(z14 , 16) 2 where σz ∼ Half-Cauchy(1) Find the treatment effect as a function of z for each site and nominal depth, d.
  11. 11. Agriculture: Modelling in 3 dimensions Results for half the field, or 810 measurements Results: Choosing the neighbourhood and variance structures Table: Comparing spatial residual modelling: Fixed component identical for all models(Orthogonal polynomial degree 8). Description pD DIC Null model: No spatial residuals 81 -2690 Linear CAR (maximum 2 neighbours) 264 -2811 CAR (maximum 4 neighbours) 358 -2990 CAR (maximum 8 neighbours) 320 -2930 AR(1), AR(1) at each depth 436 -2789 CAR (max 4 horiz, 2 depth neighbours)* 109 -2752 CAR (max 4 horiz)* 110 -2960 CAR (max 4 horiz)** 121 -2766
  12. 12. Agriculture: Modelling in 3 dimensions Results for half the field, or 810 measurements Results: Choosing the treatment effect function Table: Comparing ‘Fixed’ modelling: 4 neighbour CAR with 15 depth variances pD DIC Degree/Knots Type 297 -2970 6 Orthogonal poly 358 -2990 8 371 -2967 10 318 -2923 4 Linear Spline 369 -3002 4 (+error in depth) 401 -2999 5 (+error in depth) 327 -2954 5 Cubic radial bases 368 -3013 5 (+error in depth)
  13. 13. Agriculture: Modelling in 3 dimensions Results for half the field, or 810 measurements Results Figure: Treatment effect: Cubic radial bases (depth jittered to allow CIs to be seen).
  14. 14. Agriculture: Modelling in 3 dimensions Results for half the field, or 810 measurements Results Figure: ‘Fixed’ part: Linear spline treatment effects, depth measured with error & 95% credible intervals, CAR model, sites 1-54, December 22, 1998. Depth differences are those implied by the errors-in-measurement model.
  15. 15. Agriculture: Modelling in 3 dimensions Results for half the field, or 810 measurements Figure: 95% credible intervals for the contrast for the fixed part of cubic radial bases model with errors in depth
  16. 16. Agriculture: Modelling in 3 dimensions Results for half the field, or 810 measurements Results Figure: 95% CI for the ratio of square root of the spatial variance to that of the unstructured variance at the fifteen depths: Cubic radial bases model with errors-in-measurement for depth.
  17. 17. Agriculture: Modelling in 3 dimensions Results for half the field, or 810 measurements Conclusions from modelling in three dimensions From this modelling we concluded that layered CAR models, where neighbours of a point belong to the same horizontal depth layer, best model the spatially structured variation And they are easier to define, and faster to run. than a CAR model based on the three dimensions
  18. 18. Four Dimensional Analysis of Agricultural Data Model for Agricultural data which includes time Considerations In moving to four dimensions, it was clear that a model such as ytid = fj(i) (t, d) + ψid + ηt + ϵtid with common spatial effects across time (ψid ), and time residuals (ηt ) common across sites and depths was unlikely to describe the data well. We wanted to use the full field, 108 × 15 = 1620 data points / day rather than the 54 × 15 = 810 of the three-dimensional modelling and that implied the need for a different computing platform. Preliminary modelling 5 days of the full dataset, which used pyMCMC (Strickland, 2010) and a block updating Gibbs sampler, firmed the view, that the data might best be modelled (initially) by repeated use of the daily model.
  19. 19. Four Dimensional Analysis of Agricultural Data Model for Agricultural data which includes time Model Let ytid be the response variable measured on date t, at site i (of I plot sites in the horizontal plane), at depthid d (d = 1, ..., 15). Let j be the treatment at site i. Then ytid = ftj (d) + ψtid + ϵtid , ϵtid ∼ N(0, σtd ), with 2 ftj (d) = αtjd , ( ∑ ) (1) 2 τtd ψtid |ψti ′ d , i ̸= i ′ , ψti ψtid ∼ N ρt i ′ ∈∂i ni′ d , ni , where ni is the number of sites adjacent to site i, and i ′ ∈ ∂i denotes that site i ′ is a neighbour of site i. ρt is common across all depths for a given date, t. ftj (d) indicates that a function of d is estimated for each treatment and date.
  20. 20. Four Dimensional Analysis of Agricultural Data Results from four dimensional model Figure: Long fallowing vs Response cropping at at all depths. Saturated model. Point estimates from the MCMC iterates (full model, Method 1).
  21. 21. Four Dimensional Analysis of Agricultural Data Results from four dimensional model −220 −200 −180 −160 −140 −120 −100 0.01 0.09 0.08 02 0.03 0.08 0. 0.07 03 −0.01 0. 0.07 0.06 6 0.0 0 0.06 3 2 0.0 0.0 0 3 0. 0.05 0.0 0.05 5 04 1 0.04 0.0 0.04 0. 0. 04 0.03 0.0 Depth 0.03 2 0. 03 0 0.02 0. 0.02 01 0.01 0.02 2 0.02 0.0 0. 0.02 03 0.03 10 20 30 40 50 Day Figure: Long fallowing vs Response cropping. Saturated model. Contour graph from the point estimates from the MCMC iterates of the full model.
  22. 22. Four Dimensional Analysis of Agricultural Data Results from four dimensional model Figure: Long fallowing vs Response cropping at depth 100 for all trial dates. Saturated model. summary of MCMC iterates from the full model for the contrast.
  23. 23. Four Dimensional Analysis of Agricultural Data Results from four dimensional model Figure: Square root of variances & 95% credible intervals at depth 100 cm
  24. 24. Four Dimensional Analysis of Agricultural Data Results from four dimensional model Figure: Square root of variances & 95% credible intervals at depth 220 cm
  25. 25. Four Dimensional Analysis of Agricultural Data Results from four dimensional model 0.014 0.008 0.02 14 0.0 0.004 0.01 0.008 0.0 0.0 06 16 −50 0.016 6 0. 0.0 01 0.014 01 04 0. 2 0.006 0.004 18 0.00 0.01 8 0.012 0.0 −100 0.014 0.006 0.01 0.00 06 8 0.0 −150 0.002 Depth 0.004 −200 0.002 −250 0.004 0.006 −300 0.008 0.008 10 20 30 40 50 Day Figure: Square root of unstructured variance: Days by Depth
  26. 26. Four Dimensional Analysis of Agricultural Data Results from four dimensional model 0.02 5 0.0 0.01 2 0.015 0.0 2 −50 0.0 05 0.01 5 00 0.01 0.0 5 0. 0.015 0.01 0.0 15 5 0 0.01 −100 0.01 −150 Depth 0.005 −200 0.005 −250 0.01 −300 0.015 0.015 0.015 10 20 30 40 50 Day Figure: Square root of spatial variance: Days by Depth
  27. 27. Four Dimensional Analysis of Agricultural Data Timeseries modelling Timeseries modelling Here we use the 840 (56 days × 15 depths) contrast estimates from Method 1. Let Yt (or the vector Y ) represent the contrast estimate at time, t, and depth, d. Several models are fitted: A regression model which assumes errors are not autocorrelated. Y ∼ N(Xβ, V ), 1/V ∼ Gamma(10−6 , 10−6 ), (2) where X is a design matrix of time-varying covariates, such as log(rainfall+1) and interactions of year (as a factor) by sine and cosine terms with periods of a year and a half year.
  28. 28. Four Dimensional Analysis of Agricultural Data Timeseries modelling The local level state-space model (random walk) was fitted. Y t = µ t + νt , νt ∼ N(0, V ), µt = µt−1 + ωt , ωt ∼ N(0, W ), 1/V ∼ Gamma(10−6 , 10−6 ), 1/W ∼ Gamma(10−6 , 10−6 ). (3) An alternative set of priors for V and W was also used. 1/V ∼ Gamma(10−4 , 10−4 ), 1/W ∼ Gamma(10−4 , 10−4 ). In a further version of this model, t-distributions with 10 degrees of freedom were substituted for the normal distributions for the observation and state errors.
  29. 29. Four Dimensional Analysis of Agricultural Data Timeseries modelling An alternative formulation (Lunn et al., 2000) of the random walk model using CAR neighbourhood models was also fitted. This permitted neighbours to be weighted, and allowed a correction for the unequal time intervals. Weighted random walk models of order 1 (RW1) and order 2 (RW2) were fitted. Yt = µt + ωt + ψt , ωt ∼ N(0, V ), (∑ ) t ′, wt ′ ψt ′ W ψt |ψt ′ , t ̸= ψt ∼ N ′ ∈∂t , w + , where ∑ t w+ wt ′ = 1/|t − t ′ |, and w + = t ′ ∈∂t wt ′ , (4) with V , W defined as in Equation 3. The weight used is the reciprocal of the distance between neighbours over the time scale.
  30. 30. Four Dimensional Analysis of Agricultural Data Some results for timeseries models Some results for timeseries models Table: Summary of DICs for Contrast 1 (Long fallowing vs Response cropping) at Depth 140 Prior 2 Model pD DIC Regression 30 -377 AR(1) 4 -343 AR(1)(12) -2 -355 AR(2) 5 -342 RW(1) 36 -379 RW(1) (weighted) 40 -392 * RW(1) (t10 distribution) 39 -378 RW(2) 23 -373 RW(2) (weighted) 43 -395 * RW(1) (1768 time points) 49 -304
  31. 31. Four Dimensional Analysis of Agricultural Data Some results for timeseries models Table: DICs for Long fallowing vs Response cropping: 1st order autoregressive models vs simple regression model AR1 With rainfall AR1 AR1+5 Regression(28) Depth pD DIC pD DIC pD DIC pD DIC 100 5 -279 4 -278 9 -289 30 -315 120 5 -301 4 -303 9 -306 30 -344 140 5 -341 4 -343 9 -342 30 -377 160 5 -386 4 -386 9 -386 30 -425 180 5 -414 4 -414 9 -410 30 -433 200 5 -449 4 -450 9 -444 30 -449 220 5 -457 4 -458 9 -450 30 -455 AR1 with rainfall Covariate: log(rainfall+1) AR1 + 5 Covariates: log(rainfall+1), sin(x), cos(x), sin(2x), cos(2x), x=date/2π Regression(28) Covariates: x,x*x,x*x*x, year*(sin(x), cos(x), sin(2x), cos(2x))
  32. 32. Four Dimensional Analysis of Agricultural Data Some results for timeseries models Table: DICs for Long fallowing vs Response cropping: random walk model comparisons, using Prior 2. RW1 RW1 (W) RW2 RW2 (W) Depth pD DIC pD DIC pD DIC pD DIC 100 47 -342 47 -346 * 21 -332 38 -340 120 39 -348 42 -360 * 26 -323 40 -360 * 140 36 -379 40 -392 * 23 -373 43 -395 * 160 34 -413 38 -424 * 25 -417 43 -419 180 32 -433 36 -434 25 -439 * 43 -419 220 28 -457 * 34 -448 24 -458 * 42 -424 220 28 -461 * 35 -452 24 -463 * 43 -427 (W): inverse time interval weights. * Indicates the better models.
  33. 33. Four Dimensional Analysis of Agricultural Data Some results for timeseries models RW1 model Figure: Long fallowing vs Response cropping at depth 140 for all trial days. Random Walk (1).
  34. 34. Four Dimensional Analysis of Agricultural Data Some results for timeseries models Conclusions Repeated use of the daily model was a useful way of modelling the contrast. It allowed an unfettered evolution of the spatial and unstructured variances. To permit the fitting of a full space time model, the treatment parameters need to be reparametrised as a set of contrasts taking up all the treatment degrees of freedom. This would allow the evolution of the contrast over time within a full spatio-temporal model. From this preliminary work, random walks for the spatial variation, for ρt , for the contrast and for all other time-varying parameters should give an adequate description of the data. A subsample of the MCMC iterates for each contrast estimate can and should be used to model the evolution of the contrast and its variance over time.
  35. 35. Four Dimensional Analysis of Agricultural Data Some results for timeseries models So what is new? We have introduced The layered CAR model for three (spatial) dimensional data, in combination with complex regression models A complex time by space interaction model, through the repeated use of the layered CAR model, thereby providing a model where space and time effects are not additive, neither in the fixed nor error components, and where the outcome of interest is a function of the response variable
  36. 36. Selected References Selected References Banerjee, S., Carlin, B. P. and Gelfand, A. E.: 2004, Hierarchical modeling and analysis for spatial data, Monographs on statistics and applied probability, Chapman & Hall, Boca Raton, London, New York, Washington D.C. Besag, J. E.: 1974, Spatial interaction and the statistical analysis of lattice systems (with discussion), J. R. Statist. Soc. B 36(2), 192–236. Besag, J. E. and Mondal, D.: 2005, First-order intrinsic autoregressions and the de Wijs process, Biometrika 92 (4), 909–920. Besag, J., York, J. and Mollie, A.: 1991, Bayesian image restoration with applications in spatial statistics (with discussion), Annals of the Institute of Mathematical Statistics 43, 1–59. Cressie, N. A. C.: 1993, Statistics for spatial data. Wiley series in probability and mathematical statistics. Applied probability and statistics. New York: John Wiley. Gelfand, A. E. and Vounatsou P.: 2003, Proper multivariate conditional autoregressive models for spatial data analysis, Biostatistics 4(1), 11–25.
  37. 37. Selected References Selected References Hrafnkelsson, B. and Cressie. N.: 2003, Hierarchical modeling of count data with application to nuclear fall-out. Environmental and Ecological Statistics 10, 179–200. Lindgren, F., H. Rue, and Lindstrom J.: 2010, An explicit link between Gaussian fields and Gaussian Markov random fields: The SPDE approach. Journal of the Royal Statistical Society Series B, to appear. Lunn, D. J., A. Thomas, N. Best, and Spiegelhalter, D.: 2000, WinBUGS - a Bayesian modelling framework: Concepts, structure, and extensibility, Statistics and Computing 10(4), 325–337. Ngo, L. and Wand, M.: 2004, Smoothing with mixed model software, Journal of Statistical Software 9, 1–56. Rue, H. and L. Held: 2005, Gaussian Markov random fields : Theory and Applications. Boca Raton: Chapman & Hall/CRC. Rue, H. and H. Tjelmeland: 2002, Fitting Gaussian Markov random fields to Gaussian fields. Scandinavian Journal of Statistics 29(1), 31–49. Spiegelhalter, D. J., N. G. Best, B. P. Carlin, and A. van der Linde: 2002, Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 64(4), 583–639.
  38. 38. Thank you for listening. Questions?
  39. 39. DICs, Priors and Fit Priors for Equations 2-5 Table: Various priors used for the precisions of the timeseries models of Method 2 Precision for observational error Precision for random walk error* Prior 1 ∼ Gamma(.000001,.000001) ∼ Gamma(.000001,.000001) Prior 2 ∼ Gamma(.0001,.0001) ∼ Gamma(.0001,.0001) Prior 3 mean τ ∼ Gamma(.000001,.000001) Prior 4 total ∗ r total ∗ (1 − r ) Prior 5 ∼ Gamma(.000001,.000001) mean τ total ∼ Gamma(a, b), r ∼ Beta(1, 1) a,b calculated via method of moments from mean & 95%CI for posterior in Method 1
  40. 40. DICs, Priors and Fit Priors for Equations 2-5 (continued) Table: Constants for priors 3-5 for the precisions of the timeseries models of Method 2 Depth (cm) Mean τ a b 100 1395 6.934 .004971 120 1759 6.024 .003425 140 2241 12.413 .005538 160 3019 52.316 .017327 180 3226 87.249 .027045 200 3201 180.410 .056354 220 2175 82.412 .037894
  41. 41. DICs, Priors and Fit Model Comparisons & the DIC Table: Summary of DICs for Contrast 1 (Long fallowing vs Response cropping) at Depth 140 Prior 1 Prior 2 Model pD DIC pD DIC Regression 30 -377 AR(1) 4 -343 4 -343 AR(1)(12) -2 -356 -2 -355 AR(2) 4 -343 5 -342 RW(1) 69 -435 36 -379 RW(1) (weighted) 73 -468 * 40 -392 * RW(1) (t10 distribution) 73 -450 39 -378 RW(2) 20 -370 23 -373 RW(2) (weighted) 26 -390 43 -395 * RW(1) (1768 time points) 49 -304 (Prior 5)
  42. 42. DICs, Priors and Fit Model Comparisons & the DIC Table: R 2 , pD and DIC for the RW(1) weighted models using priors 3-5 Prior 3 Prior 4 Prior 5 Depth R2 pD DIC R2 pD DIC R2 pD DIC 100 33% 13 -255 80% 36 -258 99% 100 -411 120 23% 9 -277 79% 35 -279 99% 97 -421 140 12% 6 -299 80% 35 -271 99% 94 -433 160 16% 5 -323 83% 35 -251 100% 90 -446 180 19% 5 -332 85% 34 -246 99% 89 -448 200 27% 4 -337 86% 34 -239 99% 89 -448 220 18% 3 -318 82% 34 -225 99% 94 -434

×