TERN Ecosystem Surveillance Plots Kakadu National Park
David Clifford_Uncertainty of soil attribute estimates based on disaggregation
1. Uncertainty in the measurement
and mapping of soil attributes
Estimation and reporting of uncertainty
Presentation by Dr David Clifford (CSIRO)
2. Uncertainty Quantification
• What is it and how does uncertainty arise
• Why we need to care about it
• Importance / relevance for TERN
• Incorporation as part of TERN infrastructure
Goal: How can we understand, calculate and
communicate uncertainty of our products
3. Sources of uncertainty
Positional errors
Field vs lab measurement
Raw data – gold standard or derived quantity
Conversion to standard depths
Models:
– covariate layers
– transformation of data
– structure
– estimation and prediction
– multiple outputs & joint uncertainty
4. Reality Check
Despite our best efforts we don’t know
everything
• We know a lot about some soil processes
• We know a lot about some regions of Australia
• Our knowledge is not uniform
• Our uncertainty estimates should reflect this
5. The need for quantifying uncertainty
Avoiding the illusion of certainty
Improve the quality of inference based on TERN
products
Help highlight gaps in our knowledge
‘Error Budget’ - relative contributions of sources
Decide where / how to spend future funding
6. Reporting of uncertainty
Previously reported via a scale – an implicit
map-wide approach
Map our estimates of uncertainty
Different products have different uncertainties
Global Soil Map workshop on uncertainty –
USDA, Nebraska (Aug 2012)
12. Merging TERN Products
Approaches for merging:
• Picking the most accurate approach
• Bayesian Model Averaging
• Equal weighting of the two predictions
• Variance weighting
15. Conclusions
Uncertainty is important to quantify
Project raising new challenges in modelling and
understanding it
Drawing on diverse data resources
Bringing different components of project
together to form best soil predictions
Strategically improve future products
Hinweis der Redaktion
Presenting on behalf of a number of people on uncertainty. Others involved include the members of the broader TERN soils team and especially Dr M. Dobbie.
Up front message on what this talk will cover – I’ll talk about what UQ is, where U comes from and why we need to care about it and how TERN is addressing UQ in this project
Before talking about why uncertainty is a concern to our project and how we are dealing with it talk about where uncertainty enters into our soil attribute products – the short answer is at each stage of the process. We want a comprehensive accounting of this uncertainty but couple that with a pragmatic approach for handling it. Which sources should be tracked, which can be ignored, which are more important etc.
Highlight how we are learning about the gaps in our knowledge, and that tracking and accounting for the uncertainty highlights some of these gaps.Err don’t want to start off on a downer saying we haven’t a clue – so get the pitch for this slide right.
Why do we need to quantify uncertainty? Introduce an example of someone wanting to run APSIM for a paddock in a particular location. They have measurements of all the key parameters there but lack information on soil moisture. TERN products we are creating – either via digital soil mapping or disaggregation - will provide such an estimate and will in fact give a distribution for what we think the value is at that point. So in places where we are very certain about moisture content the APSIM output can be run with a narrow range of values. For places where we are less certain we would need to consider a broader range of productivity simulations to capture the range of possible scenarios. The TERN infrastructure is for the first time providing us with the capabilities to deliver this kind of explicit quantification of uncertainty, which will available for the broader research community and which will be update-able as more and more soils data is collected in the future.
TERN is funding new science on the understanding of uncertainty. Through our involvement with the Global Soil Map community we have helped define standards of reporting of uncertainty – be it reporting of variances or confidence intervals for particular properties and specific locations and depths – through to summarising how properties co-vary at the same locations, or how one property varies across space or down through the soil depth profile.
If time permits – talk about a non-technical yet concrete example... Data imputation.Because we can’t have a talk about stats without some plots of data – We have seen how uncertainty comes into play with the disaggregation work by Nathan. Here is an example of how we can add to the work RVR is doing by bringing in additional data through imputation, to take advantage of rich data source for the central part of Australia – an area often missed by out other sampling efforts.
Archived samples at two depths. N, P and K values imputed by NMR Spectroscopy – can we impute what values are present at intermediate depths. Drawing on rich data in ACLEP and ASRIS, together with work from Sydney Uni on depth splines we can do this – and we can track uncertainty through the process too.
Archived samples at two depths. N, P and K values imputed by NMR Spectroscopy – can we impute what values are present at intermediate depths. Drawing on rich data in ACLEP and ASRIS, together with work from Sydney Uni on depth splines we can do this – and we can track uncertainty through the process too.
Approaches to be taken depend on the origin of the products – if you have one set of data and two models then one idea is to pick the best – but across these large spatial regions its not likely that one approach will consistently give the best – and anyway if the second best is close you would be wise to take those values into consideration. That’s where Bayesian Model Averaging comes up – averaging across the form of the models, using weights based on how well that model fits the data.When you have independent data sources then you also need to merge the predictions. Equal weights if you have equal faith in both approaches. Better yet – weight them by variance associated with the predictions. Place more weight on the more accurate prediction. You end up with something that will have lower variance.