Predictive link between crop traits and eco-climatic description of the original collecting site. Presented at the Botany Seminar series (http://botanyseminars.blogspot.com) on 25 February 2011 at the Department of Agriculture and Ecology, Faculty of Life Sciences, Copenhagen University, Denmark. URL: http://botanyseminars.blogspot.com/2011/01/25-february-dag-endresen.html
Endresen, D.T.F. (2010). Predictive association between trait data and ecogeographic data for Nordic barley landraces. Crop Sci. 50(6):2418-2430. doi: 10.2135/cropsci2010.03.0174
Endresen, D.T.F., K. Street, M. Mackay, A. Bari, and E. De Pauw (2011). Predictive Association between Biotic Stress Traits and Eco-Geographic Data for Wheat and Barley Landraces. Crop Science 51 (5): 2036-2055. doi: 10.2135/cropsci2010.12.0717
Endresen, D.T.F., K. Street, M. Mackay, A. Bari, A. Amri, E. De Pauw, K. Nazari, and A. Yahyaoui (2012). Sources of Resistance to Stem Rust (Ug99) in Bread Wheat and Durum Wheat Identified Using Focused Identification of Germplasm Strategy (FIGS). Crop Science [Online first] doi: 10.2135/cropsci2011.08.0427; Published online 8 Dec. 2011.
Endresen, D.T.F. (2011). Utilization of Plant Genetic Resources: A Lifeboat to the Gene Pool [PhD Thesis]. Copenhagen University, Faculty for Life Sciences, Department of Agriculture and Ecology. Available at: http://goo.gl/pYa9x (PDF 37 MB). ISBN: 978-91-628-8268-6.
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Botany Seminar on trait data mining using FIGS, Copenhagen (25 Feb 2011)
1. Predictive link between crop traits and ecoclimatic description of the original collecting site
2. TOPICS The Nordic Gene Bank (1979) Nordic Genetic Resource Center (2008) Gap analysis To complete gene bank collections for maximum genetic diversity Trait mining with FIGS Predictive link between climate data and trait data Case studies: Morphological traits in Nordic barley Biotic stress traits in wheat and barley Blind prediction of stem rust, Ug99 in bread wheat landraces Wheat at Alnarp, June 2010
3. Nordic Gene Bank (1979-2007) Seed containers through the years Seed store (with household freezers)
4. Nordic Genetic Resource Center (2008) Nordic Genetic Resource Center, photo: January 2011 Seed store (with household freezers) Seed drying room
6. Origin versus USE (seed requests) SESTO distribution and georeferenced accessions Red dots are the georeferenced collecting places Countries are colored by accessions DISTRIBUTED Genebank material primarily originating from the Nordic region – seed requests primarily from the same region
7. Distributed network of Genebank Locations in the Nordic regionmap locations including clonal archives
10. The Svalbard Global Seed Vault (2008) is operated by NordGen Data portal online at http://www.nordgen.org/sgsv Inside the vault on 27 February 2008, Ola Westengen, Johan Bäckman and Simon Jeppson
11. Svalbard Seed Vault by country of origin (status February 2011) Status after three years of operation: February 2011
12. Gap Analysis Gap analysis Identify gaps in the gene bank collections To maximize the conserved genetic diversity
14. GAP analysis to complement genebank collections Objectives of Gap analysis: Advice the planning of new collecting/gathering expeditions Identification of relevant areas were the crop species is predicted to be present Focus on areas least well represented in the genebank collection (maximize diversity) See for example http://gisweb.ciat.cgiar.org/GapAnalysis/ for more information 14
15. FIGS – Focused Identification of Germplasm Strategy Climate layers from the ICARDA ecoclimatic database (De Pauw, 2003)
16. Challenges for improved utilization of genetic resources for crop improvement :* Large gene bank collections* Limited screening capacity 16
17. A needle in a hay stack Scientists and plant breeders want a few hundred germplasm accessions to evaluate for a particular trait. How does the scientist select a small subset likely to have the useful trait? Slide modified from slides by Ken Street, ICARDA FIGS team 17
18. Objectives of FIGS Using climate data for prediction of crop traits BEFORE the field trials. Identification of landraces with a higher probability of holding an interesting trait property. 18
19. Assumption: The climate at the original source location, where the crop landrace was developed during long-term traditional cultivation, is correlated to the trait property. Aim: To build a computer model explaining the crop trait score from the climate data. 19
20. Climate effect during the cultivation process Wild relatives are shaped by the environment Primitive cultivated crops are shaped by local climate and humans Traditional cultivated crops (landraces) are shaped by climate and humans Modern cultivated crops are mostly shaped by humans (plant breeders) Perhaps future crops are shaped in the molecular laboratory…? 20
21. Predictive link between eco-geography and traits It is possible that the human mediated selection of landraces will contribute to the link between ecogeography and traits. During traditional cultivation the farmer will select for and introduce germplasm for improved suitability of the landrace to the local conditions. 21
22. Illustration by Mackay (1995) FIGS: Origin of FIGS: Michael Mackay (1986, 1990, 1995) 22
23. Climate data – WorldClim The climate data can be extracted from the WorldClim dataset. http://www.worldclim.org/ (Hijmans et al., 2005) Data from weather stations worldwide are combined to a continuous surface layer. Climate data for each landrace is extracted from this surface layer. Precipitation: 20 590 stations Temperature: 7 280 stations 23
24. 24 Climate data Layers used in this study: Precipitation (rainfall) Maximum temperatures Minimum temperatures Some of the other layers available: Potential evapotranspiration (water-loss) Agro-climatic Zone (UNESCO classification) Soil classification (FAO Soil map) Aridity (dryness) (mean values for month and year) Eddy De Pauw (ICARDA, 2008)
25. Data for the simulation model Training set For the initial calibration or training step. Calibration set Further calibration, tuning step Often cross-validation on the training set is used to reduce the consumption of raw data. Test set For the model validation or goodness of fit testing. External data, not used in the model calibration. 25
26. Morphological traits in Nordic Barley landraces Field observations by Agnese Kolodinska Brantestam (2002-2003) Multi-way N-PLS data analysis, Dag Endresen (2009-2010) 26 Priekuli (LVA) Bjørke (NOR) Landskrona (SWE)
27. Multi-way data structure (for N-PLS) 36 variables Min. temperature Max. temperature Precipitation Jan, Feb, Mar, … (mode 2) Jan, Feb, Mar, … (mode 2) Jan, Feb, Mar, … (mode 2) mode 1 14 samples 2nd level for mode 3 1st level for mode 3 3rd level for mode 3 Precipitation Max temp (mode 3) 3 climate variables Min temp 14 samples (mode 1) 14 samples (mode 1) 27 12 months (mode 2) 12 months (mode 2)
28. Multi-way N-PLS resultsNordic barley landraces Endresen (2010). Predictive association between trait data and ecogeographic data for Nordic barley landraces. Crop Science 50: 2418-2430. DOI: 10.2135/cropsci2010.03.0174
29. Stem rust in wheat landraces Green dots indicate collecting sites for resistant wheat landraces and red dots collecting sites for susceptible landraces. USDA GRIN, trait data online: http://www.ars-grin.gov/cgi-bin/npgs/html/desc.pl?65049 29
30. SIMCA analysis (PCA model for each class) Example from the stem rust set: Principal component 3 3 PCs 2 PCs Principal component 1 * 1 PC Principal component 2 Illustration modified from Wise et al., 2006:201 (PLS Toolbox software manual)
31. Classification performance Positive predictive value (PPV) PPV = True positives / (True positives + False positives) Classification performance for the identification of resistant samples (positives) Positive diagnostic likelihood ratio (LR+) LR+ = sensitivity / (1 – specificity) Less sensitive to prevalence than PPV
32. Multivariate SIMCA resultsStem rust in wheat Endresen, D.T.F., K. Street, M. Mackay, A. Bari, E. De Pauw (submitted). Predictive association between biotic stress traits and ecogeographic data for wheat and barley landraces. Crop Science, conditionally accepted, revision underway.
33. Net blotch in barley landraces Green dots indicate collecting sites for resistant wheat landraces and red dots collecting sites for susceptible landraces. USDA GRIN, trait data online: http://www.ars-grin.gov/cgi-bin/npgs/html/desc.pl?1041 33
34. Multivariate SIMCA resultsNet blotch in barley Endresen, D.T.F., K. Street, M. Mackay, A. Bari, E. De Pauw (submitted). Predictive association between biotic stress traits and ecogeographic data for wheat and barley landraces. Crop Science, conditionally accepted, revision underway.
35. Multivariate SIMCA resultsstem rust (Ug99) in wheat Ug99 set with 4563 wheat landraces screened for Ug99 in Yemen 2007, 10.2 % resistant accessions. The true trait scores for 20% of the accessions (825 samples) were revealed. We used trait mining with SIMCA to select 500 accessions more likely to be resistant from 3728 accession with true scores hidden (to the person making the analysis). The FIGS set was observed to hold 25.8 % resistant samples and thus 2.5 times higher than expected by chance. Endresen, D.T.F., K. Street, M. Mackay, A. Bari, E. De Pauw (draft manuscript). Sources of resistance in wheat to stem rust (Ug99) identified using Focused Identification of Germplasm Strategy (FIGS).
36. A Lifeboat to the gene pool PhD thesis: A lifeboat to the gene pool Defense planned for 31 March KU LIFE campus Fredriksberg Auditorium: 3-13 / A2-70.3
37. Thanks for listening! Presented for the Botany Seminars 25 February 2011 Department of Ecology and Agriculture, Faculty of Life Sciences, Copenhagen University Dag Terje Filip Endresen dag.endresen@nordgen.org
Hinweis der Redaktion
Image: Wheat at Alnarp June 2010 by Dag Endresen, https://picasaweb.google.com/dag.endresen/GermplasmCrops#5497796034327520578
NOTE that the countries are colored by the distribution of accessions:: while the red dots are the georeferenced collecting places.Dynamic maps live to SESTO, created with UMN Mapserver (Dag Endresen, 2009)
Statsbygg (2008). Svalbard Global Seed Vault, Longyearbyen, Svalbard, new construction. Ferdigmelding nr 671/2008, Project no 11098. Statsbygg, Oslo, Norway. 28 p. Available at http://www.statsbygg.no/FilSystem/files/ferdigmeldinger/671_svalbard_frohvelv.pdf, verified 18 Jan 2011.
http://www.nordgen.org/sgsv/
http://www.nordgen.org/sgsv/
Integration of GBIF-mediated occurrence data with other applications like the openModeller generates a probability distribution using the Envelope Score Algorithm.
GBIF-MAPA Mapping and Analysis Portal Application [http://gbifmapa.austmus.gov.au/mapa/]The survey gap analysis (SGA) tool helps you design a biodiversity survey that will best complement the existing survey effort by identifying those areas least well surveyed in terms of environmental conditions.Photo: http://www.flickr.com/photos/dag_endresen/4221301525/
Illustration of trait mining with ecoclimatic GIS layers. GIS layers included in the illustration are from the ICARDA ecoclimatic database, average: annual temperature (front), annual precipitation (middle), and winter precipitation (back) (De Pauw, 2003)
Photo: Dag Endresen.Field of sugar beet (Beta vulgaris L.) at Alnarp (June 2005). URL: http://www.flickr.com/photos/dag_endresen/4189812241/
Modern agriculture uses advanced plant varieties based on the most productive genetics. The original land races and wild forms produce lower yields, but their greater genetic variation contains a higher diversity in e.g. resistance to disease. High-yielding modern crops are therefore vulnerable when a new disease arises.
Illustration traditional cattle farming: http://commons.wikimedia.org/wiki/File:Traditional_farming_Guinea.jpg (USAID, Public Domain)
The WorldClim dataset is described in: Hijmans, R.J., S.E. Cameron, J.L. Parra, P.G. Jones and A. Jarvis, 2005. Very high resolution interpolated climate surfaces for global land areas. International Journal of Climatology 25: 1965-1978NOAA GHCN-Monthly version 2:http://www.ncdc.noaa.gov/oa/climate/ghcn-monthly/index.phpWeather stations, precipitation: 20590;temperature:7280
We often divide the data for a simulation model project in three equal parts: one set for initial model calibration or training, one set for further calibration or fine tuning; and one test set for validation on the model.
GRIN database (USDA-ARS, National Plant Germplasm System, Germplasm Resources Information Network, online http://www.ars-grin.gov/npgs) USDA GRIN, trait data online: http://www.ars-grin.gov/cgi-bin/npgs/html/desc.pl?65049
GRIN database (USDA-ARS, National Plant Germplasm System, Germplasm Resources Information Network, online http://www.ars-grin.gov/npgs) USDA GRIN, trait data online: http://www.ars-grin.gov/cgi-bin/npgs/html/desc.pl?1041Dr Harold Bockelman extracted the trait data (C&E)