Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
BOOSTING
DATA SCIENCE
IN GEOCHEMISTRY
We Need Global Geochemical Data
Standards and Networking!
Kerstin Lehnert Lamont-Doh...
Data Science is Happening in
Geochemistry (and Mineralogy, Petrology, etc.)
V21A-08: Boosting Data Science in Geochemistry...
Just Reflecting on this Session ...
■ How much work went into assembling the data to do the data-driven
research in each o...
Obstacles for Data Science
■ Surveys in recent years show that data scientists still spend 75-80%
of their time ‘data wran...
Example: Data Synthesis for DECADE
■ 15 scientists working for 5 days
■ Major progress was only made
with the compilation ...
Urgency for a Geochemical Data
Standard
■ We need to be able to share & integrate data globally from multiple
databases ea...
A Never-Ending Story?
V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networ...
Can We “Standardize” Geochemical
Data?
V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Sta...
We Made Some Progress
■ Editors Roundtable recommendations with geochemical journals and
databases
■ EarthChem XML schema
...
EarthChem Data Templates
V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Net...
EarthChemXML
• Developed for the EarthChem Portal (ECP) in 2006
• Locally developed XML schema for data exchange that part...
EarthChem Portal
V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking!...
Interoperable EarthChem Data
V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and...
V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 14
V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 15
V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 16
V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 17
Proliferation of Geochemical
Databases
■ International
■ Science programs
■ Thematic
V21A-08: Boosting Data Science in Geo...
‚long tail‘ communities:
• Analogue modelling
• Rock physics/ mechanics
• Paleomagnetics
• Geochemistry
Slide contributed ...
Spain, 11, 41%
Netherlands, 1,
4%Portugal, 3,
11%
Italy, 12, 44%
27 Analytical labs, 4 countries
Spain
Netherlands
Portuga...
Data Standards in Geochemistry
are no longer an option
■ Publishers & funders are demanding FAIR data.
■ In order to do Da...
AGU Town Hall
“Building a Global Network of Geochemical Data”
Tuesday, Dec 11, 6:15-7:00pm
Marriott Marquis, Independence ...
Nächste SlideShare
Wird geladen in …5
×

Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking!

62 Aufrufe

Veröffentlicht am

Presentation at AGU Fall Meeting 2018: Large-scale, global geochemical data syntheses like EarthChem and GEOROC have, for nearly two decades, inspired and made possible a vast range of scientific studies and new discoveries, facilitating the analysis and mining of geochemical data and creating new paradigms in geochemical data analysis such as statistical geochemistry. These syntheses provide easy access to fully integrated compilations of thousands of datasets (‘data fusion’) with millions of geochemical measurements that are accompanied by comprehensive and harmonized metadata for context and provenance to search, filter, sort, and evaluate the data.
The syntheses have been assembled and maintained through manual labor by data managers, who extract data and metadata from text, tables, and supplements of publications for inclusion in the databases, a time-consuming task due to the multitude of data formats, units, normalizations, vocabularies, etc., i.e. lack of best practices for geochemical data reporting. In order to support and advance future science endeavors that rely on access to and analysis of large volumes of geochemical data, we need to develop and implement global standards for geochemical data that not only make geochemical data FAIR (Findable, Accessible, Interoperable, Re-usable), but ready for data fusion. As more geochemical data systems are emerging at national, programmatic, and subdomain levels in response to Open Access policies and science needs, standard protocols for exchanging geochemical data among these systems will need to be developed, implemented, and governed.

Critical is the alignment with existing standards such as the Semantic Sensor Network (SSN) ontology, a recent joint W3C and OGC standard that standardizes description of sensors, observation, sampling, and actuation, with sufficient flexibility to allow details of these elements to be defined in different domains. New initiatives within the International Council for Science and CODATA are working towards coordinating the International Science Unions to identify and endorse the more authoritative standards (including vocabularies and ontologies). These initiatives present a timely opportunity for geochemical data to ensure that they are born ‘connected’ within and across disciplines.

Veröffentlicht in: Daten & Analysen
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking!

  1. 1. BOOSTING DATA SCIENCE IN GEOCHEMISTRY We Need Global Geochemical Data Standards and Networking! Kerstin Lehnert Lamont-Doherty Earth Observatory, Columbia University, USA Lesley A Wyborn Australian National University, Australia Simon J D Cox CSIRO Land and Water, Australia Jens F Klump CSIRO Earth Science Resource Engineering, Australia Brent McInnes Curtin University, Australia
  2. 2. Data Science is Happening in Geochemistry (and Mineralogy, Petrology, etc.) V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 2 Goldschmidt 2018 Workshop “Data Science in Geochemistry”
  3. 3. Just Reflecting on this Session ... ■ How much work went into assembling the data to do the data-driven research in each of the talks? ■ What standards were followed to compile the data? What information about uncertainties or analytical procedure was included, what terminology was used? ■ Can I integrate the data compiled for talk A with the one from talk B? ■ Can we use the tools presented in talk X with the data from talk Y? V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 3
  4. 4. Obstacles for Data Science ■ Surveys in recent years show that data scientists still spend 75-80% of their time ‘data wrangling’. V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 4 Source: Crowdflower • RDA EU survey 2013 (75%) • Brodie 2015 (80%) • CrowdFlower 2017 (80%) Did you?
  5. 5. Example: Data Synthesis for DECADE ■ 15 scientists working for 5 days ■ Major progress was only made with the compilation of melt inclusion geochemistry because data were discoverable in PetDB and GEOROC. ■ Another 2 months of effort of the EarthChem data manager required to format & integrate data from different databases and unpublished data. V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 5
  6. 6. Urgency for a Geochemical Data Standard ■ We need to be able to share & integrate data globally from multiple databases each with their own schema. ■ We need to integrate & link data across disciplines (transdisciplinary). ■ We need to ensure compliance with FAIR data principles. ■ We need it to – Be more comprehensive with respect to data documentation, – Be aligned with modern standards, e.g. RDF, – Use, where possible, internationally endorsed vocabularies. ■ Above all, we need to have a more formal approval and governance. – We need to think of standard specifically for both technical and 'social' reasons. V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 6
  7. 7. A Never-Ending Story? V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 7 IGC 2008
  8. 8. Can We “Standardize” Geochemical Data? V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 8 I believe that our failure to unite our voices as geochemists has a simple origin – it is the complexity of our subject.
  9. 9. We Made Some Progress ■ Editors Roundtable recommendations with geochemical journals and databases ■ EarthChem XML schema ■ Rise of the IGSN V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 9 Goldstein et al. 2014, published in the EarthChem Library doi:10.1594/IEDA/100426
  10. 10. EarthChem Data Templates V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 10
  11. 11. EarthChemXML • Developed for the EarthChem Portal (ECP) in 2006 • Locally developed XML schema for data exchange that partner data systems use to encode their database content for inclusion in the ECP database. • Not comprehensive with respect to metadata, uses EarthCem vocabularies (so does not align with broader community vocabularies) • XML format is voluminous, especially for databases with hundreds of thousands of records. V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 11
  12. 12. EarthChem Portal V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 12 • 22,074 publications • 1,054,738 samples • 30,059,995 analytical values Global Federation of Geochemical Databases: • PetDB • SedDB • GEOROC (Germany) • USGS • MetPetDB • GANSEKI (Japan) • Data exchange protocol: EarthChemXML • APIs & web services (WMS, WFS) • Interoperability with modeling tools More formal & community governed standards needed for FAIR
  13. 13. Interoperable EarthChem Data V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 13 DECADE Portal (beta) http://decade.iedadata.org
  14. 14. V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 14
  15. 15. V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 15
  16. 16. V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 16
  17. 17. V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 17
  18. 18. Proliferation of Geochemical Databases ■ International ■ Science programs ■ Thematic V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 18 sponsored by the State Key Lab of the Geological Processes and Mineral Resources in the China University of Geosciences
  19. 19. ‚long tail‘ communities: • Analogue modelling • Rock physics/ mechanics • Paleomagnetics • Geochemistry Slide contributed by Kirsten Elger, GFZ Potsdam V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 19
  20. 20. Spain, 11, 41% Netherlands, 1, 4%Portugal, 3, 11% Italy, 12, 44% 27 Analytical labs, 4 countries Spain Netherlands Portugal Italy Barcelona Workshop (Nov 2018): • Agreement to use EarthChem Library templates for data publications via GFZ Data Services • Interest in collaboration for the development of global standards for geochemical data V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 20 Slide contributed by Kirsten Elger, GFZ Potsdam
  21. 21. Data Standards in Geochemistry are no longer an option ■ Publishers & funders are demanding FAIR data. ■ In order to do Data Science, we need to have a global network of geochemical data that can be accessed in a standardized format. ■ No one can do it alone – no one organization, no one group, no one country has the required resources or expertise. We need to build a global geochemistry data platform together! V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 21 “We must, indeed, all hang together or, most assuredly, we shall all hang separately”. Benjamin Franklin
  22. 22. AGU Town Hall “Building a Global Network of Geochemical Data” Tuesday, Dec 11, 6:15-7:00pm Marriott Marquis, Independence E Panel: Roberta Rudnick (President, Geochemical Society) Catherine Chauvel (Editor, Chemical Geology) Maria Uhle (NSF Program Director for International Activities) Lesley Wyborn, ANU V21A-08: Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standards and Networking! 22

×