SlideShare a Scribd company logo
1 of 8
Daniel Garijo, Oscar Corcho, Khalid Belhajjame,
Lourdes Verdes-Montenegro, Julián Garrido, Raúl
Palma, Cezary Mazurek and Kristina Hettne
Ontology Engineering Group (Universidad Politécnica de Madrid)
University Paris-Dauphine
AMIGA (Instituto de Astrofísica de Andalucía)
Poznan Supercomputing and Networking Center
LUMC
dgarijo@fi.upm.es
Warsaw, May 28th 2015
Is preserving data enough?
Towards the preservation of
scientific methods
Where does data come from? Scientific workflows
2Is preserving data enough? Towards the preservation of scientific methods
Benefits:
•Sharing and reusing previous work
•Time savings: reexecution of old experiments with different parameters).
•Teaching: new students can learn existing methods in the lab
•Design for modularity, so others can reuse
•Design for standardization, reduction of heterogeneity
•Debugging of executions
•Paper writing, linking execution pipelines to publications.
•Reproducibility.
•Etc.
Lab book
Digital
Log
Workflow
Experiment
How do we preserve workflows?
3
Workflow repositories are great! But:
•Manual annotation and documentation
•Workflow conservation plan?
•No clear link between data and method
•How to reproduce a workflow?
Workflows keep breaking!
•Zhao et al: Why Workflows Break - Understanding and Combating Decay in
Taverna Workflows. >90 workflows analyzed
•Third party resources not available/accessible
•Missing example data
•Lack of documentation
•Incomplete metadata.
Is preserving data enough? Towards the preservation of scientific methods
Do I have to document
everything again?? Didn’t I just
write a paper?
Our solution: Data + method =Context - Research Object
4
Aggregation of resources that bundles together the contents of a research work
Is preserving data enough? Towards the preservation of scientific methods
OAI-ORE
+ +
PROV
OA
How to preserve Research Objects?
5Is preserving data enough? Towards the preservation of scientific methods
Three main ways/levels:
•Descriptive reproducibility
•Documentation
•Workflow execution reproducibility
•Can we run the workflow?
•Workflow results reproducibility
•Can we get the same results?
Checklists!
•Corcho et al: Checklist for workflow conservation.
•http://dx.doi.org/10.6084/m9.figshare.1285011
•40 different aspects
•Documentation
•Goals
•Results
•Metadata
•….
•Corcho et al: Checklist for a workflow conservation plan
•http://dx.doi.org/10.6084/m9.figshare.1285012
•Based on the DCC’s data management plan
Some examples
6Is preserving data enough? Towards the preservation of scientific methods
Levels of reproducibility
Workflow conservation Plan
Conclusions
7Is preserving data enough? Towards the preservation of scientific methods
•Research Objects help bundling and bridging the gap between
data and methods (scientific workflows)
•We need to preserve research objects as much as data and workflows
used to obtain it!
•Documentation
•Ability to execute the experiment
•Ability to obtain the same results
•Checklists are a first step towards improving documentation, archival and
preservation research objects.
http://www.researchobject.org/
Daniel Garijo, Oscar Corcho, Khalid Belhajjame,
Lourdes Verdes-Montenegro, Julián Garrido, Raúl
Palma, Cezary Mazurek and Kristina Hettne
Ontology Engineering Group (Universidad Politécnica de Madrid)
University Paris-Dauphine
AMIGA (Instituto de Astrofísica de Andalucía)
Poznan Supercomputing and Networking Center
LUMC
dgarijo@fi.upm.es
Warsaw, May 28th 2015
Is preserving data enough?
Towards the preservation of
scientific methods

More Related Content

What's hot

Where Little Data meets Big Data in Biomedical Research
Where Little Data meets Big Data in Biomedical ResearchWhere Little Data meets Big Data in Biomedical Research
Where Little Data meets Big Data in Biomedical ResearchResearchSpace
 
Research Objects in Scientific Publications
Research Objects in Scientific PublicationsResearch Objects in Scientific Publications
Research Objects in Scientific Publicationsdgarijo
 
SEEKing our way to better presentation of data and models from scientific inv...
SEEKing our way to better presentation of data and models from scientific inv...SEEKing our way to better presentation of data and models from scientific inv...
SEEKing our way to better presentation of data and models from scientific inv...Natalie Stanford
 
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...sesrdm
 
[1.9] Data Archiving and Publishing - Annemiek van der Kuil [3TU.Datacentrum...
[1.9] Data Archiving and Publishing - Annemiek van der Kuil  [3TU.Datacentrum...[1.9] Data Archiving and Publishing - Annemiek van der Kuil  [3TU.Datacentrum...
[1.9] Data Archiving and Publishing - Annemiek van der Kuil [3TU.Datacentrum...3TU.Datacentrum
 
Introduction to the Environmental Data Initiative (EDI)
Introduction to the Environmental Data Initiative (EDI)Introduction to the Environmental Data Initiative (EDI)
Introduction to the Environmental Data Initiative (EDI)Corinna Gries
 
The Landscape of Research Data Management
The Landscape of Research Data Management The Landscape of Research Data Management
The Landscape of Research Data Management Alastair Dunning
 
Winning the Tour de France, Research Data and Data Stewardship
Winning the Tour de France, Research Data and Data StewardshipWinning the Tour de France, Research Data and Data Stewardship
Winning the Tour de France, Research Data and Data StewardshipAlastair Dunning
 
Data citation data-management
Data citation data-managementData citation data-management
Data citation data-managementSimon Huggard
 
Model management tools for improved reproducibility in systems biology
Model management tools for improved reproducibility in systems biologyModel management tools for improved reproducibility in systems biology
Model management tools for improved reproducibility in systems biologyUniversity Medicine Greifswald
 
Standards and tools for model management in biomedical research
Standards and tools for model management in biomedical researchStandards and tools for model management in biomedical research
Standards and tools for model management in biomedical researchUniversity Medicine Greifswald
 

What's hot (12)

Where Little Data meets Big Data in Biomedical Research
Where Little Data meets Big Data in Biomedical ResearchWhere Little Data meets Big Data in Biomedical Research
Where Little Data meets Big Data in Biomedical Research
 
Research Objects in Scientific Publications
Research Objects in Scientific PublicationsResearch Objects in Scientific Publications
Research Objects in Scientific Publications
 
SEEKing our way to better presentation of data and models from scientific inv...
SEEKing our way to better presentation of data and models from scientific inv...SEEKing our way to better presentation of data and models from scientific inv...
SEEKing our way to better presentation of data and models from scientific inv...
 
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...
 
[1.9] Data Archiving and Publishing - Annemiek van der Kuil [3TU.Datacentrum...
[1.9] Data Archiving and Publishing - Annemiek van der Kuil  [3TU.Datacentrum...[1.9] Data Archiving and Publishing - Annemiek van der Kuil  [3TU.Datacentrum...
[1.9] Data Archiving and Publishing - Annemiek van der Kuil [3TU.Datacentrum...
 
Introduction to the Environmental Data Initiative (EDI)
Introduction to the Environmental Data Initiative (EDI)Introduction to the Environmental Data Initiative (EDI)
Introduction to the Environmental Data Initiative (EDI)
 
The Landscape of Research Data Management
The Landscape of Research Data Management The Landscape of Research Data Management
The Landscape of Research Data Management
 
Winning the Tour de France, Research Data and Data Stewardship
Winning the Tour de France, Research Data and Data StewardshipWinning the Tour de France, Research Data and Data Stewardship
Winning the Tour de France, Research Data and Data Stewardship
 
Data citation data-management
Data citation data-managementData citation data-management
Data citation data-management
 
Model management tools for improved reproducibility in systems biology
Model management tools for improved reproducibility in systems biologyModel management tools for improved reproducibility in systems biology
Model management tools for improved reproducibility in systems biology
 
Standards and tools for model management in biomedical research
Standards and tools for model management in biomedical researchStandards and tools for model management in biomedical research
Standards and tools for model management in biomedical research
 
Data and model management in Systems Biology
Data and model management in Systems BiologyData and model management in Systems Biology
Data and model management in Systems Biology
 

Viewers also liked

On Specifying and Sharing Scientific Workflow Optimization Results Using Rese...
On Specifying and Sharing Scientific Workflow Optimization Results Using Rese...On Specifying and Sharing Scientific Workflow Optimization Results Using Rese...
On Specifying and Sharing Scientific Workflow Optimization Results Using Rese...dgarijo
 
Publicación de datos y métodos científicos en investigación
Publicación de datos y métodos científicos en investigaciónPublicación de datos y métodos científicos en investigación
Publicación de datos y métodos científicos en investigacióndgarijo
 
OEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology EngineeringOEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology Engineeringdgarijo
 
Software Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesSoftware Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesdgarijo
 
EDBT 2015: Summer School Overview
EDBT 2015: Summer School OverviewEDBT 2015: Summer School Overview
EDBT 2015: Summer School Overviewdgarijo
 
Frag Flow: Automated Fragment Detection in Scientific Workflows
Frag Flow: Automated Fragment Detection in Scientific WorkflowsFrag Flow: Automated Fragment Detection in Scientific Workflows
Frag Flow: Automated Fragment Detection in Scientific Workflowsdgarijo
 
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline UsersWorkflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Usersdgarijo
 
Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)dgarijo
 
Towards Workflow Ecosystems Through Semantic and Standard Representations
Towards Workflow Ecosystems Through Semantic and Standard RepresentationsTowards Workflow Ecosystems Through Semantic and Standard Representations
Towards Workflow Ecosystems Through Semantic and Standard Representationsdgarijo
 
Semantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsSemantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsdgarijo
 
OntoSoft: A Distributed Semantic Registry for Scientific Software
OntoSoft: A Distributed Semantic Registry for Scientific SoftwareOntoSoft: A Distributed Semantic Registry for Scientific Software
OntoSoft: A Distributed Semantic Registry for Scientific Softwaredgarijo
 

Viewers also liked (11)

On Specifying and Sharing Scientific Workflow Optimization Results Using Rese...
On Specifying and Sharing Scientific Workflow Optimization Results Using Rese...On Specifying and Sharing Scientific Workflow Optimization Results Using Rese...
On Specifying and Sharing Scientific Workflow Optimization Results Using Rese...
 
Publicación de datos y métodos científicos en investigación
Publicación de datos y métodos científicos en investigaciónPublicación de datos y métodos científicos en investigación
Publicación de datos y métodos científicos en investigación
 
OEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology EngineeringOEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology Engineering
 
Software Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesSoftware Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciences
 
EDBT 2015: Summer School Overview
EDBT 2015: Summer School OverviewEDBT 2015: Summer School Overview
EDBT 2015: Summer School Overview
 
Frag Flow: Automated Fragment Detection in Scientific Workflows
Frag Flow: Automated Fragment Detection in Scientific WorkflowsFrag Flow: Automated Fragment Detection in Scientific Workflows
Frag Flow: Automated Fragment Detection in Scientific Workflows
 
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline UsersWorkflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
 
Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)
 
Towards Workflow Ecosystems Through Semantic and Standard Representations
Towards Workflow Ecosystems Through Semantic and Standard RepresentationsTowards Workflow Ecosystems Through Semantic and Standard Representations
Towards Workflow Ecosystems Through Semantic and Standard Representations
 
Semantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsSemantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologists
 
OntoSoft: A Distributed Semantic Registry for Scientific Software
OntoSoft: A Distributed Semantic Registry for Scientific SoftwareOntoSoft: A Distributed Semantic Registry for Scientific Software
OntoSoft: A Distributed Semantic Registry for Scientific Software
 

Similar to Is preserving data enough? Towards the preservation of scientific methods

Effective research data management
Effective research data managementEffective research data management
Effective research data managementCatherine Gold
 
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...María Poveda Villalón
 
Metadata for Research Objects
Metadata for Research ObjectsMetadata for Research Objects
Metadata for Research Objectsseanb
 
Curation-Friendly Tools for the Scientific Researcher
Curation-Friendly Tools for the Scientific ResearcherCuration-Friendly Tools for the Scientific Researcher
Curation-Friendly Tools for the Scientific Researcherbwestra
 
From Logic Model to Data Model
From Logic Model to Data ModelFrom Logic Model to Data Model
From Logic Model to Data Modeldpokusa
 
OAI7 Research Objects
OAI7 Research ObjectsOAI7 Research Objects
OAI7 Research Objectsseanb
 
Conservation of Scientific Workflow Infrastructures by Using Semantics - 2012
Conservation of Scientific Workflow Infrastructures by Using Semantics - 2012Conservation of Scientific Workflow Infrastructures by Using Semantics - 2012
Conservation of Scientific Workflow Infrastructures by Using Semantics - 2012Idafen Santana Pérez
 
From logic model to data model: real and perceived barriers to research asses...
From logic model to data model: real and perceived barriers to research asses...From logic model to data model: real and perceived barriers to research asses...
From logic model to data model: real and perceived barriers to research asses...ORCID, Inc
 
Software tools to facilitate materials science research
Software tools to facilitate materials science researchSoftware tools to facilitate materials science research
Software tools to facilitate materials science researchAnubhav Jain
 
Open Data (and Software, and other Research Artefacts) - A proper management
Open Data (and Software, and other Research Artefacts) -A proper managementOpen Data (and Software, and other Research Artefacts) -A proper management
Open Data (and Software, and other Research Artefacts) - A proper management Oscar Corcho
 
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...Sarah Anna Stewart
 
From Scientific Workflows to Research Objects: Publication and Abstraction of...
From Scientific Workflows to Research Objects: Publication and Abstraction of...From Scientific Workflows to Research Objects: Publication and Abstraction of...
From Scientific Workflows to Research Objects: Publication and Abstraction of...dgarijo
 
Research data management at TU Eindhoven
Research data management at TU EindhovenResearch data management at TU Eindhoven
Research data management at TU EindhovenLeon Osinski
 
From Open Access to Open Data: Collaborative Work in the University Libraries...
From Open Access to Open Data: Collaborative Work in the University Libraries...From Open Access to Open Data: Collaborative Work in the University Libraries...
From Open Access to Open Data: Collaborative Work in the University Libraries...LIBER Europe
 
Towards Open Architectures and Interoperability for Learning Analytics
Towards Open Architectures and Interoperability for Learning Analytics Towards Open Architectures and Interoperability for Learning Analytics
Towards Open Architectures and Interoperability for Learning Analytics Tore Hoel
 
Research Data Management at Imperial College London
Research Data Management at Imperial College LondonResearch Data Management at Imperial College London
Research Data Management at Imperial College LondonSarah Anna Stewart
 
Research Data, or: How I Learned to Stop Worrying and Love the Policy
Research Data, or: How I Learned to Stop Worrying and Love the PolicyResearch Data, or: How I Learned to Stop Worrying and Love the Policy
Research Data, or: How I Learned to Stop Worrying and Love the PolicyTorsten Reimer
 
UCL’s research IT management systems architecture review aligned with Open Sc...
UCL’s research IT management systems architecture review aligned with Open Sc...UCL’s research IT management systems architecture review aligned with Open Sc...
UCL’s research IT management systems architecture review aligned with Open Sc...Jisc
 
Connecting the pieces: using ORCIDs to improve research impact and repositories
Connecting the pieces: using ORCIDs to improve research impact and repositoriesConnecting the pieces: using ORCIDs to improve research impact and repositories
Connecting the pieces: using ORCIDs to improve research impact and repositoriesORCID, Inc
 

Similar to Is preserving data enough? Towards the preservation of scientific methods (20)

Credible workshop
Credible workshopCredible workshop
Credible workshop
 
Effective research data management
Effective research data managementEffective research data management
Effective research data management
 
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
 
Metadata for Research Objects
Metadata for Research ObjectsMetadata for Research Objects
Metadata for Research Objects
 
Curation-Friendly Tools for the Scientific Researcher
Curation-Friendly Tools for the Scientific ResearcherCuration-Friendly Tools for the Scientific Researcher
Curation-Friendly Tools for the Scientific Researcher
 
From Logic Model to Data Model
From Logic Model to Data ModelFrom Logic Model to Data Model
From Logic Model to Data Model
 
OAI7 Research Objects
OAI7 Research ObjectsOAI7 Research Objects
OAI7 Research Objects
 
Conservation of Scientific Workflow Infrastructures by Using Semantics - 2012
Conservation of Scientific Workflow Infrastructures by Using Semantics - 2012Conservation of Scientific Workflow Infrastructures by Using Semantics - 2012
Conservation of Scientific Workflow Infrastructures by Using Semantics - 2012
 
From logic model to data model: real and perceived barriers to research asses...
From logic model to data model: real and perceived barriers to research asses...From logic model to data model: real and perceived barriers to research asses...
From logic model to data model: real and perceived barriers to research asses...
 
Software tools to facilitate materials science research
Software tools to facilitate materials science researchSoftware tools to facilitate materials science research
Software tools to facilitate materials science research
 
Open Data (and Software, and other Research Artefacts) - A proper management
Open Data (and Software, and other Research Artefacts) -A proper managementOpen Data (and Software, and other Research Artefacts) -A proper management
Open Data (and Software, and other Research Artefacts) - A proper management
 
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
 
From Scientific Workflows to Research Objects: Publication and Abstraction of...
From Scientific Workflows to Research Objects: Publication and Abstraction of...From Scientific Workflows to Research Objects: Publication and Abstraction of...
From Scientific Workflows to Research Objects: Publication and Abstraction of...
 
Research data management at TU Eindhoven
Research data management at TU EindhovenResearch data management at TU Eindhoven
Research data management at TU Eindhoven
 
From Open Access to Open Data: Collaborative Work in the University Libraries...
From Open Access to Open Data: Collaborative Work in the University Libraries...From Open Access to Open Data: Collaborative Work in the University Libraries...
From Open Access to Open Data: Collaborative Work in the University Libraries...
 
Towards Open Architectures and Interoperability for Learning Analytics
Towards Open Architectures and Interoperability for Learning Analytics Towards Open Architectures and Interoperability for Learning Analytics
Towards Open Architectures and Interoperability for Learning Analytics
 
Research Data Management at Imperial College London
Research Data Management at Imperial College LondonResearch Data Management at Imperial College London
Research Data Management at Imperial College London
 
Research Data, or: How I Learned to Stop Worrying and Love the Policy
Research Data, or: How I Learned to Stop Worrying and Love the PolicyResearch Data, or: How I Learned to Stop Worrying and Love the Policy
Research Data, or: How I Learned to Stop Worrying and Love the Policy
 
UCL’s research IT management systems architecture review aligned with Open Sc...
UCL’s research IT management systems architecture review aligned with Open Sc...UCL’s research IT management systems architecture review aligned with Open Sc...
UCL’s research IT management systems architecture review aligned with Open Sc...
 
Connecting the pieces: using ORCIDs to improve research impact and repositories
Connecting the pieces: using ORCIDs to improve research impact and repositoriesConnecting the pieces: using ORCIDs to improve research impact and repositories
Connecting the pieces: using ORCIDs to improve research impact and repositories
 

More from dgarijo

FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesFOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesdgarijo
 
FAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the FutureFAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the Futuredgarijo
 
Towards Reusable Research Software
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Softwaredgarijo
 
SOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentationSOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentationdgarijo
 
A Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed DatasetsA Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed Datasetsdgarijo
 
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge GraphsOBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphsdgarijo
 
Towards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software MetadataTowards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software Metadatadgarijo
 
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...dgarijo
 
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular DataWDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular Datadgarijo
 
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...dgarijo
 
Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019dgarijo
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Sciencedgarijo
 
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...dgarijo
 
WIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting OntologiesWIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting Ontologiesdgarijo
 
Towards Automating Data Narratives
Towards Automating Data NarrativesTowards Automating Data Narratives
Towards Automating Data Narrativesdgarijo
 
Automated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific WorkflowsAutomated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific Workflowsdgarijo
 
Reproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An OverviewReproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An Overviewdgarijo
 
PhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflowsPhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflowsdgarijo
 
Creating abstractions from scientific workflows: PhD symposium 2015
Creating abstractions from scientific workflows: PhD symposium 2015Creating abstractions from scientific workflows: PhD symposium 2015
Creating abstractions from scientific workflows: PhD symposium 2015dgarijo
 
User requirments for geospatial provenance
User requirments for geospatial provenanceUser requirments for geospatial provenance
User requirments for geospatial provenancedgarijo
 

More from dgarijo (20)

FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesFOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
 
FAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the FutureFAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the Future
 
Towards Reusable Research Software
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Software
 
SOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentationSOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentation
 
A Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed DatasetsA Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed Datasets
 
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge GraphsOBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
 
Towards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software MetadataTowards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software Metadata
 
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
 
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular DataWDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
 
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
 
Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
 
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
 
WIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting OntologiesWIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting Ontologies
 
Towards Automating Data Narratives
Towards Automating Data NarrativesTowards Automating Data Narratives
Towards Automating Data Narratives
 
Automated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific WorkflowsAutomated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific Workflows
 
Reproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An OverviewReproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An Overview
 
PhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflowsPhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflows
 
Creating abstractions from scientific workflows: PhD symposium 2015
Creating abstractions from scientific workflows: PhD symposium 2015Creating abstractions from scientific workflows: PhD symposium 2015
Creating abstractions from scientific workflows: PhD symposium 2015
 
User requirments for geospatial provenance
User requirments for geospatial provenanceUser requirments for geospatial provenance
User requirments for geospatial provenance
 

Recently uploaded

MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxPoojaSen20
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 

Recently uploaded (20)

MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 

Is preserving data enough? Towards the preservation of scientific methods

  • 1. Daniel Garijo, Oscar Corcho, Khalid Belhajjame, Lourdes Verdes-Montenegro, Julián Garrido, Raúl Palma, Cezary Mazurek and Kristina Hettne Ontology Engineering Group (Universidad Politécnica de Madrid) University Paris-Dauphine AMIGA (Instituto de Astrofísica de Andalucía) Poznan Supercomputing and Networking Center LUMC dgarijo@fi.upm.es Warsaw, May 28th 2015 Is preserving data enough? Towards the preservation of scientific methods
  • 2. Where does data come from? Scientific workflows 2Is preserving data enough? Towards the preservation of scientific methods Benefits: •Sharing and reusing previous work •Time savings: reexecution of old experiments with different parameters). •Teaching: new students can learn existing methods in the lab •Design for modularity, so others can reuse •Design for standardization, reduction of heterogeneity •Debugging of executions •Paper writing, linking execution pipelines to publications. •Reproducibility. •Etc. Lab book Digital Log Workflow Experiment
  • 3. How do we preserve workflows? 3 Workflow repositories are great! But: •Manual annotation and documentation •Workflow conservation plan? •No clear link between data and method •How to reproduce a workflow? Workflows keep breaking! •Zhao et al: Why Workflows Break - Understanding and Combating Decay in Taverna Workflows. >90 workflows analyzed •Third party resources not available/accessible •Missing example data •Lack of documentation •Incomplete metadata. Is preserving data enough? Towards the preservation of scientific methods Do I have to document everything again?? Didn’t I just write a paper?
  • 4. Our solution: Data + method =Context - Research Object 4 Aggregation of resources that bundles together the contents of a research work Is preserving data enough? Towards the preservation of scientific methods OAI-ORE + + PROV OA
  • 5. How to preserve Research Objects? 5Is preserving data enough? Towards the preservation of scientific methods Three main ways/levels: •Descriptive reproducibility •Documentation •Workflow execution reproducibility •Can we run the workflow? •Workflow results reproducibility •Can we get the same results? Checklists! •Corcho et al: Checklist for workflow conservation. •http://dx.doi.org/10.6084/m9.figshare.1285011 •40 different aspects •Documentation •Goals •Results •Metadata •…. •Corcho et al: Checklist for a workflow conservation plan •http://dx.doi.org/10.6084/m9.figshare.1285012 •Based on the DCC’s data management plan
  • 6. Some examples 6Is preserving data enough? Towards the preservation of scientific methods Levels of reproducibility Workflow conservation Plan
  • 7. Conclusions 7Is preserving data enough? Towards the preservation of scientific methods •Research Objects help bundling and bridging the gap between data and methods (scientific workflows) •We need to preserve research objects as much as data and workflows used to obtain it! •Documentation •Ability to execute the experiment •Ability to obtain the same results •Checklists are a first step towards improving documentation, archival and preservation research objects. http://www.researchobject.org/
  • 8. Daniel Garijo, Oscar Corcho, Khalid Belhajjame, Lourdes Verdes-Montenegro, Julián Garrido, Raúl Palma, Cezary Mazurek and Kristina Hettne Ontology Engineering Group (Universidad Politécnica de Madrid) University Paris-Dauphine AMIGA (Instituto de Astrofísica de Andalucía) Poznan Supercomputing and Networking Center LUMC dgarijo@fi.upm.es Warsaw, May 28th 2015 Is preserving data enough? Towards the preservation of scientific methods