SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Downloaden Sie, um offline zu lesen
Wf4Ever:
Preserving workflows as
digital Research Objects
       Stian Soiland-Reyes
  myGrid, University of Manchester

          EGI Community Forum 2012, Workflow Systems workshop
           Leibniz Supercomputing Centre, Münich, 2012-03-28
My background

                               Taverna - Scientific Workflow Management
                                  System
                               ~85000 downloads
                               ~EU projects: SCAPE, BioVeL, HELIO,
http://www.taverna.org.uk/
                               e-Lico, VPH-SHARE, EGI-INSPiRE….

                               myExperiment - Web 3.0 virtual
                                 environment, library and social
                                 network for workflows
http://www.myexperiment.org/
                               ~5000 registered users
                               ~2200 workflows
                               ~21 different systems

                                                                          2
“A biologist would rather share their
 toothbrush than their gene name”




                                  Mike Ashburner and others
                                Professor in Dept of Genetics,
                                 University of Cambridge, UK
http://www.myexperiment.org/

       “Facebook for Scientists”           A probe into researcher behaviour
       ...but different to Facebook!

   A repository of research methods       Open source (BSD) Ruby on Rails app

 A social network of people and things       REST and SPARQL, Linked Data

 A Social Virtual Research Environment    Influenced BioCatalogue, MethodBox
                                                      and SysMO-SEEK

     myExperiment currently has 5378 members, 292 groups, 2273
                workflows, 534 files and 217 packs
 Workflow Preservation
    Research Objects
       Provenance
    Recommendation
 Astronomy and Genomics
                           http://www.wf4ever-project.org/
Wf4Ever
                                                                 Challenges
Preservation of scientific workflows   » Scientific workflows enable automation
     in data-intensive science           of scientific methods and encourage
                                         best practices to be shared
                                       » Workflows need to be preserved for
                                            › Reuse, fundamental for incremental
                                              scientific development
                                            › Method reproducibility, key for
                                              credit and publication
                                       » Workflow preservation is complex!
                                       » Heterogeneous types of information
                                         need to be aggregated, including
                                         workflows and related resources
                                         forming research objects
                                       » Research objects need to be trusted and
                                         understandable n years from now
                                       » Social aspects need to be addressed in
                                         order to support reuse in scientific
                                         communities
                                                                               7
The R.* dimensions


Reusable. The key tenet of Research                 Replayable. Studies might involve
Objects is to support the sharing and               single investigations that happen in
reuse of data, methods and processes.               milliseconds or protracted processes
Repurposeable. Reuse may also                       that take years.
involve the reuse of constituent parts of Referenceable. If research objects are
the Research Object.                      to augment or replace traditional
Repeatable. There should be sufficient publication methods, then they must be
                                          referenceable or citeable.
information in a Research Object to be
able to repeat the study, perhaps years Revealable. Third parties must be able
later.                                    to audit the steps performed in the
Reproducible. A third party can start research in order to be convinced of the
                                          validity of results.
with the same inputs and methods and
see if a prior result can be confirmed.   Respectful. Explicit representations of
                                          the provenance, lineage and flow of
                                          intellectual property.
   Replacing the Paper: The Twelve Rs of the e-Research Record” on http://blogs.nature.com/eresearch/
Wf4Ever
                                   Forms of decay
Workflow Decay
• Service decay
     • Flux/decay/unavailability
• Data decay
     • Formats/ids/standards
• Infrastructure decay
     • platform/resources


Experiment Decay
•   Methodological changes
•   New technologies
•   New resources/components
•   New data
                                                 9
Preservation, Conservation, Recreating

Preserving
Archived Record
Fixed Snapshots
Review
Rerun & Replay

Conserving
Active Instrument
Live
Rerun & Reuse
Repair & Restore

Recreating
Archived Record
Active Instrument
Live
Rebuild Recycle Repurpose

                                                                     10
Workflow Decay
                                                 Decay at different abstraction levels


                                                                               Redo




                                                                            Flux


                                                                            Flux


                                                                            Flux

                                                                                      11
http://www.gridworkflow.org/kwfgrid/gwes/docs/
Research objects




              12
Research Objects as Social Objects




13          13
                                     13
http://purl.org/wf4ever/ro#
                               Research Object model core (simplified)


                              ore:aggregates
                                                   ro:ResearchObject
        ro:Resource                                                           ore:isDescribedBy



                                                                                     ro:Manifest
wfdesc:Workflow

              ro:annotatesAggregatedResource         ro:AggregatedAnnotation

                                 Note: This figure shows a simplified view of the RO core.




   RO specification: http://wf4ever.github.com/ro/
                                                                                                   14
http://purl.org/wf4ever/ro#
Research Object model core




                                15
http://purl.org/wf4ever/wfdesc#
RO model: Workflow Description




                                     16
http://purl.org/wf4ever/wfprov#
Workflow Provenance (wfprov)




                                   17
Technical infrastructure


• Models  Semantic Web Encoding
    •   Research Object
    •   Annotation
    •   Provenance
    •   Evolution and Versioning
• Services Web APIs, REST services
    • Foundational, Extension, User
    • APIs, Architecture
• Principles
    • Map into standards
    • Adopt standards
    • Lightweight components
• Ecosystem
    • Command line
    • Portal
    • Third party systems
                                                           18
The Wf4Ever Proposal
                      Services


User
Clients



Extension
Services




Foundation
Services



                               19
Wf4Ever Reference Implementation
                                                                         Prototype, Dec 2011

   Access & Usage Clients

                                                                Dropbox Client
                   RO Portal             RO Manager Tool
                                                                       ROBox



           Data Management & Analysis Services



                     Stability              Completeness
                                                                 Recommender
                    Evaluation               Evaluation



Storage Services                                           Lifecycle Services

                                                                        Taverna Workflow
                                                                          Mgmt System
                               RO Digital Library



                                                                                           20
Roadmap
                              Year 1 (Dec 2010  Dec 2011)


» Exploration (2011)
   Problem specification and requirements identification
   Better understanding of workflow preservation needs
    from the domains (what does it mean to preserve a
    scientific workflow?)
   Proofs of concepts
   Preliminary models, components, and integrated
    reference implementation
   Result identification

                                                            21
Roadmap
                                   Year 2 (Dec 2011  Dec 2012)


Realization/validation (2012)
   › Validate the models, architectures and software in practice
   › Distributed components with different access/security
     arrangements – forming REST APIs and specifications
   › RO Content Campaign: Generate 1000s of ROs
   › First productization phase: Stable releases of models and
     reference implementation
   › Decay monitoring and notification (why my wf is no longer
     stable), reacting to decay, attribution and credit support
     beyond recommendation. Detailed use of provenance
   › Execution and interoperability support (SHIWA integration)
                                                                    22
Roadmap
                                 Year 3 (Dec 2012  Dec 2013)


» Exploitation (2013)
   › Final productization phase
   › Deployment in user environments and systems, enhanced with
     workflow preservation capabilities
   › RO-enabled myExperiment
   › RO-enabled Galaxy
   › RO-enabled dataVerse
   › … and more!
   › Deployment in publishers e.g. Elsevier, Digital Science,
     GigaScience

                                                                  23
Collaborations and impact
»   SHIWA – Sharing Interoperable Workflows
»   Publishers/journals: Elsevier, GigaScience (by BGI)
»   OpenPHACTS (nanopublications)
»   SCAPE (dataset preservation)
»   BioVel (biodiversity - species preservation!)
»   Dataverse (data repository)
»   Galaxy (workflow system for genomics)
»   GenomeSpace (data integration platform)




                                                             24
Thank you!




                                      Any Questions?

                     http://www.wf4ever-project.org/




This work is licensed under the Creative Commons Attribution 3.0
Unported License. To view a copy of this license, visit
http://creativecommons.org/licenses/by/3.0/ or send a letter to Creative
Commons, 444 Castro Street, Suite 900, Mountain View, California,
94041, USA.                                                                        25

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (6)

2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)
 
2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)
 
2011-06-08 Taverna workflow system
2011-06-08 Taverna workflow system2011-06-08 Taverna workflow system
2011-06-08 Taverna workflow system
 
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTXTaverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
 
2015-07-11 Apache Taverna - BOSC 2015
2015-07-11 Apache Taverna - BOSC 20152015-07-11 Apache Taverna - BOSC 2015
2015-07-11 Apache Taverna - BOSC 2015
 
2016-05-18-Make research reproducible again - researchobject.org
2016-05-18-Make research reproducible again - researchobject.org2016-05-18-Make research reproducible again - researchobject.org
2016-05-18-Make research reproducible again - researchobject.org
 

Ähnlich wie 2012 03-28 Wf4ever, preserving workflows as digital research objects

The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
Carole Goble
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.org
Norman Morrison
 
Deroure Repo3
Deroure Repo3Deroure Repo3
Deroure Repo3
guru122
 
Chem4Word Wade
Chem4Word WadeChem4Word Wade
Chem4Word Wade
Alex Wade
 

Ähnlich wie 2012 03-28 Wf4ever, preserving workflows as digital research objects (20)

A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 
Research Objects in Wf4Ever
Research Objects in Wf4EverResearch Objects in Wf4Ever
Research Objects in Wf4Ever
 
myExperiment and the Rise of Social Machines
myExperiment and the Rise of Social MachinesmyExperiment and the Rise of Social Machines
myExperiment and the Rise of Social Machines
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
 
2011 03-provenance-workshop-edingurgh
2011 03-provenance-workshop-edingurgh2011 03-provenance-workshop-edingurgh
2011 03-provenance-workshop-edingurgh
 
Research Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibilityResearch Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibility
 
Metadata for Research Objects
Metadata for Research ObjectsMetadata for Research Objects
Metadata for Research Objects
 
Research Objects Tutorial (TPDL)
Research Objects Tutorial (TPDL)Research Objects Tutorial (TPDL)
Research Objects Tutorial (TPDL)
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.org
 
Ethics reproducibility and data stewardship
Ethics reproducibility and data stewardshipEthics reproducibility and data stewardship
Ethics reproducibility and data stewardship
 
Towards Computational Research Objects
Towards Computational Research ObjectsTowards Computational Research Objects
Towards Computational Research Objects
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
 
Libraries, OA research and OER: towards symbiosis?
Libraries, OA research and OER: towards symbiosis?Libraries, OA research and OER: towards symbiosis?
Libraries, OA research and OER: towards symbiosis?
 
Research Objects for FAIRer Science
Research Objects for FAIRer Science Research Objects for FAIRer Science
Research Objects for FAIRer Science
 
Deroure Repo3
Deroure Repo3Deroure Repo3
Deroure Repo3
 
Deroure Repo3
Deroure Repo3Deroure Repo3
Deroure Repo3
 
Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011
 
Workflow Preservation
Workflow PreservationWorkflow Preservation
Workflow Preservation
 
Research Objects in Scientific Publications
Research Objects in Scientific PublicationsResearch Objects in Scientific Publications
Research Objects in Scientific Publications
 
Chem4Word Wade
Chem4Word WadeChem4Word Wade
Chem4Word Wade
 

Mehr von Stian Soiland-Reyes

2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
Stian Soiland-Reyes
 

Mehr von Stian Soiland-Reyes (14)

2017-09-27-scholarly-html-ro
2017-09-27-scholarly-html-ro2017-09-27-scholarly-html-ro
2017-09-27-scholarly-html-ro
 
2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systems2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systems
 
2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research Object2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research Object
 
2017-07-22 Common Workflow Language Viewer
2017-07-22 Common Workflow Language Viewer2017-07-22 Common Workflow Language Viewer
2017-07-22 Common Workflow Language Viewer
 
2014-10-31 Taverna 3 architecture
2014-10-31 Taverna 3 architecture2014-10-31 Taverna 3 architecture
2014-10-31 Taverna 3 architecture
 
2014-10-30 Taverna 3 status
2014-10-30 Taverna 3 status2014-10-30 Taverna 3 status
2014-10-30 Taverna 3 status
 
2014-10-30 Taverna as an Apache Incubator project
2014-10-30 Taverna as an Apache Incubator project2014-10-30 Taverna as an Apache Incubator project
2014-10-30 Taverna as an Apache Incubator project
 
2014-06-13 Research objects in the wild
2014-06-13 Research objects in the wild2014-06-13 Research objects in the wild
2014-06-13 Research objects in the wild
 
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
 
2013-05-29 Taverna Provenance
2013-05-29 Taverna Provenance2013-05-29 Taverna Provenance
2013-05-29 Taverna Provenance
 
2013-03-21 What can provenance do for me?
2013-03-21 What can provenance do for me?2013-03-21 What can provenance do for me?
2013-03-21 What can provenance do for me?
 
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
 
Taverna workflow management system (2010 11-30 Bath Workflow Tools)
Taverna workflow management system (2010 11-30 Bath Workflow Tools)Taverna workflow management system (2010 11-30 Bath Workflow Tools)
Taverna workflow management system (2010 11-30 Bath Workflow Tools)
 
Bringing caBIG services together using Taverna
Bringing caBIG services together using TavernaBringing caBIG services together using Taverna
Bringing caBIG services together using Taverna
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

2012 03-28 Wf4ever, preserving workflows as digital research objects

  • 1. Wf4Ever: Preserving workflows as digital Research Objects Stian Soiland-Reyes myGrid, University of Manchester EGI Community Forum 2012, Workflow Systems workshop Leibniz Supercomputing Centre, Münich, 2012-03-28
  • 2. My background Taverna - Scientific Workflow Management System ~85000 downloads ~EU projects: SCAPE, BioVeL, HELIO, http://www.taverna.org.uk/ e-Lico, VPH-SHARE, EGI-INSPiRE…. myExperiment - Web 3.0 virtual environment, library and social network for workflows http://www.myexperiment.org/ ~5000 registered users ~2200 workflows ~21 different systems 2
  • 3. “A biologist would rather share their toothbrush than their gene name” Mike Ashburner and others Professor in Dept of Genetics, University of Cambridge, UK
  • 4. http://www.myexperiment.org/  “Facebook for Scientists”  A probe into researcher behaviour ...but different to Facebook!  A repository of research methods  Open source (BSD) Ruby on Rails app  A social network of people and things  REST and SPARQL, Linked Data  A Social Virtual Research Environment  Influenced BioCatalogue, MethodBox and SysMO-SEEK myExperiment currently has 5378 members, 292 groups, 2273 workflows, 534 files and 217 packs
  • 5.
  • 6.  Workflow Preservation  Research Objects  Provenance  Recommendation  Astronomy and Genomics http://www.wf4ever-project.org/
  • 7. Wf4Ever Challenges Preservation of scientific workflows » Scientific workflows enable automation in data-intensive science of scientific methods and encourage best practices to be shared » Workflows need to be preserved for › Reuse, fundamental for incremental scientific development › Method reproducibility, key for credit and publication » Workflow preservation is complex! » Heterogeneous types of information need to be aggregated, including workflows and related resources forming research objects » Research objects need to be trusted and understandable n years from now » Social aspects need to be addressed in order to support reuse in scientific communities 7
  • 8. The R.* dimensions Reusable. The key tenet of Research Replayable. Studies might involve Objects is to support the sharing and single investigations that happen in reuse of data, methods and processes. milliseconds or protracted processes Repurposeable. Reuse may also that take years. involve the reuse of constituent parts of Referenceable. If research objects are the Research Object. to augment or replace traditional Repeatable. There should be sufficient publication methods, then they must be referenceable or citeable. information in a Research Object to be able to repeat the study, perhaps years Revealable. Third parties must be able later. to audit the steps performed in the Reproducible. A third party can start research in order to be convinced of the validity of results. with the same inputs and methods and see if a prior result can be confirmed. Respectful. Explicit representations of the provenance, lineage and flow of intellectual property. Replacing the Paper: The Twelve Rs of the e-Research Record” on http://blogs.nature.com/eresearch/
  • 9. Wf4Ever Forms of decay Workflow Decay • Service decay • Flux/decay/unavailability • Data decay • Formats/ids/standards • Infrastructure decay • platform/resources Experiment Decay • Methodological changes • New technologies • New resources/components • New data 9
  • 10. Preservation, Conservation, Recreating Preserving Archived Record Fixed Snapshots Review Rerun & Replay Conserving Active Instrument Live Rerun & Reuse Repair & Restore Recreating Archived Record Active Instrument Live Rebuild Recycle Repurpose 10
  • 11. Workflow Decay Decay at different abstraction levels Redo Flux Flux Flux 11 http://www.gridworkflow.org/kwfgrid/gwes/docs/
  • 13. Research Objects as Social Objects 13 13 13
  • 14. http://purl.org/wf4ever/ro# Research Object model core (simplified) ore:aggregates ro:ResearchObject ro:Resource ore:isDescribedBy ro:Manifest wfdesc:Workflow ro:annotatesAggregatedResource ro:AggregatedAnnotation Note: This figure shows a simplified view of the RO core. RO specification: http://wf4ever.github.com/ro/ 14
  • 18. Technical infrastructure • Models  Semantic Web Encoding • Research Object • Annotation • Provenance • Evolution and Versioning • Services Web APIs, REST services • Foundational, Extension, User • APIs, Architecture • Principles • Map into standards • Adopt standards • Lightweight components • Ecosystem • Command line • Portal • Third party systems 18
  • 19. The Wf4Ever Proposal Services User Clients Extension Services Foundation Services 19
  • 20. Wf4Ever Reference Implementation Prototype, Dec 2011 Access & Usage Clients Dropbox Client RO Portal RO Manager Tool ROBox Data Management & Analysis Services Stability Completeness Recommender Evaluation Evaluation Storage Services Lifecycle Services Taverna Workflow Mgmt System RO Digital Library 20
  • 21. Roadmap Year 1 (Dec 2010  Dec 2011) » Exploration (2011) Problem specification and requirements identification Better understanding of workflow preservation needs from the domains (what does it mean to preserve a scientific workflow?) Proofs of concepts Preliminary models, components, and integrated reference implementation Result identification 21
  • 22. Roadmap Year 2 (Dec 2011  Dec 2012) Realization/validation (2012) › Validate the models, architectures and software in practice › Distributed components with different access/security arrangements – forming REST APIs and specifications › RO Content Campaign: Generate 1000s of ROs › First productization phase: Stable releases of models and reference implementation › Decay monitoring and notification (why my wf is no longer stable), reacting to decay, attribution and credit support beyond recommendation. Detailed use of provenance › Execution and interoperability support (SHIWA integration) 22
  • 23. Roadmap Year 3 (Dec 2012  Dec 2013) » Exploitation (2013) › Final productization phase › Deployment in user environments and systems, enhanced with workflow preservation capabilities › RO-enabled myExperiment › RO-enabled Galaxy › RO-enabled dataVerse › … and more! › Deployment in publishers e.g. Elsevier, Digital Science, GigaScience 23
  • 24. Collaborations and impact » SHIWA – Sharing Interoperable Workflows » Publishers/journals: Elsevier, GigaScience (by BGI) » OpenPHACTS (nanopublications) » SCAPE (dataset preservation) » BioVel (biodiversity - species preservation!) » Dataverse (data repository) » Galaxy (workflow system for genomics) » GenomeSpace (data integration platform) 24
  • 25. Thank you! Any Questions? http://www.wf4ever-project.org/ This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/ or send a letter to Creative Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA. 25