SlideShare a Scribd company logo
1 of 25
Research Objects
     Preserving scientific data and methods

     Stian Soiland-Reyes, Khalid Belhajjame
School of Computer Science, Univ of Manchester



                                myGrid NIHBI meet-up Manchester 2013-01-17
Agenda

» Preserving digital science
» The Research Object
     » Anatomy
     » Lifecycle
» Wf4Ever Tools
» Future developments




                               2
Computation Processes in Today’s Research


 » Research is being conducted in increasingly digital and
                    online environment
  » This has led to the emergence of new digital artifacts
» In some respects, these objects can be regarded as data
  » However, some objects include the description of the
   research method that is captured as a computational
                          process
» Such processes encapsulate the knowledge related to the
 generation, (re)use and general transformation of data in
                   experimental sciences
                                                 Raw data
                                                                                     Results



                                                             Computational process             3
Scientific Workflow

In this work, we focus on a particular kind of computational processes called scientific
                                        workflows


 » A scientific workflow is a precise, executable
 description of a scientific procedure - a series of
 analysis operations connected using data links


» Each operation represents the execution of a
              computational process
   » Can be supplied by independently developed
                      web services
    » Can also use existing data sources that are
                 accessible on the Web


                                                                                           4
Preservation Challenges

Challenges deal with their executable aspects and their vulnerability to the volatility of the
                          resources required for their execution

                                                                 » Changes by 3rd parties
                                                                   » Workflow may produce
                                                                    different lists at different
                                                                              times
                                                                   » Workflow may become
                                                                            inoperable

 » Workflow decay – The execution of the workflow may fail or yield different results,
   due to dependencies on resources and services subject to independent changes,
    e.g., EMBL-EBI. Even workflows that depend on local resources are vulnerable.




                                                                                                   5
Repeat                                    Reproduce
                 Within Lab                                Between Labs




   Materials                           Publication
                                                                       Materials

   Methods                                                             Methods
                                          Data


 Instruments                                                          Instruments
                                   Models, Techniques,
                                      Algorithms
 Laboratory                                                           Laboratory

  Replicate / Repeat                   Provenance              Reproduce
 Exactly replicate the original        Attribution            Run experiment with
experiment and experimental               Credit          differences in experimental
conditions. Eliminate change.                            conditions.. Compare to test
           Observe.                                              for same result.
                                                                     Observe.
                                         Context
                                      Investigation
                                          Study
                                       Experiment



                    Capture Curate Discover Use Reuse Preserve
RO Architecture is Hourglass


        Astronomy, Biology, services/protocols

      Viewing, collaboration services/protocols

       Provenance, Versioning, Mim services

     ROs structured packages
      Exchange services (media specific)

       Storage services (media specific)
From Electronic papers to Research objects




                  Scientists



  Hypothesis
                                      Experiments

                                                       Annotations

                     Research
                      Object
Electronic                                             Results
  paper

               Provenance

                                       Datasets


                                                                            8
9
Research Object: A user scenario




                                   10
Why research objects?




 A research object aggregates all elements deemed necessary to
  understand research investigations
 Promote reuse, sharing
 Enable the verification of reproducibility of the results
 Trackable, versionable, referenceable




                                                                  11
Anatomy of a research object




                            ore:aggregates                                  ore:describes
           ro:Resource                                                                       ro:Manifest
                                               ro:ResearchObject


 ore:proxyFor                                                  ore:aggregates




                                      ro:annotatesAggregatedResource
ro:FolderEntry


                             Subclass of
   ore:proxyIn
                                                               ro:SemanticAnnotation
                         ro:Folder

                                                                       ao:body



                                                    RDF file




                                                                                                                     12
Grounding Workflow-centric Research
                 Objects Using Semantic Technologies

   Workflow-centric research objects are encoded using RDF, according to a set
    of ontologies that are publicly available
   Research objects extend the Object Exchange and Reuse (ORE) model, to
    represent aggregation.




       ORE                                                          13
Grounding Workflow-centric Research
            Objects Using Semantic Technologies

 We use the Annotation Ontology (AO) to annotate research object
  resources and their relationships.




                                                          14
Relating resources in research object

                                  Results                                                            Workflow_16

                                                                                                      QTL
                                                                     produces
                       Included
                           in

                                                               Included in
                                      Feeds into                                                Published in
Logs
            produces                          Included in
Metadata
                                                                                Included in


                                                                                                        Paper
                                                            Slides
                                                                                 Published in
                                                       produces

                         Common pathways
                                                                                 Results
                          Workflow_13
The provenance of the RO elements is key to understanding, comparing and debugging scientific workflows and to
                        verifying the validity of a claim made within the context of a RO
                                                                                                                 15
Evolution of a research object




                  Live RO                                                                                                           Live RO
Scientist
                                                    My supervisor calls me            Reviews received
                 My supervisor calls me to                                                                                 A new PhD student
                                                    again and we decide to            and final version
                     report my work                                                                                        continues my work
                                                    publish our RO+paper                 published


                                  <<copy>>                           <<copy>>                <<copy, filter
                                                                                             and curate>>
                                                                                                                        <<copy>>
                                                             <<versionOf>>


Scientist
                                             RO snapshot                      RO snapshot           <<versionOf>>


                                                           Identified by a URI
                      Identified by a URI
                                                             Some metadata
                        Some metadata
                                                              Some curation
                        Some curation
                                                       Mostly private (for my group
                 Mostly private (for my group)
                                                        and for paper reviewers)
                                                                                                                          Identified by a URI
  Librarian/Curator                                                                                                         Good metadata
                                                                                                          Archived RO
                                                                                                                             and curation
                                                                                                                                          16
                                                                                                                             Mostly public
PROV standard - Basis for evolution model


                                          Candidate
                                          Recommendation




      http://www.w3.org/TR/prov-primer/



                                                       17
Wf4Ever Tools
Customizable preservability checklists




                                    18
Wf4Ever Tools
Portal: Browsing and annotating




                             19
Wf4Ever Tools
Command line tools, Client libraries




          https://github.com/wf4ever/

                                   20
Wf4Ever Tools
Specifications and APIs




                     21
Current Status and Ongoing Work




 Models/spec v0.1 public: http://purl.org/wf4ever/model
    - Upcoming revision v0.2: (Q1 2013)
        • Minor additions to workflow model terms
        • “RO Terms” – Upper user level view of RO: hypothesis, results – many are “shortcuts” for structured model

    - TODO: Update annotation model to Open Annotation Data Model (OAC)
    - TODO: PAV for detailed authorship provenance
 Showing, managing and sharing of Research Objects through
  myExperiment web site




                                               [3] http://www.myexperiment.org/                                  22
                                                                                                                      22
Open Annotation Data Model

                                                                             Community
                                                                             Draft




“Almost final” spec: 2013-01-28

Roll out meeting in Manchester:
          March 2013              http://www.openannotation.org/spec/core/

                                                                                     23
myExperiment RO support




                     24
Thank you!



http://www.wf4ever-project.org/   http://www.mygrid.org.uk/

More Related Content

Viewers also liked

2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)Stian Soiland-Reyes
 
2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)Stian Soiland-Reyes
 
2011-06-08 Taverna workflow system
2011-06-08 Taverna workflow system2011-06-08 Taverna workflow system
2011-06-08 Taverna workflow systemStian Soiland-Reyes
 
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTXTaverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTXStian Soiland-Reyes
 
2015-07-11 Apache Taverna - BOSC 2015
2015-07-11 Apache Taverna - BOSC 20152015-07-11 Apache Taverna - BOSC 2015
2015-07-11 Apache Taverna - BOSC 2015Stian Soiland-Reyes
 
2016-05-18-Make research reproducible again - researchobject.org
2016-05-18-Make research reproducible again - researchobject.org2016-05-18-Make research reproducible again - researchobject.org
2016-05-18-Make research reproducible again - researchobject.orgStian Soiland-Reyes
 

Viewers also liked (6)

2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)
 
2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)
 
2011-06-08 Taverna workflow system
2011-06-08 Taverna workflow system2011-06-08 Taverna workflow system
2011-06-08 Taverna workflow system
 
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTXTaverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
 
2015-07-11 Apache Taverna - BOSC 2015
2015-07-11 Apache Taverna - BOSC 20152015-07-11 Apache Taverna - BOSC 2015
2015-07-11 Apache Taverna - BOSC 2015
 
2016-05-18-Make research reproducible again - researchobject.org
2016-05-18-Make research reproducible again - researchobject.org2016-05-18-Make research reproducible again - researchobject.org
2016-05-18-Make research reproducible again - researchobject.org
 

Similar to 2013-01-17 Research Object

Collaboration and Sharing
Collaboration and SharingCollaboration and Sharing
Collaboration and SharingJisc
 
2011 03-provenance-workshop-edingurgh
2011 03-provenance-workshop-edingurgh2011 03-provenance-workshop-edingurgh
2011 03-provenance-workshop-edingurghJun Zhao
 
Scientific data management from the lab to the web
Scientific data management   from the lab to the webScientific data management   from the lab to the web
Scientific data management from the lab to the webJose Manuel Gómez-Pérez
 
The Research Object Initiative: Frameworks and Use Cases
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use CasesCarole Goble
 
Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...
Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...
Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...Stuart Wrigley
 
OAI7 Research Objects
OAI7 Research ObjectsOAI7 Research Objects
OAI7 Research Objectsseanb
 
Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08Jian Qin
 
Research Objects for FAIRer Science
Research Objects for FAIRer Science Research Objects for FAIRer Science
Research Objects for FAIRer Science Carole Goble
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research ObjectsCarole Goble
 
Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011Rudy Potenzone
 
Accelrys Announces Experiment Knowledge Base (EKB) for Enterprise Lab Management
Accelrys Announces Experiment Knowledge Base (EKB) for Enterprise Lab ManagementAccelrys Announces Experiment Knowledge Base (EKB) for Enterprise Lab Management
Accelrys Announces Experiment Knowledge Base (EKB) for Enterprise Lab ManagementBIOVIA
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.orgNorman Morrison
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsGaignard Alban
 
If we build it will they come?
If we build it will they come?If we build it will they come?
If we build it will they come?myGrid team
 
Knowledge Infrastructure for Global Systems Science
Knowledge Infrastructure for Global Systems ScienceKnowledge Infrastructure for Global Systems Science
Knowledge Infrastructure for Global Systems ScienceDavid De Roure
 
Ethics reproducibility and data stewardship
Ethics reproducibility and data stewardshipEthics reproducibility and data stewardship
Ethics reproducibility and data stewardshipRussell Jarvis
 
If we build it will they come? BOSC2012 Keynote Goble
If we build it will they come? BOSC2012 Keynote GobleIf we build it will they come? BOSC2012 Keynote Goble
If we build it will they come? BOSC2012 Keynote GobleCarole Goble
 

Similar to 2013-01-17 Research Object (20)

Collaboration and Sharing
Collaboration and SharingCollaboration and Sharing
Collaboration and Sharing
 
Research Objects in Wf4Ever
Research Objects in Wf4EverResearch Objects in Wf4Ever
Research Objects in Wf4Ever
 
2011 03-provenance-workshop-edingurgh
2011 03-provenance-workshop-edingurgh2011 03-provenance-workshop-edingurgh
2011 03-provenance-workshop-edingurgh
 
Scientific data management from the lab to the web
Scientific data management   from the lab to the webScientific data management   from the lab to the web
Scientific data management from the lab to the web
 
The Research Object Initiative: Frameworks and Use Cases
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use Cases
 
Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...
Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...
Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...
 
OAI7 Research Objects
OAI7 Research ObjectsOAI7 Research Objects
OAI7 Research Objects
 
Beyond the PDF 2, 2013
Beyond the PDF 2, 2013Beyond the PDF 2, 2013
Beyond the PDF 2, 2013
 
Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08
 
Research Objects for FAIRer Science
Research Objects for FAIRer Science Research Objects for FAIRer Science
Research Objects for FAIRer Science
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
 
Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011
 
Accelrys Announces Experiment Knowledge Base (EKB) for Enterprise Lab Management
Accelrys Announces Experiment Knowledge Base (EKB) for Enterprise Lab ManagementAccelrys Announces Experiment Knowledge Base (EKB) for Enterprise Lab Management
Accelrys Announces Experiment Knowledge Base (EKB) for Enterprise Lab Management
 
NETTAB 2012
NETTAB 2012NETTAB 2012
NETTAB 2012
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.org
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reports
 
If we build it will they come?
If we build it will they come?If we build it will they come?
If we build it will they come?
 
Knowledge Infrastructure for Global Systems Science
Knowledge Infrastructure for Global Systems ScienceKnowledge Infrastructure for Global Systems Science
Knowledge Infrastructure for Global Systems Science
 
Ethics reproducibility and data stewardship
Ethics reproducibility and data stewardshipEthics reproducibility and data stewardship
Ethics reproducibility and data stewardship
 
If we build it will they come? BOSC2012 Keynote Goble
If we build it will they come? BOSC2012 Keynote GobleIf we build it will they come? BOSC2012 Keynote Goble
If we build it will they come? BOSC2012 Keynote Goble
 

More from Stian Soiland-Reyes

2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systems2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systemsStian Soiland-Reyes
 
2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research Object2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research ObjectStian Soiland-Reyes
 
2017-07-22 Common Workflow Language Viewer
2017-07-22 Common Workflow Language Viewer2017-07-22 Common Workflow Language Viewer
2017-07-22 Common Workflow Language ViewerStian Soiland-Reyes
 
2014-10-31 Taverna 3 architecture
2014-10-31 Taverna 3 architecture2014-10-31 Taverna 3 architecture
2014-10-31 Taverna 3 architectureStian Soiland-Reyes
 
2014-10-30 Taverna as an Apache Incubator project
2014-10-30 Taverna as an Apache Incubator project2014-10-30 Taverna as an Apache Incubator project
2014-10-30 Taverna as an Apache Incubator projectStian Soiland-Reyes
 
2014-06-13 Research objects in the wild
2014-06-13 Research objects in the wild2014-06-13 Research objects in the wild
2014-06-13 Research objects in the wildStian Soiland-Reyes
 
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)Stian Soiland-Reyes
 
2013-03-21 What can provenance do for me?
2013-03-21 What can provenance do for me?2013-03-21 What can provenance do for me?
2013-03-21 What can provenance do for me?Stian Soiland-Reyes
 
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...Stian Soiland-Reyes
 
Taverna workflow management system (2010 11-30 Bath Workflow Tools)
Taverna workflow management system (2010 11-30 Bath Workflow Tools)Taverna workflow management system (2010 11-30 Bath Workflow Tools)
Taverna workflow management system (2010 11-30 Bath Workflow Tools)Stian Soiland-Reyes
 
Bringing caBIG services together using Taverna
Bringing caBIG services together using TavernaBringing caBIG services together using Taverna
Bringing caBIG services together using TavernaStian Soiland-Reyes
 

More from Stian Soiland-Reyes (14)

2017-09-27-scholarly-html-ro
2017-09-27-scholarly-html-ro2017-09-27-scholarly-html-ro
2017-09-27-scholarly-html-ro
 
2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systems2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systems
 
2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research Object2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research Object
 
2017-07-22 Common Workflow Language Viewer
2017-07-22 Common Workflow Language Viewer2017-07-22 Common Workflow Language Viewer
2017-07-22 Common Workflow Language Viewer
 
2014-10-31 Taverna 3 architecture
2014-10-31 Taverna 3 architecture2014-10-31 Taverna 3 architecture
2014-10-31 Taverna 3 architecture
 
2014-10-30 Taverna 3 status
2014-10-30 Taverna 3 status2014-10-30 Taverna 3 status
2014-10-30 Taverna 3 status
 
2014-10-30 Taverna as an Apache Incubator project
2014-10-30 Taverna as an Apache Incubator project2014-10-30 Taverna as an Apache Incubator project
2014-10-30 Taverna as an Apache Incubator project
 
2014-06-13 Research objects in the wild
2014-06-13 Research objects in the wild2014-06-13 Research objects in the wild
2014-06-13 Research objects in the wild
 
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
 
2013-05-29 Taverna Provenance
2013-05-29 Taverna Provenance2013-05-29 Taverna Provenance
2013-05-29 Taverna Provenance
 
2013-03-21 What can provenance do for me?
2013-03-21 What can provenance do for me?2013-03-21 What can provenance do for me?
2013-03-21 What can provenance do for me?
 
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
 
Taverna workflow management system (2010 11-30 Bath Workflow Tools)
Taverna workflow management system (2010 11-30 Bath Workflow Tools)Taverna workflow management system (2010 11-30 Bath Workflow Tools)
Taverna workflow management system (2010 11-30 Bath Workflow Tools)
 
Bringing caBIG services together using Taverna
Bringing caBIG services together using TavernaBringing caBIG services together using Taverna
Bringing caBIG services together using Taverna
 

Recently uploaded

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

2013-01-17 Research Object

  • 1. Research Objects Preserving scientific data and methods Stian Soiland-Reyes, Khalid Belhajjame School of Computer Science, Univ of Manchester myGrid NIHBI meet-up Manchester 2013-01-17
  • 2. Agenda » Preserving digital science » The Research Object » Anatomy » Lifecycle » Wf4Ever Tools » Future developments 2
  • 3. Computation Processes in Today’s Research » Research is being conducted in increasingly digital and online environment » This has led to the emergence of new digital artifacts » In some respects, these objects can be regarded as data » However, some objects include the description of the research method that is captured as a computational process » Such processes encapsulate the knowledge related to the generation, (re)use and general transformation of data in experimental sciences Raw data Results Computational process 3
  • 4. Scientific Workflow In this work, we focus on a particular kind of computational processes called scientific workflows » A scientific workflow is a precise, executable description of a scientific procedure - a series of analysis operations connected using data links » Each operation represents the execution of a computational process » Can be supplied by independently developed web services » Can also use existing data sources that are accessible on the Web 4
  • 5. Preservation Challenges Challenges deal with their executable aspects and their vulnerability to the volatility of the resources required for their execution » Changes by 3rd parties » Workflow may produce different lists at different times » Workflow may become inoperable » Workflow decay – The execution of the workflow may fail or yield different results, due to dependencies on resources and services subject to independent changes, e.g., EMBL-EBI. Even workflows that depend on local resources are vulnerable. 5
  • 6. Repeat Reproduce Within Lab Between Labs Materials Publication Materials Methods Methods Data Instruments Instruments Models, Techniques, Algorithms Laboratory Laboratory Replicate / Repeat Provenance Reproduce Exactly replicate the original Attribution Run experiment with experiment and experimental Credit differences in experimental conditions. Eliminate change. conditions.. Compare to test Observe. for same result. Observe. Context Investigation Study Experiment Capture Curate Discover Use Reuse Preserve
  • 7. RO Architecture is Hourglass Astronomy, Biology, services/protocols Viewing, collaboration services/protocols Provenance, Versioning, Mim services ROs structured packages Exchange services (media specific) Storage services (media specific)
  • 8. From Electronic papers to Research objects Scientists Hypothesis Experiments Annotations Research Object Electronic Results paper Provenance Datasets 8
  • 9. 9
  • 10. Research Object: A user scenario 10
  • 11. Why research objects?  A research object aggregates all elements deemed necessary to understand research investigations  Promote reuse, sharing  Enable the verification of reproducibility of the results  Trackable, versionable, referenceable 11
  • 12. Anatomy of a research object ore:aggregates ore:describes ro:Resource ro:Manifest ro:ResearchObject ore:proxyFor ore:aggregates ro:annotatesAggregatedResource ro:FolderEntry Subclass of ore:proxyIn ro:SemanticAnnotation ro:Folder ao:body RDF file 12
  • 13. Grounding Workflow-centric Research Objects Using Semantic Technologies  Workflow-centric research objects are encoded using RDF, according to a set of ontologies that are publicly available  Research objects extend the Object Exchange and Reuse (ORE) model, to represent aggregation. ORE 13
  • 14. Grounding Workflow-centric Research Objects Using Semantic Technologies  We use the Annotation Ontology (AO) to annotate research object resources and their relationships. 14
  • 15. Relating resources in research object Results Workflow_16 QTL produces Included in Included in Feeds into Published in Logs produces Included in Metadata Included in Paper Slides Published in produces Common pathways Results Workflow_13 The provenance of the RO elements is key to understanding, comparing and debugging scientific workflows and to verifying the validity of a claim made within the context of a RO 15
  • 16. Evolution of a research object Live RO Live RO Scientist My supervisor calls me Reviews received My supervisor calls me to A new PhD student again and we decide to and final version report my work continues my work publish our RO+paper published <<copy>> <<copy>> <<copy, filter and curate>> <<copy>> <<versionOf>> Scientist RO snapshot RO snapshot <<versionOf>> Identified by a URI Identified by a URI Some metadata Some metadata Some curation Some curation Mostly private (for my group Mostly private (for my group) and for paper reviewers) Identified by a URI Librarian/Curator Good metadata Archived RO and curation 16 Mostly public
  • 17. PROV standard - Basis for evolution model Candidate Recommendation http://www.w3.org/TR/prov-primer/ 17
  • 19. Wf4Ever Tools Portal: Browsing and annotating 19
  • 20. Wf4Ever Tools Command line tools, Client libraries https://github.com/wf4ever/ 20
  • 22. Current Status and Ongoing Work  Models/spec v0.1 public: http://purl.org/wf4ever/model - Upcoming revision v0.2: (Q1 2013) • Minor additions to workflow model terms • “RO Terms” – Upper user level view of RO: hypothesis, results – many are “shortcuts” for structured model - TODO: Update annotation model to Open Annotation Data Model (OAC) - TODO: PAV for detailed authorship provenance  Showing, managing and sharing of Research Objects through myExperiment web site [3] http://www.myexperiment.org/ 22 22
  • 23. Open Annotation Data Model Community Draft “Almost final” spec: 2013-01-28 Roll out meeting in Manchester: March 2013 http://www.openannotation.org/spec/core/ 23
  • 25. Thank you! http://www.wf4ever-project.org/ http://www.mygrid.org.uk/

Editor's Notes

  1. Some of the shared digital artefacts of digital research are executable in the sense that they describe an automated process which generates results.For example, an object might contain raw…
  2. , i.e., a multi-step process to coordinate multiple components and tasks, like a script, that orchestrates the flow of data.Such as running a program, submitting a query to a database, submitting a job to a computational facility, or invoking a service over the Web to use a remote resource
  3. It is important therefore to record the provenance of workflow outputs; i.e. the sources of information and processes involved in producing a particular listto changes in operating systems, data management sustainability and access to computational infrastructure. We note that workflows have many of the properties of software, such as the composition of components with external dependencies, and hence some aspects of software preservation [10] are applicable.
  4. ORE defines standards for the description and exchange of aggregations of Web resources. Using ORE, a workflow-centric research object is defined as a resource that aggregates other resources, i.e., workflow(s), provenance, other objects and annotations. For ex- ample, the RDF turtle snippet illustrated below specifies that a research object identified by :wro aggregates a workflow template :pathway wf sp, a workflow run :pathway wf run, and an annotation :wfannot.
  5. The elements that compose a Research Object may differ from one to another, and this difference may have consequences on the level of reproducibility that can be guaranteed.At one end of the spectrum, the Research Object is represented by a paper. As we progress to the other end the Research Object is enriched to include elements such asthe workow implementing the computation, annotations describing the experiment implemented and the hypothesis investigated, and provenance traces of past executions of the worflkow.Assessing the reproducibility of computations described using electronic papers can be tedious: a paper may just sketch the method implemented by the computation in question, without delving into details that are necessary to check that the results obtained, or claimed, in the paper can be reproduced. Verifying the reproducibility of ROs at the other end of the spectrum is less difficult. The provenance trace provides data examples to re-enact the workflow and means to verify that the results of workflow executions are comparable with prior resultsTo ensure the preservation of a workflow and the reproducibility of its results, the RO needs to be managed and curated throughout the lifecycle of the associated workflow
  6. We will now illustrate research object lifecycle through a small example that shows how all the resources contained in a research object are bundled as the scientific experiment progresses. This example lifecycle is summarized graphically on the slide.A research object normally starts its life as an empty Live Research Object, with a first design of the experiments to be performed (which determines what workflows and resources will be added, by either retrieving them from an existing platform or creating them from scratch). Then the research object is filled incrementally by aggregating such workflows that are being created, reused or re-purposed, datasets, documents, etc. Any of these components can be changed at any point in time, removed, etc.In our scenario, we observe several points in time when this Live Research Object gets copied and kept into a Research Object snapshot, which aims to reflect the status of the research object at a given point in time. Such a snap- shot may be useful to release the current version of the research outcome of an experiment, submit it to be peer reviewed or to be published (with the appro- priate access control mechanisms), share it with supervisors or collaborators, or for acknowledgement and citation purposes.A snapshot may also contain a paper describing the research object in general and the experiment in particular, depending on the policies of the corresponding scientific communication channel, e.g., workshop, conference or journal. Such snapshots have their own identifiers, and may even be preserved, since it may be useful to be able to track the evolution of the research object over time, so as to allow, for example, retrieval of a previous state of the research object, reporting to funding agencies the evolution of the research conducted, etc.At some point in time, the research object may get published and archived, in what we know as an Archived Research Object, with a permanent identifier. Such a version of our research object may be the result of copying completely our Live Research Object, or it may be the result of some filtering or curation process where only some parts of the information available in the aggregation are actually published for others to reuse. As illustrated in Figure 4, a user can use an existing Archived Research Object as a starting point to his or her research, e.g., to repurpose it or its parts, in which case a new Live Research Object is created based on the existing Archived Research Object. This is only one of the many potential scenarios that could be foreseen for the lifecycle of a workflow-centric research object and we are currently defining different storyboards for their evolution. One important aspect to highlight is the fact that during its whole lifecycle, the research object is aggregating new ob- jects. The annotation process during the lifecycle of experimentation allows the generation of sufficient metadata about the research objects to support preser- vation and sharing. Therefore, when a scientists decides to preserve it most of the annotations that will be needed for that preservation process will be already available inside the research object.