SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Downloaden Sie, um offline zu lesen
A new Approach for Publishing
   Workflows: Abstractions,
  Standards and Linked Data

                       Daniel Garijo
 Ontology Engineering Group, Departamento de Inteligencia
        Artificial. Universidad Politécnica de Madrid

                         Yolanda Gil
             Information Sciences and Institute
      University of Southern California, Marina del Rey
                                                          Date: 14/11/2011
Index of contents

Index:
1.   Background

2.   Limitations of existing approaches to workflow publication

3.   Features of our approach

     •   Publishing abstract workflows and specific workflows

     •   OPMW Ontology

     •   Linked Data Publication

4.   Workflow querying and Linked Data consumption

5.   Conclusions




                                                                                1
Background


Typical Published Article          Reproducible Article:
                                   Weaver, GenePattern GRRD, etc.

            Text:                              Text:
    Narrative of method,               Narrative of method,
   software packages used             software packages used

            Data:                             Data:
Key datasets and figures/plots    Key datasets and figures/plots

                                             Workflow:
NOT published,                     Workflow/scripts describing
loosely recorded:                 dataflow, codes, and parameters

            Software:
scripted codes + manual steps +
          notes/emails

                                                                    2
Current issues with existing publication approaches


                                      Only executable workflow is published:
 Reproducible Article:                1. Must have the same codes to re-execute
 Weaver, GenePattern GRRD, etc.          the workflow, but:
                                          –   Codes become unavailable
                                               •   Eg: eHits was proprietary and replaced by
             Text:                                 AutodockVina
     Narrative of method,                 –   Different labs prefer different codes
    software packages used                     •   Eg: R vs Matlab
                                               •   Eg: viz in Citoscape vs yEd
            Data:                     2. Must have the same workflow framework
Key datasets and figures/plots           to re-execute the workflow
                                          –   Must have R for Weaver
           Workflow:                  3. Must import files to local file system and
 Workflow/scripts describing             workflow framework
dataflow, codes, and parameters
                                          –   Must import bundle of workflow/data/code
                                              files to reproduce



                                                                                               3
Key Features of our approach


•   Publish an abstract workflow in addition to executable workflow
     – Description of workflow that is independent of the codes executed
     – Maps to the codes executed (the “executable workflow”)

•   Publish both abstract and executable workflow using the OPM standard
     – OPM (Open Provenance Model) is independent of workflow framework and is
        widely implemented
     – Other groups can import to their own workflow framework

•   Publish data and workflows as Linked Data on the Web
     – All workflows and related files are web-accessible
     – Simple mechanism to share across local file systems




                                                                                 4
What is Linked Data


1. Use URIs as names for things.

2. Use HTTP URIs so that people can look up those names.

3. When someone looks up a URI, provide useful information.

4. Include links to other URIs.




              “Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”

                                                                                                                  5
High level architecture


                                                                                    Other
                                                                                    workflow
                  WINGS on local laptop                                             environments
                         Workflow
          Core           Template        OPM
         Portal          Workflow       export
                         Instance
                                                               Programatic access
                                                                 (external apps)
                  WINGS on shared host
                         Workflow                  Linked
          Core           Template        OPM
         Portal                         export      Data
                         Workflow
                         Instance                Publication       Interactive
                  WINGS on web server
                                                                    Browsing
                         Workflow                               (Pubby frontend)
          Core           Template      OPM
         Portal                       export                                            Users
                        Workflow
                        Instance



Wings workflow                OPM
                                                 Publication      Share               Reuse
  generation                conversion




                                                                                                6
Publishing the abstract workflow




                Comparison of
               Dissimilar protein
                   structures
                   workflow




                                    7
OPMW Ontology


                              opmv:Artifact                                           opmv:Artifact
                                          opmw:                                                            opmw:                       opmv:
                                                                   opmw: hasArtifactTemplate
                                    ArtifactTemplate                                                  ArtifactInstance                 Agent
                                         artifact1                                                      execInput1                     user1
                                                                                                                                                     opmo:account


opmo:          opmo:hasArtifact             opmv:used                                                        opmv:used                      opmo:account       opmo:
                                                                                                                         opmv:wasControlledBy
OPMGraph                                               opmv:Process                                                                                            Account
                                                                                         opmv:Process
      opmw:                                                                                                                                                opmw:
                                  opmw:ProcessTemplate                                                           opmw:ProcessInstance     opmo:
 WorkflowTemplate                                                                                                                                     ExecutionAccount
                      opmo:          templateNode1                 opmw:hasProcessTemplate                         executionNode1         account
    template1                                                                                                                                             account1
                      hasProcess
                                             opmw:hasTemplateComponent                         opmw:hasSpecificComponent
                opmo:          opmv:
                hasArtifact    wasGeneratedBy            ac:AbstractComponent         ac:SpecificComponent         opmv:wasGeneratedBy         opmo: account
                                                               absComp1                    specComp1

                                         opmw:                                                                           opmw:
                                                                      opmw:hasArtifactTemplate                       ArtifactInstace
                                    ArtifactTemplate
                                     outputArtifact1                                                               executionOutput1
                                                opmv:Artifact                             opmv:Artifact
                                                                    opmw:hasWorkflowTemplate

                        Abstract Workflow                                                                     Executable Workflow




                                                                                                                                                                     8
Publication of Workflows as Linked Data



                                  Linked Data publication
               Abstract
               Workflow                            RDF Upload
   Wings        (OPM)                               Interface        OPM
                                                                   conversion
  OPM          Executable                                        Other workflow
conversion                                                       frameworks
               Workflow
                                                       RDF
                 (OPM)                                               OPM
                            Permanent                 Triple
                                                      store         import
                               web-
                            accessible
              Workflow          file
                Data,          store                 SPARQL      Web
             Components,                            Endpoint     accessible
                 etc.
                                                                Web
                                                                browser



                                                                                  9
Searching/Browsing Workflows as Linked Data

             Types of
             search




                              Resource URI
                              (Process instance)

                        Autocomplete search bar



                                  Specific component for this
                                  process instance
Properties




                                                                10
Searching/Browsing Workflows as Linked Data

           Component Name

            Component Inputs

           Component Outputs



            Code Implementations



           Template additional metadata




            Record of the different
            executions of this workflow
                                          11
Conclusions



1. Publication of an abstract workflow that represents the computational method in an
   execution-independent manner.




2. Publication of the abstract workflow and the executed workflow using the OPM
   standard that is independent of the execution environment used.




3. Publication of the workflows, components, codes and datasets as Linked Data on the
   web.




                                                                                    12
Future work



•   Extensions to abstract workflow publication
     – Be able to provide abstractions on several steps.
     – Incomplete provenance.



•   Create an OPMV/W3C PROV-O profile for common workflow representation.
     – Increase interoperability with other workflow representation systems.



•   Workflow reuse in different workflow systems.
     – Import and execute workflows in other workflow frameworks.




                                                                                13
References


• WINGS workflow system: http://seagull.isi.edu/marbles/

•The Open Provenance Model Specification: http://openprovenance.org/

• OPMO: http://openprovenance.org/model/opmo

•OPMV: http://open-biomed.sourceforge.net/opmv/ns.html

• TB Drugome Wiki (Evolution of this work):
    http://seagull.isi.edu/wings-drugome/index.php/Main_Page

•W3C PROV-O current ontology (draft):
   http://www.w3.org/2011/prov/wiki/PIL_OWL_Ontology

•Principles of Linked Data:
     http://www.w3.org/DesignIssues/LinkedData.html




                                                                              14
Acknowledgements


•UCSD people:

     •Li Xie

     •Lei Xie

     •Sarah Kinnings

     •Phil Bourne

•ISI people:

     •Varun Ratnakaar

•OEG people:

     •Oscar Corcho



                                     15
A new Approach for Publishing
   Workflows: Abstractions,
  Standards and Linked Data

                       Daniel Garijo
 Ontology Engineering Group, Departamento de Inteligencia
        Artificial. Universidad Politécnica de Madrid

                         Yolanda Gil
             Information Sciences and Institute
      University of Southern California, Marina del Rey
                                                          Date: 14/11/2011

Weitere ähnliche Inhalte

Ähnlich wie WORKS 11 Presentation

Status update OEG - Nov 2012
Status update OEG - Nov 2012Status update OEG - Nov 2012
Status update OEG - Nov 2012dgarijo
 
Overview Of .Net 4.0 Sanjay Vyas
Overview Of .Net 4.0   Sanjay VyasOverview Of .Net 4.0   Sanjay Vyas
Overview Of .Net 4.0 Sanjay Vyasrsnarayanan
 
ISI work
ISI workISI work
ISI workdgarijo
 
MEW22 22nd Machine Evaluation Workshop Microsoft
MEW22 22nd Machine Evaluation Workshop MicrosoftMEW22 22nd Machine Evaluation Workshop Microsoft
MEW22 22nd Machine Evaluation Workshop MicrosoftLee Stott
 
OpenStack Preso: DevOps on Hybrid Infrastructure
OpenStack Preso: DevOps on Hybrid InfrastructureOpenStack Preso: DevOps on Hybrid Infrastructure
OpenStack Preso: DevOps on Hybrid Infrastructurerhirschfeld
 
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructuredevopsdaysaustin
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic WebNuxeo
 
What's new in Nuxeo 5.2? - Solutions Linux 2009
What's new in Nuxeo 5.2? - Solutions Linux 2009What's new in Nuxeo 5.2? - Solutions Linux 2009
What's new in Nuxeo 5.2? - Solutions Linux 2009Stefane Fermigier
 
Integrating Splunk into your Spring Applications
Integrating Splunk into your Spring ApplicationsIntegrating Splunk into your Spring Applications
Integrating Splunk into your Spring ApplicationsDamien Dallimore
 
(ATS3-DEV05) Coding up Pipeline Pilot Components
(ATS3-DEV05) Coding up Pipeline Pilot Components(ATS3-DEV05) Coding up Pipeline Pilot Components
(ATS3-DEV05) Coding up Pipeline Pilot ComponentsBIOVIA
 
Spring 3 - Der dritte Frühling
Spring 3 - Der dritte FrühlingSpring 3 - Der dritte Frühling
Spring 3 - Der dritte FrühlingThorsten Kamann
 
SharePoint 2010 as a Development Platform
SharePoint 2010 as a Development PlatformSharePoint 2010 as a Development Platform
SharePoint 2010 as a Development PlatformAyman El-Hattab
 
Spring MVC framework
Spring MVC frameworkSpring MVC framework
Spring MVC frameworkMohit Gupta
 
Splunk as a_big_data_platform_for_developers_spring_one2gx
Splunk as a_big_data_platform_for_developers_spring_one2gxSplunk as a_big_data_platform_for_developers_spring_one2gx
Splunk as a_big_data_platform_for_developers_spring_one2gxDamien Dallimore
 
Hadoop World 2011: Proven Tools to Manage Hadoop Environments - Joey Jablonsk...
Hadoop World 2011: Proven Tools to Manage Hadoop Environments - Joey Jablonsk...Hadoop World 2011: Proven Tools to Manage Hadoop Environments - Joey Jablonsk...
Hadoop World 2011: Proven Tools to Manage Hadoop Environments - Joey Jablonsk...Cloudera, Inc.
 
Java EE 7 with Apache Spark for the World’s Largest Credit Card Core Systems ...
Java EE 7 with Apache Spark for the World’s Largest Credit Card Core Systems ...Java EE 7 with Apache Spark for the World’s Largest Credit Card Core Systems ...
Java EE 7 with Apache Spark for the World’s Largest Credit Card Core Systems ...Hirofumi Iwasaki
 
An Integrated Framework for Parameter-based Optimization of Scientific Workflows
An Integrated Framework for Parameter-based Optimization of Scientific WorkflowsAn Integrated Framework for Parameter-based Optimization of Scientific Workflows
An Integrated Framework for Parameter-based Optimization of Scientific Workflowsvijayskumar
 
A Java Implementer's Guide to Boosting Apache Spark Performance by Tim Ellison.
A Java Implementer's Guide to Boosting Apache Spark Performance by Tim Ellison.A Java Implementer's Guide to Boosting Apache Spark Performance by Tim Ellison.
A Java Implementer's Guide to Boosting Apache Spark Performance by Tim Ellison.J On The Beach
 
Building and deploying LLM applications with Apache Airflow
Building and deploying LLM applications with Apache AirflowBuilding and deploying LLM applications with Apache Airflow
Building and deploying LLM applications with Apache AirflowKaxil Naik
 
HPC Web overview - Mobyle Workshop - September 28, 2012
HPC Web overview - Mobyle Workshop - September 28, 2012HPC Web overview - Mobyle Workshop - September 28, 2012
HPC Web overview - Mobyle Workshop - September 28, 2012Hervé Ménager
 

Ähnlich wie WORKS 11 Presentation (20)

Status update OEG - Nov 2012
Status update OEG - Nov 2012Status update OEG - Nov 2012
Status update OEG - Nov 2012
 
Overview Of .Net 4.0 Sanjay Vyas
Overview Of .Net 4.0   Sanjay VyasOverview Of .Net 4.0   Sanjay Vyas
Overview Of .Net 4.0 Sanjay Vyas
 
ISI work
ISI workISI work
ISI work
 
MEW22 22nd Machine Evaluation Workshop Microsoft
MEW22 22nd Machine Evaluation Workshop MicrosoftMEW22 22nd Machine Evaluation Workshop Microsoft
MEW22 22nd Machine Evaluation Workshop Microsoft
 
OpenStack Preso: DevOps on Hybrid Infrastructure
OpenStack Preso: DevOps on Hybrid InfrastructureOpenStack Preso: DevOps on Hybrid Infrastructure
OpenStack Preso: DevOps on Hybrid Infrastructure
 
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
 
What's new in Nuxeo 5.2? - Solutions Linux 2009
What's new in Nuxeo 5.2? - Solutions Linux 2009What's new in Nuxeo 5.2? - Solutions Linux 2009
What's new in Nuxeo 5.2? - Solutions Linux 2009
 
Integrating Splunk into your Spring Applications
Integrating Splunk into your Spring ApplicationsIntegrating Splunk into your Spring Applications
Integrating Splunk into your Spring Applications
 
(ATS3-DEV05) Coding up Pipeline Pilot Components
(ATS3-DEV05) Coding up Pipeline Pilot Components(ATS3-DEV05) Coding up Pipeline Pilot Components
(ATS3-DEV05) Coding up Pipeline Pilot Components
 
Spring 3 - Der dritte Frühling
Spring 3 - Der dritte FrühlingSpring 3 - Der dritte Frühling
Spring 3 - Der dritte Frühling
 
SharePoint 2010 as a Development Platform
SharePoint 2010 as a Development PlatformSharePoint 2010 as a Development Platform
SharePoint 2010 as a Development Platform
 
Spring MVC framework
Spring MVC frameworkSpring MVC framework
Spring MVC framework
 
Splunk as a_big_data_platform_for_developers_spring_one2gx
Splunk as a_big_data_platform_for_developers_spring_one2gxSplunk as a_big_data_platform_for_developers_spring_one2gx
Splunk as a_big_data_platform_for_developers_spring_one2gx
 
Hadoop World 2011: Proven Tools to Manage Hadoop Environments - Joey Jablonsk...
Hadoop World 2011: Proven Tools to Manage Hadoop Environments - Joey Jablonsk...Hadoop World 2011: Proven Tools to Manage Hadoop Environments - Joey Jablonsk...
Hadoop World 2011: Proven Tools to Manage Hadoop Environments - Joey Jablonsk...
 
Java EE 7 with Apache Spark for the World’s Largest Credit Card Core Systems ...
Java EE 7 with Apache Spark for the World’s Largest Credit Card Core Systems ...Java EE 7 with Apache Spark for the World’s Largest Credit Card Core Systems ...
Java EE 7 with Apache Spark for the World’s Largest Credit Card Core Systems ...
 
An Integrated Framework for Parameter-based Optimization of Scientific Workflows
An Integrated Framework for Parameter-based Optimization of Scientific WorkflowsAn Integrated Framework for Parameter-based Optimization of Scientific Workflows
An Integrated Framework for Parameter-based Optimization of Scientific Workflows
 
A Java Implementer's Guide to Boosting Apache Spark Performance by Tim Ellison.
A Java Implementer's Guide to Boosting Apache Spark Performance by Tim Ellison.A Java Implementer's Guide to Boosting Apache Spark Performance by Tim Ellison.
A Java Implementer's Guide to Boosting Apache Spark Performance by Tim Ellison.
 
Building and deploying LLM applications with Apache Airflow
Building and deploying LLM applications with Apache AirflowBuilding and deploying LLM applications with Apache Airflow
Building and deploying LLM applications with Apache Airflow
 
HPC Web overview - Mobyle Workshop - September 28, 2012
HPC Web overview - Mobyle Workshop - September 28, 2012HPC Web overview - Mobyle Workshop - September 28, 2012
HPC Web overview - Mobyle Workshop - September 28, 2012
 

Mehr von dgarijo

FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesFOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesdgarijo
 
FAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the FutureFAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the Futuredgarijo
 
Towards Reusable Research Software
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Softwaredgarijo
 
SOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentationSOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentationdgarijo
 
A Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed DatasetsA Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed Datasetsdgarijo
 
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge GraphsOBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphsdgarijo
 
Towards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software MetadataTowards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software Metadatadgarijo
 
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...dgarijo
 
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular DataWDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular Datadgarijo
 
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...dgarijo
 
Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019dgarijo
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Sciencedgarijo
 
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...dgarijo
 
WIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting OntologiesWIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting Ontologiesdgarijo
 
Towards Automating Data Narratives
Towards Automating Data NarrativesTowards Automating Data Narratives
Towards Automating Data Narrativesdgarijo
 
Automated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific WorkflowsAutomated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific Workflowsdgarijo
 
OntoSoft: A Distributed Semantic Registry for Scientific Software
OntoSoft: A Distributed Semantic Registry for Scientific SoftwareOntoSoft: A Distributed Semantic Registry for Scientific Software
OntoSoft: A Distributed Semantic Registry for Scientific Softwaredgarijo
 
OEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology EngineeringOEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology Engineeringdgarijo
 
Software Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesSoftware Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesdgarijo
 
Reproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An OverviewReproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An Overviewdgarijo
 

Mehr von dgarijo (20)

FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesFOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
 
FAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the FutureFAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the Future
 
Towards Reusable Research Software
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Software
 
SOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentationSOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentation
 
A Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed DatasetsA Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed Datasets
 
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge GraphsOBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
 
Towards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software MetadataTowards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software Metadata
 
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
 
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular DataWDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
 
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
 
Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
 
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
 
WIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting OntologiesWIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting Ontologies
 
Towards Automating Data Narratives
Towards Automating Data NarrativesTowards Automating Data Narratives
Towards Automating Data Narratives
 
Automated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific WorkflowsAutomated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific Workflows
 
OntoSoft: A Distributed Semantic Registry for Scientific Software
OntoSoft: A Distributed Semantic Registry for Scientific SoftwareOntoSoft: A Distributed Semantic Registry for Scientific Software
OntoSoft: A Distributed Semantic Registry for Scientific Software
 
OEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology EngineeringOEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology Engineering
 
Software Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesSoftware Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciences
 
Reproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An OverviewReproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An Overview
 

Kürzlich hochgeladen

Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...Daniel Zivkovic
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
IEEE Computer Society’s Strategic Activities and Products including SWEBOK Guide
IEEE Computer Society’s Strategic Activities and Products including SWEBOK GuideIEEE Computer Society’s Strategic Activities and Products including SWEBOK Guide
IEEE Computer Society’s Strategic Activities and Products including SWEBOK GuideHironori Washizaki
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
UiPath Clipboard AI: "A TIME Magazine Best Invention of 2023 Unveiled"
UiPath Clipboard AI: "A TIME Magazine Best Invention of 2023 Unveiled"UiPath Clipboard AI: "A TIME Magazine Best Invention of 2023 Unveiled"
UiPath Clipboard AI: "A TIME Magazine Best Invention of 2023 Unveiled"DianaGray10
 
100+ ChatGPT Prompts for SEO Optimization
100+ ChatGPT Prompts for SEO Optimization100+ ChatGPT Prompts for SEO Optimization
100+ ChatGPT Prompts for SEO Optimizationarrow10202532yuvraj
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 

Kürzlich hochgeladen (20)

20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
IEEE Computer Society’s Strategic Activities and Products including SWEBOK Guide
IEEE Computer Society’s Strategic Activities and Products including SWEBOK GuideIEEE Computer Society’s Strategic Activities and Products including SWEBOK Guide
IEEE Computer Society’s Strategic Activities and Products including SWEBOK Guide
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
UiPath Clipboard AI: "A TIME Magazine Best Invention of 2023 Unveiled"
UiPath Clipboard AI: "A TIME Magazine Best Invention of 2023 Unveiled"UiPath Clipboard AI: "A TIME Magazine Best Invention of 2023 Unveiled"
UiPath Clipboard AI: "A TIME Magazine Best Invention of 2023 Unveiled"
 
100+ ChatGPT Prompts for SEO Optimization
100+ ChatGPT Prompts for SEO Optimization100+ ChatGPT Prompts for SEO Optimization
100+ ChatGPT Prompts for SEO Optimization
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 

WORKS 11 Presentation

  • 1. A new Approach for Publishing Workflows: Abstractions, Standards and Linked Data Daniel Garijo Ontology Engineering Group, Departamento de Inteligencia Artificial. Universidad Politécnica de Madrid Yolanda Gil Information Sciences and Institute University of Southern California, Marina del Rey Date: 14/11/2011
  • 2. Index of contents Index: 1. Background 2. Limitations of existing approaches to workflow publication 3. Features of our approach • Publishing abstract workflows and specific workflows • OPMW Ontology • Linked Data Publication 4. Workflow querying and Linked Data consumption 5. Conclusions 1
  • 3. Background Typical Published Article Reproducible Article: Weaver, GenePattern GRRD, etc. Text: Text: Narrative of method, Narrative of method, software packages used software packages used Data: Data: Key datasets and figures/plots Key datasets and figures/plots Workflow: NOT published, Workflow/scripts describing loosely recorded: dataflow, codes, and parameters Software: scripted codes + manual steps + notes/emails 2
  • 4. Current issues with existing publication approaches Only executable workflow is published: Reproducible Article: 1. Must have the same codes to re-execute Weaver, GenePattern GRRD, etc. the workflow, but: – Codes become unavailable • Eg: eHits was proprietary and replaced by Text: AutodockVina Narrative of method, – Different labs prefer different codes software packages used • Eg: R vs Matlab • Eg: viz in Citoscape vs yEd Data: 2. Must have the same workflow framework Key datasets and figures/plots to re-execute the workflow – Must have R for Weaver Workflow: 3. Must import files to local file system and Workflow/scripts describing workflow framework dataflow, codes, and parameters – Must import bundle of workflow/data/code files to reproduce 3
  • 5. Key Features of our approach • Publish an abstract workflow in addition to executable workflow – Description of workflow that is independent of the codes executed – Maps to the codes executed (the “executable workflow”) • Publish both abstract and executable workflow using the OPM standard – OPM (Open Provenance Model) is independent of workflow framework and is widely implemented – Other groups can import to their own workflow framework • Publish data and workflows as Linked Data on the Web – All workflows and related files are web-accessible – Simple mechanism to share across local file systems 4
  • 6. What is Linked Data 1. Use URIs as names for things. 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful information. 4. Include links to other URIs. “Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/” 5
  • 7. High level architecture Other workflow WINGS on local laptop environments Workflow Core Template OPM Portal Workflow export Instance Programatic access (external apps) WINGS on shared host Workflow Linked Core Template OPM Portal export Data Workflow Instance Publication Interactive WINGS on web server Browsing Workflow (Pubby frontend) Core Template OPM Portal export Users Workflow Instance Wings workflow OPM Publication Share Reuse generation conversion 6
  • 8. Publishing the abstract workflow Comparison of Dissimilar protein structures workflow 7
  • 9. OPMW Ontology opmv:Artifact opmv:Artifact opmw: opmw: opmv: opmw: hasArtifactTemplate ArtifactTemplate ArtifactInstance Agent artifact1 execInput1 user1 opmo:account opmo: opmo:hasArtifact opmv:used opmv:used opmo:account opmo: opmv:wasControlledBy OPMGraph opmv:Process Account opmv:Process opmw: opmw: opmw:ProcessTemplate opmw:ProcessInstance opmo: WorkflowTemplate ExecutionAccount opmo: templateNode1 opmw:hasProcessTemplate executionNode1 account template1 account1 hasProcess opmw:hasTemplateComponent opmw:hasSpecificComponent opmo: opmv: hasArtifact wasGeneratedBy ac:AbstractComponent ac:SpecificComponent opmv:wasGeneratedBy opmo: account absComp1 specComp1 opmw: opmw: opmw:hasArtifactTemplate ArtifactInstace ArtifactTemplate outputArtifact1 executionOutput1 opmv:Artifact opmv:Artifact opmw:hasWorkflowTemplate Abstract Workflow Executable Workflow 8
  • 10. Publication of Workflows as Linked Data Linked Data publication Abstract Workflow RDF Upload Wings (OPM) Interface OPM conversion OPM Executable Other workflow conversion frameworks Workflow RDF (OPM) OPM Permanent Triple store import web- accessible Workflow file Data, store SPARQL Web Components, Endpoint accessible etc. Web browser 9
  • 11. Searching/Browsing Workflows as Linked Data Types of search Resource URI (Process instance) Autocomplete search bar Specific component for this process instance Properties 10
  • 12. Searching/Browsing Workflows as Linked Data Component Name Component Inputs Component Outputs Code Implementations Template additional metadata Record of the different executions of this workflow 11
  • 13. Conclusions 1. Publication of an abstract workflow that represents the computational method in an execution-independent manner. 2. Publication of the abstract workflow and the executed workflow using the OPM standard that is independent of the execution environment used. 3. Publication of the workflows, components, codes and datasets as Linked Data on the web. 12
  • 14. Future work • Extensions to abstract workflow publication – Be able to provide abstractions on several steps. – Incomplete provenance. • Create an OPMV/W3C PROV-O profile for common workflow representation. – Increase interoperability with other workflow representation systems. • Workflow reuse in different workflow systems. – Import and execute workflows in other workflow frameworks. 13
  • 15. References • WINGS workflow system: http://seagull.isi.edu/marbles/ •The Open Provenance Model Specification: http://openprovenance.org/ • OPMO: http://openprovenance.org/model/opmo •OPMV: http://open-biomed.sourceforge.net/opmv/ns.html • TB Drugome Wiki (Evolution of this work): http://seagull.isi.edu/wings-drugome/index.php/Main_Page •W3C PROV-O current ontology (draft): http://www.w3.org/2011/prov/wiki/PIL_OWL_Ontology •Principles of Linked Data: http://www.w3.org/DesignIssues/LinkedData.html 14
  • 16. Acknowledgements •UCSD people: •Li Xie •Lei Xie •Sarah Kinnings •Phil Bourne •ISI people: •Varun Ratnakaar •OEG people: •Oscar Corcho 15
  • 17. A new Approach for Publishing Workflows: Abstractions, Standards and Linked Data Daniel Garijo Ontology Engineering Group, Departamento de Inteligencia Artificial. Universidad Politécnica de Madrid Yolanda Gil Information Sciences and Institute University of Southern California, Marina del Rey Date: 14/11/2011