SlideShare a Scribd company logo
1 of 29
FAIR Computational
Workflows
Professor Carole Goble
The University of Manchester UK
EU Research Infrastructures ELIXIR, IBISBA, EOSC-Life
BioExcel Centre of Excellence
Software Sustainability Institute UK
FAIRDOM Consortium
carole.goble@manchester.ac.uk
SERC Swedish e-Science Research Center Annual Meeting 13 May 2022
What is a Computational Workflow?
Multi-step processes for data analytics, data
processing pipelines and simulation sweeps
Linking computational steps
• Data flow between steps
• Control flow of steps
Handle data and processing dependencies
Operating over computational infrastructure
Come in many different flavours
Drug Discovery
What is a Computational Workflow?
Encoded method less dependent on implementations
inputs
outputs
tools, CLI,
containers,
workflows
Precise
specification
Software
Execution
WfMS
Engine
Workflow
Abstraction
Access to computational infrastructure and datasets,
tool interoperability, processing portability and
optimisation, data wrangling.
Composition
different
codes,
languages,
third parties
What is a Computational Workflow?
Inter-twingled mix and matching
Scripting
environments
Interactive Electronic
Research Notebooks
Workflow
Management Systems &
execution platforms
https://s.apache.org/existing-workflow-systems
300+ Systems
General and
Specialised
Interactive & exploratory
analysis with Human in the
Loop
Production, automated,
workflow-integrated
software
Tool chaining,
Batch processing,
Job Control
What is a Computational Workflow System?
From frameworks to web based analysis platforms, hybrid cloud deployment
6
Graph of jobs for
automatic
parallelisation, DIY
package &
containerisation
installation, auto-
documentation
Online portals users build
and reuse workflows
around publicly available or
user-uploaded data and
pre-wrapped, pre-installed
tools.
Communities cluster
around systems
Typically depends on:
• Support for specific data
types
• Support for specific codes
• Support for kinds of
workflow
• Skills level of workflow
developer
• Popularity
Why Computational Workflows?
prepare, analyze, share increasing volumes of complex data
CryoEM Image Analysis
Metagenomic Pipelines
Drug Discovery
Protein Ligand MD
Simulation
Genome Annotation
High Throughput Sequencing
[Fabrice Allain
JOBIM2021]
[Romain Dallet
JOBIM2021]
[Adam Hospital]
[Rob Finn]
[Carlos Oscar Sorzano Sanchez]
Why Computational Workflows?
data collection and model simulation
SERC: Data-driven computational
materials design, DCMD
Automatic workflow, data collection, and
development of open-data infrastructure
Why Computational Workflows?
SARS-CoV-2 allelic-variant surveillance
Automated repetitive monitoring of structured
data from the European COVID-19 Data Portal and
national SAR-CoV-2 sequencing datasets.
Scalable - global distributed PULSAR compute
network
• Improved data quality
• Uniformly analysed data for downstream
analysis & visualisation
• Submission of data to public archives
Ported tried and tested transparent methods
• EMERGEN, French SARS-CoV-2 genomic
surveillance
https://covid19.galaxyproject.org
Why Computational Workflow Systems?
Abstraction &
Composition
Automation
Scalability &
Infrastructure
Access
Reporting &
Accreditation Portability
Sharing &
Adaptability
Reusable Research Objects Computable Research Objects
Why Computational Workflow Systems?
Reproducibility
Regulation
Transparency
Documented Method
Labour saving
Productivity
Reliability
Sustained
Knowledge sharing
Scholarly Objects
Pool of know-how
Reuse, repurpose
Variant-based
Democratisation
of computational
analysis & methods
Upfront cost for downstream benefits
Benefits best when a community buys in and workflows are supported
Why FAIR Computational Workflows?
The FAIR Data Principles
RDA FAIR Data Maturity Model. Specification and Guidelines
https://zenodo.org/record/3909563#.YORYkUzTX19
Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The
FAIR Guiding Principles for scientific data management
and stewardship. Sci Data 3, 160018 (2016).
https://doi.org/10.1038/sdata.2016.18
https://www.go-fair.org/fair-principles/
Why FAIR Computational Workflows?
The FAIR Data Principles
Enable automation
• Persistent human readable and machine-
actionable linked metadata
• Community standards
• Persistent identifiers
• Licensing and access rules
• Access protocols
• Register/index/search
RDA FAIR Data Maturity Model. Specification and Guidelines
https://zenodo.org/record/3909563#.YORYkUzTX19
Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The
FAIR Guiding Principles for scientific data management
and stewardship. Sci Data 3, 160018 (2016).
https://doi.org/10.1038/sdata.2016.18
https://www.go-fair.org/fair-principles/
Why FAIR Computational Workflows?
Developer/User viewpoints
How can I find already existing workflows?
Can I access them? Public or private? Git repository?
What language is it written in?
Can I rework it to use my tool?
Is it well enough described so I can understand it?
Can I use it?
Can I reuse it in our infrastructure?
Does it make FAIR data?
Will I get credit for it?
Can I track that credit?
How easy is it to be FAIR?
What are the FAIR Principles for Workflows?
Hybrid Processual Digital Objects
FAIR Method Objects
FAIR Software Objects
FAIR Data
In and Out
FAIR Enabling
Services
C. Goble, S. Cohen-Boulakia, S.
Soiland-Reyes, D. Garijo, Y. Gil, M.R.
Crusoe, K. Peters & D. Schober. FAIR
computational workflows. Data
Intelligence 2(2020), 108–121.
https://doi.org/10.1162/dint_a_00033
What are FAIR Principles for Workflows?
Hybrid Processual Digital Objects
Method “Data” Objects
Workflows as
FAIR Software
FAIR+R and FAIR++
Quality, maturity, maintainability
The principles revised
Workflows as
FAIR Digital Objects
Data-like method objects
Associated objects
The principles adapted
Workflows as
FAIR Data Instruments
FAIRification of the dataflow
The data principles supported
C. Goble, S. Cohen-Boulakia, S.
Soiland-Reyes, D. Garijo, Y. Gil, M.R.
Crusoe, K. Peters & D. Schober. FAIR
computational workflows. Data
Intelligence 2(2020), 108–121.
https://doi.org/10.1162/dint_a_00033
Workflow Objects
Software Objects
Data FAIRification
FAIR enabling services
Services
FAIR for Research Software
Processual Digital Objects
https://www.rd-alliance.org/groups/fair-4-researchsoftware-fair4rs-wg
FAIR for Research Software (FAIR4RS)
working group
Katz, et al PATTERNS 2, 2021
https://fairsharing.org/4100
FAIR Principles for Workflows
Hybrid Processual Digital Objects
Usable and Reusable
Living & reusable parts
versioned, forked, cloned
parts recycled
limited lifespans
citable credit
executability reproducibility,
portability
testing, maturity
quality, maintainability
FAIR+R FAIR++
Composition & agency
Abstractions
specification
implementation
instantiation
run result
modularisation
FAIR parts & dependencies
propagation of FAIR properties
Findable
Search engine supported
Public, private & DOI support
Different workflow languages
and systems
Git integration for repos
Versioning & snapshots
Described by metadata
licensing
authors
& credit
analytics
access
search
versions & status
other
workflows
200+
Workflows
90+
Teams
10+
different
systems
What Workflow Metadata?
Metadata for machines & people
Common metadata
about the workflow,
tools & parameters
Canonical workflow
description of the
steps of the workflow
Type the input and
outputs of the steps
Run Provenance
RO-Crate format for
packaging a workflow,
its metadata and
companion objects
(links to containers,
data etc) for exchange,
archiving, reporting,
citing.
FAIR Digital
Object
Open Communities
https://youtu.be/Rsuxn0m4bIM
Accessible (1)
Tool Registry
Service API
Accessible (2)
A2. metadata are accessible, even when the workflow is no longer available
Enough metadata that a workflow is read-reproducible as a method description if it no longer runs
Metadata preservation belts
and braces
republish in another archive
Interoperable (1 & 2)
WfMS interoperability: describe
workflows independently of WfMS.
Platform independent pipeline
exchange and comparison.
Workflow Composability: Software interoperates
through APIs and metadata standards (FAIR4RS).
Workflow-ready tools.
Tested & validated canonical workflow blocks.
https://openwdl.org/
https://www.commonwl.org
Design for FAIR
Workflow Reuse
Licence
combinations
Access permissions
Clean interfaces
BioExcel Building Blocks
biomolecular simulation tools
https://workflowhub.eu/projects/11
Reusable and Usable
Composability + Associated Objects + Metadata + FAIR Services
Reusable – “can be understood, modified, built upon or incorporated into other
workflows” Usable – “can be executed”
Containers & Packaging Testing & monitoring
checker workflows
test data
https://crs4.github.io/life_monitor/
https://openebench.bsc.es/dashboard
Is a workflow reusable if it’s
• resource greedy
• needs special resources
• needs unavailable data
• cannot be ported or run by
anyone other than the
developers?
Data FAIRification & FAIR Data by Design
Metadata generated for data products, Assisted by WfMS and tools
Reviewing
Curation
Certification
Governance
Best Practice
Golden
Examples
Canonical
workflows
Design for
FAIR Data
and Reuse
nf-core
FAIR Workflow Services
EOSC-Life Collaboratory
EOSC-Life https://www.eosc-life.eu/
How can we Workflow FAIR Assist?
Workflow
developers
Tool and
data set
providers
Workflow readiness
FAIR Unit Testing
Brack, et al (2021). 10 Simple Rules for
making a software tool workflow-ready.
https://doi.org/10.5281/zenodo.5636487
Descriptions
Register in WorkflowHub
Best Practice
WfMS
platforms
Programmatic access
Automate FAIRness
FAIR Software
FAIR enabling Service
Use well documented
FAIR enabling and
FAIR workflows
credit the makers!
Users
Building Communities of Practice
where do I start? finding your buddies
Summary: FAIR Computational Workflows
Hybrid Processual Digital Objects FAIR takes a village*
*Borgman, C. L., & Bourne, P. E. (2021). Why it takes a village to manage and share data. Harvard Data Science Review (under Review), arXiv:2109.01694v1.
Building Communities
of Practice
Acknowledgements
The WorkflowHub Club, Bioschemas Community, RO-Crate
Community, CWL Community, Galaxy Europe, EOSC-Life
and ELIXIR Tools Platform.
Special Thanks
Stian Soiland-Reyes (U Manchester / U Amsterdam)
Paul Brack, Stuart Owen, Finn Bacall, Alan Williams (U Manchester)
Björn Grüning (U Freiburg)
Frederik Coppens (VIB)
Sarah Jones (GEANT)
Herve Menager (Pasteur Institute)
Sarah Cohen-Boulakia (U Paris Sacly)
Dan Katz (U Illinois Urbana-Champaign)
Simone Leo (CRS4)
Laura Rodriguez-Navas (BSC)
José Mª Fernández (BSC)
Workflow Community Initiative https://workflows.community/about
EOSC-Life https://www.eosc-life.eu/
ELIXIR http://elixir-europe.org
RO-Crate https://www.researchobject.org/ro-crate/
WorkflowHub https://workflowhub.eu/ and workflowhub.org
Galaxy Europe https://galaxyproject.eu/
Bioschemas https://bioschemas.org
Common Workflow Language https://www.commonwl.org/
Life Monitor https://crs4.github.io/life_monitor/

More Related Content

What's hot

Approaching Data Quality
Approaching Data QualityApproaching Data Quality
Approaching Data QualityDATAVERSITY
 
Session 1 - Introduction to i4Trust Data Spaces, building blocks, and roles |...
Session 1 - Introduction to i4Trust Data Spaces, building blocks, and roles |...Session 1 - Introduction to i4Trust Data Spaces, building blocks, and roles |...
Session 1 - Introduction to i4Trust Data Spaces, building blocks, and roles |...FIWARE
 
Architecture as Linked Data
Architecture as Linked DataArchitecture as Linked Data
Architecture as Linked DataDanny Greefhorst
 
Internet of things (IOT)
Internet of things (IOT)Internet of things (IOT)
Internet of things (IOT)Oshin Kandpal
 
Data Management - a top Priority for Healthcare Practices
Data Management - a top Priority for Healthcare PracticesData Management - a top Priority for Healthcare Practices
Data Management - a top Priority for Healthcare PracticesData Dynamics Inc
 
IoT and Big Data
IoT and Big DataIoT and Big Data
IoT and Big Datasabnees
 
Information retrieval 14 fuzzy set models of ir
Information retrieval 14 fuzzy set models of irInformation retrieval 14 fuzzy set models of ir
Information retrieval 14 fuzzy set models of irVaibhav Khanna
 
IoT 2019 overview
IoT 2019 overviewIoT 2019 overview
IoT 2019 overviewengIT
 
Internet of things (IoT)
Internet of things (IoT)Internet of things (IoT)
Internet of things (IoT)Ameer Sameer
 
Iot presentation
Iot presentationIot presentation
Iot presentationhuma742446
 
IOT - internet of Things - August 2017
IOT - internet of Things - August 2017IOT - internet of Things - August 2017
IOT - internet of Things - August 2017paul young cpa, cga
 
FIWARE Wednesday Webinars - FIWARE Overview
FIWARE Wednesday Webinars - FIWARE OverviewFIWARE Wednesday Webinars - FIWARE Overview
FIWARE Wednesday Webinars - FIWARE OverviewFIWARE
 
What it means to be FAIR
What it means to be FAIRWhat it means to be FAIR
What it means to be FAIRSarah Jones
 
a shift in our research focus: from knowledge acquisition to knowledge augmen...
a shift in our research focus: from knowledge acquisition to knowledge augmen...a shift in our research focus: from knowledge acquisition to knowledge augmen...
a shift in our research focus: from knowledge acquisition to knowledge augmen...Fabien Gandon
 
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...Edureka!
 

What's hot (20)

Approaching Data Quality
Approaching Data QualityApproaching Data Quality
Approaching Data Quality
 
Session 1 - Introduction to i4Trust Data Spaces, building blocks, and roles |...
Session 1 - Introduction to i4Trust Data Spaces, building blocks, and roles |...Session 1 - Introduction to i4Trust Data Spaces, building blocks, and roles |...
Session 1 - Introduction to i4Trust Data Spaces, building blocks, and roles |...
 
Architecture as Linked Data
Architecture as Linked DataArchitecture as Linked Data
Architecture as Linked Data
 
Internet of things (IOT)
Internet of things (IOT)Internet of things (IOT)
Internet of things (IOT)
 
Data Management - a top Priority for Healthcare Practices
Data Management - a top Priority for Healthcare PracticesData Management - a top Priority for Healthcare Practices
Data Management - a top Priority for Healthcare Practices
 
IOT in healthcare
IOT in healthcareIOT in healthcare
IOT in healthcare
 
IoT for Healthcare
IoT for HealthcareIoT for Healthcare
IoT for Healthcare
 
Iot in agriculture
Iot in agricultureIot in agriculture
Iot in agriculture
 
IoT and Big Data
IoT and Big DataIoT and Big Data
IoT and Big Data
 
Information retrieval 14 fuzzy set models of ir
Information retrieval 14 fuzzy set models of irInformation retrieval 14 fuzzy set models of ir
Information retrieval 14 fuzzy set models of ir
 
IoT 2019 overview
IoT 2019 overviewIoT 2019 overview
IoT 2019 overview
 
IoT ecosystem
IoT ecosystemIoT ecosystem
IoT ecosystem
 
Internet of things (IoT)
Internet of things (IoT)Internet of things (IoT)
Internet of things (IoT)
 
Iot presentation
Iot presentationIot presentation
Iot presentation
 
Iot audit
Iot auditIot audit
Iot audit
 
IOT - internet of Things - August 2017
IOT - internet of Things - August 2017IOT - internet of Things - August 2017
IOT - internet of Things - August 2017
 
FIWARE Wednesday Webinars - FIWARE Overview
FIWARE Wednesday Webinars - FIWARE OverviewFIWARE Wednesday Webinars - FIWARE Overview
FIWARE Wednesday Webinars - FIWARE Overview
 
What it means to be FAIR
What it means to be FAIRWhat it means to be FAIR
What it means to be FAIR
 
a shift in our research focus: from knowledge acquisition to knowledge augmen...
a shift in our research focus: from knowledge acquisition to knowledge augmen...a shift in our research focus: from knowledge acquisition to knowledge augmen...
a shift in our research focus: from knowledge acquisition to knowledge augmen...
 
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
 

Similar to FAIR Computational Workflows

FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational WorkflowsCarole Goble
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows Carole Goble
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational WorkflowsCarole Goble
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational WorkflowsCarole Goble
 
EOSC-Life Workflow Collaboratory
EOSC-Life Workflow CollaboratoryEOSC-Life Workflow Collaboratory
EOSC-Life Workflow CollaboratoryCarole Goble
 
Hughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesHughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesASIS&T
 
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, Rome
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, RomeWorkflows, provenance and reporting: a lifecycle perspective at BIH 2013, Rome
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, RomeCarole Goble
 
Research Object Community Update
Research Object Community UpdateResearch Object Community Update
Research Object Community UpdateCarole Goble
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout Carole Goble
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformSanjay Padhi, Ph.D
 
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...Carole Goble
 
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Blue BRIDGE
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Anita de Waard
 
Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Carole Goble
 
The NIH Data Commons - BD2K All Hands Meeting 2015
The NIH Data Commons -  BD2K All Hands Meeting 2015The NIH Data Commons -  BD2K All Hands Meeting 2015
The NIH Data Commons - BD2K All Hands Meeting 2015Vivien Bonazzi
 
VODAN Africa IN.pptx
VODAN Africa IN.pptxVODAN Africa IN.pptx
VODAN Africa IN.pptxGetu Tadele
 
Advances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsAdvances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsCarole Goble
 
“Semantic Technologies for Smart Services”
“Semantic Technologies for Smart Services” “Semantic Technologies for Smart Services”
“Semantic Technologies for Smart Services” diannepatricia
 
Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2Vivien Bonazzi
 

Similar to FAIR Computational Workflows (20)

FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
EOSC-Life Workflow Collaboratory
EOSC-Life Workflow CollaboratoryEOSC-Life Workflow Collaboratory
EOSC-Life Workflow Collaboratory
 
Hughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesHughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication Repositories
 
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, Rome
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, RomeWorkflows, provenance and reporting: a lifecycle perspective at BIH 2013, Rome
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, Rome
 
Research Object Community Update
Research Object Community UpdateResearch Object Community Update
Research Object Community Update
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh Platform
 
Grid computing
Grid computingGrid computing
Grid computing
 
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
 
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Let’s go on a FAIR safari!
Let’s go on a FAIR safari!
 
The NIH Data Commons - BD2K All Hands Meeting 2015
The NIH Data Commons -  BD2K All Hands Meeting 2015The NIH Data Commons -  BD2K All Hands Meeting 2015
The NIH Data Commons - BD2K All Hands Meeting 2015
 
VODAN Africa IN.pptx
VODAN Africa IN.pptxVODAN Africa IN.pptx
VODAN Africa IN.pptx
 
Advances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsAdvances in Scientific Workflow Environments
Advances in Scientific Workflow Environments
 
“Semantic Technologies for Smart Services”
“Semantic Technologies for Smart Services” “Semantic Technologies for Smart Services”
“Semantic Technologies for Smart Services”
 
Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2
 

More from Carole Goble

Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...Carole Goble
 
RO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsRO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsCarole Goble
 
Research Software Sustainability takes a Village
Research Software Sustainability takes a VillageResearch Software Sustainability takes a Village
Research Software Sustainability takes a VillageCarole Goble
 
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...Carole Goble
 
Open Research: Manchester leading and learning
Open Research: Manchester leading and learningOpen Research: Manchester leading and learning
Open Research: Manchester leading and learningCarole Goble
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...Carole Goble
 
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...Carole Goble
 
The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects Carole Goble
 
How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)Carole Goble
 
What is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can helpWhat is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can helpCarole Goble
 
ELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR BoardELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR BoardCarole Goble
 
FAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research CommonsFAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research CommonsCarole Goble
 
Reproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects helpReproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects helpCarole Goble
 
Reflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic careerReflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic careerCarole Goble
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better ResearchCarole Goble
 
Reproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsCarole Goble
 
Introduction to FAIRDOM
Introduction to FAIRDOMIntroduction to FAIRDOM
Introduction to FAIRDOMCarole Goble
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceCarole Goble
 

More from Carole Goble (18)

Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
 
RO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsRO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital Objects
 
Research Software Sustainability takes a Village
Research Software Sustainability takes a VillageResearch Software Sustainability takes a Village
Research Software Sustainability takes a Village
 
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
 
Open Research: Manchester leading and learning
Open Research: Manchester leading and learningOpen Research: Manchester leading and learning
Open Research: Manchester leading and learning
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
 
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
 
The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects
 
How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)
 
What is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can helpWhat is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can help
 
ELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR BoardELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR Board
 
FAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research CommonsFAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research Commons
 
Reproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects helpReproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects help
 
Reflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic careerReflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic career
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better Research
 
Reproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trends
 
Introduction to FAIRDOM
Introduction to FAIRDOMIntroduction to FAIRDOM
Introduction to FAIRDOM
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data Science
 

Recently uploaded

Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINsankalpkumarsahoo174
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxjana861314
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 

Recently uploaded (20)

Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 

FAIR Computational Workflows

  • 1. FAIR Computational Workflows Professor Carole Goble The University of Manchester UK EU Research Infrastructures ELIXIR, IBISBA, EOSC-Life BioExcel Centre of Excellence Software Sustainability Institute UK FAIRDOM Consortium carole.goble@manchester.ac.uk SERC Swedish e-Science Research Center Annual Meeting 13 May 2022
  • 2. What is a Computational Workflow? Multi-step processes for data analytics, data processing pipelines and simulation sweeps Linking computational steps • Data flow between steps • Control flow of steps Handle data and processing dependencies Operating over computational infrastructure Come in many different flavours Drug Discovery
  • 3. What is a Computational Workflow? Encoded method less dependent on implementations inputs outputs tools, CLI, containers, workflows Precise specification Software Execution WfMS Engine Workflow Abstraction Access to computational infrastructure and datasets, tool interoperability, processing portability and optimisation, data wrangling. Composition different codes, languages, third parties
  • 4. What is a Computational Workflow? Inter-twingled mix and matching Scripting environments Interactive Electronic Research Notebooks Workflow Management Systems & execution platforms https://s.apache.org/existing-workflow-systems 300+ Systems General and Specialised Interactive & exploratory analysis with Human in the Loop Production, automated, workflow-integrated software Tool chaining, Batch processing, Job Control
  • 5. What is a Computational Workflow System? From frameworks to web based analysis platforms, hybrid cloud deployment 6 Graph of jobs for automatic parallelisation, DIY package & containerisation installation, auto- documentation Online portals users build and reuse workflows around publicly available or user-uploaded data and pre-wrapped, pre-installed tools. Communities cluster around systems Typically depends on: • Support for specific data types • Support for specific codes • Support for kinds of workflow • Skills level of workflow developer • Popularity
  • 6. Why Computational Workflows? prepare, analyze, share increasing volumes of complex data CryoEM Image Analysis Metagenomic Pipelines Drug Discovery Protein Ligand MD Simulation Genome Annotation High Throughput Sequencing [Fabrice Allain JOBIM2021] [Romain Dallet JOBIM2021] [Adam Hospital] [Rob Finn] [Carlos Oscar Sorzano Sanchez]
  • 7. Why Computational Workflows? data collection and model simulation SERC: Data-driven computational materials design, DCMD Automatic workflow, data collection, and development of open-data infrastructure
  • 8. Why Computational Workflows? SARS-CoV-2 allelic-variant surveillance Automated repetitive monitoring of structured data from the European COVID-19 Data Portal and national SAR-CoV-2 sequencing datasets. Scalable - global distributed PULSAR compute network • Improved data quality • Uniformly analysed data for downstream analysis & visualisation • Submission of data to public archives Ported tried and tested transparent methods • EMERGEN, French SARS-CoV-2 genomic surveillance https://covid19.galaxyproject.org
  • 9. Why Computational Workflow Systems? Abstraction & Composition Automation Scalability & Infrastructure Access Reporting & Accreditation Portability Sharing & Adaptability Reusable Research Objects Computable Research Objects
  • 10. Why Computational Workflow Systems? Reproducibility Regulation Transparency Documented Method Labour saving Productivity Reliability Sustained Knowledge sharing Scholarly Objects Pool of know-how Reuse, repurpose Variant-based Democratisation of computational analysis & methods Upfront cost for downstream benefits Benefits best when a community buys in and workflows are supported
  • 11. Why FAIR Computational Workflows? The FAIR Data Principles RDA FAIR Data Maturity Model. Specification and Guidelines https://zenodo.org/record/3909563#.YORYkUzTX19 Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18 https://www.go-fair.org/fair-principles/
  • 12. Why FAIR Computational Workflows? The FAIR Data Principles Enable automation • Persistent human readable and machine- actionable linked metadata • Community standards • Persistent identifiers • Licensing and access rules • Access protocols • Register/index/search RDA FAIR Data Maturity Model. Specification and Guidelines https://zenodo.org/record/3909563#.YORYkUzTX19 Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18 https://www.go-fair.org/fair-principles/
  • 13. Why FAIR Computational Workflows? Developer/User viewpoints How can I find already existing workflows? Can I access them? Public or private? Git repository? What language is it written in? Can I rework it to use my tool? Is it well enough described so I can understand it? Can I use it? Can I reuse it in our infrastructure? Does it make FAIR data? Will I get credit for it? Can I track that credit? How easy is it to be FAIR?
  • 14. What are the FAIR Principles for Workflows? Hybrid Processual Digital Objects FAIR Method Objects FAIR Software Objects FAIR Data In and Out FAIR Enabling Services C. Goble, S. Cohen-Boulakia, S. Soiland-Reyes, D. Garijo, Y. Gil, M.R. Crusoe, K. Peters & D. Schober. FAIR computational workflows. Data Intelligence 2(2020), 108–121. https://doi.org/10.1162/dint_a_00033
  • 15. What are FAIR Principles for Workflows? Hybrid Processual Digital Objects Method “Data” Objects Workflows as FAIR Software FAIR+R and FAIR++ Quality, maturity, maintainability The principles revised Workflows as FAIR Digital Objects Data-like method objects Associated objects The principles adapted Workflows as FAIR Data Instruments FAIRification of the dataflow The data principles supported C. Goble, S. Cohen-Boulakia, S. Soiland-Reyes, D. Garijo, Y. Gil, M.R. Crusoe, K. Peters & D. Schober. FAIR computational workflows. Data Intelligence 2(2020), 108–121. https://doi.org/10.1162/dint_a_00033 Workflow Objects Software Objects Data FAIRification FAIR enabling services Services
  • 16. FAIR for Research Software Processual Digital Objects https://www.rd-alliance.org/groups/fair-4-researchsoftware-fair4rs-wg FAIR for Research Software (FAIR4RS) working group Katz, et al PATTERNS 2, 2021 https://fairsharing.org/4100
  • 17. FAIR Principles for Workflows Hybrid Processual Digital Objects Usable and Reusable Living & reusable parts versioned, forked, cloned parts recycled limited lifespans citable credit executability reproducibility, portability testing, maturity quality, maintainability FAIR+R FAIR++ Composition & agency Abstractions specification implementation instantiation run result modularisation FAIR parts & dependencies propagation of FAIR properties
  • 18. Findable Search engine supported Public, private & DOI support Different workflow languages and systems Git integration for repos Versioning & snapshots Described by metadata licensing authors & credit analytics access search versions & status other workflows 200+ Workflows 90+ Teams 10+ different systems
  • 19. What Workflow Metadata? Metadata for machines & people Common metadata about the workflow, tools & parameters Canonical workflow description of the steps of the workflow Type the input and outputs of the steps Run Provenance RO-Crate format for packaging a workflow, its metadata and companion objects (links to containers, data etc) for exchange, archiving, reporting, citing. FAIR Digital Object Open Communities https://youtu.be/Rsuxn0m4bIM
  • 21. Accessible (2) A2. metadata are accessible, even when the workflow is no longer available Enough metadata that a workflow is read-reproducible as a method description if it no longer runs Metadata preservation belts and braces republish in another archive
  • 22. Interoperable (1 & 2) WfMS interoperability: describe workflows independently of WfMS. Platform independent pipeline exchange and comparison. Workflow Composability: Software interoperates through APIs and metadata standards (FAIR4RS). Workflow-ready tools. Tested & validated canonical workflow blocks. https://openwdl.org/ https://www.commonwl.org Design for FAIR Workflow Reuse Licence combinations Access permissions Clean interfaces BioExcel Building Blocks biomolecular simulation tools https://workflowhub.eu/projects/11
  • 23. Reusable and Usable Composability + Associated Objects + Metadata + FAIR Services Reusable – “can be understood, modified, built upon or incorporated into other workflows” Usable – “can be executed” Containers & Packaging Testing & monitoring checker workflows test data https://crs4.github.io/life_monitor/ https://openebench.bsc.es/dashboard Is a workflow reusable if it’s • resource greedy • needs special resources • needs unavailable data • cannot be ported or run by anyone other than the developers?
  • 24. Data FAIRification & FAIR Data by Design Metadata generated for data products, Assisted by WfMS and tools Reviewing Curation Certification Governance Best Practice Golden Examples Canonical workflows Design for FAIR Data and Reuse nf-core
  • 25. FAIR Workflow Services EOSC-Life Collaboratory EOSC-Life https://www.eosc-life.eu/
  • 26. How can we Workflow FAIR Assist? Workflow developers Tool and data set providers Workflow readiness FAIR Unit Testing Brack, et al (2021). 10 Simple Rules for making a software tool workflow-ready. https://doi.org/10.5281/zenodo.5636487 Descriptions Register in WorkflowHub Best Practice WfMS platforms Programmatic access Automate FAIRness FAIR Software FAIR enabling Service Use well documented FAIR enabling and FAIR workflows credit the makers! Users
  • 27. Building Communities of Practice where do I start? finding your buddies
  • 28. Summary: FAIR Computational Workflows Hybrid Processual Digital Objects FAIR takes a village* *Borgman, C. L., & Bourne, P. E. (2021). Why it takes a village to manage and share data. Harvard Data Science Review (under Review), arXiv:2109.01694v1. Building Communities of Practice
  • 29. Acknowledgements The WorkflowHub Club, Bioschemas Community, RO-Crate Community, CWL Community, Galaxy Europe, EOSC-Life and ELIXIR Tools Platform. Special Thanks Stian Soiland-Reyes (U Manchester / U Amsterdam) Paul Brack, Stuart Owen, Finn Bacall, Alan Williams (U Manchester) Björn Grüning (U Freiburg) Frederik Coppens (VIB) Sarah Jones (GEANT) Herve Menager (Pasteur Institute) Sarah Cohen-Boulakia (U Paris Sacly) Dan Katz (U Illinois Urbana-Champaign) Simone Leo (CRS4) Laura Rodriguez-Navas (BSC) José Mª Fernández (BSC) Workflow Community Initiative https://workflows.community/about EOSC-Life https://www.eosc-life.eu/ ELIXIR http://elixir-europe.org RO-Crate https://www.researchobject.org/ro-crate/ WorkflowHub https://workflowhub.eu/ and workflowhub.org Galaxy Europe https://galaxyproject.eu/ Bioschemas https://bioschemas.org Common Workflow Language https://www.commonwl.org/ Life Monitor https://crs4.github.io/life_monitor/