SlideShare a Scribd company logo
1 of 27
Creating Executable Research Compendia to
Improve Reproducibility in the Geosciences
Daniel Nüst | University of Münster | @nordholmen
C4RR workshop, June 28 2017, Cambridge, UK
Contents
Creating ERCs
Technical background
A world with ERCs
2
Executable Research Compendium
3
4
Key features of ERCs
Nested containers (BagIt, Docker)
Librarian-ready
Reproducibility range of 5 to 10 years
(still worth integrating, target users are not science historians)
Desktop-size data and algorithms - closed and complete
“Geo-stuff” and R for the “last 10 %”
Remain understandable for scientists
5
Creating & Inspecting ERC
How far can we reduce
overhead for scientists?
6
7
8
ERC creation process
❏ Submit workspace (“Scripters”/”Coders”)
❏ Extract metadata
❏ Execute analysis
❏ Check syntax
❏ Capture runtime environment (manifest + image)
❏ Check metadata (user!)
9
containerit
https://github.com/o2r-project/containerit
10
containerit
(cont.)
11
...
meta toolsuite
- extract
- map
- harvest
- validate
Highlights
Automatically extract several metadata from
workspace, including spatial information
Facilitate MD management with schema
translation maps 12
13
14https://sandbox.zenodo.org/communities/o2r
Technical Background
15
ERC specification
GitHub dev
Development steps
version 0, practical evaluation
version 0.5, expert evaluation
version 0.6, architect evaluation
version 1 (mid 2017) > ref. impl.
Content
http://o2r.info/erc-spec
16
ERC specification - key features & structure
base directory
main document & display file
runtime image & runtime manifest
yml configuration file (control statements, metadata)
5 files + x
17
http://o2r.info/architecture/
18
Architecture for ERC-based
publication
process
A world with ERCs
19
20
Manipulate, Validate, Interact, ...
Integration Hacks
21
Chrome Extension
Geocontainer labels study project
22
Badges API
Chrome Extension
Summary
Executable Research Compendia are fun and …
help us learn a lot about reproducibility
work including a domain-specific “last mile”
take into consideration requirements of libraries and preservation
re-use and integrate, are not “a platform”
dont’t solve all problems (R, geo, 1/5 Vs, no HPC, comp. reproducibility)
Reproducibility service makes ERC work in geosciences for the current
publication workflow and services.
23
Outlook
“A lot of glue work around the edges” (M.Hartley)
ERCs are post-hoc glue for minimal reproducibility
Catching up with reference implementation and demo ERCs
Spin-out of tools
Follow-ups & collaborations
(production mode in cloud? special issues?)
24
Unconcealed ad:
Reproducible GEOBIA
doi: 10.3390/rs9030290
http://www.mdpi.com/2072-4292/9/3/290
25
Unconcealed ad II:
Docker for RR lesson
https://github.com/
nuest/
docker-reproducible
-research
https://nuest.github.io/
docker-reproducible
-research
26
Thanks!
What are your questions?
27
@o2r_project
github.com/o2r-project
o2r.info

More Related Content

More from Daniel Nüst

Visualising Interpolations of Mobile Sensor Observations
Visualising Interpolations of Mobile Sensor ObservationsVisualising Interpolations of Mobile Sensor Observations
Visualising Interpolations of Mobile Sensor ObservationsDaniel Nüst
 
WPS Application Patterns
WPS Application PatternsWPS Application Patterns
WPS Application PatternsDaniel Nüst
 
JavaScript Client Libraries for the (Former) Long Tail of OGC Standards
JavaScript Client Libraries for the (Former) Long Tail of OGC StandardsJavaScript Client Libraries for the (Former) Long Tail of OGC Standards
JavaScript Client Libraries for the (Former) Long Tail of OGC StandardsDaniel Nüst
 
Open Source and GitHub for Teaching with Software Development Projects
Open Source and GitHub for Teaching with Software Development ProjectsOpen Source and GitHub for Teaching with Software Development Projects
Open Source and GitHub for Teaching with Software Development ProjectsDaniel Nüst
 
5 Star Open Geoprocessing
5 Star Open Geoprocessing5 Star Open Geoprocessing
5 Star Open GeoprocessingDaniel Nüst
 
The 52°North Web Processing Service
The 52°North Web Processing ServiceThe 52°North Web Processing Service
The 52°North Web Processing ServiceDaniel Nüst
 
Linked data and rdf
Linked  data and rdfLinked  data and rdf
Linked data and rdfDaniel Nüst
 
OGC SOS for Your Data
OGC SOS for Your DataOGC SOS for Your Data
OGC SOS for Your DataDaniel Nüst
 
sos4R - Accessing SensorWeb Data from R
sos4R - Accessing SensorWeb Data from Rsos4R - Accessing SensorWeb Data from R
sos4R - Accessing SensorWeb Data from RDaniel Nüst
 
Connecting R to the Sensor Web
Connecting R to the Sensor WebConnecting R to the Sensor Web
Connecting R to the Sensor WebDaniel Nüst
 
sos4R - 52° North Innovation Price Presentation
sos4R - 52° North Innovation Price Presentationsos4R - 52° North Innovation Price Presentation
sos4R - 52° North Innovation Price PresentationDaniel Nüst
 
Visualizing the Availability of Temporally Structured Sensor Data
Visualizing the Availability of Temporally Structured Sensor DataVisualizing the Availability of Temporally Structured Sensor Data
Visualizing the Availability of Temporally Structured Sensor DataDaniel Nüst
 

More from Daniel Nüst (13)

Visualising Interpolations of Mobile Sensor Observations
Visualising Interpolations of Mobile Sensor ObservationsVisualising Interpolations of Mobile Sensor Observations
Visualising Interpolations of Mobile Sensor Observations
 
WPS Application Patterns
WPS Application PatternsWPS Application Patterns
WPS Application Patterns
 
JavaScript Client Libraries for the (Former) Long Tail of OGC Standards
JavaScript Client Libraries for the (Former) Long Tail of OGC StandardsJavaScript Client Libraries for the (Former) Long Tail of OGC Standards
JavaScript Client Libraries for the (Former) Long Tail of OGC Standards
 
Open Source and GitHub for Teaching with Software Development Projects
Open Source and GitHub for Teaching with Software Development ProjectsOpen Source and GitHub for Teaching with Software Development Projects
Open Source and GitHub for Teaching with Software Development Projects
 
5 Star Open Geoprocessing
5 Star Open Geoprocessing5 Star Open Geoprocessing
5 Star Open Geoprocessing
 
The 52°North Web Processing Service
The 52°North Web Processing ServiceThe 52°North Web Processing Service
The 52°North Web Processing Service
 
Linked data and rdf
Linked  data and rdfLinked  data and rdf
Linked data and rdf
 
OGC SOS for Your Data
OGC SOS for Your DataOGC SOS for Your Data
OGC SOS for Your Data
 
sos4R - Accessing SensorWeb Data from R
sos4R - Accessing SensorWeb Data from Rsos4R - Accessing SensorWeb Data from R
sos4R - Accessing SensorWeb Data from R
 
Connecting R to the Sensor Web
Connecting R to the Sensor WebConnecting R to the Sensor Web
Connecting R to the Sensor Web
 
sos4R @ OGC TC
sos4R @ OGC TCsos4R @ OGC TC
sos4R @ OGC TC
 
sos4R - 52° North Innovation Price Presentation
sos4R - 52° North Innovation Price Presentationsos4R - 52° North Innovation Price Presentation
sos4R - 52° North Innovation Price Presentation
 
Visualizing the Availability of Temporally Structured Sensor Data
Visualizing the Availability of Temporally Structured Sensor DataVisualizing the Availability of Temporally Structured Sensor Data
Visualizing the Availability of Temporally Structured Sensor Data
 

Recently uploaded

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Recently uploaded (20)

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 

Creating Executable Research Compendia to Improve Reproducibility in the Geosciences

Editor's Notes

  1. The ERC provides a well-structured container for both the needs of journals (ERC as the item under review), archives (suitable metadata and packaging formats), and researchers (literally everything needed to re-do an analysis is there). It relies on Docker to define and store the runtime environment. ERCs should be simple enough to be created manually and absorb best practices for organizing digital workspaces. “bundle”
  2. Test platform - we are not a platform!
  3. Daniel
  4. Marc
  5. Daniel
  6. Should be able to create it manually # researchers workspaces = # researchers data NEXT TO container ENTRY POINTs “Sophisticated Makefile”
  7. Publication workflow
  8. Test platform - we are not a platform!