Hackathon for RELIANCE research communities.
Note: Hackathon was conducted using old version of ROHub (http://www.rohub.org). New portal to be released end of 2021 (http://reliance.rohub.org)
1. This project has received funding from the European research infrastructures
(including e-Infrastructures) under the European Union's Horizon 2020 research
and innovation programme under grant agreement No 101017501
Research Lifecycle Management technologies for
Earth Science Communities and Copernicus users in
EOSC
ROHub
Hackathon
29th April 2021
2. • Unique identifier (DOI)
• Aggregation of resources
• Hypotheses
• Data used and results produced
• Methods employed to produce and analyse
data
• Scientific workflows, scripts, code
implementing such methods
• (Web) services used
• People involved in the investigation
• …
• Annotations about these resources
• Descriptive metadata
• Provenance of executions
• Versioning information
Research Objects - overview
Goal: Account, describe and share everything
about your research, including how those things
are related
http://www.researchobject.org
3. i. To organize and describe the resources, materials, and methods of an
investigation
ii. To keep track & support the research lifecycle, via snapshots, releases and
forks including versioning and change information
iii. To share your research materials with other scientists at discrete milestones of
your investigation, and collaborate via a single information unit uniquely
unit uniquely identified by an URI, pref. a DOI (RO as a social object)
iv. To enhance the findability and accessibility of your scientific outcomes through
a single information unit associated with rich, machine-readable metadata
Why Research Objects (1/2)
4. v. To enable reproducibility and reuse of the scientific methods and results
via access to resources, context and metadata
vi. To be recognized and cited (even of constituent parts) encouraging the
release and publication of research materials
vii. To preserve results and prevent decay (curation of scientific methods,
e.g., workflows)
viii. To provide evidence and support validation of findings claimed in
scholarly articles
Why Research Objects (2/2)
5. RO model customizations to Earth Science
(pre-RELIANCE)
Geospatial information
Time-period coverage
Data access policies
Intellectual Property Rights
General ES metadata (DOI,
discipline, size, format,
date…)
RO evolution and versioning extended (fork,
release and DOI variants)
Eight new RO types :
Workflow-centric
Data-centric
Research product-centric
Bibliographic
…
and their associated checklists
Concepts and properties related to
executable resources extended to consider
other types of processes, not only
workflows (roterms)
ProcessValue, subsequentProcess,
previousProcess,…
6. • Data Cube centric Research Object
• Treats DC as first-class resources
• aggregated and described in detail, e.g., how was it generated or used.
• A DC is identified and aggregated in the RO through its URI.
• link to the DC that (will) open it in ADAM GUI its “latest” scene.
• The aggregated DC metadata may include (D4.1), e.g.,
• identification, description, resolution
• parameters used to generate the dataset or subset
• links to access it
• Such RO, MAY also include other related resources like:
• (link to ) Jupyter Notebook using this DC via the ADAM API
• Intermediate and final results from the analysis of the data (link to the data in some
repository, e.g., B2Share or Zenodo)
• Related documentation
• Related publications
• Others
RO model customizations to Earth Science
(RELIANCE)
13. • Holistic solution for research object management
• implements natively the full RO model and paradigm
• support different stakeholders, with the primary focus on scientists,
researchers, students and enthusiasts
• Comprises
• backend service implementing and exposing
a set of APIs
• reference web client application, exposing all
RO functionalities to the end-users.
• Combines and leverages different technologies
• DL, long term-preservation & semantic technologies
http://www.rohub.org/
14. ROHUB enables:
• to create and manage high-quality ROs that can be interpreted and reproduced in the future
• to reference, share and preserve scientific studies, campaigns, and observations related resources, including
internal ones, links to external ones as well as other ROs (nested ROs)
• to collaborate with colleagues and to discover new knowledge through advanced exploratory search interfaces
that exploit RO metadata (both explicitly provided and automatically extracted from its content), as well as via
an standard search API OpenSearch with Geo extensions
• to manage the RO evolution including the ability to generate snapshots and releases and to allow others to
fork the RO to reuse it and extend it.
• to publish the associated work and assign it a DOI to allow its citation in scholarly communications
• to monitor and follow a particular work, getting notifications about its progress or changes in quality
• To allow researchers to build reputation enabling users to rate and favorite ROs created by others
• to find related works or relevant researchers in a particular domain, e.g., for possible collaborations or reviews
High-level features
15. ROHub plus added value services
Semantic enrichment
readability, discoverability, reuse
Recommendation
content-based, concentric spheres
Research lifecycle & scholarly communication
collaboration, publication, citation, validation
Quality assessment
Monitoring & preservation of HQ investigations
Social Impact
Sharing, quality
16. • Enrich ROs with semantic metadata extracted from their aggregated human-
generated content, enhancing both human and machine readability, and thus
their discoverability
• Extracted annotations are structured as semantic markup based on a
and are included as annotations following the RO model.
Semantic enrichment
Ontology to represent identified
annotations
RO search levels
17. • Implements a content-based
recommender service
• Takes as input user interests (as a
collection of ROs) and matches them
against other ROs based on their
content, exploiting the metadata
metadata generated by the RO
semantic enrichment process
• User interface follows a visual metaphor
metaphor based on concentric spheres
Recommender system
18. • Checklists are well-established tool for
guiding practice
• ensure safety, quality, and consistency in
communities
• allow to specify the required metadata and
RO must include and have access to,
and purpose
• defined for RO types based on researchers’
• Contribute to automate DMPs
• Allow to calculate RO quality metrics,
completeness, stability and reliability
• Monitor the RO and assess its quality
through their lifecycle for its preservation
• Assess checklist systematically
• notify users when quality drops (decay)
• Focus on reuse
Quality assessment & preservation
20. • Researchers can build up their reputation
based on the rating of other colleagues
• Ratings may be used as additional
information for reusing
• Favorites may be used to retrieve quickly
the ROs researchers are interested
• Comments and replies can be used for
making discussions
Social Impact
21. Collaboration
Who made the (content of) research
object? Who maintains it?
Who wrote this document? Who
uploaded it?
Which CSV was this Excel file imported
from?
Who wrote this description? When?
How did we get it?
What is the state of this RO? (Live or
Published?)
What did the research object look like
before? (Revisions) – are there newer
versions?
Which research objects are derived
from this RO?
Answer to: who, where, which, what, why
• Attribution
• Derivation
• Activities
http://www.w3.org/TR/prov-primer/
PROV-O
Provenance information
23. • ROHub will be interconnected with the other RELIANCE services, namely ADAM
platform for Data Cubes services and text mining and semantic enrichment services
(initial connection made in EVER-EST).
• ROHub will be onboarded into the EOSC portal catalogue
• ROHub will leverage and integrate with other EOSC services, including EOSC AAI
and research data management services like B2DROP, B2SHARE or Zenodo.
• ROHub plans also to leverage EOSC notebooks (Jupyter notebooks) service
• ROHub version 2.0 is on the way
ROHub plans in RELIANCE
24. • Two main entry points:
• ROHub portal
• Jupyter notebooks (via ROHub python library)
• Additionally
• ROHub is planned to be connected at high level from ADAM GUI
• Load a DC RO, save changes in to RO, publish DC RO (make snapshot/release), open ROHub
portal for further RO manipulation
• ROHub may be used within VRC applications (via its API, library)
ROHub use in RELIANCE
26. • Jupyter notebooks will be treated as
the default execution environment in
RELIANCE, where scientitsts will be
able to access their data,
create/manipulate data cubes,
execute their methods, and manage
their research objects
• Leverage EGI notebooks
Jupyter Notebooks (plans)
28. • Zenodo is a general-purpose open-access
repository used in EOSC as a catch all repository
• B2SHARE is a is an EOSC service to share and
publish your research data
• Both services generate DOIs
• The plan is to allow users to publish
snapshots/releases of ROs to those services
directly
• Users may be able to select the community where
to publish it
• Integration is straightforward, users will need to
provide their token to rohub to use those services
on their behalf
B2SHARE & Zenodo integration (under testing)
29. • B2DROP is a Personal Cloud
Storage Service
• May be used, if user wants it, as the
storage backend for ROHub to store
the “internal” RO resources of the
user, instead of using the default
ROHub storage sytem
• Integration path under discussion
(telco next week)
B2DROP integration (plans)
31. • Architecture:
• K. Page, R. Palma, P. Holubowicz, G. Klyne, S. Soiland-Reyes, D. Cruickshank, R. G. Cabero, E. G. Cuesta, D. D. Roure, and J. Zhao,
architecture for preserving the semantics of science," Proceedings of the 2nd International Workshop on Linked Science, ISWC, Boston,
• APIs:
• Palma, R., Hołubowic P., et al. A suite of API for the management of Research Objects. In Proceedings of the ISWC Developers
• ROHUB:
• Gómez-Pérez J.M., Palma R. Research Objects for Sharing and Exchanging Research Data and Methods in Earth Science. Poster at
Assembly, April 2016
• Palma R., Corcho O., Gómez-Pérez J.M., Mazurek, C., “ROHub – A Digital Library for Sharing and Preserving Research Objects”. Poster
• Palma R., Corcho O., Gómez-Pérez J.M., Mazurek, C., “ROHub A Digital Library of Research Objects Supporting Scientists Towards
Challenge of Proc. Extended Semantic Web Conference (ESWC), Crete, Greece, May 25-29, 2014.
• Page K., Palma R., Hołubowicz P., Klyne G., Soiland-Reyes S., Garijo D., Belhajjame K., Mayer R., Research Objects for Audio Processing:
In Proc. 53rd Audio Engineering Society International Conference on Semantic Audio, London, UK, January 27-29, 2014
• Palma R., Corcho O., Hołubowicz P., Pérez S., Page K., Mazurek C., Digital libraries for the preservation of research methods and
Workshop on the Digital Preservation of Research Methods and Artefacts (DPRMA 2013) at Joint Conference on Digital Libraries (JCDL
July 2013.
• APIs overview: https://github.com/wf4ever/apis/wiki/Wf4Ever-Services-and-APIs
• Source code: https://github.com/rohub
• Demo video: https://youtu.be/TxW2wvreyoQ
• Live Instance: http://www.rohub.org/
• New beta Portal: http://beta.rohub.org/
References
Hinweis der Redaktion
As a backbone technology behind RELIANCE, it is important to give a bit more context about research objects to understand the reasoning behind.
Research objects, as perhaps some of you already know, are rich information objects that aim to account, describe and share everything about your research, including how those things are related, in a way that is understandable by both users and machines.
Research Objects have been used and demonstrated in different communities in previous projects, from bioinformatics, to astronomy to earth science, and as a result, they are rapidly gaining more attention as a promising research-enabling technology
Research objects can be regarded as a logical container that has a
Unique identifier, e.g. DOI and that can encapsulate various research artefacts such as
Hypotheses and/or purpose of the experiment
Data used and results produced
Methods employed to produce and analyse data
Scientific workflows implementing such methods
Provenance of their executions
Versioning information
People involved in the investigation
Annotations about these resources
The Figure on the slide shows a high-level view of the RELIANCE services architecture, and their connection with EOSC and other existing services.
* As we can see in the middle, RELIANCE services will be interconnected and complementing each other, enabling scientists to use the provided functionalities and access their work from different user interfaces using ROs as the main connecting point.
* RELIANCE services will expose Restful APIs and python libraries, enabling the communitcation between RELIANCE and other EOSC services, as well as their use from different user interfaces.
* DCs will be linked in the RO as first class entities, and described with a rich set of metadata (e.g., how it was generated or used) enabling an efficient access to large datasets like Copernicus data while facilitating reusability and reproducibility of the mechanisms to access such data
* TM & enrichment services will automatically enrich the RO with metadata extracted from the available annotations and resources aggregated thus increasing their findability, interoperability and reuse, and enabling the recommendation of ROs or DCs
* Some of these connections are already in place and will require the necessary adaptations to EOSC.
* Also, as we can see in the diagram, RELIANCE services will be integrated to EOSC, reusing some of the core cross-cutting services as well as some advanced added-value services,
* They may also reuse other available services like scholarly communication services, notebooks, or QCG for HPC resources allocation, and other eu-wide AAI services like EDUGAIN (to which EgI-check in is federated as service provider)