Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Research Object Composer: A Tool for Publishing Complex Data Objects in the Cloud

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige

Hier ansehen

1 von 12 Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Ähnlich wie Research Object Composer: A Tool for Publishing Complex Data Objects in the Cloud (20)

Anzeige

Weitere von Anita de Waard (20)

Aktuellste (20)

Anzeige

Research Object Composer: A Tool for Publishing Complex Data Objects in the Cloud

  1. 1. Research Object Composer: Publishing Complex Data Objects in the FAIRground Presented by Anita de Waard, VP Research Collaborations September 24 2019
  2. 2. “One important purpose of the Commons Pilot is to collectively agree on a set of best practices … to eliminate barriers for accessing, sharing and analyzing biomedical data.” Biomedical data moving to the cloud “Storing, managing, standardizing and publishing the vast amounts of data produced by biomedical research is a critical mission for the National Institutes of Health.” 2 Findable, Accessible, Interoperable, Reusable Data Building blocks for the FAIRGround! Fairly AI-Ready?
  3. 3. Open API! As a researcher studying genetic disease X… I want to • access 1000s of DNA sequences of a population, run analysis Y and • share results of my findings, protocol and input data with my collaborators/ community • publish an article about it in a way that data is FAIR along each step of the process so that others can reproduce and build on this work. Building an open interoperable data ecosystem: User story: 3
  4. 4. Cloud data is accessible if openly disseminated Need open data & identifiers for workflow tools: Requirements : 1. Landing page URL including GUID 2. URL for page where file can be accessed (downloaded) 3. Metadata for object 4. Reference to the Task (zero or one) that this dataset was Derived From 5. Reference to the Task(s) that this dataset is the Source Of c 4 c
  5. 5. Building an open interoperable data ecosystem: Aggregates link things together Annotations about things & their relationships Container Packaging content & links: Zip files, BagIt, Docker images Identification locate things regardless where 5
  6. 6. Building an open interoperable data ecosystem: database Open repository Workflow Tool Task 1 Workflow Input Task 2 Task 3 Output Research Object Composer http://www.researchobject.org Research Object Profiler Add annotation and relationships (metadata) to collection to describe a research object: - URI - Length - Filename - Checksums etc. Research Object Serializer (a manifest itemizing file names) Serialise Research Object in standard format based BagIt =1 =2 =3 RO 1 2 3 Open API 6 Mendeley Data RO 1 2 3 • DOIs • Metadata (Findability) • Open repo (Accessibility) • Versioning • RO Standard (Interoperability, Reusability)
  7. 7. • The RO Composer is not a registry of research objects, but it can list research objects currently under construction. • The RO Composer is a microservice which responsibility is to help other services create and deposit research objects. • The composer acts as a temporary construction site that can be completed by multiple services (e.g. a data management system, a workflow system, a user interface). • These clients will be jointly building a Research Object that can then be validated according to the schema, before the RO is downloaded or deposited into an archive (like Zenodo or Mendeley Data). • Clients of the RO Composer are applications (driven by a user interface) or agents (engaged automatically from other events, e.g. a workflow run). • The RO Composer is not a required component to this: any software may generate research objects by following Research Object specifications. Purpose of the Research Object Composer*: 7* From: https://github.com/ResearchObject/research-object-composer/blob/master/introduction.ipynb
  8. 8. • API: https://researchobject.github.io/research-object-composer/api/ • Source: https://github.com/ResearchObject/research-object-composer • Link to Jupyter Notebook tutorial (even I can do it!) You can drive it today! 8
  9. 9. Use case for the ROC: Earth Sciences! EVER-EST – RO in Earth Sciences 12 EU partners 4 research communities Powered by ROHub 9
  10. 10. Other use case: Chemistry! NMReDATA 10 http://nmredata.org/ NMReDATA: • chemical shifts, scalar couplings, multiplet analysis and 2D cross peaks extracted from a set of NMR spectra • linked to the assigned chemical structure. • data resulting from full analysis of organic compounds and natural products using various spectra. / NMR Record • Database entry or folders including a .sdf file (containing the chemical structure and the NMReDATA) • Folders including the relevant NMR spectra (with FID, acquisition and processing parameters in the manufacturer’s format). • In order to facilitate transfers and exchanges of records, the folder can be compressed in the .zip format. • The NMR records (and the.sdf file) will be generated by computer-assisted structure elucidation software or web- based tools. RO 1 2 3 Sounds like a ResearchObject to us…?
  11. 11. Some questions to ponder: • How to enable interoperability between ROC and other repositories? • How do we get the word out there and get people to use ROs at scale? • What challenges for wide adoption by repositories? Authoring tools? Workflow tools? • How do RO’s fit in with other initiatives: is an RO Data, Software, both? − Citations? Cf Software citation − Credit? Does it go along with new credit metrics, Make Data Count, etc? • What role can publishers play in this? − Support standards (sit on panels, etc) − What else?? 11
  12. 12. Acknowledgements: This work was funded by the National Institutes of Health, National Heart, Lung and Blood Institutes STAGE Project, with Seven Bridges Genomics inc. Agreement No. 1 OT3 OD025463-01 And performed by: 12 Marina Soares E Silva Chris Wright Wouter Haak Carole Goble Stian Soyland-Reyes Finn Bacall

Hinweis der Redaktion

  • Big biomedical data embodies the potential to deliver faster more knowledge about diseases.
  • Collaboration between, among others, data services providers and developers of standards on research objects increases the chance to deliver an interoperable open research data ecosystem which we aim to be sustainable and scalable.
  • Standards-based metadata framework for logically and physically bundling resources with context http://researchobject.org

×