The Center for Expanded Data Annotation and Retrieval (CEDAR) aims to revolutionize the way that metadata describing scientific experiments are authored. The software we have developedthe CEDAR Workbenchis a suite of Web-based tools and REST APIs that allows users to construct metadata templates, to fill in templates to generate high-quality metadata, and to share and manage these resources. The CEDAR Workbench provides a versatile, REST-based environment for authoring metadata that are enriched with terms from ontologies. The metadata are available as JSON, JSON-LD, or RDF for easy integration in scientific applications and reusability on the Web. Users can leverage our APIs for validating and submitting metadata to external repositories. The CEDAR Workbench is freely available and open-source.
Ähnlich wie The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata that Describe Scientific Experiments (ISWC 2017 Conference) (20)
Forensic Biology & Its biological significance.pdf
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata that Describe Scientific Experiments (ISWC 2017 Conference)
1. The CEDAR Workbench: An Ontology-Assisted
Environment for Authoring Metadata that
Describe Scientific Experiments
Rafael Gonçalves, Martin O'Connor, Marcos Martínez Romero,
Attila Egyedi, Debra Willrett, John Graybeal, and Mark Musen
Stanford University
EDAR
OR EXPANDED DATA
ION AND RETRIEVAL
CEDAR
CENTER FOR EXPANDED DATA
ANNOTATION AND RETRIEVAL
CEDAR
DAR
DAR
CENTER FOR EXPANDED DATA
2. • Metadata are crucial for finding, reproducing,
and reusing the data that they describe
• The FAIR data principles specify desirable
criteria that metadata and their datasets
should meet to be Findable, Accessible,
Interoperable, and Reusable
• For metadata to be interoperable, they should
rely on controlled terms from ontologies
2
Metadata Are Essential in Science
3. Metadata Lifecycle
• Metadata are typically authored in spreadsheets
• Metadata are uploaded to public repositories
– E.g., ImmPort, GEO, etc.
• Repositories potentially verify metadata
3
scientists
fill in spreadsheets
with metadata
metadata
submit
te
m
pl
at
e
A sample study
public repository
data
subm
it
7. Metadata in the BioSample online repository are
impaired by numerous anomalies (SemSci 2017)
7
8. It is extremely hard to:
–find experimental datasets
–understand how experiments were
performed
–replicate study findings
8
Metadata are not standardized
9. Generating standard metadata is hard
• Submission formats rarely support
ontology terms
• No easy way of finding terms from
ontologies and including them in metadata
submissions
9
10. Suite of tools to enable the creation of
high-quality metadata in biomedicine
10
11. The CEDAR Workbench
Template Designer Metadata Editor
Template authors Metadata authors
design
templates
Metadata Repository
template
fill in templates
with metadata
metadata
Public Databases
LINCS
submit
metadata
Biomedical Ontologies
12. Template Designer Metadata Editor
Template authors Metadata authors
design
templates
Metadata Repository
template
fill in templates
with metadata
metadata
Public Databases
LINCS
submit
metadata
Biomedical Ontologies
The CEDAR Workbench
30. Summary
• Authoring metadata is hard and time-consuming
• Authoring semantic metadata is even harder
– Lack of convenient tools for linking metadata to
ontologies in a metadata authoring workflow
• The CEDAR Workbench facilitates metadata
creation in a semantically rigorous way
– Add type and property assertions
– Constrain the values of fields to ontology terms
– Create classes and value sets
http://metadatacenter.org
http://cedar.metadatacenter.net
30
31. CEDAR
CENTER FOR EXPANDED DATA
ANNOTATION AND RETRIEVAL
CEDAR
CENTER FOR EXPANDED DATA
ANNOTATION AND RETRIEVAL
CEDAR
CEDAR
CEDAR
I
Metadata
Thanks!