SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Downloaden Sie, um offline zu lesen
The CEDAR Workbench: An Ontology-Assisted
Environment for Authoring Metadata that
Describe Scientific Experiments
Rafael Gonçalves, Martin O'Connor, Marcos Martínez Romero,
Attila Egyedi, Debra Willrett, John Graybeal, and Mark Musen
Stanford University
EDAR
OR EXPANDED DATA
ION AND RETRIEVAL
CEDAR
CENTER FOR EXPANDED DATA
ANNOTATION AND RETRIEVAL
CEDAR
DAR
DAR
CENTER FOR EXPANDED DATA
• Metadata are crucial for finding, reproducing,
and reusing the data that they describe
• The FAIR data principles specify desirable
criteria that metadata and their datasets
should meet to be Findable, Accessible,
Interoperable, and Reusable
• For metadata to be interoperable, they should
rely on controlled terms from ontologies
2
Metadata Are Essential in Science
Metadata Lifecycle
• Metadata are typically authored in spreadsheets
• Metadata are uploaded to public repositories
– E.g., ImmPort, GEO, etc.
• Repositories potentially verify metadata
3
scientists
fill in spreadsheets
with metadata
metadata
submit
te
m
pl
at
e
A sample study
public repository
data
subm
it
4
5
Contributor
Contributor(s)
Author
Authors
Submitter
Submitted By
Creator
PI
Provider
Metadata are not standardized
6
Metadata in the BioSample online repository are
impaired by numerous anomalies (SemSci 2017)
7
It is extremely hard to:
–find experimental datasets
–understand how experiments were
performed
–replicate study findings
8
Metadata are not standardized
Generating standard metadata is hard
• Submission formats rarely support
ontology terms
• No easy way of finding terms from
ontologies and including them in metadata
submissions
9
Suite of tools to enable the creation of
high-quality metadata in biomedicine
10
The CEDAR Workbench
Template Designer Metadata Editor
Template authors Metadata authors
design
templates
Metadata Repository
template
fill in templates
with metadata
metadata
Public Databases
LINCS
submit
metadata
Biomedical Ontologies
Template Designer Metadata Editor
Template authors Metadata authors
design
templates
Metadata Repository
template
fill in templates
with metadata
metadata
Public Databases
LINCS
submit
metadata
Biomedical Ontologies
The CEDAR Workbench
13
14
15
16
17
18
19
20
21
22
23
The CEDAR Workbench
Template Designer Metadata Editor
Template authors Metadata authors
design
templates
Metadata Repository
template
fill in templates
with metadata
metadata
Public Databases
LINCS
submit
metadata
Biomedical Ontologies
24
25
26
{
"@context": {
"rdfs": "http://www.w3.org/2000/01/rdf-schema#",
"xsd": "http://www.w3.org/2001/XMLSchema#",
"pav": "http://purl.org/pav/",
//...
"Title": "http://purl.obolibrary.org/obo/NGS_0000055",
"Disorder": "http://purl.org/net/OCRe/OCRe.owl#OCRE900086",
"Institution": "http://semantic-dicom.org/dcm#InstitutionName",
"Principal Investigator": "http://purl.org/net/OCRe/OCRe.owl#OCRE901006",
"Study Type": "http://purl.obolibrary.org/obo/NGS_0000056"
},
"@type": "http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C63536",
"Title": {
"@value": "A sample study"
},
"Disorder": {
"@id": "http://purl.obolibrary.org/obo/DOID_8986",
"rdfs:label": "narcolepsy"
},
"Institution": {
"@value": "Stanford University"
},
"Principal Investigator": {
"@value": "John Doe"
},
"Study Type": {
"@id": "http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C15273",
"rdfs:label": "Longitudinal Study"
}
}
27
http://purl.obolibrary.org/obo/NGS_0000055
(rdfs:label: Title)
http://purl.org/net/OCRe/OCRe.owl#OCRE900086
(rdfs:label: Disorder)
http://purl.obolibrary.org/obo/DOID_8986
(rdfs:label: narcolepsy)
A Sample study
http://purl.obolibrary.org/obo/NGS_0000056
(rdfs:label: Study Type)
http://ncicb.nci.nih.gov/.../Thesaurus#C15273
(rdfs:label: Longitudinal Study)
http://purl.org/net/OCRe/OCRe.owl#OCRE901006
(rdfs:label: Principal Investigator)
John Doe
http://semantic-dicom.org/dcm#InstitutionName
(rdfs:label: Institution)
Stanford University
rdf:type
http://ncicb.nci.nih.gov/.../Thesaurus#C63536
(rdfs:label: Study)
28
The CEDAR Workbench
Template Designer Metadata Editor
Template authors Metadata authors
design
templates
Metadata Repository
template
fill in templates
with metadata
metadata
Public Databases
LINCS
submit
metadata
Biomedical Ontologies
Who We Work With
29
Summary
• Authoring metadata is hard and time-consuming
• Authoring semantic metadata is even harder
– Lack of convenient tools for linking metadata to
ontologies in a metadata authoring workflow
• The CEDAR Workbench facilitates metadata
creation in a semantically rigorous way
– Add type and property assertions
– Constrain the values of fields to ontology terms
– Create classes and value sets
http://metadatacenter.org
http://cedar.metadatacenter.net
30
CEDAR
CENTER FOR EXPANDED DATA
ANNOTATION AND RETRIEVAL
CEDAR
CENTER FOR EXPANDED DATA
ANNOTATION AND RETRIEVAL
CEDAR
CEDAR
CEDAR
I
Metadata
Thanks!

Weitere ähnliche Inhalte

Was ist angesagt?

eScience at the Royal Society of Chemistry and our current initiatives
eScience at the Royal Society of Chemistry and our current initiativeseScience at the Royal Society of Chemistry and our current initiatives
eScience at the Royal Society of Chemistry and our current initiatives
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
Susanna-Assunta Sansone
 
Lightning Talk, Konkiel: Bootstrapping Library Data Management Services for E...
Lightning Talk, Konkiel: Bootstrapping Library Data Management Services for E...Lightning Talk, Konkiel: Bootstrapping Library Data Management Services for E...
Lightning Talk, Konkiel: Bootstrapping Library Data Management Services for E...
ASIS&T
 
Tools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesTools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databases
Valery Tkachenko
 
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Michel Dumontier
 

Was ist angesagt? (20)

W3C HCLS Dataset Description Guidelines
W3C HCLS Dataset Description GuidelinesW3C HCLS Dataset Description Guidelines
W3C HCLS Dataset Description Guidelines
 
eScience at the Royal Society of Chemistry and our current initiatives
eScience at the Royal Society of Chemistry and our current initiativeseScience at the Royal Society of Chemistry and our current initiatives
eScience at the Royal Society of Chemistry and our current initiatives
 
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
 
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ... Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 
DAS game: how a programmer thinks
DAS game: how a programmer thinksDAS game: how a programmer thinks
DAS game: how a programmer thinks
 
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
 
Lightning Talk, Konkiel: Bootstrapping Library Data Management Services for E...
Lightning Talk, Konkiel: Bootstrapping Library Data Management Services for E...Lightning Talk, Konkiel: Bootstrapping Library Data Management Services for E...
Lightning Talk, Konkiel: Bootstrapping Library Data Management Services for E...
 
Presentation from Code Camp 2017
Presentation from Code Camp 2017Presentation from Code Camp 2017
Presentation from Code Camp 2017
 
The FAIRDOM Commons for Systems Biology
The FAIRDOM Commons for Systems BiologyThe FAIRDOM Commons for Systems Biology
The FAIRDOM Commons for Systems Biology
 
Tools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesTools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databases
 
Citing data in research articles: principles, implementation, challenges - an...
Citing data in research articles: principles, implementation, challenges - an...Citing data in research articles: principles, implementation, challenges - an...
Citing data in research articles: principles, implementation, challenges - an...
 
2016 bmdid-mappings
2016 bmdid-mappings2016 bmdid-mappings
2016 bmdid-mappings
 
Progeny Clinical
Progeny ClinicalProgeny Clinical
Progeny Clinical
 
eScience Resources for the Chemistry Community from the Royal Society of Chem...
eScience Resources for the Chemistry Community from the Royal Society of Chem...eScience Resources for the Chemistry Community from the Royal Society of Chem...
eScience Resources for the Chemistry Community from the Royal Society of Chem...
 
Facilitating semantic alignment.-biohackathon-jupp
Facilitating semantic alignment.-biohackathon-juppFacilitating semantic alignment.-biohackathon-jupp
Facilitating semantic alignment.-biohackathon-jupp
 
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
 
Research Data Sharing: A Basic Framework
Research Data Sharing: A Basic FrameworkResearch Data Sharing: A Basic Framework
Research Data Sharing: A Basic Framework
 
BioNLPSADI
BioNLPSADIBioNLPSADI
BioNLPSADI
 
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
 
schema.org and biomedical ontologies
schema.org and biomedical ontologies schema.org and biomedical ontologies
schema.org and biomedical ontologies
 

Ähnlich wie The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata that Describe Scientific Experiments (ISWC 2017 Conference)

Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Ahmad C. Bukhari
 
eTRIKS Data Harmonization Service Platform
eTRIKS Data Harmonization Service PlatformeTRIKS Data Harmonization Service Platform
eTRIKS Data Harmonization Service Platform
ibemam
 
Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...
Lucy McKenna
 

Ähnlich wie The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata that Describe Scientific Experiments (ISWC 2017 Conference) (20)

Cedar Overview
Cedar OverviewCedar Overview
Cedar Overview
 
CEDAR Technologies for AIRR Submissions
CEDAR Technologies for AIRR SubmissionsCEDAR Technologies for AIRR Submissions
CEDAR Technologies for AIRR Submissions
 
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
 
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
 
AMIA 2019: Unleashing the value of CDEs through CEDAR
AMIA 2019: Unleashing the value of CDEs through CEDARAMIA 2019: Unleashing the value of CDEs through CEDAR
AMIA 2019: Unleashing the value of CDEs through CEDAR
 
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized MetadataCEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
 
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized MetadataCEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
 
CEDAR: Easing Authoring of Metadata to Make Biomedical Data Sets More Findabl...
CEDAR: Easing Authoring of Metadata to Make Biomedical Data Sets More Findabl...CEDAR: Easing Authoring of Metadata to Make Biomedical Data Sets More Findabl...
CEDAR: Easing Authoring of Metadata to Make Biomedical Data Sets More Findabl...
 
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesApplication of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
 
Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011
 
The Role of Metadata in Reproducible Computational Research
The Role of Metadata in Reproducible Computational ResearchThe Role of Metadata in Reproducible Computational Research
The Role of Metadata in Reproducible Computational Research
 
eTRIKS Data Harmonization Service Platform
eTRIKS Data Harmonization Service PlatformeTRIKS Data Harmonization Service Platform
eTRIKS Data Harmonization Service Platform
 
FAIR Data Knowledge Graphs
FAIR Data Knowledge GraphsFAIR Data Knowledge Graphs
FAIR Data Knowledge Graphs
 
Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...
 
SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
Metadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data RepositoriesMetadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data Repositories
 
Overview of the NIH BD2K CEDAR centre, on metadata and standards
Overview of the NIH BD2K CEDAR centre, on metadata and standardsOverview of the NIH BD2K CEDAR centre, on metadata and standards
Overview of the NIH BD2K CEDAR centre, on metadata and standards
 
Dataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTagsDataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTags
 
How to expose research data in EOSC
How to expose research data in EOSCHow to expose research data in EOSC
How to expose research data in EOSC
 

Kürzlich hochgeladen

SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
RizalinePalanog2
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 

Kürzlich hochgeladen (20)

Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 

The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata that Describe Scientific Experiments (ISWC 2017 Conference)

  • 1. The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata that Describe Scientific Experiments Rafael Gonçalves, Martin O'Connor, Marcos Martínez Romero, Attila Egyedi, Debra Willrett, John Graybeal, and Mark Musen Stanford University EDAR OR EXPANDED DATA ION AND RETRIEVAL CEDAR CENTER FOR EXPANDED DATA ANNOTATION AND RETRIEVAL CEDAR DAR DAR CENTER FOR EXPANDED DATA
  • 2. • Metadata are crucial for finding, reproducing, and reusing the data that they describe • The FAIR data principles specify desirable criteria that metadata and their datasets should meet to be Findable, Accessible, Interoperable, and Reusable • For metadata to be interoperable, they should rely on controlled terms from ontologies 2 Metadata Are Essential in Science
  • 3. Metadata Lifecycle • Metadata are typically authored in spreadsheets • Metadata are uploaded to public repositories – E.g., ImmPort, GEO, etc. • Repositories potentially verify metadata 3 scientists fill in spreadsheets with metadata metadata submit te m pl at e A sample study public repository data subm it
  • 4. 4
  • 5. 5
  • 7. Metadata in the BioSample online repository are impaired by numerous anomalies (SemSci 2017) 7
  • 8. It is extremely hard to: –find experimental datasets –understand how experiments were performed –replicate study findings 8 Metadata are not standardized
  • 9. Generating standard metadata is hard • Submission formats rarely support ontology terms • No easy way of finding terms from ontologies and including them in metadata submissions 9
  • 10. Suite of tools to enable the creation of high-quality metadata in biomedicine 10
  • 11. The CEDAR Workbench Template Designer Metadata Editor Template authors Metadata authors design templates Metadata Repository template fill in templates with metadata metadata Public Databases LINCS submit metadata Biomedical Ontologies
  • 12. Template Designer Metadata Editor Template authors Metadata authors design templates Metadata Repository template fill in templates with metadata metadata Public Databases LINCS submit metadata Biomedical Ontologies The CEDAR Workbench
  • 13. 13
  • 14. 14
  • 15. 15
  • 16. 16
  • 17. 17
  • 18. 18
  • 19. 19
  • 20. 20
  • 21. 21
  • 22. 22
  • 23. 23 The CEDAR Workbench Template Designer Metadata Editor Template authors Metadata authors design templates Metadata Repository template fill in templates with metadata metadata Public Databases LINCS submit metadata Biomedical Ontologies
  • 24. 24
  • 25. 25
  • 26. 26 { "@context": { "rdfs": "http://www.w3.org/2000/01/rdf-schema#", "xsd": "http://www.w3.org/2001/XMLSchema#", "pav": "http://purl.org/pav/", //... "Title": "http://purl.obolibrary.org/obo/NGS_0000055", "Disorder": "http://purl.org/net/OCRe/OCRe.owl#OCRE900086", "Institution": "http://semantic-dicom.org/dcm#InstitutionName", "Principal Investigator": "http://purl.org/net/OCRe/OCRe.owl#OCRE901006", "Study Type": "http://purl.obolibrary.org/obo/NGS_0000056" }, "@type": "http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C63536", "Title": { "@value": "A sample study" }, "Disorder": { "@id": "http://purl.obolibrary.org/obo/DOID_8986", "rdfs:label": "narcolepsy" }, "Institution": { "@value": "Stanford University" }, "Principal Investigator": { "@value": "John Doe" }, "Study Type": { "@id": "http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C15273", "rdfs:label": "Longitudinal Study" } }
  • 27. 27 http://purl.obolibrary.org/obo/NGS_0000055 (rdfs:label: Title) http://purl.org/net/OCRe/OCRe.owl#OCRE900086 (rdfs:label: Disorder) http://purl.obolibrary.org/obo/DOID_8986 (rdfs:label: narcolepsy) A Sample study http://purl.obolibrary.org/obo/NGS_0000056 (rdfs:label: Study Type) http://ncicb.nci.nih.gov/.../Thesaurus#C15273 (rdfs:label: Longitudinal Study) http://purl.org/net/OCRe/OCRe.owl#OCRE901006 (rdfs:label: Principal Investigator) John Doe http://semantic-dicom.org/dcm#InstitutionName (rdfs:label: Institution) Stanford University rdf:type http://ncicb.nci.nih.gov/.../Thesaurus#C63536 (rdfs:label: Study)
  • 28. 28 The CEDAR Workbench Template Designer Metadata Editor Template authors Metadata authors design templates Metadata Repository template fill in templates with metadata metadata Public Databases LINCS submit metadata Biomedical Ontologies
  • 29. Who We Work With 29
  • 30. Summary • Authoring metadata is hard and time-consuming • Authoring semantic metadata is even harder – Lack of convenient tools for linking metadata to ontologies in a metadata authoring workflow • The CEDAR Workbench facilitates metadata creation in a semantically rigorous way – Add type and property assertions – Constrain the values of fields to ontology terms – Create classes and value sets http://metadatacenter.org http://cedar.metadatacenter.net 30
  • 31. CEDAR CENTER FOR EXPANDED DATA ANNOTATION AND RETRIEVAL CEDAR CENTER FOR EXPANDED DATA ANNOTATION AND RETRIEVAL CEDAR CEDAR CEDAR I Metadata Thanks!