SlideShare ist ein Scribd-Unternehmen logo
1 von 12
Downloaden Sie, um offline zu lesen
Use of CEDAR Technology for Ontology-based
Submission of Biomedical Data to the NCBI
Syed Ahmad Chan Bukhari Ph.D., Kei-Hoi Cheung Ph.D., Steven H Kleinstein Ph.D.
Yale University
NCBI is an important resource to archive biomedical data
● NCBI hosts a collection of biomedical databases:
○ BioProject, BioSample, SRA, GenBank, GEO etc.
● Provide infrastructure to submit experimental data and associated metadata
● Minimal use of standard terminologies to define the necessary metadata
○ Ontologies recommended for some data elements (Not implemented)
● NCBI metadata are often described using inconsistent terminologies
○ Limit our ability to access, find, interoperate and reuse the data sets
Goal: Leverage CEDAR to improve NCBI metadata submissions
NCBI BioSample guideline suggests to use Disease Ontology terms
How are metadata currently submitted to NCBI?
BioProject
BioSample
Sequence Read Archive
Combination of web-based forms
and excel templates
● No mechanism to enforce standardized
vocabularies or ontology links
NCBI repositories need improved metadata
CEDAR maps components (e.g., entities, attributes, and value sets) to standard
ontologies that provide global definitions and machine-readable identifiers
Link to BRENDA Tissue and
Enzyme Source Ontology (BTO)
Link to Cell Ontology
Example NCBI BioSample Record
“B cell”, “B-cell” and “Bcell”
CEDAR-to-NCBI Solution
Link to Cell Ontology
Link to Disease Ontology
(for real-time validation)
Wrong location for info
Link to NCBI Taxonomy Ontology
Adaptive Immune-Receptor Repertoire (AIRR) Community
Next-generation sequencing of B & T cell receptor repertoires (AIRR-seq)
Developing standard protocols for reporting and sharing AIRR-seq data to
optimize their use in biomedical research and patient care
AIRR Working Groups
Minimal Standards
Tools and Resources
Common Repository
AIRR Community Formed
1.
Study
Subject
Diagnosis
2.
Sample
Processing
3.
Nucleic Acid
Processing and
Sequencing
4.
Raw
Data
5.
Data
Processing
6.
Processed
Sequences with
Annotations
o Study title
o Study type
o Study inclusion/exclusion
criteria
o Grant funding agency
o Lab name
o Contact information
o Contact of person
uploading data
o Lab address
o Relevant publications
(identifiers)
o Subject ID
o Animal, human or
synthetic
o Sex
o Age
o Age event
o Ancestry population
o Ethnicity
o Race
o Species name
o Strain name
o Linked to other subject?
o Type of link
o Relevant Clinical History
o Study Group Description
o Disease(s)
o Disease stage
o Process type
o Immunogen/agent
o Biological sample ID
o Sample type
o Anatomic site/source
o Disease state of sample
o Sample collection time
(relative to T0)
o Collection time event (T0)
o Source (from commercial)
o Experiment Sample
o Tissue processing
o Cell isolation/enrichment
procedure
o Processing (sample)
o Cell subset
o Cell subset phenotype
o Single cell or bulk?
o How many cells in
experiment?
o Number of cells per
sequencing reaction
o Target substrate (DNA or
RNA)
o Library generation
method
o Library generation
protocol
o Target locus for PCR
o Forward PCR primer
location
o Reverse PCR primer
location
o Forward primer
sequences
o Reverse primer sequences
o Whole vs. partial
sequences
o Heavy vs. Light vs. paired
o Amount of template (ng)
o Total reads
o Total reads passing QC
o Calibrator and other
internal controls
o Total reads passing QC
o Protocol ID(s)
o Sequencing platform
o Read length(s)
o Sequencing facility
o Batch number
o Date of Sequencing run
o Sequencing kit
o File containing the raw
sequences
o Names of software tools
o Version numbers
o Paired read assembly
o Quality thresholds
o Primer match cut-offs
o Collapsing method
o Data processing protocols
(free text)
o V(D)J germline reference
database
o V gene
o D gene
o J gene
o CDR3 nucleotide
sequence
o CDR3 amino acid
sequence
o Read count
AIRR Community Data Elements
Each of the 6 high-level principles has been expanded into a set of data elements
Standard implemented @ NCBI
BioProject
BioSample
SRA
GenBank
Deposited at FAIRsharing.org:
https://fairsharing.org/bsg-s000689
CEDAR-AIRR-NCBI Templates
Created CEDAR templates to submit metadata to:
NCBI BioProject, BioSample and SRA
CEDAR-AIRR-NCBI Metadata Generation
Data Submitter
NCBI CEDAR
Controlled Vocabularies
Predictive Entry
Interactive Metadata Entry
Metadata Findability
Metadata Accessibility
Metadata Interoperability
Metadata Reusability
represents limited features availability
Metadata submissions to NCBI BioProject, BioSample
and SRA are ontologically controlled and relationally
linked, which enables concept-based federated queries
across repositories that are silos otherwise.
dfgdfg
CEDAR-AIRR-NCBI Submission Workflow
Demo
http://bit.ly/2uY7Lhk
Acknowledgment
● National Institutes of Health through an NIH Big Data to Knowledge program
under grant U54AI117925.
● Ben Busby, NCBI
● Leila Rassi, SRA
● Tanya Barrett, GEO
● Kleinstein Lab
● Team CEDAR
Thanks

Weitere ähnliche Inhalte

Was ist angesagt?

Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
CEDAR: Center for Expanded Data Annotation and Retrieval
 
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Ahmad C. Bukhari
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
Carole Goble
 
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Michel Dumontier
 
Tools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesTools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databases
Valery Tkachenko
 

Was ist angesagt? (20)

Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
 
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
 
Making it Easier, Possibly Even Pleasant, to Author Rich Experimental Metadata
Making it Easier, Possibly Even Pleasant, to Author Rich Experimental MetadataMaking it Easier, Possibly Even Pleasant, to Author Rich Experimental Metadata
Making it Easier, Possibly Even Pleasant, to Author Rich Experimental Metadata
 
NETTAB 2012
NETTAB 2012NETTAB 2012
NETTAB 2012
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data Science
 
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
 
An Open Repository Model for Acquiring Knowledge About Scientific Experiments
An Open Repository Model for Acquiring Knowledge About Scientific ExperimentsAn Open Repository Model for Acquiring Knowledge About Scientific Experiments
An Open Repository Model for Acquiring Knowledge About Scientific Experiments
 
NETTAB 2013
NETTAB 2013NETTAB 2013
NETTAB 2013
 
CEDAR work bench for metadata management
CEDAR work bench for metadata managementCEDAR work bench for metadata management
CEDAR work bench for metadata management
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
 
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
 
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
 
Drug Discovery- ELRIG -2012
Drug Discovery- ELRIG -2012Drug Discovery- ELRIG -2012
Drug Discovery- ELRIG -2012
 
OpenTox Europe 2013
OpenTox Europe 2013OpenTox Europe 2013
OpenTox Europe 2013
 
The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...
The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...
The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...
 
Data retriveal ,srg and dbget
Data retriveal ,srg and dbgetData retriveal ,srg and dbget
Data retriveal ,srg and dbget
 
A guided SQL tour of bioinformatics databases
A guided SQL tour of bioinformatics databasesA guided SQL tour of bioinformatics databases
A guided SQL tour of bioinformatics databases
 
W3C HCLS Dataset Description Guidelines
W3C HCLS Dataset Description GuidelinesW3C HCLS Dataset Description Guidelines
W3C HCLS Dataset Description Guidelines
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
 
Tools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesTools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databases
 

Ähnlich wie Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to the NCBI

dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...
dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...
dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...
dkNET
 
Semantic Web Technologies as a Framework for Clinical Informatics
Semantic Web Technologies as a Framework for Clinical InformaticsSemantic Web Technologies as a Framework for Clinical Informatics
Semantic Web Technologies as a Framework for Clinical Informatics
Chimezie Ogbuji
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
Ian Foster
 
The Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to TerminologyThe Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to Terminology
Snow Owl
 
provenance of microarray experiments
provenance of microarray experimentsprovenance of microarray experiments
provenance of microarray experiments
Helena Deus
 
The BioAssay Research Database
The BioAssay Research DatabaseThe BioAssay Research Database
The BioAssay Research Database
Rajarshi Guha
 

Ähnlich wie Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to the NCBI (20)

Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...
Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...
Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...
 
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
 
Standardization of the HIPC Data Templates
Standardization of the HIPC Data TemplatesStandardization of the HIPC Data Templates
Standardization of the HIPC Data Templates
 
Standardization of the HIPC Data Templates: The Story So Far
Standardization of the HIPC Data Templates: The Story So FarStandardization of the HIPC Data Templates: The Story So Far
Standardization of the HIPC Data Templates: The Story So Far
 
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...
 
dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...
dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...
dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...
 
FAIRer Research
FAIRer ResearchFAIRer Research
FAIRer Research
 
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...
 
The eCrystals Federation
The eCrystals FederationThe eCrystals Federation
The eCrystals Federation
 
Semantic Web Technologies as a Framework for Clinical Informatics
Semantic Web Technologies as a Framework for Clinical InformaticsSemantic Web Technologies as a Framework for Clinical Informatics
Semantic Web Technologies as a Framework for Clinical Informatics
 
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseDevelopment of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
 
The Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to TerminologyThe Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to Terminology
 
provenance of microarray experiments
provenance of microarray experimentsprovenance of microarray experiments
provenance of microarray experiments
 
The BioAssay Research Database
The BioAssay Research DatabaseThe BioAssay Research Database
The BioAssay Research Database
 
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
 
150219 agbt giab_poster_marc
150219 agbt giab_poster_marc150219 agbt giab_poster_marc
150219 agbt giab_poster_marc
 
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseDevelopment of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
 
The Role of Metadata in Reproducible Computational Research
The Role of Metadata in Reproducible Computational ResearchThe Role of Metadata in Reproducible Computational Research
The Role of Metadata in Reproducible Computational Research
 
Quality Assessment of Biomedical Metadata using Topic Modeling
Quality Assessment of Biomedical Metadata using Topic ModelingQuality Assessment of Biomedical Metadata using Topic Modeling
Quality Assessment of Biomedical Metadata using Topic Modeling
 

Mehr von Syed Ahmad Chan Bukhari, PhD

A semantic framework for biomedical image discovery
A semantic framework for biomedical image discoveryA semantic framework for biomedical image discovery
A semantic framework for biomedical image discovery
Syed Ahmad Chan Bukhari, PhD
 
BioNLP-SADI: A Suite of interoperable BioNLP Semantic Web Services based on S...
BioNLP-SADI: A Suite of interoperable BioNLP Semantic Web Services based on S...BioNLP-SADI: A Suite of interoperable BioNLP Semantic Web Services based on S...
BioNLP-SADI: A Suite of interoperable BioNLP Semantic Web Services based on S...
Syed Ahmad Chan Bukhari, PhD
 

Mehr von Syed Ahmad Chan Bukhari, PhD (10)

CEDAR: Easing Authoring of Metadata to Make Biomedical Data Sets More Findabl...
CEDAR: Easing Authoring of Metadata to Make Biomedical Data Sets More Findabl...CEDAR: Easing Authoring of Metadata to Make Biomedical Data Sets More Findabl...
CEDAR: Easing Authoring of Metadata to Make Biomedical Data Sets More Findabl...
 
Finding and Reusing Biomedical Datasets using CEDAR Metadata Repository and T...
Finding and Reusing Biomedical Datasets using CEDAR Metadata Repository and T...Finding and Reusing Biomedical Datasets using CEDAR Metadata Repository and T...
Finding and Reusing Biomedical Datasets using CEDAR Metadata Repository and T...
 
CEDAR Technologies for AIRR Submissions
CEDAR Technologies for AIRR SubmissionsCEDAR Technologies for AIRR Submissions
CEDAR Technologies for AIRR Submissions
 
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized MetadataCEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
 
A semantic framework for biomedical image discovery
A semantic framework for biomedical image discoveryA semantic framework for biomedical image discovery
A semantic framework for biomedical image discovery
 
CAIRR: A pipeline to submit AIRR data to the NCBI through the CEDAR Workbench
CAIRR: A pipeline to submit AIRR data to the NCBI through the CEDAR WorkbenchCAIRR: A pipeline to submit AIRR data to the NCBI through the CEDAR Workbench
CAIRR: A pipeline to submit AIRR data to the NCBI through the CEDAR Workbench
 
BioNLP-SADI: A Suite of interoperable BioNLP Semantic Web Services based on S...
BioNLP-SADI: A Suite of interoperable BioNLP Semantic Web Services based on S...BioNLP-SADI: A Suite of interoperable BioNLP Semantic Web Services based on S...
BioNLP-SADI: A Suite of interoperable BioNLP Semantic Web Services based on S...
 
Type 2 fuzzy ontology ahmadchan
Type 2 fuzzy ontology ahmadchanType 2 fuzzy ontology ahmadchan
Type 2 fuzzy ontology ahmadchan
 
AN Intelligent Realtime multiple vessel collision risk assessment system
AN Intelligent Realtime multiple vessel collision risk assessment system AN Intelligent Realtime multiple vessel collision risk assessment system
AN Intelligent Realtime multiple vessel collision risk assessment system
 
Type-2 Fuzzy Ontology
Type-2 Fuzzy OntologyType-2 Fuzzy Ontology
Type-2 Fuzzy Ontology
 

Kürzlich hochgeladen

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 

Kürzlich hochgeladen (20)

Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 

Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to the NCBI

  • 1. Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to the NCBI Syed Ahmad Chan Bukhari Ph.D., Kei-Hoi Cheung Ph.D., Steven H Kleinstein Ph.D. Yale University
  • 2. NCBI is an important resource to archive biomedical data ● NCBI hosts a collection of biomedical databases: ○ BioProject, BioSample, SRA, GenBank, GEO etc. ● Provide infrastructure to submit experimental data and associated metadata ● Minimal use of standard terminologies to define the necessary metadata ○ Ontologies recommended for some data elements (Not implemented) ● NCBI metadata are often described using inconsistent terminologies ○ Limit our ability to access, find, interoperate and reuse the data sets Goal: Leverage CEDAR to improve NCBI metadata submissions NCBI BioSample guideline suggests to use Disease Ontology terms
  • 3. How are metadata currently submitted to NCBI? BioProject BioSample Sequence Read Archive Combination of web-based forms and excel templates ● No mechanism to enforce standardized vocabularies or ontology links
  • 4. NCBI repositories need improved metadata CEDAR maps components (e.g., entities, attributes, and value sets) to standard ontologies that provide global definitions and machine-readable identifiers Link to BRENDA Tissue and Enzyme Source Ontology (BTO) Link to Cell Ontology Example NCBI BioSample Record “B cell”, “B-cell” and “Bcell” CEDAR-to-NCBI Solution Link to Cell Ontology Link to Disease Ontology (for real-time validation) Wrong location for info Link to NCBI Taxonomy Ontology
  • 5. Adaptive Immune-Receptor Repertoire (AIRR) Community Next-generation sequencing of B & T cell receptor repertoires (AIRR-seq) Developing standard protocols for reporting and sharing AIRR-seq data to optimize their use in biomedical research and patient care AIRR Working Groups Minimal Standards Tools and Resources Common Repository AIRR Community Formed
  • 6. 1. Study Subject Diagnosis 2. Sample Processing 3. Nucleic Acid Processing and Sequencing 4. Raw Data 5. Data Processing 6. Processed Sequences with Annotations o Study title o Study type o Study inclusion/exclusion criteria o Grant funding agency o Lab name o Contact information o Contact of person uploading data o Lab address o Relevant publications (identifiers) o Subject ID o Animal, human or synthetic o Sex o Age o Age event o Ancestry population o Ethnicity o Race o Species name o Strain name o Linked to other subject? o Type of link o Relevant Clinical History o Study Group Description o Disease(s) o Disease stage o Process type o Immunogen/agent o Biological sample ID o Sample type o Anatomic site/source o Disease state of sample o Sample collection time (relative to T0) o Collection time event (T0) o Source (from commercial) o Experiment Sample o Tissue processing o Cell isolation/enrichment procedure o Processing (sample) o Cell subset o Cell subset phenotype o Single cell or bulk? o How many cells in experiment? o Number of cells per sequencing reaction o Target substrate (DNA or RNA) o Library generation method o Library generation protocol o Target locus for PCR o Forward PCR primer location o Reverse PCR primer location o Forward primer sequences o Reverse primer sequences o Whole vs. partial sequences o Heavy vs. Light vs. paired o Amount of template (ng) o Total reads o Total reads passing QC o Calibrator and other internal controls o Total reads passing QC o Protocol ID(s) o Sequencing platform o Read length(s) o Sequencing facility o Batch number o Date of Sequencing run o Sequencing kit o File containing the raw sequences o Names of software tools o Version numbers o Paired read assembly o Quality thresholds o Primer match cut-offs o Collapsing method o Data processing protocols (free text) o V(D)J germline reference database o V gene o D gene o J gene o CDR3 nucleotide sequence o CDR3 amino acid sequence o Read count AIRR Community Data Elements Each of the 6 high-level principles has been expanded into a set of data elements Standard implemented @ NCBI BioProject BioSample SRA GenBank Deposited at FAIRsharing.org: https://fairsharing.org/bsg-s000689
  • 7. CEDAR-AIRR-NCBI Templates Created CEDAR templates to submit metadata to: NCBI BioProject, BioSample and SRA
  • 8. CEDAR-AIRR-NCBI Metadata Generation Data Submitter NCBI CEDAR Controlled Vocabularies Predictive Entry Interactive Metadata Entry Metadata Findability Metadata Accessibility Metadata Interoperability Metadata Reusability represents limited features availability Metadata submissions to NCBI BioProject, BioSample and SRA are ontologically controlled and relationally linked, which enables concept-based federated queries across repositories that are silos otherwise. dfgdfg
  • 11. Acknowledgment ● National Institutes of Health through an NIH Big Data to Knowledge program under grant U54AI117925. ● Ben Busby, NCBI ● Leila Rassi, SRA ● Tanya Barrett, GEO ● Kleinstein Lab ● Team CEDAR