Call Girls Agra Just Call 8250077686 Top Class Call Girl Service Available
Ontologies for Semantic Normalization of Immunological Data
1. Ontologies for Semantic Normalization of Immunological Data
Yannick Pouliot1, Atul J. Butte1,2
1. Division of Systems Medicine, Stanford University School of Medicine
2. Center for Pediatric Bioinformatics, Lucile Packard Children’s Hospital, Palo Alto, California
The Problem
Data from experiments probing the immune system are inherently
complex because of the diversity of data types, assay types and the
number of biological agents involved. This complexity is further
increased by the multi-center nature of data generated by HIPC.
One of the goals of HIPC is to deliver a database able to support
broad community access to these complex data sets. Critical to the
success of this database will be its ability to provide conceptual
characterizations of experiments and their results (“data and
metadata encoding”). Such encodings identify data sets according to
experimental properties so that users can quickly narrow their
searches to the most pertinent results.
To this end, conceptual encoding that rely on “industry-standard”
ontologies are preferred is the best way to achieve this. We
determined the extent to which existing ontologies can be used to
encode HIPC data, and ImmPort’s ability to support the application of
these concepts.
Since ImmPort will be the repository of HIPC data, we evaluated its
use of ontologies. Upon determining that ImmPort is not ontologycompliant, we analyzed the universe of ontologies to determine the
extent to which existing ontologies can be used to encode HIPC data,
and ImmPort’s ability to support the application of these concepts.
Ontologies provided by BioPortal1 were first selected
based on their domain of application:
Domain
Description
Example Ontology
Analysis
Anatomy
Process of data analysis
Anatomical structures at all level of
resolution except molecular
Disease states manifested by organisms at
anatomical, spatial, temporal and
functional levels
Conditions/specifications associated with
a scientific or clinical protocol
Process/properties/data types of
modeling, computational or otherwise
Aspects of biomolecules : structure,
sequence, function
Biochemical, signaling pathways used by
organisms
States manifested by organisms at
anatomical, spatial, temporal and
functional levels. “Anatomy” and
“Disease” are components of
“Phenotype” but treated distinctly
Ontology of Data Mining
Cell Ontology
Experimental
conditions
Modeling
Molecule
Pathways
Phenotype
Criterium
Parameter
Design
Must be integrative (ability to draw on similar/identical
concepts from other ontologies)
Must minimize overlap with other ontologies, consistent with
providing terms that can be inter-related across ontologies
Should be “relatable” to clinical applications
Must be applicable to humans, and perhaps one animal model
Must be an ontology, not just a controlled vocabulary (with a
few exceptions)
Evidence of ongoing development and maintenance
Developmental
state
Usage
Content
Developed with, accepted by standards or professional
organizations
Adheres to standards such as the Basic Formal Ontology
Must be released (post beta)
Reasonably widely adopted
Must exhibit a good balance between expressiveness and
usability/understandability:
Conceptual clarity (e.g., no ambiguous classifications for dualuse organs such as reproductive/urinary organs)
Limited redundancy of synonymous concepts
Usable definitions of concepts describing HIPC data or
metadata, including experiment design
Completeness (frequency of missing concepts)
Correctness (how accurately is a concept is expressed)
These ontologies were then analyzed for their ability to
recognize terms from text obtained from Build 1 datasets,
protocols, and metadata, as well as from Stanford’s DataMt
database2 (which stores many Stanford HIPC datasets). An
automated pipeline that relies on the National Center for
Biomedical Ontology’s Annotator3 was written that relies on
BioPortal’s Web services to parse the text and attempt to
map to the reference ontologies.
Methods
Disease
57 Ontologies within these domains were then
screened for preliminary suitability to HIPC data/metadata
according to four broad criteria:
Infectious Disease Ontology (IDO)
Ontology of Clinical Research
(OCRe)
Interaction Network Ontology
(INO)
ONTIE - Ontology of Immune
Epitopes
Pathway Ontology
Phenotypic Quality
Results & Discussion
Ontology
NCI Thesaurus
Medical Subject Headings
Molecule role
SNOMED Clinical Terms
PRotein Ontology (PRO)
Cell Cycle Ontology
Ontology for Biomedical Investigations
Experimental Factor Ontology
SemanticScience Integrated Ontology
Units of measurement
Phenotypic quality
EDAM
Foundational Model of Anatomy
Vaccine Ontology
ICPC-2 PLUS
MGED Ontology
Measurement Method Ontology
Gene Ontology
Ontology of Clinical Research (OCRe)
Protein-protein interaction
Mammalian phenotype
Build 1
(%)
16.7%
7.1%
14.3%
2.4%
16.7%
7.1%
2.4%
n
7
3
6
1
7
3
1
ImmPort
CV
(%)
33.3%
21.4%
0.0%
11.9%
0.0%
0.0%
14.3%
7.1%
2.4%
2.4%
7.1%
2.4%
2.4%
2.4%
2.4%
n
14
9
5
6
3
1
1
3
1
Data Mt
(%)
52.6%
22.9%
24.6%
17.7%
10.3%
12.6%
5.7%
7.4%
2.9%
2.3%
2.3%
0.6%
2.3%
1.1%
1.1%
n
92
40
43
31
18
22
10
13
5
4
4
1
4
2
2
1
1
1
0.6%
0.6%
0.6%
1
1
1
• Many mapping failures attributable to lack of definition for commercial objects
within the reference ontologies (e.g., “Anti-CD27” antibody from BD)
Solution: Contacting ontology owners to have them add commercial terms
to their ontologies
• Many mapping failures are easily correctable
Example: Adding a pre-processor able to recognized instances of the “anti-“
problem (e.g., “anti-CD20” not recognized even though “CD20” is known)
We conclude that ImmPort should be able to migrate toward ontologically-based
encodings.
References
1. Noy et al., (2009) “BioPortal: Ontologies and Integrated Data resources at the Click of a
Mouse”, Nucl. Acids Res., 37:W170-W173.
2. Siebert, J., Munsil, D. & Maecker, H. (2011) "A Novel Approach for Integrating and Exploring
Heterogeneous Translational Data", manuscript in preparation.
3. Jonquet et al., (2009) “The Open Biomedical Annotator”, Summit on Transla. Bioinfo., 56-60.
Acknowledgments
NIAID, Hewlett Packard Foundation, Butte Lab