SlideShare ist ein Scribd-Unternehmen logo
1 von 20
Building, Using, and Maintaining
Ontology-Enabled Resources in
Exposure Science and Health
Informatics Settings
Deborah L. McGuinness
Tetherless World Senior Constellation Chair
Professor of Computer, Cognitive, and Web Science
Director RPI Web Science Research Center
RPI Institute for Data Exploration and Application Health Informatics Lead
dlm@cs.rpi.edu , @dlmcguinness
OpenTox Durham, NC July 12, 2018
●Limited data integration without controlled
vocabulary
●Limited reproducibility without shared
definitions
●Difficulty in reuse without provenance
Ontologies can enhance findabiity, accessibility,
interoperability, reusability and research impact
Integrated Understandable Data
McGuinness 7/18
What is an Ontology?
An ontology specifies a rich description of the
• Terminology, concepts, nomenclature
• Relationships among concepts and individuals
• Sentences distinguishing concepts, refining
definitions & relationships
relevant to a particular domain or area of interest.
* Based on AAAI ‘99 Ontologies Panel ̶ McGuinness, Welty, Uschold, Gruninger, Lehmann
McGuinness 6/7/2017
• "Pull" for Ontologies. Invited
talk. Semantics for the Web.
Dagstuhl, Germany, 2000.
• Ontologies Come of Age.
Fensel, Hendler, Lieberman,
Wahlster, eds. Spinning the
Semantic Web: Bringing the
World Wide Web to Its Full
Potential. MIT Press, 2003.
McGuinness 7/18
Ontology Development Process
**
Use Cases
Existing Ontologies
& Vocabularies
Expert Interviews
Labkey,
Ontology
Fragments
Ontology
Curation
(ongoing)
Reviewers & Curators
* Ontology Development Team
* Domain collaborators
* Invited experts
* "Consumers" (data analysts)
Knowledge Graph
Integration
* Linking data and
metadata content to
domain terms
* Linking workflows
based on semantic
descriptions
Repository
Integration
* Source Datasets
* Analytics source
code
* Results
* Publications
Knowledge-
Enhanced
Search
Finding what
is there that
might be of
use
Semantic
Extract
Transform,
Load
(SETLr)
Expert
Guidance
Sources
Data Reporting
Templates
Data Dictionaries /
Codebooks
Foundational
Ontologies/Vocabularies
Human Aware
Data Acquisition
Framework
Ontology
Browser
Generated
Ontology
* domain concepts
* authoritative
vocabularies
* vetted definitions
* supporting citations
Erickson, McGuinness, McCusker, Chastain, Pinheiro, Rashid, Liang, Liu, Stingone, …
Exemplified by CHEAR
McGuinness 7/18
CHEAR Ontology Effort
5
Goal: Encode terminology currently needed by the CHEAR Data Center
Portal, publish an open source extensible ontology integrating general
exposure science and health leveraging best in class terminologies.
Enabling Findable, Accessible, Interoperable, Reusable Data and
Services to support data analysis and interdisciplinary research
Ontologies encode terms and their interrelationships, providing a foundation
for understanding interoperability and reusability (I and R in terms of FAIR)
Ontology-enabled infrastructures - Knowledge Graphs and Ontology-
enabled search services also provide support for finding and accessing
relevant content (the F and A in FAIR)
Child Health Exposure
Analysis Resource Ontology
Ontology Foundations
6
Imported Ontologies:
● Semantic Science Integrated
Ontology (SIO)
● PROV-O
● Units Ontology
● Human-Aware Science Ontology
(HAScO)
● Virtual Solar Terrestrial
Observatory (Instruments)
(VSTO-I)
● Environment Ontology (ENVO)
Minimum Information to Reference an External
Ontology Term (MIREOT)-ed Ontologies:
● Chemicals of Biological Interest (CheBI)
● Statistics Ontology (STAT-O)
● PubChem
● UBERON (Anatomy)
● Disease Ontology (DO)
● UniProt (Proteins)
● Cogat (Cognitive Measures)
● ExO
● …
Annotations:
● Simple Knowledge Organization System
(SKOS)
● Dublin Core (DC) Terms
6
CHEAR Ontology
Foundations and Reuse
McGuinness 7/18 Partially supported by: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
Mapping Data to Meaning:
Semantic Data Dictionaries
Rashid, Chastain, Stingone, McGuinness, McCusker. The Semantic Data Dictionary
Approach to Data Annotation and Integration. Enabling Open Semantic Science, in
Proceedings of the International Web Science Conference Knowledge Graphs Workshop Oct
21, 2017
McGuinness 7/18 Partially supported by: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
8
• Ontology support for
mapping and integration
(e.g., education level)
• Ontology informs decisions
about variables that may be
combined, serve as proxy,
or used to derive desired
info (e.g., birth outcomes)
• Ontology Integrity
constraints may help flag
errors (e.g., APGAR > 10)
• Ontology helps expose
implicit information and find
links
Fenton Z-Score
Sex
Birth
weight
Gest
Age
Mother’s Highest
Education Level
Val
Did not attend school 0
Elementary school 1
Technical post-primary 2
Middle school 3
Technical post-middle
school 4
Highschool or junior
college 5
Technical post-junior
college 6
College 7
Graduate 8
Doesn’t know 9
Mother
Education
Val
Less than High
School 0
High School
Graduate or More 1
Support Browsing,Searching, Pooling
Deriving Values, Verification, …
McGuinness, McCusker, Pinheiro, Stingone, et. al. Funding: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
McGuinness 7/18
Laboratory Information Mgmt
System (LIMS)-based
backend integrated with
Ontology.
Includes automatic ingest,
access control, data
governance, download, ...
Supports Search study, data
sample, subject, ...
Enables statiscians to ask for
content to support their
studies e.g., find
Child: Birth Weight, Gender,
Gestational Age at Birth
Mother: Age, BMI “early in
pregnancy based on
inclusion criterion for the
particular study”, Parity,
Education
Metals: As, CD, Mn, Mo, Pb
CHEAR Human Aware Data
Acquisition Framework
McGuinness 7/18
Pinheiro, Liang, Rashid, Liu, Chastain, Santos, McCusker, McGuinness
Sample Question: Gennings
Partially supported by: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
Ontology-Enabled Study Search /
Precision Data Search and Download
Blood Biomarkers for Children’s Health (Study 1)
Institution:
Principal Investigator(s):
Number of Subjects:
Number of Samples:
Study Description:
Keywords:
Urine Biomarkers for Children’s Health (Study 2)
Institution:
Principal Investigator(s):
Number of Subjects:
Number of Samples:
Study Description:
Keywords:
Metabolomic Biomarkers for Children’s Health (Study 3)
Institution:
Principal Investigator(s):
Number of Subjects:
Number of Samples:
Study Description:
Keywords:
McGuinness 7/18 Partially supported by: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
Ontology-enabled Framework:
AI Horizons Example **
Ontologies are an important piece;
but are part of a larger integrated
framework
Semanalytics / SemNExt RPI team:
McGuinness, Bennett PIs with
McCusker, Erickson, Seneviratne and the extended Research
groups along with input from IBM HEALS collaborators and
motivation from MANY projects, particularly from NIEHS/
Mount Sinai,
McGuinness, D.L. & Bennett K. Integrating
Semantics and Numerics: Case Study on
Enhancing Genomic and Disease Data
Using Linked Data Technologies. In Proc.
Smart Data 2015, San Jose, CA.McGuinness 7/18 partially supported through IBM AI Horizons Network
Probability-Aware
Knowledge Exploration
• Knowledge imported from
drug, protein, and
disease interaction
databases.
• Each interaction given an
evidence-driven
probability.
• Find drugs that could
affect melanoma, filtered
by interaction probability.
• The best hypotheses
were generated using the
highest probabilities.
• Being expanded initially
for genomics-aware
cancer care applications
McCusker, Dumontier, Yan,
He, Dordick, McGuinness.
Finding Melanoma Drugs
through a Probabilistic
Knowledge Graph. PeerJ
4L 32007, 2016.
McGuinness 7/18
13
Text Mining
Linked Data
Biomedical Databases
Omics and Epidemiology Datasets
File uploads of any kind
Fully automated
Reproducible and traceable from diverse
data types:
Tabular CSV & XLS, XML, JSON, HTML,
etc.
Transformed into rigorously modeled
knowledge:
Meaning as structure, global IDs,
foundational ontologies
Tracked and versioned using provenance
standards
RPI knowledge graph technology
curates from diverse sources
McGuinness 7/18 partially supported through IBM AI Horizons Network
RPI knowledge graph technology
tracks the provenance of everything
14
All knowledge is encoded with its
provenance and publication details
using "nanopublications" approach.
Knowledge curation traces the
source and method of
transformation:
Linked Data "was quoted from X",
Semantic ETL stores workflows as
provenance.
Inference agents show their
work:
What knowledge was used? What
rules were applied?
Knowledge from user
interactions trace back to the
user.
Who made the claim? What other
knowledge have they added? Do
we trust this person?
McGuinness 7/18 partially supported through IBM AI Horizons Network
RPI knowledge graph technology
uses provenance-driven probabilities
15
Experimental Methods
Knowledge Sources
Expert opinions can evaluate
evidence based on experimental
method:
Compute probabilities (edge width)
from expert opinions about protein
interaction assays and knowledge
source reputation.
Probabilities focus on high-
quality results:
Graph exploration relies on a "joint
probability horizon" for searching
out hypothetical connections.
Current probabilities are system
wide, future work will be
Binding
Binding
McGuinness 7/18 partially supported through IBM AI Horizons Network
RPI knowledge graph technology
enables probabilistic graph exploration
17
What drugs are likely to have an effect on a
disease?
What diseases is this drug likely to affect?
How are X and Y likely to be connected?
Which tumor mutations might be driving the
tumor?
What pathways are affected by a given drug?
Users can dig into the chain of evidence
supporting the results of graph
exploration:
Everything is linked, justified, and
explainable.
McGuinness 7/18 partially supported through IBM AI Horizons Network
Powering Personalized Treatment Agent
18
New
Staging
Guidelines
Axiom
Generation
Ontologi
es
Map
Cancer
Terminology
Patient
Datasets
Synthetic
Patient
Data
Treatment,
and
Monitoring
Guidelines
Semantic
Data
Dictionary
Conversion
Extraction
Reasoner
Visualization
New Data +
Justifications
Seneviratne, Rashid, Chari, McCusker, Bennett, Hendler and McGuinness. Knowledge Integration for Disease Characterization: A
Breast Cancer Example. In Proceedings of the International Semantic Web Conference, Monterey, CA, 2018.
McGuinness 7/18 partially supported through IBM AI Horizons Network
Personalized Treatment Agent - Patient “D”
19
Biomarkers dictate
the drug options
D’s Triple positive
combination lead to
downstaging
Overall treatment suggestions change based
on disease characterization using new data and
guidelines
Seneviratne, Rashid, Chari, McCusker, Bennett, Hendler and McGuinness. Knowledge Integration for Disease Characterization: A
Breast Cancer Example. In Proceedings of the International Semantic Web Conference, Monterey, CA, 2018.
McGuinness 7/18 partially supported through IBM AI Horizons Network
Discussion
• Ontologies enable FAIR (Findable,
Accessible, Interoperable, Reusable) Data
Resources
• They can support movement across levels
of abstraction
• Use ontology-enabled architectures
• Do NOT build ontologies from scratch
• Selectively and thoughtfully reuse existing
best practice ontologies/vocabularies
• Engage experts in choosing ontology
(portions) and in designing the knowledge
architecture
• Ecosystems and diverse teams are critical
for success – community driven and
maintained ontology-based systems are the
future
• Lets enable the next generation of decision
support together! Please contact us for
collaboration
McGuinness 7/18
Conclusion
• Semantics / Ontologies
enable new
personalized and
population level health
• Contact: dlm@cs.rpi.edu
• Thanks to many: RPI Tetherless World team
particularly McCusker, Erickson, Hendler,
Pinheiro, Rashid, Liang, Liu, Chastain; RPI:
Bennett, Dyson, Seneviratne; Mount Sinai
particularly Teitelbaum, Stingone, Mervish,
Gennings, Kovatch; IBM, particularly Das, Chen,
Brown, ….
• Funding: NIEHS 0255-0236-4609 /
1U2CES026555-01, DARPA HR0011-16-2-0030,
IBM-RPI HEALS AI Horizons Network, NSF ACI-
1640840
• Forthcoming book: Ontology Engineering with
Kendall
McGuinness 7/18

Weitere ähnliche Inhalte

Kürzlich hochgeladen

Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024Timothy Spann
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhYasamin16
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 

Kürzlich hochgeladen (20)

Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 

Empfohlen

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Empfohlen (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Mc guinnessopentox20180712durhamnc final

  • 1. Building, Using, and Maintaining Ontology-Enabled Resources in Exposure Science and Health Informatics Settings Deborah L. McGuinness Tetherless World Senior Constellation Chair Professor of Computer, Cognitive, and Web Science Director RPI Web Science Research Center RPI Institute for Data Exploration and Application Health Informatics Lead dlm@cs.rpi.edu , @dlmcguinness OpenTox Durham, NC July 12, 2018
  • 2. ●Limited data integration without controlled vocabulary ●Limited reproducibility without shared definitions ●Difficulty in reuse without provenance Ontologies can enhance findabiity, accessibility, interoperability, reusability and research impact Integrated Understandable Data McGuinness 7/18
  • 3. What is an Ontology? An ontology specifies a rich description of the • Terminology, concepts, nomenclature • Relationships among concepts and individuals • Sentences distinguishing concepts, refining definitions & relationships relevant to a particular domain or area of interest. * Based on AAAI ‘99 Ontologies Panel ̶ McGuinness, Welty, Uschold, Gruninger, Lehmann McGuinness 6/7/2017 • "Pull" for Ontologies. Invited talk. Semantics for the Web. Dagstuhl, Germany, 2000. • Ontologies Come of Age. Fensel, Hendler, Lieberman, Wahlster, eds. Spinning the Semantic Web: Bringing the World Wide Web to Its Full Potential. MIT Press, 2003. McGuinness 7/18
  • 4. Ontology Development Process ** Use Cases Existing Ontologies & Vocabularies Expert Interviews Labkey, Ontology Fragments Ontology Curation (ongoing) Reviewers & Curators * Ontology Development Team * Domain collaborators * Invited experts * "Consumers" (data analysts) Knowledge Graph Integration * Linking data and metadata content to domain terms * Linking workflows based on semantic descriptions Repository Integration * Source Datasets * Analytics source code * Results * Publications Knowledge- Enhanced Search Finding what is there that might be of use Semantic Extract Transform, Load (SETLr) Expert Guidance Sources Data Reporting Templates Data Dictionaries / Codebooks Foundational Ontologies/Vocabularies Human Aware Data Acquisition Framework Ontology Browser Generated Ontology * domain concepts * authoritative vocabularies * vetted definitions * supporting citations Erickson, McGuinness, McCusker, Chastain, Pinheiro, Rashid, Liang, Liu, Stingone, … Exemplified by CHEAR McGuinness 7/18
  • 5. CHEAR Ontology Effort 5 Goal: Encode terminology currently needed by the CHEAR Data Center Portal, publish an open source extensible ontology integrating general exposure science and health leveraging best in class terminologies. Enabling Findable, Accessible, Interoperable, Reusable Data and Services to support data analysis and interdisciplinary research Ontologies encode terms and their interrelationships, providing a foundation for understanding interoperability and reusability (I and R in terms of FAIR) Ontology-enabled infrastructures - Knowledge Graphs and Ontology- enabled search services also provide support for finding and accessing relevant content (the F and A in FAIR) Child Health Exposure Analysis Resource Ontology
  • 6. Ontology Foundations 6 Imported Ontologies: ● Semantic Science Integrated Ontology (SIO) ● PROV-O ● Units Ontology ● Human-Aware Science Ontology (HAScO) ● Virtual Solar Terrestrial Observatory (Instruments) (VSTO-I) ● Environment Ontology (ENVO) Minimum Information to Reference an External Ontology Term (MIREOT)-ed Ontologies: ● Chemicals of Biological Interest (CheBI) ● Statistics Ontology (STAT-O) ● PubChem ● UBERON (Anatomy) ● Disease Ontology (DO) ● UniProt (Proteins) ● Cogat (Cognitive Measures) ● ExO ● … Annotations: ● Simple Knowledge Organization System (SKOS) ● Dublin Core (DC) Terms 6 CHEAR Ontology Foundations and Reuse McGuinness 7/18 Partially supported by: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
  • 7. Mapping Data to Meaning: Semantic Data Dictionaries Rashid, Chastain, Stingone, McGuinness, McCusker. The Semantic Data Dictionary Approach to Data Annotation and Integration. Enabling Open Semantic Science, in Proceedings of the International Web Science Conference Knowledge Graphs Workshop Oct 21, 2017 McGuinness 7/18 Partially supported by: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
  • 8. 8 • Ontology support for mapping and integration (e.g., education level) • Ontology informs decisions about variables that may be combined, serve as proxy, or used to derive desired info (e.g., birth outcomes) • Ontology Integrity constraints may help flag errors (e.g., APGAR > 10) • Ontology helps expose implicit information and find links Fenton Z-Score Sex Birth weight Gest Age Mother’s Highest Education Level Val Did not attend school 0 Elementary school 1 Technical post-primary 2 Middle school 3 Technical post-middle school 4 Highschool or junior college 5 Technical post-junior college 6 College 7 Graduate 8 Doesn’t know 9 Mother Education Val Less than High School 0 High School Graduate or More 1 Support Browsing,Searching, Pooling Deriving Values, Verification, … McGuinness, McCusker, Pinheiro, Stingone, et. al. Funding: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01 McGuinness 7/18
  • 9. Laboratory Information Mgmt System (LIMS)-based backend integrated with Ontology. Includes automatic ingest, access control, data governance, download, ... Supports Search study, data sample, subject, ... Enables statiscians to ask for content to support their studies e.g., find Child: Birth Weight, Gender, Gestational Age at Birth Mother: Age, BMI “early in pregnancy based on inclusion criterion for the particular study”, Parity, Education Metals: As, CD, Mn, Mo, Pb CHEAR Human Aware Data Acquisition Framework McGuinness 7/18 Pinheiro, Liang, Rashid, Liu, Chastain, Santos, McCusker, McGuinness Sample Question: Gennings Partially supported by: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
  • 10. Ontology-Enabled Study Search / Precision Data Search and Download Blood Biomarkers for Children’s Health (Study 1) Institution: Principal Investigator(s): Number of Subjects: Number of Samples: Study Description: Keywords: Urine Biomarkers for Children’s Health (Study 2) Institution: Principal Investigator(s): Number of Subjects: Number of Samples: Study Description: Keywords: Metabolomic Biomarkers for Children’s Health (Study 3) Institution: Principal Investigator(s): Number of Subjects: Number of Samples: Study Description: Keywords: McGuinness 7/18 Partially supported by: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
  • 11. Ontology-enabled Framework: AI Horizons Example ** Ontologies are an important piece; but are part of a larger integrated framework Semanalytics / SemNExt RPI team: McGuinness, Bennett PIs with McCusker, Erickson, Seneviratne and the extended Research groups along with input from IBM HEALS collaborators and motivation from MANY projects, particularly from NIEHS/ Mount Sinai, McGuinness, D.L. & Bennett K. Integrating Semantics and Numerics: Case Study on Enhancing Genomic and Disease Data Using Linked Data Technologies. In Proc. Smart Data 2015, San Jose, CA.McGuinness 7/18 partially supported through IBM AI Horizons Network
  • 12. Probability-Aware Knowledge Exploration • Knowledge imported from drug, protein, and disease interaction databases. • Each interaction given an evidence-driven probability. • Find drugs that could affect melanoma, filtered by interaction probability. • The best hypotheses were generated using the highest probabilities. • Being expanded initially for genomics-aware cancer care applications McCusker, Dumontier, Yan, He, Dordick, McGuinness. Finding Melanoma Drugs through a Probabilistic Knowledge Graph. PeerJ 4L 32007, 2016. McGuinness 7/18
  • 13. 13 Text Mining Linked Data Biomedical Databases Omics and Epidemiology Datasets File uploads of any kind Fully automated Reproducible and traceable from diverse data types: Tabular CSV & XLS, XML, JSON, HTML, etc. Transformed into rigorously modeled knowledge: Meaning as structure, global IDs, foundational ontologies Tracked and versioned using provenance standards RPI knowledge graph technology curates from diverse sources McGuinness 7/18 partially supported through IBM AI Horizons Network
  • 14. RPI knowledge graph technology tracks the provenance of everything 14 All knowledge is encoded with its provenance and publication details using "nanopublications" approach. Knowledge curation traces the source and method of transformation: Linked Data "was quoted from X", Semantic ETL stores workflows as provenance. Inference agents show their work: What knowledge was used? What rules were applied? Knowledge from user interactions trace back to the user. Who made the claim? What other knowledge have they added? Do we trust this person? McGuinness 7/18 partially supported through IBM AI Horizons Network
  • 15. RPI knowledge graph technology uses provenance-driven probabilities 15 Experimental Methods Knowledge Sources Expert opinions can evaluate evidence based on experimental method: Compute probabilities (edge width) from expert opinions about protein interaction assays and knowledge source reputation. Probabilities focus on high- quality results: Graph exploration relies on a "joint probability horizon" for searching out hypothetical connections. Current probabilities are system wide, future work will be Binding Binding McGuinness 7/18 partially supported through IBM AI Horizons Network
  • 16. RPI knowledge graph technology enables probabilistic graph exploration 17 What drugs are likely to have an effect on a disease? What diseases is this drug likely to affect? How are X and Y likely to be connected? Which tumor mutations might be driving the tumor? What pathways are affected by a given drug? Users can dig into the chain of evidence supporting the results of graph exploration: Everything is linked, justified, and explainable. McGuinness 7/18 partially supported through IBM AI Horizons Network
  • 17. Powering Personalized Treatment Agent 18 New Staging Guidelines Axiom Generation Ontologi es Map Cancer Terminology Patient Datasets Synthetic Patient Data Treatment, and Monitoring Guidelines Semantic Data Dictionary Conversion Extraction Reasoner Visualization New Data + Justifications Seneviratne, Rashid, Chari, McCusker, Bennett, Hendler and McGuinness. Knowledge Integration for Disease Characterization: A Breast Cancer Example. In Proceedings of the International Semantic Web Conference, Monterey, CA, 2018. McGuinness 7/18 partially supported through IBM AI Horizons Network
  • 18. Personalized Treatment Agent - Patient “D” 19 Biomarkers dictate the drug options D’s Triple positive combination lead to downstaging Overall treatment suggestions change based on disease characterization using new data and guidelines Seneviratne, Rashid, Chari, McCusker, Bennett, Hendler and McGuinness. Knowledge Integration for Disease Characterization: A Breast Cancer Example. In Proceedings of the International Semantic Web Conference, Monterey, CA, 2018. McGuinness 7/18 partially supported through IBM AI Horizons Network
  • 19. Discussion • Ontologies enable FAIR (Findable, Accessible, Interoperable, Reusable) Data Resources • They can support movement across levels of abstraction • Use ontology-enabled architectures • Do NOT build ontologies from scratch • Selectively and thoughtfully reuse existing best practice ontologies/vocabularies • Engage experts in choosing ontology (portions) and in designing the knowledge architecture • Ecosystems and diverse teams are critical for success – community driven and maintained ontology-based systems are the future • Lets enable the next generation of decision support together! Please contact us for collaboration McGuinness 7/18
  • 20. Conclusion • Semantics / Ontologies enable new personalized and population level health • Contact: dlm@cs.rpi.edu • Thanks to many: RPI Tetherless World team particularly McCusker, Erickson, Hendler, Pinheiro, Rashid, Liang, Liu, Chastain; RPI: Bennett, Dyson, Seneviratne; Mount Sinai particularly Teitelbaum, Stingone, Mervish, Gennings, Kovatch; IBM, particularly Das, Chen, Brown, …. • Funding: NIEHS 0255-0236-4609 / 1U2CES026555-01, DARPA HR0011-16-2-0030, IBM-RPI HEALS AI Horizons Network, NSF ACI- 1640840 • Forthcoming book: Ontology Engineering with Kendall McGuinness 7/18