Some common needs for the patient registries, Electronic Health Record (EHR) systems, and clinical research repositories of the future are: semantic interoperability, adoption of standardized clinical terminology, adhoc and distributed querying interfaces, and integration with extant databases and web-based systems. A suite of standards has recently emerged from the consortium responsible for the development and oversight of the protocols of the World-wide Web (WWW). They were conceived to address data integration challenges associated with internet and intranet applications. Many of these standards and technologies are capable of addressing the challenges common to health information systems. In this talk, an introductory overview of these technologies, how they address these challenges, and a brief discussion of projects where they have been used is given.
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Â
Semantic Web Technologies: A Paradigm for Medical Informatics
1. Semantic Web Technologies: A
Paradigm for Medical
Informatics
Chimezie Ogbuji (Owner, Metacognition LLC.)
http://metacognition.info/presentations/SWTMedicalInformatics.pdf
http://metacognition.info/presentations/SWTMedicalInformatics.ppt
2. Who I am
ï Circa 2001: Introduced to web standards and
Semantic Web technologies
ï 2003-2011: Lead architect of CCF in-house
clinical repository project
ï 2006-2011: Member representative of CCF in
World-wide Web Consortium (W3C)
⊠Editor of various standards and Semantic Web
Health Care and Life Sciences Interest Group
chair
ï 2011-2012: Senior Research Associate at
CWRU Center for Clinical Investigations
ï 2012-current: Started business providing
resource and data management software for
home healthcare agencies (Metacognition
LLC)
3. Medical Informatics
Challenges
ï Semantic interoperability
⊠Exchange of data with common meaning
between sender and receiver
ï Most of the intended benefits of HIT
depend on interoperability between
systems
ï Difficulties integrating patient record
systems with other information resources
are among the major issues hampering
their effectiveness
⊠Interoperability is a major goal for meaningful
use of Electronic Health Records (EHR)
Rodrigues et al. 2013; Kadry et al. 2010; Shortliffe and Cimino, 2006
4. Requirements and Solutions
ï Semantic interoperability requires:
⊠Structured data
⊠A common controlled vocabulary
ï Solutions emphasize the meaning of
data rather than how they are
structured
⊠âSemanticâ paradigms
5. Registries and Research DBs
ï Patient registries and clinical research
repositories capture data elements in
a uniform manner
ï The structure of the underlying data
needs to be able to evolve along with
the investigations they support
ï Thus, schema extensibility is
important
6. Querying Interfaces
ï Standardized interfaces for querying
facilitate:
⊠Accessibility to clinical information
systems
⊠Distributed querying of data from where
they reside
ï Requires:
⊠Semantically-equivalent data structures
ï Alternatively, data are centralized in
data warehouses
Austin et al. 2007, âImplementation of a query interface for a generic record serverâ
7. Biomedical Ontologies
ï Ontologies are artifacts that
conceptualize a domain as a taxonomy
of classes and constraints on
relationships between their members
ï Represented in a particular formalism
ï Increasingly adopted as a foundation for
the next generation of biomedical
vocabularies
ï Construction involves representing a
domain of interest independent of
behavior of applications using an
ontology
ï Important means towards achieving
semantic interoperability
8. Biomedical Ontology
Communities
ï Prominent examples of adoption by
life science and healthcare
terminology communities:
⊠The Open Biological and Biomedical
Ontologies (OBO) Foundry
⊠Gene Ontology (GO)
⊠National Center for Biomedical Ontology
(NCBO) Bioportal
⊠International Health Terminology
Standards Development Organization
(IHTSDO)
9. Semantic Web and
Technologies
ï The Semantic Web is a vision of how
the existing infrastructure of the
World-wide Web (WWW) can be
extended such that machines can
interpret the meaning of data on it
ï Semantic Web technologies are the
standards and technologies that have
been developed to achieve the vision
10. An Analogy
ï (Technological) singularity is a
theoretical moment when artificial
intelligence (AI) will have progressed
to a greater-than-human intelligence
ï Despite remaining in the realm of
science fiction, it has motivated many
useful developments along the way
⊠The use of ontologies for knowledge
representation and IBM Watson
capabilities, for example
11. Background: Graphs
ï Graphs are data structures
comprising nodes and edges that
connect them
ï The edges can be directional
ï Either the nodes, the edges, or both
can be labeled
ï The labels provide meaning to the
graphs (edge labels in particular)
Node Nodeedge
12. Resource Description
Framework
ï The Resource Description Framework
(RDF) is a graph-based knowledge
representation language for describing
resources
ï Itâs edges are directional and both
nodes and edges are labeled
ï It uses Universal Resource Identifiers
(URI) for labeling
ï Foundation for Semantic Web
technologies
13. RDF: Continued
ï The edges are statements (triples) that
go from a subject to an object
ï Some objects are text values
ï Some subjects and objects can be left
unlabeled (Blank nodes)
⊠Anonymous resources: not important to label
them uniquely
ï The URI of the edge is the predicate
ï Predicates used together for a common
purpose are a vocabulary
14. ï Subject: Dr. X (a URI)
ï Object: Chime
ï Predicate: treats
ï Vocabulary:
⊠treats, subject of record, author, and full
name
Chime
Dr. X
treats subject of record
author
"Chimezie Ogbuji"full name
15. RDF vocabularies
ï How meaning is interpreted from an RDF
graph
ï There are vocabularies that constrain how
predicates are used
⊠Want a sense of treats where the subject is a
clinician and the object is a patient
ï There is a predicate relating resources to the
classes they are a member of (type)
ï There are vocabularies that define
constraints on class hierarchies
ï These comprise a basic RDF Schema
(RDFS) language
ï Represented as an RDF graph
16. Chime
Dr. X
treats subject of record
Patient
Physician
type
type
Hypertension DX
Clinical Diagnosis
type
is a
authorPerson
is a
is a
17. Ontologies for RDF
ï The Ontology Web Language (OWL)
is used to describe ontologies for RDF
graphs
ï More sophisticated constraints than
RDFS
ï Commonly expressed as an RDF
graph
ï Defines the meaning of RDF
statements through constraints:
⊠On their predicates
⊠On the classes the resources they relate
18. Chime
Dr. X
treats subject of record
Patient
Physician
type
type
Hypertension DX
Clinical Diagnosis
type
is a
authorPerson
is a
is a
Governed by OWL/ RDFS for domain
19. OWL Formats
ï Most common format for describing
ontologies
ï Distribution format of ontologies in the
NCBO BioPortal
ï SNOMED CT distributions include an
OWL representation
⊠RDF graphs can describe medical content
in a SNOMED CT-compliant way through
the use of this vocabulary
20. Validation and Deduction
ï OWL is based on a formal,
mathematical logic that can be used
for validating the structure of an
ontology and RDF data that conform
to it (consistency checking)
ï Used to deduce additional RDF
statements implied by the meaning of
a given RDF graph (logical inference)
ï Logical reasoners are used for this
21. Inference
ï Can infer anatomical location from
SNOMED CT definitions
Hypertension DX
type
ïŹnding site
Systemic circulatory
system structure
type
Hypertension DX <-> 1201005 / âBenign essential hypertension (disorder)
22. Querying RDF Graphs
ï SPARQL is the official query language
for RDF graphs
ï Comparable to relational query
languages
⊠Primary difference: it queries RDF triples,
whereas SQL queries tables of arbitrary
dimensions
ï Includes various web protocols for
querying RDF graphs
ï Foundation of SPARQL is the triple
pattern
ï (?clinician, treats, ?patient)
⊠?clinician and ?patient are variables (like a
wildcard)
23. ?patient
?physician ?dx
treats subject of record
author
Hypertension DX
type
Which physicians have given essential hypertension diagnoses and to w
(?physician, author, ?dx)
(?physician, treats, ?patient)
(?dx, subject of record, ?patient)
(?dx, type, Hypertension DX)
?physician ?patient ?dx
Dr. X Chime âŠ
24. SPARQL over Relational Data
ï Most common implementations
convert SPARQL to SQL and evaluate
over:
⊠a relational databases designed for RDF
storage
⊠an existing relational database
ï There are products for both
approaches
ï Former requires native storage of RDF
⊠Relational structure doesnât change even
as RDF vocabulary does (schemaElliot et al. 2009, âA Complete Translation from SPARQL into Efficient SQLâ
25. SPARQL over Existing Relation
Data
ï âVirtual RDF viewâ
⊠Translation to SQL follows a given
mapping from existing relational
structures to an RDF vocabulary
⊠Allows non-disruptive evolution of existing
systems
⊠Well-suited as a standard querying
interface over clinical data repositories
⊠They can be queried as SPARQL,
securely over encrypted HTTP
26. Relational RDF (SNOMED CT perhaps)
Mapping and
Translation layer
Secure HTTP
SPARQL
SQL
Legacy / existing
applications
Patient registry or
data repository
3rd party applications
SQL
27. Example: Cleveland Clinic
(SemanticDB)
ï Content repository and data
production system released in Jan.
2008
ï 80 million (native) RDF statements
⊠Uses vocabulary from a patient record
OWL ontology for the registry
ï Based on
⊠Existing registry of heart surgery and CV
interventions
⊠200,000 patient records
⊠Generating over 100 publications per year
Pierce et al. 2012, âSemanticDB: A Semantic Web Infrastructure for Clinical Research and Quality Reporting
28. Cohort Identification
ï Interface developed in conjunction
with Cycorp
ï Leverage their logical reasoning
system (Cyc)
⊠Identifies cohorts using natural language
(NL) sentence fragments
⊠Converts fragments to SPARQL
⊠SPARQL is evaluated against RDF store
29. Example: Mayo Clinic
(MCLSS)
ï Mayo Clinic Life Sciences System
(MCLSS)
⊠Effort to represent Mayo Clinic EHR data
as RDF graphs
⊠Patient demographics, diagnoses,
procedures, lab results, and free-text
notes
⊠Goal was to wrap MCLSS relational
database and expose as read-only, query-
able RDF graphs that conform to standard
ontologiesPathak et al. 2012, "Using Semantic Web Technologies for Cohort Identification from Electronic
Health Records for Clinical Research"
30. Example: Mayo Clinic (CEM)
ï Clinical Element Model (CEM)
⊠Represents logical structure of data in
EHR
⊠Goal: translate CEM definitions into OWL
and patient (instance) data into
conformant RDF
⊠Use tools (logical reasoners) to check
semantic consistency of the ontology,
instance data, and to extract new
knowledge via deduction
⊠Instance data validation:
ï correct number of linked components, value
within data range, existence of units, etc.
Tao et al. 2012, âA semantic-web oriented representation of the clinical element model for
secondary use of electronic health records data"
31. Summary
ï Schema extensibility
⊠Use of RDF
ï Semantic Interoperability
⊠Domain modeling using OWL and RDFS
ï Standardized query interfaces
⊠Querying over SPARQL
ï Incremental, non-disruptive adoption
⊠Virtual RDF views
ï Main challenge: highly disruptive
innovation