The problem
We've seen 959 ways to refer
to Proceedings of the National
Academy of Sciences.
Google Scholar Development Team
http://bit.ly/K6xRf0
The problem
We've seen 959 ways to refer
to Proceedings of the National
Academy of Sciences.
¡Ay mi
estómago! Team
Google Scholar Development
http://bit.ly/K6xRf0
The main intent of the Semantic
Web is to give machines much
better access to information
resources so they can be
information intermediaries in
support of humans.
Michael Uschold
http://bit.ly/JuWSUg
Let’s Define Our Terms
ive ps
is t iat shi
t L c hy ssoc ion ar
l ici rar A lat
ra mm
E xp H ie R e G
controlled
vocabulary ✓
taxonomy ✓ ✓
thesaurus ✓ ✓ ✓
ontology ✓ ✓ ✓ ✓
Warning
Pursuit of controlled vocabulary
tends to expose source systems for
the quagmires they are.
“Desiderata” for Controlled
Medical Vocabularies
http://bit.ly/desider
1. Content – formal editorial policy and
methodology; provide breadth and depth;
don’t just add terms
2. Concept orientation – exactly one meaning per
concept and exactly one concept per meaning
3. Concept permanence – old concepts can't be
deleted; names can be changed as long as
meaning doesn't change
“Desiderata” for Controlled
Medical Vocabularies
http://bit.ly/desider
4. Nonsemantic identifiers – use a meaningless
integer
5. Polyhiearchy – employ multiple hierarchies to
support need for tree walking and inferencing
6. Formal Definitions – structured descriptions
that invoke relationships within the
terminology
“Desiderata” for Controlled
Medical Vocabularies
http://bit.ly/desider
7. Reject “not elsewhere classified” –
terminology changes induce semantic drift
8. Graceful evolution – fix mistakes; account for
changes in medical knowledge
9. Recognize redundancy – redundant
expressions are inevitable, but redundant
concepts are bad
What is the license of the
controlled vocabulary?
• Are the ontology codes copyrighted and
can they be used in an open source
application?
• Need to account for the possibility that the
data is reused for a commercial interest
The Ontology Team is
considering serving vocabularies
for select domains
“The VIVO community might be able to build
services to serve controlled vocabularies for
organizations and journals.”
http://bit.ly/J7Vd8w
Food and Agriculture Organization
(FAO) geopolitical ontology
• master reference for geopolitical
information in multiple languages
• provides relations among territories (land
borders, group membership, etc)
• tracks historical changes
Ships with VIVO application
As of version 1.4, VIVO allows
users to lookup terms from
UMLS and GEMET
As of version 1.4, VIVO allows
users to lookup terms from
UMLS and GEMET
As of version 1.4, VIVO allows
users to lookup terms from
UMLS and GEMET
As of version 1.4, VIVO allows
users to lookup terms from
UMLS and GEMET
GEMET: controlled vocabulary
for environmental topics
administration forestry radiations
agriculture general research
air geography resources
animal husbandry human health social aspects,
biology industry population
building information soil
chemistry legislation space
climate materials tourism
disasters, accidents, risk military aspects trade, services
economics natural areas, landscape, transport
energy ecosystems urban environment,
natural dynamics urban stress
environmental policy
noise, vibrations waste
fishery
physics water
food, drinking water
pollution
Vocabularies actively being
considered for VIVO
• colleges and universities
• journals
- open source status (VIVOONT-433)
• languages (VIVOONT-250)
- model write, speak, proficiency
• others?
Types of Specialty
All Specialties
Board-Certified Specialties
Board-Certified Subspecialties
Types of Medical Expertise
Feigned Clinical Research
< <
GLG-20s Performed Board-certified Invented a
masquerading 100+ ECGs in Cardiology better ECG
as doctors for
comic effect
We use Intelligent Medical
Objects (IMO)’s interface
terminology
• Maps medical expertise terms to SNOMED CT
• Useful for returning relevant results to
patients searching for a doctor
• Enables the physician to enter more arcane
areas of expertise (e.g., Asian American
Community Health)
• A commercial application
Board Certifications
Problem #1: No indication
of certifying board.
At least 13 certifications
including geriatric medicine,
pain medicine, and urology are
given by at least one ABMS
board.
Board Certifications
Problem #2: Names of
certifications are ambiguous.
Colon and rectal surgery is listed
in the following alternate ways:
Surgery, Colon and Rectal
Colon-Rectal Surgery
Colorectal Surgery
The National Uniform Claim Committee
(NUCC) maintains a list of health care
provider taxonomy codes, but this list
seems to be exclusively for non-MDs.
Change in number of ABMS
Subspecialties/Specialties
145
84
66 74
20
10
70 79 92 96 99 0 12
re-
19
70
-19 - 19 By
19
By
19 2
P 19 1980
Cosmetic Dentistry Geriatric Psychotherapy Neuro Critical Care
Cosmetic Dermatology Gynecologic Endocrinology Neuro Radiology
Cosmetic Surgery Gynecologic Pathology Neuro-Ophthalmology
Critical Care Neurology Gynecology Neuro-Pathology
Dermatology, General Hand Surgery Nutrition
Ear, Nose, and Throat, Heart Surgery Oral and Maxillofacial
Pediatric Hematology/Oncology Pathology
Echocardiography Hepatobiliary Surgery Oral and Maxillofacial Surgery
Electrodiagnostic Medicine Hepatology Orthodontics
Emergency Neurology High Risk Obstetrics Orthopedic Surgery
Endocrinology Hospitalist Orthopedics
ollowing 135 board
Facial Plastic and Immunopathology Pain Medicine/Pain
The f
Reconstructive Surgery
Facial Plastic Surgery
Infant Psychiatry
Intensive Care
Management
Pathology
ns in our system
Family Psychology Internal Medicine, General Pediatric Allergy and
certificatio
Fetal Cardiology
Foot and Ankle Surgery
International Medicine
International Travel Medicine
Immunology
Pediatric Behavior and
cognized by ABMS.
Foot Surgery Interventional Neuroradiology Development
are not re
Gastroenterology Pathology
Gastrointestinal Pathology
Gastrointestinal Surgery
Interventional Oncology
Interventional Pain
Management
Pediatric Dentistry
Pediatric Neurological Surgery
Prior to 1970
1970-1979
1970-1979
Pediatric Neurology
General Anesthesiology Interventional Radiology Pediatric Neurosurgery
General Cardiology Invasive Cardiology Pediatric Orthopedic Surgery
General Dentistry Laboratory Medicine Pediatric Orthopedics
General Dermatology Laryngology Periodontics
General Internal Medicine Liver Pathology Plastic and Reconstructive
General Neurology Maternal-Fetal Medicine Surgery
General Neurosurgery Medical Genetics Psychology
General Obstetrics and Molecular Genetics Pulmonary Disease Medicine
Gynecology Molecular Hematopathology Radiology
General Ophthalmology Molecular Infectious Disease Radiology, Vascular/
General Pediatrics Molecular Pathology Interventional
General Psychiatry Musculoskeletal Oncology Reproductive Endocrinology
General Surgery Musculoskeletal Radiology Surgery, Critical Care
General Urology Neonatal Neurology Surgery, Hand
Genetics, Medical Neonatal Surgery Surgery, Oral and Maxillofacial
Geriatric Cardiology Neonatal Thoracic Surgery Thoracic Surgery
Geriatric Dermatology Neonatology Vascular and Interventional
Weill Game Plan for Board
Certifications
• Explore ingest from Intellicred (fewer
certifications, less variability, may include
certifying agency?)
• Explore external vocabularies
• Failing that, create our own
Expertise term from Weill
Cornell Physician Profile
3% of terms from
the source system
System (n = 2578) lack or have an
unclear equivalent
in UMLS
How does a term of local clinical expertise map
to UMLS using Stony Brook's API? Weill → UMLS
– In Vitro Fertilization Counseling → V
Unclear Fertilization | Counseling
– Adjustable Band → Band
– Bowel-Sparing Strictureplasty → No
Identical
Subtype
Compound term Equivalent preserving Union of two concepts
original meaning
53% of terms from 3% of terms from 2% of terms from
the source system
correspond exactly to 5% of terms from the source system the source system
some representation 34% of terms from the source system can be represented
by the joining (not
can only be
represented as a
in UMLS the source system have some equivalent
in UMLS that is intersection) of two subtype of a
can only be
lexically different but concepts in UMLS concept in UMLS
represented as a
– Polycystic Ovary Syndrome semantically identical
combination of terms
– Anaphylaxis Weill → UMLS Weill → UMLS
– Aortic Dissection from UMLS
– Billing and Coding → Billing | – Bipolar 1 Disorder → Bipolar
– Chemoembolization
– Dental Implant
Weill → UMLS Coding Disorder
– Biopsy of Skin → Skin biopsy – Bone and Mineral Metabolism – FAA Medical Exam → Medica
– Echocardiogram Weill → UMLS → Bone Metabolism | Mineral
– Aneurysm of Popliteal Artery → Exam
– Asian American Community Health Metabolim
Aneurysm Popliteal
→ Asian American | Community – Bladder and Prostate Cancer
– Charcot-Marie-Tooth Disease →
Health → Bladder Cancer | Prostate
Charcot-Marie-Tooth
– Endoscopic Ultrasound of Cancer
– Cirrhosis of Liver → Cirrhosis
Esophagus → Endoscopic Ultrasound
– Coarctation of the Aorta →
| Esophagus
Coarctation
– Chronic Pelvic Pain In Female →
Chronic Pelvic Pain | Female
– Bronchoscopy With Biopsy →
Pre-coordination Post-coordination
Definition Terms combined by a
developer to denote a
Terms combined at the
time of search and
specific concept and its retrieval using Boolean
attributes more or other operators.
precisely.
Benefits Users who are not Lazy or “busy”
totally familiar with a developers
controlled vocabulary
and its structure.
Examples avian hypersensitivity avian AND
pneumonitis hypersensitivity AND
pneumonitis
carrier sense multiple
access carrier sense AND
multiple access
How do we semantically model
post-coordinated terms?
1. Do not mess with post-coordination. User adds
term from lookup service. That's it. (Existing
method.)
2. User adds term from lookup service. Machine
makes basic inferences based on similarity.
(Everything is "related term.")
3. User adds term from lookup service. Administrator
models terms.
4. User adds term from lookup service. User interface
enables and guides end user.
Option #3: User adds term from
lookup service. Administrator
models terms.
Can we build on others' work?
• The International Health Terminology
Standards Development Organization
(IHTSDO) in Denmark is working to develop
and promote SNOMED to support sharing of
modelling.
• IMO, our terminology service, may help
model coordinated terms.
Why SNOMED CT may be better at
representing medical terms
compared to UMLS
• No formal conceptual model (near-synonymy)
• No hierarchy
• Lots of redundancy
• Lots of ambiguity
UMLS is good for helping you find terms
in a specific terminology because all
many-to-one term-to-concept mappings
expand the synonyms you can match
against. I recommend you use UMLS to
find terms from a very limited set of
terminologies - maybe SNOMED plus
LOINC plus RxNorm, for example.
Jim Cimino
Proposed Role of SKOS
Classes
skos:Concept
snomedct:Procedure
snomedct:Disorder
rxnorm:Drug
...
Properties
skos:related
snomedct:equivalentTo
...
skos:broader
skos:narrower
Read More
Guidelines for the Construction, Format, and
Management of Monolingual Controlled
Vocabularies
http://bit.ly/niso-standard
Desiderata for Controlled Medical
Vocabularies
http://bit.ly/desider
Practice Robot Courtesy with
Local Extensions
Use classes/properties that are
subclasses/subproperties of existing
classes/properties in VIVO’s core
ontology.