SlideShare ist ein Scribd-Unternehmen logo
1 von 27
Exploiting Semantic Structure for Mapping
        User-specified Form Terms
         to SNOMED CT Concepts
 Ritu Khare1,2, Yuan An1, Jiexun Li1, Il-Yeol Song1, Xiaohua Hu1
                      The iSchool at Drexel1
                       College of Medicine2
              Drexel University, Philadelphia, PA, USA
Presentation Order
    1.   Motivation
    2.   Problems
    3.   Solutions
    4.   Evaluation
    5.   Final Remarks



2
General Motivation
     Database Integration and Interoperability
       Semantic Heterogeneity across clinical data sources
        (Halevy, 2005, Henry et al. 1993, Hernandez et al. 2005, Wright et al., 1999)


                                                 ?
       MRN          Med Rec #      Medical Record
                                   Number
       Blood        Diastolic
       Pressure     Systolic       BP
                                   Physical Status
      Constitutional Vital Signs


     Recommendation: Controlled Medical Vocabularies should
     be involved in the design artifacts of the healthcare systems.
     (Jean et al., 2007, Sugumaran and Storey, 2002)
3
Specific Motivation




        Clinical Encounter Form     Electronic Health Records (EHR)

     The terms on the clinical forms are mapped to, or annotated
      by, a standard terminology.
     Domain experts may manually perform the annotation
       costly and tedious


      Research Objective: Design an automatic tool for mapping
4
      form terms to standard terminologies.
1.   Motivation
    2.   Problem
    3.   Solutions
    4.   Evaluation
    5.   Final Remarks



5
The Mapping Problem
    Clinical Encounter Form   SNOMED CT
                               The Systematized Nomenclature of
                                Medicine - Clinical Terms (Intl.
                                Health Terminology Stds. Dev. Org)
                               Most comprehensive clinical
                                vocabulary (SNOMED CT User
                                Guide, 2009).
                               >360,000 logically-defined clinical
                                concepts (Hina et al., 2010,
                                Stenzhorn et al., 2009).


                              Form
                              Term        SNOMED CT Concept
                              Patient     11615400: Patient
                                          (person)
                              MRN
                                          398225001: Medical
                                          record number
6                                         (observable entity)
SNOMED CT Concepts
                                                                 SNOMED CT
    concept id: 0231832                                          Semantic
                                                                 Categories
    Fully-specified-name: Respiratory Rate (Observable Entity)
                                                                 •Attribute
    Preferred Term: Respiratory Rate
                                                                 •Body Structure
    Synonym: Respiration Frequency
                                                                 •Disorder
                                                                 •Finding
                                                                 •Observable Entity
    concept id: 362508001
                                                                 •Occupation
    Fully-specified-name: Both eyes, entire (Body Structure)
                                                                 •Person
    Preferred Term: Both eyes, entire
                                                                 •Physical Object
    Synonym: OU- Both eyes
                                                                 •Procedure
                                                                 •Racial Group
                                                                 •Situation
7                                                                •…
SNOMED CT Browsers: (Rogers and Bodenreider, 2008)
Existing Mapping Services




    General Mapping
                              Category Specific Mapping
8
Challenges:
    Mapping Form Terms to SNOMED CT Concepts
     Diversity Challenge                     Context Challenge
         Different clinicians - different      Same Form Term - Different
          terms                                  Concepts.
         MRN, Med. Rec.#
         Vital signs, Constitutional,
          Physical status




9
1.   Motivation
     2.   Problem
     3.   Solution
     4.   Evaluation
     5.   Final Remarks



10
Premises
                                       The first, i.e., the most string-
      The key is to identify the       similar, result retrieved by the
         SNOMED CT semantic             category-specific mapping is
         category appropriate for a     usually the desired concept.
         given term.




         How to automatically determine the SNOMED CT Semantic
     ?   Category appropriate for a given form term ?
11
The term context can be derived from the SEMANTIC STUCTURE of
     1   the form.


          The FORM TREE accurately captures the semantic intentions of
           the designer.
          Inspired by hierarchical modeling of forms (Dragut et al. 2009,
           Wu et al. 2009)




12
The implicit relationship between
2      the term context
       (i.e., the semantic structure)
       and      the   desired    semantic
       category                                                         Naïve Bayes Classifier
       can be formally captured into                                     Based on the Bayes theorem
       a STATISTICAL MODEL.                                               (Han and Kamber 2006).

                                               Procedure                 Class Labels (SNOMED CT
             Person
                             root
                                                                          semantic categories )
                                                                           attribute, body structure,
Observable
  Entity          Patient                Examination                          disorder, …
                                                                         Data Attributes (local
           Name           Gender                                          structure)
                                           Respiratory
                                                           Observable        Node type
                                                             Entity
                                                                             Parent node type
     Observable
                                                                             Child node Type
       Entity         M       F                                              Parent Semantic Category
                                                nl
                                               perc.                         Grandparent Semantic
                                                             Finding          Category
              Qualifier
               Value               Qualifier
                                    Value
13
Overall Mapping Approach
          Form Tree                      Training Data


                            Node                         Category                 Semantic   SNOMED
Form      Structure         Attributes
                                         Classificatio   Membership               Category      CT
                                                                       Category                          SNOMED CT
Term      Analyzer                         n Model                                           Category
                                                         Probabilities  Picker
                                                                                              Specific
                                                                                                         Concept
                                                                                             Mapping




                                                     Procedure
                   Person
                                     root
      Observable
        Entity         Patient                Examination

              Name              Gender          Respiratory
                                                                 Observable
         Observable                                                Entity
           Entity

               Novelty: Hybrid Approach
               (leverages semantic structure as well as term
 14            linguistics)
1.   Motivation
     2.   Problem
     3.   Solution
     4.   Evaluation
     5.   Final Remarks



15
Data                                            Manual (Gold)
                                                     Annotations
                                            954 (63.55%) terms
         Dataset Forms              Total   Term      Concept ID
                                    Terms
                                            Patien    11615400: Patient
     1   Walk in clinic encounter   161     t         (person)
         forms (3 forms)                    MRN       398225001: Medical
     2   Nursing patient            261               record number
         admission forms (6                           (observable entity)
         forms)                             …         ……………….
     3   Labor & delivery DB        294
         data-entry forms (7
         forms)                             Some Unmapped Terms
     4   Adult visit encounter      388
                                            no scleral icterus
         forms
         (5 forms)                          chronic back pain
     5   Child visit encounter      397     Follow up with PCP
         forms
         (5 forms)                          Sent to ER
16
         26 Forms                   1501
Implementation (JAVA) and Settings




                                    Gold
       Form Design               Annotations
         Interface                                                                  API, provided by
                                                                                      the Dataline
        Form Tree               Training Data                                       Software Limited



                                                Category                 Semantic     SNOMED
Form   Structure   Node         Classificatio   Membership               Category        CT
                                                              Category                                 SNOMED CT
Term   Analyzer    Attributes     n Model                                             Category
                                                Probabilities  Picker
                                                                                       Specific
                                                                                                       Concept
                                                                                      Mapping



                                 Cross Validation
 17                              (leave 1 out) for
                                   each dataset
Goal: To study whether…
Experiment Design                                semantic structure can improve mapping
                                                 performance.
           SNOMED
 Form     CT General             SNOMED CT          Measures
 Term      Mapping               Concept
                                                    Precision       # correct annotations/#
       Baseline (linguistics                                        annotations
              only)                                 Recall          # correct annotations/# gold
                                                                    annotations


                                                Category                     Semantic    SNOMED
Form   Structure   Node         Classificatio   Membership                   Category       CT       SNOMED CT
                                                              Category
Term   Analyzer    Attributes     n Model                                                Category
                                                Probabilities  Picker
                                                                                          Specific   Concept
                                                                                         Mapping
                            Hybrid (linguistics + semantic
                                       structure)

                                                Category         Category
                                                                              Semantic   SNOMED
Form   Structure   Node         Classificatio   Membership                    Category      CT       SNOMED CT
                                                                  Picker
Term   Analyzer    Attributes     n Model                       +candidate               Category
                                                Probabilities                             Specific   Concept
                                                                   set
                                                                expansion                Mapping
18                               Hybrid++
Mapping Duration
     Results                                                /form = 1- 11 s




      Baseline                           Recall low:
        Precision: 0.63, Recall: 0.45      SNOMED CT API uses exact
      Baseline to Hybrid                    string matching
        Precision by 18%.                  Couldn’t handle the variation
                                             of terms, i.e., diversity
      Hybrid to Hybrid++                    challenge.
        Precision by16% , Recall
         by23%
      Hybrid++
19      Precision: 0.86, Recall: 0.55
More Results
  Term processing
     component
       remove special characters
        -, #, /, etc.

       acronym expansion
        dictionary
        T (Temperature)
        BTL (Bilateral Tubal        Precision only slightly
         Litigation)                  improved
                                       3-5%
        VTE (Venous
                                     Recall improved majorly
         Thromboembolism)              25%
                                     Final Precision =0.89, Recall
20                                    =0.76
Implications
     Impact of Semantic Structure
     Overall mapping performance
     More number of correct predictions (context challenge)

     Impact of Linguistics
     Majorly on recall
     Reaches more number of relevant terms (diversity
     challenge)
     Overall
     Promising performance, even with limited training data
     Recall low because of simplicity of linguistic techniques -
     can be further improved using sophisticated techniques.



21
1.   Motivation
     2.   Problem
     3.   Solution
     4.   Evaluation
     5.   Final Remarks



22
Contributions
      PROBLEM: NEW problem of standardizing the terms on clinical
       encounter forms using SNOMED CT.
         Existing works (Henry et al., 1993, Barrows Jr. et al. 1994,
          Patrick et al. 2007)
            standardization of clinical notes: diagnosis, medication
             information, patient complaints, etc.
      SOLUTION: Context-based method that leverages SEMANTIC
       STRUCTURE of forms along with term linguistics.
         Existing works
           linguistic techniques (synonyms, morphemes, lexical
            variants)




23
Contributions
      EVALUATION: 26 healthcare forms containing 950+ mappable
       terms specified by multiple clinicians.
           Improvement over existing services
           23% precision, 38% recall
           Promising Performance
           precision: 0.89, recall: 0.76


      FINDINGS:
           Linguistics helps overcome diversity challenge and improve
            recall
           Semantic structure helps overcome context challenge and
            improves precision and recall.
           Design synergistic hybrid approaches to address all
            mapping challenges, and Achieve a superior performance
24
Limitations
       TECHNIQUE                    TECHNICAL EVALUATION
        Post coordinated mapping     Compare with other models:
        Handle Missing and             Bayesian networks, k
         Inapplicable Values in          Neural Networks,
         Training data
                                         Classification Association
                                         Rules
       STUDY
                                      Test the validity of
        Domain Expert Annotator       assumptions
                                        Class conditional
                                         independence
                                        Correctness of most
                                         linguistic matching
                                         concept
                                        Classification Attributes
                                      Compare/Combine with
25
                                       other UMLS terminology
Future Directions
      Fully explore SNOMED          In larger frameworks, does
       CT                              annotation help improve
        Defining relationships         Data/Database Integration
                                         ?
                                        Data Quality ?
      Customize for Form               Patient Diagnosis ?
       Categories                       User Interventions ?
          Encounter, Regular
           Visit,…                 Work In Progress:
      Larger Knowledge Base for   Integrate with flexible Electronic
       Training Datasets           Health Record system (IHI 2010)
                                   Integration of new forms in EHR
                                   improve database integration
                                   process




26
Thank you




27

Weitere ähnliche Inhalte

Andere mochten auch

An Empirical Study on Using Hidden Markov Models for Search Interface Segment...
An Empirical Study on Using Hidden Markov Models for Search Interface Segment...An Empirical Study on Using Hidden Markov Models for Search Interface Segment...
An Empirical Study on Using Hidden Markov Models for Search Interface Segment...The Children's Hospital of Philadelphia
 
Thepatientoutcomesblog survey results 2012
Thepatientoutcomesblog survey results 2012Thepatientoutcomesblog survey results 2012
Thepatientoutcomesblog survey results 2012Keith Meadows
 
8 things you should not do when selecting a prem
8 things you should not do when selecting a prem8 things you should not do when selecting a prem
8 things you should not do when selecting a premKeith Meadows
 
The Diabetes Health Profile - Development and applications
The Diabetes Health Profile - Development and applicationsThe Diabetes Health Profile - Development and applications
The Diabetes Health Profile - Development and applicationsKeith Meadows
 
White paper 5 things you need to know about patient reported outcome (pro) ...
White paper   5 things you need to know about patient reported outcome (pro) ...White paper   5 things you need to know about patient reported outcome (pro) ...
White paper 5 things you need to know about patient reported outcome (pro) ...Keith Meadows
 
The diabetes health profile ebook
The diabetes health profile ebookThe diabetes health profile ebook
The diabetes health profile ebookKeith Meadows
 
A selection of slides from our cognitive interview training workshop
A selection of  slides from our cognitive interview training workshopA selection of  slides from our cognitive interview training workshop
A selection of slides from our cognitive interview training workshopKeith Meadows
 
Oper Semangat: a campaign to gain Indonesian football supporter's optimist sp...
Oper Semangat: a campaign to gain Indonesian football supporter's optimist sp...Oper Semangat: a campaign to gain Indonesian football supporter's optimist sp...
Oper Semangat: a campaign to gain Indonesian football supporter's optimist sp...Faldi Dwi Wahyudi
 
DHP manual sample pages 02.11.12
DHP manual sample pages 02.11.12DHP manual sample pages 02.11.12
DHP manual sample pages 02.11.12Keith Meadows
 
Our story of understanding of what its like living with diabetes
Our story of understanding of what its like  living with diabetesOur story of understanding of what its like  living with diabetes
Our story of understanding of what its like living with diabetesKeith Meadows
 
5 tips for_selecting_prom
5 tips for_selecting_prom5 tips for_selecting_prom
5 tips for_selecting_promKeith Meadows
 
Understanding Clinical Forms: Structure Discovery and SNOMED CT Annotation
Understanding Clinical Forms: Structure Discovery and SNOMED CT AnnotationUnderstanding Clinical Forms: Structure Discovery and SNOMED CT Annotation
Understanding Clinical Forms: Structure Discovery and SNOMED CT AnnotationThe Children's Hospital of Philadelphia
 
Young spikes price tag of a nation
Young spikes price tag of a nationYoung spikes price tag of a nation
Young spikes price tag of a nationFaldi Dwi Wahyudi
 

Andere mochten auch (20)

Introduction to Database Research Projects @ CWHR
Introduction to Database Research Projects @ CWHRIntroduction to Database Research Projects @ CWHR
Introduction to Database Research Projects @ CWHR
 
Rassa dikit juga enak
Rassa dikit juga enakRassa dikit juga enak
Rassa dikit juga enak
 
Prospectus presentation
Prospectus presentation Prospectus presentation
Prospectus presentation
 
An Empirical Study on Using Hidden Markov Models for Search Interface Segment...
An Empirical Study on Using Hidden Markov Models for Search Interface Segment...An Empirical Study on Using Hidden Markov Models for Search Interface Segment...
An Empirical Study on Using Hidden Markov Models for Search Interface Segment...
 
Thepatientoutcomesblog survey results 2012
Thepatientoutcomesblog survey results 2012Thepatientoutcomesblog survey results 2012
Thepatientoutcomesblog survey results 2012
 
8 things you should not do when selecting a prem
8 things you should not do when selecting a prem8 things you should not do when selecting a prem
8 things you should not do when selecting a prem
 
The Diabetes Health Profile - Development and applications
The Diabetes Health Profile - Development and applicationsThe Diabetes Health Profile - Development and applications
The Diabetes Health Profile - Development and applications
 
White paper 5 things you need to know about patient reported outcome (pro) ...
White paper   5 things you need to know about patient reported outcome (pro) ...White paper   5 things you need to know about patient reported outcome (pro) ...
White paper 5 things you need to know about patient reported outcome (pro) ...
 
The diabetes health profile ebook
The diabetes health profile ebookThe diabetes health profile ebook
The diabetes health profile ebook
 
A selection of slides from our cognitive interview training workshop
A selection of  slides from our cognitive interview training workshopA selection of  slides from our cognitive interview training workshop
A selection of slides from our cognitive interview training workshop
 
Let's Chat The Museum
Let's Chat The MuseumLet's Chat The Museum
Let's Chat The Museum
 
Oper Semangat: a campaign to gain Indonesian football supporter's optimist sp...
Oper Semangat: a campaign to gain Indonesian football supporter's optimist sp...Oper Semangat: a campaign to gain Indonesian football supporter's optimist sp...
Oper Semangat: a campaign to gain Indonesian football supporter's optimist sp...
 
DHP manual sample pages 02.11.12
DHP manual sample pages 02.11.12DHP manual sample pages 02.11.12
DHP manual sample pages 02.11.12
 
Our story of understanding of what its like living with diabetes
Our story of understanding of what its like  living with diabetesOur story of understanding of what its like  living with diabetes
Our story of understanding of what its like living with diabetes
 
5 tips for_selecting_prom
5 tips for_selecting_prom5 tips for_selecting_prom
5 tips for_selecting_prom
 
Understanding Clinical Forms: Structure Discovery and SNOMED CT Annotation
Understanding Clinical Forms: Structure Discovery and SNOMED CT AnnotationUnderstanding Clinical Forms: Structure Discovery and SNOMED CT Annotation
Understanding Clinical Forms: Structure Discovery and SNOMED CT Annotation
 
Young spikes price tag of a nation
Young spikes price tag of a nationYoung spikes price tag of a nation
Young spikes price tag of a nation
 
Mike thelwall ritu
Mike thelwall rituMike thelwall ritu
Mike thelwall ritu
 
Can Clinicians Create High-Quality Databases?
Can Clinicians Create High-Quality Databases?Can Clinicians Create High-Quality Databases?
Can Clinicians Create High-Quality Databases?
 
Remote Mentoring Young Girls in STEM through MAGIC
Remote Mentoring Young Girls in STEM through MAGICRemote Mentoring Young Girls in STEM through MAGIC
Remote Mentoring Young Girls in STEM through MAGIC
 

Ähnlich wie Exploiting Semantic Structure for Mapping User-specified Form Terms to SNOMED CT Concepts

An Introduction to SNOMED CT
An Introduction to SNOMED CTAn Introduction to SNOMED CT
An Introduction to SNOMED CTGuruprasad Kini
 
2012 02 16 - Clinical LOINC Tutorial - Documents
2012 02 16 - Clinical LOINC Tutorial - Documents2012 02 16 - Clinical LOINC Tutorial - Documents
2012 02 16 - Clinical LOINC Tutorial - Documentsdvreeman
 
Pasi Leino :: Using XML standards for system integration
Pasi Leino :: Using XML standards for system integrationPasi Leino :: Using XML standards for system integration
Pasi Leino :: Using XML standards for system integrationgeorge.james
 
Fire and Ice - SNOMED for CORE strength - John Fountain
Fire and Ice - SNOMED for CORE strength - John FountainFire and Ice - SNOMED for CORE strength - John Fountain
Fire and Ice - SNOMED for CORE strength - John FountainHL7 New Zealand
 
Visual Analytics for Healthcare - Panel at AMIA 2012 in Chicago
Visual Analytics for Healthcare - Panel at AMIA 2012 in ChicagoVisual Analytics for Healthcare - Panel at AMIA 2012 in Chicago
Visual Analytics for Healthcare - Panel at AMIA 2012 in ChicagoAdam Perer
 
2011 08 15 - Clinical LOINC Tutorial - Documents
2011 08 15 - Clinical LOINC Tutorial - Documents2011 08 15 - Clinical LOINC Tutorial - Documents
2011 08 15 - Clinical LOINC Tutorial - Documentsdvreeman
 
Identification of Entities in Swedish
Identification of Entities in SwedishIdentification of Entities in Swedish
Identification of Entities in SwedishFindwise
 
2011 01 27 - Clinical LOINC Tutorial - Documents
2011 01 27 - Clinical LOINC Tutorial - Documents2011 01 27 - Clinical LOINC Tutorial - Documents
2011 01 27 - Clinical LOINC Tutorial - Documentsdvreeman
 
ALA 2010 -- Jabin White
ALA 2010 -- Jabin WhiteALA 2010 -- Jabin White
ALA 2010 -- Jabin Whitebisg
 
Clinical Clarity versus Terminological Order - The Readiness of SNOMED CT Con...
Clinical Clarity versus Terminological Order - The Readiness of SNOMED CT Con...Clinical Clarity versus Terminological Order - The Readiness of SNOMED CT Con...
Clinical Clarity versus Terminological Order - The Readiness of SNOMED CT Con...henryhezhe2003
 
SLE 2012 Keynote: Cognitive and Social Challenges of Ontology Use in the Biom...
SLE 2012 Keynote: Cognitive and Social Challenges of Ontology Use in the Biom...SLE 2012 Keynote: Cognitive and Social Challenges of Ontology Use in the Biom...
SLE 2012 Keynote: Cognitive and Social Challenges of Ontology Use in the Biom...Margaret-Anne Storey
 
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminarMolecular similarity searching methods, seminar
Molecular similarity searching methods, seminarHaitham Hijazi
 

Ähnlich wie Exploiting Semantic Structure for Mapping User-specified Form Terms to SNOMED CT Concepts (20)

An Introduction to SNOMED CT
An Introduction to SNOMED CTAn Introduction to SNOMED CT
An Introduction to SNOMED CT
 
2012 02 16 - Clinical LOINC Tutorial - Documents
2012 02 16 - Clinical LOINC Tutorial - Documents2012 02 16 - Clinical LOINC Tutorial - Documents
2012 02 16 - Clinical LOINC Tutorial - Documents
 
Pasi Leino :: Using XML standards for system integration
Pasi Leino :: Using XML standards for system integrationPasi Leino :: Using XML standards for system integration
Pasi Leino :: Using XML standards for system integration
 
SNOMED Clinical Terms - Introduction
SNOMED Clinical Terms - IntroductionSNOMED Clinical Terms - Introduction
SNOMED Clinical Terms - Introduction
 
Secondary Use of Healthcare Data for Translational Research
Secondary Use of Healthcare Data for Translational ResearchSecondary Use of Healthcare Data for Translational Research
Secondary Use of Healthcare Data for Translational Research
 
0 An Introduction To Snomed Ct1
0 An Introduction To Snomed Ct10 An Introduction To Snomed Ct1
0 An Introduction To Snomed Ct1
 
Fire and Ice - SNOMED for CORE strength - John Fountain
Fire and Ice - SNOMED for CORE strength - John FountainFire and Ice - SNOMED for CORE strength - John Fountain
Fire and Ice - SNOMED for CORE strength - John Fountain
 
Pattern recognition
Pattern recognitionPattern recognition
Pattern recognition
 
carloPoster_FINAL
carloPoster_FINALcarloPoster_FINAL
carloPoster_FINAL
 
Visual Analytics for Healthcare - Panel at AMIA 2012 in Chicago
Visual Analytics for Healthcare - Panel at AMIA 2012 in ChicagoVisual Analytics for Healthcare - Panel at AMIA 2012 in Chicago
Visual Analytics for Healthcare - Panel at AMIA 2012 in Chicago
 
2011 08 15 - Clinical LOINC Tutorial - Documents
2011 08 15 - Clinical LOINC Tutorial - Documents2011 08 15 - Clinical LOINC Tutorial - Documents
2011 08 15 - Clinical LOINC Tutorial - Documents
 
20120928 2nd bpd congress
20120928 2nd bpd congress20120928 2nd bpd congress
20120928 2nd bpd congress
 
Identification of Entities in Swedish
Identification of Entities in SwedishIdentification of Entities in Swedish
Identification of Entities in Swedish
 
I know just what you mean - Ontologies and their uses
I know just what you mean - Ontologies and their usesI know just what you mean - Ontologies and their uses
I know just what you mean - Ontologies and their uses
 
2011 01 27 - Clinical LOINC Tutorial - Documents
2011 01 27 - Clinical LOINC Tutorial - Documents2011 01 27 - Clinical LOINC Tutorial - Documents
2011 01 27 - Clinical LOINC Tutorial - Documents
 
ALA 2010 -- Jabin White
ALA 2010 -- Jabin WhiteALA 2010 -- Jabin White
ALA 2010 -- Jabin White
 
Clinical Clarity versus Terminological Order - The Readiness of SNOMED CT Con...
Clinical Clarity versus Terminological Order - The Readiness of SNOMED CT Con...Clinical Clarity versus Terminological Order - The Readiness of SNOMED CT Con...
Clinical Clarity versus Terminological Order - The Readiness of SNOMED CT Con...
 
SLE 2012 Keynote: Cognitive and Social Challenges of Ontology Use in the Biom...
SLE 2012 Keynote: Cognitive and Social Challenges of Ontology Use in the Biom...SLE 2012 Keynote: Cognitive and Social Challenges of Ontology Use in the Biom...
SLE 2012 Keynote: Cognitive and Social Challenges of Ontology Use in the Biom...
 
Pcori2013 (23)
Pcori2013 (23)Pcori2013 (23)
Pcori2013 (23)
 
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminarMolecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
 

Kürzlich hochgeladen

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 

Kürzlich hochgeladen (20)

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

Exploiting Semantic Structure for Mapping User-specified Form Terms to SNOMED CT Concepts

  • 1. Exploiting Semantic Structure for Mapping User-specified Form Terms to SNOMED CT Concepts Ritu Khare1,2, Yuan An1, Jiexun Li1, Il-Yeol Song1, Xiaohua Hu1 The iSchool at Drexel1 College of Medicine2 Drexel University, Philadelphia, PA, USA
  • 2. Presentation Order 1. Motivation 2. Problems 3. Solutions 4. Evaluation 5. Final Remarks 2
  • 3. General Motivation  Database Integration and Interoperability  Semantic Heterogeneity across clinical data sources (Halevy, 2005, Henry et al. 1993, Hernandez et al. 2005, Wright et al., 1999) ? MRN Med Rec # Medical Record Number Blood Diastolic Pressure Systolic BP Physical Status Constitutional Vital Signs Recommendation: Controlled Medical Vocabularies should be involved in the design artifacts of the healthcare systems. (Jean et al., 2007, Sugumaran and Storey, 2002) 3
  • 4. Specific Motivation Clinical Encounter Form Electronic Health Records (EHR)  The terms on the clinical forms are mapped to, or annotated by, a standard terminology.  Domain experts may manually perform the annotation  costly and tedious Research Objective: Design an automatic tool for mapping 4 form terms to standard terminologies.
  • 5. 1. Motivation 2. Problem 3. Solutions 4. Evaluation 5. Final Remarks 5
  • 6. The Mapping Problem Clinical Encounter Form SNOMED CT  The Systematized Nomenclature of Medicine - Clinical Terms (Intl. Health Terminology Stds. Dev. Org)  Most comprehensive clinical vocabulary (SNOMED CT User Guide, 2009).  >360,000 logically-defined clinical concepts (Hina et al., 2010, Stenzhorn et al., 2009). Form Term SNOMED CT Concept Patient 11615400: Patient (person) MRN 398225001: Medical record number 6 (observable entity)
  • 7. SNOMED CT Concepts SNOMED CT concept id: 0231832 Semantic Categories Fully-specified-name: Respiratory Rate (Observable Entity) •Attribute Preferred Term: Respiratory Rate •Body Structure Synonym: Respiration Frequency •Disorder •Finding •Observable Entity concept id: 362508001 •Occupation Fully-specified-name: Both eyes, entire (Body Structure) •Person Preferred Term: Both eyes, entire •Physical Object Synonym: OU- Both eyes •Procedure •Racial Group •Situation 7 •…
  • 8. SNOMED CT Browsers: (Rogers and Bodenreider, 2008) Existing Mapping Services General Mapping Category Specific Mapping 8
  • 9. Challenges: Mapping Form Terms to SNOMED CT Concepts  Diversity Challenge  Context Challenge  Different clinicians - different  Same Form Term - Different terms Concepts.  MRN, Med. Rec.#  Vital signs, Constitutional, Physical status 9
  • 10. 1. Motivation 2. Problem 3. Solution 4. Evaluation 5. Final Remarks 10
  • 11. Premises  The first, i.e., the most string-  The key is to identify the similar, result retrieved by the SNOMED CT semantic category-specific mapping is category appropriate for a usually the desired concept. given term. How to automatically determine the SNOMED CT Semantic ? Category appropriate for a given form term ? 11
  • 12. The term context can be derived from the SEMANTIC STUCTURE of 1 the form.  The FORM TREE accurately captures the semantic intentions of the designer.  Inspired by hierarchical modeling of forms (Dragut et al. 2009, Wu et al. 2009) 12
  • 13. The implicit relationship between 2 the term context (i.e., the semantic structure) and the desired semantic category Naïve Bayes Classifier can be formally captured into  Based on the Bayes theorem a STATISTICAL MODEL. (Han and Kamber 2006). Procedure  Class Labels (SNOMED CT Person root semantic categories )  attribute, body structure, Observable Entity Patient Examination disorder, …  Data Attributes (local Name Gender structure) Respiratory Observable  Node type Entity  Parent node type Observable  Child node Type Entity M F  Parent Semantic Category nl perc.  Grandparent Semantic Finding Category Qualifier Value Qualifier Value 13
  • 14. Overall Mapping Approach Form Tree Training Data Node Category Semantic SNOMED Form Structure Attributes Classificatio Membership Category CT Category SNOMED CT Term Analyzer n Model Category Probabilities Picker Specific Concept Mapping Procedure Person root Observable Entity Patient Examination Name Gender Respiratory Observable Observable Entity Entity Novelty: Hybrid Approach (leverages semantic structure as well as term 14 linguistics)
  • 15. 1. Motivation 2. Problem 3. Solution 4. Evaluation 5. Final Remarks 15
  • 16. Data Manual (Gold) Annotations 954 (63.55%) terms Dataset Forms Total Term Concept ID Terms Patien 11615400: Patient 1 Walk in clinic encounter 161 t (person) forms (3 forms) MRN 398225001: Medical 2 Nursing patient 261 record number admission forms (6 (observable entity) forms) … ………………. 3 Labor & delivery DB 294 data-entry forms (7 forms) Some Unmapped Terms 4 Adult visit encounter 388 no scleral icterus forms (5 forms) chronic back pain 5 Child visit encounter 397 Follow up with PCP forms (5 forms) Sent to ER 16 26 Forms 1501
  • 17. Implementation (JAVA) and Settings Gold Form Design Annotations Interface API, provided by the Dataline Form Tree Training Data Software Limited Category Semantic SNOMED Form Structure Node Classificatio Membership Category CT Category SNOMED CT Term Analyzer Attributes n Model Category Probabilities Picker Specific Concept Mapping Cross Validation 17 (leave 1 out) for each dataset
  • 18. Goal: To study whether… Experiment Design semantic structure can improve mapping performance. SNOMED Form CT General SNOMED CT Measures Term Mapping Concept Precision # correct annotations/# Baseline (linguistics annotations only) Recall # correct annotations/# gold annotations Category Semantic SNOMED Form Structure Node Classificatio Membership Category CT SNOMED CT Category Term Analyzer Attributes n Model Category Probabilities Picker Specific Concept Mapping Hybrid (linguistics + semantic structure) Category Category Semantic SNOMED Form Structure Node Classificatio Membership Category CT SNOMED CT Picker Term Analyzer Attributes n Model +candidate Category Probabilities Specific Concept set expansion Mapping 18 Hybrid++
  • 19. Mapping Duration Results /form = 1- 11 s  Baseline  Recall low:  Precision: 0.63, Recall: 0.45  SNOMED CT API uses exact  Baseline to Hybrid string matching  Precision by 18%.  Couldn’t handle the variation of terms, i.e., diversity  Hybrid to Hybrid++ challenge.  Precision by16% , Recall by23%  Hybrid++ 19  Precision: 0.86, Recall: 0.55
  • 20. More Results  Term processing component  remove special characters  -, #, /, etc.  acronym expansion dictionary  T (Temperature)  BTL (Bilateral Tubal  Precision only slightly Litigation) improved  3-5%  VTE (Venous  Recall improved majorly Thromboembolism)  25%  Final Precision =0.89, Recall 20 =0.76
  • 21. Implications Impact of Semantic Structure Overall mapping performance More number of correct predictions (context challenge) Impact of Linguistics Majorly on recall Reaches more number of relevant terms (diversity challenge) Overall Promising performance, even with limited training data Recall low because of simplicity of linguistic techniques - can be further improved using sophisticated techniques. 21
  • 22. 1. Motivation 2. Problem 3. Solution 4. Evaluation 5. Final Remarks 22
  • 23. Contributions  PROBLEM: NEW problem of standardizing the terms on clinical encounter forms using SNOMED CT.  Existing works (Henry et al., 1993, Barrows Jr. et al. 1994, Patrick et al. 2007)  standardization of clinical notes: diagnosis, medication information, patient complaints, etc.  SOLUTION: Context-based method that leverages SEMANTIC STRUCTURE of forms along with term linguistics.  Existing works  linguistic techniques (synonyms, morphemes, lexical variants) 23
  • 24. Contributions  EVALUATION: 26 healthcare forms containing 950+ mappable terms specified by multiple clinicians.  Improvement over existing services  23% precision, 38% recall  Promising Performance  precision: 0.89, recall: 0.76  FINDINGS:  Linguistics helps overcome diversity challenge and improve recall  Semantic structure helps overcome context challenge and improves precision and recall.  Design synergistic hybrid approaches to address all mapping challenges, and Achieve a superior performance 24
  • 25. Limitations  TECHNIQUE  TECHNICAL EVALUATION  Post coordinated mapping  Compare with other models:  Handle Missing and  Bayesian networks, k Inapplicable Values in Neural Networks, Training data Classification Association Rules  STUDY  Test the validity of  Domain Expert Annotator assumptions  Class conditional independence  Correctness of most linguistic matching concept  Classification Attributes  Compare/Combine with 25 other UMLS terminology
  • 26. Future Directions  Fully explore SNOMED  In larger frameworks, does CT annotation help improve  Defining relationships  Data/Database Integration ?  Data Quality ?  Customize for Form  Patient Diagnosis ? Categories  User Interventions ?  Encounter, Regular Visit,… Work In Progress:  Larger Knowledge Base for Integrate with flexible Electronic Training Datasets Health Record system (IHI 2010) Integration of new forms in EHR improve database integration process 26

Hinweis der Redaktion

  1. 25 min presentation – 5 min question answer. Make 20 slides only. Read reviewers comments. Breakdown – 2, 4, 5, 5, 4
  2. (In other words, we could say that existing systems are certainly not designed with future integration in mind.)
  3. Who designed the forms? Why not other domains – which other domains? Possible. Have some idea. Mark the concepts – post coordinated or partial mapping.
  4. Draw all the figures properly in MS 2010 ppt.
  5. Why does recall decrease – when number of correct predictions decrease on applying the hybrid method. Sometime linguitic approach returns more accurate result. More improvement in recall, and precision means forms had those terms whose multiple senses exist in SNOMED CT
  6. Our experience of tagging 52 data-entry forms suggests that the training samples can be constructed quickly and easily, as compared to the construction of exhaustive set of rules or heuristicsTo further test the performance of the mapping framework in a heterogeneous environment,