NLP (Natural Language Processing) shows a great deal of potential for many applications in the healthcare industry. This document shares 6 promising use cases for NLP to manage Epilepsy treatment effectively.
1. This document is confidential and contains proprietary information, including trade secrets of CitiusTech. Neither the document nor any of the information
contained in it may be reproduced or disclosed to any unauthorized person under any circumstances without the express written permission of CitiusTech.
6 Epilepsy Use Cases for NLP
March 2020
Authors: Sameer Gokhale; Sr. Healthcare Consultant & Mamata Khirade; Healthcare Business Analyst
CitiusTech Thought
Leadership
2. 2
Agenda
▪ NLP for Life Sciences
▪ Hidden Insights in Unstructured Data for Epilepsy
▪ Use Cases - NLP for Epilepsy
• Epilepsy Triggers
• Identify Surgical Candidates
• Identify PNES Patients
• Detection of SUDEP Patients
• Build Epilepsy Subgroup Cohort
• Avoid Epilepsy Misdiagnosis
▪ References
3. 3
NLP for Life Sciences
▪ Natural Language Processing (NLP) is a form of Artificial Intelligence (AI) that focuses on
interpretation and manipulation of human-generated spoken and/or written data
▪ NLP enables computer programs to process and analyze unstructured data, such as free-text
physician notes written in Electronic Health Records (EHRs)
▪ Specific tasks for an NLP system may include:
Summarize
Text
▪ Identify key
concepts
from clinical
notes/
discharge
summaries
▪ Understand
academic
journal
articles
Data Mapping
▪ Unstructured
text to
structured
data from
EHRs
▪ Improve
clinical data
integrity
Data
Conversion
▪ Machine-
readable
formats into
natural
language
▪ Reporting
and
educational
purpose
Analysis Ready
Data
▪ Synthesis of
multiple data
sources
▪ OCR to
convert the
images into
text
▪ Radiology
scans to text
Synthesis of
Insights
▪ Learn/
Relearn over
time
▪ Incorporate
results of
previous
interactions
as feedback
4. 4
Hidden Insights in Unstructured Data for Epilepsy (1/2)
▪ Epilepsy research leverages the data from Electronic Health Records (EHRs) and hospital
discharge summaries
▪ These sources do not contain detailed epilepsy information such as epilepsy subtype/syndrome,
epilepsy cause, seizure type or investigation results in a structured form, limiting the quality and
depth of the research data
▪ Unstructured or semi-structured data from clinical notes offer in-depth information on epilepsy,
but are difficult to extract and drive meaningful insights
▪ NLP can be utilized to mine insights from such unstructured and semi-structured data
5. 5
Hidden Insights in Unstructured Data for Epilepsy (2/2)
General Medicine
Admission
Notes
Progress
Notes
Discharge
Summary
Operative
Notes
Outpatient
Clinic Notes
Pathology
Reports
Radiology
Reports
Clinical Notes/Discharge Summaries
▪ Aid Clinical Decision Support
▪ Identify Patient Cohorts
▪ Identify Risk Factors
Epilepsy Specific
NLP
▪ Phenotype Extraction
▪ Patient Assessment and Stratification
▪ Prognostication and Treatment
Insights from Unstructured Data is of immense help to Physicians/Researchers
6. 6
Use Case 1: Epilepsy Triggers
Data
Sources
Clinical and nursing notes from electronic medical records of a provider hospital
Input
Parameters
Epilepsy patient well-being triggers
▪ Issues: Headaches, dizziness, vitality, insomnia, bowel symptoms, skin problems, weight
fluctuation and hormonal imbalance
▪ Cognitive Symptoms: Visual and mental symptoms
▪ QoL: Quality of social life and socioeconomic problems
Output
Parameters
Patient's well-being defined by a Risk Score based on Frequency of triggers and Overlap of
triggers
Methodology NLP to analyze an unstructured data to identify epilepsy patient's well-being triggers across
3 categories
Dependencies ▪ Electronic Medical Records database
▪ Analysis Tool (SAS miner)
▪ Validation of epilepsy patient's well-being triggers
Assumptions Triggers to be validated by client's clinical team
Business
Outcomes
▪ Predictive tool (risk score) to identify epilepsy patient's well-being
▪ Tool will prevent unnecessary hospitalizations, medical/diagnostic testing and treatment
interventions
▪ Use case with necessary clinical interventions will help reduce clinical, humanistic and
economic burden on epilepsy patients
Reference - Kivekas E et al. 2016
7. 7
Use Case 2: Identify Surgical Candidates
Data
Sources
Electronic Health Record (EHR) data of epilepsy patients
Input
Parameters
▪ Progress notes stored in EHR data
▪ Feature set extracted from progress notes to train model
▪ E.g. "under suboptimal control", "no seizures“; training and test data sets
Output
Parameters
Based on surgical candidacy score, classify patients as -
▪ Patients with greater likelihood of surgery candidacy
▪ Patients with greater likelihood of seizure freedom
Methodology ▪ Develop NLP model with extracted features to classify the patients based on a surgical
candidacy score
▪ Calculate efficacy of the model using statistical parameters such as AUC, F1 score,
sensitivity, specificity, PPV, NPV, etc.
▪ Verify results of the model from expert Epileptologists
Dependencies Data for patient dataset to train model
Assumptions Manual verification of model results by experts
Business
Outcomes
▪ Surgical Candidacy Score: To accurately identify patients who could benefit from surgery
in earlier stages of epilepsy journey using NLP integrated EHR data
▪ To generate electronic alerts for potential surgical candidates to undergo presurgical
evaluation
▪ Improved provider performance and patient health outcomes
Reference - Wissel BD et al. 2019
8. 8
Use Case 3: Identify PNES Patients
Data
Sources
Clinical notes of epilepsy patients from EHR with video-electroencephalogram (VEGR)
monitoring. Assess sections of clinical notes such as “history of present illness/subjective”,
“past medical history”, “impression”, “assessment” and “plan”
Input
Parameters
Develop NLP rules and classifier based on PNES related vocabulary such as “psychogenic
non-epileptic”, “psychogenic seizures”, “non-epileptic”, “pseudo seizures”, “NES” and “PNES”
(Psychogenic Nonepileptic Seizures)
Output
Parameters
▪ Classify patients as – Definite, Probable PNES and No PNES
▪ Significantly reduce false positive diagnosis of epilepsy by excluding notes of patients
with PNES
Methodology ▪ NLP tool (Yale cTakes Extension) uses UIMA & MI classifier to analyze clinical notes of
patients with epilepsy diagnosis (ICD-9 345.X) and identify definite or probable PNES
patients
▪ Positive predictive values (PPVs), sensitivity and F-score are also calculated
Dependencies Clinical notes from EHR; Analysis Tool such as SAS Miner; Yale cTakes Extension (YTEX)
Assumptions Correct diagnosis of PNES patients will avoid misdiagnosis and reduce resource utilization
Business
Outcomes
▪ Avoid false positive epilepsy diagnosis. About 7.7% (~6,160 veterans) of epilepsy patients
across USA suffer from PNES and are misdiagnosed as Epileptic
▪ Significant economic consequences of misdiagnosis; NICE guidelines estimate direct total
national medical costs between £164 - 188 million
Reference - Hamid H et al. 2013
9. 9
Use Case 4: Detection of SUDEP Patients
Data
Sources
Physician notes from EMR and notes with patient's clinical presentation, clinical summary,
medical history, surgical history, medications, allergies, family history, and treatment plan
Input
Parameters
▪ SUDEP risk factors
▪ Synonyms for generalized tonic–clonic seizures (GTCS)
▪ Create NLP algorithms for each risk factor
▪ Refractory and surgery candidacy
Output
Parameters
Potential SUDEP patients based on identifiable risk factors from respective clinical notes
Methodology Regular expressions, a type of NLP, can be used to identify SUDEP risk factors in EMRs
(physician notes)
Dependencies ▪ Clinical notes from EHR
▪ Analysis tool such as SAS Miner
▪ Validation of risk factors for SUDEP
Assumptions Risk factors of SUDEP to be validated by the clinical team
Business
Outcomes
▪ Detection of risk factors of SUDEP in Epilepsy patients and offer ongoing surveillance
▪ Electronically prompt clinician to provide counseling to potential SUDEP patients
▪ Cohort builder of homogenous SUDEP patients
Reference - Barbour K et al. 2019
10. 10
Use Case 5: Build Epilepsy Subgroup Cohort
Data
Sources
Patient discharge summaries
Input
Parameters
▪ De-identified plain text files from discharge summaries
▪ 5 categories: Epileptogenic zone, seizure semiology, lateralizing sign, interictal EEG and
ictal EEG pattern
▪ 5 modules of PEEP: Splitting sections and extracting segments; generating correlation
candidates using EpSO; identification of correlation candidates to link anatomical
locations to epilepsy phenotypes; classifying phenotype categories and storing resulting
structured data in a database; and performing cohort identification queries
Output
Parameters
Extracted results of complex epilepsy phenotypes to build a cohort of epilepsy patients with a
specific phenotype
Methodology ▪ Rule-based system (PEEP) for automatic extraction of complex epilepsy phenotypes and
correlated anatomical locations from discharge summaries
▪ PEEP pipeline: Section splitting and segment extraction; phenotype and anatomical
location correlation candidate generation (tool: MetaMapRENER); correlation algorithm;
phenotype category classifier; and cohort identification
Dependencies Defined rule-based system
Assumptions ▪ Clinical validation of rules
▪ PEEP will work at other hospitals with minor changes
Business
Outcomes
▪ Phenotype Extraction in Epilepsy (PEEP)
▪ Cohort builder of homogenous SUDEP patients
Reference - Cui L et al. 2014
11. 11
Use Case 6: Avoid Epilepsy Misdiagnosis
Data
Sources
Free-text from discharge summaries and EEG reports from healthcare provider
Input
Parameters
Diagnosis related information present in a narrative form in the medical records across
multiple visits by a patient
Output
Parameters
▪ Identify misdiagnosed epilepsy
▪ Improve the ability of medical doctors to identify an epilepsy syndrome which is not
previously diagnosed
Methodology ▪ Text classification methods to correlate narrative text features to the diagnosis of
Epilepsy.
▪ First step is to construct a term vector representation of the doc.
▪ Second step is the Latent Dirichlet Allocation (LDA) which is an unsupervised technique
used for topic discovery and text classification
Dependencies ▪ Clinical Notes from EHR
▪ Analysis tool such as SAS Miner
▪ Validation of risk factors for SUDEP
Assumptions Clinical definitions for epilepsy diagnoses to be validated by client
Business
Outcomes
▪ Clinical decision support tool using NLP algorithm
▪ Significant economic consequences of misdiagnosis. NICE guidelines estimate direct total
national medical costs between £164 - 188 million
Reference - Cui L et al. 2014
12. 12
References
▪ McKinsey & Company. Natural language processing in healthcare. Dec 2018.
https://www.mckinsey.com/industries/healthcare-systems-and-services/our-insights/natural-language-
processing-in-healthcare
▪ Bresnik J. August 2016. What Is the Role of Natural Language Processing in Healthcare?
https://healthitanalytics.com/features/what-is-the-role-of-natural-language-processing-in-healthcare
▪ Kivekas E et al. 2016. Functionality of Triggers for Epilepsy Patients Assessed by Text and Data Mining of
Medical and Nursing Records. Nursing Informatics 2016. W. Sermeus et al. (Eds.). doi:10.3233/978-1-61499-
658-3-128
▪ Wissel BD et al. 2019. Prospective validation of a machine learning model that uses provider notes to identify
candidates for resective epilepsy surgery. Epilepsia. 2019;00:1–10. DOI: 10.1111/epi.16398
▪ Hamid H et al. Validating a natural language processing tool to exclude psychogenic nonepileptic seizures in
electronic medical record-based epilepsy research. Epilepsy & Behavior 29 (2013) 578–580.
http://dx.doi.org/10.1016/j.yebeh.2013.09.025
▪ Barbour K et al. 2019. Automated detection of sudden unexpected death in epilepsy risk factors in electronic
medical records using natural language processing. Epilepsia. 2019;60:1209–1220. DOI: 10.1111/epi.15966
▪ Cui L et al. 2014. Complex epilepsy phenotype extraction from narrative clinical discharge summaries. J Biomed
Inform. 2014 October ; 51: 272–279. doi:10.1016/j.jbi.2014.06.006.
▪ Sullivan E et al. 2014. Text Classification towards Detecting Misdiagnosis of an Epilepsy Syndrome in a Pediatric
Population. AMIA Annu Symp Proc. 2014 Nov 14;2014:1082-7. eCollection 2014.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4419916/
▪ Baldassano SN et al., Big data in status epilepticus, Epilepsy & Behavior,
https://doi.org/10.1016/j.ebeh.2019.106457
13. About CitiusTech
4,000+
Healthcare IT professionals worldwide
1,500+
Healthcare software engineering
400+
FHIR / HL7 certified professionals
25%+
CAGR over last 5 years
110+
Healthcare customers
▪ Healthcare technology companies
▪ Hospitals, IDNs & medical groups
▪ Payers and health plans
▪ ACO, MCO, HIE, HIX, NHIN and RHIO
▪ Pharma & Life Sciences companies
13
Thank You
Authors:
Sameer Gokhale
Sr. Healthcare Consultant
Mamata Khirade
Healthcare Business Analyst
thoughtleaders@citiustech.com