SlideShare a Scribd company logo
1 of 29
Dr. David Talby
NATURAL LANGUAGE UNDERSTANDING IN HEALTHCARE:
STATE OF THE ART NLP, MACHINE LEARNING AND DEEP
LEARNING WITH OPEN SOURCE SOFTWARE
CONTENTS
1. THE IMPORTANCE OF NLP IN HEALTHCARE AI
2. NLP IS ULTRA DOMAIN SPECIFIC
3. STATE OF THE ART RESEARCH RESULTS
4. SPARK NLP: PRODUCTION GRADE & AT SCALE
AI VS. DOCTORS
Deep Learning
Computer
Vision
Access to Care
Diagnostic
Accuracy
NLP IN HEALTHCARE
Deep Learning
NLP
Efficiency
Accuracy
Radiology Diagnostic
Mental
Health
Safety
Events
Inpatient
Pre-
Auth
Key
Opinion
Leaders
Research
Meta
Analysis
Clinical
Coding
Financial
Anti-
Fraud
Adverse
Events
Drug Development
Recruit
for Trials
RISK PREDICTION CASE STUDY: DETECTING SEPSIS
“Compared to previous work that only
used structured data such as vital signs
and demographic information, utilizing
free text drastically improves the
discriminatory ability (increase in AUC
from 0.67 to 0.86) of identifying
infection.”
COHORT SELECTION CASE STUDY: ONCOLOGY
“Using the combination of structured and
unstructured data, 8324 patients were
identified as having advanced NSCLC.
Of these patients, only 2472 were also in the
cohort generated using structured data only.
Further, 1090 patients would be included in the
structured data only cohort who should have
been excluded based on additional data.”
CONTENTS
1. THE IMPORTANCE OF NLP IN HEALTHCARE AI
2. NLP IS ULTRA DOMAIN SPECIFIC
3. STATE OF THE ART RESEARCH RESULTS
4. SPARK NLP: PRODUCTION GRADE & AT SCALE
WHAT MAKES LANGUAGE HARD
• Nuanced
– Sure / I agree / Absolutely! / Whatever / Yes sir / Just to see you smile ❤️
• Fuzzy
– Blue, New, Tall, Child, Tell, Do
• Contextual
– “Patient denies alcohol abuse”
• Medium specific
– “SGTM c u in 15”
• Domain specific
– All forward-looking statements included in this document are based on
information available to us on the date hereof, and we assume no obligation to
revise or publicly release any revision to any such forward-looking statement,
except as may otherwise be required by law.
We all speak many languages.
THE FIRST RULE OF NLP:
ED Triage Notes
states started last night, upper abd, took alka seltzer approx
0500, no relief. nausea no vomiting
Since yeatreday 10/10 "constant Tylenol 1 hr ago. +nausea.
diaphoretic. Mid abd radiates to back
Generalized abd radiating to lower x 3 days accompanied
by dark stools. Now with bloody stool this am. Denies dizzy,
sob, fatigue. Visiting from Japan on business.”
Features
Type of Pain
Intensity of Pain
Body part of region
Symptoms
Onset of symptoms
Attempted home remedy
EMERGENCY ROOM LANGUAGE
Different Vocabulary
Different Grammar
Different Context
Different Meaning
Different
Language Models
Tokenizer Normalizer
Lemmatizer Fact Extraction
Part of Speech Tagger Spell Checker
Coreference Resolution Dependency Parser
Sentence Splitting Negation Detection
Named Entity Recognition Sentiment Analysis
Intent Classification Summarization
Word Embeddings Emotion Detection
Question Answering Relevance Ranking
Best Next Action Translation
CONTENTS
1. THE IMPORTANCE OF NLP IN HEALTHCARE AI
2. NLP IS ULTRA DOMAIN SPECIFIC
3. STATE OF THE ART RESEARCH RESULTS
4. SPARK NLP: PRODUCTION GRADE & AT SCALE
NAMED ENTITY RECOGNITION
“DEEP LEARNED” NER
“Entity Recognition from Clinical Texts via Recurrent Neural Network”.
Liu et al., BMC Medical Informatics & Decision Making, July 2017.
F-Score Dataset Task
85.81% 2010 i2b2 Medical concept extraction
92.29% 2012 i2b2 Clinical event detection
94.37% 2014 i2b2 De-identification
ENTITY RESOLUTION
“DEEP LEARNED” ENTITY RESOLUTION
“CNN-based ranking for biomedical entity normalization”.
Li et al., BMC Bioinformatics, October 2017.
F-Score Dataset Task
90.30%
ShARe /
CLEF
Disease & problem norm.
92.29% NCBI Disease norm. in literature
ASSERTION STATUS DETECTION
Prescribing sick days due to diagnosis of influenza. Positive
Jane complains about flu-like symptoms. Speculative
Jane’s RIDT came back clean. Negative
Jane is at risk for flu if she’s not vaccinated. Conditional
Jane’s older brother had the flu last month. Family history
Jane had a severe case of flu last year. Patient history
“DEEP LEARNED” ASSERTION STATUS DETECTION
“Improving Classification of Medical Assertions in Clinical Notes“
Kim et al., In Proceedings of the 49th Annual Meeting of the Association for Computational
Linguistics: Human Language Technologies, 2011.
Dataset Metric
94.17%
4th i2b2/VA
Mirco-averaged F1
79.76% Marco-averaged F1
WORD EMBEDDINGS
BIOMEDICAL WORD EMBEDDINGS
“How to Train Good Word Embeddings for Biomedical NLP”.
Kim et al., In Proceedings of the 49th Annual Meeting of the Association for Computational
Linguistics: Human Language Technologies, 2011.
• Intrinsic measures (similarity between concepts)
• Extrinsic measures (improving NER)
• Hyperparameter optimization
• Sub-sampling
• Minimum-count
• Learning rate
• Vector dimension
• Context window size
• Available under an open license here
CONTENTS
1. THE IMPORTANCE OF NLP IN HEALTHCARE AI
2. NLP IS ULTRA DOMAIN SPECIFIC
3. STATE OF THE ART RESEARCH RESULTS
4. SPARK NLP: PRODUCTION GRADE & AT SCALE
NLP FOR APACHE SPARK
Data Sources API
Spark Core API (RDD’s, Project Tungsten)
Spark SQL API (DataFrame, Catalyst Optimizer)
Spark ML API (Pipeline, Transformer, Estimator)
Part of Speech Tagger
Named Entity Recognition
Sentiment Analysis
Spell Checker
Tokenizer
Stemmer
Lemmatizer
Entity Extraction
Topic Modeling
Word2Vec
TF-IDF
String distance calculation
N-grams calculation
Stop word removal
Train/Test & Cross-Validate
Ensembles
High Performance Natural Language Understanding at Scale
Design Goals
• State of the art Performance & Scale
• Frictionless Reuse
• Enterprise Grade
Built on the Spark ML API’s
Apache 2.0 Licensed
Active development & support
NLP FOR APACHE SPARK: COMBINED NLP & ML PIPELINES
pipeline = pyspark.ml.Pipeline(stages=[
document_assembler,
tokenizer,
stemmer,
normalizer,
stopword_remover,
tf,
idf,
lda])
topic_model = pipeline.fit(df)
Spark NLP annotators
Spark ML featurizers
Spark ML LDA implementation
Single execution plan for whole pipeline
PERFORMANCE BENCHMARKS
• Training was 80x faster to train on 2.6MB
• Training was 38x faster on 100k
• Training on 100k & 2.6MB took roughly the same
• Additional near-linear speedup on a cluster
• Prediction was 1.6x faster on 75MB
• Prediction was 1.4x faster on 15MB
• Adding NLP stages takes roughly the same
• Additional near-linear speedup on a cluster
HEALTHCARE EXTENSIONS
Data Sources API
Spark Core API (RDD’s, Project Tungsten)
Spark SQL API (DataFrame, Catalyst Optimizer)
Spark ML API (Pipeline, Transformer, Estimator)
Part of Speech Tagger
Named Entity Recognition
Sentiment Analysis
Spell Checker
Tokenizer
Stemmer
Lemmatizer
Entity Extraction
Topic Modeling
Word2Vec
TF-IDF
String distance calculation
N-grams calculation
Stop word removal
Train/Test & Cross-Validate
Ensembles
High Performance Natural Language Understanding at Scale
com.johnsnowlabs.nlp.clinical.*
Healthcare specific NLP
annotators for Spark in
Scala, Java or Python:
• Entity Recognition
• Value Extraction
• Word Embeddings
• Assertion Status
• Sentiment Analysis
• Spell Checking, …
data.johnsnowlabs.com/health
1,800+ Expert curated,
clean, linked, enriched &
always up to date data:
• Terminology
• Providers
• Demographics
• Clinical Guidelines
• Genes
• Measures, …
CASE STUDY: DEMAND FORECASTING OF ADMISSIONS FROM ED
Features from Structured Data
• How many patients will be admitted today?
• Data Source: EPIC Clarity data
Reason for visit
Age
Gender
Vital signs
Current wait time
Number of orders
Admit in past 30 days
Type of insurance
CASE STUDY: DEMAND FORECASTING OF ADMISSION FROM ED
Features from Natural Language Text
• A majority of the rich relevant content lies in unstructured notes that
are contributed by doctors and nurses from patient interactions.
• Data Source: Emergency Department Triage notes and other ED notes
Type of Pain
Intensity of Pain
Body part of region
Symptoms
Onset of symptoms
Attempted home remedy
Accuracy Baseline: Human manual prediction
ML with structured data
ML with NLP
USING SPARK NLP
• Homepage: https://nlp.johnsnowlabs.com
– Getting Started, Documentation, Examples, Videos, Blogs
– Join the Slack Community
• GitHub: https://github.com/johnsnowlabs/spark-nlp
– Open Issues & Feature Requests
– Contribute!
• The library has Scala and Python 2 & 3 API’s
• Get directly from maven-central or spark-packages
• Tested on all Spark 2.x versions
david@pacific.ai
@davidtalby
in/davidtalby
THANK YOU!

More Related Content

What's hot

Artificial intelligence applications for covid 19
Artificial intelligence applications for covid 19Artificial intelligence applications for covid 19
Artificial intelligence applications for covid 19SABARINATH C D
 
Digital data storage in DNA
Digital data storage in DNADigital data storage in DNA
Digital data storage in DNASharath Raj
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural networkImtiaz Siddique
 
Project on disease prediction
Project on disease predictionProject on disease prediction
Project on disease predictionKOYELMAJUMDAR1
 
Vanishing & Exploding Gradients
Vanishing & Exploding GradientsVanishing & Exploding Gradients
Vanishing & Exploding GradientsSiddharth Vij
 
AI in Healthcare 2017
AI in Healthcare 2017AI in Healthcare 2017
AI in Healthcare 2017Peter Morgan
 
Natural Language Processing for Medical Data
Natural Language Processing for Medical DataNatural Language Processing for Medical Data
Natural Language Processing for Medical DataAnja Pilz
 
5 Powerful Real World Examples Of How AI Is Being Used In Healthcare
5 Powerful Real World Examples Of How AI Is Being Used In Healthcare5 Powerful Real World Examples Of How AI Is Being Used In Healthcare
5 Powerful Real World Examples Of How AI Is Being Used In HealthcareBernard Marr
 
Deep learning for genomics: Present and future
Deep learning for genomics: Present and futureDeep learning for genomics: Present and future
Deep learning for genomics: Present and futureDeakin University
 
Dimension Reduction Introduction & PCA.pptx
Dimension Reduction Introduction & PCA.pptxDimension Reduction Introduction & PCA.pptx
Dimension Reduction Introduction & PCA.pptxRohanBorgalli
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural NetworkMuhammad Ishaq
 
Pattern Recognition using Artificial Neural Network
Pattern Recognition using Artificial Neural NetworkPattern Recognition using Artificial Neural Network
Pattern Recognition using Artificial Neural NetworkEditor IJCATR
 
Artificial Neural Network | Deep Neural Network Explained | Artificial Neural...
Artificial Neural Network | Deep Neural Network Explained | Artificial Neural...Artificial Neural Network | Deep Neural Network Explained | Artificial Neural...
Artificial Neural Network | Deep Neural Network Explained | Artificial Neural...Simplilearn
 
Autoencoders
AutoencodersAutoencoders
AutoencodersCloudxLab
 
Random forest algorithm
Random forest algorithmRandom forest algorithm
Random forest algorithmRashid Ansari
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceLukas Masuch
 

What's hot (20)

Artificial intelligence applications for covid 19
Artificial intelligence applications for covid 19Artificial intelligence applications for covid 19
Artificial intelligence applications for covid 19
 
Digital data storage in DNA
Digital data storage in DNADigital data storage in DNA
Digital data storage in DNA
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
Project on disease prediction
Project on disease predictionProject on disease prediction
Project on disease prediction
 
Vanishing & Exploding Gradients
Vanishing & Exploding GradientsVanishing & Exploding Gradients
Vanishing & Exploding Gradients
 
AI in Healthcare 2017
AI in Healthcare 2017AI in Healthcare 2017
AI in Healthcare 2017
 
Natural Language Processing for Medical Data
Natural Language Processing for Medical DataNatural Language Processing for Medical Data
Natural Language Processing for Medical Data
 
5 Powerful Real World Examples Of How AI Is Being Used In Healthcare
5 Powerful Real World Examples Of How AI Is Being Used In Healthcare5 Powerful Real World Examples Of How AI Is Being Used In Healthcare
5 Powerful Real World Examples Of How AI Is Being Used In Healthcare
 
Deep learning and Healthcare
Deep learning and HealthcareDeep learning and Healthcare
Deep learning and Healthcare
 
Ai black box
Ai black boxAi black box
Ai black box
 
Deep learning for genomics: Present and future
Deep learning for genomics: Present and futureDeep learning for genomics: Present and future
Deep learning for genomics: Present and future
 
Dimension Reduction Introduction & PCA.pptx
Dimension Reduction Introduction & PCA.pptxDimension Reduction Introduction & PCA.pptx
Dimension Reduction Introduction & PCA.pptx
 
Visual reasoning
Visual reasoningVisual reasoning
Visual reasoning
 
NAMED ENTITY RECOGNITION
NAMED ENTITY RECOGNITIONNAMED ENTITY RECOGNITION
NAMED ENTITY RECOGNITION
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
Pattern Recognition using Artificial Neural Network
Pattern Recognition using Artificial Neural NetworkPattern Recognition using Artificial Neural Network
Pattern Recognition using Artificial Neural Network
 
Artificial Neural Network | Deep Neural Network Explained | Artificial Neural...
Artificial Neural Network | Deep Neural Network Explained | Artificial Neural...Artificial Neural Network | Deep Neural Network Explained | Artificial Neural...
Artificial Neural Network | Deep Neural Network Explained | Artificial Neural...
 
Autoencoders
AutoencodersAutoencoders
Autoencoders
 
Random forest algorithm
Random forest algorithmRandom forest algorithm
Random forest algorithm
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial Intelligence
 

Similar to Natural Language Understanding in Healthcare

State of the Art Natural Language Processing at Scale with Alexander Thomas a...
State of the Art Natural Language Processing at Scale with Alexander Thomas a...State of the Art Natural Language Processing at Scale with Alexander Thomas a...
State of the Art Natural Language Processing at Scale with Alexander Thomas a...Databricks
 
Deep learning for natural language understanding
Deep learning for natural language understandingDeep learning for natural language understanding
Deep learning for natural language understandingDavid Talby
 
Applying NLP to Personalized Healthcare - 2021
Applying NLP to Personalized Healthcare - 2021Applying NLP to Personalized Healthcare - 2021
Applying NLP to Personalized Healthcare - 2021David Talby
 
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...Databricks
 
NLP tutorial at AIME 2020
NLP tutorial at AIME 2020NLP tutorial at AIME 2020
NLP tutorial at AIME 2020Rui Zhang
 
The Translational Medicine
The Translational MedicineThe Translational Medicine
The Translational MedicineJoanne Luciano
 
Recent Advances in Deep Learning Techniques for Electronic Health Record
Recent Advances in Deep Learning Techniques for Electronic Health RecordRecent Advances in Deep Learning Techniques for Electronic Health Record
Recent Advances in Deep Learning Techniques for Electronic Health Recordkingstdio
 
Apache Spark NLP: Extending Spark ML to Deliver Fast, Scalable & Unified Nat...
 Apache Spark NLP: Extending Spark ML to Deliver Fast, Scalable & Unified Nat... Apache Spark NLP: Extending Spark ML to Deliver Fast, Scalable & Unified Nat...
Apache Spark NLP: Extending Spark ML to Deliver Fast, Scalable & Unified Nat...Databricks
 
What if we never agree on a common health information model?
What if we never agree on a common health information model?What if we never agree on a common health information model?
What if we never agree on a common health information model?Koray Atalag
 
Synthetic Biology and Data-Driven Synthetic Biology for Personalized Medicine...
Synthetic Biology and Data-Driven Synthetic Biology for Personalized Medicine...Synthetic Biology and Data-Driven Synthetic Biology for Personalized Medicine...
Synthetic Biology and Data-Driven Synthetic Biology for Personalized Medicine...RussellHanson
 
2011 12 08 - LOINC Introduction
2011 12 08 - LOINC Introduction2011 12 08 - LOINC Introduction
2011 12 08 - LOINC Introductiondvreeman
 
Data Mining in Rediology reports
Data Mining in Rediology reportsData Mining in Rediology reports
Data Mining in Rediology reportsSaeed Mehrabi
 
Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...
Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...
Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...Remedy Informatics
 
New Frontiers in Applied NLP​ - PAW Healthcare 2022
New Frontiers in Applied NLP​ - PAW Healthcare 2022New Frontiers in Applied NLP​ - PAW Healthcare 2022
New Frontiers in Applied NLP​ - PAW Healthcare 2022David Talby
 
How to Apply NLP to Analyze Clinical Trials
How to Apply NLP to Analyze Clinical TrialsHow to Apply NLP to Analyze Clinical Trials
How to Apply NLP to Analyze Clinical TrialsDavid Talby
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshopGenomeInABottle
 
Qualitative analysis boot camp final presentation slides
Qualitative analysis boot camp final presentation slidesQualitative analysis boot camp final presentation slides
Qualitative analysis boot camp final presentation slidesAlexandra Howson MA, PhD, CHCP
 
Qualitative analysis boot camp final presentation slides
Qualitative analysis boot camp final presentation slidesQualitative analysis boot camp final presentation slides
Qualitative analysis boot camp final presentation slidesAlexandra Howson MA, PhD, CHCP
 
How deep learning reshapes medicine
How deep learning reshapes medicineHow deep learning reshapes medicine
How deep learning reshapes medicineHongyoon Choi
 

Similar to Natural Language Understanding in Healthcare (20)

State of the Art Natural Language Processing at Scale with Alexander Thomas a...
State of the Art Natural Language Processing at Scale with Alexander Thomas a...State of the Art Natural Language Processing at Scale with Alexander Thomas a...
State of the Art Natural Language Processing at Scale with Alexander Thomas a...
 
Deep learning for natural language understanding
Deep learning for natural language understandingDeep learning for natural language understanding
Deep learning for natural language understanding
 
Applying NLP to Personalized Healthcare - 2021
Applying NLP to Personalized Healthcare - 2021Applying NLP to Personalized Healthcare - 2021
Applying NLP to Personalized Healthcare - 2021
 
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
 
NLP tutorial at AIME 2020
NLP tutorial at AIME 2020NLP tutorial at AIME 2020
NLP tutorial at AIME 2020
 
The Translational Medicine
The Translational MedicineThe Translational Medicine
The Translational Medicine
 
Recent Advances in Deep Learning Techniques for Electronic Health Record
Recent Advances in Deep Learning Techniques for Electronic Health RecordRecent Advances in Deep Learning Techniques for Electronic Health Record
Recent Advances in Deep Learning Techniques for Electronic Health Record
 
Apache Spark NLP: Extending Spark ML to Deliver Fast, Scalable & Unified Nat...
 Apache Spark NLP: Extending Spark ML to Deliver Fast, Scalable & Unified Nat... Apache Spark NLP: Extending Spark ML to Deliver Fast, Scalable & Unified Nat...
Apache Spark NLP: Extending Spark ML to Deliver Fast, Scalable & Unified Nat...
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
 
What if we never agree on a common health information model?
What if we never agree on a common health information model?What if we never agree on a common health information model?
What if we never agree on a common health information model?
 
Synthetic Biology and Data-Driven Synthetic Biology for Personalized Medicine...
Synthetic Biology and Data-Driven Synthetic Biology for Personalized Medicine...Synthetic Biology and Data-Driven Synthetic Biology for Personalized Medicine...
Synthetic Biology and Data-Driven Synthetic Biology for Personalized Medicine...
 
2011 12 08 - LOINC Introduction
2011 12 08 - LOINC Introduction2011 12 08 - LOINC Introduction
2011 12 08 - LOINC Introduction
 
Data Mining in Rediology reports
Data Mining in Rediology reportsData Mining in Rediology reports
Data Mining in Rediology reports
 
Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...
Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...
Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...
 
New Frontiers in Applied NLP​ - PAW Healthcare 2022
New Frontiers in Applied NLP​ - PAW Healthcare 2022New Frontiers in Applied NLP​ - PAW Healthcare 2022
New Frontiers in Applied NLP​ - PAW Healthcare 2022
 
How to Apply NLP to Analyze Clinical Trials
How to Apply NLP to Analyze Clinical TrialsHow to Apply NLP to Analyze Clinical Trials
How to Apply NLP to Analyze Clinical Trials
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
 
Qualitative analysis boot camp final presentation slides
Qualitative analysis boot camp final presentation slidesQualitative analysis boot camp final presentation slides
Qualitative analysis boot camp final presentation slides
 
Qualitative analysis boot camp final presentation slides
Qualitative analysis boot camp final presentation slidesQualitative analysis boot camp final presentation slides
Qualitative analysis boot camp final presentation slides
 
How deep learning reshapes medicine
How deep learning reshapes medicineHow deep learning reshapes medicine
How deep learning reshapes medicine
 

More from David Talby

Building State-of-the-art Natural Language Processing Projects with Free Soft...
Building State-of-the-art Natural Language Processing Projects with Free Soft...Building State-of-the-art Natural Language Processing Projects with Free Soft...
Building State-of-the-art Natural Language Processing Projects with Free Soft...David Talby
 
Turning Medical Expert Knowledge into Responsible Language Models - K1st World
Turning Medical Expert Knowledge into Responsible Language Models - K1st WorldTurning Medical Expert Knowledge into Responsible Language Models - K1st World
Turning Medical Expert Knowledge into Responsible Language Models - K1st WorldDavid Talby
 
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...David Talby
 
Introducing the Open-Source Library for Testing NLP Models - Healthcare NLP S...
Introducing the Open-Source Library for Testing NLP Models - Healthcare NLP S...Introducing the Open-Source Library for Testing NLP Models - Healthcare NLP S...
Introducing the Open-Source Library for Testing NLP Models - Healthcare NLP S...David Talby
 
Architecting an Open Source AI Platform 2018 edition
Architecting an Open Source AI Platform   2018 editionArchitecting an Open Source AI Platform   2018 edition
Architecting an Open Source AI Platform 2018 editionDavid Talby
 
Build your open source data science platform
Build your open source data science platformBuild your open source data science platform
Build your open source data science platformDavid Talby
 
Natural Language Understanding with Machine Learned Annotators and Deep Learn...
Natural Language Understanding with Machine Learned Annotators and Deep Learn...Natural Language Understanding with Machine Learned Annotators and Deep Learn...
Natural Language Understanding with Machine Learned Annotators and Deep Learn...David Talby
 
Architecting a Predictive, Petabyte-Scale, Self-Learning Fraud Detection System
Architecting a Predictive,  Petabyte-Scale, Self-Learning Fraud Detection SystemArchitecting a Predictive,  Petabyte-Scale, Self-Learning Fraud Detection System
Architecting a Predictive, Petabyte-Scale, Self-Learning Fraud Detection SystemDavid Talby
 
Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...
Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...
Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...David Talby
 

More from David Talby (9)

Building State-of-the-art Natural Language Processing Projects with Free Soft...
Building State-of-the-art Natural Language Processing Projects with Free Soft...Building State-of-the-art Natural Language Processing Projects with Free Soft...
Building State-of-the-art Natural Language Processing Projects with Free Soft...
 
Turning Medical Expert Knowledge into Responsible Language Models - K1st World
Turning Medical Expert Knowledge into Responsible Language Models - K1st WorldTurning Medical Expert Knowledge into Responsible Language Models - K1st World
Turning Medical Expert Knowledge into Responsible Language Models - K1st World
 
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
 
Introducing the Open-Source Library for Testing NLP Models - Healthcare NLP S...
Introducing the Open-Source Library for Testing NLP Models - Healthcare NLP S...Introducing the Open-Source Library for Testing NLP Models - Healthcare NLP S...
Introducing the Open-Source Library for Testing NLP Models - Healthcare NLP S...
 
Architecting an Open Source AI Platform 2018 edition
Architecting an Open Source AI Platform   2018 editionArchitecting an Open Source AI Platform   2018 edition
Architecting an Open Source AI Platform 2018 edition
 
Build your open source data science platform
Build your open source data science platformBuild your open source data science platform
Build your open source data science platform
 
Natural Language Understanding with Machine Learned Annotators and Deep Learn...
Natural Language Understanding with Machine Learned Annotators and Deep Learn...Natural Language Understanding with Machine Learned Annotators and Deep Learn...
Natural Language Understanding with Machine Learned Annotators and Deep Learn...
 
Architecting a Predictive, Petabyte-Scale, Self-Learning Fraud Detection System
Architecting a Predictive,  Petabyte-Scale, Self-Learning Fraud Detection SystemArchitecting a Predictive,  Petabyte-Scale, Self-Learning Fraud Detection System
Architecting a Predictive, Petabyte-Scale, Self-Learning Fraud Detection System
 
Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...
Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...
Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...
 

Recently uploaded

Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 

Recently uploaded (20)

Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 

Natural Language Understanding in Healthcare

  • 1. Dr. David Talby NATURAL LANGUAGE UNDERSTANDING IN HEALTHCARE: STATE OF THE ART NLP, MACHINE LEARNING AND DEEP LEARNING WITH OPEN SOURCE SOFTWARE
  • 2. CONTENTS 1. THE IMPORTANCE OF NLP IN HEALTHCARE AI 2. NLP IS ULTRA DOMAIN SPECIFIC 3. STATE OF THE ART RESEARCH RESULTS 4. SPARK NLP: PRODUCTION GRADE & AT SCALE
  • 3. AI VS. DOCTORS Deep Learning Computer Vision Access to Care Diagnostic Accuracy
  • 4. NLP IN HEALTHCARE Deep Learning NLP Efficiency Accuracy Radiology Diagnostic Mental Health Safety Events Inpatient Pre- Auth Key Opinion Leaders Research Meta Analysis Clinical Coding Financial Anti- Fraud Adverse Events Drug Development Recruit for Trials
  • 5. RISK PREDICTION CASE STUDY: DETECTING SEPSIS “Compared to previous work that only used structured data such as vital signs and demographic information, utilizing free text drastically improves the discriminatory ability (increase in AUC from 0.67 to 0.86) of identifying infection.”
  • 6. COHORT SELECTION CASE STUDY: ONCOLOGY “Using the combination of structured and unstructured data, 8324 patients were identified as having advanced NSCLC. Of these patients, only 2472 were also in the cohort generated using structured data only. Further, 1090 patients would be included in the structured data only cohort who should have been excluded based on additional data.”
  • 7. CONTENTS 1. THE IMPORTANCE OF NLP IN HEALTHCARE AI 2. NLP IS ULTRA DOMAIN SPECIFIC 3. STATE OF THE ART RESEARCH RESULTS 4. SPARK NLP: PRODUCTION GRADE & AT SCALE
  • 8. WHAT MAKES LANGUAGE HARD • Nuanced – Sure / I agree / Absolutely! / Whatever / Yes sir / Just to see you smile ❤️ • Fuzzy – Blue, New, Tall, Child, Tell, Do • Contextual – “Patient denies alcohol abuse” • Medium specific – “SGTM c u in 15” • Domain specific – All forward-looking statements included in this document are based on information available to us on the date hereof, and we assume no obligation to revise or publicly release any revision to any such forward-looking statement, except as may otherwise be required by law.
  • 9. We all speak many languages. THE FIRST RULE OF NLP:
  • 10. ED Triage Notes states started last night, upper abd, took alka seltzer approx 0500, no relief. nausea no vomiting Since yeatreday 10/10 "constant Tylenol 1 hr ago. +nausea. diaphoretic. Mid abd radiates to back Generalized abd radiating to lower x 3 days accompanied by dark stools. Now with bloody stool this am. Denies dizzy, sob, fatigue. Visiting from Japan on business.” Features Type of Pain Intensity of Pain Body part of region Symptoms Onset of symptoms Attempted home remedy EMERGENCY ROOM LANGUAGE
  • 11. Different Vocabulary Different Grammar Different Context Different Meaning Different Language Models Tokenizer Normalizer Lemmatizer Fact Extraction Part of Speech Tagger Spell Checker Coreference Resolution Dependency Parser Sentence Splitting Negation Detection Named Entity Recognition Sentiment Analysis Intent Classification Summarization Word Embeddings Emotion Detection Question Answering Relevance Ranking Best Next Action Translation
  • 12. CONTENTS 1. THE IMPORTANCE OF NLP IN HEALTHCARE AI 2. NLP IS ULTRA DOMAIN SPECIFIC 3. STATE OF THE ART RESEARCH RESULTS 4. SPARK NLP: PRODUCTION GRADE & AT SCALE
  • 14. “DEEP LEARNED” NER “Entity Recognition from Clinical Texts via Recurrent Neural Network”. Liu et al., BMC Medical Informatics & Decision Making, July 2017. F-Score Dataset Task 85.81% 2010 i2b2 Medical concept extraction 92.29% 2012 i2b2 Clinical event detection 94.37% 2014 i2b2 De-identification
  • 16. “DEEP LEARNED” ENTITY RESOLUTION “CNN-based ranking for biomedical entity normalization”. Li et al., BMC Bioinformatics, October 2017. F-Score Dataset Task 90.30% ShARe / CLEF Disease & problem norm. 92.29% NCBI Disease norm. in literature
  • 17. ASSERTION STATUS DETECTION Prescribing sick days due to diagnosis of influenza. Positive Jane complains about flu-like symptoms. Speculative Jane’s RIDT came back clean. Negative Jane is at risk for flu if she’s not vaccinated. Conditional Jane’s older brother had the flu last month. Family history Jane had a severe case of flu last year. Patient history
  • 18. “DEEP LEARNED” ASSERTION STATUS DETECTION “Improving Classification of Medical Assertions in Clinical Notes“ Kim et al., In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011. Dataset Metric 94.17% 4th i2b2/VA Mirco-averaged F1 79.76% Marco-averaged F1
  • 20. BIOMEDICAL WORD EMBEDDINGS “How to Train Good Word Embeddings for Biomedical NLP”. Kim et al., In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011. • Intrinsic measures (similarity between concepts) • Extrinsic measures (improving NER) • Hyperparameter optimization • Sub-sampling • Minimum-count • Learning rate • Vector dimension • Context window size • Available under an open license here
  • 21. CONTENTS 1. THE IMPORTANCE OF NLP IN HEALTHCARE AI 2. NLP IS ULTRA DOMAIN SPECIFIC 3. STATE OF THE ART RESEARCH RESULTS 4. SPARK NLP: PRODUCTION GRADE & AT SCALE
  • 22. NLP FOR APACHE SPARK Data Sources API Spark Core API (RDD’s, Project Tungsten) Spark SQL API (DataFrame, Catalyst Optimizer) Spark ML API (Pipeline, Transformer, Estimator) Part of Speech Tagger Named Entity Recognition Sentiment Analysis Spell Checker Tokenizer Stemmer Lemmatizer Entity Extraction Topic Modeling Word2Vec TF-IDF String distance calculation N-grams calculation Stop word removal Train/Test & Cross-Validate Ensembles High Performance Natural Language Understanding at Scale Design Goals • State of the art Performance & Scale • Frictionless Reuse • Enterprise Grade Built on the Spark ML API’s Apache 2.0 Licensed Active development & support
  • 23. NLP FOR APACHE SPARK: COMBINED NLP & ML PIPELINES pipeline = pyspark.ml.Pipeline(stages=[ document_assembler, tokenizer, stemmer, normalizer, stopword_remover, tf, idf, lda]) topic_model = pipeline.fit(df) Spark NLP annotators Spark ML featurizers Spark ML LDA implementation Single execution plan for whole pipeline
  • 24. PERFORMANCE BENCHMARKS • Training was 80x faster to train on 2.6MB • Training was 38x faster on 100k • Training on 100k & 2.6MB took roughly the same • Additional near-linear speedup on a cluster • Prediction was 1.6x faster on 75MB • Prediction was 1.4x faster on 15MB • Adding NLP stages takes roughly the same • Additional near-linear speedup on a cluster
  • 25. HEALTHCARE EXTENSIONS Data Sources API Spark Core API (RDD’s, Project Tungsten) Spark SQL API (DataFrame, Catalyst Optimizer) Spark ML API (Pipeline, Transformer, Estimator) Part of Speech Tagger Named Entity Recognition Sentiment Analysis Spell Checker Tokenizer Stemmer Lemmatizer Entity Extraction Topic Modeling Word2Vec TF-IDF String distance calculation N-grams calculation Stop word removal Train/Test & Cross-Validate Ensembles High Performance Natural Language Understanding at Scale com.johnsnowlabs.nlp.clinical.* Healthcare specific NLP annotators for Spark in Scala, Java or Python: • Entity Recognition • Value Extraction • Word Embeddings • Assertion Status • Sentiment Analysis • Spell Checking, … data.johnsnowlabs.com/health 1,800+ Expert curated, clean, linked, enriched & always up to date data: • Terminology • Providers • Demographics • Clinical Guidelines • Genes • Measures, …
  • 26. CASE STUDY: DEMAND FORECASTING OF ADMISSIONS FROM ED Features from Structured Data • How many patients will be admitted today? • Data Source: EPIC Clarity data Reason for visit Age Gender Vital signs Current wait time Number of orders Admit in past 30 days Type of insurance
  • 27. CASE STUDY: DEMAND FORECASTING OF ADMISSION FROM ED Features from Natural Language Text • A majority of the rich relevant content lies in unstructured notes that are contributed by doctors and nurses from patient interactions. • Data Source: Emergency Department Triage notes and other ED notes Type of Pain Intensity of Pain Body part of region Symptoms Onset of symptoms Attempted home remedy Accuracy Baseline: Human manual prediction ML with structured data ML with NLP
  • 28. USING SPARK NLP • Homepage: https://nlp.johnsnowlabs.com – Getting Started, Documentation, Examples, Videos, Blogs – Join the Slack Community • GitHub: https://github.com/johnsnowlabs/spark-nlp – Open Issues & Feature Requests – Contribute! • The library has Scala and Python 2 & 3 API’s • Get directly from maven-central or spark-packages • Tested on all Spark 2.x versions

Editor's Notes

  1. There is not one “language” – every vertical and communication channel has its own jargon that includes vocabulary, grammar, assumptions and semantics. For example – in these ED triage notes, none of the sentences is in valid English, and the words “patient” and “pain” do not appear.