3. Real World Problem
3
• What action to take?
• Where to do it?
• Who should do it?
• How quickly does it need to be done?
• Why was it done?
• Decisions!
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoundation
4. Data Sources
4
• Caller phone number: call routing information, mobile/fixed,
single/multiple user (like an IP address), GPS/tower, eCall/Automatic
Crash Notification
• Resources/system status: what people, vehicles, equipment, etc.
• Environment: Weather, crowding & traffic (granular to the device),
street corner/high rise/wilderness, ferry/train/plane schedules
• Call center, paramedics, hospital, police records, fire records, public
health
• Social media: twitter, facebook, instagram, etc
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoundation
5. Existing research
5
• 50 years of Operations Research / Management
• 25 years of decision tool/tree validation
• 10 years of clinical registry prediction tool validation
• 15 years of decision support in emergency calling “appropriateness”
• 6 months of deep data mining exploratory work
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoundation
6. Why is it so complex?
6
• Chinese city with 9 million residents
• 2.5 calls per resident over 5 years (0.5/person/year)
• Repeat callers average 2.09 calls per year
• USA with 320 million residents
• 240 million 911 calls per year (0.75/person/year)
• 41,000 calls per Public Safety Answering Point
• $4.51 per call, just to maintain the ICT & dispatching system
• 10,000+ ICD10 diagnosis codes
• 19,000 EMS services across 50 states & 6 territories
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoundation
7. Categorization
7
• Started in 1978…
• 36 Families of problem types
• Level of Urgency: Hot or Not
• Omega, Alpha, Bravo, Charlie, Delta, Echo
• Nuanced descriptors help determine what
kind of first-aid instructions are to be given
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoundation
9. Decision Tree – Manual Deductive Reasoning
9
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoundation
• Dispatching priority relies on standardized keywords compared to
a known list of static scenarios
• IF
• Shooting THEN
• Urgently send police, apply tourniquet, stop bleeding.
• Not breathing/pulseless THEN
• Start CPR, urgently send paramedics
• Cardiac history THEN
• Urgently send paramedics, take aspirin, stay calm
• Known as clustering in computer science
10. Questions / Prioritization / Instructions
10
• Priorities designed to purposefully over-triage rather than increase
specificity as risk management tool
• Lots of vehicles / fewer vehicles
• Lights & Sirens / no L&S
• Queuing theory using probabilistic expected delays for paramedics,
police, or fire department responders
• Targeting the slowest delay possible because time=money
• Knowledge discovery opportunities are overlooked!
• Crowdsource trained people for faster response
• Electronic medical records describe historical risk
• Caller behavior, word choice, history, location, etc are untapped indicators
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoundation
11. Queuing Theory – Planning to Disappoint
11
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoundation
• Operations Research, Management Science, & Computer Science
disciplines rely on probabilistic calculations
• A model is constructed so that queue lengths and waiting time
can be predicted
• Interarrival time & service times are independent random variables
• Designed to select next task to perform
• The most commonly used laws are:
• FIFO - First In First Out: who comes earlier leaves earlier
• LIFO - Last Come First Out: who comes later leaves earlier
• RS - Random Service: the customer is selected randomly
• Priority
12. Erlang Call Center Algorithm
12
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoundation
Source: http://www.erlang.com/calculator/call/
Estimate how many agents you
need in your call center for
each hour during an eight hour
day…
How many taxis for a particular
time of day?
How many hospital beds? Fire
trucks? Paramedics? Police?
13. Natural Language Processing
13
• Machine learning to determine semantic meaning
• Based on ontologies and probabilistic decisions
• “Understanding” of words, meanings, intents
• Better suited for structured, grouped or otherwise trained text such as
physician narratives or same language categorization
• Excels at spelling, grammar, and Named Entity Recognition that are relatively
structured attributes
• Well suited for classifying/parsing simple or common statements
• Generally “trained” by humans (expensive)
• Handling unstructured data, stemming, bag of words, TF/IDF, topic modeling.
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoundation
14. Machine Learning - Inductive
14
• Learns from the information itself
• Classifier accuracy is similar to human experts
• Common Algorithm Types
• K-nearest neighbors (KNN)
• Linear regression
• Logistic regression
• Naive Bayes
• Decision trees, bagged trees, boosted trees, boosted stumps
• Random Forests
• AdaBoost
• Neural networks
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoundation
15. Comparing Supervised Learning Algorithms
15
Algorithm
Problem
Type
Results
interpretabl
e by you?
Easy to
explain
algorithm
to others?
Average
predictive
accuracy
Training speed
Prediction
speed
Amount of
parameter
tuning needed
(excluding
feature
selection)
Performs well
with small
number of
observations?
Handles lots of
irrelevant
features well
(separates signal
from noise)?
Automaticall
y learns
feature
interactions?
Gives
calibrated
probabilities
of class
membership?
Parametric
?
Features
might need
scaling?
KNN Either Yes Yes Lower Fast
Depends
on n
Minimal No No No Yes No Yes
Linear
regression
Regression Yes Yes Lower Fast Fast
None (excluding
regularization)
Yes No No N/A Yes
No (unless
regularized)
Logistic
regression
Classification Somewhat Somewhat Lower Fast Fast
None (excluding
regularization)
Yes No No Yes Yes
No (unless
regularized)
Naive Bayes Classification Somewhat Somewhat Lower
Fast (excluding
feature
extraction)
Fast
Some for feature
extraction
Yes Yes No No Yes No
Decision trees Either Somewhat Somewhat Lower Fast Fast Some No No Yes Possibly No No
Random
Forests
Either A little No Higher Slow Moderate Some No
Yes (unless noise
ratio is very high)
Yes Possibly No No
AdaBoost Either A little No Higher Slow Fast Some No Yes Yes Possibly No No
Neural
networks
Either No No Higher Slow Fast Lots No Yes Yes Possibly No Yes
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoundation
https://docs.google.com/spreadsheets/d/16i47Wmjpj8k-
mFRk-NnXXU5tmSQz8h37YxluDV8Zy9U/edit#gid=0
16. Support Vector Machine (SVM)
16
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoundation
Nadkarni, P. M., Ohno-Machado, L., & Chapman, W. W. (2011).
Natural language processing: an introduction. Journal of the
American Medical Informatics Association : JAMIA, 18(5), 544–
551. http://doi.org/10.1136/amiajnl-2011-000464
17. Algorithm Quality
17
• Very similar level of accuracy
between algorithms
• Will use similar attributes for
scoring
• May vary when categorical vs
continuous data
• Primary difference is in efficiency
• Big-O Notation is a relative
representation of the complexity of
an algorithm
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoundation
18. Random Forest
18
• Advantages
• It has been widely shown that random forests
are one of the most accurate existing
classification methods
• It can deal with a huge number of features
• It runs efficiently on large datasets
• It can help estimate which variables are
important in classification
• It can be extended to an unsupervised version
to work with unlabeled data.
• It is relatively robust to noise
• Disadvantages
• They tend to overt noisy data.
• Not as intuitive as some other classification
methods
• Might take a while to build the forest (but once
it's built classification is very fast)
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoundation
19. The Turing Test
19
• In 1950 Alan Turing wondered ‘Can computers think?’
• Proposed The Imitation Game
• Interrogator and two players, one human and one computer
• Based on typewritten responses the interrogator was to guess which
player was the computer
• He believed having adequate storage was the primary limiting factor
with speed being next
• Learning machine is like a child being taught
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoundation
Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59(236), 433-460.
20. Research Questions
20
• Can an a priori algorithmic, inductive reasoning based approach be
developed to:
• improve the speed of the decision making process during emergency call
taking and dispatching?
• improve the accuracy of the resource assignment for emergency call
dispatching?
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoundation
21. Discussion – Present Considerations
21
• Flowchart/Tree: veracity of the reporting party, socio-economic and
demographic factors of the patient/victim, the capability of the
responding unit, the quality of services provided by the responding
individual, and the specificity of the dispatching algorithm itself are
not factored into the decision model.
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoundation
22. Discussion – Future Considerations & Research
22
• Future research: develop an AI, ML based approach.
• Obtain detailed 911 call and electronic Patient Care Records for approximately
five million patients where an outcome is identified.
• unfounded/no merit, patient treated but not transported, patient treated and
transported, and patient transferred to another responder.
• The clinical condition at the time of the outcome will be determined based on standard
paramedic coding practices.
• Data split by randomization to a training dataset and test dataset.
• A Random Forest model built from training dataset then applied to test
dataset.
• Comparative statistics to evaluate the resource assignments, reduced
demand, and potential savings of the new model
• New knowledge model is a dynamic and real-time application
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoundation