Presentation from Measurecamp Manchester 2019 on chatbot analytics. It covered: a specific (scripted) chatbot service and how this was evaluated; a discussion on other techniques and tools. Enjoy!
How to measure conversations - evaluation and analytics for a healthcare chatbot service
1. Evaluation and analytics for a healthcare
chatbot service
Measurecamp Manchester, 2019
How to measure
conversations
2. • The challenge
• Evaluation approaches for:
• Usage
• Commercial value
• Clinical value
• Future opportunities
What we’ll cover:
This is a session is about chatbots and possible
ways to evaluate them
3. We developed a chatbot support app for people
starting a new weekly T2D treatment
Challenge:
People with type 2 diabetes (T2D) starting a new
treatment needed the most support over the first
12 weeks
Solution:
We developed a scripted chatbot app for iOS and
Android smartphones to provide support around
starting, titrating and monitoring their treatment
3
4. The potential to learn more through the app
was huge…
If done accurately and compliantly, the app could allow us to:
• Spot and understand patterns in how people start and stay on
treatment*
• Feed these insights back into the organisation to improve the
service and future products
• Improving the experience and outcomes of patients on Drug Y
• Combine with real world studies on adherence and patient
health outcomes
* based on opted-in, aggregated, anonymised usage data
5. 5
How do you track when
different things are
discussed each time?
…but first we had to work out what to track and
how to evaluate a chatbot
How do you define a
“successful”
conversation?
How do we demonstrate
the value of the service?
Measuring and optimising a chatbot presents several challenges:
6. Usage
Commercial value
Clinical support
To tackle these we evaluated the service at
three levels
• Understand user preferences and behaviours
• Leading indicator for treatment uptake
• Assess completion of treatment initiation
• Support patient adherence / outcomes
1
2
3
8. We evaluated app usage by focusing on the four
core service areas
ii) Record Tracking a patient’s progress with the app
iii) Inform
Providing treatment and safety information in a variety of
formats (text, photo, video)
iv) Remind
Providing reminders to the patient about treatment and
reinforcing adherence
i) Acquire Finding, installing and setting up the app
9. To monitor promotion, installs and on-boarding,
we set up a conversion funnel within the app
App promotion through patient /
physician support materials
App store views
App store downloads
App installs
Registration starts
Registration completions
Weekly active users
How well are we informing people
about the app?
How many go on to set up the app?
How many become active app
users?
i) Acquire
10. This would help us identify how to better
promote and help people register on the app
Types of question this would answer:
• Which promotional activities materials drive the most app
store views / downloads?
• Where do people drop out of the set up process?
• How many people start and then continue to use the app?
i) Acquire
11. To help users manage their health, we gave
them optional tools to track their progress
Initial setup
• Injection reminder preferences
• Dosing settings
• Weight tracking & starting weight (optional)
Treatment initiation process
• Adherence and dosing information
• Weekly weight loss (optional)
On-going treatment management
• Weekly weight loss (optional)
• Satisfaction level with app
ii) Record
12. The chatbot also provided support content on
T2D and managing the treatment
Dialogue
User
feedback
App opens
Notification
Conversation
tree followed
Educational
content
interactions
Alerts
Frequency &
recency
iii) Inform
13. We wrote a feedback loop into conversations… 13
First time a
user accesses
each support
content
element
User taken to
start of
conversation
Conversation
ended
iii) Inform
14. …and tracked net promoter score among users
after 15 days of registering with the app
To evaluate app satisfaction we used the Net Promoter Score (NPS) framework
to evaluate the overall satisfaction level of users
Rules
• NPS question would only be available to users who had unlocked the app
(and provided)
• Unique product ID (to confirm the were a patient-on-treatment)
• Login credentials
• Accept terms and conditions
• 15 days after successful app unlock, the NPS survey would be activated
• The NPS survey would be included within the app exit dialogue (see right):
“How likely are you to recommend this app to a friend? 1 being not at
all likely, 10 being extremely likely”
14
iii) Inform
15. This helped us get users’ perspectives on the
chatbot service and the information provided
Types of question this would answer:
• What are the top 3 topics or conversations for patients during the first 12
weeks of treatment?
• How engaged and satisfied patients are with programmed content?
• How users navigate and interact with the app and the conversations?
iii) Inform
16. Key to the chatbot was reminding users when it
was time to take their treatment
Schedule Reminder Injection Outcome
iv) Remind
17. We were able to learn how well the reminder
functionality worked and for which user groups
Types of question this would answer:
• What patient / behavioural characteristics correlate with adherence?
• How effective are reminders at keeping patients on track with their treatments?
• What can be learned about their likelihood to adhere to treatment week by
week?
• At which point-in-time of the treatment are the users most engaged with the
app? When does that trail of? How could we proactively engage users with
relevant information?
19. The chatbot supported the business strategy of
delivering a rewarding treatment experience
Strategic action (3)
Deliver a rewarding patient experience
• Set patients and physicians up for success through treatment initiation
• Help build patient encouragement and motivation to stay on treatment
Strategic focus
Establish Drug Y as the unparalleled treatment for T2D
20. To assess the commercial value of the chatbot
we needed to demonstrate four things
1. What was the level of uptake?
2. What level of completion?
3. How satisfied were users with the service?
4. What was the cost saving/potential return for healthcare
providers?
20
Inspired by the GDS service standard and mandatory KPIs. See:
- https://www.gov.uk/service-manual/service-standard
- https://www.gov.uk/service-manual/measuring-success/sharing-your-data-with-the-performance-platform
21. Three of these were addressed through our
measurement of chatbot usage
1. Uptake - # new app registrations; # weekly active users
2. Completion - % users completing the 12 week initiation period
3. Satisfaction - net promoter score
4. Financial saving/return
21
22. Measuring the financial value of the chatbot
required wider collaboration
We needed to answer the following questions:
• What percentage of patients complete titration on the app vs. those not using the app?
• How many additional injections are taken by patients on the app vs. those not using the
app?
• What is the patient lifetime value of a Drug Y app user vs. a Drug Y non-app user?
To answer these we worked with:
• Medical affairs team – set up a separate study are / integrate within the planned RW
• Forecasting team – understand:
• The actual cost of an injection
• Completion rate of patients over first 12 weeks; average number of injections taken per patient (non-
app users)
22
24. Via the app
What was tracked to deliver the app’s functionality:
Outside the app
What could only be tracked outside of the app
• Dose achieved
• Reminders fulfilled
• Satisfaction level with treatment
• Completion of initiation period
• Change in patient reported weight
(optional)
• Behaviour of people on treatment who are
not using the app
• Clinical outcomes beyond weight reduction
(HbA1C, clinically recorded weight loss,
cardiovascular health)
• Quality of life
To understand clinical outcomes, we needed to
go beyond what we could track through the app
25. Integration with real-world study
25
• Created a separate arm of the real world study to evaluate
app usage and health outcomes of patients who start using
the app:
• How do app user health outcomes differ from those not using the app?
• How does successful completion of the titration period differ between
app users and non app users?
• How does the previous T2D medication affect outcomes among app
and non-app user?
• How does number of weeks of app usage correlate with health
outcomes?
• Why do patients who start using the app stop using it?
• Advantages: ensures that a minimum sample size of app users
is recruited
• Disadvantages: may involve additional resource, delay of >6
months for initial study results
The app was integrated into an observational
study to learn its impact on adherence
26. So, what do think?
• What else could we have evaluated to assess the service?
• How have you evaluated chatbot / voice assistant tools in
other industries?
• What tools have you used to do this?
26