1345 track 1 chen_using our laptop

Measuring Success at the Edge of Innovation
October 2017

2
Good Afternoon!
Sarah Schmalbach
Guardian Mobile Innovation Lab -
A small multidisciplinary innovation team of editors, reporters, a product manager and
engineers in the Guardian US newsroom testing new mobile storytelling formats for two
years with a grant from the John S. and James L. Knight Foundation with the goal of
accelerating innovation in newsrooms around the country.
Lynette Chen
MaassMedia -
MaassMedia is an independent, specialty analytics consultancy based in Philadelphia, PA.
We provide guidance and leadership to major global brands seeking to optimize their
investments in digital multi-channel content, marketing and customer service initiatives.

5 Slides 6-7 Slides 8-16 Slides 17-25 Slides 26-29 Slide 31
3
Agenda Overview
Introduction
The
Experiments
Measurement
Framework NLP
Putting it to Use
Closing / Q&As

The Guardian Mobile Innovation Lab & Maass Media team
Sarah Schmalbach
Senior Product
Manager
Sasha Koren
Editor
Alastair Coote
Developer
Dylan Greif
Product Designer
Mazin Sidhamed
Reporter
Connor Jennings
Developer
Greg Kaminski
Analytics Director
Lynette Chen
Senior Consultant
Brian Hood
Analyst

The team
SME
Data
Designer
Engineer
Knows which
questions to ask:
Sarah Schmalbach
Sasha Koren
Mazin Sidhamed
Dylan Greif
Knows how
to get the data:
Alastair Coote
Connor Jennings
Brian Hood
Knows how to
present the data:
Lynette Chen
Greg Kaminski

6
What do our experiments look like?
Live video
notifications
Lock screen
access to live
streaming video
news coverage
without having to
open an app, or
wait

7
What do our experiments look like?
Live UK election
results notifications
Lock screen access to
three types of live UK
election results:
overall, latest and by
constituency -- plus
tap-through access
to more coverage
and results

8
The need for a new way to measure success
Research Study conducted by: &
Overheard
in the
industry...
The bigger thing is, how do we measure successful alerts? Because we’re not
sitting next to everyone as their phone goes off saying, ‘Now did you open
that? Did you get everything you need to get without opening that? Did you
appreciate getting it?’ It’s just sort of an impossible thing to measure
without doing a really wide ranging survey, which we’re not going to do.”
-Mobile Editor, News Agency

Determine which formats for delivering content are the most successful
9
Success framework
Quantitative Data Qualitative DataEngagement
Behavior
User Opinion

10
Quantitative : What did users do?
Total Interactions

11
Positive vs. Negative Engagements

12
Net Interaction Rate

Net Interaction Rate
13
Number of positive interactions – negative interactions
Number of notifications shown

14
Qualitative: What did users think?
User surveys
• 24 user surveys sent
• Range of 5 to 6,219 responses

15
Need to automate qualitative analysis
Longform response feedback
May 2016
x5 responses
Analysis in 75 seconds
Nov. 2016
x1,116 responses
Analysis in 24,165 seconds
Q:
“Anything
else you’d like
to tell us?”

16
Opportunity with sentiment NLP
• Smart manual analysis of longform feedback was no longer possible
• Needed a way to speed up the process but also keeping the scoring consistent
• Wanted to understand overall sentiment by evaluating the sentiment of each
response through identifiable keywords
• Solution: Natural Language Processing (NLP) for sentiment analysis!

17
Process of Building
Create the
algorithm
Obtain a
dataset
Manually score
sentiment of
data
Feed some of the
data for the
algorithm to be
trained on
Use the trained
algorithm to
predict results of
remaining data
Review results,
identify areas of
concern, iterate

18
Our original algorithm
• Inspired by various existing NLP python packages
• Negation words
• Exclude “noise” words
• Trained on 500 US Election experiment survey long form responses
• Accuracy on US Election data: 81%
• Issue: limited scope

19
VADER sentiment algorithm
• Developed by Georgia Tech
• Trained on a 7,000+ word lexicon
• Takes into consideration punctuation,
emojis, and amplifier words

20
VADER sentiment algorithm - valence
• Each word is assigned valence on a scale of -4 to 4
Word Valence
abusive -3.2
idk -0.4
zebra 0.0
good 1.9
☺ 2.0
great 3.1

21
Adapting the VADER sentiment algorithm
• Lexicon covers a wide range but isn’t complete
• Updated, updating, up-to-date
• Vanish, vanished, vanishing
• Updated with Guardian specific words and permutations
• convenient
• “become a contributor”
• “will purchase”
• Accuracy on US Election data: 80%
• Accuracy on EU Referendum data: 88%

22
VADER – positive sentiment example
“Jason is smart, handsome, and funny.”
Sentiment Score
negative 0.00
neutral 0.25
positive 0.75
overall positive

23
VADER – negative sentiment example
“Jason is not smart, handsome, nor funny.”
Sentiment Score
negative 0.65
neutral 0.35
positive 0.00
overall negative

24
VADER – nuanced sentiment example
“Jason is not smart, handsome, but he is funny.”
Sentiment Score
negative 0.26
neutral 0.45
positive 0.29
overall positive

What do users think?
25
A new KPI : net sentiment score
Number of positive responses – negative responses
Number of responses

26
UK Snap Election: How did users engage?
• Positive interactions: taps and switch views
• Negative interactions: unsubscribe and stop
alerts
• Net interaction rate: 3.4%

27
UK Snap Election: What did users think?
Net sentiment score: 44%
105
269
Negative
Positive
Survey Responses

28
UK Snap Elections: identifying key responses
“I loved the live general elections updates, found it easy to use and loved the expandable
visualisation. It was great … I used the alert to check on the progress of the election
throughout the night. Would definitely use it again”
“I live in Australia and work as a chef. I cannot take breaks and don't have time to Google
results. It was excellent to have the live updates on my alert screen…all I had to do was
tap the screen to see updated results. It was fantastic. Thanks Guardian. Great work.”
“The alerts kept causing my Android Chrome browser to crash.”
97%
97%
-40%

29
UK Snap Elections: evaluating success
• Overall success with the alerts
• High net sentiment score
• Positive net interaction rate
• Signals that alerts can provide utility on
their own and also give convenient
access to deeper coverage
44% Net Sentiment Score
3% Net Interaction Rate

30
Learnings/Takeaways
Have a toolkit with both quantitative and qualitative data
1
2
3
4
Be forward thinking and flexible
Building/adapting an NLP algorithm is an investment but can provide
great utility
Need to adapt and improve NLP algorithms over time

31
Questions?
Sarah Schmalbach | Guardian Mobile Innovation Lab Senior Product
Manager
sarah.schmalbach@theguardian.com
@schmalie
Lynette Chen | MaassMedia Senior Consultant
lchen@maassmedia.com
@huirastic

1345 track 1 chen_using our laptop

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (13)

Ähnlich wie 1345 track 1 chen_using our laptop

Ähnlich wie 1345 track 1 chen_using our laptop (20)

Mehr von Rising Media, Inc.

Mehr von Rising Media, Inc. (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

1345 track 1 chen_using our laptop