The document discusses various issues and misconceptions around online research methods. It addresses eight "misinformation" claims:
1) That sentiment analysis systems are 99% accurate, when human agreement on sentiment is only 82-90%.
2) That a one-size-fits-all sentiment scoring approach is best, when custom or category-specific scoring may be more accurate.
3) That social media like Twitter provide demographic data, when available demographics are actually very limited.
4) That brand awareness can be directly measured online, when awareness levels vary significantly between brands.
5) That detailed demographic data is readily available from online samples, when demographics are often limited.
6
5. Real-life Data Analysis: How Many Licks to the Tootsie Roll Center of a Tootsie Pop?
by Cory Heid, guest blogger
Almost all of us have tried a Tootsie Pop at some point. I’m willing to bet that most of us also thought, “I wonder
how many licks it does take to get to the center of the Tootsie Pop?” If you haven’t wondered about this, here’s
the classic commercial that may get you more curious.
Personally, I was not very satisfied with the owl's answer of “3,” so I decided to continue the little boy’s quest to
find the number of licks required to reach the center of a Tootsie Pop.
Research
Looking around the ‘net, I found that other studies done by student researchers at various universities have
reached very different answers. These students, who represented some significant research institutions and
engineering schools, used different licking methodologies and/or licking machines and licking experiments. As I
mentioned, the results varied greatly. See for yourself:
Since I lacked both the equipment and desire to build a licking machine, a simpler licking experiment seemed
appropriate. I wanted to assess which of these factors would have the most impact on the effectiveness of the
Tootsie-Pop licking:
Force of the lick
Temperature of the licker's mouth
pH Level of the licker's saliva
Solubility of the licker's saliva
Lab Work
Once I selected my factors, I incorporated them into an experiment. I chose simulate the different levels of each
factor in the lab. For my force simulation, I placed my Tootsie Pop in a beaker with 150 mL of water and a
magnetic stirrer. I placed the beaker on a hot plate that could spin the stirrer and create a circular motion in the
water. I then pulled the Tootsie Pop out of the water every minute and measured the height and width of the
pop on the thicker band around the pop. I did this each minute until the pop revealed a noticeable amount of
chocolate (the elusive ‘center’).
This method had some issues. The chocolate-flavored Tootsie Pops made the water very murky, so I omitted
them from the lab testing. I also could not track the speed of the circulating water. Instead, I used the speed
dial’s indicator on the magnetic stirrer (1 being slowest speed, 10 being fastest speed). After several tests at
speeds 1, 2, 4 and 6, I was able to gather an ample amount of data to perform some analysis.
I then moved on to the temperature tests. Like the force tests, these used a beaker, 150 mL of water, and a hot
plate. I heated the water to different temperatures and measured the height and width of the Tootsie Pop every
Long Verbatims have poor validity
5
6. • I want Cracker Jacks.
• Cracker Jacks are the shit
• i want Cracker Jacks soo
bad
• I am addicted to Cracker
Jacks...
Short verbatims have high validity
6
7. Which sentiment trend is more valid?
7
5=
Positive
1=
Negative
3.1
3.2
3.3
3.4
Short Long
8. “The best academic study I've seen showed only
82% agreement between two human sentiment
annotators, rising to 90% if uncertain cases are
removed.“
Seth Grimes, June 2011,
http://www.b-eye-network.com/view/15276
The Truth
8
15. Pfizer has greater awareness than Unilever
15
0
10000
20000
30000
Pfizer Unilever
16. P90X has even greater awareness than Pfizer
16
0
20000
40000
60000
80000
P90X Pfizer Unilever Red Rose Tea
Cracker Jacks
and Tootsie Rolls
are here
18. Care to Crosstab Demos on Starbucks Data?
18
Total Male
Femal
e
US
Europ
e
Asia
Canad
a
Middl
e East
Pacific Africa 25-34 18-24 35-44 45-54 55+
Sample Size 18557 4182 3706 2118 358 247 220 47 35 20 12 9 6 6 2
0
5000
10000
15000
20000
36. Confessions
• Strengths and weaknesses
• Sentiment analysis is
nowhere near perfect
• Speed will not result in
high quality
• Demographics are weak
• Stance on privacy
What to look for
36