SlideShare ist ein Scribd-Unternehmen logo
1 von 46
SOCIAL MEDIA ANALYTICS TO QUANTIFY FAN 
ENGAGEMENT 
DR. ROBERT BAKER 
TED KWARTLER 
Get a more complete profile of your fans to inform business decisions and improve ROI 
calculations.
AGENDA 
Basics 
Where are the fans? 
Who are the fans? 
What are fans talking about? 
How do the fans feel towards the team? 
What is the point of all this?
A FAN’S EXPERIENCE 
If only there had been social media, the Yankees could have profiled my experience.
BASICS 
WHAT IS TEXT MINING?
SOCIAL MEDIA ANALYTICS REQUIRES TEXT MINING 
Before text mining. After text mining. 
Text mining lets you “drink from a fire hose” of information and distill useful meaning.
Organized into 
Document Term Matrix (DTM) 
Term Document Matrix (TDM) 
Apply standard and 
domain specific rules 
WHAT IS TEXT MINING? 
Unstructured natural 
language texts 
Insight 
& 
Recommendation 
Natural 
language 
surveys tweets 
Text mining is an emerging technology that can be used to augment existing data by making 
unstructured text available for analysis and decision making. 
articles 
emails 
blogs 
reviews 
texts
EXAMPLE UNSTRUCTURED TEXT SOURCES 
Many sources including emails, forum posts, 
tweets, books, pdfs, reviews, transcripts etc. 
Unstructured natural 
language texts 
杜兰特和詹姆斯谁才是当今联盟的头牌?这是最近很火热的话题。一方 
面杜兰特高居得分榜首位,在MVP权力榜上也雄踞第一;另一方面詹姆 
斯带领热火一切为了三连冠,比赛沉稳... 
Had my first experience at TD Garden when my Bulls came to play the Celtics. Being someone with an out of 
state license living in Boston, I usually carry my passport anyway, but I had a friend in town and wanted to clear up this ID 
controversy I read so much about in the rules.
EXAMPLE PRE-PROCESSING STEPS 
(or other software 
e.g. Python NLTK) 
1.Make all text lower case 
2.For twitter, remove “RT” for 
retweet. 
3.Remove symbols like “@” 
4.Remove punctuation 
5.Remove numbers 
6.Remove Urls e.g. 
http://www.espn.com 
7.Remove extra whitespace 
8.Remove “stopwords” 
9.Others as needed depending on 
In a “bag of words” text mining methodology the corpus must 
be cleaned. Cleaning often means making items lower case, removing 
punctuation, numbers and extra whitespace. In unique instances 
domain specific rules are applied (e.g. removing “RT” for retweet). 
Apply standard and domain specific 
rules 
Cleaned Version: 
no doubt derek jeter makes 
my top all time with babe lou 
yankee clipper mick 
杜兰特和詹姆斯谁才是当今联盟的头 
牌?这是最近很火热的话题。一方面杜兰特 
高居得分榜首位,在MVP权力榜上也雄踞第 
一;另一方面詹姆斯带领热火一切为了三连 
冠,比赛沉稳... 
Translated Version: 
Durant and James, who is the league's first 
card today? This is a very hot topic recently. 
On the one hand Durant highest scoring top 
position in the standings MVP authority also 
ranked first; on the other hand, James led the 
Heat everything for three consecutive years, 
the race calm ... 
Cleaned Version: 
durant james who league first card today very 
hot topic recently on one hand durant highest 
scoring top position standings MVP authority 
ranked first other hand, james led heat 
everything three consecutive years race calm
DATA ORGANIZATION 
Once cleaned the documents and terms are organized into large matrices. 
Often they are very sparse and may contain tens of thousands of data points. 
Attributes may be single words or word tokens of 2 or more words. 
Organized into 
Document Term Matrix 
Term Document Matrix 
no doubt derek jeter makes my 
top all time with babe lou 
yankee clipper mick 
Document no doubt derek jeter top durant james termN 
Tweet_1 1 1 1 1 1 0 0 0 
Sina_1 0 0 0 0 1 2 2 1 
docN … … … … … … … … 
Term Tweet_1 Sina_1 docN 
no 1 0 … 
doubt 1 0 … 
jeter 1 0 … 
top 1 1 … 
termN 0 1 … 
durant james who league first 
card today very hot topic 
recently on one hand durant 
highest scoring top position 
standings MVP authority ranked 
first other hand, james led heat 
everything three consecutive 
years race calm ... 
Document Term Matrix 
Term Document Matrix
WHERE ARE THE FANS? 
LOCATION BASED ATTRIBUTES
DODGERS TWITTER FOLLOWERS -10K SAMPLE
INDIANS TWITTER FOLLOWERS -10K SAMPLE
NYY TWITTER FOLLOWERS -10K SAMPLE
Team Total Followers Sample Bing API Geo-Located Median Distance to Stadium 
Dodgers ~540K First 10K 2,854 1,372 miles 
Indians ~225K First 10K 3,774 319 miles 
Yankees ~1.18K First 10K 1,335 713 miles
WHO ARE THE FANS? 
COMMON DEMOGRAPHIC EXTRACTION
From Twitter locations to zip code then demographic data. 
Sample of 3262 of 10k Followers Geo-located IDs 
Zip City Populatio 
on 
Avg 
house 
value 
Income 
below 
poverty 
Total 
business 
es 
Total 
household 
ds 
91766 Pomona, 
CA 
71,599 $142,800 15.4% 803 
93301 Bakersfiel 
d, CA 
12,248 $109,600 20.4% 1,438 
91606 North 
Hollywood, 
CA 
44,958 $170,100 15.4% 622 14,903 
WE CAN GET MORE GRANULAR.
Sample of 3775 of 10k Followers Geo-located IDs 
Zip City Populatio 
on 
Avg 
house 
value 
Income 
below 
poverty 
Total 
business 
es 
Total 
household 
ds 
44107 Lakewood 
d, 
OH 
52,244 $117,900 16.4% 945 25,333 
44139 Solon, 
OH 
24,356 $215,700 16.4% 1,155 8,693 
44304 Akron, 
OH 
5,916 $56,300 13.0% 172 1,637 
WE CAN GET MORE GRANULAR. 
From Twitter locations to zip code then demographic data.
Sample of 1335 of 10k Followers Geo-located IDs 
Zip City Populatio 
on 
Avg 
house 
value 
Income 
below 
poverty 
Total 
business 
es 
Total 
household 
ds 
10462 Bronx, 
NY 
75,784 $192,600 27.9% 1002 29855 
14223 Buffalo, 
NY 
22,665 $85,700 13.9% 328 9832 
75060 Irving, 
TX 
45,980 $83,300 17.2% 503 
WE CAN GET MORE GRANULAR. 
From Twitter locations to zip code then demographic data.
FURTHER INSIGHTS OF ZIP 91766, POMONA CA 
At the zip code and metropolitan area there are 
countless dimensions that may aid in fan 
segmentation and marketing. 
• Ranked #1 Drought Riskiest Cities 
• Ranked #15 Riskiest for Identity Theft 
• Ranked #5 Most Irritation Prone City 
Sources: 
http://www.census.gov 
http://emergency.cdc.gov/snaps/data/39/39153.htm 
http://www.bestplaces.net/rankings/zip-code/ohio/akron/44304 
• Ranked #8 Healthiest 
• Ranked #13 Best City for Teleworking 
• Ranked #6 Most Single City 
Population 
White Black Hispanic 
Asian Hawaiin Indian 
Other 
Gender 
male female 
Households 
total.households house w/child 
Immigration 
Mexico El Savador Philippines 
Gutemala Korea China 
Vietnam Iran
FURTHER INSIGHTS OF ZIP 44304, AKRON OH 
Population 
White Black Asian 
Hawaiin Indian Other 
At the zip code and metropolitan area there are 
countless dimensions that may aid in fan 
segmentation and marketing. 
Gender 
male female 
Households 
total.households house w/child 
Immigration 
India Germany Yugoslavia 
UK Italy Canada 
China other 
• Ranked #1 Best City for Thanksgiving 
• Ranked #4 Best Cities for Teleworking 
• Ranked #25 America’s Best Cities for Dating 
Sources: 
http://www.census.gov 
http://emergency.cdc.gov/snaps/data/39/39153.htm 
http://www.bestplaces.net/rankings/zip-code/ohio/akron/44304 
• Ranked #64 Most Popular City for the Holidays 
• Ranked #73 America’s Most Stressful Cities 
• Ranked #140 2005 Best Places to Live
FURTHER INSIGHTS OF ZIP 10462, BRONX NY 
At the zip code and metropolitan area there are 
countless dimensions that may aid in fan 
segmentation and marketing. 
• Ranked #2 Least Crime for Large Metro Area 
• Ranked #2 Sleepless Cities 2011 
• Ranked #3 Most Single Cities 
Sources: 
http://www.census.gov 
http://emergency.cdc.gov/snaps/data/39/39153.htm 
http://www.bestplaces.net/rankings/zip-code/ohio/akron/44304 
• Ranked #9 Most Irritation Prone Cities 
• Ranked #14 Healthiest Cities 
• Ranked #28 Most Playful Cities 
Population 
White Black Hispanic 
Asian Hawaiin Indian 
Other 
Gender 
male female 
Households 
total.households house w/child 
Immigration 
Dominican Jamaica Mexico 
Guyana Ecuador Caribbean 
Honduras Ghana
WHAT ARE THE FANS TALKING ABOUT? 
INTERESTING TOPICS AND NAMED ENTITY RECOGNITION
• Free Twitter API 
1.1K Tweets 
• Tweets mentioning “Indians” 
• 7/31 & 8/1 
• “Tokenize” single words into unique two 
word groups 
• Trade mentions 
• Masterson to Cardinals for Ramsey 
• Cabrera to Nationals for Walters 
• Throwback jerseys for KC Royals game 
• Mariners game attendees 7/31
DIFFERENCES OF WORD CLOUDS SIMPLE WORD CLOUD, 
CLOUD, COMMON CLOUD AND POLARIZED CLOUD 
text1 text2 
text2 
text1 text21 text2 
Simple Word Cloud 
Commonality & Polarized Cloud 
Comparison Cloud
12K Tweets 
• Includes a mix free API access and 
full fire hose paid API over 48 distinct 
hours 
• Sampling occurred August 1 and 
August 13 
• Tweets mentioning “Dodgers” most 
often discussed 
• Clayton Kershaw’s appearance 
on Jimmy Kimmel Live 
• FCC Chairman’s letter to Time 
Warner CEO about the Dodger’s 
TV Channel
2K Spanish 
Tweets 
• Free Twitter API Spanish language 
search over 48 distinct hours 
• Sampling occurred July 29 and 
August 12 
• Tweets mentioning “Dodgers” and 
used Spanish most often discussed 
• The AP story of Dan Haren 
beating the Braves 
• Vin Scully retiring was a smaller 
topic although present 
Example: 
Dodgers vencen a Bravos con 2 jonrones de Kemp http://t.co/9U7xiIPOdo 
#noticias 
Dodgers beat Braves with 2 homers Kemp http://t.co/9U7xiIPOdo #news
235 Blogs 
Treemap 
Sentiment 
• July 29-July 31 
• Group is Correlated Topic Modeling 
• Color is sentiment 
•Area is blog length 
• Takeaways: 
• Babe Ruth’s birthday is shared with 
Laurence Fishburn, born in Augusta 
Georgia – picked up blogs mentioning 
“birthdays on this date” 
• Eli Manning wants to remember advice of 
Derek Jeter 
• Pending trade deadline 
• ESPNNewYork writer Wallace Matthews 
• Game recaps
Dissimilar 
Words 
• Full FB Firehose of public posts 
• Sampling occurred 
• Dodgers: 
July 29 – July 31 
• Yankees: 
July 28 – July 31 
• FB mentions of Dodgers and 
Yankees tagged as English 
•Marketing posts about Spike 
Lee requested a Red New 
York Yankees World Series 
edition fitted cap
Words 
in 
Comm 
on 
• Full FB Firehose of public posts 
• Sampling occurred 
• Dodgers: 
July 29 – July 31 
• Yankees: 
July 28 – July 31 
• FB mentions of Dodgers and Yankees 
tagged as English 
•As expected trades to improve the season 
towards the end of the deadline were 
mentioned by both teams
COMPARATIVE ANALYSIS – BIGRAMS IN COMMON 
• Full FB Firehose of 
public posts 
• Sampling occurred 
• Dodgers: 
Jul 29, -- Jul 31 
• Yankees: 
Jul 28 – Jul 31 
• FB mentions of 
Dodgers and 
Yankees tagged as 
English 
red sox 
Equal Mentions
FEELINGS TOWARDS THE TEAM 
SIMPLE SENTIMENT ANALYSIS
EXAMPLE POLARITY SCORING IN TWITTER 
Many words in natural language 
Follows a predictable distribution. Zipf’s Law 
but there is steep decline in everyday usage. 
900,000 
800,000 
700,000 
600,000 
500,000 
400,000 
300,000 
200,000 
100,000 
0 
1 
4 
7 
10 
13 
16 
19 
22 
25 
28 
31 
34 
37 
40 
43 
46 
49 
52 
55 
58 
61 
64 
67 
70 
73 
76 
79 
82 
85 
88 
91 
94 
97 
100 
Top two words in English 
spoken language are “the” 
and “be”. Top two words in 
Twitter are “RT” and “I”. 
However the power 
distribution is similar and 
follows Zipf’s law. 
Top 100 Word Usage from 3M Tweets
SENTIMENT POLARITY ANALYSIS 
Surprise is a sentiment. 
Hit by a bus! – Negative 
polarity but surprising. 
Won the lottery! – Positive 
polarity but still surprising. 
Use the University of Pittsburgh’s MPQA Lexicon 
& Illocution Inc’s 10K top Twitter words. 
Keyword Scanning for 
polarity 
R script scans for 3546 positive 
words, and 5701 negative 
words. It adds positive words 
and subtracts negative ones. 
The final score represents the 
polarity of the social 
interaction. 
•I loathe the Tigers. -1 
•I love Lou Whittaker. He was the 
best. +2 
•I like the Tigers but dislike going to 
the stadium. 0
DODGER SENTIMENT ON TWITTER 9/5 
Median: -1 
Mean: -0.47
INDIANS SENTIMENT ON TWITTER 9/5 
Median: 0 
Mean: -0.1198
YANKEE SENTIMENT ON TWITTER 9/5 
Median: 0 
Mean: -0.118
IN COMPARISON… 
hey..yankees....can ya score some runs?! 
indians activate murphy from disabled list http://t.co/bqliintwsf 
dodgers rhp josh beckett won't return this season 
Team Tweets>=1 Tweets<=-1 Total w/o 0 % positive 
Yankees 280 406 686 41% 
Indians 290 456 746 39% 
Dodgers 448 1,226 1,674 27%
WHAT IS THE POINT OF ALL THIS? 
TARGETED MARKETING EFFORTS, EVANGELISTS, REFINED SEGMENTATION, MEDIA MIX MODELING 
LEADING TO ROI
EXAMPLE IDENTIFY EVANGELISTS, INFLUENCERS & DETRACTORS 
• When engaging on social media it 
is important to note the clout of 
followers in terms of status updates, 
and followers 
• Running sentiment analysis on 
updates/posts adds context to the 
voice of the customer 
• Appending other data allows for 
additional segmentation, and 
differentiated customer 
experiences e.g. my Yankee story 
10K Indians Followers less 138 outliers
MEDIA MIX MODELING FOR SOCIAL MEDIA ROI 
• In lieu of actual sales 
merchandise data and 
marketing spend, 
tracked Amazon Sales 
Rank hourly from 4/1 to 
8/31 
• Relative measure of 
sales against other 
“Sports and Outdoors” 
category items 
•Lower number is better
DODGER CAP AVERAGE HOURLY SALES RANK PER DAY 
4500 
4000 
3500 
3000 
2500 
2000 
1500 
1000 
500 
0 
1-Apr 8-Apr 15-Apr 22-Apr 29-Apr 6-May 13-May 20-May 27-May 3-Jun 10-Jun 17-Jun 24-Jun 1-Jul 8-Jul 15-Jul 22-Jul 29-Jul 5-Aug 12-Aug 19-Aug 26-Aug 
Amazon sales rank when seen as a time series exhibits is not stationary. Overall the Dodgers has 
an increasing trend despite being successful on field and has some periodicity based on day of 
week.
Time Series 
Decompositi 
on 
• Econometric forecasting TSD 
was used in an attempt to 
isolate social media impact 
and understand sales rank 
patterns 
• Trend is likely the impact of 
baseball season excitement 
then waning to other sports 
• Seasonal may be the impact 
of retail day of the week 
cycles 
• Leaving random as the 
dependent variable in the 
media mix GLM
Tweets to Decomposed Amazon Sales Rank 
• Correlation is only -0.08. 
• Given the tweets are 
examined against ‘random’ 
or unexplained data the 
relationship may still be 
relevant. 
•As this is proxy data for sales 
of a single item, results not 
conclusive 
1000 
800 
600 
400 
200 
0 
-200 
-400 
-600 
-800 
-1000 
0 10 20 30 40 50 60 70 80 90 100 
*removed dates with missing data
Tweets to Average Daily Amazon Sales Rank 
•Much stronger correlation - 
0.24 
• Leads one to believe the 
more a team tweets the 
lower the sales rank 
•As this is proxy data for sales 
of a single item, results not 
conclusive 
*removed dates with missing data 
4500 
4000 
3500 
3000 
2500 
2000 
1500 
1000 
500 
0 
0 10 20 30 40 50 60 70 80 90 100
Media mix modeling 
*removed dates with missing data 
• Given the likely relationship: 
• Set up a GLM using marketing 
efforts media spend with the 
dependent variable being 
revenue, ticket sales, 
merchandise sales etc. 
• The coefficients of the inputs 
illustrate the impact of the 
channel marketing spends 
leading you to ROI 
Example: 
푓 푠푎푙푒푠 
= 훽0 + 훽1 푠표푐푖푎푙. 푚푒푑푖푎. 푠푝푒푛푑 
+ 훽2 푡푟푎푑푖푡푖표푛푎푙. 푚푘푡푔. 푠푝푒푛푑 
+ 훽3 푡푒푎푚. 푝푒푟푓표푟푚푎푛푐푒 … 훽푛 + 휖 
The goal is increased model lift, and accuracy by 
incorporating social media spend. The coefficient of the 
variable demonstrates the impact. This will allow you to 
calculate a ROI of social spend.
FURTHER INFO 
Want example R scripts for the visuals? 
www.sportsanalytics.org starting 9/15

Weitere ähnliche Inhalte

Andere mochten auch

Sport Analytics Innovation Summit
Sport Analytics Innovation SummitSport Analytics Innovation Summit
Sport Analytics Innovation SummitEric Lewallen
 
How Big Data Is Revolutionizing Sports?
How Big Data Is Revolutionizing Sports?How Big Data Is Revolutionizing Sports?
How Big Data Is Revolutionizing Sports?Aditi Singh
 
Big data analytics in sports industry
Big data analytics in sports industryBig data analytics in sports industry
Big data analytics in sports industryPromptCloud
 
Sports Analytics in the Era of Big Data and Data Science
Sports Analytics in the Era of Big Data and Data ScienceSports Analytics in the Era of Big Data and Data Science
Sports Analytics in the Era of Big Data and Data ScienceKonstantinos Pelechrinis
 
7 Ways Sports Teams Win With Sports Analytics
7 Ways Sports Teams Win With Sports Analytics7 Ways Sports Teams Win With Sports Analytics
7 Ways Sports Teams Win With Sports AnalyticsTableau Software
 
Sports Analytics
Sports AnalyticsSports Analytics
Sports AnalyticsMark Conway
 
[Challenge:Future] Project Panorama !
[Challenge:Future] Project Panorama ![Challenge:Future] Project Panorama !
[Challenge:Future] Project Panorama !Challenge:Future
 
SAS/MIT/Sloan Data Analytics
SAS/MIT/Sloan Data AnalyticsSAS/MIT/Sloan Data Analytics
SAS/MIT/Sloan Data AnalyticsSteven Kimber
 
Episode 34 of the DSMSports Podcast w/ Jacob Rosen on Sports Analytics
Episode 34 of the DSMSports Podcast w/ Jacob Rosen on Sports AnalyticsEpisode 34 of the DSMSports Podcast w/ Jacob Rosen on Sports Analytics
Episode 34 of the DSMSports Podcast w/ Jacob Rosen on Sports AnalyticsNeil Horowitz
 
K.i.s.s with presentation
K.i.s.s with presentation K.i.s.s with presentation
K.i.s.s with presentation Sylvia Liang
 
Disrupting in the digital era: key traits of an evolution of disruptive innov...
Disrupting in the digital era: key traits of an evolution of disruptive innov...Disrupting in the digital era: key traits of an evolution of disruptive innov...
Disrupting in the digital era: key traits of an evolution of disruptive innov...Andrea Paraboschi
 
Imagine the Possibilities: An All-in-One Treasury and Risk Management Solution
Imagine the Possibilities: An All-in-One Treasury and Risk Management SolutionImagine the Possibilities: An All-in-One Treasury and Risk Management Solution
Imagine the Possibilities: An All-in-One Treasury and Risk Management SolutionReval
 
Using R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsUsing R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsAjay Ohri
 
MYagonism basketball analytics innovation -public-
MYagonism basketball analytics innovation -public-MYagonism basketball analytics innovation -public-
MYagonism basketball analytics innovation -public-Paolo Raineri
 
Sports Analytics Innovation Summit - Data Powered Storytelling
Sports Analytics Innovation Summit - Data Powered StorytellingSports Analytics Innovation Summit - Data Powered Storytelling
Sports Analytics Innovation Summit - Data Powered StorytellingNuno Santos
 
Batter Up! Advanced Sports Analytics with R and Storm
Batter Up! Advanced Sports Analytics with R and StormBatter Up! Advanced Sports Analytics with R and Storm
Batter Up! Advanced Sports Analytics with R and StormRevolution Analytics
 
Big Data BizViz Sports Analytics
Big Data BizViz Sports AnalyticsBig Data BizViz Sports Analytics
Big Data BizViz Sports AnalyticsBig Data BizViz LLC
 

Andere mochten auch (20)

Sport Analytics Innovation Summit
Sport Analytics Innovation SummitSport Analytics Innovation Summit
Sport Analytics Innovation Summit
 
Big Data for Big Sports
Big Data for Big SportsBig Data for Big Sports
Big Data for Big Sports
 
How Big Data Is Revolutionizing Sports?
How Big Data Is Revolutionizing Sports?How Big Data Is Revolutionizing Sports?
How Big Data Is Revolutionizing Sports?
 
Big data analytics in sports industry
Big data analytics in sports industryBig data analytics in sports industry
Big data analytics in sports industry
 
Sports Analytics in the Era of Big Data and Data Science
Sports Analytics in the Era of Big Data and Data ScienceSports Analytics in the Era of Big Data and Data Science
Sports Analytics in the Era of Big Data and Data Science
 
7 Ways Sports Teams Win With Sports Analytics
7 Ways Sports Teams Win With Sports Analytics7 Ways Sports Teams Win With Sports Analytics
7 Ways Sports Teams Win With Sports Analytics
 
Sports Analytics
Sports AnalyticsSports Analytics
Sports Analytics
 
Sports Social Media Analytics
Sports Social Media AnalyticsSports Social Media Analytics
Sports Social Media Analytics
 
[Challenge:Future] Project Panorama !
[Challenge:Future] Project Panorama ![Challenge:Future] Project Panorama !
[Challenge:Future] Project Panorama !
 
SAS/MIT/Sloan Data Analytics
SAS/MIT/Sloan Data AnalyticsSAS/MIT/Sloan Data Analytics
SAS/MIT/Sloan Data Analytics
 
Episode 34 of the DSMSports Podcast w/ Jacob Rosen on Sports Analytics
Episode 34 of the DSMSports Podcast w/ Jacob Rosen on Sports AnalyticsEpisode 34 of the DSMSports Podcast w/ Jacob Rosen on Sports Analytics
Episode 34 of the DSMSports Podcast w/ Jacob Rosen on Sports Analytics
 
K.i.s.s with presentation
K.i.s.s with presentation K.i.s.s with presentation
K.i.s.s with presentation
 
Analytics - Sports Style, ESPN
Analytics - Sports Style, ESPNAnalytics - Sports Style, ESPN
Analytics - Sports Style, ESPN
 
Disrupting in the digital era: key traits of an evolution of disruptive innov...
Disrupting in the digital era: key traits of an evolution of disruptive innov...Disrupting in the digital era: key traits of an evolution of disruptive innov...
Disrupting in the digital era: key traits of an evolution of disruptive innov...
 
Imagine the Possibilities: An All-in-One Treasury and Risk Management Solution
Imagine the Possibilities: An All-in-One Treasury and Risk Management SolutionImagine the Possibilities: An All-in-One Treasury and Risk Management Solution
Imagine the Possibilities: An All-in-One Treasury and Risk Management Solution
 
Using R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsUsing R for Social Media and Sports Analytics
Using R for Social Media and Sports Analytics
 
MYagonism basketball analytics innovation -public-
MYagonism basketball analytics innovation -public-MYagonism basketball analytics innovation -public-
MYagonism basketball analytics innovation -public-
 
Sports Analytics Innovation Summit - Data Powered Storytelling
Sports Analytics Innovation Summit - Data Powered StorytellingSports Analytics Innovation Summit - Data Powered Storytelling
Sports Analytics Innovation Summit - Data Powered Storytelling
 
Batter Up! Advanced Sports Analytics with R and Storm
Batter Up! Advanced Sports Analytics with R and StormBatter Up! Advanced Sports Analytics with R and Storm
Batter Up! Advanced Sports Analytics with R and Storm
 
Big Data BizViz Sports Analytics
Big Data BizViz Sports AnalyticsBig Data BizViz Sports Analytics
Big Data BizViz Sports Analytics
 

Ähnlich wie Quantifying Fan Engagement using Social Media

6 things to expect when you are visualizing
6 things to expect when you are visualizing6 things to expect when you are visualizing
6 things to expect when you are visualizingKrist Wongsuphasawat
 
Social Media Training :: Market Research Assoc. 2010
Social Media Training :: Market Research Assoc. 2010Social Media Training :: Market Research Assoc. 2010
Social Media Training :: Market Research Assoc. 2010Eric Schwartzman
 
Social Media Training :: Market Research Association :: First Outlook Conference
Social Media Training :: Market Research Association :: First Outlook ConferenceSocial Media Training :: Market Research Association :: First Outlook Conference
Social Media Training :: Market Research Association :: First Outlook ConferenceEric Schwartzman
 
Social media-training-market-research-assoc-2010
Social media-training-market-research-assoc-2010Social media-training-market-research-assoc-2010
Social media-training-market-research-assoc-2010Eric Schwartzman
 
Twitterface: A viral marketing concept
Twitterface: A viral marketing conceptTwitterface: A viral marketing concept
Twitterface: A viral marketing conceptAra Pehlivanian
 
Top 5 Social Media Networks. 5 Best Practices.
Top 5 Social Media Networks. 5 Best Practices.Top 5 Social Media Networks. 5 Best Practices.
Top 5 Social Media Networks. 5 Best Practices.RezStream
 
Trumping the Polls: Event Analysis During the 2016 Presidential Election
Trumping the Polls: Event Analysis During the 2016 Presidential ElectionTrumping the Polls: Event Analysis During the 2016 Presidential Election
Trumping the Polls: Event Analysis During the 2016 Presidential ElectionJinho Choi
 
Open Data: Analysis and Visualisation
Open Data: Analysis and VisualisationOpen Data: Analysis and Visualisation
Open Data: Analysis and VisualisationDr Muhammad Adnan
 
Harness the Potential of Local Search for Your Business
Harness the Potential of Local Search for Your BusinessHarness the Potential of Local Search for Your Business
Harness the Potential of Local Search for Your BusinessHubSpot
 
Building a Knowledge Graph
Building a Knowledge GraphBuilding a Knowledge Graph
Building a Knowledge GraphDanBennett47
 
Digital Culture Industry: Writing a Digital History with Digital Documents (P...
Digital Culture Industry: Writing a Digital History with Digital Documents (P...Digital Culture Industry: Writing a Digital History with Digital Documents (P...
Digital Culture Industry: Writing a Digital History with Digital Documents (P...James Allen-Robertson
 
Fundamentals for the New Era PR Pro with Sarah Evans
Fundamentals for the New Era PR Pro with Sarah EvansFundamentals for the New Era PR Pro with Sarah Evans
Fundamentals for the New Era PR Pro with Sarah EvansCision
 
6 things to expect when you are visualizing (2020 Edition)
6 things to expect when you are visualizing (2020 Edition)6 things to expect when you are visualizing (2020 Edition)
6 things to expect when you are visualizing (2020 Edition)Krist Wongsuphasawat
 
Thriving in a digital world
Thriving in a digital worldThriving in a digital world
Thriving in a digital worldSports Geek
 
From Chirps to Whistles - Discovering Event-specific Informative Content from...
From Chirps to Whistles - Discovering Event-specific Informative Content from...From Chirps to Whistles - Discovering Event-specific Informative Content from...
From Chirps to Whistles - Discovering Event-specific Informative Content from...Debanjan Mahata
 
Annotated Bibliography DUE 0319. Please upload to Canvas. This.docx
Annotated Bibliography DUE 0319. Please upload to Canvas. This.docxAnnotated Bibliography DUE 0319. Please upload to Canvas. This.docx
Annotated Bibliography DUE 0319. Please upload to Canvas. This.docxdaniahendric
 
Immersive Recommendation
Immersive RecommendationImmersive Recommendation
Immersive Recommendation承剛 謝
 
What I tell myself before visualizing
What I tell myself before visualizingWhat I tell myself before visualizing
What I tell myself before visualizingKrist Wongsuphasawat
 

Ähnlich wie Quantifying Fan Engagement using Social Media (20)

6 things to expect when you are visualizing
6 things to expect when you are visualizing6 things to expect when you are visualizing
6 things to expect when you are visualizing
 
Social Media Training :: Market Research Assoc. 2010
Social Media Training :: Market Research Assoc. 2010Social Media Training :: Market Research Assoc. 2010
Social Media Training :: Market Research Assoc. 2010
 
Social Media Training :: Market Research Association :: First Outlook Conference
Social Media Training :: Market Research Association :: First Outlook ConferenceSocial Media Training :: Market Research Association :: First Outlook Conference
Social Media Training :: Market Research Association :: First Outlook Conference
 
Social media-training-market-research-assoc-2010
Social media-training-market-research-assoc-2010Social media-training-market-research-assoc-2010
Social media-training-market-research-assoc-2010
 
Twitterface: A viral marketing concept
Twitterface: A viral marketing conceptTwitterface: A viral marketing concept
Twitterface: A viral marketing concept
 
Top 5 Social Media Networks. 5 Best Practices.
Top 5 Social Media Networks. 5 Best Practices.Top 5 Social Media Networks. 5 Best Practices.
Top 5 Social Media Networks. 5 Best Practices.
 
Trumping the Polls: Event Analysis During the 2016 Presidential Election
Trumping the Polls: Event Analysis During the 2016 Presidential ElectionTrumping the Polls: Event Analysis During the 2016 Presidential Election
Trumping the Polls: Event Analysis During the 2016 Presidential Election
 
Open Data: Analysis and Visualisation
Open Data: Analysis and VisualisationOpen Data: Analysis and Visualisation
Open Data: Analysis and Visualisation
 
Harness the Potential of Local Search for Your Business
Harness the Potential of Local Search for Your BusinessHarness the Potential of Local Search for Your Business
Harness the Potential of Local Search for Your Business
 
Building a Knowledge Graph
Building a Knowledge GraphBuilding a Knowledge Graph
Building a Knowledge Graph
 
Digital Culture Industry: Writing a Digital History with Digital Documents (P...
Digital Culture Industry: Writing a Digital History with Digital Documents (P...Digital Culture Industry: Writing a Digital History with Digital Documents (P...
Digital Culture Industry: Writing a Digital History with Digital Documents (P...
 
Fundamentals for the New Era PR Pro with Sarah Evans
Fundamentals for the New Era PR Pro with Sarah EvansFundamentals for the New Era PR Pro with Sarah Evans
Fundamentals for the New Era PR Pro with Sarah Evans
 
6 things to expect when you are visualizing (2020 Edition)
6 things to expect when you are visualizing (2020 Edition)6 things to expect when you are visualizing (2020 Edition)
6 things to expect when you are visualizing (2020 Edition)
 
Thriving in a digital world
Thriving in a digital worldThriving in a digital world
Thriving in a digital world
 
Final presentation
Final presentationFinal presentation
Final presentation
 
Chick-fil-a Leadercast 2011: Social Media Usage
Chick-fil-a Leadercast 2011: Social Media UsageChick-fil-a Leadercast 2011: Social Media Usage
Chick-fil-a Leadercast 2011: Social Media Usage
 
From Chirps to Whistles - Discovering Event-specific Informative Content from...
From Chirps to Whistles - Discovering Event-specific Informative Content from...From Chirps to Whistles - Discovering Event-specific Informative Content from...
From Chirps to Whistles - Discovering Event-specific Informative Content from...
 
Annotated Bibliography DUE 0319. Please upload to Canvas. This.docx
Annotated Bibliography DUE 0319. Please upload to Canvas. This.docxAnnotated Bibliography DUE 0319. Please upload to Canvas. This.docx
Annotated Bibliography DUE 0319. Please upload to Canvas. This.docx
 
Immersive Recommendation
Immersive RecommendationImmersive Recommendation
Immersive Recommendation
 
What I tell myself before visualizing
What I tell myself before visualizingWhat I tell myself before visualizing
What I tell myself before visualizing
 

Kürzlich hochgeladen

Unlock Your Social Media Potential with IndianLikes - IndianLikes.com
Unlock Your Social Media Potential with IndianLikes - IndianLikes.comUnlock Your Social Media Potential with IndianLikes - IndianLikes.com
Unlock Your Social Media Potential with IndianLikes - IndianLikes.comSagar Sinha
 
O9654467111 Call Girls In Shahdara Women Seeking Men
O9654467111 Call Girls In Shahdara Women Seeking MenO9654467111 Call Girls In Shahdara Women Seeking Men
O9654467111 Call Girls In Shahdara Women Seeking MenSapana Sha
 
Models Call Girls Shettihalli - 7001305949 Escorts Service 50% Off with Cash ...
Models Call Girls Shettihalli - 7001305949 Escorts Service 50% Off with Cash ...Models Call Girls Shettihalli - 7001305949 Escorts Service 50% Off with Cash ...
Models Call Girls Shettihalli - 7001305949 Escorts Service 50% Off with Cash ...jicagig173
 
Independent Escorts Lucknow 8923113531 WhatsApp luxurious locale in your city...
Independent Escorts Lucknow 8923113531 WhatsApp luxurious locale in your city...Independent Escorts Lucknow 8923113531 WhatsApp luxurious locale in your city...
Independent Escorts Lucknow 8923113531 WhatsApp luxurious locale in your city...makika9823
 
Music Video Codes and Conventions 2 .pptx
Music Video Codes and Conventions 2 .pptxMusic Video Codes and Conventions 2 .pptx
Music Video Codes and Conventions 2 .pptxjenrobinson12
 
Top Astrologer, Kala ilam specialist in USA and Bangali Amil baba in Saudi Ar...
Top Astrologer, Kala ilam specialist in USA and Bangali Amil baba in Saudi Ar...Top Astrologer, Kala ilam specialist in USA and Bangali Amil baba in Saudi Ar...
Top Astrologer, Kala ilam specialist in USA and Bangali Amil baba in Saudi Ar...baharayali
 
Upgrade Your Twitter Presence with Socio Cosmos
Upgrade Your Twitter Presence with Socio CosmosUpgrade Your Twitter Presence with Socio Cosmos
Upgrade Your Twitter Presence with Socio CosmosSocioCosmos
 
YouScan Company Overview - Social Media Listening with Visual Insights.pdf
YouScan Company Overview - Social Media Listening with Visual Insights.pdfYouScan Company Overview - Social Media Listening with Visual Insights.pdf
YouScan Company Overview - Social Media Listening with Visual Insights.pdfAlexander Sirach
 
Add more information to your upload Tip: Better titles and descriptions lead ...
Add more information to your upload Tip: Better titles and descriptions lead ...Add more information to your upload Tip: Better titles and descriptions lead ...
Add more information to your upload Tip: Better titles and descriptions lead ...SejarahLokal
 
IMPACT OF FISCAL POLICY AND MONETARY POLICY ON THE ECONOMIC GROWTH OF NIGERIA...
IMPACT OF FISCAL POLICY AND MONETARY POLICY ON THE ECONOMIC GROWTH OF NIGERIA...IMPACT OF FISCAL POLICY AND MONETARY POLICY ON THE ECONOMIC GROWTH OF NIGERIA...
IMPACT OF FISCAL POLICY AND MONETARY POLICY ON THE ECONOMIC GROWTH OF NIGERIA...AJHSSR Journal
 
Protecting Your Little Explorer at Home!
Protecting Your Little Explorer at Home!Protecting Your Little Explorer at Home!
Protecting Your Little Explorer at Home!andrekr997
 
定制(ENU毕业证书)英国爱丁堡龙比亚大学毕业证成绩单原版一比一
定制(ENU毕业证书)英国爱丁堡龙比亚大学毕业证成绩单原版一比一定制(ENU毕业证书)英国爱丁堡龙比亚大学毕业证成绩单原版一比一
定制(ENU毕业证书)英国爱丁堡龙比亚大学毕业证成绩单原版一比一ra6e69ou
 
When-technology-and-Humanity-Cross-1.pptx
When-technology-and-Humanity-Cross-1.pptxWhen-technology-and-Humanity-Cross-1.pptx
When-technology-and-Humanity-Cross-1.pptxReaper61
 
办理伯明翰大学毕业证书文凭学位证书
办理伯明翰大学毕业证书文凭学位证书办理伯明翰大学毕业证书文凭学位证书
办理伯明翰大学毕业证书文凭学位证书saphesg8
 
VIP Moti Bagh Call Girls Free Doorstep Delivery 9873777170
VIP Moti Bagh Call Girls Free Doorstep Delivery 9873777170VIP Moti Bagh Call Girls Free Doorstep Delivery 9873777170
VIP Moti Bagh Call Girls Free Doorstep Delivery 9873777170Komal Khan
 
Call Girls In Dwarka ⏩7838079806 ⏩Escort Service In Patel Nagar Delhi
Call Girls In Dwarka ⏩7838079806 ⏩Escort Service In Patel Nagar DelhiCall Girls In Dwarka ⏩7838079806 ⏩Escort Service In Patel Nagar Delhi
Call Girls In Dwarka ⏩7838079806 ⏩Escort Service In Patel Nagar Delhidelhiescort
 

Kürzlich hochgeladen (20)

Unlock Your Social Media Potential with IndianLikes - IndianLikes.com
Unlock Your Social Media Potential with IndianLikes - IndianLikes.comUnlock Your Social Media Potential with IndianLikes - IndianLikes.com
Unlock Your Social Media Potential with IndianLikes - IndianLikes.com
 
O9654467111 Call Girls In Shahdara Women Seeking Men
O9654467111 Call Girls In Shahdara Women Seeking MenO9654467111 Call Girls In Shahdara Women Seeking Men
O9654467111 Call Girls In Shahdara Women Seeking Men
 
Hot Sexy call girls in Ramesh Nagar🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Ramesh Nagar🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Ramesh Nagar🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Ramesh Nagar🔝 9953056974 🔝 Delhi escort Service
 
Models Call Girls Shettihalli - 7001305949 Escorts Service 50% Off with Cash ...
Models Call Girls Shettihalli - 7001305949 Escorts Service 50% Off with Cash ...Models Call Girls Shettihalli - 7001305949 Escorts Service 50% Off with Cash ...
Models Call Girls Shettihalli - 7001305949 Escorts Service 50% Off with Cash ...
 
Independent Escorts Lucknow 8923113531 WhatsApp luxurious locale in your city...
Independent Escorts Lucknow 8923113531 WhatsApp luxurious locale in your city...Independent Escorts Lucknow 8923113531 WhatsApp luxurious locale in your city...
Independent Escorts Lucknow 8923113531 WhatsApp luxurious locale in your city...
 
Music Video Codes and Conventions 2 .pptx
Music Video Codes and Conventions 2 .pptxMusic Video Codes and Conventions 2 .pptx
Music Video Codes and Conventions 2 .pptx
 
Top Astrologer, Kala ilam specialist in USA and Bangali Amil baba in Saudi Ar...
Top Astrologer, Kala ilam specialist in USA and Bangali Amil baba in Saudi Ar...Top Astrologer, Kala ilam specialist in USA and Bangali Amil baba in Saudi Ar...
Top Astrologer, Kala ilam specialist in USA and Bangali Amil baba in Saudi Ar...
 
Upgrade Your Twitter Presence with Socio Cosmos
Upgrade Your Twitter Presence with Socio CosmosUpgrade Your Twitter Presence with Socio Cosmos
Upgrade Your Twitter Presence with Socio Cosmos
 
YouScan Company Overview - Social Media Listening with Visual Insights.pdf
YouScan Company Overview - Social Media Listening with Visual Insights.pdfYouScan Company Overview - Social Media Listening with Visual Insights.pdf
YouScan Company Overview - Social Media Listening with Visual Insights.pdf
 
looking for escort 9953056974 Low Rate Call Girls In Vinod Nagar
looking for escort 9953056974 Low Rate Call Girls In  Vinod Nagarlooking for escort 9953056974 Low Rate Call Girls In  Vinod Nagar
looking for escort 9953056974 Low Rate Call Girls In Vinod Nagar
 
Add more information to your upload Tip: Better titles and descriptions lead ...
Add more information to your upload Tip: Better titles and descriptions lead ...Add more information to your upload Tip: Better titles and descriptions lead ...
Add more information to your upload Tip: Better titles and descriptions lead ...
 
IMPACT OF FISCAL POLICY AND MONETARY POLICY ON THE ECONOMIC GROWTH OF NIGERIA...
IMPACT OF FISCAL POLICY AND MONETARY POLICY ON THE ECONOMIC GROWTH OF NIGERIA...IMPACT OF FISCAL POLICY AND MONETARY POLICY ON THE ECONOMIC GROWTH OF NIGERIA...
IMPACT OF FISCAL POLICY AND MONETARY POLICY ON THE ECONOMIC GROWTH OF NIGERIA...
 
Protecting Your Little Explorer at Home!
Protecting Your Little Explorer at Home!Protecting Your Little Explorer at Home!
Protecting Your Little Explorer at Home!
 
定制(ENU毕业证书)英国爱丁堡龙比亚大学毕业证成绩单原版一比一
定制(ENU毕业证书)英国爱丁堡龙比亚大学毕业证成绩单原版一比一定制(ENU毕业证书)英国爱丁堡龙比亚大学毕业证成绩单原版一比一
定制(ENU毕业证书)英国爱丁堡龙比亚大学毕业证成绩单原版一比一
 
When-technology-and-Humanity-Cross-1.pptx
When-technology-and-Humanity-Cross-1.pptxWhen-technology-and-Humanity-Cross-1.pptx
When-technology-and-Humanity-Cross-1.pptx
 
young call girls in Greater Noida 🔝 9953056974 🔝 Delhi escort Service
young call girls in  Greater Noida 🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in  Greater Noida 🔝 9953056974 🔝 Delhi escort Service
young call girls in Greater Noida 🔝 9953056974 🔝 Delhi escort Service
 
办理伯明翰大学毕业证书文凭学位证书
办理伯明翰大学毕业证书文凭学位证书办理伯明翰大学毕业证书文凭学位证书
办理伯明翰大学毕业证书文凭学位证书
 
young Call girls in Dwarka sector 23🔝 9953056974 🔝 Delhi escort Service
young Call girls in Dwarka sector 23🔝 9953056974 🔝 Delhi escort Serviceyoung Call girls in Dwarka sector 23🔝 9953056974 🔝 Delhi escort Service
young Call girls in Dwarka sector 23🔝 9953056974 🔝 Delhi escort Service
 
VIP Moti Bagh Call Girls Free Doorstep Delivery 9873777170
VIP Moti Bagh Call Girls Free Doorstep Delivery 9873777170VIP Moti Bagh Call Girls Free Doorstep Delivery 9873777170
VIP Moti Bagh Call Girls Free Doorstep Delivery 9873777170
 
Call Girls In Dwarka ⏩7838079806 ⏩Escort Service In Patel Nagar Delhi
Call Girls In Dwarka ⏩7838079806 ⏩Escort Service In Patel Nagar DelhiCall Girls In Dwarka ⏩7838079806 ⏩Escort Service In Patel Nagar Delhi
Call Girls In Dwarka ⏩7838079806 ⏩Escort Service In Patel Nagar Delhi
 

Quantifying Fan Engagement using Social Media

  • 1. SOCIAL MEDIA ANALYTICS TO QUANTIFY FAN ENGAGEMENT DR. ROBERT BAKER TED KWARTLER Get a more complete profile of your fans to inform business decisions and improve ROI calculations.
  • 2. AGENDA Basics Where are the fans? Who are the fans? What are fans talking about? How do the fans feel towards the team? What is the point of all this?
  • 3. A FAN’S EXPERIENCE If only there had been social media, the Yankees could have profiled my experience.
  • 4. BASICS WHAT IS TEXT MINING?
  • 5. SOCIAL MEDIA ANALYTICS REQUIRES TEXT MINING Before text mining. After text mining. Text mining lets you “drink from a fire hose” of information and distill useful meaning.
  • 6. Organized into Document Term Matrix (DTM) Term Document Matrix (TDM) Apply standard and domain specific rules WHAT IS TEXT MINING? Unstructured natural language texts Insight & Recommendation Natural language surveys tweets Text mining is an emerging technology that can be used to augment existing data by making unstructured text available for analysis and decision making. articles emails blogs reviews texts
  • 7. EXAMPLE UNSTRUCTURED TEXT SOURCES Many sources including emails, forum posts, tweets, books, pdfs, reviews, transcripts etc. Unstructured natural language texts 杜兰特和詹姆斯谁才是当今联盟的头牌?这是最近很火热的话题。一方 面杜兰特高居得分榜首位,在MVP权力榜上也雄踞第一;另一方面詹姆 斯带领热火一切为了三连冠,比赛沉稳... Had my first experience at TD Garden when my Bulls came to play the Celtics. Being someone with an out of state license living in Boston, I usually carry my passport anyway, but I had a friend in town and wanted to clear up this ID controversy I read so much about in the rules.
  • 8. EXAMPLE PRE-PROCESSING STEPS (or other software e.g. Python NLTK) 1.Make all text lower case 2.For twitter, remove “RT” for retweet. 3.Remove symbols like “@” 4.Remove punctuation 5.Remove numbers 6.Remove Urls e.g. http://www.espn.com 7.Remove extra whitespace 8.Remove “stopwords” 9.Others as needed depending on In a “bag of words” text mining methodology the corpus must be cleaned. Cleaning often means making items lower case, removing punctuation, numbers and extra whitespace. In unique instances domain specific rules are applied (e.g. removing “RT” for retweet). Apply standard and domain specific rules Cleaned Version: no doubt derek jeter makes my top all time with babe lou yankee clipper mick 杜兰特和詹姆斯谁才是当今联盟的头 牌?这是最近很火热的话题。一方面杜兰特 高居得分榜首位,在MVP权力榜上也雄踞第 一;另一方面詹姆斯带领热火一切为了三连 冠,比赛沉稳... Translated Version: Durant and James, who is the league's first card today? This is a very hot topic recently. On the one hand Durant highest scoring top position in the standings MVP authority also ranked first; on the other hand, James led the Heat everything for three consecutive years, the race calm ... Cleaned Version: durant james who league first card today very hot topic recently on one hand durant highest scoring top position standings MVP authority ranked first other hand, james led heat everything three consecutive years race calm
  • 9. DATA ORGANIZATION Once cleaned the documents and terms are organized into large matrices. Often they are very sparse and may contain tens of thousands of data points. Attributes may be single words or word tokens of 2 or more words. Organized into Document Term Matrix Term Document Matrix no doubt derek jeter makes my top all time with babe lou yankee clipper mick Document no doubt derek jeter top durant james termN Tweet_1 1 1 1 1 1 0 0 0 Sina_1 0 0 0 0 1 2 2 1 docN … … … … … … … … Term Tweet_1 Sina_1 docN no 1 0 … doubt 1 0 … jeter 1 0 … top 1 1 … termN 0 1 … durant james who league first card today very hot topic recently on one hand durant highest scoring top position standings MVP authority ranked first other hand, james led heat everything three consecutive years race calm ... Document Term Matrix Term Document Matrix
  • 10. WHERE ARE THE FANS? LOCATION BASED ATTRIBUTES
  • 13. NYY TWITTER FOLLOWERS -10K SAMPLE
  • 14. Team Total Followers Sample Bing API Geo-Located Median Distance to Stadium Dodgers ~540K First 10K 2,854 1,372 miles Indians ~225K First 10K 3,774 319 miles Yankees ~1.18K First 10K 1,335 713 miles
  • 15. WHO ARE THE FANS? COMMON DEMOGRAPHIC EXTRACTION
  • 16. From Twitter locations to zip code then demographic data. Sample of 3262 of 10k Followers Geo-located IDs Zip City Populatio on Avg house value Income below poverty Total business es Total household ds 91766 Pomona, CA 71,599 $142,800 15.4% 803 93301 Bakersfiel d, CA 12,248 $109,600 20.4% 1,438 91606 North Hollywood, CA 44,958 $170,100 15.4% 622 14,903 WE CAN GET MORE GRANULAR.
  • 17. Sample of 3775 of 10k Followers Geo-located IDs Zip City Populatio on Avg house value Income below poverty Total business es Total household ds 44107 Lakewood d, OH 52,244 $117,900 16.4% 945 25,333 44139 Solon, OH 24,356 $215,700 16.4% 1,155 8,693 44304 Akron, OH 5,916 $56,300 13.0% 172 1,637 WE CAN GET MORE GRANULAR. From Twitter locations to zip code then demographic data.
  • 18. Sample of 1335 of 10k Followers Geo-located IDs Zip City Populatio on Avg house value Income below poverty Total business es Total household ds 10462 Bronx, NY 75,784 $192,600 27.9% 1002 29855 14223 Buffalo, NY 22,665 $85,700 13.9% 328 9832 75060 Irving, TX 45,980 $83,300 17.2% 503 WE CAN GET MORE GRANULAR. From Twitter locations to zip code then demographic data.
  • 19. FURTHER INSIGHTS OF ZIP 91766, POMONA CA At the zip code and metropolitan area there are countless dimensions that may aid in fan segmentation and marketing. • Ranked #1 Drought Riskiest Cities • Ranked #15 Riskiest for Identity Theft • Ranked #5 Most Irritation Prone City Sources: http://www.census.gov http://emergency.cdc.gov/snaps/data/39/39153.htm http://www.bestplaces.net/rankings/zip-code/ohio/akron/44304 • Ranked #8 Healthiest • Ranked #13 Best City for Teleworking • Ranked #6 Most Single City Population White Black Hispanic Asian Hawaiin Indian Other Gender male female Households total.households house w/child Immigration Mexico El Savador Philippines Gutemala Korea China Vietnam Iran
  • 20. FURTHER INSIGHTS OF ZIP 44304, AKRON OH Population White Black Asian Hawaiin Indian Other At the zip code and metropolitan area there are countless dimensions that may aid in fan segmentation and marketing. Gender male female Households total.households house w/child Immigration India Germany Yugoslavia UK Italy Canada China other • Ranked #1 Best City for Thanksgiving • Ranked #4 Best Cities for Teleworking • Ranked #25 America’s Best Cities for Dating Sources: http://www.census.gov http://emergency.cdc.gov/snaps/data/39/39153.htm http://www.bestplaces.net/rankings/zip-code/ohio/akron/44304 • Ranked #64 Most Popular City for the Holidays • Ranked #73 America’s Most Stressful Cities • Ranked #140 2005 Best Places to Live
  • 21. FURTHER INSIGHTS OF ZIP 10462, BRONX NY At the zip code and metropolitan area there are countless dimensions that may aid in fan segmentation and marketing. • Ranked #2 Least Crime for Large Metro Area • Ranked #2 Sleepless Cities 2011 • Ranked #3 Most Single Cities Sources: http://www.census.gov http://emergency.cdc.gov/snaps/data/39/39153.htm http://www.bestplaces.net/rankings/zip-code/ohio/akron/44304 • Ranked #9 Most Irritation Prone Cities • Ranked #14 Healthiest Cities • Ranked #28 Most Playful Cities Population White Black Hispanic Asian Hawaiin Indian Other Gender male female Households total.households house w/child Immigration Dominican Jamaica Mexico Guyana Ecuador Caribbean Honduras Ghana
  • 22. WHAT ARE THE FANS TALKING ABOUT? INTERESTING TOPICS AND NAMED ENTITY RECOGNITION
  • 23. • Free Twitter API 1.1K Tweets • Tweets mentioning “Indians” • 7/31 & 8/1 • “Tokenize” single words into unique two word groups • Trade mentions • Masterson to Cardinals for Ramsey • Cabrera to Nationals for Walters • Throwback jerseys for KC Royals game • Mariners game attendees 7/31
  • 24. DIFFERENCES OF WORD CLOUDS SIMPLE WORD CLOUD, CLOUD, COMMON CLOUD AND POLARIZED CLOUD text1 text2 text2 text1 text21 text2 Simple Word Cloud Commonality & Polarized Cloud Comparison Cloud
  • 25. 12K Tweets • Includes a mix free API access and full fire hose paid API over 48 distinct hours • Sampling occurred August 1 and August 13 • Tweets mentioning “Dodgers” most often discussed • Clayton Kershaw’s appearance on Jimmy Kimmel Live • FCC Chairman’s letter to Time Warner CEO about the Dodger’s TV Channel
  • 26. 2K Spanish Tweets • Free Twitter API Spanish language search over 48 distinct hours • Sampling occurred July 29 and August 12 • Tweets mentioning “Dodgers” and used Spanish most often discussed • The AP story of Dan Haren beating the Braves • Vin Scully retiring was a smaller topic although present Example: Dodgers vencen a Bravos con 2 jonrones de Kemp http://t.co/9U7xiIPOdo #noticias Dodgers beat Braves with 2 homers Kemp http://t.co/9U7xiIPOdo #news
  • 27. 235 Blogs Treemap Sentiment • July 29-July 31 • Group is Correlated Topic Modeling • Color is sentiment •Area is blog length • Takeaways: • Babe Ruth’s birthday is shared with Laurence Fishburn, born in Augusta Georgia – picked up blogs mentioning “birthdays on this date” • Eli Manning wants to remember advice of Derek Jeter • Pending trade deadline • ESPNNewYork writer Wallace Matthews • Game recaps
  • 28. Dissimilar Words • Full FB Firehose of public posts • Sampling occurred • Dodgers: July 29 – July 31 • Yankees: July 28 – July 31 • FB mentions of Dodgers and Yankees tagged as English •Marketing posts about Spike Lee requested a Red New York Yankees World Series edition fitted cap
  • 29. Words in Comm on • Full FB Firehose of public posts • Sampling occurred • Dodgers: July 29 – July 31 • Yankees: July 28 – July 31 • FB mentions of Dodgers and Yankees tagged as English •As expected trades to improve the season towards the end of the deadline were mentioned by both teams
  • 30. COMPARATIVE ANALYSIS – BIGRAMS IN COMMON • Full FB Firehose of public posts • Sampling occurred • Dodgers: Jul 29, -- Jul 31 • Yankees: Jul 28 – Jul 31 • FB mentions of Dodgers and Yankees tagged as English red sox Equal Mentions
  • 31. FEELINGS TOWARDS THE TEAM SIMPLE SENTIMENT ANALYSIS
  • 32. EXAMPLE POLARITY SCORING IN TWITTER Many words in natural language Follows a predictable distribution. Zipf’s Law but there is steep decline in everyday usage. 900,000 800,000 700,000 600,000 500,000 400,000 300,000 200,000 100,000 0 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100 Top two words in English spoken language are “the” and “be”. Top two words in Twitter are “RT” and “I”. However the power distribution is similar and follows Zipf’s law. Top 100 Word Usage from 3M Tweets
  • 33. SENTIMENT POLARITY ANALYSIS Surprise is a sentiment. Hit by a bus! – Negative polarity but surprising. Won the lottery! – Positive polarity but still surprising. Use the University of Pittsburgh’s MPQA Lexicon & Illocution Inc’s 10K top Twitter words. Keyword Scanning for polarity R script scans for 3546 positive words, and 5701 negative words. It adds positive words and subtracts negative ones. The final score represents the polarity of the social interaction. •I loathe the Tigers. -1 •I love Lou Whittaker. He was the best. +2 •I like the Tigers but dislike going to the stadium. 0
  • 34. DODGER SENTIMENT ON TWITTER 9/5 Median: -1 Mean: -0.47
  • 35. INDIANS SENTIMENT ON TWITTER 9/5 Median: 0 Mean: -0.1198
  • 36. YANKEE SENTIMENT ON TWITTER 9/5 Median: 0 Mean: -0.118
  • 37. IN COMPARISON… hey..yankees....can ya score some runs?! indians activate murphy from disabled list http://t.co/bqliintwsf dodgers rhp josh beckett won't return this season Team Tweets>=1 Tweets<=-1 Total w/o 0 % positive Yankees 280 406 686 41% Indians 290 456 746 39% Dodgers 448 1,226 1,674 27%
  • 38. WHAT IS THE POINT OF ALL THIS? TARGETED MARKETING EFFORTS, EVANGELISTS, REFINED SEGMENTATION, MEDIA MIX MODELING LEADING TO ROI
  • 39. EXAMPLE IDENTIFY EVANGELISTS, INFLUENCERS & DETRACTORS • When engaging on social media it is important to note the clout of followers in terms of status updates, and followers • Running sentiment analysis on updates/posts adds context to the voice of the customer • Appending other data allows for additional segmentation, and differentiated customer experiences e.g. my Yankee story 10K Indians Followers less 138 outliers
  • 40. MEDIA MIX MODELING FOR SOCIAL MEDIA ROI • In lieu of actual sales merchandise data and marketing spend, tracked Amazon Sales Rank hourly from 4/1 to 8/31 • Relative measure of sales against other “Sports and Outdoors” category items •Lower number is better
  • 41. DODGER CAP AVERAGE HOURLY SALES RANK PER DAY 4500 4000 3500 3000 2500 2000 1500 1000 500 0 1-Apr 8-Apr 15-Apr 22-Apr 29-Apr 6-May 13-May 20-May 27-May 3-Jun 10-Jun 17-Jun 24-Jun 1-Jul 8-Jul 15-Jul 22-Jul 29-Jul 5-Aug 12-Aug 19-Aug 26-Aug Amazon sales rank when seen as a time series exhibits is not stationary. Overall the Dodgers has an increasing trend despite being successful on field and has some periodicity based on day of week.
  • 42. Time Series Decompositi on • Econometric forecasting TSD was used in an attempt to isolate social media impact and understand sales rank patterns • Trend is likely the impact of baseball season excitement then waning to other sports • Seasonal may be the impact of retail day of the week cycles • Leaving random as the dependent variable in the media mix GLM
  • 43. Tweets to Decomposed Amazon Sales Rank • Correlation is only -0.08. • Given the tweets are examined against ‘random’ or unexplained data the relationship may still be relevant. •As this is proxy data for sales of a single item, results not conclusive 1000 800 600 400 200 0 -200 -400 -600 -800 -1000 0 10 20 30 40 50 60 70 80 90 100 *removed dates with missing data
  • 44. Tweets to Average Daily Amazon Sales Rank •Much stronger correlation - 0.24 • Leads one to believe the more a team tweets the lower the sales rank •As this is proxy data for sales of a single item, results not conclusive *removed dates with missing data 4500 4000 3500 3000 2500 2000 1500 1000 500 0 0 10 20 30 40 50 60 70 80 90 100
  • 45. Media mix modeling *removed dates with missing data • Given the likely relationship: • Set up a GLM using marketing efforts media spend with the dependent variable being revenue, ticket sales, merchandise sales etc. • The coefficients of the inputs illustrate the impact of the channel marketing spends leading you to ROI Example: 푓 푠푎푙푒푠 = 훽0 + 훽1 푠표푐푖푎푙. 푚푒푑푖푎. 푠푝푒푛푑 + 훽2 푡푟푎푑푖푡푖표푛푎푙. 푚푘푡푔. 푠푝푒푛푑 + 훽3 푡푒푎푚. 푝푒푟푓표푟푚푎푛푐푒 … 훽푛 + 휖 The goal is increased model lift, and accuracy by incorporating social media spend. The coefficient of the variable demonstrates the impact. This will allow you to calculate a ROI of social spend.
  • 46. FURTHER INFO Want example R scripts for the visuals? www.sportsanalytics.org starting 9/15

Hinweis der Redaktion

  1. Misses Amplifiers, negations and emoticons Missed example: Wicked good. Don’t have cancer. Twitter is unique in frequency e.g. “I” is the #2 most frequent word yet in all other English usages ranks #10, RT =#1, the=#1 Twitter=not English! “smh” , “jk”, “lmao” & “gr8” are words in twitter ~3M Tweets April 2013 to July 2013 ~600K unigrams Example done in R using Illocution Inc Twitter Data to create unigram lexicon appended to University of Illinois at Chicago positive/negative lexicon Lexicon from top 10K words scored as positive, neutral, negative Multi-question perspective answering subjectivity lexicon from the University of Pittsburgh Added twitter peculiarities from illocution inc’s twitter frequency analysis Total is 5701 negative words 3546 positive words