Axa Assurance Maroc - Insurer Innovation Award 2024
Icwsm 2014 modeling user attitude v.7
1. Modeling User Attitude toward Controversial
Topics in Online Social Media
Huiji Gao*
, Jalal Mahmud+
, Jilin Chen+
, Jeffrey Nichols+
, Michelle Zhou+
*Arizona State University
+
IBM Research - Almaden
2014.06
1
2. 2
Motivation
•400+ million tweets daily
•3.2 billion Facebook likes
and comments daily
Hundreds of millions of people express
themselves on social media daily
Many social media campaigns emerged, where
people express strong opinions and provide
support for social causes of public interest
Fracking Vaccination
Two users may hold the same negative sentiment
toward a topic due to different opinions
However, they may take different actions due to
their different opinions toward a topic
3. 3
Motivating Example
Joe may support the opinion that fracking causes damage to environment,
believing that fracking should be immediately stopped.
Bill may believe that fracking harms environment, but is against the position of
stopping fracking completely, believing that better regulation of fracking is
needed.
Due to their different opinions, Joe and Bill may have different tendency to
spread
a petition that calls for stopping fracking, despite their shared negative sentiment.
Fracking
Damages
Environment
Fracking
Harms
Environment
Completely
Agree
Not
Completely
Agree
Spread Not
Spread
4. 4
Prior Work
Motivated by this gap, we present a unified computational model
that captures people’s sentiment toward a topic, their specific
opinion, and their likelihood of taking an action.
Nuanced relationships between sentiment, opinion, and action has not been
captured well by traditional sentiment or opinion analysis work (Jiang et al. 2011;
Tan et al. 2011; Somasundaran and Wiebe 2010).
Prior behavior prediction work on social media (Yang et al. 2010; Feng and Wang
2013) are agnostic on the underlying opinions for behaviors, thus missing the
potential effect of opinions in their prediction efforts.
5. Attitude Background
Tri-component Attitude Model
Our model is inspired by an established theoretical framework in marketing
research on attitudes and attitude models, where attitude is defined as a unified
concept containing three aspects: “feelings”, “beliefs”, and “actions” (McGuire
1968; Schiffman and Kanuk 2010).
- According to the framework, beliefs are acquired on attitude object (e.g., a topic,
product or person), which in turns influences the feelings on the object and the
actions w.r.t. the attitude object.
Our computational model operationalizes this
framework mathematically, casting feelings,
beliefs, and actions into users’ sentiment, opinion,
and action toward a topic on social media.
7. Contributions
Study user attitude toward a controversial topic in terms of
sentiment, opinion, and action.
Discover the relationships among sentiment, opinion and action,
and model them for attitude prediction.
Perform experiments with real-world social media campaign
datasets to demonstrate the model performance.
7
8. Methodology
Ground Truth Construction
Model Training
Attitude Prediction
- We consider re-tweeting an opinion about a topic as a ground truth and used supervised
approach for labeling tweets.
- Model user action (e.g. re-tweeting) as preferences toward a target (e.g. a tweet)
- Adopt collaborative filtering method (e.g. matrix factorization) for action inference
- Introduce features (e.g. historical content, behavior, profile) in matrix factorization
framework to handle “cold-start” users due to data sparsity.
- Bridge the gap between latent factors and explicit opinions expected as output.
- Introduce transition matrix to capture overall sentiment from opinions.
- Optimization for parameter Inference
- Predict Users’ Sentiment, Opinion and Action toward a topic.
9. 9
User Attitude
Underlying Factors User Behavior
Problem: Model user behavior through his/her opinion/sentiment
Re-tweeting actions: Given a set of tweets, determine whether a user would re-
tweet one or more tweets among them.
Inferring User Preferences towards a Set of Targets (Tweets)
Methodology
10. 10
Methodology
Capturing User Retweeting Action with Matrix Factorization
R: User-tweet matrix representing re-tweeting actions
U: Low-rank representation of users’ latent preferences
V: Low-rank representation of tweets’ latent profiles
: Regularization terms
Inferring User Preferences towards a Set of Targets (Tweets)
Observed value in R is factorized into U and V;
An unknown user i’s preferences towards a tweet j, R(i,j), is
approximated through Ui Vj
T
11. 11
Methodology: Feature Selection for Preference Approximation
Inferring User Preferences towards a Target
Capturing User Retweeting Action with Matrix Factorization
Introduce features in matrix factorization framework to handle “cold-start”
users due to data sparsity.
User-related features
Feature coefficients
Sparse Feature Space
12. 12
Methodology: Opinion Regularization
Inferring User Preferences towards a Target
Capturing User Retweeting Action with Matrix Factorization
Bridge gaps between latent factors and explicit opinions:
A. Map user latent preferences into opinion space
B. Non-negativity Constraint for opinion interpretation
Observed user opinions
13. 13
Methodology: Sentiment Regularization
Inferring User Preferences towards a Target
Capturing User Retweeting Action with Matrix Factorization
Introduce transition matrix to capture overall sentiment from opinions
A. Introduce transition matrix S to map a user’s opinion
preferences to sentiment polarity
B. Non-negativity constraint for sentiment interpretation
Opinion-sentiment transition matrix
Observed user
sentiment polarity
14. 14
ATMiner: Modeling User Attitude toward a Topic:
Action Factorization Opinion Regularization Sentiment Regularization
Sparse Learning
Avoid Over-fitting
Cold-Start Users
Non-negativity
Framework
Use Alternative non-negative least square to infer W, S, and V.
15. 15
Modeling User Attitude
Input: R: User-Tweet Matrix;
X: User-Feature Matrix;
O: User-Opinion Matrix;
P: User-Sentiment Matrix.
User
Topic
Transition
Matrix
Tweet
Latent Profile
Features
RO
P
S
V
X
Feature
Coefficients
W
Output: W: Feature Coefficients (Opinion
Level);
S: Transition Matrix (Sentiment Level);
V: Tweet Latent Profiles (Action Level).
Experiments
Tasks:
1. Opinion Prediction:
2. Sentiment Prediction:
3. Retweeting Action Inference:
16. Experimental Setup – Data Collection
[1] D. Boyd, S. Golder, and G. Lotan. Tweet, tweet, retweet: Conversational aspects of retweeting on twitter. HICSS ’10, 2010.
[4] M. J. Welch, U. Schonfeld, D. He, and J. Cho. Topical semantics of twitter links. In Proc. of the WSDM ’11, 2011.
Selected fracking and vaccination as the controversial topics.
Used Twitter’s streaming API to obtain 1.6 million tweets related to fracking
topic from Jan, 2013 to March, 2013, with a set of fracking-related keywords
For vaccination dataset, we obtained 1.1 million tweets related to vaccination
topic from May, 2013 to Oct, 2013, with a set of vaccination-related keywords
Ranked all the crawled tweets based on their retweeted times, and selected those
which are retweeted for more than 100 times as our action tweets.
There were 162 action tweets in fracking dataset and 105 action tweets in
vaccination dataset.
17. Experimental Setup – Ground Truth Creation
Ground truth of action is available from the re-tweet of action tweets
Obtain corresponding users who re-tweeted action tweets, construct R
- Following traditional assumption (Boyd, Golder, and Lotan 2010; Conover et al. 2011; Welch et al.
2011), re-tweeting is used as an endorsement of the original tweet.
Crawl users historical tweets, construct X
Manually label action tweets into eight opinion categories for fracking and six
opinion categories for vaccination.
- Instead of manually labeling each user in our dataset, we manually labeled only action tweets.
Assign user into opinion categories according to re-tweeting actions, construct O
Assign user into sentiment category according to opinion assignments, construct P
- If the majority opinions assigned to this user are positive, the user is labeled as positive, otherwise
negative.
18. Experimental Setup – Opinions
Opinions in the Fracking Dataset
18
Opinion Tweet Example
Fracking benefits economy
and energy
Fracking saves us money; fracking creates jobs
Fracking is safe FACT: fracking has safely produced over 600 trillion cubic feet of #natgas
since 1947.
Fracking causes oil spill Lives in a pineapple under the sea. BP oil spill.
Fracking damages
environment
Large earthquake in Oklahoma in 2011 was caused by #fracking
Fracking causes health
problems
To anyone speaking of the economic ”benefits” of fracking:
what use is that money if your food and water are full of poison.
Fracking does not help
economy
The amount of money BP lost from the oil spill could buy about 30 ice cream
sandwiches for everyone on earth.
Fracking is bad Yoko Ono took a tour of gas drilling sites in PA to protest fracking.
Fracking should be stopped Protect our kids and families from #fracking. Please RT!
19. Experimental Setup - Opinions
Opinions in the Vaccination Dataset
19
Opinion Tweet Example
Positive Information (Opinion)
about vaccination
Vaccination campaign launches with hope of halting
measles outbreak http://t.co/H2B6ujFx22
Vaccination should be continued To not vaccinate is like manslaughter. Vaccinate!
Counter negative information
about vaccination
Six vaccination myths - and why they’re wrong.
http://t.co/BX7kq0SOjz
Negative Information (Opinion)
about vaccination
Vaccination has never been proven to have saved one
single life.
Vaccination causes disease Until the #Vaccination was introduced RT @trutherbot:
Cancer was a rarity more than 200 years ago.
Criticize forced vaccination Police State? Registry System Being Set Up to Track
Your Vaccination Status - http://t.co/fkSWDbYAbB
20. Experimental Setup
Datasets Statistics
20
User features based on users’ historical tweets.
Use unigram model while removing stop-words to construct the feature space, and
use term frequency as feature value.
Cross validation, 70% for training and 30% for testing.
All the parameters of our model are set through cross validation.
- Specifically, we set = 0.5, = 0.5, = 2, and = 0.1.
Fracking Vaccination
No. of Users 5,387 2,593
No. of Positive Users 1,562 1617
No. of Negative Users 3,822 976
Duration 1/13-3/13 5/13-10/13
No. of Historical Tweets 458,987 226,541
No. of Opinions 8 6
No. of Action Tweets 162 105
No. of Features 10,907 4,803
24. Conclusions
Presented a model to estimate a user’s attitude in terms of sentiment,
opinion and likelihood of action toward controversial topics in social
media.
Captured the relationships among sentiment, opinions and actions so as
to predict actions and sentiment based on one’s opinions.
Our model extended traditional matrix factorization approach by usage of
features, opinion and sentiment regularization.
Experiments using two real world datasets demonstrate that our model
outperforms baselines in predicting sentiment, opinion and action.
24
Fracking, or hydraulic fracturing,
is the process of extracting natural gas from shale rock layers,
and the process has been hotly debated in the public due
to its potential impact on energy and environment.
Figure 1 shows an illustrative example of user attitude toward
a controversial topic (fracking) on Twitter. At sentiment
level, it shows two sentiments toward fracking (support
fracking vs. oppose fracking). Note that, a user may neither
support nor oppose fracking. However, for clarity we do not
include such neutral sentiment in the example. At opinion
level, a user may have one or more opinions w.r.t. different
facets of fracking. For example, “Fracking damages environment”
is an opinion regarding to the “environment” facet
of fracking and the example tweet on the left side of Figure
1 contains that opinion. Similarly, “Fracking is safe” is
an opinion regarding to the “safety” facet, and the example
tweet on the right side of Figure 1 contains that opinion.
Each opinion has a sentiment associated with it (the first
opinion is negative toward “fracking” and the second opinion
is positive toward “fracking”). A user who has multiple
opinions may have an overall sentiment toward the topic
(not shown in Figure 1). Finally, at action level, a user may
retweet/mention/post a tweet containing such opinions.
Up to this slide: 5 min.
Up to this slide: 8 min.
Up to this slide: 15 min
Up to here: 18 min
We then
assign these opinions to users based on their corresponding
retweeting actions. The assignment follows the traditional
assumption of studying user retweeting behavior, that when
a user retweets a tweet, we assume he endorses the tweet
content (Boyd, Golder, and Lotan 2010; Conover et al. 2011;
Welch et al. 2011; Calais Guerra et al. ). The assignment
of each user on different opinions are considered as opinion
ground truth. We then label each user’s sentiment based
on his opinion assignment, i.e., if the majority opinions assigned
to this user are positive, the user is labeled as positive,
otherwise negative. For each user in our dataset, we also
crawled historical tweets (100 max) that were posted before
the time when the user first retweeted an action tweet. These
historical tweets are used to generate user features.