Weitere ähnliche Inhalte
Ähnlich wie Unsupervised Sentiment Analysis (20)
Kürzlich hochgeladen (20)
Unsupervised Sentiment Analysis
- 1. Taras Zagibalov
T.Zagibalov@sussex.ac.uk
PhD candidate at University of Sussex
Brighton, UK
Ford Foundation International Fellowship fellow
Natural languages: Russian, English, Mandarin
Programming: Java, Prolog
Taras Zagibalov© 2009
- 3. Outline
What is Sentiment Analysis
Application of Sentiment Analysis
Who's in the business?
Unsolved Problems
Why unsupervised?
Is it effective?
Taras Zagibalov© 2009
- 4. Sentiment Analysis
Sentiment Analysis (or Opinion Mining) is a
relatively new research area in Information
Retrieval and Natural Language Processing,
which is concerned not with a document's topic,
but with what opinion it expresses
Taras Zagibalov© 2009
- 5. What is Sentiment Analysis
Subjectivity Classification
Orientation Detection
Opinion Holder and Target Extraction
Feature-Based Opinion Mining
Taras Zagibalov© 2009
- 6. What is Sentiment Analysis
Subjectivity Classification
Orientation Detection
Opinion Holder and Target Extraction
quot;Feature-Based Opinion Miningquot;
A car has four wheels.
vs
It's a good car.
Taras Zagibalov© 2009
- 7. What is Sentiment Analysis
Subjectivity Classification
Orientation Detection
Opinion Holder and Target Extraction
quot;Feature-Based Opinion Miningquot;
It's a good car.
vs
It's a bad car.
Taras Zagibalov© 2009
- 8. What is Sentiment Analysis
Subjectivity Classification
Orientation Detection
Opinion Holder and Target Extraction
quot;Feature-Based Opinion Miningquot;
Ian says it's a good car.
Taras Zagibalov© 2009
- 9. What is Sentiment Analysis
Subjectivity Classification
Orientation Detection
Opinion Holder and Target Extraction
quot;Feature-Based Opinion Miningquot;
The wheels are good, but all the rest is just
unusable.
Taras Zagibalov© 2009
- 10. Application of Sentiment
Analysis
Where opinions can be found?
News feeds (Google, Yahoo, Reuters etc)
Blogs (LJ, Technorati etc)
Social Networks (Twitter, Facebook...)
Customer review sites (Amazon, eBay...)
Taras Zagibalov© 2009
- 11. Application of Sentiment
Analysis
Marketing Research
Product Reviews Analysis
Brand Tracking
Influence Analysis
Public Opinion Tracking
Customer correspondence analysis
Taras Zagibalov© 2009
- 12. Application of Sentiment
Analysis
What questions can be answered by
Sentiment analysis system?
What do customers think about our product?
Which of our customers are unsatisfied?
What features of our product are the worst?
Who and how influences our image?
What is public reaction to (some event or
some person)?
and so on...
Taras Zagibalov© 2009
- 13. Example 1
On-line (blogs, mass-media) monitoring of a product
promotion campaigns
10
9
8
7
6
5
4
3
2
1
0
A B
Promotional campaign A is successful as most of on-line
reviews are positive.
Promotional campaign B needs immediate actions as most of
on-line reviews are negative.
Taras Zagibalov© 2009
- 14. Example 2
New product release as it mirrored in customer on-line
reviews
8
7
6
5
4
3
2
1
0
A B
(A) Product release and add campaign is quite effective as
public opinion is mostly positive. But the sentiment changes as
sales grow (B), more people are unsatisfied and it needs to be
analysed (probably some quality-related issues)
Taras Zagibalov© 2009
- 15. Example 3
Influence analysis by tracking blogs
9
8
7
6
5
4
3
2
1
0
A B
(A) Negative review in a newspaper does not affect a generally
positive sentiment towards a product, although a positive
review in a magazine (B) is quite effective.
Taras Zagibalov© 2009
- 16. Who's in the business?
BrandWatch
Istrategy Labs
Cataphora
Scoutlabs
Lexalytics
Infonic
Attensity
Open Dover
... Taras Zagibalov© 2009
- 17. What's the technology?
Machine Learning
Manually tagged training data sets
User-tagged training data sets (“thumbs up” and the
“ five stars”)
Knowledge-based Approaches
Manually created word-lists
Generic word-lists (like SentiWordNet or sentiment
vocabularies)
Manual Processing
Taras Zagibalov© 2009
- 18. Unsolved Problems
Domain-dependency
Unpredictable evaluation language
Language-dependency
Taras Zagibalov© 2009
- 19. Unsolved Problems
Domain-dependency
Unpredictable evaluation language
Language-dependency
quot;The plot was unpredictablequot;
vs
quot;the steering was unpredictablequot;
Taras Zagibalov© 2009
- 20. Unsolved Problems
Domain-dependency
Unpredictable evaluation language
Language-dependency
“good” == “bad” in eBay
“3G” (technology for mobile phones) == “good”
Taras Zagibalov© 2009
- 21. Unsolved Problems
Domain-dependency
Unpredictable evaluation language
Language-dependency
Culture-related issues (“good” <> “ 好” )
Language-related issues (SVO vs SOV)
Taras Zagibalov© 2009
- 22. Why unsupervised?
Cross-Domain applicability
Multi-Lingual applicability
Cheap Start
Taras Zagibalov© 2009
- 23. Why unsupervised?
Cross-Domain applicability
Multi-Lingual applicability
Cheap Start
No expensive human annotation needed:
all information is found in the documents
which needed to be processed.
All extracted information is domain-
specific and free from noise produced by
“generic” word lists and wordnets.
Taras Zagibalov© 2009
- 24. Why unsupervised?
Cross-Domain applicability
Multi-Lingual applicability
Cheap Start
Unsupervised systems, being data-
independent, can be easily ported to
almost any language.
Taras Zagibalov© 2009
- 25. Why unsupervised?
Cross-Domain applicability
Multi-Lingual applicability
Cheap Start
Once an unsupervised system is
developed it can be applied to new data
almost immediately saving costs of data
labelling and/or rules (word-lists) writing
up.
Taras Zagibalov© 2009
- 26. Is it effective?
The unsupervised approach was tested on
different language corpora (English, Simplified
Chinese, Traditional Chinese, Japanese) and in
many cases compared reasonably well with
supervised methods.
Results were presented on some major
international scientific conferences (ACL,
IJCNLP, COLING, NTCIR).
Taras Zagibalov© 2009
- 27. Is it effective?
The approach can be easily combined with
supervised techniques:
Unsupervised system can provide initial data
for in-depth research of the data (building up
word-lists and rule-sets)
Automatically extracted information can be
used for training machine learning systems.
Taras Zagibalov© 2009
- 28. Conclusion
Unsupervised Sentiment Analysis is an efficient
instument of keeping track of public opinion in
different domains and languages.
It can be used as an entry point to a new
domain or language.
It can be combined with supervised methods to
increase accuracy.
Taras Zagibalov© 2009