The big data phenomenon has confirmed the achievement of data access transformation. Sentiment analysis (SA) is one of the most exploited area and used for profit-making purpose through business intelligence applications. This paper reviews the trends in SA and relates the growth in the area with the big data era.
(+971568250507 ))# Young Call Girls in Ajman By Pakistani Call Girls in ...
A review of sentiment analysis approaches in big
1. A Review of Sentiment
Analysis Approaches in Big
Data Era
Nurfadhlina Mohd Sharef
Department of Computer Science
Faculty of Computer Science and Information Technology, Universiti Putra Malaysia
Serdang, Selangor, Malaysia
nurfadhlina@upm.edu.my
2.
3.
4. Sentiment Analysis
analyzes people’s sentiments, opinions, appraisals, attitudes,
evaluations, and emotions
towards entities such as organizations, products, services,
individuals, topics, issues, events, and their attributes
as presented online via text, video and other means of
communication.
5. Sentiment Analysis
These communications can fall into three broad categories: positive, neutral
or negative.
There are also many names and slightly different tasks, e.g., sentiment
analysis, opinion mining, opinion extraction, sentiment mining, subjectivity
analysis, customer complaint, affect analysis, emotion analysis, review
mining, review analysis, etc.
6.
7.
8. Tools Description
The Hadoop
Distributed File
System (HDFS)
HDFS divides the data into smaller parts and
distributes it across the various servers/nodes
SQL Server
Integration
Service These tools allow posts can be downloaded and
loaded into Hadoop
Apache Flume
MapReduce
MapReduce is a process that transforms data
loaded into Hadoop into a format that can be
used for analysis.
Hive
a runtime Hadoop support architecture that
leverages Structure Query Language (SQL) with
the Hadoop platform.
Jaql Jaql converts high-level queries into low-level
queries and
Zookeeper Zookeeper coordinate parallel processing across
big clusters
HBase HBase is a column-oriented database
management system that sits on top of HDFS by
using a non-SQL approach.
9. Problem
Which features to use?
Words (unigrams)
Phrases/n-grams
Sentences
How to interpret features for sentiment detection?
Bag of words (IR)
Annotated lexicons (WordNet, SentiWordNet)
Syntactic patterns
Paragraph structure
10. Challenges
Harder than topical classification, with which bag of words features perform
well
Must consider other features due to…
Subtlety of sentiment expression
irony
expression of sentiment using neutral words
Domain/context dependence
words/phrases can mean different things in different contexts and domains
Effect of syntax on semantics
11. Sentiment Analysis Trends
Year Quantity Highlighted Topics
2004 4 Affective computing, sentiment classification, polarity
2005 10 Contextual polarity, phrase level SA, sentiment classification, scores, subject classification
2006 10 Lexicon, feature, summarization, mining, understanding, temporal SA, weighted polarity, user profiling based on SA
2007 43 Lexicon, feature mining, emotion detection, clustering, conjuncts presence
2008 72 Multi-lingual SA, ratings inference, feature mining, word orientation, SentiWordNet, rating weighting, radicalization
Text mining
techniques
detection, affective computing, compositional semantics analysis, sentiment-based prediction, concept hierarchy,
classification
multilingual
2009 131 ML approaches for SA, user profiling based on SA, feature association, semantic association, visual SA, cross-linguistic
SA, ontology-based SA, polarity lexicon, multi-entity scoring, affective computing
2010 216 Orientation analysis, affective computing, linguistic models, applied visual for SA, semantic role labeling, clustering-based
linguistics
SA, cross-lingual SA, SA-based prediction, twitter-based SA, global SentiWordNet, intensity classification,
cross-domain SA, opinion question- answering, sentiment topic detection, language specific SA
2011 297 Opinion leader identification, social network-based surveillance, product recommendation, terrorism informatics,
affective computing, features clustering, political orientation detection, wish identification, sentiment lexicon,
influence detection, personality mining, polarity analysis, graph based sentiment representation, semantic based SA,
learning models for SA, emotion clustering, ontology based SA, sentence level SA, language specific SA
2012 454 Linguistic features analysis, business and financial forecasting, attitude prediction, sentiment topic detection, verbs
applied
polarity disambiguation, SenticNet, semantic orientation, language specific SA, cross lingual SA, emotion
recognition, social values and group identification,
2013 562 Multilingual, ML-based polarity detection, sentiment evolution modeling, aspect-based sentiment classification,
social intelligence, SA-based prediction, computational analysis of public voice, emotion mining, SA-based customer
care, security-related intelligence, graph extraction, social network-based SA, linguistic features, statistical
approaches for SA, concept-level SA, correlational study between financial sentiment and prices in financial
markets, subjectivity detection, cross-domain SA, opinion leaders identification,
2014 216 Feature-based SA through ontologies, concept-level SA based on dependency rules, word polarity disambiguation,
linguistics
aspect-oriented SA, sentence-level SA, graph clustering for SA, subjectivity analysis, word sentiment in WordNet 3.0,
computational analysis of public voice
12. Approaches
Sentiment
Analysis
Content-based
Polarity
Detection
Positive, Negative, Neutral
Strength
Detection
Typically [-1,1]
SentiWordNet
Feature Mining
Unigram,
Bigram
Syntactic, Lexical, Structural
Link-based
Stylistic
Affective
Computation
Emotion
Classification
Social Network
Influencer
Multilingual
Machine
Learning
Naïve Bayes
Support Vector
Model
13. Conclusion
This paper has discussed the trends in SA
the climax of big data era has gained even more focus even the area has been
started since before year 2004. Advancements in big data technologies have also
enabled this area to flourish.
Nevertheless, many rooms of improvements exist such as maturing the big data
technologies and increasing alternatives for SA solutions using the platform.
More infrastructures are also needed to let SA to be exploited for many more
applications besides the existing community centric, product review-based and
influential assessment.
Studies for techniques of SA in cross-domain dataset and multilingual should also
explored.
Improvements for deeper semantic computation such as the SenticNet approach
should also be expanded besides enriching SentiWordNet for multilingual, more
precise and multi-granular representation