User-generated content (UGC) such as online reviews is freely available in the web. This kind of data has been used to support clients’ and managerial decision-making in several industries, e.g. books, tourism, or hospitality. In this workshop, I will introduce a fine-grained characterisation of UGC and a new multidomain and multilingual conceptual data model to represent UGC.
Moreover, I will present a domain-specific ontology for accommodations that can be also used to support managerial decision making and end-user applications. Instead of the few categories commonly provided by Web 2.0 portals, this ontology enables accommodation managers to find specific information. The ontology is also used as input for an algorithm to recognise sentiment in online reviews. Finally, I will describe some of the main approaches to deal with sentiment analysis.
In short, I will address some of the main challenges of UGC introducing:
a) A proposal for a fine-grained characterisation of UGC;
b) A structured representation of UGC which leverages the information provided by the use of Web 2.0 applications;
c) The main approaches to perform sentiment analysis;
d) An ontology to represent knowledge in the accommodation sector.
Food processing presentation for bsc agriculture hons
A Fine-Grained Analysis of User-Generated Content to Support Decision Making
1. Workshop
A Fine‐Grained Analysis of User‐Generated
Content to Support Decision Making
Marcirio Silveira Chaves
h/p://mchaves.wikidot.com
Informa<on Systems Research Group
Business and Informa<on Technology Research Centre (BITREC)
Ins<tute for Scien<fic and Technological Research of Universidade Atlân<ca (ISTR)
2. User‐Generated Content (UGC)
• As known as • Can be expressed throught
– User‐Generated Data – Opinions
– User‐Created Content – Reviews
– User‐Contributed Data – Comments
– Consumer‐Generated – Posts
Media
– …
• Notes:
• All the examples described in this workshop are real data.
• Some papers men<oned here are under review.
• Color legend:
• Examples
• Posi<ve feature
• Nega<ve feature
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 2
3. Example of UGC
• An opinion posted in Facebook Dec‐10‐2011, 12:30 pm
– “would highly recommend Infinity Motorcycles,
Southampton for all motorbiking gear. Very
reasonable people. Earlier they gave me a full
money back for a unused (aer explaining why it
was unused) ladies motorbike jacket (no defects
whasoever) and today the zipper on my new
jacket was broken and they gave me a brand new
one (no ques<ons asked, no receipt business and
no fuss created). Five Star service.”
– This user had 226 friends.
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 3
4. Some sta<s<cs about UGC
• More than 50% of all internet visits are now to
UGC/social media sites.
• More than 75% of <me spent on the internet
is "social”.
• Facebook now captures as much <me spent
on the internet as Google, Yahoo, and AOL.
• More than 80% of consumers are influenced
by Social MarkeJng.
Source: http://www.bbrisco.com/2010/05/social.html
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 4
6. Outline
Part 1 Part 2
• Workshop Context • Sen<ment Analysis/Opinion
• User‐Generated Content Mining
(UGC) • Polarity Recognizer in
• Characterisa<on of UGC Portuguese (PIRPO)
• Knowledge Engineering ‐ • Informa<on Visualisa<on
Ontology Development
• Hands‐on Session (Individual
Task): Dealing with UGC
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 6
7. Context Workshop
A framework for Customer Knowledge Management based on Social Seman<c Web.
Chaves, Marcirio Silveira; Trojahn, Cássia and Pedron, Cris<ane Drebes.
A Framework for Customer Knowledge Management based on Social Seman<c Web: A Hotel Sector Approach. In:
Customer Rela<onship Management and the Social and Seman<c Web: Enabling Cliens Conexus. Colomo‐Palacios,
R.; Varajão, J. and Soto‐Acosta, P. (Eds.). p. 141‐157, Hershey, PA: IGI Global, 2012. ISBN: 978‐161‐35‐0044‐6
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 7
8. An Fine‐grained Analysis of UGC
• Overall opinion about a topic is only a part of the
informaJon of interest.
• Document‐level senJment classificaJon fails to detect
sen<ment about individual aspects of the topic. In
reality, for example, though one could be generally
happy about his car, he might be dissaJsfied by the
engine noise.
• To the manufacturers, these individual weaknesses and
strengths are equally important to know, or even more
valuable than the overall sa<sfac<on level of customers.
(Tang et al. 2009)
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 8
9. UGC
An opinion is simply a posiJve or negaJve
senJment, view, aPtude, emoJon, or
appraisal about an enJty or an aspect of the
enJty (Hu and Liu, 2004; Liu, 2006) from an
opinion holder (Bethard et al., 2004; Kim and
Hovy, 2004; Wiebe et al., 2005).
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 9
10. Characterisa<on of UGC
• Opinion’s Characterisa<on
– I use and extend the defini<on proposed by (Ding
et al., 2008; Liu, 2010; Mar<n and White, 2005) to
analyse the sentences of reviews.
– Let the review be r.
– In the most general case, r is characterised as a set
of the following elements {O,F,SO,H,S,A,R,I,SG},
where:
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 10
12. Characterisa<on of UGC
1 ‐ Object (O)
– An object is a product (e.g. movie and book) or a
service (e.g. hotel and restaurant) under review
which is composed by features.
– Objects are also called enJJes.
2 ‐ Feature (F)
– A feature is a component or part of an object.
• actor and photography are features on a movie.
• pool and staff are features on a hotel.
– Features are also called aXributes or facets.
– A feature can be men<oned explicitly or implicitly
in a review (Ding et al. 2008).
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 12
13. Characterisa<on of UGC
2.1 ‐ Explicit Feature (F)
– If a feature f appears in review r, it is called an
explicit feature in r.
– The hotel is located very near the center city.
• loca<on is an explicit feature.
2.2 ‐ Implicit Feature (F):
– If f does not appear in r but is implied, it is called
an implicit feature in r.
– Hotel is far from public transporta<on.
• loca<on is an implicit feature.
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 13
14. Characterisa<on of UGC
3 ‐ Sentence‐OrientaJon (SO)
– A review consists of a sequence of sentences
r=〈 s1, s2, …, sm〉(Ding et al., 2008).
– A sentence can be evaluated as the following
perspec<ves:
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 14
15. Characterisa<on of UGC
3.1 ObjecJvity
– An objec<ve sentence contains or menJon facts.
• This hotel is far from the airport, ca. 15km.
– A subjec<ve sentence does not menJon any fact.
• The parking could be free.
3.2 Polarity
– It describes the orientaJon present in a sentence
(i.e. posiJve, negaJve, neutral and irrelevant).
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 15
16. Characterisa<on of UGC
3.3 Intensity (strength of the polarity)
– It refers to the strength of the private state that is
being expressed, in other words, how strong is an
emo<on or a convic<on of belief (Wilson, 2008).
– It describes how intense it was the experience using
a product or service:
• very posiJve, posiJve, neutral, negaJve and very
negaJve.
• Very kindly staff. refers to a very posi<ve impression on
the staff service.
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 16
17. Characterisa<on of UGC
4 ‐ Opinion Holder (H)
– The holder of a par<cular opinion is the person or the
organisaJon that holds the opinion (Ding et al., 2008).
– A holder is iden<fied with demographic characterisJcs
(e.g. name, city and country).
– Sites such as tripadvisor.com and booking.com classify
holders as types including:
• families with older children
• families with young children
• mature couples
• groups of friends
• solo travellers
• young couples
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 17
18. Characterisa<on of UGC
5 – Source
– An informa<on source is a web site which provides
a set of reviews.
• tripadvisor.com
• booking.com
• amazon.com
• A: A%tude
• SG: Sugges.on
• R: Recommenda.on
• I: Inten.on
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 18
19. Outline
Part 1 Part 2
• Workshop Context • Sen<ment Analysis/Opinion
• User‐Generated Content Mining
(UGC) • Polarity Recognizer in
• Characterisa<on of UGC Portuguese (PIRPO)
• Knowledge Engineering ‐ • Informa<on Visualisa<on
Ontology Development
• Hands‐on Session (Individual
Task): Dealing with UGC
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 19
21. More limita<ons
• Actually, web agents are unable to answer
ques<ons such as:
– What are the hotels with longer indoor swimming
pool Jme table in Roma?
– What are the hotels with the cheapest breakfast
in Lisbon?
– What are the cheapest hotels with family suite
room with sea view in Barcelona?
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 21
22. Knowledge Engineering
• Ontology as a support to evaluate UGC
– Set of concepts to a specific domain
– Human and machine readable
– Support to fine‐grained analysis of the instances
(e.g. reviews)
– Hontology (H stands for hotel, hostal and hostel)
• A robust, coherent and mul<lingual representa<on of
the accommoda<on sector.
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 22
23. Context Workshop
A framework for Customer Knowledge Management based on Social Seman<c Web.
Chaves, Marcirio Silveira; Trojahn, Cássia and Pedron, Cris<ane Drebes.
A Framework for Customer Knowledge Management based on Social Seman<c Web: A Hotel Sector Approach. In:
Customer Rela<onship Management and the Social and Seman<c Web: Enabling Cliens Conexus. Colomo‐Palacios,
R.; Varajão, J. and Soto‐Acosta, P. (Eds.). p. 141‐157, Hershey, PA: IGI Global, 2012. ISBN: 978‐161‐35‐0044‐6
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 23
24. Knowledge Engineering
• Development Methodology
– Iden<fy exis<ng ontologies on related domains
– Select the main concepts and proper<es
– Organize concepts and proper<es hierarchically into categories
– Translate the ontology (manual)
– Expand concepts and proper<es based on comments
– Translate the new concepts and proper<es (manual)
– Generate the ontology in several formats
Chaves, M. S. and Trojahn, C. Towards a MulJlingual Ontology for Ontology‐driven Content Mining in Social Web Sites.
Proc. of the ISWC 2010 Workshops, Volume I, 1st InternaJonal Workshop on Cross‐Cultural and Cross‐Lingual Aspects
of the SemanJc Web. Shanghai, China, November 7th, 2010.
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 24
25. Knowledge Engineering
• Hontology
– A mulJlingual ontology for the accommodaJon
sector.
• Demo Protégé
Chaves, M. S.; Freitas, L. A. and Vieira, R. (2012). Hontology: A mulJlingual ontology for the
accommodaJon sector. 4th InternaJonal Conference on Knowledge Engineering and Ontology
Development, Barcelona, Spain, 4‐7 October. (SubmiXed)
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 25
26. Knowledge Engineering
Metrics Value
Number of Concepts 285
Number of Object Proper<es 10
Number of Data Proper<es 31
Concept Axioms
Preliminary Subconcept axioms 270
Hontology Equivalent concepts axioms 4
Disjoint concepts axioms 93
Sta<s<cs Object Property Axioms
Func<onal object property axioms 6
Object property domain axioms 11
Object property range axioms 8
Data Property Axioms
Func<onal data property axioms 12
Object data domain axioms 17
Object data range axioms 1
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 26
27. Hands‐on Session
• The aim of this hands‐on session is to allow you thinking
in‐depth about UGC on the context of the accommoda<on
sector.
• You are going to receive a set of 4 or 5 reviews about
accommoda<ons and should evaluate each one according
to the following parameters:
– Features present in the review (see the concepts of
Hontology)
– Intensity (Strength of the Polarity): (very nega<ve,
nega<ve, neutral, posi<ve, very posi<ve)
• Notes:
– Evaluate one feature per line.
– Please, save your sheet in another file and send to
mschaves@gmail.com. Subject: UB:GX
– X = number of the group.
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 27
28. Outline
Part 1 Part 2
• Workshop Context • Sen<ment Analysis/Opinion
• User‐Generated Content Mining
(UGC) • Polarity Recognizer in
• Characterisa<on of UGC Portuguese (PIRPO)
• Knowledge Engineering ‐ • Informa<on Visualisa<on
Ontology Development
• Hands‐on Session (Individual
Task): Dealing with UGC
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 28
29. Sen<ment Analysis
• Analysis and automaJc extracJon of SemanJc
OrientaJon
• SemanJc orientaJon refers to the polarity and
strength of words, phrases, or texts.
• Approaches
– Lexicon‐based
• Dic<onaries of words annotated with the word´s seman<c
orienta<on, or polarity.
• A manually built dicJonary provides a solid foundaJon for a
lexicon‐based approach (Taboada et. al., 2011).
– StaJsJcal or Machine‐learning
• Supervised classifica<on
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 29
30. Sen<ment Analysis
• Lexicon‐based Approach
– Sen<ment‐bearing words: a list of nouns, verbs,
adjecJves and adverbs (Chesley et al., 2006)
• use verbs and adjec<ves to classify English
opinionated blog texts.
– List of conjuncJons and connecJves (Liu, 2010).
– Use of auxiliary verbs to get features and opinion‐
oriented words about products from texts (Khan et
al., 2010).
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 30
31. Sen<ment Analysis
• Seed words
– are a small set of words with strong negaJve or
posiJve associa<ons, such as excellent or abysmal.
– In principle, a posi<ve adjec<ve should occur more
frequently alongside the posi<ve seed words, and
thus will obtain a posi<ve score, whereas nega<ve
adjec<ves will occur most oen in the vicinity of
nega<ve seed words, thus obtaining a nega<ve
score (Taboada et. al. 2011).
• This restaurant has a bad and expensive food.
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 31
32. Sen<ment Analysis
• Part‐of‐Speech (PoS)
– In order to evaluate a sentence in a review, we
should consider the parts‐of‐speech men<oned
such as adjecJves, adverbs and verbs.
– Adjec<ves are classified as:
• posi<ve (good, excellent and clean),
• nega<ve (awful, boring and terrible),
• neutral (regular and indifferent) and
• dual, which can express posi<ve and nega<ve opinion
(small, long).
– In some approaches nouns are represented by
concepts of a domain ontology and mapped as
features.
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 32
33. Sen<ment Analysis
• ConjuncJon and ConnecJve (CC)
– Connec<ves are words that help iden<fying
addiJonal adjecJve opinion words and their
orientaJons.
– One of the constraints is about conjunc<on (i.e. and),
which says that conjoined adjec<ves usually have the
same orienta<on (Liu, 2010).
• This room is beau<ful and spacious.
– if beau<ful is known to be posi<ve, it can be inferred that spacious
is also posi<ve.
– HeurisJc:
• People usually express the same opinion on both sides of
a conjuncJon.
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 33
34. Sen<ment Analysis
• ConjuncJon and ConnecJve (CC)
– Rules or constraints are also designed for other
connec<ves (e.g. or, but, either‐or, and neither‐nor).
• This hotel is beau<ful but difficult to get there.
– The occurrence aer the connec<ve but is an indicator of a
nega<ve opinion.
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 34
35. Sen<ment Analysis
• Strength of the PolaJry or Intensity or
IntensificaJon
– Amplifiers (very, a lot) increase the seman<c intensity
of a neighboring lexical item;
– AXenuators/Downtoners (a li/le, slightly) decrease it.
• Some approaches have implemented intensifiers
using simple addiJon and subtracJon
– if a posi<ve adjec<ve has an SO value of 2:
• an amplified adjec<ve would have an SO value of 3, and
• a downtoned adjec<ve an SO value of 1.
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 35
36. Sen<ment Analysis
• NegaJon
– The obvious approach to nega<on is simply to
reverse the polarity of the lexical item next to a
negator, changing good (+3) into not good (−3).
– Not, none, nobody, never, and nothing, and other
words, such as without or lack.
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 36
37. Polarity Recognizer in Portuguese (PIRPO)
• Polarity Recognizer in Portuguese to classify senJment in
online reviews.
• PIRPO was built from the ground to Portuguese for
recognising the polarity of the user opinion on
accommoda<on reviews.
• Each review is analysed according to concepts from a
domain ontology.
• We decompose the review in sentences in order to assign a
polarity to each concept of the ontology in the sentence.
Chaves, M. S., Freitas, L., Souza, M. and Vieira, R. PIRPO: An Algorithm to deal with Polarity in Portuguese Online
Reviews from the AccommodaJon Sector. 17th InternaJonal conference on ApplicaJons of Natural Language
Processing to InformaJon Systems (NLDB), Groningen, The Netherlands, 26‐28 June 2012.
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 37
39. PIRPO
• Reviews
– Full dataset: 1500 reviews from January 2010 to
April 2011 in Portuguese, English and Spanish,
from which 180 in Portuguese.
• Ontology Concepts
– The concepts used to classify the reviews are
provided by Hontology, which in its current
version, has 110 concepts.
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 39
40. PIRPO
• List of adjecJves: It is composed by sen<ment‐
bearing words.
– This list of polar adjecJves in Portuguese
• contains 30.322 entries.
• is composed by the name of the adjecJve and a polarity
which can assign one of three values: +1, ‐1 and 0.
• These values corresponding to the posiJve, negaJve and
neutral senses of the adjec<ve.
– PIRPO uses this list to calculate the semanJc
orientaJon of the concepts found in the sentences.
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 40
44. PIRPO: Discussion on the Results
• PIRPO reached a be/er
recall for concepts with
posi<ve polarity, while
mixed polarity had a
higher precision.
• The low F‐score can be
mainly due to the
algorithm has assigned
a polarity to a specific
concept of the
ontology, while the
human classified the
review as a whole.
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 44
45. Outline
Part 1 Part 2
• Workshop Context • Knowledge Engineering ‐
• User‐Generated Content Modelling UGC
(UGC) • Sen<ment Analysis/Opinion
• Characterisa<on of UGC Mining
• Knowledge Engineering ‐ • Polarity Recognizer in
Ontology Development Portuguese (PIRPO)
• Hands‐on Session (Individual • Informa<on Visualisa<on
Task): Dealing with UGC
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 45
46. Context Workshop
A framework for Customer Knowledge Management based on Social Seman<c Web.
Chaves, Marcirio Silveira; Trojahn, Cássia and Pedron, Cris<ane Drebes.
A Framework for Customer Knowledge Management based on Social Seman<c Web: A Hotel Sector Approach. In:
Customer Rela<onship Management and the Social and Seman<c Web: Enabling Cliens Conexus. Colomo‐Palacios,
R.; Varajão, J. and Soto‐Acosta, P. (Eds.). p. 141‐157, Hershey, PA: IGI Global, 2012. ISBN: 978‐161‐35‐0044‐6
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 46
47. Informa<on Visualisa<on
• What is the visual model of the poten<al end‐user?
• How should we properly map and render:
– the most valued accommoda<on features?
– the percep<on of the quality offered by the hotel?
– the correla<on between the guest’s profile and the
mostly relevant features?
– the intensity of the posi<vity or nega<vity of the
features?
• Does the use of advanced visual techniques (such as
tree oriented) to map the results will help the
accommoda<on managers and guests to have a
be/er insight of the data?
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 47
48. Exploring Informa<on Visualisa<on
• In the next figures
– The color was used to map the polarity and the
strength of the polarity values on the CO.
– The size was used to map the frequency that the
CO is men<oned in the reviews.
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 48
49. Exploring Informa<on Visualisa<on
Result of the applica<on of Bubble Tree visualisaJon of the
rela<on among concepts of the ontology, polarity (le) and
strength of the polarity (right).
• Carvalho, E.; Chaves, M. S., 2012. Exploring User‐Generated Data VisualizaJon in the AccommodaJon
Sector. 16th InternaJonal Conference InformaJon VisualisaJon, IEEE. (SubmiXed)
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 49
50. Exploring Informa<on Visualisa<on
Results using Treemap visualisaJon of the rela<on among type of
customer, concepts of the ontology and polarity.
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 50
52. Final Remarks
• In‐depth analysis of UGC can be used as input
to improve decision making.
• It is <me to think about new models to store
UGC data.
• It is necessary the building from the ground of
new algorithms to deal with UGC for
languages other than English.
• InformaJon visualisaJon of UGC is in its
infancy state.
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 52
53. Main References
• S. Bethard, H. Yu, A. Thornton, V. Hatzivassiloglou, and D. Jurafsky, 2004. Automa<c extrac<on of opinion proposi<ons and
their holders. in Proceedings of the AAAI Spring Symposium on Exploring A%tude and Affect in Text.
• Chesley, P.; Vincent, B.; Xu, L. and Srihari R., 2006. Using verbs and adjec<ves to automa<cally classify blog sen<ment. in
AAAI Symposium on Computa<onal Approaches to Analysing Weblogs (AAAI‐CAAW), 27–29.
• Ding, X., Liu, B., and Yu, P. S., 2008. A holis<c lexicon‐based approach to opinion mining. Proceedings of the Conference on
Web Search and Web Data Mining (WSDM).
• M. Hu and B. Liu, 2004. Mining opinion features in customer reviews. In Proceedings of AAAI, pp. 755–760.
• S.‐M. Kim and E. Hovy, 2004. Determining the sen<ment of opinions. In Proceedings of the Interna.onal Conference on
Computa.onal Linguis.cs (COLING), 2004.
• Liu, Bing, 2010. Sen<ment Analysis and Subjec<vity. In Handbook of Natural Language Processing, Second Edi<on, Eds: N.
Indurkhya and F. J. Damerau), CRC Press, Taylor and Francis Group, Boca Raton, FL. Chapter 28.
• Mar<n, J.R. and White, P. R. R., 2005. The Language of Evalua<on, Appraisal in English, Palgrave Macmillan, London &
New York.
• Taboada, M., Brooke, J., Tofiloski, M., Voll, K.D., Stede, M., 2011. Lexicon‐based methods for sen<ment analysis.
Computa<onal Linguis<cs 37(2), 267–307.
• Tang, H., Tan, S., Cheng, X., 2009. A survey on sen<ment detec<on of reviews. Expert Systems with Applica<ons 36(7),
10760 – 10773.
• Whitelaw, C.; Garg, N. and Argamon, S., 2005. Using appraisal groups for sen<ment analysis. In Proceedings of the 14th
ACM interna<onal conference on Informa<on and knowledge management (CIKM '05). ACM, New York, NY, USA, 625‐631.
• Wilson, T., 2008. Fine‐Grained Subjec<vity Analysis. PhD Disserta<on, Intelligent Systems Program, University of
Pi/sburgh.
• Wilson, T., Wiebe, J., Hoffmann, P., 2009. Recognizing contextual polarity: An explora<on of features for phrase‐level
sen<ment analysis. Computa<onal Linguis<cs 35, 399–433.
• Y. Wu, F. Wei, S. Liu, N. Au, W. Cui, H. Zhou, and H. Qu, 2010. OpinionSeer: Interac<ve Visualisa<on of Hotel Customer
Feedback. IEEE Transac<ons on Visualiza<on and Computer Graphics, 6, 1109‐1118. Nov‐Dec.
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 53
54. Open‐source sen<ment‐analysis tools
• Python NLTK (Natural Language Toolkit)
– h/p://www.nltk.org and h/p://text‐processing.com/demo/sen<ment
• R, TM (text mining) module h/p://cran.r‐project.org/web/packages/tm/index.html
• RapidMiner h/p://rapid‐i.com/content/view/184/196/
• GATE, the General Architecture for Text Engineering h/p://gate.ac.uk/sen<ment
• UIMA‐plug‐in annotators for sen<ment — Apache UIMA is the
Unstructured Informa<on Management Architecture, h/p://uima.apache.org/
• SenJment classifiers for the WEKA data‐mining workbench,
h/p://www.cs.waikato.ac.nz/ml/weka/.
• Stanford NLP tools ‐ h/p://www‐nlp.stanford.edu/soware maximum‐entropy
classifica<on approach for sen<ment.
Apr‐18‐12 Marcirio Chaves ‐ marcirioc@uatlan<ca.pt 54