SlideShare a Scribd company logo
1 of 24
 Francesco Colace, Massimo De Santo, Luca Greco
DIEM –Università degli Studi di Salerno
{fcolace, desanto, lgreco}@unisa.it
ACII 2013 – Geneva, 2-5 September 2013
 Web 2.0 (or Web X.Y) rules!
 Social Networks, Blogs, Microblogs, Reviews’
Collectors Sites: huge and terrific quantity of
heterogeneus and opinonated data
ACII 2013 – Geneva, 2-5 September 2013
 Open issues:
o How to manage this information?
o How to extract the sentiment inside the data?
o How to understand something about the users?
o How to evaluate the opinion of people about some topics or
products?
 Sentiment Analysis
ACII 2013 – Geneva, 2-5 September 2013
 Brief introduction to the Sentiment Analysis
o Related Works
 Towards a Sentiment Analysis Framework
o The Proposed Approach
• The LDAApproach
• The Mixed Graph of Terms
• A sentiment mining algorithm
 Experimental results
 Conclusions and Future Works
ACII 2013 – Geneva, 2-5 September 2013
 Sentiment:
o a thought, view, or attitude, especially based mainly on emotion instead
of reason
 Sentiment Analysis (as known as Opinion mining):
o use of Natural Language Processing (NLP) and computational
techniques to automate the extraction and classification of sentiment
from unstructured texts
ACII 2013 – Geneva, 2-5 September 2013
 Consumer information
o Product reviews (Amazon, e-Bay, …)
 Marketing
o Consumer attitudes
o Trends
 Politics
o Politicians want to know voters’ point of views
o Voters want to know policitians’ stances and who else supports them
 Social
o Find like-minded individuals or communities
ACII 2013 – Geneva, 2-5 September 2013
 What features adopt?
o Words
o Sentences
 How to interpret features for sentiment detection?
o As a bag of words
o By the use of annotated lexicons
o According to syntactic patterns
o Analyzing the paragraph structure
ACII 2013 – Geneva, 2-5 September 2013
 Naïve Bayes
 Maximum Entropy Classifier
 SVM
 Markov Blanket Classifier
 … … …
 Latent Dirichlet Allocation (LDA)
ACII 2013 – Geneva, 2-5 September 2013
 By the use of the Bag of Words approach, a document
can be represented as an ordered set of words
 Problems:
o What words express better the sentiment in a text?
o How to compare various «bag of words» derived from texts with the
same sentiment?
o By the use of the bag of words is it possible to represent the documents’
domain of interest?
ACII 2013 – Geneva, 2-5 September 2013
 The mixed Graph of Terms is a «graph based» representation
of documents
 In the proposed approach, a mixed Graph of Terms is obtained
by an automatic extraction of words based on probabilistic
clustering techniques as Latent Dirichlet Allocation (LDA)
 In a mixed Graph of Terms the words are linked according to
their mutual occurence probability and «aggregating_word»
and «aggregated_words» can be recognized
 Our proposal: a mixed Graph of Terms can be used as a
«sentiment filter»
ACII 2013 – Geneva, 2-5 September 2013
 In the proposed approach, in a mixed Graph of Terms two
different layers can be recognized:
 The Aggregator Layer: the words with higher degree of
interconnection with the words that are in the documents
 The “Aggregated Words” Layer: this layer expresses words
that have higher degree of interconnection with one or more
Aggregator Word
ACII 2013 – Geneva, 2-5 September 2013
 In natural language processing, Latent Dirichlet Allocation (LDA) is a
generative model that allows sets of observations to be explained by
unobserved groups that explain why some parts of the data are similar
 For example, if observations are words collected into documents, it
posits that each document is a mixture of a small number of topics and
that each word's creation is attributable to one of the document's topics
 The basic idea is that the documents are represented as random
mixtures over latent topics, where a topic is characterized by a
distribution over words
 By the use of the Latent Dirichlet Allocation technique a set of
documents can be represented as a mixed Graph of Terms
ACII 2013 – Geneva, 2-5 September 2013
ACII 2013 – Geneva, 2-5 September 2013
ACII 2013 – Geneva, 2-5 September 2013
 Step_1: Learn a mixed Graph of Terms by the
use of labelled documents (i.e. Positive or
Negative) obtaining:
o mGT positive
o mGT negative
 Step_2: Use the mixed Graph of Terms as filter
in order to classify the sentiment of texts
o Comparing concepts that are both in the mGTs both
in the text
o Comparing words that are both in the mGTs both in
the text
ACII 2013 – Geneva, 2-5 September 2013
ACII 2013 – Geneva, 2-5 September 2013
 Dataset: Movie Reviews
Approach Accuracy
Support Vector Machine* 82,90
Naive Bayes* 81,50
Maximum Entropy* 81,00
mGT-LDA 88,50
*[Bo Pang, 2002]
ACII 2013 – Geneva, 2-5 September 2013
 Dataset: Real Tweets related to Politics
 Training Set: 3980 Tweets
 Test Set: 32185 Tweets
ACII 2013 – Geneva, 2-5 September 2013
Approach Accuracy
mGT-LDA 87,10
SVM 79,20
Naive Bayes 76,60
ACII 2013 – Geneva, 2-5 September 2013
http://193.205.190.209/elezioni2013/
ACII 2013 – Geneva, 2-5 September 2013
days
accuracy
ACII 2013 – Geneva, 2-5 September 2013
Masterchef - http://193.205.190.209/tvshow/masterchef/
 Pro:
o Indipendent from Language
o Fast classification
o Continous Upgrade
o Little Training Set
 Cons:
o In general, long Time for mGT building process
o An Annotated Lexicon is needed
ACII 2013 – Geneva, 2-5 September 2013
 To improve the classification by the continous update of
the training set
 To Introduce SentiWordnet as Annotated lexicon
 To adopt an ontological formalism for a better
representation of the mGT
 To build a bigger tweets’ dataset
ACII 2013 – Geneva, 2-5 September 2013
ACII 2013 – Geneva, 2-5 September 2013
Don’t forget to tweet your sentiment!!! 

More Related Content

Viewers also liked

So we don't go foe hoin ass to tryin play checc up.pt.2.doc
So we don't go foe hoin ass to tryin play checc up.pt.2.docSo we don't go foe hoin ass to tryin play checc up.pt.2.doc
So we don't go foe hoin ass to tryin play checc up.pt.2.docMurad Wysinger
 
[Assignment/Research] handwriting
[Assignment/Research] handwriting[Assignment/Research] handwriting
[Assignment/Research] handwritingMimi Mokhtar
 
Tukuko, Crónica de una celebración desde el corazon de la Sierra de Perijá
Tukuko, Crónica de una celebración desde el corazon de la Sierra de PerijáTukuko, Crónica de una celebración desde el corazon de la Sierra de Perijá
Tukuko, Crónica de una celebración desde el corazon de la Sierra de PerijáJose María De Viana
 
JJWF Tailgate 2014.compressed
JJWF Tailgate 2014.compressedJJWF Tailgate 2014.compressed
JJWF Tailgate 2014.compressedChellee Siewert
 
Mr. jagadeesh electrical design engineer
Mr. jagadeesh electrical design engineerMr. jagadeesh electrical design engineer
Mr. jagadeesh electrical design engineerJagadeesh Rowtu
 
A4L2015 Donation Form
A4L2015 Donation FormA4L2015 Donation Form
A4L2015 Donation Formartists4life
 
Boletin maritimo 60 febrero EN
Boletin maritimo 60 febrero ENBoletin maritimo 60 febrero EN
Boletin maritimo 60 febrero ENRafael Brito
 
Data Science Perspective and DS demo
Data Science Perspective and DS demo Data Science Perspective and DS demo
Data Science Perspective and DS demo PivotalOpenSourceHub
 
Fundamentos da educação especial inclusiva
Fundamentos da educação especial inclusivaFundamentos da educação especial inclusiva
Fundamentos da educação especial inclusivaGeisse Martins
 
Session 05 cleaning and exploring
Session 05 cleaning and exploringSession 05 cleaning and exploring
Session 05 cleaning and exploringSara-Jayne Terp
 
The most common chinese characters in order of frequency
The most common chinese characters in order of frequencyThe most common chinese characters in order of frequency
The most common chinese characters in order of frequencySandra Zhou
 

Viewers also liked (15)

So we don't go foe hoin ass to tryin play checc up.pt.2.doc
So we don't go foe hoin ass to tryin play checc up.pt.2.docSo we don't go foe hoin ass to tryin play checc up.pt.2.doc
So we don't go foe hoin ass to tryin play checc up.pt.2.doc
 
Mmji
MmjiMmji
Mmji
 
Lesson 7
Lesson 7Lesson 7
Lesson 7
 
[Assignment/Research] handwriting
[Assignment/Research] handwriting[Assignment/Research] handwriting
[Assignment/Research] handwriting
 
Ts For compass
Ts For compass Ts For compass
Ts For compass
 
Tukuko, Crónica de una celebración desde el corazon de la Sierra de Perijá
Tukuko, Crónica de una celebración desde el corazon de la Sierra de PerijáTukuko, Crónica de una celebración desde el corazon de la Sierra de Perijá
Tukuko, Crónica de una celebración desde el corazon de la Sierra de Perijá
 
JJWF Tailgate 2014.compressed
JJWF Tailgate 2014.compressedJJWF Tailgate 2014.compressed
JJWF Tailgate 2014.compressed
 
Mr. jagadeesh electrical design engineer
Mr. jagadeesh electrical design engineerMr. jagadeesh electrical design engineer
Mr. jagadeesh electrical design engineer
 
A4L2015 Donation Form
A4L2015 Donation FormA4L2015 Donation Form
A4L2015 Donation Form
 
Boletin maritimo 60 febrero EN
Boletin maritimo 60 febrero ENBoletin maritimo 60 febrero EN
Boletin maritimo 60 febrero EN
 
Data Science Perspective and DS demo
Data Science Perspective and DS demo Data Science Perspective and DS demo
Data Science Perspective and DS demo
 
Fundamentos da educação especial inclusiva
Fundamentos da educação especial inclusivaFundamentos da educação especial inclusiva
Fundamentos da educação especial inclusiva
 
Session 05 cleaning and exploring
Session 05 cleaning and exploringSession 05 cleaning and exploring
Session 05 cleaning and exploring
 
ORACLE HA NFS over Oracle ASM
ORACLE HA NFS over Oracle ASMORACLE HA NFS over Oracle ASM
ORACLE HA NFS over Oracle ASM
 
The most common chinese characters in order of frequency
The most common chinese characters in order of frequencyThe most common chinese characters in order of frequency
The most common chinese characters in order of frequency
 

Similar to A Probabilistic Approach to Tweets' Sentiment Classification - ACII 2013 Conference

IRJET - Deep Collaborrative Filtering with Aspect Information
IRJET - Deep Collaborrative Filtering with Aspect InformationIRJET - Deep Collaborrative Filtering with Aspect Information
IRJET - Deep Collaborrative Filtering with Aspect InformationIRJET Journal
 
«Ejemplos de herramientas que nos facilitan las analíticas de aprendizaje en ...
«Ejemplos de herramientas que nos facilitan las analíticas de aprendizaje en ...«Ejemplos de herramientas que nos facilitan las analíticas de aprendizaje en ...
«Ejemplos de herramientas que nos facilitan las analíticas de aprendizaje en ...eMadrid network
 
G04124041046
G04124041046G04124041046
G04124041046IOSR-JEN
 
ONTOLOGICAL TREE GENERATION FOR ENHANCED INFORMATION RETRIEVAL
ONTOLOGICAL TREE GENERATION FOR ENHANCED INFORMATION RETRIEVALONTOLOGICAL TREE GENERATION FOR ENHANCED INFORMATION RETRIEVAL
ONTOLOGICAL TREE GENERATION FOR ENHANCED INFORMATION RETRIEVALijaia
 
Semantic similarity measurement- A theoretical study of various approaches
Semantic similarity measurement- A theoretical study of various approachesSemantic similarity measurement- A theoretical study of various approaches
Semantic similarity measurement- A theoretical study of various approachesIRJET Journal
 
Twitter Sentiment Analysis: An Unsupervised Approach
Twitter Sentiment Analysis: An Unsupervised ApproachTwitter Sentiment Analysis: An Unsupervised Approach
Twitter Sentiment Analysis: An Unsupervised ApproachIRJET Journal
 
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANS
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANSCONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANS
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANSijseajournal
 
An in-depth review on News Classification through NLP
An in-depth review on News Classification through NLPAn in-depth review on News Classification through NLP
An in-depth review on News Classification through NLPIRJET Journal
 
An Unsupervised Approach For Reputation Generation
An Unsupervised Approach For Reputation GenerationAn Unsupervised Approach For Reputation Generation
An Unsupervised Approach For Reputation GenerationKayla Jones
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER) International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER) ijceronline
 
IRJET- Short-Text Semantic Similarity using Glove Word Embedding
IRJET- Short-Text Semantic Similarity using Glove Word EmbeddingIRJET- Short-Text Semantic Similarity using Glove Word Embedding
IRJET- Short-Text Semantic Similarity using Glove Word EmbeddingIRJET Journal
 
Automatic Grading of Handwritten Answers
Automatic Grading of Handwritten AnswersAutomatic Grading of Handwritten Answers
Automatic Grading of Handwritten AnswersIRJET Journal
 
Live Sign Language Translation: A Survey
Live Sign Language Translation: A SurveyLive Sign Language Translation: A Survey
Live Sign Language Translation: A SurveyIRJET Journal
 
Evaluating sentiment analysis and word embedding techniques on Brexit
Evaluating sentiment analysis and word embedding techniques on BrexitEvaluating sentiment analysis and word embedding techniques on Brexit
Evaluating sentiment analysis and word embedding techniques on BrexitIAESIJAI
 
Semantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenzaSemantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenzaGiorgia Lodi
 
An Approach to Owl Concept Extraction and Integration Across Multiple Ontolog...
An Approach to Owl Concept Extraction and Integration Across Multiple Ontolog...An Approach to Owl Concept Extraction and Integration Across Multiple Ontolog...
An Approach to Owl Concept Extraction and Integration Across Multiple Ontolog...dannyijwest
 
Embedding for fun fumarola Meetup Milano DLI luglio
Embedding for fun fumarola Meetup Milano DLI luglioEmbedding for fun fumarola Meetup Milano DLI luglio
Embedding for fun fumarola Meetup Milano DLI luglioDeep Learning Italia
 
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...IRJET Journal
 

Similar to A Probabilistic Approach to Tweets' Sentiment Classification - ACII 2013 Conference (20)

IRJET - Deep Collaborrative Filtering with Aspect Information
IRJET - Deep Collaborrative Filtering with Aspect InformationIRJET - Deep Collaborrative Filtering with Aspect Information
IRJET - Deep Collaborrative Filtering with Aspect Information
 
«Ejemplos de herramientas que nos facilitan las analíticas de aprendizaje en ...
«Ejemplos de herramientas que nos facilitan las analíticas de aprendizaje en ...«Ejemplos de herramientas que nos facilitan las analíticas de aprendizaje en ...
«Ejemplos de herramientas que nos facilitan las analíticas de aprendizaje en ...
 
G04124041046
G04124041046G04124041046
G04124041046
 
ONTOLOGICAL TREE GENERATION FOR ENHANCED INFORMATION RETRIEVAL
ONTOLOGICAL TREE GENERATION FOR ENHANCED INFORMATION RETRIEVALONTOLOGICAL TREE GENERATION FOR ENHANCED INFORMATION RETRIEVAL
ONTOLOGICAL TREE GENERATION FOR ENHANCED INFORMATION RETRIEVAL
 
Semantic similarity measurement- A theoretical study of various approaches
Semantic similarity measurement- A theoretical study of various approachesSemantic similarity measurement- A theoretical study of various approaches
Semantic similarity measurement- A theoretical study of various approaches
 
Twitter Sentiment Analysis: An Unsupervised Approach
Twitter Sentiment Analysis: An Unsupervised ApproachTwitter Sentiment Analysis: An Unsupervised Approach
Twitter Sentiment Analysis: An Unsupervised Approach
 
IJET-V3I1P1
IJET-V3I1P1IJET-V3I1P1
IJET-V3I1P1
 
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANS
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANSCONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANS
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANS
 
An in-depth review on News Classification through NLP
An in-depth review on News Classification through NLPAn in-depth review on News Classification through NLP
An in-depth review on News Classification through NLP
 
An Unsupervised Approach For Reputation Generation
An Unsupervised Approach For Reputation GenerationAn Unsupervised Approach For Reputation Generation
An Unsupervised Approach For Reputation Generation
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER) International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
ML
ML ML
ML
 
IRJET- Short-Text Semantic Similarity using Glove Word Embedding
IRJET- Short-Text Semantic Similarity using Glove Word EmbeddingIRJET- Short-Text Semantic Similarity using Glove Word Embedding
IRJET- Short-Text Semantic Similarity using Glove Word Embedding
 
Automatic Grading of Handwritten Answers
Automatic Grading of Handwritten AnswersAutomatic Grading of Handwritten Answers
Automatic Grading of Handwritten Answers
 
Live Sign Language Translation: A Survey
Live Sign Language Translation: A SurveyLive Sign Language Translation: A Survey
Live Sign Language Translation: A Survey
 
Evaluating sentiment analysis and word embedding techniques on Brexit
Evaluating sentiment analysis and word embedding techniques on BrexitEvaluating sentiment analysis and word embedding techniques on Brexit
Evaluating sentiment analysis and word embedding techniques on Brexit
 
Semantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenzaSemantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenza
 
An Approach to Owl Concept Extraction and Integration Across Multiple Ontolog...
An Approach to Owl Concept Extraction and Integration Across Multiple Ontolog...An Approach to Owl Concept Extraction and Integration Across Multiple Ontolog...
An Approach to Owl Concept Extraction and Integration Across Multiple Ontolog...
 
Embedding for fun fumarola Meetup Milano DLI luglio
Embedding for fun fumarola Meetup Milano DLI luglioEmbedding for fun fumarola Meetup Milano DLI luglio
Embedding for fun fumarola Meetup Milano DLI luglio
 
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...
 

Recently uploaded

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 

Recently uploaded (20)

DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 

A Probabilistic Approach to Tweets' Sentiment Classification - ACII 2013 Conference

  • 1.  Francesco Colace, Massimo De Santo, Luca Greco DIEM –Università degli Studi di Salerno {fcolace, desanto, lgreco}@unisa.it ACII 2013 – Geneva, 2-5 September 2013
  • 2.  Web 2.0 (or Web X.Y) rules!  Social Networks, Blogs, Microblogs, Reviews’ Collectors Sites: huge and terrific quantity of heterogeneus and opinonated data ACII 2013 – Geneva, 2-5 September 2013
  • 3.  Open issues: o How to manage this information? o How to extract the sentiment inside the data? o How to understand something about the users? o How to evaluate the opinion of people about some topics or products?  Sentiment Analysis ACII 2013 – Geneva, 2-5 September 2013
  • 4.  Brief introduction to the Sentiment Analysis o Related Works  Towards a Sentiment Analysis Framework o The Proposed Approach • The LDAApproach • The Mixed Graph of Terms • A sentiment mining algorithm  Experimental results  Conclusions and Future Works ACII 2013 – Geneva, 2-5 September 2013
  • 5.  Sentiment: o a thought, view, or attitude, especially based mainly on emotion instead of reason  Sentiment Analysis (as known as Opinion mining): o use of Natural Language Processing (NLP) and computational techniques to automate the extraction and classification of sentiment from unstructured texts ACII 2013 – Geneva, 2-5 September 2013
  • 6.  Consumer information o Product reviews (Amazon, e-Bay, …)  Marketing o Consumer attitudes o Trends  Politics o Politicians want to know voters’ point of views o Voters want to know policitians’ stances and who else supports them  Social o Find like-minded individuals or communities ACII 2013 – Geneva, 2-5 September 2013
  • 7.  What features adopt? o Words o Sentences  How to interpret features for sentiment detection? o As a bag of words o By the use of annotated lexicons o According to syntactic patterns o Analyzing the paragraph structure ACII 2013 – Geneva, 2-5 September 2013
  • 8.  Naïve Bayes  Maximum Entropy Classifier  SVM  Markov Blanket Classifier  … … …  Latent Dirichlet Allocation (LDA) ACII 2013 – Geneva, 2-5 September 2013
  • 9.  By the use of the Bag of Words approach, a document can be represented as an ordered set of words  Problems: o What words express better the sentiment in a text? o How to compare various «bag of words» derived from texts with the same sentiment? o By the use of the bag of words is it possible to represent the documents’ domain of interest? ACII 2013 – Geneva, 2-5 September 2013
  • 10.  The mixed Graph of Terms is a «graph based» representation of documents  In the proposed approach, a mixed Graph of Terms is obtained by an automatic extraction of words based on probabilistic clustering techniques as Latent Dirichlet Allocation (LDA)  In a mixed Graph of Terms the words are linked according to their mutual occurence probability and «aggregating_word» and «aggregated_words» can be recognized  Our proposal: a mixed Graph of Terms can be used as a «sentiment filter» ACII 2013 – Geneva, 2-5 September 2013
  • 11.  In the proposed approach, in a mixed Graph of Terms two different layers can be recognized:  The Aggregator Layer: the words with higher degree of interconnection with the words that are in the documents  The “Aggregated Words” Layer: this layer expresses words that have higher degree of interconnection with one or more Aggregator Word ACII 2013 – Geneva, 2-5 September 2013
  • 12.  In natural language processing, Latent Dirichlet Allocation (LDA) is a generative model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar  For example, if observations are words collected into documents, it posits that each document is a mixture of a small number of topics and that each word's creation is attributable to one of the document's topics  The basic idea is that the documents are represented as random mixtures over latent topics, where a topic is characterized by a distribution over words  By the use of the Latent Dirichlet Allocation technique a set of documents can be represented as a mixed Graph of Terms ACII 2013 – Geneva, 2-5 September 2013
  • 13. ACII 2013 – Geneva, 2-5 September 2013
  • 14. ACII 2013 – Geneva, 2-5 September 2013
  • 15.  Step_1: Learn a mixed Graph of Terms by the use of labelled documents (i.e. Positive or Negative) obtaining: o mGT positive o mGT negative  Step_2: Use the mixed Graph of Terms as filter in order to classify the sentiment of texts o Comparing concepts that are both in the mGTs both in the text o Comparing words that are both in the mGTs both in the text ACII 2013 – Geneva, 2-5 September 2013
  • 16. ACII 2013 – Geneva, 2-5 September 2013
  • 17.  Dataset: Movie Reviews Approach Accuracy Support Vector Machine* 82,90 Naive Bayes* 81,50 Maximum Entropy* 81,00 mGT-LDA 88,50 *[Bo Pang, 2002] ACII 2013 – Geneva, 2-5 September 2013
  • 18.  Dataset: Real Tweets related to Politics  Training Set: 3980 Tweets  Test Set: 32185 Tweets ACII 2013 – Geneva, 2-5 September 2013 Approach Accuracy mGT-LDA 87,10 SVM 79,20 Naive Bayes 76,60
  • 19. ACII 2013 – Geneva, 2-5 September 2013 http://193.205.190.209/elezioni2013/
  • 20. ACII 2013 – Geneva, 2-5 September 2013 days accuracy
  • 21. ACII 2013 – Geneva, 2-5 September 2013 Masterchef - http://193.205.190.209/tvshow/masterchef/
  • 22.  Pro: o Indipendent from Language o Fast classification o Continous Upgrade o Little Training Set  Cons: o In general, long Time for mGT building process o An Annotated Lexicon is needed ACII 2013 – Geneva, 2-5 September 2013
  • 23.  To improve the classification by the continous update of the training set  To Introduce SentiWordnet as Annotated lexicon  To adopt an ontological formalism for a better representation of the mGT  To build a bigger tweets’ dataset ACII 2013 – Geneva, 2-5 September 2013
  • 24. ACII 2013 – Geneva, 2-5 September 2013 Don’t forget to tweet your sentiment!!! 