SlideShare ist ein Scribd-Unternehmen logo
1 von 19
1
Tweets Classification
Supervisor - Dr. Vikas Saxena
Name - Shubhangi Agarwal
Varun Ajay Gupta
Enrolment No. – 10104768
10104730
Introduction
• As we are living in an era of social networking that’s
why our project focuses on twitter. In this project
we extracts the tweets and then classify them into
different categories . As with extraction of tweets
we extracts the huge amount of information with it.
• By using tweet classification we can predict the
current trend like which is most popular language
on twitter, most talked about person , burning topics
and much more.
5/29/2014Footer Text 2
Problem Statement
• Extraction of tweets.
• Converting unstructured data into structured data.
• Pre-processing of data .
• Finding the most popular language on twitter.
• Choosing of features for the classification.
• Classifying the tweets into different categories.
5/29/2014Footer Text 3
Algorithm
• SVMs (support vector machines) are supervised
learning models with associated
learning algorithms that analyse data and
recognize patterns, used for classification
and regression analysis .
• Given a set of training examples, each marked as
belonging to one of two categories, an SVM
training algorithm builds a model that assigns new
examples into one category or the other,
5/29/2014Footer Text 4
Why SVM ?
• Most popular in text classification.
• High accuracy in comparison to other algorithms.
• By choosing right features svm can be robust even
when the training sample has some bias.
5/29/2014Footer Text 5
Technology Used
• Operating System: UBUNTU 12.04 .
• Language: PYTHON
• Tools: GEDIT
• Debugger: PYTHON DEBUGGER
5/29/2014Footer Text 6
5/29/2014Footer Text 7
Unstructured Tweets
5/29/2014Footer Text 8
Structured Tweets
5/29/2014Footer Text 9
Calculating most popular
language on twitter
5/29/2014Footer Text 10
Pictorially showing
popularity of languages
5/29/2014Footer Text 11
Features choose
• No of sports words.
• No of politics words.
• No of entertainment words.
• Lexical complexity.
• No of hash tags.
• No of digits.
5/29/2014Footer Text 12
Values of features of
training set
5/29/2014Footer Text 13
Feature values of testing data
set before application of SVM
5/29/2014Footer Text 14
Result of classification of
tweets
5/29/2014Footer Text 15
Graph of SVM and
accuracy
5/29/2014Footer Text 16
Conclusion
On implementing the SVM on the testing dataset .
It classifies the data into sports ,entertainment and
politics category with a accuracy of 97.5%
5/29/2014Footer Text 17
Future Work
• Till now we have implemented the SVM to classify
the tweets in general categories like Sports , politics
, entertainment. We will try to implement it to
categories data into more specific categories so
that it can be used by the marketing and PR team
of different organizations while they are choosing
their strategies.
5/29/2014Footer Text 18
5/29/2014 19
Thank You

Weitere ähnliche Inhalte

Was ist angesagt?

Parts of Speect Tagging
Parts of Speect TaggingParts of Speect Tagging
Parts of Speect Tagging
theyaseen51
 
Owa330011 bssap protocol analysis issue 1.0
Owa330011 bssap protocol analysis issue 1.0Owa330011 bssap protocol analysis issue 1.0
Owa330011 bssap protocol analysis issue 1.0
Nguon Dung Le
 
Timing advances
Timing advancesTiming advances
Timing advances
Anil Singh
 

Was ist angesagt? (20)

BERT Finetuning Webinar Presentation
BERT Finetuning Webinar PresentationBERT Finetuning Webinar Presentation
BERT Finetuning Webinar Presentation
 
BERT - Part 1 Learning Notes of Senthil Kumar
BERT - Part 1 Learning Notes of Senthil KumarBERT - Part 1 Learning Notes of Senthil Kumar
BERT - Part 1 Learning Notes of Senthil Kumar
 
Volte troubleshooting
Volte troubleshootingVolte troubleshooting
Volte troubleshooting
 
CSCE181 Big ideas in NLP
CSCE181 Big ideas in NLPCSCE181 Big ideas in NLP
CSCE181 Big ideas in NLP
 
Bert
BertBert
Bert
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
 
BERT - Part 2 Learning Notes
BERT - Part 2 Learning NotesBERT - Part 2 Learning Notes
BERT - Part 2 Learning Notes
 
Parts of Speect Tagging
Parts of Speect TaggingParts of Speect Tagging
Parts of Speect Tagging
 
Introduction to Transformer Model
Introduction to Transformer ModelIntroduction to Transformer Model
Introduction to Transformer Model
 
5G network architecture progress
5G network architecture progress5G network architecture progress
5G network architecture progress
 
Owa330011 bssap protocol analysis issue 1.0
Owa330011 bssap protocol analysis issue 1.0Owa330011 bssap protocol analysis issue 1.0
Owa330011 bssap protocol analysis issue 1.0
 
Tweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVMTweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVM
 
Overview 3GPP NR Physical Layer
Overview 3GPP NR Physical LayerOverview 3GPP NR Physical Layer
Overview 3GPP NR Physical Layer
 
Twitter Sentiment Analysis
Twitter Sentiment AnalysisTwitter Sentiment Analysis
Twitter Sentiment Analysis
 
Renjihha.m massive mimo in 5 g
Renjihha.m massive mimo in 5 gRenjihha.m massive mimo in 5 g
Renjihha.m massive mimo in 5 g
 
Mohan r resume
Mohan r resumeMohan r resume
Mohan r resume
 
Open vocabulary problem
Open vocabulary problemOpen vocabulary problem
Open vocabulary problem
 
Timing advances
Timing advancesTiming advances
Timing advances
 
sentiment analysis
sentiment analysis sentiment analysis
sentiment analysis
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
 

Andere mochten auch

Semantic Entity extraction from Sports Tweets
Semantic Entity extraction from Sports TweetsSemantic Entity extraction from Sports Tweets
Semantic Entity extraction from Sports Tweets
mitsmit
 
SubTopic Detection of Tweets Related to an Entity
SubTopic Detection of Tweets Related to an EntitySubTopic Detection of Tweets Related to an Entity
SubTopic Detection of Tweets Related to an Entity
Ankita Kumari
 
2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers
2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers
2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers
Dongseo University
 

Andere mochten auch (9)

CLASSIFICATION OF TWEETS
CLASSIFICATION OF TWEETSCLASSIFICATION OF TWEETS
CLASSIFICATION OF TWEETS
 
Exploiting Wikipedia for Entity Name Disambiguation in Tweets
Exploiting Wikipedia for Entity Name Disambiguation in TweetsExploiting Wikipedia for Entity Name Disambiguation in Tweets
Exploiting Wikipedia for Entity Name Disambiguation in Tweets
 
Harnessing Web Page Directories for Large-Scale Classification of Tweets
Harnessing Web Page Directories for Large-Scale Classification of TweetsHarnessing Web Page Directories for Large-Scale Classification of Tweets
Harnessing Web Page Directories for Large-Scale Classification of Tweets
 
Discovering Context
Discovering ContextDiscovering Context
Discovering Context
 
Classifying Microblogs For Disasters
Classifying Microblogs For DisastersClassifying Microblogs For Disasters
Classifying Microblogs For Disasters
 
Semantic Entity extraction from Sports Tweets
Semantic Entity extraction from Sports TweetsSemantic Entity extraction from Sports Tweets
Semantic Entity extraction from Sports Tweets
 
SubTopic Detection of Tweets Related to an Entity
SubTopic Detection of Tweets Related to an EntitySubTopic Detection of Tweets Related to an Entity
SubTopic Detection of Tweets Related to an Entity
 
2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers
2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers
2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 

Ähnlich wie Tweets Classification

Twitter Sentiment Prediction.pptx
Twitter Sentiment Prediction.pptxTwitter Sentiment Prediction.pptx
Twitter Sentiment Prediction.pptx
Krishnesh Pujari
 
The 't' in tel software development for tel research problems, pitfalls, and ...
The 't' in tel software development for tel research problems, pitfalls, and ...The 't' in tel software development for tel research problems, pitfalls, and ...
The 't' in tel software development for tel research problems, pitfalls, and ...
Roland Klemke
 

Ähnlich wie Tweets Classification (20)

Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
 
Approaching the Bleeding Edge: Possibilities & Practices for Learning Technol...
Approaching the Bleeding Edge: Possibilities & Practices for Learning Technol...Approaching the Bleeding Edge: Possibilities & Practices for Learning Technol...
Approaching the Bleeding Edge: Possibilities & Practices for Learning Technol...
 
Mentimeter-PPT.pptx
Mentimeter-PPT.pptxMentimeter-PPT.pptx
Mentimeter-PPT.pptx
 
Twitter Sentiment Prediction.pptx
Twitter Sentiment Prediction.pptxTwitter Sentiment Prediction.pptx
Twitter Sentiment Prediction.pptx
 
The 't' in tel software development for tel research problems, pitfalls, and ...
The 't' in tel software development for tel research problems, pitfalls, and ...The 't' in tel software development for tel research problems, pitfalls, and ...
The 't' in tel software development for tel research problems, pitfalls, and ...
 
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...
 
Teacher training material
Teacher training materialTeacher training material
Teacher training material
 
Industry project part2
Industry project part2Industry project part2
Industry project part2
 
Language detection model presentations. Machine learning
Language detection model presentations. Machine learningLanguage detection model presentations. Machine learning
Language detection model presentations. Machine learning
 
IRJET- Twitter Sentimental Analysis for Predicting Election Result using ...
IRJET-  	  Twitter Sentimental Analysis for Predicting Election Result using ...IRJET-  	  Twitter Sentimental Analysis for Predicting Election Result using ...
IRJET- Twitter Sentimental Analysis for Predicting Election Result using ...
 
Supersede overview presentation
Supersede overview presentationSupersede overview presentation
Supersede overview presentation
 
State of FOSS in Guyana
State of FOSS in GuyanaState of FOSS in Guyana
State of FOSS in Guyana
 
e learning management software - Witzscope
e learning management software - Witzscope e learning management software - Witzscope
e learning management software - Witzscope
 
Content Strategy From the Outside In
Content Strategy From the Outside InContent Strategy From the Outside In
Content Strategy From the Outside In
 
Lecture 3 se
Lecture 3 seLecture 3 se
Lecture 3 se
 
Data science unit 1 By: Professor Lili Saghafi
Data science unit 1 By: Professor Lili Saghafi Data science unit 1 By: Professor Lili Saghafi
Data science unit 1 By: Professor Lili Saghafi
 
Using Automated Testing Tools to Empower Your User Research
Using Automated Testing Tools to Empower Your User ResearchUsing Automated Testing Tools to Empower Your User Research
Using Automated Testing Tools to Empower Your User Research
 
Hybrid Classifier for Sentiment Analysis using Effective Pipelining
Hybrid Classifier for Sentiment Analysis using Effective PipeliningHybrid Classifier for Sentiment Analysis using Effective Pipelining
Hybrid Classifier for Sentiment Analysis using Effective Pipelining
 
Multi-Class Sentiment Classification using Machine Learning and Deep Learning...
Multi-Class Sentiment Classification using Machine Learning and Deep Learning...Multi-Class Sentiment Classification using Machine Learning and Deep Learning...
Multi-Class Sentiment Classification using Machine Learning and Deep Learning...
 
Lambda Solutions | Interconnecting your Integrations
Lambda Solutions | Interconnecting your Integrations Lambda Solutions | Interconnecting your Integrations
Lambda Solutions | Interconnecting your Integrations
 

Kürzlich hochgeladen

IATP How-to Foreign Travel May 2024.pdff
IATP How-to Foreign Travel May 2024.pdffIATP How-to Foreign Travel May 2024.pdff
IATP How-to Foreign Travel May 2024.pdff
17thcssbs2
 
ppt your views.ppt your views of your college in your eyes
ppt your views.ppt your views of your college in your eyesppt your views.ppt your views of your college in your eyes
ppt your views.ppt your views of your college in your eyes
ashishpaul799
 

Kürzlich hochgeladen (20)

philosophy and it's principles based on the life
philosophy and it's principles based on the lifephilosophy and it's principles based on the life
philosophy and it's principles based on the life
 
Salient features of Environment protection Act 1986.pptx
Salient features of Environment protection Act 1986.pptxSalient features of Environment protection Act 1986.pptx
Salient features of Environment protection Act 1986.pptx
 
IATP How-to Foreign Travel May 2024.pdff
IATP How-to Foreign Travel May 2024.pdffIATP How-to Foreign Travel May 2024.pdff
IATP How-to Foreign Travel May 2024.pdff
 
2024_Student Session 2_ Set Plan Preparation.pptx
2024_Student Session 2_ Set Plan Preparation.pptx2024_Student Session 2_ Set Plan Preparation.pptx
2024_Student Session 2_ Set Plan Preparation.pptx
 
factors influencing drug absorption-final-2.pptx
factors influencing drug absorption-final-2.pptxfactors influencing drug absorption-final-2.pptx
factors influencing drug absorption-final-2.pptx
 
Dementia (Alzheimer & vasular dementia).
Dementia (Alzheimer & vasular dementia).Dementia (Alzheimer & vasular dementia).
Dementia (Alzheimer & vasular dementia).
 
ppt your views.ppt your views of your college in your eyes
ppt your views.ppt your views of your college in your eyesppt your views.ppt your views of your college in your eyes
ppt your views.ppt your views of your college in your eyes
 
[GDSC YCCE] Build with AI Online Presentation
[GDSC YCCE] Build with AI Online Presentation[GDSC YCCE] Build with AI Online Presentation
[GDSC YCCE] Build with AI Online Presentation
 
Basic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
Basic Civil Engg Notes_Chapter-6_Environment Pollution & EngineeringBasic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
Basic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
 
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT VẬT LÝ 2024 - TỪ CÁC TRƯỜNG, TRƯ...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT VẬT LÝ 2024 - TỪ CÁC TRƯỜNG, TRƯ...TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT VẬT LÝ 2024 - TỪ CÁC TRƯỜNG, TRƯ...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT VẬT LÝ 2024 - TỪ CÁC TRƯỜNG, TRƯ...
 
size separation d pharm 1st year pharmaceutics
size separation d pharm 1st year pharmaceuticssize separation d pharm 1st year pharmaceutics
size separation d pharm 1st year pharmaceutics
 
Gyanartha SciBizTech Quiz slideshare.pptx
Gyanartha SciBizTech Quiz slideshare.pptxGyanartha SciBizTech Quiz slideshare.pptx
Gyanartha SciBizTech Quiz slideshare.pptx
 
Post Exam Fun(da) Intra UEM General Quiz - Finals.pdf
Post Exam Fun(da) Intra UEM General Quiz - Finals.pdfPost Exam Fun(da) Intra UEM General Quiz - Finals.pdf
Post Exam Fun(da) Intra UEM General Quiz - Finals.pdf
 
slides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptxslides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptx
 
Post Exam Fun(da) Intra UEM General Quiz 2024 - Prelims q&a.pdf
Post Exam Fun(da) Intra UEM General Quiz 2024 - Prelims q&a.pdfPost Exam Fun(da) Intra UEM General Quiz 2024 - Prelims q&a.pdf
Post Exam Fun(da) Intra UEM General Quiz 2024 - Prelims q&a.pdf
 
Telling Your Story_ Simple Steps to Build Your Nonprofit's Brand Webinar.pdf
Telling Your Story_ Simple Steps to Build Your Nonprofit's Brand Webinar.pdfTelling Your Story_ Simple Steps to Build Your Nonprofit's Brand Webinar.pdf
Telling Your Story_ Simple Steps to Build Your Nonprofit's Brand Webinar.pdf
 
Open Educational Resources Primer PowerPoint
Open Educational Resources Primer PowerPointOpen Educational Resources Primer PowerPoint
Open Educational Resources Primer PowerPoint
 
The Benefits and Challenges of Open Educational Resources
The Benefits and Challenges of Open Educational ResourcesThe Benefits and Challenges of Open Educational Resources
The Benefits and Challenges of Open Educational Resources
 
Morse OER Some Benefits and Challenges.pptx
Morse OER Some Benefits and Challenges.pptxMorse OER Some Benefits and Challenges.pptx
Morse OER Some Benefits and Challenges.pptx
 
An Overview of the Odoo 17 Discuss App.pptx
An Overview of the Odoo 17 Discuss App.pptxAn Overview of the Odoo 17 Discuss App.pptx
An Overview of the Odoo 17 Discuss App.pptx
 

Tweets Classification

  • 1. 1 Tweets Classification Supervisor - Dr. Vikas Saxena Name - Shubhangi Agarwal Varun Ajay Gupta Enrolment No. – 10104768 10104730
  • 2. Introduction • As we are living in an era of social networking that’s why our project focuses on twitter. In this project we extracts the tweets and then classify them into different categories . As with extraction of tweets we extracts the huge amount of information with it. • By using tweet classification we can predict the current trend like which is most popular language on twitter, most talked about person , burning topics and much more. 5/29/2014Footer Text 2
  • 3. Problem Statement • Extraction of tweets. • Converting unstructured data into structured data. • Pre-processing of data . • Finding the most popular language on twitter. • Choosing of features for the classification. • Classifying the tweets into different categories. 5/29/2014Footer Text 3
  • 4. Algorithm • SVMs (support vector machines) are supervised learning models with associated learning algorithms that analyse data and recognize patterns, used for classification and regression analysis . • Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples into one category or the other, 5/29/2014Footer Text 4
  • 5. Why SVM ? • Most popular in text classification. • High accuracy in comparison to other algorithms. • By choosing right features svm can be robust even when the training sample has some bias. 5/29/2014Footer Text 5
  • 6. Technology Used • Operating System: UBUNTU 12.04 . • Language: PYTHON • Tools: GEDIT • Debugger: PYTHON DEBUGGER 5/29/2014Footer Text 6
  • 10. Calculating most popular language on twitter 5/29/2014Footer Text 10
  • 11. Pictorially showing popularity of languages 5/29/2014Footer Text 11
  • 12. Features choose • No of sports words. • No of politics words. • No of entertainment words. • Lexical complexity. • No of hash tags. • No of digits. 5/29/2014Footer Text 12
  • 13. Values of features of training set 5/29/2014Footer Text 13
  • 14. Feature values of testing data set before application of SVM 5/29/2014Footer Text 14
  • 15. Result of classification of tweets 5/29/2014Footer Text 15
  • 16. Graph of SVM and accuracy 5/29/2014Footer Text 16
  • 17. Conclusion On implementing the SVM on the testing dataset . It classifies the data into sports ,entertainment and politics category with a accuracy of 97.5% 5/29/2014Footer Text 17
  • 18. Future Work • Till now we have implemented the SVM to classify the tweets in general categories like Sports , politics , entertainment. We will try to implement it to categories data into more specific categories so that it can be used by the marketing and PR team of different organizations while they are choosing their strategies. 5/29/2014Footer Text 18