SlideShare ist ein Scribd-Unternehmen logo
1 von 3
11/18/2015 Analyze Twitter Data
with Hortonworks
Hadoop
Intermediate Project Report
Bharat Khanna
UNIVERSITY AT BUFFALO
1
Sentiment Analysis of Mr. Narendra Modi’s Brand Image using Twitter Data
Summary: - I am doing sentiment analysis of Mr. Narendra Modi’s Brand Image across
different nations using data from twitter. For fetching the twitter data, I am using Apache
Flume that is open source and by default comes installed in Hortonworks sandbox platform
1.3.
After fetching the data from twitter, it would be loaded directly to HDFS (Hadoop Distributed
File System). This way I am reducing the extra overhead of transferring the data from local
system to HDFS.
Data loaded in HDFS is still in unstructured format and not good for Ad-hoc analysis. So I will
be converting the JSON data to tabular format and store it in HIVE. Also I would be providing
a graphical user interface to end users to run their own ad-hoc analysis.
Next step deals with using the dictionary file to score the sentiment of each tweet by the
number of positive words compared to number of negative words, and then assigned a
positive, negative or neutral sentiment value to eachtweet. I have downloaded the dictionary
file from below link.
Click here for Dictionary
Last part of project is to show results of sentiments analysis in form of visualizations. Here I
will be using Tableau for it. I will be connecting Tableau to Hive using Hortonworks ODBC
Driver that I downloaded from Hortonworks website (link mentioned in references section).
I will show the results of analysis in the form graphs and maps using Tableau’s inbuilt VIZQL
server.
Data sets and Software:
Sentiment Data: - Sentiment Data is unstructured data that represents opinions, emotions,
attitudes contained in sources such as social media posts, online blogs, and product reviews
etc.
Whyuse sentiment Data:- Organizations use sentiment data to know what people feel about
their product and what they can do to effectively market their product.
How did I fetched Twitter Data: - Created twitter app, configured flume.conf with app
credentials and ran flume. All the steps for fetching data from twitter using Apache Flume I
have mentioned in a YouTube video and a ppt, the link of which is below. I have alsouploaded
video at ublearns discussion forum of DC.
YouTube: - https://youtu.be/E1w5SkE7Cco
Slide share: - http://www.slideshare.net/bharat3khanna/extracting-twitter-data-using-
apache-flume
Source code for Flume-Snapshot.jar:- Idownloadedsource code of Flume-snapshot.jarfromgithub
and builtthe jarusingmavenpackage inHadoop cluster.
2
Click here for Flume Source Code
Size of Data: - Though there is no limitation of amount of data I can get from twitter but for this
project, I am going to do my analysis on approximately 100 mb of data.
AlgorithmsUsed:- IamnotusingMap-Reduce Algorithmhere,sinceIwanttodoanalysis oncomplete
data and I don’twant to use aggregatedmeasures.If I wouldhave usedMap Reduce,thenmy lot of
data wouldhave beenaggregatedbyreducer.My source data isin JSON format and I am usingHive-
serde.jar (serde stands serializer and deserializer) that helps in parsing the JSON data effectively to
hive tables.
Source code forHive-serde.jar:-Idownloaded source code of Hive-serde.jarfromgithubandbuiltthe
jar using maven package in Hadoop cluster.
Clickhere forHive-serde.jarsource code
Analysis to be done on Twitter data: - I am going to do following analysis using Hive and Tableau:-
a) Maximum tweets count per user.
b) Count of retweets.
c) Geographically mapping people’s sentiments towards Mr. Modi.
References: -
http://blog.cloudera.com/blog/2012/09/analyzing-twitter-data-with-hadoop
https://github.com/cloudera/cdh-twitter-example
https://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html#lexicon
http://hortonworks.com/products/releases/hdp-1-3/#add_ons

Weitere ähnliche Inhalte

Was ist angesagt?

New sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumarNew sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumarRavi Kumar
 
Sentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using pythonSentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using pythonHetu Bhavsar
 
Sentiment Analysis
Sentiment Analysis Sentiment Analysis
Sentiment Analysis prnk08
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSumit Raj
 
Python report on twitter sentiment analysis
Python report on twitter sentiment analysisPython report on twitter sentiment analysis
Python report on twitter sentiment analysisAntaraBhattacharya12
 
Sentiment Analysis on Twitter
Sentiment Analysis on TwitterSentiment Analysis on Twitter
Sentiment Analysis on TwitterSmritiAgarwal26
 
sentiment analysis text extraction from social media
sentiment  analysis text extraction from social media sentiment  analysis text extraction from social media
sentiment analysis text extraction from social media Ravindra Chaudhary
 
Sentiment analysis of Twitter Data
Sentiment analysis of Twitter DataSentiment analysis of Twitter Data
Sentiment analysis of Twitter DataNurendra Choudhary
 
Twitter Sentiment Analysis
Twitter Sentiment AnalysisTwitter Sentiment Analysis
Twitter Sentiment AnalysisAyush Khandelwal
 
Sentiment Analaysis on Twitter
Sentiment Analaysis on TwitterSentiment Analaysis on Twitter
Sentiment Analaysis on TwitterNitish J Prabhu
 
Twitter sentiment analysis
Twitter sentiment analysisTwitter sentiment analysis
Twitter sentiment analysisSunil Kandari
 
IRE2014-Sentiment Analysis
IRE2014-Sentiment AnalysisIRE2014-Sentiment Analysis
IRE2014-Sentiment AnalysisGangasagar Patil
 
Sentiment analysis in Twitter on Big Data
Sentiment analysis in Twitter on Big DataSentiment analysis in Twitter on Big Data
Sentiment analysis in Twitter on Big DataIswarya M
 
Approaches to Sentiment Analysis
Approaches to Sentiment AnalysisApproaches to Sentiment Analysis
Approaches to Sentiment AnalysisNihar Suryawanshi
 
Sentiment Analysis Using Product Review
Sentiment Analysis Using Product ReviewSentiment Analysis Using Product Review
Sentiment Analysis Using Product ReviewAbdullah Moin
 
Sentiment Analysis on Twitter
Sentiment Analysis on TwitterSentiment Analysis on Twitter
Sentiment Analysis on TwitterSubarno Pal
 
Sentiment Analysis in Twitter
Sentiment Analysis in TwitterSentiment Analysis in Twitter
Sentiment Analysis in Twitterprnk08
 
project sentiment analysis
project sentiment analysisproject sentiment analysis
project sentiment analysissneha penmetsa
 

Was ist angesagt? (20)

New sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumarNew sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumar
 
Sentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using pythonSentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using python
 
Sentiment Analysis
Sentiment Analysis Sentiment Analysis
Sentiment Analysis
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
 
Twitter sentiment analysis ppt
Twitter sentiment analysis pptTwitter sentiment analysis ppt
Twitter sentiment analysis ppt
 
Python report on twitter sentiment analysis
Python report on twitter sentiment analysisPython report on twitter sentiment analysis
Python report on twitter sentiment analysis
 
Sentiment Analysis on Twitter
Sentiment Analysis on TwitterSentiment Analysis on Twitter
Sentiment Analysis on Twitter
 
sentiment analysis text extraction from social media
sentiment  analysis text extraction from social media sentiment  analysis text extraction from social media
sentiment analysis text extraction from social media
 
Sentiment analysis of Twitter Data
Sentiment analysis of Twitter DataSentiment analysis of Twitter Data
Sentiment analysis of Twitter Data
 
Twitter Sentiment Analysis
Twitter Sentiment AnalysisTwitter Sentiment Analysis
Twitter Sentiment Analysis
 
Sentimental Analysis of twitter data .
Sentimental Analysis of twitter data .Sentimental Analysis of twitter data .
Sentimental Analysis of twitter data .
 
Sentiment Analaysis on Twitter
Sentiment Analaysis on TwitterSentiment Analaysis on Twitter
Sentiment Analaysis on Twitter
 
Twitter sentiment analysis
Twitter sentiment analysisTwitter sentiment analysis
Twitter sentiment analysis
 
IRE2014-Sentiment Analysis
IRE2014-Sentiment AnalysisIRE2014-Sentiment Analysis
IRE2014-Sentiment Analysis
 
Sentiment analysis in Twitter on Big Data
Sentiment analysis in Twitter on Big DataSentiment analysis in Twitter on Big Data
Sentiment analysis in Twitter on Big Data
 
Approaches to Sentiment Analysis
Approaches to Sentiment AnalysisApproaches to Sentiment Analysis
Approaches to Sentiment Analysis
 
Sentiment Analysis Using Product Review
Sentiment Analysis Using Product ReviewSentiment Analysis Using Product Review
Sentiment Analysis Using Product Review
 
Sentiment Analysis on Twitter
Sentiment Analysis on TwitterSentiment Analysis on Twitter
Sentiment Analysis on Twitter
 
Sentiment Analysis in Twitter
Sentiment Analysis in TwitterSentiment Analysis in Twitter
Sentiment Analysis in Twitter
 
project sentiment analysis
project sentiment analysisproject sentiment analysis
project sentiment analysis
 

Ähnlich wie Twitter sentiment analysis project report

Sentiment Analysis on Twitter Data Using Apache Flume and Hive
Sentiment Analysis on Twitter Data Using Apache Flume and HiveSentiment Analysis on Twitter Data Using Apache Flume and Hive
Sentiment Analysis on Twitter Data Using Apache Flume and HiveIRJET Journal
 
Social data analysis using apache flume, hdfs, hive
Social data analysis using apache flume, hdfs, hiveSocial data analysis using apache flume, hdfs, hive
Social data analysis using apache flume, hdfs, hiveijctet
 
Analytics With PowerBI On Azure
Analytics With PowerBI On AzureAnalytics With PowerBI On Azure
Analytics With PowerBI On AzureAnita Luthra
 
IRJET- Opinion Mining on Pulwama Attack
IRJET-  	  Opinion Mining on Pulwama AttackIRJET-  	  Opinion Mining on Pulwama Attack
IRJET- Opinion Mining on Pulwama AttackIRJET Journal
 
IRJET- Sentiment Analysis on Twitter Posts using Hadoop
IRJET- Sentiment Analysis on Twitter Posts using HadoopIRJET- Sentiment Analysis on Twitter Posts using Hadoop
IRJET- Sentiment Analysis on Twitter Posts using HadoopIRJET Journal
 
Social media and its data are both a challenge and.docx
Social media and its data are both a challenge and.docxSocial media and its data are both a challenge and.docx
Social media and its data are both a challenge and.docxwrite12
 
Sentiment analysis and classification of tweets using rapid miner tool
Sentiment analysis and classification of tweets using rapid miner toolSentiment analysis and classification of tweets using rapid miner tool
Sentiment analysis and classification of tweets using rapid miner toolValarmathi Srinivasan
 
sentimentanaly 2.pdf
sentimentanaly 2.pdfsentimentanaly 2.pdf
sentimentanaly 2.pdfvisheshs4
 
Real time sentiment analysis of twitter feeds with the NASDAQ index
Real time sentiment analysis of twitter feeds with the NASDAQ indexReal time sentiment analysis of twitter feeds with the NASDAQ index
Real time sentiment analysis of twitter feeds with the NASDAQ indexEric Tham
 
Product Sentiment Analysis
Product Sentiment AnalysisProduct Sentiment Analysis
Product Sentiment Analysisnancy amala
 
Stock prediction using social network
Stock prediction using social networkStock prediction using social network
Stock prediction using social networkChanon Hongsirikulkit
 
FDMEE Tutorial - Part 1
FDMEE Tutorial - Part 1FDMEE Tutorial - Part 1
FDMEE Tutorial - Part 1Van Huy
 
Five steps to search and store tweets by keywords
Five steps to search and store tweets by keywordsFive steps to search and store tweets by keywords
Five steps to search and store tweets by keywordsWeiai Wayne Xu
 
ESG - HDS HCP Anywhere Easy, Secure, On-Premises File Sharing
ESG - HDS HCP Anywhere Easy, Secure, On-Premises File SharingESG - HDS HCP Anywhere Easy, Secure, On-Premises File Sharing
ESG - HDS HCP Anywhere Easy, Secure, On-Premises File SharingHitachi Vantara
 
Curriculum Vitae
Curriculum VitaeCurriculum Vitae
Curriculum VitaeSunny Roy
 
A Survey on Analysis of Twitter Opinion Mining using Sentiment Analysis
A Survey on Analysis of Twitter Opinion Mining using Sentiment AnalysisA Survey on Analysis of Twitter Opinion Mining using Sentiment Analysis
A Survey on Analysis of Twitter Opinion Mining using Sentiment AnalysisIRJET Journal
 
Data extraction tools
Data extraction toolsData extraction tools
Data extraction toolsCristian Ruiz
 
Twitter_Sentiment_analysis.pptx
Twitter_Sentiment_analysis.pptxTwitter_Sentiment_analysis.pptx
Twitter_Sentiment_analysis.pptxJOELFRANKLIN13
 
FDMEE Tutorial - Part 1
FDMEE Tutorial - Part 1FDMEE Tutorial - Part 1
FDMEE Tutorial - Part 1Van Huy
 

Ähnlich wie Twitter sentiment analysis project report (20)

Sentiment Analysis on Twitter Data Using Apache Flume and Hive
Sentiment Analysis on Twitter Data Using Apache Flume and HiveSentiment Analysis on Twitter Data Using Apache Flume and Hive
Sentiment Analysis on Twitter Data Using Apache Flume and Hive
 
Social data analysis using apache flume, hdfs, hive
Social data analysis using apache flume, hdfs, hiveSocial data analysis using apache flume, hdfs, hive
Social data analysis using apache flume, hdfs, hive
 
Analytics With PowerBI On Azure
Analytics With PowerBI On AzureAnalytics With PowerBI On Azure
Analytics With PowerBI On Azure
 
IRJET- Opinion Mining on Pulwama Attack
IRJET-  	  Opinion Mining on Pulwama AttackIRJET-  	  Opinion Mining on Pulwama Attack
IRJET- Opinion Mining on Pulwama Attack
 
IRJET- Sentiment Analysis on Twitter Posts using Hadoop
IRJET- Sentiment Analysis on Twitter Posts using HadoopIRJET- Sentiment Analysis on Twitter Posts using Hadoop
IRJET- Sentiment Analysis on Twitter Posts using Hadoop
 
Social media and its data are both a challenge and.docx
Social media and its data are both a challenge and.docxSocial media and its data are both a challenge and.docx
Social media and its data are both a challenge and.docx
 
Sentiment analysis and classification of tweets using rapid miner tool
Sentiment analysis and classification of tweets using rapid miner toolSentiment analysis and classification of tweets using rapid miner tool
Sentiment analysis and classification of tweets using rapid miner tool
 
sentimentanaly 2.pdf
sentimentanaly 2.pdfsentimentanaly 2.pdf
sentimentanaly 2.pdf
 
Real time sentiment analysis of twitter feeds with the NASDAQ index
Real time sentiment analysis of twitter feeds with the NASDAQ indexReal time sentiment analysis of twitter feeds with the NASDAQ index
Real time sentiment analysis of twitter feeds with the NASDAQ index
 
Develop MS Office Plugins
Develop MS Office Plugins Develop MS Office Plugins
Develop MS Office Plugins
 
Product Sentiment Analysis
Product Sentiment AnalysisProduct Sentiment Analysis
Product Sentiment Analysis
 
Stock prediction using social network
Stock prediction using social networkStock prediction using social network
Stock prediction using social network
 
FDMEE Tutorial - Part 1
FDMEE Tutorial - Part 1FDMEE Tutorial - Part 1
FDMEE Tutorial - Part 1
 
Five steps to search and store tweets by keywords
Five steps to search and store tweets by keywordsFive steps to search and store tweets by keywords
Five steps to search and store tweets by keywords
 
ESG - HDS HCP Anywhere Easy, Secure, On-Premises File Sharing
ESG - HDS HCP Anywhere Easy, Secure, On-Premises File SharingESG - HDS HCP Anywhere Easy, Secure, On-Premises File Sharing
ESG - HDS HCP Anywhere Easy, Secure, On-Premises File Sharing
 
Curriculum Vitae
Curriculum VitaeCurriculum Vitae
Curriculum Vitae
 
A Survey on Analysis of Twitter Opinion Mining using Sentiment Analysis
A Survey on Analysis of Twitter Opinion Mining using Sentiment AnalysisA Survey on Analysis of Twitter Opinion Mining using Sentiment Analysis
A Survey on Analysis of Twitter Opinion Mining using Sentiment Analysis
 
Data extraction tools
Data extraction toolsData extraction tools
Data extraction tools
 
Twitter_Sentiment_analysis.pptx
Twitter_Sentiment_analysis.pptxTwitter_Sentiment_analysis.pptx
Twitter_Sentiment_analysis.pptx
 
FDMEE Tutorial - Part 1
FDMEE Tutorial - Part 1FDMEE Tutorial - Part 1
FDMEE Tutorial - Part 1
 

Kürzlich hochgeladen

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 

Kürzlich hochgeladen (20)

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 

Twitter sentiment analysis project report

  • 1. 11/18/2015 Analyze Twitter Data with Hortonworks Hadoop Intermediate Project Report Bharat Khanna UNIVERSITY AT BUFFALO
  • 2. 1 Sentiment Analysis of Mr. Narendra Modi’s Brand Image using Twitter Data Summary: - I am doing sentiment analysis of Mr. Narendra Modi’s Brand Image across different nations using data from twitter. For fetching the twitter data, I am using Apache Flume that is open source and by default comes installed in Hortonworks sandbox platform 1.3. After fetching the data from twitter, it would be loaded directly to HDFS (Hadoop Distributed File System). This way I am reducing the extra overhead of transferring the data from local system to HDFS. Data loaded in HDFS is still in unstructured format and not good for Ad-hoc analysis. So I will be converting the JSON data to tabular format and store it in HIVE. Also I would be providing a graphical user interface to end users to run their own ad-hoc analysis. Next step deals with using the dictionary file to score the sentiment of each tweet by the number of positive words compared to number of negative words, and then assigned a positive, negative or neutral sentiment value to eachtweet. I have downloaded the dictionary file from below link. Click here for Dictionary Last part of project is to show results of sentiments analysis in form of visualizations. Here I will be using Tableau for it. I will be connecting Tableau to Hive using Hortonworks ODBC Driver that I downloaded from Hortonworks website (link mentioned in references section). I will show the results of analysis in the form graphs and maps using Tableau’s inbuilt VIZQL server. Data sets and Software: Sentiment Data: - Sentiment Data is unstructured data that represents opinions, emotions, attitudes contained in sources such as social media posts, online blogs, and product reviews etc. Whyuse sentiment Data:- Organizations use sentiment data to know what people feel about their product and what they can do to effectively market their product. How did I fetched Twitter Data: - Created twitter app, configured flume.conf with app credentials and ran flume. All the steps for fetching data from twitter using Apache Flume I have mentioned in a YouTube video and a ppt, the link of which is below. I have alsouploaded video at ublearns discussion forum of DC. YouTube: - https://youtu.be/E1w5SkE7Cco Slide share: - http://www.slideshare.net/bharat3khanna/extracting-twitter-data-using- apache-flume Source code for Flume-Snapshot.jar:- Idownloadedsource code of Flume-snapshot.jarfromgithub and builtthe jarusingmavenpackage inHadoop cluster.
  • 3. 2 Click here for Flume Source Code Size of Data: - Though there is no limitation of amount of data I can get from twitter but for this project, I am going to do my analysis on approximately 100 mb of data. AlgorithmsUsed:- IamnotusingMap-Reduce Algorithmhere,sinceIwanttodoanalysis oncomplete data and I don’twant to use aggregatedmeasures.If I wouldhave usedMap Reduce,thenmy lot of data wouldhave beenaggregatedbyreducer.My source data isin JSON format and I am usingHive- serde.jar (serde stands serializer and deserializer) that helps in parsing the JSON data effectively to hive tables. Source code forHive-serde.jar:-Idownloaded source code of Hive-serde.jarfromgithubandbuiltthe jar using maven package in Hadoop cluster. Clickhere forHive-serde.jarsource code Analysis to be done on Twitter data: - I am going to do following analysis using Hive and Tableau:- a) Maximum tweets count per user. b) Count of retweets. c) Geographically mapping people’s sentiments towards Mr. Modi. References: - http://blog.cloudera.com/blog/2012/09/analyzing-twitter-data-with-hadoop https://github.com/cloudera/cdh-twitter-example https://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html#lexicon http://hortonworks.com/products/releases/hdp-1-3/#add_ons