SlideShare ist ein Scribd-Unternehmen logo
1 von 19
The Python NLP Project
- An analysis of Twitter Data
NAIQING LIN & XIAOYE LI
Background & Introduction
Prevalence of food safety related crisis in recent years
Development of social media and social networking sites
Accessibility to twitter data through twitter API
Research Questions
What are the key features of the food safety information communicated by the Twitter users
(e.g., volume, variety, etc.)?
What are the most popular words or phrases discussed in tweets related to food safety crisis?
What are the key vertex and edges within the food safety network regarding information
dissemination?
What are the main clusters and their differences within different centrality metrics?
Method
Data Collection
Data extraction through Python Tweepy Package
Specific key words: foodsafety; globalfoodsupply, Allergen, Foodcontamination,
Yes2safe, & Foodillness
Length of data collection: 18 hours
Method
Data Description
Pilot test conducted, 117 tweets resulted
Number of dataset: 2286 tweets extracted
Small number of dataset: food safety as less popular topic in Twitter, compared
to other topics (e.g., politics, entertainment, etc.)
Data Analysis
Descriptive Analytics
User Analysis
◦ Unique users: 723
◦ Active users: StarzNsky4u with 207 tweets, NimsJane with 141 tweets
◦ Active user actively post tweets related to environmental conservation and animal rights advocates
◦ Visible users: users were mentioned more than 200 times by other users
◦ Influential users in legal regulations for American Food Exports Act
◦ Popular languages: English, Spanish
Descriptive Analytics
User Relationship Analysis
◦ Follower and following relationship analysis
◦ Canadian restaurant owner as the one with most followers (567510)
◦ Non-profit organization, ASPCA as another popular user, focusing on fighting against animal
cruelty
Descriptive Analytics
Tweet Features
◦ Original tweets: 1288
◦ Retweets: 998
◦ Most popular retweets: food allergy, foodborne illness, cross-contamination, horsemeat, and
foodborne illness outbreak
◦ Popular urls: food allergen, food recall, and food safety related news websites (992 urls in
total)
Content Analytics
Data Preprocessing
◦ Remove urls and user names
◦ Remove non-alphanumeric contents
◦ Text preprocessing (tokenization, lowercase conversion, stopwords removal, etc.)
Content Analytics
Word Frequency Analysis
Frequent words
◦ Food
◦ Allergen
◦ Americans
◦ Want
◦ Safe
◦ ……
Content Analytics
Word Frequency Analysis
Frequent bi-gram words
◦ Legacy, sicken
◦ Watching, want
◦ Want, safe
◦ Safe, floor
◦ Americans, watching
◦ ……
Visualization of
Word Frequency
Unsupervised Content Analytics
Clustering Analysis
K = 6 (clusters)
Topics in each clusters
Clu 0: legal regualtions
Clu 1: government actions
Clu 2: international related
Clu 3: Oklahoma incidents
Clu 4: negative effects
Clu 5: horsemeat
Sentiment Analysis
Python analysis & Bing-liu Sentiment Analysis
◦ Results: majority in positive sentiment (615) and subjectivility
◦ Most popular words in positive reviews: safe, yes, foodsafety, americans, vote
◦ Negative: 325
◦ Most popular words in negative reviews: stop, exporting, toxins, fdalabeled, warnings
◦ Neutral: 277
Network Analytics
Mention Network Analysis
Visualization of whole
network
Practical Implications
For governmental agencies:
◦ More promotions of public twitter account and regular postings of useful tweet information
◦ Using popular hashtags to communicate the important food safety related information with
the public
For commercial business owners:
◦ Sharing of food safety related information and transparency in information disclosure
◦ Inclusion of website urls and expand influence by enlarging networks around them
Limitations and Future Research
Limitations
Time Constraints
Data collection obstacles (lack of data source)
Twitter as the single data source
Limitations and Future Research
Future research
Inclusion of more key words
Combination of other data sources (e.g., government website)
More in-depth analysis of important tweets in individual accounts (e.g., visible users)
Utilization of more data-visualization tools
Thank You!

Weitere ähnliche Inhalte

Ähnlich wie Python NLP Project

Using Chaos to Disentangle an ISIS-Related Twitter Network
Using Chaos to Disentangle an ISIS-Related Twitter NetworkUsing Chaos to Disentangle an ISIS-Related Twitter Network
Using Chaos to Disentangle an ISIS-Related Twitter NetworkSteve Kramer
 
AI in the Social Sciences Presentation
AI in the Social Sciences Presentation AI in the Social Sciences Presentation
AI in the Social Sciences Presentation April Heyward
 
Introduction to research data management
Introduction to research data managementIntroduction to research data management
Introduction to research data managementrds-wayne-edu
 
Ethics in Infodemiology and Public Health 2.0
Ethics in Infodemiology and Public Health 2.0Ethics in Infodemiology and Public Health 2.0
Ethics in Infodemiology and Public Health 2.0Gunther Eysenbach
 
Searching for evidence
Searching for evidenceSearching for evidence
Searching for evidenceAnne Madden
 
National Preparedness, Lawbreaker Identification, Patient Engagement, Discove...
National Preparedness, Lawbreaker Identification, Patient Engagement, Discove...National Preparedness, Lawbreaker Identification, Patient Engagement, Discove...
National Preparedness, Lawbreaker Identification, Patient Engagement, Discove...Naveen Ashish
 
Big Data - Outcomes Performance Measured
Big Data - Outcomes Performance MeasuredBig Data - Outcomes Performance Measured
Big Data - Outcomes Performance MeasuredGreenway Health
 
Research methodology
Research methodologyResearch methodology
Research methodologyTosif Ahmad
 
Sdal air health and social development (jan. 27, 2014) final
Sdal air health and social development (jan. 27, 2014) finalSdal air health and social development (jan. 27, 2014) final
Sdal air health and social development (jan. 27, 2014) finalkimlyman
 
Semantic web mining for Pharmaceutical Business Intelligence
Semantic web mining for Pharmaceutical Business IntelligenceSemantic web mining for Pharmaceutical Business Intelligence
Semantic web mining for Pharmaceutical Business IntelligenceDavid Cocker
 
Chain Event: Intro - Sean Manion
Chain Event: Intro - Sean ManionChain Event: Intro - Sean Manion
Chain Event: Intro - Sean ManionSean Manion PhD
 
FoodSENSE - A decision support framework for nutrition and food security inte...
FoodSENSE - A decision support framework for nutrition and food security inte...FoodSENSE - A decision support framework for nutrition and food security inte...
FoodSENSE - A decision support framework for nutrition and food security inte...ILRI
 
1. Data Science overview - part1.pptx
1. Data Science overview - part1.pptx1. Data Science overview - part1.pptx
1. Data Science overview - part1.pptxRahulTr22
 
Finding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics DatasetsFinding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics DatasetsManuel Corpas
 
Eysenbach: Infodemiology and Infoveillance
Eysenbach: Infodemiology and InfoveillanceEysenbach: Infodemiology and Infoveillance
Eysenbach: Infodemiology and InfoveillanceGunther Eysenbach
 
Workshop finding and accessing data - fiona nadia charlotte - cambridge apr...
Workshop   finding and accessing data - fiona nadia charlotte - cambridge apr...Workshop   finding and accessing data - fiona nadia charlotte - cambridge apr...
Workshop finding and accessing data - fiona nadia charlotte - cambridge apr...Fiona Nielsen
 
Blockchain for a TBI Research Network - Manion
Blockchain for a TBI Research Network - ManionBlockchain for a TBI Research Network - Manion
Blockchain for a TBI Research Network - ManionSean Manion PhD
 

Ähnlich wie Python NLP Project (20)

Using Chaos to Disentangle an ISIS-Related Twitter Network
Using Chaos to Disentangle an ISIS-Related Twitter NetworkUsing Chaos to Disentangle an ISIS-Related Twitter Network
Using Chaos to Disentangle an ISIS-Related Twitter Network
 
AI in the Social Sciences Presentation
AI in the Social Sciences Presentation AI in the Social Sciences Presentation
AI in the Social Sciences Presentation
 
Introduction to research data management
Introduction to research data managementIntroduction to research data management
Introduction to research data management
 
Ethics in Infodemiology and Public Health 2.0
Ethics in Infodemiology and Public Health 2.0Ethics in Infodemiology and Public Health 2.0
Ethics in Infodemiology and Public Health 2.0
 
Searching for evidence
Searching for evidenceSearching for evidence
Searching for evidence
 
National Preparedness, Lawbreaker Identification, Patient Engagement, Discove...
National Preparedness, Lawbreaker Identification, Patient Engagement, Discove...National Preparedness, Lawbreaker Identification, Patient Engagement, Discove...
National Preparedness, Lawbreaker Identification, Patient Engagement, Discove...
 
week 7.pptx
week 7.pptxweek 7.pptx
week 7.pptx
 
Big Data - Outcomes Performance Measured
Big Data - Outcomes Performance MeasuredBig Data - Outcomes Performance Measured
Big Data - Outcomes Performance Measured
 
Better Data for a Better World
Better Data for a Better WorldBetter Data for a Better World
Better Data for a Better World
 
Research methodology
Research methodologyResearch methodology
Research methodology
 
Sdal air health and social development (jan. 27, 2014) final
Sdal air health and social development (jan. 27, 2014) finalSdal air health and social development (jan. 27, 2014) final
Sdal air health and social development (jan. 27, 2014) final
 
Semantic web mining for Pharmaceutical Business Intelligence
Semantic web mining for Pharmaceutical Business IntelligenceSemantic web mining for Pharmaceutical Business Intelligence
Semantic web mining for Pharmaceutical Business Intelligence
 
Chain Event: Intro - Sean Manion
Chain Event: Intro - Sean ManionChain Event: Intro - Sean Manion
Chain Event: Intro - Sean Manion
 
FoodSENSE - A decision support framework for nutrition and food security inte...
FoodSENSE - A decision support framework for nutrition and food security inte...FoodSENSE - A decision support framework for nutrition and food security inte...
FoodSENSE - A decision support framework for nutrition and food security inte...
 
1. Data Science overview - part1.pptx
1. Data Science overview - part1.pptx1. Data Science overview - part1.pptx
1. Data Science overview - part1.pptx
 
Finding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics DatasetsFinding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics Datasets
 
Eysenbach: Infodemiology and Infoveillance
Eysenbach: Infodemiology and InfoveillanceEysenbach: Infodemiology and Infoveillance
Eysenbach: Infodemiology and Infoveillance
 
The data we want
The data we wantThe data we want
The data we want
 
Workshop finding and accessing data - fiona nadia charlotte - cambridge apr...
Workshop   finding and accessing data - fiona nadia charlotte - cambridge apr...Workshop   finding and accessing data - fiona nadia charlotte - cambridge apr...
Workshop finding and accessing data - fiona nadia charlotte - cambridge apr...
 
Blockchain for a TBI Research Network - Manion
Blockchain for a TBI Research Network - ManionBlockchain for a TBI Research Network - Manion
Blockchain for a TBI Research Network - Manion
 

Mehr von Naiqing Lin, Ph.D.

Explaining unobserved heterogeneity of food safety behavioral intention: A se...
Explaining unobserved heterogeneity of food safety behavioral intention: A se...Explaining unobserved heterogeneity of food safety behavioral intention: A se...
Explaining unobserved heterogeneity of food safety behavioral intention: A se...Naiqing Lin, Ph.D.
 
Kansas Rural Tourism Development
Kansas Rural Tourism DevelopmentKansas Rural Tourism Development
Kansas Rural Tourism DevelopmentNaiqing Lin, Ph.D.
 
Exploring Unobserved Heterogeneity of Food Safety Behavior: A Meta-Analysis
Exploring Unobserved Heterogeneity of Food Safety Behavior: A Meta-AnalysisExploring Unobserved Heterogeneity of Food Safety Behavior: A Meta-Analysis
Exploring Unobserved Heterogeneity of Food Safety Behavior: A Meta-AnalysisNaiqing Lin, Ph.D.
 
Non-parametric Method of Comparison for Rural Development
Non-parametric Method of Comparison for Rural DevelopmentNon-parametric Method of Comparison for Rural Development
Non-parametric Method of Comparison for Rural DevelopmentNaiqing Lin, Ph.D.
 
Multi-group Structural Equation Modelling with Rural Tourism
Multi-group Structural Equation Modelling with Rural TourismMulti-group Structural Equation Modelling with Rural Tourism
Multi-group Structural Equation Modelling with Rural TourismNaiqing Lin, Ph.D.
 
Leading by example: A three-wave sequential mixed method food safety study
Leading by example: A three-wave sequential mixed method food safety studyLeading by example: A three-wave sequential mixed method food safety study
Leading by example: A three-wave sequential mixed method food safety studyNaiqing Lin, Ph.D.
 
Predicting and Explaining Behavioral Intention and Hand Sanitizer Use Among ...
 Predicting and Explaining Behavioral Intention andHand Sanitizer Use Among ... Predicting and Explaining Behavioral Intention andHand Sanitizer Use Among ...
Predicting and Explaining Behavioral Intention and Hand Sanitizer Use Among ...Naiqing Lin, Ph.D.
 

Mehr von Naiqing Lin, Ph.D. (7)

Explaining unobserved heterogeneity of food safety behavioral intention: A se...
Explaining unobserved heterogeneity of food safety behavioral intention: A se...Explaining unobserved heterogeneity of food safety behavioral intention: A se...
Explaining unobserved heterogeneity of food safety behavioral intention: A se...
 
Kansas Rural Tourism Development
Kansas Rural Tourism DevelopmentKansas Rural Tourism Development
Kansas Rural Tourism Development
 
Exploring Unobserved Heterogeneity of Food Safety Behavior: A Meta-Analysis
Exploring Unobserved Heterogeneity of Food Safety Behavior: A Meta-AnalysisExploring Unobserved Heterogeneity of Food Safety Behavior: A Meta-Analysis
Exploring Unobserved Heterogeneity of Food Safety Behavior: A Meta-Analysis
 
Non-parametric Method of Comparison for Rural Development
Non-parametric Method of Comparison for Rural DevelopmentNon-parametric Method of Comparison for Rural Development
Non-parametric Method of Comparison for Rural Development
 
Multi-group Structural Equation Modelling with Rural Tourism
Multi-group Structural Equation Modelling with Rural TourismMulti-group Structural Equation Modelling with Rural Tourism
Multi-group Structural Equation Modelling with Rural Tourism
 
Leading by example: A three-wave sequential mixed method food safety study
Leading by example: A three-wave sequential mixed method food safety studyLeading by example: A three-wave sequential mixed method food safety study
Leading by example: A three-wave sequential mixed method food safety study
 
Predicting and Explaining Behavioral Intention and Hand Sanitizer Use Among ...
 Predicting and Explaining Behavioral Intention andHand Sanitizer Use Among ... Predicting and Explaining Behavioral Intention andHand Sanitizer Use Among ...
Predicting and Explaining Behavioral Intention and Hand Sanitizer Use Among ...
 

Kürzlich hochgeladen

Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayMakMakNepo
 
Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........LeaCamillePacle
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Romantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxRomantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxsqpmdrvczh
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationAadityaSharma884161
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 

Kürzlich hochgeladen (20)

TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up Friday
 
Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Romantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxRomantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptx
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint Presentation
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 

Python NLP Project

  • 1. The Python NLP Project - An analysis of Twitter Data NAIQING LIN & XIAOYE LI
  • 2. Background & Introduction Prevalence of food safety related crisis in recent years Development of social media and social networking sites Accessibility to twitter data through twitter API
  • 3. Research Questions What are the key features of the food safety information communicated by the Twitter users (e.g., volume, variety, etc.)? What are the most popular words or phrases discussed in tweets related to food safety crisis? What are the key vertex and edges within the food safety network regarding information dissemination? What are the main clusters and their differences within different centrality metrics?
  • 4. Method Data Collection Data extraction through Python Tweepy Package Specific key words: foodsafety; globalfoodsupply, Allergen, Foodcontamination, Yes2safe, & Foodillness Length of data collection: 18 hours
  • 5. Method Data Description Pilot test conducted, 117 tweets resulted Number of dataset: 2286 tweets extracted Small number of dataset: food safety as less popular topic in Twitter, compared to other topics (e.g., politics, entertainment, etc.)
  • 6. Data Analysis Descriptive Analytics User Analysis ◦ Unique users: 723 ◦ Active users: StarzNsky4u with 207 tweets, NimsJane with 141 tweets ◦ Active user actively post tweets related to environmental conservation and animal rights advocates ◦ Visible users: users were mentioned more than 200 times by other users ◦ Influential users in legal regulations for American Food Exports Act ◦ Popular languages: English, Spanish
  • 7. Descriptive Analytics User Relationship Analysis ◦ Follower and following relationship analysis ◦ Canadian restaurant owner as the one with most followers (567510) ◦ Non-profit organization, ASPCA as another popular user, focusing on fighting against animal cruelty
  • 8. Descriptive Analytics Tweet Features ◦ Original tweets: 1288 ◦ Retweets: 998 ◦ Most popular retweets: food allergy, foodborne illness, cross-contamination, horsemeat, and foodborne illness outbreak ◦ Popular urls: food allergen, food recall, and food safety related news websites (992 urls in total)
  • 9. Content Analytics Data Preprocessing ◦ Remove urls and user names ◦ Remove non-alphanumeric contents ◦ Text preprocessing (tokenization, lowercase conversion, stopwords removal, etc.)
  • 10. Content Analytics Word Frequency Analysis Frequent words ◦ Food ◦ Allergen ◦ Americans ◦ Want ◦ Safe ◦ ……
  • 11. Content Analytics Word Frequency Analysis Frequent bi-gram words ◦ Legacy, sicken ◦ Watching, want ◦ Want, safe ◦ Safe, floor ◦ Americans, watching ◦ ……
  • 13. Unsupervised Content Analytics Clustering Analysis K = 6 (clusters) Topics in each clusters Clu 0: legal regualtions Clu 1: government actions Clu 2: international related Clu 3: Oklahoma incidents Clu 4: negative effects Clu 5: horsemeat
  • 14. Sentiment Analysis Python analysis & Bing-liu Sentiment Analysis ◦ Results: majority in positive sentiment (615) and subjectivility ◦ Most popular words in positive reviews: safe, yes, foodsafety, americans, vote ◦ Negative: 325 ◦ Most popular words in negative reviews: stop, exporting, toxins, fdalabeled, warnings ◦ Neutral: 277
  • 15. Network Analytics Mention Network Analysis Visualization of whole network
  • 16. Practical Implications For governmental agencies: ◦ More promotions of public twitter account and regular postings of useful tweet information ◦ Using popular hashtags to communicate the important food safety related information with the public For commercial business owners: ◦ Sharing of food safety related information and transparency in information disclosure ◦ Inclusion of website urls and expand influence by enlarging networks around them
  • 17. Limitations and Future Research Limitations Time Constraints Data collection obstacles (lack of data source) Twitter as the single data source
  • 18. Limitations and Future Research Future research Inclusion of more key words Combination of other data sources (e.g., government website) More in-depth analysis of important tweets in individual accounts (e.g., visible users) Utilization of more data-visualization tools