SlideShare ist ein Scribd-Unternehmen logo
1 von 27
Created by The Curiosity Bits Blog (curiositybits.com)
The codes are provided by Dr. Gregory D. Saxton
Mining Facebook Page on Python
1
What data are available for mining?
‱ Posts: all posts on a Facebook page
‱ Content, posted time, included URLs, mentioned Facebook friends/pages, etc.
‱ Comments: comments on the posts
‱ Sender, content, posted time, etc.
‱ Engagement Indicators (we will cover this in an upcoming tutorial)
‱ The # of likes, shares and comments on a post
‱ The lifespan of a post: the time between sending the post and receiving its
last comment
2
The data are available through Facebook Graph API
‱ Use Facebook Graph API Explorer
(http://developers.facebook.com/tools/explorer) to get an access
token.
3
The JSON data will be stored on a SQLite database
‱ Like Twitter data, the raw output from Facebook API is in
JSON format. To see how the JSON output is organized, use
JSON Viewer (http://jsonviewer.stack.hu/)
- DOWNLOAD A JSON OUTPUT SAMPLE -
(https://drive.google.com/file/d/0Bwwg6GLCW_IPMWlL
Njd2NnplM1k/edit?usp=sharing)
4
A JSON will look like this on a web browser

5
Copy the content of the downloaded JSON sample onto JSON Viewer
(http://jsonviewer.stack.hu), click “Viewer”
Download a sample JSON 6
Let’s get down to the Python code

- DOWNLOAD THE PYTHON CODE -
(https://drive.google.com/file/d/0Bwwg6GLCW_IPbmNi
QW1EWGotd28/edit?usp=sharing)
7
Let’s walk you through each part of the script
part 1
Download a sample JSON
‱ These are necessary Python libraries. You
need to install them before running the
code.
‱ Not familiar with the installing? Review our
previous tutorial on how to install Python
libraries (pg.8) http://curiositybits.com/python-
for-mining-the-social-web/python-tutorial-mining-
twitter-user-profile/
8
Part 2: Create columns in the database for output variables
the name of a Facebook page; useful when
you are mining content from multiple pages
Page id, an unique identifier for a Facebook page
The URL to a Facebook page post
An unique identifier for a Facebook page post, formatted as Page ID_Status ID
An unique identifier for a Facebook page post, without page ID in it
The textual content of a post
9
Part 2: Create columns in the database for output variables
When the page post is sent
When the post is retrieved from API
When the last comment is posted.
The type of the post – status, link, photo, video, etc.
The included URL(s) to a video
The name of the webpage that the included URL is linking to.
The Included URL(s) to a photo
Description of the webpage that the included
URL is linking to.
10
Part 2: Create columns in the database for output variables
The number of mentions in a post
The mentioned page/people in a post
11
Part 2: Create columns in the database for output variables
The entire JSON raw output, including information
not parsed to the existing columns
Comments (including sender, content and posted time) on
the first page
Comments on the second page and beyond
12
‱ The columns created here correspond to the output
variables returned by Facebook Graph API. See the
definitions of all output variables:
https://developers.facebook.com/docs/graph-
api/reference/post
13
Part 2: the parsed data look like this in a SQLite Database Browser
Not familiar with SQLite Database Browser? Review our first tutorial (page.
10) at http://curiositybits.com/python-for-mining-the-social-web/python-
tutorial-mining-twitter-user-profile/
14
Part 2: the parsed data look like this in a SQLite Database Browser– continued.
Not familiar with SQLite Database Browser? Review our first tutorial (page.
10) at http://curiositybits.com/python-for-mining-the-social-web/python-
tutorial-mining-twitter-user-profile/
15
Not familiar with SQLite Database Browser? Review our first tutorial (page.
10) at http://curiositybits.com/python-for-mining-the-social-web/python-
tutorial-mining-twitter-user-profile/
Part 2: the parsed data look like this in a SQLite Database Browser– continued.
16
Not familiar with SQLite Database Browser? Review our first tutorial (page.
10) at http://curiositybits.com/python-for-mining-the-social-web/python-
tutorial-mining-twitter-user-profile/
Part 2: the parsed data look like this in a SQLite Database Browser– continued.
17
Click to see all comments.
Individual comments are separated
by the symbol ***
Part 2: the parsed data look like this in a SQLite Database Browser– continued.
18
Q: Why are some columns entirely blank?
A: We have created more columns than we needed
for this round of data-mining. The additional
columns created are for the next iteration of mining
through which we will get Facebook engagement
indicators.
19
Part 3: Set up the access token in the Python script
Paste your own access token here.
20
Part 4: Set up the SQLite database in the Python script
Name your own database. The database
will be saved to the same folder as your
Python script.
Or try a complete file path, if you want to
save the database in a different folder. 21
Part 3: tell Python which Facebook pages to look for.
The numbers here are Facebook page IDs,
wrapped in ‘’, and separated by commas.
Each Facebook page will have an unique
page ID, which can be found in the page’s
URL.
22
You can also use page name if the
page name consists of only one
word.
Part 3: tell Python which Facebook pages to look for.
23
BUT! Here is a catch:
if a page name contains multiple words, It is
recommended that you use numerical page id
instead. You can find the numerical page id in the URL
- the string of numbers after page name in the URL.
(e.g. https://www.facebook.com/pages/Spot-Coffee-
Elmwood/316579834919)
Part 3: tell Python which Facebook pages to look for.
24
Part 4: run the script
Hit RUN and you
will see Anaconda
showing the
progress of the data
mining.
Not familiar with
Anaconda? Review
our previous
tutorial (pg. 3) at
http://curiositybits.com/py
thon-for-mining-the-social-
web/python-tutorial- 25
Lastly, a caveat
.
If you encounter an error when running the script, a
database (though incomplete) may have been
created. You will need to delete the database file and
re-run the script, or change the database file name in
the script before the second run.
This script does not work on a preexisting database
file.
26
COMING UP NEXT

GET FACEBOOK ENGAGEMENT INDICATORS
The # of likes, shares and comments to the posts, and the
lifespan of a post
Stay tuned to our Curiosity Bites Blog (curiositybits.com)
27

Weitere Àhnliche Inhalte

Was ist angesagt?

ESWC 2014 Tutorial Handson 1: Collect Data from Facebook
ESWC 2014 Tutorial Handson 1: Collect Data from FacebookESWC 2014 Tutorial Handson 1: Collect Data from Facebook
ESWC 2014 Tutorial Handson 1: Collect Data from FacebookMiriam Fernandez
 
Week12presentation
Week12presentationWeek12presentation
Week12presentationyuki0722_0007
 
Week12presentation
Week12presentationWeek12presentation
Week12presentations1160001
 
Hao lyu slides_sarcasm
Hao lyu slides_sarcasmHao lyu slides_sarcasm
Hao lyu slides_sarcasmHao Lyu
 
Internet Tutorial 03
Internet  Tutorial 03Internet  Tutorial 03
Internet Tutorial 03dpd
 
Newsgathering and monitoring the social web
Newsgathering and monitoring the social webNewsgathering and monitoring the social web
Newsgathering and monitoring the social webFatmaAbouOmar
 
Journalists and the Social Web 2
Journalists and the Social Web 2Journalists and the Social Web 2
Journalists and the Social Web 2ardessie
 
Facebook report appendices
Facebook report appendicesFacebook report appendices
Facebook report appendicesEdenTraining
 
Facebook technical analysis by the Data Protection Commissioner Ireland
Facebook technical analysis by the Data Protection Commissioner IrelandFacebook technical analysis by the Data Protection Commissioner Ireland
Facebook technical analysis by the Data Protection Commissioner IrelandKrishna De
 
Facebook to tag satirical articles to stop users falling for the Onion's jokes
Facebook to tag satirical articles to stop users falling for the Onion's jokesFacebook to tag satirical articles to stop users falling for the Onion's jokes
Facebook to tag satirical articles to stop users falling for the Onion's jokesRussell Com
 
Social data analysis using apache flume, hdfs, hive
Social data analysis using apache flume, hdfs, hiveSocial data analysis using apache flume, hdfs, hive
Social data analysis using apache flume, hdfs, hiveijctet
 
Talent42 2014 matt grove finding contact info
Talent42 2014 matt grove finding contact infoTalent42 2014 matt grove finding contact info
Talent42 2014 matt grove finding contact infoTalent42
 
Rozalia alik task2 math3 (new)
Rozalia alik task2 math3 (new)Rozalia alik task2 math3 (new)
Rozalia alik task2 math3 (new)Rozalia Alik
 
Search tools
Search toolsSearch tools
Search toolscoachhahn
 
Finding stories by newsgathering and monitoring on social web .pptx
Finding stories by newsgathering and monitoring  on social web .pptxFinding stories by newsgathering and monitoring  on social web .pptx
Finding stories by newsgathering and monitoring on social web .pptxyasminMohamedramadan1
 

Was ist angesagt? (16)

ESWC 2014 Tutorial Handson 1: Collect Data from Facebook
ESWC 2014 Tutorial Handson 1: Collect Data from FacebookESWC 2014 Tutorial Handson 1: Collect Data from Facebook
ESWC 2014 Tutorial Handson 1: Collect Data from Facebook
 
How to Search Twitter
How to Search TwitterHow to Search Twitter
How to Search Twitter
 
Week12presentation
Week12presentationWeek12presentation
Week12presentation
 
Week12presentation
Week12presentationWeek12presentation
Week12presentation
 
Hao lyu slides_sarcasm
Hao lyu slides_sarcasmHao lyu slides_sarcasm
Hao lyu slides_sarcasm
 
Internet Tutorial 03
Internet  Tutorial 03Internet  Tutorial 03
Internet Tutorial 03
 
Newsgathering and monitoring the social web
Newsgathering and monitoring the social webNewsgathering and monitoring the social web
Newsgathering and monitoring the social web
 
Journalists and the Social Web 2
Journalists and the Social Web 2Journalists and the Social Web 2
Journalists and the Social Web 2
 
Facebook report appendices
Facebook report appendicesFacebook report appendices
Facebook report appendices
 
Facebook technical analysis by the Data Protection Commissioner Ireland
Facebook technical analysis by the Data Protection Commissioner IrelandFacebook technical analysis by the Data Protection Commissioner Ireland
Facebook technical analysis by the Data Protection Commissioner Ireland
 
Facebook to tag satirical articles to stop users falling for the Onion's jokes
Facebook to tag satirical articles to stop users falling for the Onion's jokesFacebook to tag satirical articles to stop users falling for the Onion's jokes
Facebook to tag satirical articles to stop users falling for the Onion's jokes
 
Social data analysis using apache flume, hdfs, hive
Social data analysis using apache flume, hdfs, hiveSocial data analysis using apache flume, hdfs, hive
Social data analysis using apache flume, hdfs, hive
 
Talent42 2014 matt grove finding contact info
Talent42 2014 matt grove finding contact infoTalent42 2014 matt grove finding contact info
Talent42 2014 matt grove finding contact info
 
Rozalia alik task2 math3 (new)
Rozalia alik task2 math3 (new)Rozalia alik task2 math3 (new)
Rozalia alik task2 math3 (new)
 
Search tools
Search toolsSearch tools
Search tools
 
Finding stories by newsgathering and monitoring on social web .pptx
Finding stories by newsgathering and monitoring  on social web .pptxFinding stories by newsgathering and monitoring  on social web .pptx
Finding stories by newsgathering and monitoring on social web .pptx
 

Andere mochten auch

Data mining in social network
Data mining in social networkData mining in social network
Data mining in social networkakash_mishra
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisDatamining Tools
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisDataminingTools Inc
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyNati Shalom
 
Social Data Mining
Social Data MiningSocial Data Mining
Social Data MiningMahesh Meniya
 
Social media mining PPT
Social media mining PPTSocial media mining PPT
Social media mining PPTChhavi Mathur
 
Social Media Mining - Chapter 3 (Network Measures)
Social Media Mining - Chapter 3 (Network Measures)Social Media Mining - Chapter 3 (Network Measures)
Social Media Mining - Chapter 3 (Network Measures)SocialMediaMining
 
Data mining for social media
Data mining for social mediaData mining for social media
Data mining for social mediarangesharp
 
Predicting Social Capital in Nonprofits’ Stakeholder Engagement on Social Media
Predicting Social Capital in Nonprofits’ Stakeholder Engagement on Social MediaPredicting Social Capital in Nonprofits’ Stakeholder Engagement on Social Media
Predicting Social Capital in Nonprofits’ Stakeholder Engagement on Social MediaWeiai Wayne Xu
 
Python Tutorial-Mining imgur images
Python Tutorial-Mining imgur imagesPython Tutorial-Mining imgur images
Python Tutorial-Mining imgur imagesWeiai Wayne Xu
 
R Class: Set up Social Media API
R Class: Set up Social Media APIR Class: Set up Social Media API
R Class: Set up Social Media APIWeiai Wayne Xu
 
How Do We Fight Email Phishing? (ICA2015 - San Juan, PR)
How Do We Fight Email Phishing? (ICA2015 - San Juan, PR) How Do We Fight Email Phishing? (ICA2015 - San Juan, PR)
How Do We Fight Email Phishing? (ICA2015 - San Juan, PR) Weiai Wayne Xu
 
Network Structures For A Better Twitter Community
Network Structures For A Better Twitter CommunityNetwork Structures For A Better Twitter Community
Network Structures For A Better Twitter CommunityWeiai Wayne Xu
 
PyCon APAC 2014 - Social Network Analysis Using Python (David Chiu)
PyCon APAC 2014 - Social Network Analysis Using Python (David Chiu)PyCon APAC 2014 - Social Network Analysis Using Python (David Chiu)
PyCon APAC 2014 - Social Network Analysis Using Python (David Chiu)David Chiu
 
Journalists and the Social Web 1
Journalists and the Social Web 1Journalists and the Social Web 1
Journalists and the Social Web 1ardessie
 
Instagram - Digital Marketing Tool
Instagram - Digital Marketing ToolInstagram - Digital Marketing Tool
Instagram - Digital Marketing ToolCHRISTINE MA
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining Sushil Kulkarni
 

Andere mochten auch (20)

Data mining in social network
Data mining in social networkData mining in social network
Data mining in social network
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case Study
 
Social Data Mining
Social Data MiningSocial Data Mining
Social Data Mining
 
Social media mining PPT
Social media mining PPTSocial media mining PPT
Social media mining PPT
 
Social Media Mining - Chapter 3 (Network Measures)
Social Media Mining - Chapter 3 (Network Measures)Social Media Mining - Chapter 3 (Network Measures)
Social Media Mining - Chapter 3 (Network Measures)
 
Data mining for social media
Data mining for social mediaData mining for social media
Data mining for social media
 
Predicting Social Capital in Nonprofits’ Stakeholder Engagement on Social Media
Predicting Social Capital in Nonprofits’ Stakeholder Engagement on Social MediaPredicting Social Capital in Nonprofits’ Stakeholder Engagement on Social Media
Predicting Social Capital in Nonprofits’ Stakeholder Engagement on Social Media
 
Python Tutorial-Mining imgur images
Python Tutorial-Mining imgur imagesPython Tutorial-Mining imgur images
Python Tutorial-Mining imgur images
 
R Class: Set up Social Media API
R Class: Set up Social Media APIR Class: Set up Social Media API
R Class: Set up Social Media API
 
How Do We Fight Email Phishing? (ICA2015 - San Juan, PR)
How Do We Fight Email Phishing? (ICA2015 - San Juan, PR) How Do We Fight Email Phishing? (ICA2015 - San Juan, PR)
How Do We Fight Email Phishing? (ICA2015 - San Juan, PR)
 
Network Structures For A Better Twitter Community
Network Structures For A Better Twitter CommunityNetwork Structures For A Better Twitter Community
Network Structures For A Better Twitter Community
 
PyCon APAC 2014 - Social Network Analysis Using Python (David Chiu)
PyCon APAC 2014 - Social Network Analysis Using Python (David Chiu)PyCon APAC 2014 - Social Network Analysis Using Python (David Chiu)
PyCon APAC 2014 - Social Network Analysis Using Python (David Chiu)
 
Sync your facebook friends with your database
Sync your facebook friends with your databaseSync your facebook friends with your database
Sync your facebook friends with your database
 
Advance Facebook Techniques
Advance Facebook TechniquesAdvance Facebook Techniques
Advance Facebook Techniques
 
Journalists and the Social Web 1
Journalists and the Social Web 1Journalists and the Social Web 1
Journalists and the Social Web 1
 
Instagram - Digital Marketing Tool
Instagram - Digital Marketing ToolInstagram - Digital Marketing Tool
Instagram - Digital Marketing Tool
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining
 
MemSQL
MemSQLMemSQL
MemSQL
 

Ähnlich wie Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and comments

An Introduction To The Use Of Widgets in libraries
An Introduction To The Use Of Widgets in librariesAn Introduction To The Use Of Widgets in libraries
An Introduction To The Use Of Widgets in librariesAaron Tay
 
Social Media Data Collection & Analysis
Social Media Data Collection & AnalysisSocial Media Data Collection & Analysis
Social Media Data Collection & AnalysisScott Sanders
 
Facebook ( Open ) Graph and the Semantic Web
Facebook ( Open ) Graph and the Semantic WebFacebook ( Open ) Graph and the Semantic Web
Facebook ( Open ) Graph and the Semantic WebMatteo Brunati
 
Facebook Connect Integration
Facebook Connect IntegrationFacebook Connect Integration
Facebook Connect Integrationmujahidslideshare
 
Introduction to facebook javascript sdk
Introduction to facebook javascript sdk Introduction to facebook javascript sdk
Introduction to facebook javascript sdk Yi-Fan Chu
 
IRJET- Socially Smart an Aggregation System for Social Media using Web Sc...
IRJET-  	  Socially Smart an Aggregation System for Social Media using Web Sc...IRJET-  	  Socially Smart an Aggregation System for Social Media using Web Sc...
IRJET- Socially Smart an Aggregation System for Social Media using Web Sc...IRJET Journal
 
Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)
Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)
Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)Matthew Russell
 
SXSW Hacking RSS: Filtering & Processing Obscene Amounts of Information
SXSW Hacking RSS: Filtering & Processing Obscene Amounts of InformationSXSW Hacking RSS: Filtering & Processing Obscene Amounts of Information
SXSW Hacking RSS: Filtering & Processing Obscene Amounts of InformationDawn Foster
 
Peepcode facebook-2-rails on facebook
Peepcode facebook-2-rails on facebookPeepcode facebook-2-rails on facebook
Peepcode facebook-2-rails on facebooksushilprajapati
 
Introduction to facebook java script sdk
Introduction to facebook java script sdk Introduction to facebook java script sdk
Introduction to facebook java script sdk Yi-Fan Chu
 
Web 2.0: What Can It Offer The Research Community?
Web 2.0: What Can It Offer The Research Community?Web 2.0: What Can It Offer The Research Community?
Web 2.0: What Can It Offer The Research Community?lisbk
 
Download PowerPoint Project on social programming for engineering students
Download PowerPoint Project on social programming for engineering studentsDownload PowerPoint Project on social programming for engineering students
Download PowerPoint Project on social programming for engineering studentsSkyingBlogger
 
Mining Social Web APIs with IPython Notebook (PyCon 2014)
Mining Social Web APIs with IPython Notebook (PyCon 2014)Mining Social Web APIs with IPython Notebook (PyCon 2014)
Mining Social Web APIs with IPython Notebook (PyCon 2014)Matthew Russell
 
Hacking RSS: Filtering & Processing Obscene Amounts of Information (short ve...
Hacking RSS: Filtering & Processing  Obscene Amounts of Information (short ve...Hacking RSS: Filtering & Processing  Obscene Amounts of Information (short ve...
Hacking RSS: Filtering & Processing Obscene Amounts of Information (short ve...Dawn Foster
 
Virtual Tech Days 2010 - Integrating Social Networks with ASP.NET
Virtual Tech Days 2010 - Integrating Social Networks with ASP.NETVirtual Tech Days 2010 - Integrating Social Networks with ASP.NET
Virtual Tech Days 2010 - Integrating Social Networks with ASP.NETKrishna T
 
project_proposal_osrf
project_proposal_osrfproject_proposal_osrf
project_proposal_osrfom1234567890
 
Python webinar 2nd july
Python webinar 2nd julyPython webinar 2nd july
Python webinar 2nd julyVineet Chaturvedi
 
Facebook Technology Stack
Facebook Technology StackFacebook Technology Stack
Facebook Technology StackHusain Ali
 

Ähnlich wie Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and comments (20)

An Introduction To The Use Of Widgets in libraries
An Introduction To The Use Of Widgets in librariesAn Introduction To The Use Of Widgets in libraries
An Introduction To The Use Of Widgets in libraries
 
Social Media Data Collection & Analysis
Social Media Data Collection & AnalysisSocial Media Data Collection & Analysis
Social Media Data Collection & Analysis
 
Facebook ( Open ) Graph and the Semantic Web
Facebook ( Open ) Graph and the Semantic WebFacebook ( Open ) Graph and the Semantic Web
Facebook ( Open ) Graph and the Semantic Web
 
Facebook Connect Integration
Facebook Connect IntegrationFacebook Connect Integration
Facebook Connect Integration
 
Introduction to facebook javascript sdk
Introduction to facebook javascript sdk Introduction to facebook javascript sdk
Introduction to facebook javascript sdk
 
IRJET- Socially Smart an Aggregation System for Social Media using Web Sc...
IRJET-  	  Socially Smart an Aggregation System for Social Media using Web Sc...IRJET-  	  Socially Smart an Aggregation System for Social Media using Web Sc...
IRJET- Socially Smart an Aggregation System for Social Media using Web Sc...
 
Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)
Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)
Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)
 
SXSW Hacking RSS: Filtering & Processing Obscene Amounts of Information
SXSW Hacking RSS: Filtering & Processing Obscene Amounts of InformationSXSW Hacking RSS: Filtering & Processing Obscene Amounts of Information
SXSW Hacking RSS: Filtering & Processing Obscene Amounts of Information
 
Introducing Facebook
Introducing FacebookIntroducing Facebook
Introducing Facebook
 
Peepcode facebook-2-rails on facebook
Peepcode facebook-2-rails on facebookPeepcode facebook-2-rails on facebook
Peepcode facebook-2-rails on facebook
 
Introduction to facebook java script sdk
Introduction to facebook java script sdk Introduction to facebook java script sdk
Introduction to facebook java script sdk
 
Web 2.0: What Can It Offer The Research Community?
Web 2.0: What Can It Offer The Research Community?Web 2.0: What Can It Offer The Research Community?
Web 2.0: What Can It Offer The Research Community?
 
Download PowerPoint Project on social programming for engineering students
Download PowerPoint Project on social programming for engineering studentsDownload PowerPoint Project on social programming for engineering students
Download PowerPoint Project on social programming for engineering students
 
Mining Social Web APIs with IPython Notebook (PyCon 2014)
Mining Social Web APIs with IPython Notebook (PyCon 2014)Mining Social Web APIs with IPython Notebook (PyCon 2014)
Mining Social Web APIs with IPython Notebook (PyCon 2014)
 
Hacking RSS: Filtering & Processing Obscene Amounts of Information (short ve...
Hacking RSS: Filtering & Processing  Obscene Amounts of Information (short ve...Hacking RSS: Filtering & Processing  Obscene Amounts of Information (short ve...
Hacking RSS: Filtering & Processing Obscene Amounts of Information (short ve...
 
Virtual Tech Days 2010 - Integrating Social Networks with ASP.NET
Virtual Tech Days 2010 - Integrating Social Networks with ASP.NETVirtual Tech Days 2010 - Integrating Social Networks with ASP.NET
Virtual Tech Days 2010 - Integrating Social Networks with ASP.NET
 
project_proposal_osrf
project_proposal_osrfproject_proposal_osrf
project_proposal_osrf
 
Python webinar 2nd july
Python webinar 2nd julyPython webinar 2nd july
Python webinar 2nd july
 
Dbs-Week5-Class-Exercises
Dbs-Week5-Class-ExercisesDbs-Week5-Class-Exercises
Dbs-Week5-Class-Exercises
 
Facebook Technology Stack
Facebook Technology StackFacebook Technology Stack
Facebook Technology Stack
 

Mehr von Weiai Wayne Xu

Big data, small data and everything in between
Big data, small data and everything in betweenBig data, small data and everything in between
Big data, small data and everything in betweenWeiai Wayne Xu
 
Say search and sales e-cigar and big data
Say search and sales   e-cigar and big data Say search and sales   e-cigar and big data
Say search and sales e-cigar and big data Weiai Wayne Xu
 
Xu talk 3-17-2015
Xu talk 3-17-2015Xu talk 3-17-2015
Xu talk 3-17-2015Weiai Wayne Xu
 
The Networked Creativity in the Censored Web 2.0
The Networked Creativity in the Censored Web 2.0The Networked Creativity in the Censored Web 2.0
The Networked Creativity in the Censored Web 2.0Weiai Wayne Xu
 
The Networked Cultural Diffusion of Kpop on YouTube
The Networked Cultural Diffusion of Kpop on YouTubeThe Networked Cultural Diffusion of Kpop on YouTube
The Networked Cultural Diffusion of Kpop on YouTubeWeiai Wayne Xu
 
What makes an image worth a thousand words NCA2014
What makes an image worth a thousand words   NCA2014What makes an image worth a thousand words   NCA2014
What makes an image worth a thousand words NCA2014Weiai Wayne Xu
 
Predicting opinion leadership on twitter
Predicting opinion leadership on twitter   Predicting opinion leadership on twitter
Predicting opinion leadership on twitter Weiai Wayne Xu
 

Mehr von Weiai Wayne Xu (7)

Big data, small data and everything in between
Big data, small data and everything in betweenBig data, small data and everything in between
Big data, small data and everything in between
 
Say search and sales e-cigar and big data
Say search and sales   e-cigar and big data Say search and sales   e-cigar and big data
Say search and sales e-cigar and big data
 
Xu talk 3-17-2015
Xu talk 3-17-2015Xu talk 3-17-2015
Xu talk 3-17-2015
 
The Networked Creativity in the Censored Web 2.0
The Networked Creativity in the Censored Web 2.0The Networked Creativity in the Censored Web 2.0
The Networked Creativity in the Censored Web 2.0
 
The Networked Cultural Diffusion of Kpop on YouTube
The Networked Cultural Diffusion of Kpop on YouTubeThe Networked Cultural Diffusion of Kpop on YouTube
The Networked Cultural Diffusion of Kpop on YouTube
 
What makes an image worth a thousand words NCA2014
What makes an image worth a thousand words   NCA2014What makes an image worth a thousand words   NCA2014
What makes an image worth a thousand words NCA2014
 
Predicting opinion leadership on twitter
Predicting opinion leadership on twitter   Predicting opinion leadership on twitter
Predicting opinion leadership on twitter
 

KĂŒrzlich hochgeladen

Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Visit to a blind student's school🧑‍🩯🧑‍🩯(community medicine)
Visit to a blind student's school🧑‍🩯🧑‍🩯(community medicine)Visit to a blind student's school🧑‍🩯🧑‍🩯(community medicine)
Visit to a blind student's school🧑‍🩯🧑‍🩯(community medicine)lakshayb543
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
call girls in Kamla Market (DELHI) 🔝 >àŒ’9953330565🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïž
call girls in Kamla Market (DELHI) 🔝 >àŒ’9953330565🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïžcall girls in Kamla Market (DELHI) 🔝 >àŒ’9953330565🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïž
call girls in Kamla Market (DELHI) 🔝 >àŒ’9953330565🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïž9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
USPSÂź Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPSÂź Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPSÂź Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPSÂź Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 

KĂŒrzlich hochgeladen (20)

Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Visit to a blind student's school🧑‍🩯🧑‍🩯(community medicine)
Visit to a blind student's school🧑‍🩯🧑‍🩯(community medicine)Visit to a blind student's school🧑‍🩯🧑‍🩯(community medicine)
Visit to a blind student's school🧑‍🩯🧑‍🩯(community medicine)
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
call girls in Kamla Market (DELHI) 🔝 >àŒ’9953330565🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïž
call girls in Kamla Market (DELHI) 🔝 >àŒ’9953330565🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïžcall girls in Kamla Market (DELHI) 🔝 >àŒ’9953330565🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïž
call girls in Kamla Market (DELHI) 🔝 >àŒ’9953330565🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïž
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
USPSÂź Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPSÂź Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPSÂź Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPSÂź Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 

Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and comments

  • 1. Created by The Curiosity Bits Blog (curiositybits.com) The codes are provided by Dr. Gregory D. Saxton Mining Facebook Page on Python 1
  • 2. What data are available for mining? ‱ Posts: all posts on a Facebook page ‱ Content, posted time, included URLs, mentioned Facebook friends/pages, etc. ‱ Comments: comments on the posts ‱ Sender, content, posted time, etc. ‱ Engagement Indicators (we will cover this in an upcoming tutorial) ‱ The # of likes, shares and comments on a post ‱ The lifespan of a post: the time between sending the post and receiving its last comment 2
  • 3. The data are available through Facebook Graph API ‱ Use Facebook Graph API Explorer (http://developers.facebook.com/tools/explorer) to get an access token. 3
  • 4. The JSON data will be stored on a SQLite database ‱ Like Twitter data, the raw output from Facebook API is in JSON format. To see how the JSON output is organized, use JSON Viewer (http://jsonviewer.stack.hu/) - DOWNLOAD A JSON OUTPUT SAMPLE - (https://drive.google.com/file/d/0Bwwg6GLCW_IPMWlL Njd2NnplM1k/edit?usp=sharing) 4
  • 5. A JSON will look like this on a web browser
 5
  • 6. Copy the content of the downloaded JSON sample onto JSON Viewer (http://jsonviewer.stack.hu), click “Viewer” Download a sample JSON 6
  • 7. Let’s get down to the Python code
 - DOWNLOAD THE PYTHON CODE - (https://drive.google.com/file/d/0Bwwg6GLCW_IPbmNi QW1EWGotd28/edit?usp=sharing) 7
  • 8. Let’s walk you through each part of the script
part 1 Download a sample JSON ‱ These are necessary Python libraries. You need to install them before running the code. ‱ Not familiar with the installing? Review our previous tutorial on how to install Python libraries (pg.8) http://curiositybits.com/python- for-mining-the-social-web/python-tutorial-mining- twitter-user-profile/ 8
  • 9. Part 2: Create columns in the database for output variables the name of a Facebook page; useful when you are mining content from multiple pages Page id, an unique identifier for a Facebook page The URL to a Facebook page post An unique identifier for a Facebook page post, formatted as Page ID_Status ID An unique identifier for a Facebook page post, without page ID in it The textual content of a post 9
  • 10. Part 2: Create columns in the database for output variables When the page post is sent When the post is retrieved from API When the last comment is posted. The type of the post – status, link, photo, video, etc. The included URL(s) to a video The name of the webpage that the included URL is linking to. The Included URL(s) to a photo Description of the webpage that the included URL is linking to. 10
  • 11. Part 2: Create columns in the database for output variables The number of mentions in a post The mentioned page/people in a post 11
  • 12. Part 2: Create columns in the database for output variables The entire JSON raw output, including information not parsed to the existing columns Comments (including sender, content and posted time) on the first page Comments on the second page and beyond 12
  • 13. ‱ The columns created here correspond to the output variables returned by Facebook Graph API. See the definitions of all output variables: https://developers.facebook.com/docs/graph- api/reference/post 13
  • 14. Part 2: the parsed data look like this in a SQLite Database Browser Not familiar with SQLite Database Browser? Review our first tutorial (page. 10) at http://curiositybits.com/python-for-mining-the-social-web/python- tutorial-mining-twitter-user-profile/ 14
  • 15. Part 2: the parsed data look like this in a SQLite Database Browser– continued. Not familiar with SQLite Database Browser? Review our first tutorial (page. 10) at http://curiositybits.com/python-for-mining-the-social-web/python- tutorial-mining-twitter-user-profile/ 15
  • 16. Not familiar with SQLite Database Browser? Review our first tutorial (page. 10) at http://curiositybits.com/python-for-mining-the-social-web/python- tutorial-mining-twitter-user-profile/ Part 2: the parsed data look like this in a SQLite Database Browser– continued. 16
  • 17. Not familiar with SQLite Database Browser? Review our first tutorial (page. 10) at http://curiositybits.com/python-for-mining-the-social-web/python- tutorial-mining-twitter-user-profile/ Part 2: the parsed data look like this in a SQLite Database Browser– continued. 17
  • 18. Click to see all comments. Individual comments are separated by the symbol *** Part 2: the parsed data look like this in a SQLite Database Browser– continued. 18
  • 19. Q: Why are some columns entirely blank? A: We have created more columns than we needed for this round of data-mining. The additional columns created are for the next iteration of mining through which we will get Facebook engagement indicators. 19
  • 20. Part 3: Set up the access token in the Python script Paste your own access token here. 20
  • 21. Part 4: Set up the SQLite database in the Python script Name your own database. The database will be saved to the same folder as your Python script. Or try a complete file path, if you want to save the database in a different folder. 21
  • 22. Part 3: tell Python which Facebook pages to look for. The numbers here are Facebook page IDs, wrapped in ‘’, and separated by commas. Each Facebook page will have an unique page ID, which can be found in the page’s URL. 22
  • 23. You can also use page name if the page name consists of only one word. Part 3: tell Python which Facebook pages to look for. 23
  • 24. BUT! Here is a catch: if a page name contains multiple words, It is recommended that you use numerical page id instead. You can find the numerical page id in the URL - the string of numbers after page name in the URL. (e.g. https://www.facebook.com/pages/Spot-Coffee- Elmwood/316579834919) Part 3: tell Python which Facebook pages to look for. 24
  • 25. Part 4: run the script Hit RUN and you will see Anaconda showing the progress of the data mining. Not familiar with Anaconda? Review our previous tutorial (pg. 3) at http://curiositybits.com/py thon-for-mining-the-social- web/python-tutorial- 25
  • 26. Lastly, a caveat
. If you encounter an error when running the script, a database (though incomplete) may have been created. You will need to delete the database file and re-run the script, or change the database file name in the script before the second run. This script does not work on a preexisting database file. 26
  • 27. COMING UP NEXT
 GET FACEBOOK ENGAGEMENT INDICATORS The # of likes, shares and comments to the posts, and the lifespan of a post Stay tuned to our Curiosity Bites Blog (curiositybits.com) 27