Studying Facebook via Data Extraction: a Netvizz tutorial at the Digital Methods Summer School 2013
1. Studying
Facebook
via
Data
Extrac6on
The
Netvizz
Applica6on
Bernhard
Rieder
Universiteit
van
Amsterdam
Mediastudies
Department
2. Overview
Compared
to
TwiGer,
Facebook
is
difficult
to
study
through
data
extrac6on
but
also
has
important
advantages:
☉ complicated
API,
very
complex
and
opaque
privacy
regime,
constant
changes,
etc.
☉ rich
and
detailed
data,
access
to
full
6melines,
etc.
Goal:
lower
the
threshold
for
working
with
quan6ta6ve
and
computa6onal
approaches,
thereby
fostering
transversal
thinking;
open
the
walled
garden.
Netvizz
is
a
Facebook
applica6on
that
exports
a
variety
of
data
files
in
common
formats
for
a
variety
of
sec6ons
of
the
Facebook
plaSorm.
Humanists
and
social
scien6sts
are
oUen
interested
in
descrip6ve
sta6s6cs
rather
than
models
or
advanced
metrics;
data
stays
close
to
the
medium.
3. Two
kinds
of
quan6ta6ve
analysis
Sta$s$cs
Observed:
objects
and
proper$es
Inferred:
rela$ons
Data
representa6on:
the
table
Visual
representa6on:
quan$ty
charts
Grouping:
class
(similar
proper$es)
Graph-‐theory
Observed:
objects
and
rela$ons
Inferred:
structure
Data
representa6on:
the
matrix
Visual
representa6on:
network
diagrams
Grouping:
clique
(dense
rela$ons)
4.
5. Personal
network
Nodes:
users
/
links:
"friendship"
Good
star6ng
point
for
learning
network
analysis
6. Personal
"like"
network
Nodes:
users
&
liked
objects
("bipar6te
graphe")
/
links:
"liking"
A
post-‐demographical
view
on
social
rela6ons
and
culture
7.
8. FB
group
"Islam
is
dangerous"
Friendship
network,
color:
betweenness
centrality
2.339
members
Average
degree
of
39.69
81.7%
have
at
least
one
friend
in
the
group
55.4%
five
or
more
37.2%
have
20
or
more
founder
and
admin
has
609
friends
9. FB
group
"Islam
is
dangerous"
Friendship
network,
color:
Interface
language
en_us,
de,
en_uk,
it
dominate
10. Mapping
European
Extremism
(aggregate
groups)
Friendship
rela6ons
of
18
extreme-‐right
groups
User
names
are
unique!
(gephi
can
fuse
networks)
12. Facebook
Page
"ElShaheeed",
June
2010
–
June
2011,
(Poell
/
Rieder,
forthcoming)
7K
posts,
700K
users,
3.6M
comments,
10M
likes,
work
in
progress!
13. New
media
plaSorms
funnel
prac6ces
into
reduced
and
largely
formal
"grammars
of
ac6on"
(Agre
1989);
data
is
therefore
very
clean,
very
complete,
and
very
detailed.
Can
be
imported
with
great
ease
into
standard
packages
for
sta6s6cal
(e.g.
R,
Excel,
Rapidminer)
or
network
analysis
(e.g.
gephi,
Pajek).
Data
and
tools
20. FB
page
"Educate
children
about
the
evils
of
Islam"
1.586
likes,
253
users
commen6ng
or
liking
on
last
200
posts
21. FB
page
"Educate
children
about
the
evils
of
Islam"
Links
have
more
comments,
photos
more
likes.
22. FB
pages
of
New
York
Times
and
Wall
Street
Journal
(aggregate
pages)
30
latest
posts,
27K
users
liking
or
commen6ng
(user
ids
are
unique!)
23. Facebook
page
like
network
Seed:
Stop
Islamiza6on
of
the
World
Crawl
depth:
2
24. Studying
extremism
on
Facebook
Some
examples
from
the
Digital
Methods
Ini6a6ve's
data
sprint
on
an6-‐
Islamism
and
right
wing
extremism.
Four
aspects
of
SNS
we
wanted
to
study:
☉ Coordina6on,
social
networking,
and
social
support
for
extremists
☉ Broadcas6ng
and
mobiliza6on
channel
for
extremists
☉ Expressions
from
diffuse
publics
☉ Debate
and
encounter
around
Islam
25. Conclusions
Netvizz
exports
a
variety
of
data
files
in
common
formats
for
a
variety
of
sec6ons
of
the
Facebook
plaSorm
and
can
be
used
in
many
different
research
designs.
Netvizz
aGempts
to
lower
the
threshold
for
quan6ta6ve
work
on
Facebook,
allowing
for
closer
connec6ons
with
qualita6ve,
interpreta6ve
thinking.
Easy
access
to
visualiza6on
techniques
is
crucial
for
this
approach.
26. Thank
You
hGps://apps.facebook.com/netvizz/
rieder@uva.nl
hGps://www.digitalmethods.net
hGp://thepoli6csofsystems.net
"Far
be@er
an
approximate
answer
to
the
right
ques$on,
which
is
oBen
vague,
than
an
exact
answer
to
the
wrong
ques$on,
which
can
always
be
made
precise.
Data
analysis
must
progress
by
approximate
answers,
at
best,
since
its
knowledge
of
what
the
problem
really
is
will
at
best
be
approximate."
(Tukey
1962)