Measuring the dynamic bi-directional influence between content and social networks
1. Measuring the dynamic
bi-directional influence between
content and social networks
Shenghui Wang and Paul Groth
{swang,pgroth}@few.vu.nl
@shenghui, @pgroth
ISWC 2010
Shanghai, China
2. Outline
• Influence over time
– Social Networks
– Content Networks
• Influence Framework
1. Network Generation
2. Measuring Network Properties
3. Time Series Analysis
• Results
11. Social Networks Content Networks
queri
consum
correl
hierarch
profillognorm
graph
ws-bpel
to
program
decis
global
electron
mechan
imbalanc
cook
word
bottleneck
brows
relev
recip
geograph
markov
graph-bas
rate
design
click
spectral
index
section
access
petri
conduct
net
usag
modular
clickstream
implicit
valu
search
forum
auction
technolog
anchor
rdf
anycast
social
opinion
semant
approxim
prefer
folksonomi
tag-bas
substr
mobil
select
use
from
&
recommend
on
relat
probabilist
uddi
prototyp
cach
ict4d
retriev
scalabl
annot
tag
learn
stream
process
share
templat
topic
minimum
explor
onlin
secur
travel
answer
product
resourc
peer-to-p
usabl
geoloc
bloom
domin
sparql
goal-driven
issu
inform
suggest
composit
feedback
telecom
keyboard
taxonomi
dynam
entiti
reinforc
monitor
polici
delici
handl
gadget
framework
spatio-tempor
discuss
workload
sidejack
submodular
mode
found
citat
hard
combinatori
meta
sponsor
energi
extract
orient
network
join
space
publish
research
content
on-lin
adapt
internet
integr
partit
navig
reason
theori
complianc
thread
clickthrough
filter
length
regress
frequent
independ
denorm
rank
evolut
script
data
interact
system
messag
circl
privaci
gps
eavesdrop
fuzzi
crawl
keyword
tree
structur
h-index
balanc
video
schema
browser
and
function
comput
mine
engin
rout
technology-enhanc
(well
soap
distribut
track
price
object
eye-track
regular
segment
model
co-clust
multi-keyword
determin
bulletin
commerc
qos
text
cdn
random
session
reput
find
xml
locat
winner
activ
cloak
local
express
mainten
cost-per-act requirorgan
statist
mediat
microbusi
view
wiki
set
knowledg
2.0 expertis
disjunct
detect
expert
pattern
review
wikipedia
debat
languag
chemic
flickr
approach
email
attribut
spars
isol
extens
p2p
news
advertis
popul
protect
instant
axiomat
dissemin
voicesit
tempor
facet
instanc
context
logic
load
ontolog
walk
distil
suppli
trust
communiti
duplic
invert
devic
compon interest
basic
imag
bayesian
repetit
educ
hidden
semantic-bas
novel
datalog
servic
near
behavior
anonym
incentive-cent
region
server-sid
propag
metric
cross-languag
cluster
pharm
lightweight
develop
minim
media
medic
econom
complex
dht
infer
optim
effect
user
extern
task
semantics)
person
programm
the
paradigm
isoton
monet
photo
rest
collabor
demograph
web
cut
character
board
persuas
subsequ
match
applic
classfic
webpag
traffic
associ
measur
microformat
collect
cascad
soft
page
sitemap
crawler
shed
excerpt
maxim
mirror
guarante
p3p
transport
viral
for
overlay
characteris
larg
market
machin
same-origin
compress
web-bas
vs.
comparison
of
label
semistructur
disabl
owl
effici
log
task-bas
spam
question
aspect-ori
fast
interfac
analysi
semi-supervis
wireless
cloud
pagerank
categor
consist
isid
problem
similar
query-log
classif
featur
evalu
pseudo
abstract
diagnosi
proven
generat
mutual
mashup
discoveri
virtual
bpel
field
communic
phish
architectur
longev
svm
algorithm
fsg
reliabl
descript
visual
rule
represent
monet
queri
consum
collabor
paper
semantic/data
reput
languag
entiti
web
locat
polici
with
explain
desktop
blog
to
analyz
rich
geo/tempor
analyt
applic
digit
tangible/hapt
spell
(slas)
traffic
relev
measur
unstructur
level
h
negat
authent
correct
sensemak
statist
soft
manag
crawler
wiki
enterpris
properti
aspect
porn
natur
creation
rate
design
structur
extract
click
index
network
for
open
review
multimedia
definit
publish
discoveri
content
method
communiti
internet
approach
defens metadata
machin
real-world
agreement
rich-media
market
base
theori
repositori
news
advertis
vertic
on
search
auction
of
page
filter
context
social
fine-grain
improv
semistructur
produc
control
semant
e-commerc
effici
appli
qualiti
rank
system
right
mobil
summar
select
use
from
log
spam
interact
compos
avail
their
attack
interfac
includ
recommend
corpus
large-scal
ontolog
deliveri
that
tool
privaci
site
trail
visual
link
ling
harvest
cach
replic
novel
retriev
evolut
scalabl
servic
access
annot
contextu
learn
browser
object-ori
analysi
classif
comput
evalu
context-awar
process
in
share
mine
cluster
tag
explor
generat
onlin
facet
develop
techniqu
secur
perform media
research
exchang
econom
other
exploratori
combin
document
divers
sub/super-docu
relat
distribut
compress
discov
virus
user
component-bas
engin
data
model
feder
audit
sentiment
algorithm
author
issu
person
text
inter-organiz
suggest
mechan
the
opinion
12. Question: What is that influence?
• If a researcher identifies a new topic one year,
does that result in the research having more
coauthors the next?
• Does an informative post on a microblogging
service lead to a user gaining followers?
– If a user is popular in a social network, will his new
status updates be widely quoted?
14. 1. Network Generation
• Input: Domain data with identified social actors
and content elements
• Outputs:
– Series of social networks in time
– Series of content networks in time
– Bindings between these networks
• Keypoint: Networks must evolve
18. Semantic Web makes this easy!
• Content and social networks are already bound
– E.g. a resource can represent a person or a paper I
can point at
• SPARQL queries easily extract the separate
networks
19. Semantic web makes this easy!
SPARQL queries to extract co-author pairs and co-occurrence keywords
Traditionally, this takes a lot of extraction effort in terms of
content analysis
20. • Various measures of the centrality of a node determine
its relative importance in a network
• The local Clustering Coefficient is an indication of the
embeddedness of single vertices, i.e., the degree to
which individuals tend to cluster together.
2. Measuring network properties
2 3
1
4
5 6 7
21. 2. Measuring network properties
• Standard network properties, such as degree
centrality, betweenness centrality, clustering coefficient
• Domain specific network properties or content
variables can be used to gain additional insight
• Need an interpretation for social reality
22. Output of the previous steps
before time series analysis
Author Year
Social
bc
Social
dc
Social
cc
Content
dc
Content
bc
http://data.semanticweb.org/person/daqing-he 2007 0 0.0118 0 0 0
http://data.semanticweb.org/person/daqing-he 2008 0 0.0123 1.0000 0.0189 0.0372
http://data.semanticweb.org/person/daqing-he 2009 0 0 0 0 0
http://data.semanticweb.org/person/daqing-he 2010 0 0 0 0 0
http://data.semanticweb.org/person/chengxiang-zhai 2007 0 0.0118 1.0000 0 0
http://data.semanticweb.org/person/chengxiang-zhai 2008 0.0005 0.0123 1.0000 0.0184 0.0289
http://data.semanticweb.org/person/chengxiang-zhai 2009 0 0 0 0 0
http://data.semanticweb.org/person/chengxiang-zhai 2010 0 0.0031 1.0000 0.0093 0.0147
23. 3. Time Series Analysis
• Fit the data to multilevel times series models
• Use Autoregressive methods
• Must deal with:
– Fixed effects in general
– Random effects considering differences between
individuals
25. 3. Time Series Analysis
• Produces an influence network
!"#$%"&%'"%()*+,
-*&./0'"%()*+,
1*"(%"('"%()*+,
2%3+%%
&0$4(%+."3'&*%5&.%"(
67899
67899
:%()%%"%44
6789;
2%3+%%
6789; 78<< 78=>
678?<
678@@
678<
26. Results in two domains
• Influence between co-authors of academic
papers and the topics they address
• Influence between social status of online forum
participants and the attention they give to
particular parties
28. Results for Political Forum
!"#$%"&%'"%()*+,
-*&./0'"%()*+,
1*"(%"('"%()*+,
."23%4+%%
5678 *$(23%4+%%
5687 9%()%%""%::
565;
3%4+%%
565<
567
567=
56>?
565@
56>
56>8
2565;
9%()%%"%::
565@
565<
• Discussions from online forum nl.politiek
• Time series of 259 weeks
• More than 21,000 participants
• The content is the attention that 19 Dutch political parties receive
30. Conclusion
• The world is full of dynamic networks
• The Semantic Web is all about networks
• The influence networks show us how those
networks influence each other