T. Takaguchi, M. Nakamura, N. Sato, K. Yano, N. Masuda.
Predictability of conversation partners.
Physical Review X, 1, 011008 (2011).
PDF is available free at:
http://prx.aps.org/abstract/PRX/v1/i1/e011008
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Predictability of conversation partners
1. Predictability of
conversation partners
Phys. Rev. X 1, 011008 (2011)
1Taro Takaguchi, 1Mitsuhiro Nakamura,
2Nobuo Sato, 2Kazuo Yano,
and 1,3Naoki Masuda
1 Univ. of Tokyo
2 Central Research Lab., Hitachi, Ltd.
3 JST PRESTO
2. Conventional assumptions
Modeling human behavior such as
Mobility patterns Temporal order of
in the physical space selecting interaction partners
by random (or Lévy) walk by Poisson process
Actual behavior: to what degree random / deterministic ?
3. Predictability of mobility patterns (Song et al., 2010)
Mobility patterns are largely predictable.
Method: measuring the entropy of
the sequence of locations of each cell-phone user
How about other components of human behavior?
4. Predictability of interaction partners
interacts with
2 3 2 2 3 1
time
discard the timing of events
“partner sequence”
- Characterize individuals in a social organization
- Related to epidemics and information spreading
5. Data set: face-to-face interaction log in an office
Recording from two offices in Japanese companies
- subjects: 163 individuals
- period: 73 days
- total conversation events: 51,879 times
We used the data collected by World Signal Center, Hitachi, Ltd., Japan.
(株)日立製作所 ワールドシグナルセンタにより収集された
データセットを使用した。
Name tag with an infrared
module
(The Business Microscope system)
http://www.hitachi-hitec.com/jyouhou/business-microscope/
6. Details of data set
- A conversation event = communication between close tags
- Conversation events: undirected
- Time resolution = one minute
-Multiple events initiated within the same minute
→ determine their order at random
7. Three entropy measures (cf. Song et al., 2010)
- for individual who has partners in the recording period,
Random entropy: interaction in a totally random manner
Uncorrelated entropy: heterogeneity among
: the probability to talk with
Conditional entropy: second-order correlation
: to talk with immediately after with
8. cf. Entropy
“The uncertainty of the random variable”
Ex.) coin toss
: always tail
11. Quantifying the predictability
Mutual information of the partner sequence
Interpretation:
The information about the NEXT partner
that is earned by knowing the PREVIOUS partner
rando predictable
m
12. Aim of the study
Examine the predictability of conversation partners
- similar to that of mobility patterns
“Predictability” = the mutual information of the partner sequence
Data set: face-to-face interaction log from two offices in Japan
i
interacts with
2 3 2 2 3 1
events
13. Histogram of the entropies
, regardless of the value of
→ Partner sequences are predictable to some extent.
40
(a) H i0 (b)
number of individuals
H i1 4.0
30 H i2
3.5
20 H i2
3.0
10
2.5
0 2.0
1 2 3 4 5 6 7 2.0 2.5 3.0 3.5 4.0
Hi H i1
14. Significance of
Null hypothesis:
“ is positive only because of the small data size.”
Compare with the value obtained from shuffled partner sequences
3
Error bars: 99% confidential intervals of shuffled sequences
(a)
2
^
Ii Empirical
1
0
1 50 100 150
i in the ascending order of I i
15. Main cause of the predictability
The bursty human activity pattern:
characterized by the long-tailed inter-event intervals
“ tends to talk with within a short period after talking with .”
time
of a typical individual
16. Test 1 (1)
Purpose: examine the contribution of the bursty activity
Partner sequence in a given day
2 1 3 3 2 1 2 1 3 2 1 3 1
time
extract the sequences with each partner
1
with
2
with
3
with
17. Test 1 (2)
Next, shuffle the intervals within each partner
1
with
2
with
3
with
merge
Calculate of this shuffled sequence
1
2 1 3 1 2 2 3 1 3 2 1 3 1
2 time
18. Result of Test 1
Generate 100 shuffled sequence
Contribution of burstiness
3
(a) bars: mean ± sd for the shuffled sequences
Error
2
I iburst Empirical
1
0
1 50 100 150
i in the ascending order of I i
19. Test 2
Purpose: examine the predictability after omitting the burstiness
2 3 2 2 2 3 3 1
events
merge into one
Calculate of the merged sequence
20. Result of Test 2
Shuffle the merged sequence with replacement
conditioned that no partner ID appears successively
is significantly large.
→ Predictability remains without the effect of the burstiness.
3 Error bars: 99% confidential intervals of shuffled sequences
(b)
2
^merge
Ii
Original
1
0
1 50 100 150
merge
i in the ascending order of Ii
21. Summary so far
Partner sequence: predictable to some extent
Main cause of predictability: the bursty activity pattern
- Predictability remains after omitting the burstiness.
4.0
(b)
3.5
H i2
3.0
2.5
2.0
2.0 2.5 3.0 3.5 4.0
H i1
22. Predictability depends on individuals
Depends on ’s position in the social network?
Histogram of
40
number of individuals
30
20
10
0
0.5 1.0 1.5 2.0
Ii
23. Conversation network
Undirected and weighted network
Node : individual
Link : a pair of individuals having at least one event
Link weight : The total number of events for the pair
Node attributions
Degree
Strength
Mean weight
25. Hypotheses
- Fix , small → abundance of weak links (with small weights)
□ Hypothesis about links
- “Strength of weak ties” (Granovetter, 1973)
□ Hypothesis about nodes
26. Strength of weak ties (Granovetter, 1973)
Weak links offer non-redundant contacts.
Weak links tend to connect different communities.
“bridge”
27. Hypotheses
- Fix , small → abundance of weak links (with small weights)
□ Hypothesis about links
- Weak links connect different communities.
□ Hypothesis about nodes
• Individuals connecting different communities → large
• Individuals concealed in a community → small
28. Hypothesis about links
Purpose: quantify the degree of being concealed in a community
Relative neighborhood overlap of a link (Onnela et al., 2007)
29. Hypothesis about links
“Strength of weak ties” hypothesis;
a link with large tends to have a large
→ increases with
: averaged over the links with
0.45 (a)
0.40
<O>w
0.35
Consistent with the hypothesis
0.30
0.2 0.4 0.6 0.8 1.0
P cum(w ) fraction of the links with
30. Hypotheses
- Fix , small → abundance of weak links (with small weights)
□ Hypothesis about links
- Weak links connect different communities.
□ Hypothesis about nodes
• Individuals connecting different communities → large
• Individuals concealed in a community → small
31. Hypothesis about nodes
Purpose: quantify the degree of being concealed in a community
Clustering coefficient of a node (Watts and Strogatz, 1998)
32. Abundance of triangles
- 3-clique percolation community (Palla et al., 2005)
- Hierarchical structure (Ravász and Barabási, 2003)
Nodes connecting communities → small
33. Clustering coefficient with link-weight threshold
: after eliminating the links with
Original CN
10 1 3
1
5
2
7 12
5
Expectation: triangles persist around individuals with small
for a fixed
34. Hypothesis about nodes
■: Pearson correlation coefficient between and
○: Partial correlation coefficient and
between
with and fixed
35. Hypotheses
- Fix , small → abundance of weak links (with small weights)
□ Hypothesis about links
- Weak links connect different communities.
□ Hypothesis about nodes
• Individuals connecting different communities → large
• Individuals concealed in a community → small
0.45 (a)
0.40
<O>w
0.35
0.30
0.2 0.4 0.6 0.8 1.0
P cum(w )
36. Conclusions
Predictability of partner sequence
- Face-to-face interaction log in offices in Japanese companies
Conversation partners: predictable to some extent
- A main cause: the bursty activity pattern
- Significant predictability after omitting the burstiness
Dependence on the individual’s position in the conversation network
- “Strength of weak ties”
- INTER-community individuals → large
- INTRA-community individuals → small
Full paper is available
Taro Takaguchi, Mitsuhiro Nakamura, Nobuo Sato, Kazuo Yano,
and Naoki Masuda,
“Predictability of conversation partners”,
Phys. Rev. X 1, 011008 (2011).