1. The document presents a method for detecting social events using mobile phone data by analyzing the mobility and social behavior of mobile phone users.
2. It aims to detect unusual large gatherings of people (social events) and identify frequent locations like home or work.
3. The method uses Bayesian inference on antenna connections to estimate user locations and identifies events as weeks with unusually high numbers of probable attendees based on comparing ordinary and event presence probabilities.
Recombination DNA Technology (Nucleic Acid Hybridization )
Social Event Detection
1. Social Event Detection
V.A. Traag1, A. Browet1, F. Calabrese2, F. Morlot3
1Department of Applied Mathematics
UCL, Louvain-la-neuve, Belgium
2SENSEable City Lab
MIT, Cambridge, USA
3Orange Labs
Issy-les-Moulineaux, France
24 February 2011
2. Outline
1 Motivation
2 Bayesian Location Inference
3 Identification of frequent location
4 Event detection
5 Presence probability
3. Introduction
Purpose
Analyze mobility and social behaviour of mobile phone users:
1 Detect social events i.e. unsual large gatherings of poeple.
2 Identify frequent location such as home or office.
Motivation
1 Between 70% & 80% of human mobility is explain by the daily
home-office routine (Barabasi et al.). Analyze the
out-of-ordinary behaviour.
2 Anticipate the impact of large events on urban transit for traffic
regulation or public transportation.
3 Identification/Classification of users and their habits for
telecommunication company.
4. Introduction
Purpose
Analyze mobility and social behaviour of mobile phone users:
1 Detect social events i.e. unsual large gatherings of poeple.
2 Identify frequent location such as home or office.
Motivation
1 Between 70% & 80% of human mobility is explain by the daily
home-office routine (Barabasi et al.). Analyze the
out-of-ordinary behaviour.
2 Anticipate the impact of large events on urban transit for traffic
regulation or public transportation.
3 Identification/Classification of users and their habits for
telecommunication company.
5. Introduction
Purpose
Analyze mobility and social behaviour of mobile phone users:
1 Detect social events i.e. unsual large gatherings of poeple.
2 Identify frequent location such as home or office.
Motivation
1 Between 70% & 80% of human mobility is explain by the daily
home-office routine (Barabasi et al.). Analyze the
out-of-ordinary behaviour.
2 Anticipate the impact of large events on urban transit for traffic
regulation or public transportation.
3 Identification/Classification of users and their habits for
telecommunication company.
6. Introduction
Purpose
Analyze mobility and social behaviour of mobile phone users:
1 Detect social events i.e. unsual large gatherings of poeple.
2 Identify frequent location such as home or office.
Motivation
1 Between 70% & 80% of human mobility is explain by the daily
home-office routine (Barabasi et al.). Analyze the
out-of-ordinary behaviour.
2 Anticipate the impact of large events on urban transit for traffic
regulation or public transportation.
3 Identification/Classification of users and their habits for
telecommunication company.
7. Introduction
Available data
1 Precise location of antennas but no orientation information.
2 Record for each connection to the networks (calls, text
messages, mobile internet,...)
Compute 2 probability measures
1 φi (x) to be connected to antenna i given a position x
2 ψi (x) to be in position x given that the user was connected to
antenna i
8. Location Inference
The signal strength at position x of an antenna i at position Xi is
defined by:
• the power of the antenna pi ; but pi = p;
• the loss of signal strength over distance:
Li (x) =
1
x − Xi
β
;
• a stochastic fading of the signal i.e. the Rayleigh fading Ri :
Pr(Ri ≤ r) = F(r) = 1 − e−r
.
9. Location Inference
The signal strength of antenna i is then given by
Si (x) = pi Li (x)Ri .
Further assumptions:
• Ri ⊥⊥ Rj ∀i = j.
• given a position x, the user connects to the antenna i with the
highest signal strength:
Si (x) ≥ Sj (x) ∀j ∈ X
Si (x) = max
j∈X
Sj (x)
10. Location Inference
Let ai denote the fact that a user connects to antenna i.
Pr(ai |x) = Pr(Si (x) = maxj∈X Sj (x))
=
j∈X
j=i
Pr (pi Li (x)Ri ≥ pj Lj (x)Rj )
If we assume that the random variable Ri realize a specific value r,
Pr(ai |x, Ri = r) =
j∈X
j=i
Pr Rj ≤ Li (x)
Lj (x) r
=
j∈X
j=i
F Li (x)
Lj (x) r
11. Location Inference
Then, it follows that
φi (x) = Pr(ai |x) =
∞
0
f (r)Pr(ai |x, Ri = r)dr
=
∞
0
e−r
j∈X
j=i
1 − exp −r
||x−Xj ||β
||x−Xi ||β dr
≈
∞
0
e−r
j∈Xi
1 − exp −r
||x−Xj ||β
||x−Xi ||β dr
How to choose the local neighborhood and what is its impact ?
12. Location Inference
Delaunay Radius:
ρi = max{d(Xi , Xj )| j Delaunay of i}
The domain Di is define by
Di = {x|rρi ≥ d(x, Xi )}
The neighborhood is computed as
Xi = {j|Xj ∈ Di , j ∈ X}
14. Location Inference
Based on Bayes rule, we can obtain
ψi (x) = Pr(x|ai ) =
Pr(ai |x)Pr(x)
Pr(ai )
The value Pr(x)
Pr(ai ) is not known but can be assumed constant over
the domain Di . It follows that
ψi (x) =
φi (x)
Di
φi (x)dx
16. Frequent Location Indentification
Probability that a user connects to antenna i is φi (x)
Probability that he made ki calls with antenna i is then φi (x)ki
The likelihood of observing those calling frequencies is
L(x|k) =
i∈H
φi (x)ki
log L(x|k) =
i∈H
ki log φi (x)
Maximum Likelihood Estimator(MLE)
ˆxh(u) = arg max
x
log L(x|k(u))
17. Overview Event Detection
General
• Looking for unusual large gatherings of people.
• Which people are likely to be attending an (possible) event?
• Should be present at the event location with high probability.
• Should not be often there.
Presence probability
Given calls in the neighbourhood, what is the probability the user
was present during the time interval of an event?
Ordinary probability
What is the average probability a user was present during other
weeks.
18. Presence probability
Derivation
• Probability user in area A at time tc for a call c is pc.
• Assume constant leave and arrival rate γ
• Then for t = tc we have e−γ|t−tc |pc.
• Take max over all calls c for a user
pp =
1
te − ts
te
ts
max
c
e−γ|t−tc |
pcdt
Motivation
• More calls ⇒ higher presence probability
• Calls close by ⇒ higher presence probability
• Don’t take into account calls outside of area.
19. Presence probability
← First call
← Second call
Time
Probability
13 14 15 16 17 18 19
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
20. Ordinary probability
How regularly is user in the area?
(Consider only same weekday, same time of day)
April
1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
Was not present, i.e. pp(i) = 0
Was in area with probability pp(2)
Was in area with probability pp(5)
Ordinary probability
Ordinary probability defined as average probability, i.e.
po = 1
W
W
i=1 pp(i)
21. Probability of attending
Maximum ordinary probability
• Should be present with relatively high probability
• Relatively rarely present ⇒ small po (i.e. only for the event)
• What is theoretical maximum ordinary probability ¯po?
• Theoretical maximum: make infinite number of calls with ‘best’
antenna.
Probability of attending
• Probability user attended then calculated as
pa = pp(1 − po/¯po)
22. Event detection
Number of attendees
• Mark user as (possible) attendee if pa high enough
• Number of (possible) attendees at week w given by nw
• Mark week w as event if nw is high enough.
29. Conclusions
Conclusions
• Possible to detect ‘social events’ in mobile phone data
• Robust to antenna positioning and switching
• Interesting observation: non-routine behaviour seems massive
Further considerations
• Use simpler (faster) method to detect irregularities
• Refine location estimation by likelihood inference
Questions? Suggestions? Remarks?