3. Trivia
Taxi Observations* by Location and Booking Frequency of Zone in
Singapore1
15 April 2010 * Sampled dataset (~10,000 observations) 3
4. Outline
• Reality Mining
• Applications
• Holy Grail!
• Challenges
• Discussion and Q& A
15 April 2010 4
5. Reality Mining Study2
• “ … collection and analysis of machine-sensed
environmental data pertaining to human social
behavior, with the goal to identify predictable patterns
of future human behavior …”
• .. extracting information from real world sensor data …
• Reality Mining vs. Data Mining
• Nathan Eagle, Alex (Sandy) Pentland, MIT, 2005
• 100 Mobile phones, 9 months, 45,000 hours of
communication logs, location and proximity data
15 April 2010 5
7. Key Results
Social Network Analysis in the wild!3
15 April 2010 7
8. Why do we care?
• Social Science
– Social Network Analysis
– Behavioral Modeling
– Human Mobility
• Systems Research
– Transportation
– Environmental Modeling
– Healthcare
15 April 2010 8
10. Why do we care (again)?
• Social Science
– Social Network Analysis
– Behavioral Modeling
– Human Mobility
• Systems Research
– Transportation
– Environmental Modeling
– Healthcare
15 April 2010 10
13. Mobile Millennium, UC Berkeley7
100 probe vehicles, carrying
GPS-enabled N95
San Francisco Bay
Area, California
Virtual Trip Lines (VTL)
15 April 2010 13
15. Holy Grail!
• Urban Planning and Management
– Real time city
• Are the sidewalks along the Belleuve lake good for jogging
today, given the air and noise pollution levels?
– Macroscopic view
• Is there a need for running supplementary tram services (or sending
an additional fleet of taxis) towards the end of a soccer match
between Switzerland and Germany?
– Emergency/Crisis Response
• 2009 Mumbai terrorist blasts
– Disease Outbreak
15 April 2010 15
17. Holy Grail!
• Urban Planning and Management
– Real time city
• Are the sidewalks along the Belleuve lake good for jogging today,
given the air and noise pollution levels?
– Macroscopic vs. Microscopic
• Is there a need for running supplementary tram services (or sending
an additional fleet of taxis) towards the end of a soccer match
between Switzerland and Germany?
– Emergency/Crisis Response
• 2009 Mumbai terrorist blasts
– Disease Outbreak/Epidemic Modeling
15 April 2010 17
18. Challenge #1: Big Data
• How big is big enough?
– Wal-Mart: 100-400 GB/day of RFID data8
– LHC: 40 TB/day9
• Storage is cheap!
• Stream data mining
15 April 2010 18
20. Challenge #3: Privacy(!)
• A “new deal” on data? 10
– right to possess your data
– control the use of your data
– right to distribute or dispose your data
• How thin or thick the line is between publicity
and privacy?
• Trivia again!
– Erica is travelling to Helsinki in May 2010?
– Florian and Stephan visited Brussels in February 2010?
15 April 2010 20
21. Big Money!
IBM Smarter Planet HP CeNSE
15 April 2010 21
23. Takeaway Message
Last 5 years have spurred an industrial revolution of sensor
data. I believe that applying empirical (and later, computational
methodologies) on this real world data would help us better
understand the underlying cognitive, social, policy and
engineering issues present in our socio-technical systems.
Reality Mining, which sits at the intersection of computer
science, statistics and social science, fits in this role nicely.
15 April 2010 23
24. References
1. Darshan Santani, Rajesh Krishna Balan, and C. Jason Woodard, Understanding and Improving a
GPS-based Taxi System, In 6th USENIX International Conference on Mobile
Systems, Applications, and Services (MobiSys), Breckenridge, Colorado, June 2008
2. N. Eagle and A. (Sandy) Pentland. Reality mining: sensing complex social systems. Personal
Ubiquitous Computing, 10(4):255–268, 2006.
3. N. Eagle, A. S. Pentland, and D. Lazer. Inferring friendship network structure by using mobile
phone data. Proceedings of the National Academy of Sciences, 106(36):15274–15278, 2009
4. C. Song, Z. Qu, N. Blumm, and A.-L. Barabasi. Limits of Predictability in Human Mobility.
Science, 327(5968):1018–1021, 2010
5. N. Maisonneuve, M. Stevens, M. E. Niessen, and L. Steels.Noisetube: Measuring and mapping
noise pollution with mobile phones. In I. N. Athanasiadis, P. A. Mitkas, A. E.Rizzoli, and J. M.
Gómez, editors, ITEE, pages 215–228. Springer, 2009.
6. J. Yoon, B. Noble, and M. Liu. Surface street traffic estimation. In MobiSys ’07: Proceedings of
the 5th International conference on Mobile systems, applications and services, pages 220–
232, New York, NY, USA, 2007
7. J. C. Herrera, D. B.Work, R. Herring, X. J. Ban, , and A. M.Bayen. Evaluation of traffic data
obtained via gps-enabled mobile phones: the mobile century field experiment. Working
Paper, UCB-ITS-VWP-2009-8, August 2009
8. I. Alexander, G. Andrea, M. Florian, and E. Fleisch.Estimating data volumes of rfid-enabled
supply chains. In AMCIS 2009 Proceedings, page 636, 2009
9. CERN LHC Computing. http://public.web.cern.ch/public/en/LHC/Computing-en.html, April 2010
10. Alex (Sandy) Pentland, Reality Mining for Companies, in O’reilly Where2.0 Conference, May 19-
21, SanJose CA, 2009
15 April 2010 24