38. 5th International Conference of Crisis Mappers
Panel I—What is so Big about Big Data?
4 questions on the Big Data-Rich
Future of Humanitarian Assistance
Emmanuel Letouzé
Fellow, Harvard Humanitarian Initiative
PhD Candidate, UC Berkeley
Non-Resident Adviser, International Peace Institute
eletouze@berkeley.edu
Nairobi, November 21st, 2013
40. 1. What is ‘Big Data’ about—and not about?
① Big Data as data == “traces of human actions picked up
by digital devices” (Letouzé, Meier and Vinck)
1. “Digital breadcrumbs” (Sandy Pentland)
2. Open web data (social media, online news..)
3. Sensing (satellite, meters..)
② Big Data as data is not ‘about’ size—it’s a primarily
qualitative shift
③ Big Data is “not about the data” (Gary King)
41. 1. What is Big Data about—and not about?
•
•
•
Big Data as data doesn’t have to be big to be different
Big Data as data is about very many very small data
produced by / about connected individuals (big data is
small data—it can also be slow data)
Big Data takes intent and capacities
Movement of an individual in
Rwanda over 4 years (Source J. Blumenstock)
43. 2. How will Big Data grow & age?
Stock of world data, circa 1980 (assume)
Stock of world data, circa 2020?
90 days
50 years
1
0
Unknown data
1
1000
0
1000
44. 3. How has / may it be used for
humanitarian assistance purposes?
① Descriptive analysis (e.g. maps)
② Predictive analysis (proxying vs. forecasting)
③ Diagnostics (causal inference)
45. 3. How has / may it be used for
humanitarian assistance purposes?
Pattern recognition + anomaly detection: Violent event in ACLED
data vs. cellphone call volume in Ivory Coast
Source: Letouzé and Prydz, 2013
46. Example: “Prediction of Socio-Economic Levels Using Cell-Phone Records”
(Telefonica research, 2011)
National
Statistical
Survey from “a
Institutes in Latin
major citycarry
out surveys
America”
Telefonica team
used their data to
‘predict’ SELs from
Cell Phone Usage
Predict the present
(SELs for nonsurveyed regions)
and monitor the
future (track
changes over time)
47. 4. What are the traps and priorities ahead?
i.
Main risks are
① Creation of a ‘new’ digital divide
=>Recentralization of decisionmaking, reversing recent trends/efforts
② Dehumanization / de-democratization of
decision-making (cf drones, killer-robots)
③ Confidentiality / security: e.g. CDRs deanonymization and identification
48. 4. What are the traps and priorities ahead?
ii. Main challenges/questions are
① Political: Engaging with & empowering at-risk / affected
people and communities for community
resilience, feedback loops, agile response..(urgency vs.
sustainability?)
② Legal-institutional: Devising principles and frameworks
for ‘responsible’ data sharing and analysis (D4D team)
③ Theoretical-methodological: further research / progress
to take place on
1. Sample bias correction
2. Privacy: erasable future, noise in data
3. Models of human response to emergencies
4. Causal inference
49. % PERSONNAL DATA SHARED
%personal data shared
All data collected
all data shared
Extreme societal
considerations / Open Data
society
Right Balance?
Source: Letouzé and Vinck, 2013
%personal DATA COLLECTED
% PERSONNAL data collected
No data collected,
No data shared
Extreme individual
consideration / Full privacy
All data collected,
No data shared
Extreme commercial
considerations / surveillance