Big Data Socio-Economic Externalities – the BYTE Case Studies
1. BYTE:
Big Data Socio-Economic Externalities – the BYTE Case
Studies
Anna Donovan
Trilateral Research & Consulting, LLP
BYTE project coordinator
Big data roadmap and cross-disciplinarY
community for addressing socieTal
Externalities
BDVA Summit
17-19 June 2015
2. @BYTE_EU www.byte-project.eu
Objectives
The BYTE project has three main objectives:
1. To produce a research and policy roadmap and recommendations to support European stakeholders in
increasing their share of the big data market by 2020 and in capturing and addressing the positive and
negative societal externalities associated with use of big data.
2. To involve all of the European actors relevant to big data in order to identify concrete current and
emerging problems to be addressed in the BYTE roadmap. The stakeholder engagement activities will lead to
the creation of the Big Data Community, a sustainable platform from which to measure progress in meeting the
challenges posed by societal externalities and identify new and emerging challenges.
3. To disseminate the BYTE findings, recommendations and the existence of the BYTE Big Data Community to
a larger population of stakeholders in order to encourage them to implement the BYTE guidelines and
participate in the Big Data Community.
3. @BYTE_EU www.byte-project.eu
Project details: BYTE
•Big data roadmap and cross-disciplinarY community for addressing socieTal Externalities (BYTE)
project
•March 2014 – Feb 2017; 36 months
• Funded by DG-CNCT: €2.25 million (Grant agreement no: 619551)
• 11 Partners
• 10 Countries
4. @BYTE_EU www.byte-project.eu
Case studies: big data practitioners assist
to identify externalities
Environmental data
Energy
Utilities / Smart Cities
Cultural Data
Health
Crisis informatics
Transport
5. @BYTE_EU www.byte-project.eu
Case studies
•QUESTION(S):
• Which positive and negative societal externalities are associated
with the use of big data in each sector example?
• Who are the (positively and negatively) affected parties?
• How might potential positive impacts be captured, and how
might challenges associated with negative impacts be
addressed/ diminished?
•METHODS:
• Desk-based research;
• Semi-structured interviews with high-level industry big data
practitioners; and
• Expert focus groups to test and validate research findings.
6. @BYTE_EU www.byte-project.eu
UNDERSTANDING ‘EXTERNALITIES’
In BYTE we consider the externalities or impacts of
big data
Positive effects or benefits realised by a third party
Negative costs (or harm) that affects a third party
Externalities relate to social processes linked to big
data, as well as the opportunities & risks that may
arise as a result of the existence of the data.
Some effects may be unexpected or unintentional
IMPACT
ECONOMIC
SOCIAL
LEGALETHICAL
POLITICAL
7. Common
externalities
across case
studies
Examples of Externalities Positive Negative
Economic • Boost to the economy
• Innovation
• Increase efficiency
• Smaller actors left behind
• Shrink economies
New business models with social
and economic considerations, and
increased innovation through open
data and source material and by
infrastructure and technology
improvements
Private companies gaining revenue
from organisations that can least
afford to pay a premium,
humanitarian organisations
providing access to data during
crises
Legal • Privacy
• Data protection
• Data ownership
• Copyright
• Risks associated with inclusion & exclusion
Organisations implementing
measures to support data
protection, data security and other
legal issues, i.e. licensing
frameworks for cultural data
Access to proprietary data
restricted outside of organisations
Social &
ethical
• Transparency
• Discrimination
• Methodological difficulties
• Spurious relationships
• Consumer manipulation
Improved services across the
sectors, e.g health services
enhanced by improved diagnostic
testing; / e.g. increased awareness
of the need for socially responsible
and ethical data practices, i.e.
importance of verifiable social
media data in crises
Continued issues raised by the use
of personal data, data accompanied
by intellectual property rights. Data
sharing etc
Political • Reliance on US services
• Services have become utilities
• Legal issues become trade issues
International cooperation through
data sharing
Cross national flows of data
tensions between for-profit and
non-profit organisations
8. @BYTE_EU www.byte-project.eu
Case study example key findings: big data
and health
•Generally, data utilisation in the healthcare sector is developed and widespread across a number of health areas,
especially in terms of medical research and diagnostic testing that translates into improved, more specialised care
for patients.
•Genetic data use is maturing and focused on high-grade analytics and the discovery of rare genes and genetic
disorders.
•The key improvements include timely and more accurate diagnosis, the development of personalised medicines,
and drug and other treatments/ therapy development, which can save lives
•Key innovations include the development of privacy protecting and secure databases for genetic data samples,
which is vital given the highly sensitive nature of the personal data utilised; and new business models focused on
big genetic data sequencing
•However, there tends to be a reluctance by public sector initiatives to share data on open databases or in
collaborations with private organisations (big pharma etc.) due to legal/ ethical constraints (e.g. consent/ privacy),
and public sector ethos (public good v. profit generation).
9. Examples of Externalities Positive Negative
Economic • Boost to the economy
• Innovation
• Increase efficiency
“one of the things that we’ve been
working on here is trying to develop a
database of possible deletions or
duplications because the software
and the data doesn’t allow that […]
as soon as we are confident that we
have found something that would be
helpful, we would publish it and make
it available definitely.”
(Translational medicine specialist)
“One area for development as a
potential business opportunity is deal
with the challenge of interoperability
of big health data.” (FG)
Legal
• Data protection
• Privacy
• Data ownership
“Big data demands the
development of new legal
frameworks in order to […]
enhance and formalise how to
share data among countries for
improving research and
healthcare.” (FG)
10. @BYTE_EU www.byte-project.eu
Case study example key findings: big data
and crisis informatics
•Crisis informatics is in the early stages of integrating big data into standard operations and is primarily
focussed on integrating social media and geographical data (There has not yet been much progress
integrating other data types – e.g., environmental measurements, meteorological data, etc)
•The key improvement is that the analysis of this data improves situational awareness more quickly after an
event has occurred.
•A key innovation is the use of human computing, primarily through digital volunteers, to validate the data
collected and determine how trustworthy it is.
•Stakeholders in this area are making progress in addressing privacy and data protection issues, which are
significant and complex, given their focus on data from social media sources.
11. Examples of Externalities Positive Negative
Social & ethical • Transparency
• Discrimination
• Methodological
difficulties
• Spurious relationships
• Consumer manipulation
“I worked on a project called the ethics of data
conference where we brought in one hundred
people from different areas of knowledge to talk
about data ethics. And to infuse our projects and
understand and build road maps. There is something
called responsible data forum which is working on
templates in projects, to be able to help people
incorporate those kind of personal data. My
colleague has been working on something called
ethical data checklists as part of the code of
conducts for the communities that he has
cofounded. So these code of conducts I have written
one for humanitarian open street map about how
we manage data.” (Program Manager, RICC)
Political • Reliance on US services
• Services have become
utilities
• Legal issues become
trade issues
“Humanitarian organisations and others are very
worried about creating technology dependence one
particular vendor, so they find that our platforms are
open source make them more comfortable with
adopting our process and our technology because
they know that we don’t hold a leverage over their
activity.” (SS, RICC)
“Difficulty of potential reliance on
US based infrastructure services.”
(D, RICC)
12. @BYTE_EU www.byte-project.eu
BYTE project key outputs
•Define research efforts and policy measures necessary for responsible participation in
the big data economy
•Vision for Big Data for Europe for 2020, incorporating externalities
• Amplify positive externalities
• Diminish negative ones
•Roadmap
• Research Roadmap
• Policy Roadmap
•Formation of a Big Data community
• Implement the roadmap
• Sustainability plan
13. @BYTE_EU www.byte-project.eu
THANK YOU
Any questions?
Key contacts:
◦ Anna Donovan – anna.donovan@trilateralresearch.com
◦ Kush Wadhwa – kush.wadhwa@trilateralresearch.com
◦ Rachel Finn – rachel.finn@trilateralresearch.com
Editor's Notes
Innovative tools combine human computing (crowd sourcing) & machine computing (artificial intelligence) to evaluate citizens’ needs during or immediately after a crisis. includes mining “open” social media data, including text feeds, images, videos, location and temporal information to gather information, identify needs and assess damage.
Project 1 uses a combination of crowd sourcing and AI to automatically classify millions of tweets and text messages per hour during crisis situations. These tweets could be about issues related to shelter, food, damage, etc., and this information is used to identify areas where response activities should be targeted. Project 2 examines multi-media and the photos and messages in social media feeds to identify damage to infrastructure. This is a particularly important project as the use of satellite imagery to identify infrastructure damage is only 30-40% accurate and there is a generalised difficulty surrounding extracting meaningful data from this source (Director, RICC).
Primary research: interviews, focus group
Positive externalities occur when a product, activity or decision by an actor causes positive effects or benefits realised by a third party resulting from a transaction in which they had no direct involvement.
Negative externalities occur when a product, activity or decision by an actor causes costs (or harm) that is not entirely born by that actor but that affects a third party, e.g., citizens (Business Dictionary, 2014).
externalities are related to processes (i.e., production, service, use) and not to the product itself. That is, it is not big data per se that causes a particular externality, but rather, it is the social processes employed via big data that can produce externalities. Furthermore, these externalities may result from the direct collection or processing of data (e.g., privacy infringements), as well as the opportunities and risks that may arise as a result of the existence of the data (e.g., linking data sets). In addition, as externalities may have unexpected effects on third parties, a central task in BYTE is the identification of the involved processes, their effects as well as the potential affected parties.
Production of a roadmap outlining a plan of action to enable European scientists and industry to capture a proportionate share of the big data market.
Provision of assistance to industry in capturing positive externalities (efficiencies, new business models, etc.) and addressing potential negative externalities before beginning a project, initiative or programme.
A series of clear and precise future research needs and policy steps