Presentation: The BYTE Project - by Rachel Finn, Trilateral Research & Consulting (UK), at the European Data Economy Workshop taking place back to back to SEMANTiCS2015 on 15 September 2015 in Vienna
1. BYTE:
Big Data Externalities – the BYTE Case Studies
Rachel Finn
Trilateral Research & Consulting, LLP
Big data roadmap and cross-disciplinarY
community for addressing socieTal
Externalities
European Data Economy Workshop
15 September 2015
2. @BYTE_EU www.byte-project.eu
Project details: BYTE
•Big data roadmap and cross-disciplinarY community for addressing socieTal Externalities (BYTE)
project
•March 2014 – Feb 2017; 36 months
• Funded by DG-CNCT: €2.25 million (Grant agreement no: 619551)
• 11 Partners
• 10 Countries
3. @BYTE_EU www.byte-project.eu
Objectives
The BYTE project has three main objectives:
1. To produce a research and policy roadmap and recommendations to support European stakeholders in increasing their share of
the big data market by 2020 and in capturing and addressing the positive and negative societal externalities associated with use of
big data.
2. To involve all of the European actors relevant to big data in order to identify concrete current and emerging problems to be
addressed in the BYTE roadmap. The stakeholder engagement activities will lead to the creation of the Big Data Community, a
sustainable platform from which to measure progress in meeting the challenges posed by societal externalities and identify new and
emerging challenges.
3. To disseminate the BYTE findings, recommendations and the existence of the BYTE Big Data Community to a larger population of
stakeholders in order to encourage them to implement the BYTE guidelines and participate in the Big Data Community.
4. @BYTE_EU www.byte-project.eu
Case studies: big data practitioners assist
to identify externalities
Environmental data
Energy
Utilities / Smart Cities
Cultural Data
Health
Crisis informatics
Transport
5. @BYTE_EU www.byte-project.eu
Understanding ‘externalities’
In BYTE we consider the externalities or impacts of
big data
Positive effects or benefits realised by a third party
Negative costs (or harm) that affects a third party
Externalities relate to social processes linked to big
data, as well as the opportunities & risks that may
arise as a result of the existence of the data.
Some effects may be unexpected or unintentional
IMPACT
ECONOMIC
SOCIAL
LEGALETHICAL
POLITICAL
6. @BYTE_EU www.byte-project.eu
Big data concerns: externalities
Economic
• Boost to the economy
• Innovation
• Increase efficiency
• Smaller actors left
behind
• Shrink economies
Legal
• Privacy
• Data protection
• Data ownership
• Copyright
• Risks associated with
inclusion & exclusion
Social & Ethical
• Transparency
• Discrimination
• Methodological
difficulties
• Spurious relationships
• Consumer
manipulation
Political
• Reliance on US
services
• Services have become
utilities
• Legal issues become
trade issues
Economic
• Boost to the economy
• Innovation ✔
• Increase efficiency ✔
• Smaller actors left
behind
• Shrink economies
Legal
• Privacy ✔
• Data protection ✔
• Data ownership ✔
• Copyright
• Risks associated with
inclusion & exclusion
Social & Ethical
• Transparency ✔
• Discrimination
• Methodological
difficulties
• Spurious relationships
• Consumer
manipulation
• Improved services ✔
Political
• Reliance on US
services ✔
• Services have become
utilities ✔
• Legal issues become
trade issues
• Dependent on public
funding ✔
7. @BYTE_EU www.byte-project.eu
Select horizontal findings
Positive externalities
• Efficiencies
• Product and service innovation
• New business models
• Societal benefits (improved decision-
making in healthcare, crisis
management, commercial
organisations; personalised services)
Negative externalities
• Dependence on public funding to
create the environment in which big
data business models can flourish
• Privacy concerns
• Fear of losing proprietary
information
• Outdated legislation
• Difficulty in adapting business
models
8. @BYTE_EU www.byte-project.eu
Case study-specific findings: health
•Big data in healthcare is quite well developed and widespread across a
number of health areas.
•Genetic data use is maturing and focused on high-grade analytics and
the discovery of rare genes and genetic disorders.
•The key improvements include timely and more accurate diagnosis, the
development of personalised medicines, and drug and other
treatments/ therapy development, which can save lives.
•Key innovations include the development of privacy protecting and
secure databases for genetic data samples.
•However, there tends to be a reluctance by public sector initiatives to
share data due to legal and ethical constraints.
“So in our own consent we never
say that data will be fully
anonymous. We do everything in
our power so that it is deposited in
a anonymous fashion and […] when
we consent we are very careful in
saying look it’s very unlikely that
anyone is going to actively identify
information about you” (Program
head, Clinical geneticist )
9. @BYTE_EU www.byte-project.eu
Case study-specific findings: crisis
informatics
•Crisis informatics is in the early stages of integrating big data.
•Currently, its primary focus is on integrating social media and geographical data.
•The key improvement is that the analysis of this data improves situational awareness more quickly after an
event has occurred.
•A key innovation is the combination of human computing and machine computing, primarily through
digital volunteers, to validate the data collected and determine how trustworthy it is.
•Stakeholders in this area are making progress in addressing privacy and data protection issues.
•Some evidence of reliance on US cloud and technology services.
“And I have seen this on multiply occasions from […] big private companies in this, they’ll deal with their own
huge amount of data and response to crisis and so on. But [then] become very unpredictable unsustainable
outside of an emergency, do a good job of talking about what they do during a crisis but then sort of
disappear in-between.” (Programme manager, International Governmental Organisation)
10. @BYTE_EU www.byte-project.eu
BYTE project key outputs
• Define research efforts and policy measures necessary for responsible participation in
the big data economy
• Vision for Big Data for Europe for 2020, incorporating externalities
• Amplify positive externalities
• Diminish negative ones
• Roadmap
• Research Roadmap
• Policy Roadmap
• Formation of a Big Data community
• Implement the roadmap
• Sustainability plan
11. @BYTE_EU www.byte-project.eu
Next event
Validating case study externalities
Dublin
14th October 2015, 9am-5pm
Presentations by:
Sonja Zillner, SIEMENS
Big Data in a Digital City
Knut Sebastian Tungland, Statoil
Big data in the energy sector
12. @BYTE_EU www.byte-project.eu
THANK YOU
Any questions?
Key contacts:
◦ Rachel Finn – rachel.finn@trilateralresearch.com
◦ Kush Wadhwa – kush.wadhwa@trilateralresearch.com
Hinweis der Redaktion
Positive externalities occur when a product, activity or decision by an actor causes positive effects or benefits realised by a third party resulting from a transaction in which they had no direct involvement.
Negative externalities occur when a product, activity or decision by an actor causes costs (or harm) that is not entirely born by that actor but that affects a third party, e.g., citizens (Business Dictionary, 2014).
externalities are related to processes (i.e., production, service, use) and not to the product itself. That is, it is not big data per se that causes a particular externality, but rather, it is the social processes employed via big data that can produce externalities. Furthermore, these externalities may result from the direct collection or processing of data (e.g., privacy infringements), as well as the opportunities and risks that may arise as a result of the existence of the data (e.g., linking data sets). In addition, as externalities may have unexpected effects on third parties, a central task in BYTE is the identification of the involved processes, their effects as well as the potential affected parties.
Bullet one – how we define an externality – as an “impact”
Public opinion surveys reveal that citizens are concerned about many of these issues, especially privacy and data protection.
Generally, data utilisation in the healthcare sector is developed and widespread across a number of health areas, especially in terms of medical research and diagnostic testing that translates into improved, more specialised care for patients.
Genetic data use is maturing and focused on high-grade analytics and the discovery of rare genes and genetic disorders.
The key improvements include timely and more accurate diagnosis, the development of personalised medicines, and drug and other treatments/ therapy development, which can save lives
Key innovations include the development of privacy protecting and secure databases for genetic data samples, which is vital given the highly sensitive nature of the personal data utilised; and new business models focused on big genetic data sequencing
However, there tends to be a reluctance by public sector initiatives to share data on open databases or in collaborations with private organisations (big pharma etc.) due to legal/ ethical constraints (e.g. consent/ privacy), and public sector ethos (public good v. profit generation).
Crisis informatics is in the early stages of integrating big data into standard operations and is primarily focussed on integrating social media and geographical data (There has not yet been much progress integrating other data types – e.g., environmental measurements, meteorological data, etc)
The key improvement is that the analysis of this data improves situational awareness more quickly after an event has occurred.
A key innovation is the use of human computing, primarily through digital volunteers, to validate the data collected and determine how trustworthy it is.
Stakeholders in this area are making progress in addressing privacy and data protection issues, which are significant and complex, given their focus on data from social media sources.
Production of a roadmap outlining a plan of action to enable European scientists and industry to capture a proportionate share of the big data market.
Provision of assistance to industry in capturing positive externalities (efficiencies, new business models, etc.) and addressing potential negative externalities before beginning a project, initiative or programme.
A series of clear and precise future research needs and policy steps