Data for Impact - Horizon 2020 project pioneering big data approaches for improved assessment of the societal impact in the health, demographic change and well-being societal challenge at national and EU levels. Data4Impact aspires to develop a set of new indicators for assessing research and innovation performance based on a hands-on and data-driven approach.
Here is the presentation from the Data4Impact workshop, which took place on 24th of September 2018.
2. Data4Impact: the basics
Call: CO-CREATION-08-2016-2017: Better integration
of evidence on the impact of research and innovation
in policy making
Expected impacts:
– Improved monitoring of R&I activities: new indicators
for assessing research and innovation performance,
including the impact of research and innovation policies
– Prove value to the society: determining the societal impact
of research and innovation funding in order better to
justify research and innovation spending
Data4Impact addresses key challenges and expected impacts of
CO-CREATION-08-2016-2017 through a data driven approach
3.
4. Data4Impact: the basics
Definition of Big Data:
"Big Data is high-volume, high-velocity and/or high-variety
information assets that demand cost-effective, innovative forms of
information processing that enable enhanced insight, decision making,
and process automation."
Key properties of Big Data:
– Volume, i.e. no sampling is generally applied
– Variety, i.e. structured and unstructured data from various sources,
in different formats
– Velocity, i.e. real-time/rapid data
– Veracity, i.e. variations in data quality, cleaning, processing, etc.
Non-intrusiveness -> Big Data is a byproduct of digital interaction and
communication;
5. Data4Impact: the basics
Project mission: create new knowledge by applying big data
approaches to:
Improve the monitoring of EU and national R&I programmes
Better assess the societal impact of research funding
Is Data4Impact Big Data? We think so:
Volume -> entire funding programmes (e.g. FP7, H2020,
national programmes) and their outputs/results/impacts will
be covered
Variety of sources -> publication, IPR, company, project,
clinical guidelines, social media/media/fora and other data to
be mined
Velocity -> real-time data will be tracked from several sources
Veracity -> strong emphasis on analysis of unstructured data;
data will come in many different forms and will require
extensive processing
6. Data4Impact: the basics
Rapid rise of big data in EU policymaking
1st applications of machine learning techniques in EU
evaluations/impact assessments in 2014-2015
Big data workshop in 2016: big data is 3 years away from
being 3 years away..
BUT, several important developments over the last 2 years:
– Piloting of big data approaches for the European Innovation
Scoreboard (DG RTD, 2017)
– JRC working on various big data projects, incl. on ICT projects in EU
FPs (see EU Science Hub)
– Funding two big data projects under TRANSFORMATIONS
(EURITO, D4I)
– Launching of a project tracking research results in FP7 (DG RTD,
2018)
Key driver: growing need/necessity for timely data & non-intrusive
data collection approaches
7. Data4Impact: the basics
Why the health domain?
Key challenge with big data is actually to make it
‘small’
Need to understand the subject area, this implies
specialisation in specific programmes/areas
Health has some attractive properties: highly
relevant to society, impacts are generally well
understood and defined, many data sources are
open
Knowledge of the subject field
8. User-centered approach
Over 30 national and international funders covered (incl. EU FPs, NIH, WT,
various national funders, UK charities, etc.)
Our aim is to provide data and for some of they key questions asked by
policymakers and citizens
Relevance:
– E.g. extent to which the funded research activities correspond to societal needs;
– Adequacy of programme and priority setting;
– Timeliness of research
Coherence: synergies/overlaps between different programmes
(highly relevant for Joint Programming & related policies)
Results: does the research feed into clinical practice?
Impacts:
– On innovation
– On policy
– On Societal Challenges (health & wellbeing)
17. Data4Impact: objectives
Objective 1: define, develop, analyse new indicators for assessing
the performance of EU and national R&I systems.
.
Input
R&D funding
Societal impactOutput/resultThroughput
Human capital
PatentsResearch systems
Knowledge
Publications
… (tbd)
Added value
Trademarks
New technologies
… (tbd)
Clinical guidelines
Economic results
Standards
… (tbd)
Improved
health,
demographic
change &
wellbeing
Data source: OpenAIRE’s funding/project/programme +
publication + patent data + mining of additional data to
extend coverage across countries/disciplines (WP3)
Data source: mining of company, clinical guidelines,
finalised project, social media, media, etc. data using
algorithms developed by Athena RC, UoB and Qualia SA
(WP3)
Project/programme monitoring data +
publication & patent full text data
Clinical guidelines data + webometrics:
company/web + final project (text) + social
media/media/etc. data
18. Data4Impact: objectives
Objectives 2+3: gather data at input, throughput, output and impact
levels, derive facts and understand impact on health-related
challenges
19. Data4Impact: objectives
Objectives 4+5: perform community-driven validation and develop
user-centered tools
2 core topics Methodology +
analysis
Validation through
community-driven cases
New indicators for
assessing research
and innovation
performance
Determining the
societal impact of
research and
innovation
funding
Landscaping: analysis of
existing systems, indicators,
methods and tools
User-centred analysis &
definition of needs for
specific research areas
Big data collection, cleaning,
augmentation and analysis
United Kingdom
Germany
Sweden
Objective5:developmentofnew
methodologies,indicatorsandtools
Objectives 1-3 Objective 4