Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Framework and Roadmap towards an Open Science Infrastructure/Simon Hodson
1. Framework and Roadmap towards an
Open Science Infrastructure
Simon Hodson, Executive Director, CODATA
www.codata.org
AOSP Workshop: Framework and Roadmap towards an
Open Science Infrastructure
Centurion Lake Hotel
14 May 2018
2. Vision of a coordinating activity to help put in place and link the enabling practices,
capacities and technologies for Open Science.
Pan African in ambition.
Funded by Department of Science and Technology via National Research Foundation;
delivered by ASSAf, directed by CODATA.
Current three year pilot preparing the foundations for a broader initiative.
Successful first strategy workshop (March 2018) followed by a stakeholder workshop (Sept
2018) to prepare the platform initiative.
Aim for this to be launched at Science Forum South Africa, Dec 2018.
African Open Science
Platform
3. Key deliverables of the pilot project will be foundations for the platform in these four key
area:
1. Frameworks and guidance to assist policy development at national and institutional
level.
2. Study and recommendations to reduce barriers and provide constructive incentives for
Open Science.
3. Framework for data science training (including RDM, data stewardship and science of
data); curriculum framework, training materials, recommendations for training
initiatives.
4. Framework and roadmap for data infrastructure development: emphasising
partnerships and de-duplication between national systems, economies of scale,
institutions and domain initiatives.
Framework for Policies, Incentives,
Training and Technical Infrastructures
4. Developing a Framework and Roadmap
for Open Science Infrastructure
Today’s meeting: to help inform the project on matters of data infrastructure and to benefit
from your expertise.
A preliminary document identifying a set of priorities and a plan for development to inform
discussions in September.
Virtualised network, compute and storage: delivered in such as way as to achieve
economies of scale (regional, national and institutional dimensions).
Open Science Infrastructure: including international ecosystem for FAIR data,
requirements of data stewardship, specialised Research Infrastructures.
A final project output which will lay out a vision and set of priorities and actions for data
infrastructure to inform the activities of a proposed phase two.
5. The Case for Open Data
in a Big Data World
• Science International Accord on Open Data in a Big Data
World: http://www.science-international.org/
• Supported by four major international science
organisations.
• Presents a powerful case that the profound
transformations mean that data should be:
• Open by default: as open as possible, as closed as
necessary
• Intelligently open: FAIR data
• Lays out a framework of principles, responsibilities and
enabling practices for how the vision of Open Data in a
Big Data World can be achieved.
• Campaign for endorsements: over 150 organisations so
far.
• Please consider endorsing the Accord:
http://www.science-international.org/#endorse
6. Framework for Regional, National and
Institutional Data Strategies
National / Institutional Open Science and FAIR Data Strategy
Consultative forum, stakeholder engagement.
Open data policies and guidance at national and institutional
level.
Clarify the boundaries of open (particularly privacy, IPR).
Clarify the data in scope, guidelines on selection.
Develop incentives and reward systems.
Mechanisms (infrastructure and policy) to ensure
concurrent publication of data as research output.
Data ‘publication’ and citations of data included in
assessment of research contribution.
Promotion of data skills:
Essential data skills for researchers.
Develop skills and competencies for data stewards, data
scientists.
7. Framework for Regional, National and
Institutional Data Strategies
Scope, roadmap and implement data infrastructure.
Network, compute and storage: key components of
national, regional infrastructure (network / NREN,
economies of scale for storage and compute).
Engagement with international FAIR Data / Open
Science data ecosystem components: permanent
identifiers, metadata standards, standards for TDRs,
etc.
Data Stewardship Infrastructure: Development of
regional, national and institutional infrastructure(s)
for data stewardship and Open Science (RDM, generic
and specialised research platforms/environments,
trusted digital repositories).
Collaborative Research Infrastructures: RIs and
research tools for certain research disciplines,
nationally, regionally to pool expertise and lower
costs.
8. Vision and Mission of an
African Open Science Platform
African scientists are at the cutting edge of contemporary, data-intensive science as a
fundamental resource for a modern society.
A digital ecosystem with five complementary aims governed by a set of common principles
and practices:
1. A virtual space for scientists to find, deposit, manage, share and reuse data, software
and metadata;
2. A means of continually developing capacities at all levels of national science systems
and amongst professionals and their institutions operating in the public and private
domain;
3. A basis for multi-stakeholder consortia that wish to utilise powerful digital tools in
addressing major common problems, and for work in the trans-disciplinary mode;
4. A forum for exchange of ideas, best practices and opportunities amongst Platform
partners and with the international data-science community.
5. An African Data Science Institute, to advance the frontiers of data science and provide
support for interdisciplinary research domains where there are particularly strong data
assets in Africa.
9. African Open Science Platform:
Suggested Phase Two Activities
1. Registry of African data initiatives, collections and services
2. Coordination and provision of network, compute and storage (building on current work of
NRENs, targeting needs of Open Science, achieving economies of scale).
3. A virtual space for scientists to find, deposit, manage, share and reuse data, software and
metadata (i.e. support for / or provision of FAIR data components, data stewardship and
Research Infrastructures).
4. An African Data Science Institute (to develop African capacities at the international cutting
edge of research in data analytics, artificial intelligence, machine learning and data
stewardship).
5. Major data-intensive programmes in science areas where Africa is data-asset rich (process
for identifying these areas, obtaining funding, ensuring that RIs are in place).
6. Network for Education and Skills in Data and Information (training programmes in data
science, data stewardship, data literacy, targeted at all stages of education).
7. Network for Open Science Access and Dialogue (building full engagement and joint action in
transdisciplinary and citizen science initiatives as an essential component of Open Science).
10. Emerging Policy Consensus? FAIR Data
• FAIR Data (see original guiding principles at https://www.force11.org/node/6062
• Findable: have sufficiently rich metadata and a unique and persistent identifier.
• Accessible: retrievable by humans and machines through a standard protocol;
open and free by default; authentication and authorization where necessary.
• Interoperable: metadata use a ‘formal, accessible, shared, and broadly applicable
language for knowledge representation’.
• Reusable: metadata provide rich and accurate information; clear usage license;
detailed provenance.
11. European Commission Expert Group
on FAIR Data
Core Deliverables
1. To develop recommendations on what
needs to be done to turn each
component of the FAIR data principles
into reality
2. To propose indicators to measure
progress on each of the FAIR components
3. Actively support the creation of the FAIR
Data Action Plan, by proposing a list of
concrete actions as part of its Final
Report
4. Draft for consultation, released 11 June
2018, final report October 2018.
5. Support Commission in presentation of
FAIR Data Action Plan in Autumn 2018.
Report Structure
1. Concepts: Why FAIR?
2. Creating a culture of FAIR data
3. Making FAIR data a reality: technical
perspective
4. Skills and capacities for FAIR data
5. Measuring Change
6. Facilitating Change: a FAIR Data
Action Plan
12. FAIR Guiding Principles (1)
• To be Findable:
• F1. (meta)data are assigned a globally unique and persistent identifier
• F2. data are described with rich metadata (defined by R1 below)
• F3. metadata clearly and explicitly include the identifier of the data it describes
• F4. (meta)data are registered or indexed in a searchable resource
• To be Accessible:
• A1. (meta)data are retrievable by their identifier using a standardized
communications protocol
• A1.1 the protocol is open, free, and universally implementable
• A1.2 the protocol allows for an authentication and authorization procedure,
where necessary
• A2. metadata are accessible, even when the data are no longer available
(Mons, B., et al., The FAIR Guiding Principles for scientific data management and stewardship, Scientific Data,
http://dx.doi.org/10.1038/sdata.2016.18)
13. FAIR Guiding Principles (2)
• To be Interoperable:
• I1. (meta)data use a formal, accessible, shared, and broadly applicable language
for knowledge representation.
• I2. (meta)data use vocabularies that follow FAIR principles
• I3. (meta)data include qualified references to other (meta)data
• To be Reusable:
• R1. meta(data) are richly described with a plurality of accurate and relevant
attributes
• R1.1. (meta)data are released with a clear and accessible data usage license
• R1.2. (meta)data are associated with detailed provenance
• R1.3. (meta)data meet domain-relevant community standards
(Mons, B., et al., The FAIR Guiding Principles for scientific data management and stewardship, Scientific Data,
http://dx.doi.org/10.1038/sdata.2016.18)
14. International ‘ecosystem’ of open science
and FAIR data components
Open Science infrastructure is not just the network, storage and compute.
Ecosystem of components which are created and governed internationally.
Reporting Research Outputs: information systems for research output reporting (CRIS), metadata
standards e.g. CERIF, managed by euroCRIS.
Persistent and Unique Identifiers: DOIs for articles (CrossRef); DOIs for data sets (DataCite); author IDs
(ORCID).
Data and Metadata Standards: CIF in crystallography, FITS in astronomy, DDI in social science surveys,
Darwin Core in biodiversity, etc, etc.
DCC Registry of Metadata Standards http://www.dcc.ac.uk/resources/metadata-standards ; now
maintained by RDA IG http://rd-alliance.github.io/metadata-directory/
Data Repositories: listed in Re3Data, registry of data repositories: https://www.re3data.org/
Trusted Data Repositories: Core Trust Seal https://www.coretrustseal.org/, a merger of Data Seal of
Approval and the World Data System criteria.
Criteria for Trustworthy Digital Archives (DIN 31644) http://www.data-
archive.ac.uk/curate/trusted-digital-repositories/standards-of-trust?index=3
Audit and certification of trustworthy digital repositories (ISO 16363) http://www.data-
archive.ac.uk/curate/trusted-digital-repositories/standards-of-trust?index=2
17. RDM lifecycle diagram for maturity assessment, DCC
2018, based on Hodson and Molloy 2013
• Full lifecycle data
infrastructures:
Preparation of DMPs
Management of active
data
Appraisal and selection
Stewardship and
preservation
Ensuring the Data is FAIR
(discovery metadata,
identifier, access
mechanisms and controls,
usage license, domain and
provenance metadata…)
Open Science and FAIR Data Services
18. Where should research data go?
• Earth observation data;
• Genetic data;
• Social science survey data…
Homogenous
data collections
essential for
research
• Significant data outputs from
funded projects;
• Raw and analysed
experimental data…
Significant data
outputs of
publicly funded
research
• Raw and analysed data for
reproducibility (evidence);
• Data behind the graph…
Data
underpinning
research
publications
National and
international data
archives
National or
institutional data
archives; data
papers
Dedicated data
archives (e.g.
Dryad)
19. Open Science, FAIR Data:
Commons, Clouds, Platforms…
Commons: ‘collectively owned and managed by a community of users’
Clouds: European Open Science Cloud (not just European, not entirely Open, not just for
science and not exclusively cloud technology)…
Platform Approaches:
brokerage for discovery and access, reinforced by the development of common
standards and principles or policies (e.g. GEOSS, Research Data Australia);
brokerage of services: approaches for discovery and access, augmented by the
provision of services for particular research disciplines, including the promotion of
skills, training, competences, standards, tools for analysis etc (e.g. Elixir, CESSDA and
other ESFRIs, CGIAR on a global scale);
platform environment: utilizing the capacity of Cloud Computing for efficiency, access
management, analysis across vast numbers of datasets, marketisation of services in a
platform economy in which standards and common rules minimize vendor lock-in (e.g.
NIH Data Commons, European Open Science Cloud).
20. EOSC Declaration
[EOSC architecture] The EOSC will be developed as a data
infrastructure commons serving the needs of scientists. It should
provide both common functions and localised services delegated to
community level. Indeed, the EOSC will federate existing resources
across national data centres, European e-infrastructures and
research infrastructures
[Service deployment] The EOSC shall support different deployment
models (e.g. Infrastructure as a Service, Platform as a Service,
Software as a Service), to meet the needs of communities at
different levels of maturity in the provision and use of research data
service. The EOSC shall support the whole research lifecycle by
strong development at platform level that facilitate the provision of
a wide set of software, infrastructure, protocols, methods,
incentives, training, services.
[Thematic areas] The EOSC shall promote the co-ordination and
progressive federation of open data infrastructures developed in
specific thematic areas (e.g. health, environment, food, marine,
social sciences, transport). The EOSC will implement a common
reference scheme to ensure FAIR data uptake and compliance by
national and European data providers in all disciplines.
21. EOSC Declaration
[FAIR principles] Implementation of the FAIR principles must be pragmatic
and technology-neutral, encompassing all four dimensions: findability,
accessibility, interoperability and reusability. FAIR principles are neither
standards nor practices. The disciplinary sectors must develop their specific
notions of FAIR data in a coordinated fashion and determine the desired level
of FAIR-ness. FAIR principles should apply not only to research data but also
to data-related algorithms, tools, workflows, protocols, services and other
kinds of digital research objects.
[Research data repositories] Trusted research data repositories play a
fundamental role in modern science. Scientist must be able to find, re-use,
deposit and share data via trusted data repositories that implement FAIR
data principles and that ensure long-term sustainability of research data
across all disciplines.
[Data Management Plans] A key element of good data management is a
Data Management Plan (DMP); the use of DMPs should become obligatory in
all research projects generating or collecting publicly funded research data,
based on online tools conforming to common methodologies. Funder and
institutional requirements must be aligned and minimum conditions for
DMPs must be defined. Researchers' host institutions have a responsibility to
oversee and complete the DMPs and hand them over to data repositories.
22. EOSC Declaration
[Citation system] A data citation system should be put in place to
reward the provision of excellent open data. This will assist both
the assessment of researchers and their projects, and help
implementing the findability, accessibility, interoperability and
reusability of research data.
[Common catalogues] There must be catalogues (e.g. for datasets,
services, standards) based on machine readable metadata and
identifiable by means of a common and persistent identification
mechanism that will make research data findable via an 'EOSC
Portal'.
[Semantic layer] Research data must be both syntactically and
semantically understandable, allowing meaningful data exchange
and reuse among scientific disciplines and countries.
[FAIR tools and services] Easy access must be available to a
common set of FAIR tools and services, to guide the curation of
FAIR data for re-use and to assess FAIR compliance.
23. INTERNATIONAL DATA WEEK
IDW 2018
Gaborone, Botswana: 5-8 November 2018
Information: http://internationaldataweek.org/
Deadline for abstracts, 31 May:
https://www.scidatacon.org/IDW2018/
24. CODATA-RDA School of
Research Data Science
• Annual foundational school at ICTP, Trieste (with the
objective to build a network of partners, train-the-
trainers).
• Advanced workshops, ICTP, Trieste, following the
foundational school.
• National or regional schools, organised with local
partners.
2018
• Next #DataTrieste Summer School, 6-17 August 2018.
• Next #DataTrieste Advanced Workshops 20-24 August
2018.
• Call for applications, deadline 21 May:
http://www.codata.org/datatrieste2018
• Schools in Brisbane (UQ and Australian Academy of
Sciences); ICTP Kigali (October); ICTP São Paulo
(December)
25. Simon Hodson
Executive Director CODATA
www.codata.org
http://lists.codata.org/mailman/listinfo/codata-international_lists.codata.org
Email: simon@codata.org
Twitter: @simonhodson99
Tel (Office): +33 1 45 25 04 96 | Tel (Cell): +33 6 86 30 42 59
CODATA (ICSU Committee on Data for Science and Technology), 5 rue Auguste Vacquerie, 75016 Paris,
Thank you for your attention!
26. RDM lifecycle diagram for maturity assessment, DCC
2018, based on Hodson and Molloy 2013
28. SciDataCon part of
International Data Week
SciDataCon aims to help this community ensure that it has a concrete scientific record of its
work: peer reviewed abstracts > presentations > Special Collection in the Data Science
Journal.
Themes and Scope: see
https://www.scidatacon.org/conference/IDW2018/conference_themes_and_scope/
Approved Sessions: https://www.scidatacon.org/conference/IDW2018/approved_sessions/
Incredibly rich range of topics. If you do not find a topic there you can submit an abstract
to the general submissions.
Abstracts can be submitted to Approved Sessions or to General Submissions. Will be peer
reviewed and distributed into the programme.
Abstracts for presentations and lightning talks/posters.
Deadline is 31 May: https://www.scidatacon.org/conference/IDW2018/call_for_papers/
29. International Data Week
Keynotes
Joy Phumaphi, former Minister of Health,
Botswana; co-chair of WHO Group on
Family and Community Health.
Rob Adam, Director of SKA South Africa, a
major African science and data initiative.
Ismail Serageldin, founding Director of the
new Biblioteca Alexandrina, noted thinker
on science policy issues.
Elizabeth Marincola, former CEO of PLOS;
now leading the African Academy of
Sciences publication initiatives (see AAS
Open Research).
Tshilidzi Marwala, VC of University of
Johannesburg, noted thinker in Big Data
and AI.
30. What is Open Science? (1)
Open access to research literature.
Data that is as Open as possible, as closed as necessary.
FAIR Data (Findable, Accessible, Interoperable,
Reusable).
Data is a recognised and important output of research.
A culture and methodology of open discussion and
enquiry (including methodology, lab notebooks, pre-
prints).
Data code and analysis processes are shared for
reproducibility.
Engagement with society and the economy in research
activities (citizen science, co-design / transdisciplinary
research, interface between research, development and
innovation).
31. What is Open Science? (2)
Open Science is not just Open Access + Open Data.
Individuals, institutions and the science system benefits
from putting research outputs (including data) in the
open: shop window and repository of all research
outputs.
Important role of open processes, open data and
reproducibility / replicability.
Role of AI / Machine Learning: analysis at scale.
Open innovation and transdisciplinary research.
The Open Science ethos and co-design helps build
collaboration between research institutions, societal
groups, government agencies, third sector and industry.
32. CODATA-RDA School of Research
Data Science
• Contemporary research – particularly
when addressing the most significant,
interdisciplinary research challenges –
increasingly depends on a range of skills
relating to data.
• These skills include the principles and
practice of Open Science; research data
management and curation, how to
prepare a data management plan and to
annotate data; software and data
carpentry; principles and practices of
visualisation; data analysis, statistics and
machine learning; use of computational
infrastructures. The ensemble of these
skills, relating to data in research, can
usefully be called ‘Research Data Science’.
33. DataTrieste Film on Vimeo: https://vimeo.com/232209813
Call for applications, deadline 21 May: http://www.codata.org/datatrieste2018