Australia invests $AUD1-2B per annum in research data. Like most countries, it wants to get the best return possible on this data. Europe is spending E1.4B on their open data “pilot”. This means the data should be FAIR: findable, accessible, interoperable, and reusable. Part of this is that data should be routinely “published” and available in a “data repository”. But what does this mean?
Ross Wilkinson
CEO, Australian National Data Service
Presented at the 2015 Wiley Publishing Seminar, 5 November, Melbourne, Australia.
2. Outline
Trends
The research data assets of Australia
International trends
The challenges for the publication process
The opportunity
Conclusions
2
3. Some Trends:
Reproducible Science
Open Science
Open Data
Data Citation
Data Citation
Bibliometrics
Data Journals
Data Repositories
Trusted Data
Repositories
FAIR Data
Funded Fair Data
3
4. What’s going on?
Data is no longer a by-product of research
Data is valuable
Funders and Government want more from their
research investments
So do research institutional leadership
4
6. The Value of Open Data Report
The analysis in the report suggests that the value of
data in Australia’s public research is at least $1.9
billion per annum and possibly up to $6 billion per
annum – at 2012-13 levels of expenditure and
activity.
The report discusses the implications for Australia
including if this value is not realised while
recognising potential costs if its value is to be
effectively leveraged.
6
8. Australian Research Data Activity
Data Policy
Capturing data valuable over long periods in Marine,
Astronomy, Earth Sciences, Ecosystems …for a wide
range of research purposes
Supporting the storage of data
Supporting the management of data
Supporting the enhancement of data
Building Institutional Research Data Capacity
Developing data partnerships with industry?
8
9. Research Data Policy
ARC and NHMRC: Treat data as an asset
Department of Environment: Requirement that
data is open, discoverable, and available
Department of Education: The Australian Research
Data Infrastructure Strategy provides
recommendations for coherent approach to
research data and research data infrastructure
9
10. Data is Transformative
Governments are not investing in research data to
make life easier for researchers
Investments in research data to enable societal
problems to be addressed
This requires data to be in a form that allows a
wide variety of use
10
11. Data as a research output – and more
• Funders are seeing research
data a publishable output
• They expect data to be
managed
• They expect it to be
available for industry,
education, the public and
further research
11
Industry Education
Public Research
Data
12. AURIN – Urban data infrastructure
How can I increase the value of my suburban
property development?
How do I make it more “liveable” to attract more
buyers?
Integrate data from developers, local government,
state government, federal government, mapping
data, roads data, public transport maps….
Apply University of Melbourne developed
“walkability” index 12
13. How do you develop suburbs that
work for residents, developers and
local government?
Along the Maribyrnong River, 10 km from Melbourne’s CBD, 128 ha of government land is
ripe for redevelopment
It could accommodate 3000 dwellings and offices for 3000 people
Planning a sustainable, liveable community integrated into its urban surrounds demands
information on transport, health services, environment, housing prices, recreation
facilities and more
This comes from Federal and State government agencies, local councils, utilities and
private companies
For Maribyrnong, data and 80 tools to manage it are being made available through the
Australian Urban Research Intelligence Network (AURIN) and the Australian National Data
Service (ANDS)
New tools—such as employment opportunities and walkability—are being added
Similar projects can facilitate development across Australia’s cities and towns
13
14. Data Value
Stronger research
More efficient research
Stronger partnerships
More industry engagment – data as a trust builder
14
15. Australian National Data Service:
To make Australia’s research data assets
more valuable for its researchers, research
institutions and the nation
15
16. So we need to transform:
Data that are:
Unmanaged
Disconnected
Invisible
Single use
To Structured Collections that are:
Managed
Connected
Findable
Reusable
so that researchers can easily publish, discover, access
and use research data.
Value
17. Major Open Data Program
Connecting mining data, to
research techniques, to
industry exploration
Connecting twitter data to
Jakarta map to analytics for
managing flooding
Collecting tropical data to
institutional strategy
Collecting ancient DNA for
forming international
partnerships for new results 17
18. Data Opportunities – and threats
Data sharing is great for trust development
Data openness challenges traditional business
models
Data partners can be anywhere – EU is investing
€1.4B in open data to drive jobs and innovation
Research data environment in Australia is world
leading
18
19. Back to some Trends:
Reproducible Science
Open Science
Open Data
Data Citation
Data Citation
Bibliometrics
Data Journals
Data Repositories
Trusted Data
Repositories
FAIR Data
Funded Fair Data
19
21. Royal Society
publishes “Science as
an open enterprise” –
written by Geoffrey
Boulton
Influential in EU/UK
21
22. EU Open Data “Pilot”
1.4B Euros as part of H2020
80% take up
22
23. Data citation
Data that is used
should be cited – just
as other work is cited
Provides appropriate
credit
Enables reproduction
23
DataCite provides
reliability
Agreed basic
information: Creator
(Publication year), Title,
Publisher, Identifier
Suitably formatted DOI
24. Data citation works with..
ORCID – for people
Crossref – for papers
Fundref – for funders
IGSN – for specimens
…
Can we measure the
value? Bibliometricians
arise!
24
Connection is key
And the connections
should be machine
operable
Research is more
valuable if it is more
connected
25. Data Journals
Geoscience Data
Journal (Wiley)
Scientific Data (Nature)
Journal of Open
Archaeology
Data (Ubiquity)
Biodiversity Data
Journal (Pensoft)
A means of describing
the data – its formation,
properties, usage
Enables recognition of a
contribution
Enhances usage of the
data
Enables “traditional”
bibliometrics
25
26. Data Repositories:
Provide:
Data storage
Metadata storage
Data access methods
Data management
software
But also:
Integrated approach to
content and metadata
Policies, processes,
services, and people
Overall commitment to
the stewardship of
digital materials
26
27. Trusted data repositories
Need for reliable data
Trusted repositories:
Trusted Repositories Audit & Certification (TRAC) -ISO 16363
Data Seal of Approval e.g. Pacific and Regional Archive for
Digital Sources in Endangered Cultures (PARADISEC)
Often required by publishers
May be increasingly required (and funded) by
research funders
27
28. FAIR Data – (FORCE 11)
To be Findable:
(meta)data are assigned a globally unique
and eternally persistent identifier.
data are described with rich metadata.
(meta)data are registered or indexed in a
searchable resource.
metadata specify the data identifier.
To be Accessible:
(meta)data are retrievable by their identifier
the protocol is open, free, and universally
implementable
the protocol allows for an authentication and
authorization procedure, where necessary.
metadata are accessible, even when the data
are no longer available.
To be Interoperable:
(meta)data use a formal, accessible, shared,
and broadly applicable language for
knowledge representation
(meta)data use vocabularies that follow FAIR
principles.
(meta)data include qualified references to
other (meta)data.
To be Re-usable:
meta(data) have a plurality of accurate and
relevant attributes.
(meta)data are released with a clear and
accessible data usage license
(meta)data are associated with
their provenance.
(meta)data meet domain-relevant
community standards.
28
29. Funded FAIR data
All of the data that support a research finding
should be FAIR
It should be stored in a trusted repository
It should be funded
29
31. The Opportunity
Fully integrated publication of all outputs of a
scholarly endeavour with rich connection
FAIR data in a trusted repository
Fully explorable scholarly journals
Researchers get much better exposure of their
research
The outcomes are defensible
New research and partners become available
31
32. Conclusions
Research data is valuable
It should be expected that the data underpinning
findings are available for scrutiny
Far greater value is available, especially if it is
findable, accessible, interoperable and reusable
This is helped if data is published
32
33. 33This work is licensed under a Creative Commons Attribution 3.0 Australia License
ANDS is supported by the Australian Government through the National Collaborative
Research Infrastructure Strategy (NCRIS).
Thank you!