1. #OAER12
22/10/2012
Afternoon session
Opening up Science
2. Open, if it’s possible
and
closed, if it has to be
Archiving research data and making it available @DANS
Henk Harmsen
Open Access to Excellence in Research
October 22, 2012 - 9.15 – 17.00 u
KVAB, Brussels
3. Contents
• Data is hot!
• About DANS
• Storing & Sharing
• Linked resources
• Modes of access
6. Data is hot!
• Article on “trends for 2012”:
“Keeping your research data
secret until they are finally
printed in a scientific
journal is so 2011”
• Neelie Kroes (Vice-President
of the European
Commission responsible for
the Digital Agenda): “Data is
the new gold”
7. What is DANS?
• Institute of Dutch Academy and Research
Funding Organisation (KNAW & NWO) since 2005
• First predecessor dates back to 1964 (Steinmetz
Foundation), Historical Data Archive 1989
• Mission: promote and provide permanent access
to digital research information (started with data
archives in the humanities and social sciences)
8. Our main activities and services
• Encourage researchers to self-archive and reuse data by means of our
Electronic Archiving SYstem EASY
• Our largest digital collections are in archaeology, social sciences and
history (moving into other domains as well)
• Provide access, through Narcis.nl, to thousands of scientific datasets, e-
publications and other research information in the Netherlands
• Data projects in collaboration with research communities and partner
organisations
• Advice, training and support (Data Seal of Approval,
Persistent Identifier Infrastructure)
• R&D into archiving of and access to digital information
9. Collaboration DANS – University
Libraries
• Starting with Delft, Leiden, Wageningen…
• UL: front offices - DANS: back office
• Roles:
– DANS: long-term archiving of research data (like KB e-depot
for publications), providing expertise, training, standards
– UL: data lab services (VRE, repository) for local researchers
• Possibility to archive data from University repositories:
– Challenges explored in Podium Plus project (SURF Share)
– Auto-ingest from Dataverses
– Stumbling blocks not technical, but copyright
– IPR issues can be solved if university, researchers and funders
agree
10. Data @ DANS is not “up for grabs”!
Modes of access
• Open (after registration)
• Restricted (depositor is the access authority)
• Other (DANS as security backup)
Archiving system EASY facilitates
- Access management easy and fast
- Embargo for limited time period
- Data reviews
- See who used “your” data
11. Why is digital preservation of data
important?
• Storage of data makes research more
transparent
• Checks on claims made in publications
• Replication research is possible
• However, data re-use for comparative studies
is much more important
12. How does it work?
• NWO investments.
Before grant is awarded
there is a agreements on
access
– At DANS
– Or other repository with
Data Seal of Approval
• Archeological research
deposit obligation
13. Cultures of data sharing differ over
disciplines, but also change over time
14. Six reasons not to share your data
1. No one else can understand the complexity of
my data.
2. If someone analyzes my data he/she might come
to other conclusions.
3. Someone else might even discover new findings.
4. I am not yet ready with the analysis of my data.
5. I’ve worked hard to collect the data. They’re
mine!
6. I cannot trust data that has been produced
somewhere else.
19. Data reviews
• Pilot
• 92% recommends re-used dataset
• Average rating is about 4 (scale 1-5)
• 70% states that specific dataset helps to answer
questions
20. Data Seal of Approval
5 Criteria
16 Guidelines
The research data:
• can be found on the
Internet
• are accessible (clear
rights and licenses)
• are in a usable format
• are reliable
• can be referred to
(persistent identifier)
www.datasealofapproval.org
26-10-2012
22. Open Data: how we cope with them…
“OPEN ACCESS TO EXCELLENCE IN RESEARCH”
October 22, 2012
Brussel
Jan Haspeslagh
Librarian
Heike Lust
Information manager
23. Overview
The process
• Archiving
• Documenting
• QC & Integration
• Publishing
• Redistribition
The data policy
26. Metadata discovery:
Archiving • Responsibles
• Access rights
• Parameters
• Coverage: time, geography,
taxonomy, …
• Relations to other datasets
• Publications
Documenting Goal:
Maximum searchability
and retrieval
29. QC: all elements available for correct reading, use and
analysis of data?
Archiving
Integration: Combining data from different sources and
providing users with a unified view of these data
QC
Documenting Integration
30. IMIS Integrated Marine Information System
Publishing
→ Module Datasets: ISO 19115 discovery metadata
→ Module Literature: ISBD & ASFIS metadata standards
31. Open Marine Archive
&
Open Data
Redistribution Publishing
→ Module Datasets
Crossreferenced!
→ Module Literature
39. WDS Data Policy
There will be full and open exchange of data, metadata and
products shared within WDS, … All shared data, metadata and
products being free of charge or no more than cost of
reproduction will be encouraged for research and education.
IOC Oceanographic Data Exchange Policy
Member States shall provide timely, free and unrestricted
access to all data, associated metadata and products generated
under the auspices of IOC programmes.
Member States are encouraged to provide timely, free and
unrestricted access to relevant data and associated metadata
from non-IOC programmes …. for non-commercial use by the
research and education communities, provided that any
products or results of such use shall be published in the open
literature without delay or restriction.
40. Data policy at VLIZ
(under development)
VLIZ advocates free data exchange and supports the IOC
Oceanographic Data Exchange Policy. Wherever possible
and relevant, the data from the databases will be made
available online through the Internet. Naturally, restrictions
may apply, as a result of which we cannot offer unlimited
access. This is for example the case for data of which VLIZ
is not the primary source: in this case the data exchange
policy of the originator of the data will apply.
41. Data policy at VLIZ – practical
MDA & OMA
• Permanent archive for data and publications
• Fully documented
• Easy online archival & information tool
Main challenges:
Convincing scientists to openly share their data
no mandates, all is voluntary!
Effort involved in properly describing data so it
can be re-used by others → need for dedicated
data centers!
Niemand anders kan de complexiteit van mijn data begrijpen [de praktijk laat zien dat dat wel kan als de omstandigheden van het onderzoek en de data zelf goed worden gedocumenteerd en beschreven]Als iemand anders mijn data analyseert, vindt hij misschien een ander antwoord dat mijn bevindingen kan ondergraven [het kunnen falsifiëren van uitspraken vormt juist de grondslag van de wetenschappelijke methode. Onderzoekers die falsificatie onmogelijk maken hinderen de vooruitgang van wetenschappelijke kennis]Misschien vindt iemand anders iets in mijn data dat ik over het hoofd heb gezien [dat verhoogt juist het rendement van de investering in de dataverzameling en het onderzoek]Ik ben nog niet klaar met de analyse van mijn data [onderzoek is nooit af, met dit argument kan het beschikbaar stellen van data eeuwig tot mañana worden uitgesteld. Een publicatie over de data is het uiterste moment om ze beschikbaar te stellen voor contra-expertise]Het zijn mijn data, waarvoor ik hard heb gewerkt om ze te verzamelen, en niemand anders heeft er recht op [dataverzameling die met publieke middelen wordt gefinancierd, moet ook publiekelijk toegankelijk zijn. NWO kan zeker rechten claimen op data uit door NWO gefinancierd onderzoek. Bovendien: als de resultaten van onderzoek wel publiek worden gemaakt in boeken en tijdschriften, waarom dan de data die eraan ten grondslag liggen niet?]De data die elders zijn geproduceerd kan ik niet vertrouwen of begrijpen [als dat niet kan, zijn de onderzoeksresultaten in de literatuur dan wel te vertrouwen en begrijpen? Dit is trouwens het spiegelbeeld van 1]