1. Documentation of Scientific Results
and the Handling and Keeping of Scientific Data
Thea M Drachen, Marine biologist, PhD
Research Librarian, University of Southern Denmark
2. Course lecturer
Thea Drachen
Ph.D. in marine biology 2008
(University of Technology, Sydney)
Information specialist at the Research
Support Service Section at
Copenhagen University Library and
Information Service
Research information scientist at
Global Information and Analysis, Novo
Nordisk
Reseach librarian at University of
Southern Denmark
E-mail: thmd@bib.sdu.dk
3. Morning schedule
09:30 Documentation of Scientific Results and the Handling
and Keeping of Scientific Data:
Welcome and introduction
Part 1: Expectations and the fitness of scientific results
Buzzing Break
Part 2: Rules and regulations regarding research data
Buzzing Break
Part 3: Research data management (handling + keeping)
Suggestion for further homework
10:30 Coffee break
8. Method not fit for purpose
1998 Andrew Wakefield and
Lancet paper
MMR vaccine causes autism
widespread scare of vaccination
– with severe direct and indirect
effects
2010 retracted - found to be
bad scientific practice (no
control group and causality
unclear)
Exerpts from American Academy of Pediatrics
http://www2.aap.org/immunization/families/autismwakefield.html
accessed 6 March 2014
10. Aim of the course and learning goals:
You must be able to explain and apply the norms
of Responsible Conduct of Research in relation to:
• Documentation of your scientific results
• The handling and keeping of your scientific data
(aka. research data management…)
11. Take home messages and intended learning
outcome:
• Students should know the requirements for
documentation of research as described in
Guidelines for Good Scientific Practice
The PhD School of SCIENCE
• Students should be able to reflect on and
describe own scientific practice in relation to
requirements described above.
• Students should be able to recognize and
discuss the underlying principles for creating
trustworthy scientific results.
(cf. The Danish Committees on Scientific
Dishonesty, for instance Guidelines for research
protocols, data documentation and data
storage).
16. You must properly document your scientific results!
- for without documentation of your research there
can be no perceived quality of that research!
18. Three elements from the Singapore Statement
on Research Integrity and data
1. Integrity: Researchers should take responsibility for the
trustworthiness of their research.
4. Research Records: Researchers should keep clear,
accurate records of all research in ways that will allow
verification and replication of their work by others.
5. Research Findings: Researchers should share data and
findings openly and promptly, as soon as they have had
an opportunity to establish priority and ownership claims.
Great Expectations…
19. Expectation: The Vancouver Rules
Est. 1978 by a group of editors of general medical journals. 5th
edition from 1997.
Describes the do’s and don'ts of article publication in medicine.
Authorship is substantial participation, where all the following
conditions are met:
1. conception and design, or analysis and interpretation
of data
AND
2. drafting the article or revising it critically for important
intellectual content
AND
3. final approval of the version to be published.
There are no demands of data management, but managing
your data will make it easier to document adherence to the
protocol.
20. Expectation: Follow the principles of how to perform
valid measurements
1. Analytical measurements should be made to satisfy an agreed
requirement.
2. Analytical measurements should be made using methods and
equipment which have been tested to ensure they are fit for purpose.
3. Staff making analytical measurements should be both qualified and
competent to undertake the task.
4. There should be regular independent assessment of the technical
performance of a laboratory.
5. Analytical measurements made in one location should be consistent
with those made elsewhere.
6. Organizations making analytical measurements should have well
defined quality control and quality assurance procedures.
Source:
http://www.nmschembio.org.uk/GenericArticle.aspx?m=108&amid=1
285 (access date: 11 June 2014)
21. General expectations of high quality research
Meeting required expectations (i.e. rules etc.) from:
• Scientific journals / publishers
• Local standards / institutions
• Funders’ requirements
• Research archives
• Ethical guidelines
• Authoring (Vancouver rules)
• Client standards
• Society
• The Law
• …
Expectation of scientific truth – whatever that may be (cf.
your scientific method)…
23. The Danish Committees on Scientific Dishonesty
Guidelines for Good Scientific Practice
• for research protocols and reports, data documentation and data
storage (Chapters 1 and 2)
• for agreements at the initiation of research projects (Chapter 3)
• relating to rights and duties concerning storage and use of research
data (Chapter 4)
• on publication matters (Chapter 5)
• the Act on Processing of Personal Data and research
projects (Chapter 7)
http://ufm.dk/en/research-and-innovation/councils-and-commissions/the-danish-committees-on-
scientific-dishonesty/the-danish-committees-on-scientific-dishonesty
24. The Danish Committees on Scientific Dishonesty
Guidelines for research protocols and reports,
data documentation and data storage in basic health research
• All persons participating in a project shall be able to see and
understand the original trial results, their processing and
interpretation.
• The research results ought to be available in the longer term so
as to be reassessed or applied for further research.
• An appropriate design and storage of research protocols and
research reports, data documentation and data storage are
accordingly crucial.
Guidelines for Good Scientific Practice, continued
28. Some laws, rules and regulations
Good scientific practice does require the research to be
in accordance with e.g.:
• The Danish Data Protection Agency (Datatilsynet)
• The Danish National Committee on Biomedical Research
Ethics
• The regional committees on research ethics (Regionerne).
• The Danish Agency on Animal Experiments
(Dyreforsøgstilsynet)
• The Danish Committees on Scientific Dishonesty (UVVU)
• The extended practice committee at UCPH
• The Danish Code of Conduct forthcomming
29. Research Data Management at UCPH
In 2013 The extended practice committee on responsible
conduct of research (Det Udvidede Praksisudvalg for god
videnskabelig praksis) recommended that the university board
create a policy on scientific data management.
Exerpts from recommendations
• Before research projects involving research data is started, it is the
responsibility of the PI to ensure that there is a written
clarification of the circumstances under which the research data is
collected, stored and secured (i.e. a data management plan, DMP)
• Research data should be stored so that the results can be
recreated if necessary (integrity)
• The responsible researcher is to back up her data (preservation)
• Research data should be stored in a sufficient form and in a safe
place, i.e. it must not be possible subsequently to manipulate or
falsify data or make unauthorized access (security)
30. The Ministry of Higher Education and Science has a group working
on a proposal for a Danish Code of Conduct for Research Integrity.
As part of a broader hearing during April/May a conference was held
in May 2014. The hearing has just reached its deadline.
Chapter 2. Data management
“Responsible conduct of research includes proper management of
primary materials and data. The key purpose of data
management is to guarantee credible and transparent research.”
• Researchers are responsible for storing their data and primary
materials.
• Researchers are – unless otherwise regulated – responsible for
deciding the extent to which primary material should be retained.
When deciding this, researchers should consider the value of the
primary materials for assessing the results of the research and
take account of the physical and technical possibility of storage at
the institution.
Recent developments: A Danish Code of Conduct
31. Responsibilities – excepts …
i. Data and primary materials should be retained, stored and
managed in a clear and accurate form that allows the result to be
assessed, the procedures to be retraced and – when applicable – the
research to be reproduced.
ii. Data should be kept for a period of at least five years from the
date of publication…
iv. The data records should enable identification of persons having
conducted the actual research…
v. Results should be kept irrespective of whether or not they
were published, and should contain a precise and traceable
reference to the source.
vii. Data and primary material should be retained in a way that
makes them available for use by other researchers…
Recent developments: A Danish Code of Conduct
33. Buzzing topic
Which rules regarding
data are most critical
to your research
project?
Which of them
are the most difficult
to follow?
You have 5 minutes
1. We will discuss these
in plenum afterwards.
have 10 min.
Creative Ignition (Flickr: Frustration) [CC-BY-2.0
(http://creativecommons.org/licenses/by/2.0)], via
Wikimedia Commons
35. Perceived quality is fundamental to research,
integrity and the ethos of science
Quality assurance through data management should
be common sense as most modern research
involves:
Large amounts of data
Many scientists and complex organizations
Advanced and complex analytical methods
Economic and public investments
36. What documentation?
• Quality manual for your lab, project or institution.
• Basic documentation (equipment etc).
• Organization plan (who is responsible for what?).
• Description of methods and procedures.
• Overview of material (documents and inventory).
• Scholarly publications (authoring).
• Records of primary material - any material that form
the basis of the research (e.g. biological material, notes,
interviews, texts and literature, or recordings).
• Data - detailed records of the primary materials that
comprise the basis for the analysis that generates the
results.
37. Why?
→ Documentation ensures integrity of data (QA)
→ Cf. rules and regulations!
→ Meet grant and funder requirements
38. Research Data Management
Again: so why manage your research data?
Primarily:
Because you must (internal and external requirements)!
• Documentation ensures integrity of data (QA)
• Cf. rules and regulations!
• Meet grant and funder requirements
Secondarily:
• To increase the visibility of your research1
• Save time, money, ressources
• Preserve your data for yourself and others
• Increase your research efficiency through documentation
• Facilitate new discoveries
1. Dorch, Bertil. «On the Citation Advantage of linking to data» (5. Juli 2012).
http://hprints.org/hprints-00714715.
39. • Save time, money, resources
• Preserve your data for yourself and others
• Increase your research efficiency through documentation
• Facilitate new discoveries
• Increases the visibility of your research
Spinoffs of quality assurance through data management
40. Cancer clinical trials
Research data management
Piwowar, Heather A., Roger S. Day, og Douglas B. Fridsma. «Sharing Detailed Research
Data Is Associated with Increased Citation Rate». PLoS ONE 2, nr. 3 (2007): e308.
41. Cancer clinical trials
Research data management
“We found that cancer clinical trials which share their
microarray data were cited about 70% more frequently
than clinical trials which do not.”
“This result held even for lower-profile publications and
thus is relevant to authors of all trials.”
Piwowar, Heather A., Roger S. Day, og Douglas B. Fridsma. «Sharing Detailed Research
Data Is Associated with Increased Citation Rate». PLoS ONE 2, nr. 3 (2007): e308.
42. Research data management
Piwowar, Heather A., Roger S. Day, og Douglas B. Fridsma. «Sharing Detailed Research
Data Is Associated with Increased Citation Rate». PLoS ONE 2, nr. 3 (2007): e308.
43. All data necessary to understand, assess, and extend the
conclusions of the manuscript must be available to any reader
of Science. All computer codes involved in the creation or
analysis of data must also be available to any reader of
Science. After publication, all reasonable requests for data and
materials must be fulfilled. Any restrictions on the availability
of data, codes, or materials, including fees and original data
obtained from other sources (Materials Transfer Agreements),
must be disclosed to the editors upon submission. If there are
any MTAs pertaining to data or materials produced in this
research, or that you have agreed to in conducting the
research that restrict you from providing data or materials,
please describe these and send the editor of your manuscript
a copy of these specific MTAs when you submit your
manuscript. Fossils or other rare specimens must be deposited
in a public museum or repository and available for research.
Scientific community, Science (accessed 11 June 2014)
http://www.sciencemag.org/site/feature/contribinfo/prep/gen_info.xhtml#dataavail
Data and materials
44. Science supports the efforts of databases that aggregate
published data for the use of the scientific community.
Therefore, appropriate data sets (including microarray
data, protein or DNA sequences, atomic coordinates or
electron microscopy maps for macromolecular structures,
and climate data) must be deposited in an approved
database, and an accession number or a specific access
address must be included in the published paper. We
encourage compliance with MIBBI guidelines (Minimum
Information for Biological and Biomedical Investigations).
Scientific community, Science, continued
http://www.sciencemag.org/site/feature/contribinfo/prep/gen_info.xhtml#dataavail,
accessed 12 June 2014
45. Details include but are not limited to:
• Molecular structure data. … Approved databases are the
Worldwide Protein Data Bank, BioMag Res Bank, and Electron
Microscopy Data Bank (MSD-EBI), and for synthetic molecules,
the Cambridge Crystallographic Data Centre.
• DNA and protein sequences. Approved databases are GenBank
or other members of the International Nucleotide Sequence
Database Collaboration and SWISS-PROT.
• Microarray data. Data should be presented in MIAME-compliant
standard format. Approved databases are Gene Expression
Omnibus and ArrayExpress.
• Climate data. Data should be archived in the NOAA climate
repository or other public databases.
• Ecological data. We recommend deposition of data in Dryad.
Large data sets with no appropriate approved repository must be
housed as supplemental materials at Science, or only when this is
not possible, on an archived institutional Web site, provided a copy
of the data is held in escrow at Science to ensure availability to
readers.
Scientific community, Science, continued
http://www.sciencemag.org/site/feature/contribinfo/prep/gen_info.xhtml#dataavail,
accessed 12 June 2014
46. Scientific community, Science – con’t
Science supports the efforts of databases that aggregate published data
for the use of the scientific community. Therefore, appropriate data
sets (including microarray data, protein or DNA sequences, atomic
coordinates or electron microscopy maps for macromolecular
structures, and climate data) must be deposited in an approved
database, and an accession number or a specific access address
must be included in the published paper. We encourage compliance
with MIBBI guidelines (Minimum Information for Biological and
Biomedical Investigations).
Details include but are not limited to:
Molecular structure data. … Approved databases are the Worldwide Protein Data Bank, BioMag Res
Bank, and Electron Microscopy Data Bank (MSD-EBI), and for synthetic molecules, the
Cambridge Crystallographic Data Centre.
DNA and protein sequences. Approved databases are GenBank or other members of the International
Nucleotide Sequence Database Collaboration and SWISS-PROT.
Microarray data. Data should be presented in MIAME-compliant standard format. Approved databases
are Gene Expression Omnibus and ArrayExpress.
Climate data. Data should be archived in the NOAA climate repository or other public databases.
Ecological data. We recommend deposition of data in Dryad.
Large data sets with no appropriate approved repository must be housed
as supporting online material at Science, or only when this is not
possible, on an archived institutional Web site, provided a copy of
the data is held in escrow at Science to ensure availability to
readers.
Source: http://www.sciencemag.org/site/feature/contribinfo/prep/gen_info.xhtml#dataavail
As of January 2013, the BMJ will no
longer publish any trial of drugs or
devices where the authors do not
commit to making the relevant
anonymised patient level data
available, on reasonable request
49. Research data management:
A state of mind!
Goodman et al. (2014) Ten Simple Rules for the Care and
Feeding of Scientific Data. PLoS Comput Biol 10(4):
e1003542. doi:10.1371/journal.pcbi.1003542
Rule 1. Love Your Data, and Help Others Love It, Too
Rule 2. Share Your Data Online, with a Permanent Identifier
Rule 3. Conduct Science with a Particular Level of Reuse in Mind
Rule 4. Publish Workflow as Context
Rule 5. Link Your Data to Your Publications as Often as Possible
Rule 6. Publish Your Code (Even the Small Bits)
Rule 7. State How You Want to Get Credit
Rule 8. Foster and Use Data Repositories
Rule 9. Reward Colleagues Who Share Their Data Properly
Rule 10. Be a Booster for Data Science
50. Research data management:
Use a checklist for your data management life cycle!
Steps in a typical data management plan:
1. Introduction and context
2. Data types, formats, standards and capture methods
3. Ethics and intellectual property
4. Access, data sharing and re-use
5. Short-term storage and data management
Cf. DCC. (2013). Checklist for a Data Management Plan. v.4.0.
Edinburgh: Digital Curation Centre. Available online:
http://www.dcc.ac.uk/resources/data-management-plans
58. Suggested extra homework
Open Researcher and Contributor ID = ORCID
• an open standard for uniquely identify scientists and
science writers
• based on Thomson Reuter’s ResearcherID
• links an authors’ ResearcherID and Scopus ID (Elsevier)
I.e. ORCID is a kind of social security number for
publishing, and can be used as a unique and persistent
identity, e.g. corresponding to DOI for digital objects,
articles etc.
Register at ORCID and complete publication list by
importing from Web of Science and / or Scopus:
http://orcid.org
It only takes 5 minutes
59.
60. List of useful references
Udvalgene Vedrørende Videnskabelig Uredelighed (2009):
Vejledninger i God Videnskabelig Praksis med særlig fokus på
sundhedsvidenskab, naturvidenskab og teknisk videnskab.
Forsknings- og Innovationsstyrelsen, København.
The Danish Committees on Scientific Dishonesty (2009): Guidelines
for Good Scientific Practice with special focus on health science,
natural science and technical science. Forsknings- og
Innovationsstyrelsen, København.
Daasnes, Camilla (2008) Persondataloven – regler og praksis for god
databehandlingsskik, DDA Nyt 94, 4-9
http://samfund.dda.dk/ddakatalog//MogD/md94.pdf
EURACHEM / CITAC Working Group (1998) Quality Assurance for
Research and Development and Non-routine Analysis. CITAC,
Switzerland.
Practice Committee (2011) Practical advice regarding good scientific
practice. University of Copenhagen, Denmark.
Prichard & Barwick (2008) Quality Assurance in Analytical Chemistry.
John Wiley & Sons, Ltd, Chichester, England.
61. Further information about accessibility of data and materials
Cech, T. R. (2003), Sharing Publication-Related Data and
Materials: Responsibilities of Authorship in the Life
Sciences.
American Psychological Association, Responsible Conduct of
Research: Data Sharing and Data Archiving.
National Science Foundation Policy on Data Sharing [PDF].
Committee on Ensuring the Utility and Integrity of Research
Data in a Digital Age and Committee on Science,
Engineering, and Public Policy, Ensuring the Utility and
Integrity of Research Data in a Digital Age(National
Academy Press, Washington, DC, 2009).
62. CONSORT
Consolidated Standards of Reporting Trials
http://www.consort-statement.org/
various initiatives to alleviate the problems arising from
inadequate reporting of randomized controlled trials
64. 1960s to explain some of the
horrors taking place during WW2
Question:
“For how long will someone
continue to give shocks to
another person if they are told to
do so, even if they thought they
could be seriously hurt?” i.e. will
people do morally wrong things
just because an authority figure
tells them to?
Stanley Milligram Experiment
65. 75-450 V
Prerecorded audio
40 subjects
all 40 gave up to 300 V
25 continued to 450 V
Repeated in 2009 by BBC
12 subjecsts
9 continued to 450 V
Stanley Milligram Experiment