October 17, 2013 @ Robert E. Kennedy Library, Data Studio, California Polytechnic State University.
Many funders now require researchers to submit a Data Management Plan alongside their project proposals. The DMPTool is a free, online wizard that helps you create a data management plan specific to your project, and provides you with links and resources for ensuring your plan is successful.
7. What
is
a
data
management
plan?
A
document
that
describes
what
you
will
do
with
your
data
both
during
your
research
and
after
you
complete
your
project
From
Flickr
by
Barbies
Land
8. From
Flickr
by
401(K)
2013
For
funders:
A
short
plan
submitted
alongside
grant
applications
An
outline
of
–
–
–
–
–
–
what
will
be
collected
methods
But
they
Standards
all
have
different
requirements
Metadata
and
express
them
in
sharing/access
different
ways
long-‐term
storage
Includes
how
and
why
9. Why
prepare
a
DMP?
From
Flickr
by
natalinha
• Saves
time
• Increases
research
efficiency
• Satisfies
requirements
14. Small
&
Simple
• Document
what
you
know
now
• Share
the
plan
with
your
team
• Avoid
procrastination
and
immobilization
Where
to
start?
15. Courtesy
of
Martin
Donnelly
Research
Support
Office
Data
Library
/
Repository
Researcher
DMP?
Unruly
Data
Computing
Support
Faculty
Ethics
Committee
Etc...
16. Who:
Support
Services
&
Collaborators
Plan
section
Support
Information
about
data
PI,
co-‐PIs,
research
staff
Metadata
content
&
format
Librarians,
data
repositories
Policies
for
access,
sharing,
reuse
Funder,
institute,
HIPPA,
IRB,
users
Long-‐term
storage
and
data
management
Librarians,
IT
staff,
data
repositories
Budget
Sponsored
programs
office,
funder
17. DMPs:
A
Short
History
Liz
Lyon:
Dealing
with
Data
2008
UK
funder
expectations
2009
2009-‐10
18. DMPs:
A
Short
History
Across
the
Pond…
Federal
Funding
Accountability
and
Transparency
Act
2006
2010
–
present
2010
19. NSF
DMP
Requirements
From
Grant
Proposal
Guidelines:
DMP
supplement
may
include:
1. the
types
of
data,
samples,
physical
collections,
software,
curriculum
materials,
and
other
materials
to
be
produced
in
the
course
of
the
project
2.
the
standards
to
be
used
for
data
and
metadata
format
and
content
(where
existing
standards
are
absent
or
deemed
inadequate,
this
should
be
documented
along
with
any
proposed
solutions
or
remedies)
3.
policies
for
access
and
sharing
including
provisions
for
appropriate
protection
of
privacy,
confidentiality,
security,
intellectual
property,
or
other
rights
or
requirements
4.
policies
and
provisions
for
re-‐use,
re-‐distribution,
and
the
production
of
derivatives
5.
plans
for
archiving
data,
samples,
and
other
research
products,
and
for
preservation
of
access
to
them
20. 1. Types
of
data
&
other
information
• Types
of
data
produced
• Relationship
to
existing
data
• How/when/where
will
the
data
be
captured
or
created?
C.
Strasser
• How
will
the
data
be
processed?
• Quality
assurance
&
quality
control
measures
• Security:
version
control,
backing
up
biology.kenyon.edu
• Who
will
be
responsible
for
data
management
during/after
project?
From
Flickr
by
Lazurite
21. 2. Data
&
metadata
standards
• What
metadata
are
needed
to
make
the
data
meaningful?
• How
will
you
create
or
capture
these
metadata?
Wired.com
• Why
have
you
chosen
particular
standards
and
approaches
for
metadata?
22. 3. Policies
for
access
&
sharing
4. Policies
for
re-‐use
&
re-‐distribution
• Are
you
under
any
obligation
to
share
data?
• How,
when,
&
where
will
you
make
the
data
available?
• What
is
the
process
for
gaining
access
to
the
data?
• Who
owns
the
copyright
and/or
intellectual
property?
•
•
•
•
•
•
Will
you
retain
rights
before
opening
data
to
wider
use?
How
long?
Are
permission
restrictions
necessary?
Embargo
periods
for
political/commercial/patent
reasons?
Ethical
and
privacy
issues?
Who
are
the
foreseeable
data
users?
How
should
your
data
be
cited?
23. 5. Plans
for
archiving
&
preservation
• What
data
will
be
preserved
for
the
long
term?
For
how
long?
• Where
will
data
be
preserved?
• What
data
transformations
need
to
occur
before
preservation?
• What
metadata
will
be
submitted
alongside
the
datasets?
• Who
will
be
responsible
for
preparing
data
for
preservation?
Who
will
be
the
main
contact
person
for
the
archived
data?
From
Flickr
by
theManWhoSurfedTooMuch
24. Don’t
forget:
Budget
• Costs
of
data
preparation
&
documentation
Hardware,
software
Personnel
Archive
fees
• How
costs
will
be
paid
Request
funding!
dorrvs.com
25. *
NSF’s
Vision
DMPs
and
their
evaluation
will
grow
&
change
over
time
Peer
review
will
determine
next
steps
Community-‐driven
guidelines
Evaluation
will
vary
with
directorate,
division,
&
program
officer
*Unofficially
26. A
DMP
Example
(1)
•
•
Project
name:
Effects
of
temperature
and
salinity
on
population
growth
of
the
estuarine
copepod,
Eurytemora
affinis
Project
participants
and
affiliations:
Carly
Strasser
(University
of
Alberta
and
Dalhousie
University)
Mark
Lewis
(University
of
Alberta)
Claudio
DiBacco
(Dalhousie
University
and
Bedford
Institute
of
Oceanography)
• Funding
agency:
CAISN
(Canadian
Aquatic
Invasive
Species
Network)
•
• Description
of
project
aims
and
purpose:
•
We
will
rear
populations
of
E.
affinis
in
the
laboratory
at
three
temperatures
and
three
salinities
(9
treatments
total).
We
will
document
the
population
from
hatching
to
death,
noting
the
proportion
of
individuals
in
each
stage
over
time.
The
data
collected
will
be
used
to
parameterize
population
models
of
E.
affinis.
We
will
build
a
model
of
population
growth
as
a
function
of
temperature
and
salinity.
This
will
be
useful
for
studies
of
invasive
copepod
populations
in
the
Northeast
Pacific.
• Video
Source:
Plankton
Copepods.
Video.
Encyclopædia
Britannica
Online.
Web.
13
Jun.
2011
27. A
DMP
Example
(2)
•
•
•
1.
Information
about
data
Every
two
days,
we
will
subsample
E.
affinis
populations
growing
at
our
treatment
conditions.
We
will
use
a
microscope
to
identify
the
stage
and
sex
of
the
subsampled
individuals.
We
will
document
the
information
first
in
a
laboratory
notebook,
then
copy
the
data
into
an
Excel
spreadsheet.
For
quality
control,
values
will
be
entered
separately
by
two
different
people
to
ensure
accuracy.
The
Excel
spreadsheet
will
be
saved
as
a
comma-‐
separated
value
(.csv)
file
daily
and
backed
up
to
a
server.
After
all
data
are
collected,
the
Excel
spreadsheet
will
be
saved
as
a
.csv
file
and
imported
into
the
program
R
for
statistical
analysis.
Strasser
will
be
responsible
for
all
data
management
during
and
after
data
collection.
Our
short-‐term
data
storage
plan,
which
will
be
used
during
the
experiment,
will
be
to
save
copies
of
1)
the
.txt
metadata
file
and
2)
the
Excel
spreadsheet
as
.csv
files
to
an
external
drive,
and
to
take
the
external
drive
off
site
nightly.
We
will
use
the
Subversion
version
control
system
to
update
our
data
and
metadata
files
daily
on
the
University
of
Alberta
Mathematics
Department
server.
We
will
also
have
the
laboratory
notebook
as
a
hard
copy
backup.
28. A
DMP
Example
(3)
• 2.
Metadata
format
&
content
•
We
will
first
document
our
metadata
by
taking
careful
notes
in
the
laboratory
notebook
that
refer
to
specific
data
files
and
describe
all
columns,
units,
abbreviations,
and
missing
value
identifiers.
These
notes
will
be
transcribed
into
a
.txt
document
that
will
be
stored
with
the
data
file.
After
all
of
the
data
are
collected,
we
will
then
use
EML
(Ecological
Metadata
Language)
to
digitize
our
metadata.
EML
is
on
of
the
accepted
formats
used
in
Ecology,
and
works
well
for
the
type
of
data
we
will
be
producing.
We
will
create
these
metadata
using
Morpho
software,
available
through
the
Knowledge
Network
for
Biocomplexity
(KNB).
The
documentation
and
metadata
will
describe
the
data
files
and
the
context
of
the
measurements.
29. A
DMP
Example
(4)
3.
Policies
for
access,
sharing
&
reuse
•
We
are
required
to
share
our
data
with
the
CAISN
network
after
all
data
have
been
collected
and
metadata
have
been
generated.
This
should
be
no
more
than
6
months
after
the
experiments
are
completed.
In
order
to
gain
access
to
CAISN
data,
interested
parties
must
contact
the
CAISN
data
manager
(data@caisn.ca)
or
the
authors
and
explain
their
intended
use.
Data
requests
will
be
approved
by
the
authors
after
review
of
the
proposed
use.
•
The
authors
will
retain
rights
to
the
data
until
the
resulting
publication
is
produced,
within
two
years
of
data
production.
After
publication
(or
after
two
years,
whichever
is
first),
the
authors
will
open
data
to
public
use.
After
publication,
we
will
submit
our
data
to
the
KNB
allowing
discovery
and
use
by
the
wider
scientific
community.
Interested
parties
will
be
able
to
download
the
data
directly
from
KNB
without
contacting
the
authors,
but
will
still
be
encouraged
to
give
credit
to
the
authors
for
the
data
used
by
citing
a
KNB
accession
number
either
in
the
publication’s
text
or
in
the
references
list.
30. A
DMP
Example
(5)
4.
Long-‐term
storage
and
data
management
The
data
set
will
be
submitted
to
KNB
for
long-‐term
preservation
and
storage.
The
authors
will
submit
metadata
in
EML
format
along
with
the
data
to
facilitate
its
reuse.
Strasser
will
be
responsible
for
updating
metadata
and
data
author
contact
information
in
the
KNB.
• 5.
Budget
•
A
tablet
computer
will
be
used
for
data
collection
in
the
field,
which
will
cost
approximately
$500.
Data
documentation
and
preparation
for
reuse
and
storage
will
require
approximately
one
month
of
salary
for
one
technician.
The
technician
will
be
responsible
for
data
entry,
quality
control
and
assurance,
and
metadata
generation.
These
costs
are
included
in
the
budget
in
lines
12-‐16.
35. DMPTool
Project
• Partners
started
working
in
January
2011
• Developed
requirements,
divided
work
• Self-‐funded
/
In-‐kind
36. dmptool.org
• Free
• Guides
through
creating
a
DMP
• Helps
meet
funder
requirements
• Supplies
questions
• Includes
explanation/context
provided
by
the
agency
• Provides
links
to
the
agency
website
37. Wait!
Data
management
planning
is
complex
&
requires
dialog
Range
of
support
&
understanding
Our
focus:
• simplify
&
scale
the
common
parts
• develop
community
• provide
incremental
improvement
in
functionality
From
Flickr
by
ChrisGoldNY
38. Roadmap
Background
Current
DMP
tools
A
walkthrough
of
DMPTool
From
Flickr
by
(Luciano)
DMPTool2
39.
40.
41.
42.
43.
44.
45.
46.
47.
48.
49.
50.
51. Access
DMPTool
can
be
added
to
campus
single
sign-‐on
service
Researchers
use
campus
login
to
access
tool
From
Flickr
by
Clonny
Researchers
like
it
here
52. Institution-‐specific…
• Help
text
• Links
to
resources
&
services
• Suggested
answers
…at
different
levels
• All
DMPs
• All
DMPs
for
a
particular
funding
agency
• Question
within
a
data
management
plan
Customized
Resources
From
Flickr
by
lumachrome
58. 2
Improvements
for
Plan
Creators
• Collaborative
plan
creation
• Role-‐based
user
authorization
&
access
• Better
plan
templates
&
resources
59. 2
New
administrator
Interface
• Template
creation:
• Better
plan
template
granularity
discipline,
funder,
question
What
tgranularity
• Better
institution
his
means
for
plan
creators:
department,
college,
lab
group,
…
• Better
plans
• Enhanced
search
and
browse
of
plans
• Access
to
m• More
granular
help
etrics
for
reporting
&
follow-‐up
• Local
input
&
assistance
62. DMPTool
API
Other
stuff
• Carefully
thought
out
code
• Invisible
to
user
• Expose
specific
functionality
and/or
data
• Other
functionality/data
protected
63. API
Benefits
Interactions
Improve
functionality
Add
more
functionality
Combine
with
their
services
Popularity
64.
65.
66. IMLS
Grant
Improving
Data
Stewardship
with
the
DMPTool
Provide
librarians
with
the
tools
and
resources
to
claim
the
data
management
education
space