Webinar presented on December 5, 2012, by Joan Starr and Perry Willett of CDL/UC3, and Lisa Federer and Claudia Horning from UCLA. Part of the ACRL Digital Curation Interest (DCIG) Group Webinar Series.
4. Image credit: http://www.flickr.com/photos/vixon/116447718/ by barryegan (Vitor Leite)
Why should researchers bother with DATA CITATION? What is their motivation?
To provide fair credit to those responsible: exposure
To ensure scientific transparency and reasonable accountability for authors and stewards:
transparency
To aid in tracking the impact of the work: citation tracking
To help data authors verify how their data are being used: verification
To aid scientific reproducibility through direct, unambiguous connection to the precise data used in
T id i tifi d ibilit th h di t bi ti t th i d t di
a particular study: scientific re‐use
Source: ESIP—Earth Science Information Partners
(http://wiki.esipfed.org/index.php/Interagency_Data_Stewardship/Citations/provider_guidelines)
4
5. Image credit: http://www.flickr.com/photos/vixon/116447718/ by barryegan (Vitor Leite)
Why should researchers bother with DATA CITATION? What is their motivation?
To provide fair credit to those responsible: exposure
To ensure scientific transparency and reasonable accountability for authors and stewards:
transparency
To aid in tracking the impact of the work: citation tracking
To help data authors verify how their data are being used: verification
To aid scientific reproducibility through direct, unambiguous connection to the precise data used in
T id i tifi d ibilit th h di t bi ti t th i d t di
a particular study: scientific re‐use
Source: ESIP—Earth Science Information Partners
(http://wiki.esipfed.org/index.php/Interagency_Data_Stewardship/Citations/provider_guidelines)
5
6. Image credit: http://www.flickr.com/photos/vixon/116447718/ by barryegan (Vitor Leite)
Why should researchers bother with DATA CITATION? What is their motivation?
To provide fair credit to those responsible: exposure
To ensure scientific transparency and reasonable accountability for authors and stewards:
transparency
To aid in tracking the impact of the work: citation tracking
To help data authors verify how their data are being used: verification
To aid scientific reproducibility through direct, unambiguous connection to the precise data used in
T id i tifi d ibilit th h di t bi ti t th i d t di
a particular study: scientific re‐use
Source: ESIP—Earth Science Information Partners
(http://wiki.esipfed.org/index.php/Interagency_Data_Stewardship/Citations/provider_guidelines)
6
7. Image credit: http://www.flickr.com/photos/vixon/116447718/ by barryegan (Vitor Leite)
Why should researchers bother with DATA CITATION? What is their motivation?
To provide fair credit to those responsible: exposure
To ensure scientific transparency and reasonable accountability for authors and stewards:
transparency
To aid in tracking the impact of the work: citation tracking
To help data authors verify how their data are being used: verification
To aid scientific reproducibility through direct, unambiguous connection to the precise data used in
T id i tifi d ibilit th h di t bi ti t th i d t di
a particular study: scientific re‐use
Source: ESIP—Earth Science Information Partners
(http://wiki.esipfed.org/index.php/Interagency_Data_Stewardship/Citations/provider_guidelines)
7
8. Image credit: http://www.flickr.com/photos/vixon/116447718/ by barryegan (Vitor Leite)
Why should researchers bother with DATA CITATION? What is their motivation?
To provide fair credit to those responsible: exposure
To ensure scientific transparency and reasonable accountability for authors and stewards:
transparency
To aid in tracking the impact of the work: citation tracking
To help data authors verify how their data are being used: verification
To aid scientific reproducibility through direct, unambiguous connection to the precise data used in
T id i tifi d ibilit th h di t bi ti t th i d t di
a particular study: scientific re‐use
Source: ESIP—Earth Science Information Partners
(http://wiki.esipfed.org/index.php/Interagency_Data_Stewardship/Citations/provider_guidelines)
8
14. Supporting the MANAGE AND SHARE STAGES ARE THE MERRITT curation and
preservation repository and our long‐term identifier service, EZID.
i i d l id ifi i EZID
BAKING DATA CURATION INTO THE COLLECTION PHASE: DATA‐UP
AND ENHANCING COLLECTION OF E‐SCIENCE IN THE WEB ENVIRONMENT :
WEB ARCHIVING SERVICE, OR WAS
To facilitate data publication, we are exploring this new Data Paper model.
We are engaged in a number of network‐level collaborations and
partnerships, but these two have particular relevance to the data
management space, with DataONE focused on distributed data networks and
DataCite on persistent identifiers.
And lastly, we have partnered with UVA, and many others to develop and
launch an easy to use Data Management Plan Tool.
l h t D t M t Pl T l
Let’s take a brief look at all of these things, and then we’ll talk about what
this means to you.
14
16. In its capacity as a data Management tool, Merritt can function in one of
p y g ,
several ways:
it can be a “dark” or inaccessible archive for important digital assets; I
t can serve as a “bright” archive with direct discovery and access; I
t can be the preservation back‐end for existing or new discovery and
content management systems;
or it can integrate with distributed data grids.
Current work includes the Datashare project at UC San Francisco (UCSF).
Datashare, as the name suggests, encourages researchers to share their
data. See the Datashare website at http://datashare.ucsf.edu
16
20. Preservation: Curation microservices and Merritt
To bake data curation i
T b k d i into data creation: DCXL (Data Curation XL Plug‐In)
d i DCXL (D C i XL Pl I )
To enhance data sharing, collecting and gathering: WAS service
To facilitate data publication, we are exploring this new Data Paper model.
And behind many of these steps, the EZID service.
We are engaged in a number of network‐level collaborations and
partnerships, but these two have particular relevance to the data
management space, with DataONE focused on distributed data networks and
t ith D t ONE f d di t ib t d d t t k d
DataCite on persistent identifiers.
And lastly, we have partnered with UVA, and many others to develop and
launch an easy to use Data Management Plan Tool.
So let’s take a brief look at all of these things, and while I’m there, I’ll dive
more deeply into EZID, which is the service I manage.
d l i t EZID hi h i th i I
20
21. Nobody thinks of Excel as a preservation‐ready tool, but everybody uses
y p y , y y
it! The KEY IDEA in keeping this EASY here is: let them use the tools they
are use to using. (Get out of the way of that elephant!)
Gordon & Betty Moore Foundation + Microsoft Research are funding
this.
Our part is requirements gathering; MS will do development. Open
source plug in.
21
22. WAS allows curators to collect and manage web‐published content so
g p
that scholars can use the content for private research and/or publish the
content for public access.
The archives contain eScience content as well as government documents,
event captures, and archives for specific research communities, such as
unique data sets, collections of sites not otherwise grouped together,
and the sites resulting from grant activity.
22
24. Preservation: Curation microservices and Merritt
To bake data curation i
T b k d i into data creation: DCXL (Data Curation XL Plug‐In)
d i DCXL (D C i XL Pl I )
To enhance data sharing, collecting and gathering: WAS service
To facilitate data publication, we are exploring this new Data Paper model.
And behind many of these steps, the EZID service.
We are engaged in a number of network‐level collaborations and
partnerships, but these two have particular relevance to the data
management space, with DataONE focused on distributed data networks and
t ith D t ONE f d di t ib t d d t t k d
DataCite on persistent identifiers.
And lastly, we have partnered with UVA, and many others to develop and
launch an easy to use Data Management Plan Tool.
So let’s take a brief look at all of these things, and while I’m there, I’ll dive
more deeply into EZID, which is the service I manage.
d l i t EZID hi h i th i I
24
26. Preservation: Curation microservices and Merritt
To bake data curation i
T b k d i into data creation: DCXL (Data Curation XL Plug‐In)
d i DCXL (D C i XL Pl I )
To enhance data sharing, collecting and gathering: WAS service
To facilitate data publication, we are exploring this new Data Paper model.
And behind many of these steps, the EZID service.
We are engaged in a number of network‐level collaborations and
partnerships, but these two have particular relevance to the data
management space, with DataONE focused on distributed data networks and
t ith D t ONE f d di t ib t d d t t k d
DataCite on persistent identifiers.
And lastly, we have partnered with UVA, and many others to develop and
launch an easy to use Data Management Plan Tool.
So let’s take a brief look at all of these things, and while I’m there, I’ll dive
more deeply into EZID, which is the service I manage.
d l i t EZID hi h i th i I
26
27. DataONE is an NSF funded, virtual data center for biology, ecology, and
, gy, gy,
environmental sciences.
DataOne has the overarching goal of building a new culture of data
access and data sharing. This is an international collaboration working
with scientists and librarians, as well as other stakeholders.
1. Engaging the scientist in the data curation process
2. Supporting the full data life cycle
3. Encouraging data stewardship and sharing
4. Promoting best practices
5.
5 Engaging citizens
E i iti
6. Developing domain agnostic solutions
27
29. How can EZID be in the business of issuing DataCite DOIs? California
g
Digital Library was one of the founding members.
DataCite was indeed formed in 2009 by 10 Libraries and Research
Centers with a Mission: “"Helping you find, access, and reuse data“
The number has now grown to 16. In addition there are 3 associate
members, including the Korea Institute of Science and Technology
Information and BGI, so there is a presence in Asia.
DATACITE’s primary methodology for achieving this mission: issuing DOIs
(Digital Object Identifiers) for datasets.
(Di it l Obj t Id tifi ) f d t t
29
30. These are the factors driving the collaboration:
g
1. Institutions rely on soft funding… agencies have created a new
demand, meet the demand or don’t get funded.
2. Approach is to work collaboratively to consolidate expertise and
reduce costs
3. Libraries plus, plus
4. Provide an environment that allows researchers to focus on research
30
32. Image credit:
Image credit:
http://content.cdlib.org/ark:/28722/bk0007s853c/?q
uery=tools&brand=calisphere Courtesy of UC Berkeley,
Bancroft Library; United Aircraft Corporation: Joint War Production Drive
Committee
DATA CURATION LEADS TO GOOD OUTCOMES FOR
RESEARCHERS.
• They’ll be motivated routinely to deposit in stable
public storage. Data products (datasets and
processing information) and the data papers that
reward them with authorship credit
• Data journals will spring up around disciplines, even if
disciplinary data papers are scattered across
geographically distributed repositories.
• Data products will be re‐used, annotated, corrected, 32
d i l li k d t f t diti l bli ti