“Hot Topics: The DuraSpace Community Webinar Series, " Series Six: Research Data in Repositories” Curated by David Minor, Research Data Curation Program, UC San Diego Library. Webinar 1: “Research Data Curation at UC San Diego: An Overview”
Presented by David Minor & Declan Fleming, Chief Technology Strategist, UC San Diego Library
10-1-13 “Research Data Curation at UC San Diego: An Overview” Presentation Slides
1. October 1, 2014 Hot Topics: DuraSpace Community Webinar Series
Hot Topics: The DuraSpace
Community Webinar Series
Series Six:
“Research Data in Repositories”
Curated by David Minor
2. October 1, 2013 Hot Topics: DuraSpace Community Webinar Series
Webinar 1: Research Data Curation
at UC San Diego: An Overview
Presented by:
David Minor, Research Data Curation Program, UC San
Diego Library
Declan Fleming, Chief Technology Strategist, UC San Diego
Library
3. Hot Topics Web Seminar Series:
Research Data in Repositories
The UC San Diego Experience
First Webinar: Introduction and Framing
4. General Series Intro
• First webinar: Intro and Framing: UC San
Diego decisions and planning
• Second Webinar: Deep dive into technology
and metadata
• Third Webinar: The perspective from
researchers, next steps
5. Your esteemed presenters …
First webinar:
David Minor – Program Director, Research Data Curation
Declan Fleming - Chief Technology Strategist
Second webinar:
Arwen Hutt - Metadata Librarian
Matt Critchlow - Manager of Development and Web Services
Third webinar:
Dick Norris – Professor, Scripps Institution of Oceanography
Rick Wagner – Data Scientist at San Diego Supercomputer Center
6. Today we will …
• Provide background on who we are
• Talk about how we got here
• Look at what differentiates research data
• Examine how we approached this work
• Point to the future
8. • Social Sciences (38.1%)
• Engineering (20.3%)
• Biology (18.7%)
• Science/Math (10.3%)
• Special/Undeclared (6.3%)
• Humanities (3.3%)
• Arts (3.0%)
30,000 Students
$1 Billion in annual research
9. UC San Diego Library
• 590,000+ e-books
• 43,000 electronic periodicals and journals
• 262,000+ digital media resources
• 50+ unique digital collections
10. How did we get here?
Photo courtesy Declan Fleming, All Rights Reserved
11. 2008 Campus-wide survey
Indicated desire for data services
Management of active data
Long-term preservation of data
Data Description
2009 “Blueprint for the Digital University”
2012 Data Management Plan requirements
Changed thinking of many on campus
Push for integrated campus services
12. Our researchers said they need:
Places to put things
Ways to point to things
Help organizing things
Photo courtesy Jenny Reiswig, All Rights Reserved
13. Integrated campus services
Research Cyberinfrastructure Program (RCI)
http://rci.ucsd.edu
Research Data Curation Program in Library
http://libraries.ucsd.edu/services/data-curation
14. RCI is “By campus, for campus”
Started in 2011, RCI are priorities driven by researcher
requirements
RCI is managed by an Oversight Committee,
representing campus units, which sets strategic
directions and oversees implementation
Implementation partners from across campus
o Administrative Computing & Telecommunications
o Calit2
o San Diego Supercomputer Center
o UCSD Library
17. Data Curation Pilots
Two year pilot process with selected
researchers (started September 2011)
Targeted domains representing campus
Explicitly required researcher participation
18. The curation pilot goals
Investigate what it means to make a variety of research
data discoverable and reusable
Investigate current UC San Diego tools for this work
Learn how researchers, information technologists,
and librarians work together with data
Recommend production services
Develop budget and cost models
19. The curation pilot researchers
The Brain Observatory
NSF OpenTopography Facility
Levantine Archaeology Laboratory
Scripps Institution of Oceanography
Geological Collections
The Laboratory for Computational Astrophysics
21. What’s the deal with research data?
What makes it different from other data types?
What challenges does it present to a
repository?
What are the data owners’ expectations?
What are the institutional responsibilities?
22. What makes it different from other data types?
Complexity
Size of data
Number of files
Variety of file types
New presentation methods
Mixed mode presentation methods
Photo courtesy Jenny Reiswig, All Rights Reserved
23. What challenges does it present to a
repository?
How do we represent complex data models?
How do we meet the needs of a diverse user community
(including data owners)?
How do we interact with diverse systems?
How does it play with library data?
24. What are the data owners’ expectations?
Long term preservation of data
Long term availability of data
Reference-ability of data
Lots of cool whiz bang features Actual research data!
25. What are the institutional responsibilities?
Preserve the intellectual capital of the university
Take advantage of institutional commitments
Provide a base level of data services for campus
Create new research opportunities by bringing diverse data
together
26. How did we begin to approach this work?
Actual research data!
27. How did we begin to approach this work?
By providing an integrated stack of services:
A tool for search and discovery
A digital preservation service
Identifier service
Training classes Not research data!
28. The integrated stack
DAMS for search and discovery
Chronopolis for digital preservation
EZID service for object identifiers
Training classes on Data Management Plan Tool
Nope!
29. DAMS for search and discovery
Begun 10 years ago
RDF based metadata - allows for
multiple standards support (MADS, Premis, MIX, etc.)
local attributes easily added
linked data
Local and cloud based storage
Needed changes made to accommodate complex
objects and collections
30.
31. UCSD Library DAMS
As other collections are added, they will
be listed here. Cross-collection
discoverability is key.
Complex research collections will be
“mixed in” with regular digital
collections.
33. Chronopolis for digital preservation
• TRAC-certified preservation system
• Data replicated in three locations across the country
• Active preservation
34. EZID service for DOI services
As part of an arrangement with the California Digital
Library, we provide free DOIs to campus researchers.
38. We faced challenges in these processes
Expression of complex objects in metadata
Presentation of complex objects in DAMS
Organization of divergent datasets into a coherent unit
Best ways to provide data management services
39. Our challenges for tomorrow
Move from boutique, pilot services to a scalable
series of processes.
Work with additional researchers in same domains.
Work with new domains.
Broaden lifecycle management
mindset on campus.
40. Stay tuned …
Next Webinar (October 15)
Deep dive into our two core developments
Arwen Hutt - Metadata Librarian
Matt Critchlow - Manager of Development and
Web Services
41. Stay tuned …
Final Webinar (October 31)
The researcher perspective from two of our
pilot participants
Dick Norris – Professor, Scripps Institution of
Oceanography
Rick Wagner – Data Scientist at San Diego Supercomputer
Center
43. UC San Diego Library
• 590,000+ e-books
• 43,000 electronic periodicals and journals
• 262,000+ digital media resources
• 50+ unique digital collections