SlideShare ist ein Scribd-Unternehmen logo
1 von 30
Downloaden Sie, um offline zu lesen
LIBBIE STEPHENSON, DATA ARCHIVIST (RETIRED)
UCLA SOCIAL SCIENCE DATA ARCHIVE
LIBBIE@G.UCLA.EDU
HTTPS://DATAVERSE.HARVARD.EDU/DATAVERSE/SSDA_UCLA
Data Curation for Quantitative
Social Science Research:
A Case Study
NISO Virtual Conference: Data
Curation – Cultivating Past Research
Data for Future Consumption
August 31, 2016
DISCLAIMER
 I am retired from UCLA so my
comments reflect my own experience
and expertise. They do not necessarily
reflect the ideas, opinions or practices
of anyone at UCLA.
 These materials are free for you to
use, but please cite accordingly.
NISO - AUGUST 31, 2016
2
OVERVIEW
 About the Archive
 About the data we manage
 What we are trying to do
 What we actually do
 Some illustrations
NISO - AUGUST 31, 2016
3
ABOUT THE ARCHIVE
 Operating since 1964 -- before email, PC’s, Internet,
laptops, smart phones; Manage survey/quantitative
data stored on media from punch cards to cloud
 Staff have library science degrees; statistical and
technical expertise; quantitative social science
background
 Serve all UCLA quantitative researchers: Provide
reference, cataloging/metadata, long term archiving;
support in data rescue, management, security.
NISO - AUGUST 31, 2016
4https://dataverse.harvard.edu
/dataverse/ssda_ucla
SURVEY/QUANTITATIVE
RESEARCH
 Carried out in the U.S. since 1940’s -- post
WW2
 1960’s -70’s -- ICPSR & academic archives
 1970’s -- growth of data oriented professional
associations (IASSIST, APDU, IFDO, CESSDA)
 Focused on society and social norms
 Predict outcomes; test assumptions; study
change over time; run experiments
NISO - AUGUST 31, 2016
5
Note: in any
discipline we
also need to
understand
the work
flow of the
research and
the way
individuals
approach
their work.
CURATION GOALS
 Researcher driven philosophy of open access,
data sharing, reuse
 Collaborative, multi-unit or multi-institutional
 Ensure data conservation and long term usability,
as well as discovery and access
 Processes and work flows support disaster
planning
 Use of best and trusted digital repository
policies, models, practices, and work flows
 Reflect values of accountability and integrity
NISO - AUGUST 31, 2016
6
POLICIES SUPPORT PRACTICE
 Foundational, essential to a strong data curation
infrastructure.
 Encompasses what is acquired/collected, curation
levels and scope, ensures long term usability, drives
processes and work flows
 Social Science Data Archive policy
 TOOL : Policy-making for Research Data in
Repositories by Ann Green, Stuart Macdonald and
Robin Rice.
NISO - AUGUST 31, 2016
7
OUR STEPS IN CURATION
 Initial contact
 Data Quality Review and Appraisal
 Ingest
Verification
Metadata
Physical storage
 Access
 Preservation
NISO - AUGUST 31, 2016
8
INITIAL CONTACT
 Data Curation Profile
 Data Management Plan
 Guide to Social Science Data Preparation
and Archiving
NISO - AUGUST 31, 2016
9
APPRAISAL
 Archival Collection Policy
 Also depends on:
 Resources to process
 Long term resources
 Fitness, usefulness
 Data Deposit Form signatures and
completeness; commitment to share
data; privacy and confidentiality
NISO - AUGUST 31, 2016
10
DATA QUALITY REVIEW
Use of statistical packages, emulator, Adobe Pro, Excel,
Colectica, Text editor
 Verify deposit package, check sums, freq’s,
compare data to documentation
 Completeness of codebook, question text,
sampling, weighting, recodes, methods
 Disclosure analysis, check for personal identifiers
and assess privacy/confidentiality of respondents
 Documentation converted to PDF/A
11
NISO - AUGUST 31, 2016
EXAMPLE: WHAT KIND OF DATA?
NISO - AUGUST 31, 2016
12
CODEBOOK DOCUMENTS THE
COLUMNS
NISO - AUGUST 31, 2016
13
5002 01 01 302000 001 101 10004B121068965
Each item is
called a variable.
We refer to the
numeric content
of each item as a
value.
COMPARE FREQS TO CODEBOOK
NISO - AUGUST 31, 2016
14
VALUES
VALUE LABELS
VARIABLE
RUN MARGINALS/FREQUENCIES
NISO - AUGUST 31, 2016
15
Sex of Respondent
Frequency Percent Valid Percent Cumulative Percent
Valid MALE 856 45.1 45.1 45.1
FEMALE 1041 54.9 54.9 100.0
Total 1897 100.0 100.0
What is your race - ethnicity
Frequency Percent Valid Percent Cumulative Percent
Valid White 618 32.6 32.6 32.6
Hispanic 475 25.0 25.0 57.6
Black 474 25.0 25.0 82.6
Asian or Pacific Islander 282 14.9 14.9 97.5
Native American or Alaskan native 17 .9 .9 98.4
Identifies more than one of the above groups 20 1.1 1.1 99.4
DON'T KNOW 2 .1 .1 99.5
REFUSED 9 .5 .5 100.0
Total 1897 100.0 100.0
INGEST – PHYSICAL FORMATS
 Virus check, run check sums, address
versioning, fixity, file naming conventions
 Convert files to archival formats if required
 Back copies to external media
 Copy datasets to Dataverse; Safe Archive tool
 Use of secure file transfer client
 SQL/PHP scripts for local holdings file
 Compression software (7-zip)
NISO - AUGUST 31, 2016
16
Address
disaster plan
and file
access
(public and
local);
Security
requirements;
LOCKSS
INGEST– BIBLIOGRAPHIC METADATA
Bibliographic metadata enables search and
discovery:
 Establish bibliographic-level identity for unique
items
 Bibliographic record to WorldCat/Voyager
 Add record to holdings database (SQL)
 Create Dataverse record; Assign persistent
identifier
NISO - AUGUST 31, 2016
17
Produce and review with investigator
WHAT ELSE DO WE NEED TO
KNOW ABOUT THE DATA?
 Description of the study
 Citation
 Funding source
 Methodology
 Sampling
 Publications
NISO - AUGUST 31, 2016
18
EXAMPLE - DATAVERSE
NISO - AUGUST 31, 2016
19
Links to tools to
manage collections
Navigate to and
search for studies
Studies can be downloaded or
analyzed online
VARIABLE LEVEL SEARCH
CAPABILITIES
 Enables searching across many studies at
once.
 Enables searching shared catalogs of multiple
archives
 TOOLS: Colectica Repository and NESSTAR
 Requires local or remote hosting of software.
 Can share the metadata files for repurposing.
NISO - AUGUST 31, 2016
20
DATA DOCUMENTATION
INITIATIVE
Document, Discover, and Interoperate
 “International standard for describing data
that result from observational methods in
the social, behavioral, economic, and health
sciences”
 “Facilitates interpretation and understanding
-- both by humans and computers”
NISO - AUGUST 31, 2016
21
http://www.ddialliance.org/
INGEST-VARIABLE LEVEL METADATA
Descriptive metadata of detailed information about the
data enables understandability and reuse:
 Create variable-level metadata, using Colectica or
NESSTAR to produce standardized metadata records
 Create DDI record; full DDI codebook
 Migrate DDI to Colectica Repository
NISO - AUGUST 31, 2016
22
Produce and review with investigator
NESSTAR
EXAMPLE - IMPORTING DATA
 Use the
Data tab
to import
files from
SPSS or
STATA
formats.
NISO - AUGUST 31, 2016
23
Label
Question
text
Numeric
values
Variable Details include variable name,
label, description or question text, and
types of coding.
NISO - AUGUST 31, 2016
24
EXAMPLE DDI FROM COLECTICA
NISO - AUGUST 31, 2016
25
DDI fields are in
red; used to
create
documentation;
can be
repurposed
PRESERVATION AND CURATION
 Continuous monitoring of file formats; migrate to new formats
when:
New operating system; New version of statistical software
New mode of file transfer; Code change
 Monitoring of database function; software updates or redesigns
 Monitoring of servers, external media health; replace as needed
 Data forensics; check sums; validation; authentication; version
control; format migration; refresh media; record preservation
metadata -- DDI
 Review disaster plan and collection policy at regular intervals
 Review new or revised regulations for intellectual property;
security; data producers/distributors; funding agencies
 Review with original depositor, their data management plans,
changes in access or user permissions
26
Focus is on functional-level preservation and long term
usability through use of DDI and continuous review.
UNCOMFORTABLE TRUTHS
 Data management in institutions requires
high level administrative participation;
new, sustained funding; and differently
trained staff
 Data management planning is not a static
event but a continuous process to ensure
long term independently understandable
informed reuse of research
 There is an urgent need for standards, tools,
and best practice models for many different
file formats and disciplines
NISO - AUGUST 31, 2016
27
NEXT STEPS FOR PRACTITIONERS
“Crucial metadata about data are not always
being captured or created and linked to data in
repositories. Storage and persistence of data
submissions isn't enough. We need data
archivists and librarians to commit to partnering
with researchers to curate data -- to review
incoming data for usability, confidentiality, and
completeness of descriptive information.”
NISO - AUGUST 31, 2016
28
Ann Green (2016) Email communication
Used with permission
ANY QUESTIONS?
THANK YOU!
 Social Science Data Archive, UCLA
 Box 951484
Los Angeles, CA 90095-1484
310-825-0716
NISO - AUGUST 31, 2016
29
LINKSSocial Science Data Archive dataverse.harvard.edu/dataverse/ssda_ucla
Data Seal of Approval www.datasealofapproval.org/en/
National Digital Stewardship Alliance
ndsa.org/activities/levels-of-digital-preservation/
Open Archival Information System
www.oclc.org/research/publications/library/2000/lavoie-oais.html
Social Science Data Archive Policy
data-archive.library.ucla.edu/SSDA_collectionAndArchivingPolicy.pdf?_ga=
1.3255478.786669706.1378228281
Data Curation Profile datacurationprofiles.org/
Data Management Planning at ICPSR
www.icpsr.umich.edu/icpsrweb/content/datamanagement/dmp/index.html
ICPSR Guide to Data Preparation
www.icpsr.umich.edu/icpsrweb/content/deposit/guide/
Colectica www.colectica.com/
NESSTAR www.nesstar.com/index.html
DDI www.ddialliance.org/
Dataverse dataverse.org/
NISO - AUGUST 31, 2016

Weitere ähnliche Inhalte

Was ist angesagt?

Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017ARDC
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environmentphilipdurbin
 
Publishing perspectives on data management & future directions
Publishing perspectives on data management & future directionsPublishing perspectives on data management & future directions
Publishing perspectives on data management & future directionsARDC
 
RDAP 16: Sustainability of data infrastructure: The history of science scienc...
RDAP 16: Sustainability of data infrastructure: The history of science scienc...RDAP 16: Sustainability of data infrastructure: The history of science scienc...
RDAP 16: Sustainability of data infrastructure: The history of science scienc...ASIS&T
 
Why does research data matter to libraries
Why does research data matter to librariesWhy does research data matter to libraries
Why does research data matter to librariesJisc RDM
 
ESIP Federation: Community-Driven, Collaborative Governance - Carol Beaton Me...
ESIP Federation: Community-Driven, Collaborative Governance - Carol Beaton Me...ESIP Federation: Community-Driven, Collaborative Governance - Carol Beaton Me...
ESIP Federation: Community-Driven, Collaborative Governance - Carol Beaton Me...ASIS&T
 
No more waiting! Tools that work Today to reveal dataset use
No more waiting!  Tools that work Today to reveal dataset useNo more waiting!  Tools that work Today to reveal dataset use
No more waiting! Tools that work Today to reveal dataset useHeather Piwowar
 
RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel ASIS&T
 
Manage your online profile: Maximize the visibility of your work and make an ...
Manage your online profile: Maximize the visibility of your work and make an ...Manage your online profile: Maximize the visibility of your work and make an ...
Manage your online profile: Maximize the visibility of your work and make an ...Julia Gelfand
 
The Data Management Ecosystem
The Data Management EcosystemThe Data Management Ecosystem
The Data Management EcosystemJohn Kunze
 

Was ist angesagt? (20)

Lee - The Data Lifecycle: Curating Partners to Curate Data
Lee - The Data Lifecycle: Curating Partners to Curate DataLee - The Data Lifecycle: Curating Partners to Curate Data
Lee - The Data Lifecycle: Curating Partners to Curate Data
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
Allard - Research Data Services in Libraries
Allard - Research Data Services in LibrariesAllard - Research Data Services in Libraries
Allard - Research Data Services in Libraries
 
NISO Training Thursday Crafting a Scientific Data Management Plan
NISO Training Thursday Crafting a Scientific Data Management PlanNISO Training Thursday Crafting a Scientific Data Management Plan
NISO Training Thursday Crafting a Scientific Data Management Plan
 
Wheeler & Benedict -- Enabling the Preservation Relay
Wheeler & Benedict -- Enabling the Preservation RelayWheeler & Benedict -- Enabling the Preservation Relay
Wheeler & Benedict -- Enabling the Preservation Relay
 
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environment
 
NISO Working Group Connection Live! Research Data Metrics Landscape: An Updat...
NISO Working Group Connection Live! Research Data Metrics Landscape: An Updat...NISO Working Group Connection Live! Research Data Metrics Landscape: An Updat...
NISO Working Group Connection Live! Research Data Metrics Landscape: An Updat...
 
Publishing perspectives on data management & future directions
Publishing perspectives on data management & future directionsPublishing perspectives on data management & future directions
Publishing perspectives on data management & future directions
 
RDAP 16: Sustainability of data infrastructure: The history of science scienc...
RDAP 16: Sustainability of data infrastructure: The history of science scienc...RDAP 16: Sustainability of data infrastructure: The history of science scienc...
RDAP 16: Sustainability of data infrastructure: The history of science scienc...
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
Why does research data matter to libraries
Why does research data matter to librariesWhy does research data matter to libraries
Why does research data matter to libraries
 
ESIP Federation: Community-Driven, Collaborative Governance - Carol Beaton Me...
ESIP Federation: Community-Driven, Collaborative Governance - Carol Beaton Me...ESIP Federation: Community-Driven, Collaborative Governance - Carol Beaton Me...
ESIP Federation: Community-Driven, Collaborative Governance - Carol Beaton Me...
 
No more waiting! Tools that work Today to reveal dataset use
No more waiting!  Tools that work Today to reveal dataset useNo more waiting!  Tools that work Today to reveal dataset use
No more waiting! Tools that work Today to reveal dataset use
 
RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel
 
Manage your online profile: Maximize the visibility of your work and make an ...
Manage your online profile: Maximize the visibility of your work and make an ...Manage your online profile: Maximize the visibility of your work and make an ...
Manage your online profile: Maximize the visibility of your work and make an ...
 
The Data Management Ecosystem
The Data Management EcosystemThe Data Management Ecosystem
The Data Management Ecosystem
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 

Ähnlich wie Stephenson - Data Curation for Quantitative Social Science Research

Natasha intro to rdm c3 dis may 2018.pptx
Natasha intro to rdm c3 dis may 2018.pptxNatasha intro to rdm c3 dis may 2018.pptx
Natasha intro to rdm c3 dis may 2018.pptxARDC
 
Research Integrity Advisor and Data Management
Research Integrity Advisor and Data ManagementResearch Integrity Advisor and Data Management
Research Integrity Advisor and Data ManagementARDC
 
Ucla july 2018 natasha simons
Ucla july 2018 natasha simonsUcla july 2018 natasha simons
Ucla july 2018 natasha simonsARDC
 
2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)Dag Endresen
 
My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018Susanna-Assunta Sansone
 
Fsci 2018 monday30_july_am6
Fsci 2018 monday30_july_am6Fsci 2018 monday30_july_am6
Fsci 2018 monday30_july_am6ARDC
 
FAIR for the future: embracing all things data
FAIR for the future: embracing all things dataFAIR for the future: embracing all things data
FAIR for the future: embracing all things dataARDC
 
12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”DuraSpace
 
Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersIncisive_Events
 
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific DataNIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific DataSusanna-Assunta Sansone
 
Data publication: Discover, Explore, Visualise
Data publication: Discover, Explore, VisualiseData publication: Discover, Explore, Visualise
Data publication: Discover, Explore, VisualiseAlejandra Gonzalez-Beltran
 
Overview to: BBSRC Oxford Doctoral Training Partnership - Dr Sansone - July 2014
Overview to: BBSRC Oxford Doctoral Training Partnership - Dr Sansone - July 2014Overview to: BBSRC Oxford Doctoral Training Partnership - Dr Sansone - July 2014
Overview to: BBSRC Oxford Doctoral Training Partnership - Dr Sansone - July 2014Susanna-Assunta Sansone
 
NC3Rs Publication Bias workshop - Sansone - Better Data = Better Science
NC3Rs Publication Bias workshop - Sansone - Better Data = Better ScienceNC3Rs Publication Bias workshop - Sansone - Better Data = Better Science
NC3Rs Publication Bias workshop - Sansone - Better Data = Better ScienceSusanna-Assunta Sansone
 
INSERM - Data Management & Reuse of Health Data - May 2017
INSERM - Data Management & Reuse of Health Data - May 2017INSERM - Data Management & Reuse of Health Data - May 2017
INSERM - Data Management & Reuse of Health Data - May 2017Susanna-Assunta Sansone
 
Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014Susanna-Assunta Sansone
 
Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policie...
Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policie...Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policie...
Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policie...Academy of Science of South Africa (ASSAf)
 
Putting FAIR Principles in the Context of Research Information: FAIRness for ...
Putting FAIR Principles in the Context of Research Information: FAIRness for ...Putting FAIR Principles in the Context of Research Information: FAIRness for ...
Putting FAIR Principles in the Context of Research Information: FAIRness for ...Anastasija Nikiforova
 

Ähnlich wie Stephenson - Data Curation for Quantitative Social Science Research (20)

Natasha intro to rdm c3 dis may 2018.pptx
Natasha intro to rdm c3 dis may 2018.pptxNatasha intro to rdm c3 dis may 2018.pptx
Natasha intro to rdm c3 dis may 2018.pptx
 
Research Integrity Advisor and Data Management
Research Integrity Advisor and Data ManagementResearch Integrity Advisor and Data Management
Research Integrity Advisor and Data Management
 
Research Data Management and Reproducibility
Research Data Management and ReproducibilityResearch Data Management and Reproducibility
Research Data Management and Reproducibility
 
Ucla july 2018 natasha simons
Ucla july 2018 natasha simonsUcla july 2018 natasha simons
Ucla july 2018 natasha simons
 
FAIR-4-GSC-Sansone-Aug23.pdf
FAIR-4-GSC-Sansone-Aug23.pdfFAIR-4-GSC-Sansone-Aug23.pdf
FAIR-4-GSC-Sansone-Aug23.pdf
 
2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)
 
My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018
 
Fsci 2018 monday30_july_am6
Fsci 2018 monday30_july_am6Fsci 2018 monday30_july_am6
Fsci 2018 monday30_july_am6
 
FAIR for the future: embracing all things data
FAIR for the future: embracing all things dataFAIR for the future: embracing all things data
FAIR for the future: embracing all things data
 
12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”
 
Holmes "Institutional Infrastructure for Data Sharing"
Holmes "Institutional Infrastructure for Data Sharing"Holmes "Institutional Infrastructure for Data Sharing"
Holmes "Institutional Infrastructure for Data Sharing"
 
Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producers
 
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific DataNIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
 
Data publication: Discover, Explore, Visualise
Data publication: Discover, Explore, VisualiseData publication: Discover, Explore, Visualise
Data publication: Discover, Explore, Visualise
 
Overview to: BBSRC Oxford Doctoral Training Partnership - Dr Sansone - July 2014
Overview to: BBSRC Oxford Doctoral Training Partnership - Dr Sansone - July 2014Overview to: BBSRC Oxford Doctoral Training Partnership - Dr Sansone - July 2014
Overview to: BBSRC Oxford Doctoral Training Partnership - Dr Sansone - July 2014
 
NC3Rs Publication Bias workshop - Sansone - Better Data = Better Science
NC3Rs Publication Bias workshop - Sansone - Better Data = Better ScienceNC3Rs Publication Bias workshop - Sansone - Better Data = Better Science
NC3Rs Publication Bias workshop - Sansone - Better Data = Better Science
 
INSERM - Data Management & Reuse of Health Data - May 2017
INSERM - Data Management & Reuse of Health Data - May 2017INSERM - Data Management & Reuse of Health Data - May 2017
INSERM - Data Management & Reuse of Health Data - May 2017
 
Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014
 
Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policie...
Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policie...Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policie...
Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policie...
 
Putting FAIR Principles in the Context of Research Information: FAIRness for ...
Putting FAIR Principles in the Context of Research Information: FAIRness for ...Putting FAIR Principles in the Context of Research Information: FAIRness for ...
Putting FAIR Principles in the Context of Research Information: FAIRness for ...
 

Mehr von National Information Standards Organization (NISO)

Mehr von National Information Standards Organization (NISO) (20)

Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
 
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
 
Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"
 
Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"
 
Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"
 
Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"
 
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
 
Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"
 
Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"
 
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
 
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
 
Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"
 
Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"
 
Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"
 

Kürzlich hochgeladen

ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701bronxfugly43
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Shubhangi Sonawane
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesShubhangi Sonawane
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxNikitaBankoti2
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docxPoojaSen20
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfChris Hunter
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 

Kürzlich hochgeladen (20)

ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptx
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 

Stephenson - Data Curation for Quantitative Social Science Research

  • 1. LIBBIE STEPHENSON, DATA ARCHIVIST (RETIRED) UCLA SOCIAL SCIENCE DATA ARCHIVE LIBBIE@G.UCLA.EDU HTTPS://DATAVERSE.HARVARD.EDU/DATAVERSE/SSDA_UCLA Data Curation for Quantitative Social Science Research: A Case Study NISO Virtual Conference: Data Curation – Cultivating Past Research Data for Future Consumption August 31, 2016
  • 2. DISCLAIMER  I am retired from UCLA so my comments reflect my own experience and expertise. They do not necessarily reflect the ideas, opinions or practices of anyone at UCLA.  These materials are free for you to use, but please cite accordingly. NISO - AUGUST 31, 2016 2
  • 3. OVERVIEW  About the Archive  About the data we manage  What we are trying to do  What we actually do  Some illustrations NISO - AUGUST 31, 2016 3
  • 4. ABOUT THE ARCHIVE  Operating since 1964 -- before email, PC’s, Internet, laptops, smart phones; Manage survey/quantitative data stored on media from punch cards to cloud  Staff have library science degrees; statistical and technical expertise; quantitative social science background  Serve all UCLA quantitative researchers: Provide reference, cataloging/metadata, long term archiving; support in data rescue, management, security. NISO - AUGUST 31, 2016 4https://dataverse.harvard.edu /dataverse/ssda_ucla
  • 5. SURVEY/QUANTITATIVE RESEARCH  Carried out in the U.S. since 1940’s -- post WW2  1960’s -70’s -- ICPSR & academic archives  1970’s -- growth of data oriented professional associations (IASSIST, APDU, IFDO, CESSDA)  Focused on society and social norms  Predict outcomes; test assumptions; study change over time; run experiments NISO - AUGUST 31, 2016 5 Note: in any discipline we also need to understand the work flow of the research and the way individuals approach their work.
  • 6. CURATION GOALS  Researcher driven philosophy of open access, data sharing, reuse  Collaborative, multi-unit or multi-institutional  Ensure data conservation and long term usability, as well as discovery and access  Processes and work flows support disaster planning  Use of best and trusted digital repository policies, models, practices, and work flows  Reflect values of accountability and integrity NISO - AUGUST 31, 2016 6
  • 7. POLICIES SUPPORT PRACTICE  Foundational, essential to a strong data curation infrastructure.  Encompasses what is acquired/collected, curation levels and scope, ensures long term usability, drives processes and work flows  Social Science Data Archive policy  TOOL : Policy-making for Research Data in Repositories by Ann Green, Stuart Macdonald and Robin Rice. NISO - AUGUST 31, 2016 7
  • 8. OUR STEPS IN CURATION  Initial contact  Data Quality Review and Appraisal  Ingest Verification Metadata Physical storage  Access  Preservation NISO - AUGUST 31, 2016 8
  • 9. INITIAL CONTACT  Data Curation Profile  Data Management Plan  Guide to Social Science Data Preparation and Archiving NISO - AUGUST 31, 2016 9
  • 10. APPRAISAL  Archival Collection Policy  Also depends on:  Resources to process  Long term resources  Fitness, usefulness  Data Deposit Form signatures and completeness; commitment to share data; privacy and confidentiality NISO - AUGUST 31, 2016 10
  • 11. DATA QUALITY REVIEW Use of statistical packages, emulator, Adobe Pro, Excel, Colectica, Text editor  Verify deposit package, check sums, freq’s, compare data to documentation  Completeness of codebook, question text, sampling, weighting, recodes, methods  Disclosure analysis, check for personal identifiers and assess privacy/confidentiality of respondents  Documentation converted to PDF/A 11 NISO - AUGUST 31, 2016
  • 12. EXAMPLE: WHAT KIND OF DATA? NISO - AUGUST 31, 2016 12
  • 13. CODEBOOK DOCUMENTS THE COLUMNS NISO - AUGUST 31, 2016 13 5002 01 01 302000 001 101 10004B121068965 Each item is called a variable. We refer to the numeric content of each item as a value.
  • 14. COMPARE FREQS TO CODEBOOK NISO - AUGUST 31, 2016 14 VALUES VALUE LABELS VARIABLE
  • 15. RUN MARGINALS/FREQUENCIES NISO - AUGUST 31, 2016 15 Sex of Respondent Frequency Percent Valid Percent Cumulative Percent Valid MALE 856 45.1 45.1 45.1 FEMALE 1041 54.9 54.9 100.0 Total 1897 100.0 100.0 What is your race - ethnicity Frequency Percent Valid Percent Cumulative Percent Valid White 618 32.6 32.6 32.6 Hispanic 475 25.0 25.0 57.6 Black 474 25.0 25.0 82.6 Asian or Pacific Islander 282 14.9 14.9 97.5 Native American or Alaskan native 17 .9 .9 98.4 Identifies more than one of the above groups 20 1.1 1.1 99.4 DON'T KNOW 2 .1 .1 99.5 REFUSED 9 .5 .5 100.0 Total 1897 100.0 100.0
  • 16. INGEST – PHYSICAL FORMATS  Virus check, run check sums, address versioning, fixity, file naming conventions  Convert files to archival formats if required  Back copies to external media  Copy datasets to Dataverse; Safe Archive tool  Use of secure file transfer client  SQL/PHP scripts for local holdings file  Compression software (7-zip) NISO - AUGUST 31, 2016 16 Address disaster plan and file access (public and local); Security requirements; LOCKSS
  • 17. INGEST– BIBLIOGRAPHIC METADATA Bibliographic metadata enables search and discovery:  Establish bibliographic-level identity for unique items  Bibliographic record to WorldCat/Voyager  Add record to holdings database (SQL)  Create Dataverse record; Assign persistent identifier NISO - AUGUST 31, 2016 17 Produce and review with investigator
  • 18. WHAT ELSE DO WE NEED TO KNOW ABOUT THE DATA?  Description of the study  Citation  Funding source  Methodology  Sampling  Publications NISO - AUGUST 31, 2016 18
  • 19. EXAMPLE - DATAVERSE NISO - AUGUST 31, 2016 19 Links to tools to manage collections Navigate to and search for studies Studies can be downloaded or analyzed online
  • 20. VARIABLE LEVEL SEARCH CAPABILITIES  Enables searching across many studies at once.  Enables searching shared catalogs of multiple archives  TOOLS: Colectica Repository and NESSTAR  Requires local or remote hosting of software.  Can share the metadata files for repurposing. NISO - AUGUST 31, 2016 20
  • 21. DATA DOCUMENTATION INITIATIVE Document, Discover, and Interoperate  “International standard for describing data that result from observational methods in the social, behavioral, economic, and health sciences”  “Facilitates interpretation and understanding -- both by humans and computers” NISO - AUGUST 31, 2016 21 http://www.ddialliance.org/
  • 22. INGEST-VARIABLE LEVEL METADATA Descriptive metadata of detailed information about the data enables understandability and reuse:  Create variable-level metadata, using Colectica or NESSTAR to produce standardized metadata records  Create DDI record; full DDI codebook  Migrate DDI to Colectica Repository NISO - AUGUST 31, 2016 22 Produce and review with investigator NESSTAR
  • 23. EXAMPLE - IMPORTING DATA  Use the Data tab to import files from SPSS or STATA formats. NISO - AUGUST 31, 2016 23
  • 24. Label Question text Numeric values Variable Details include variable name, label, description or question text, and types of coding. NISO - AUGUST 31, 2016 24
  • 25. EXAMPLE DDI FROM COLECTICA NISO - AUGUST 31, 2016 25 DDI fields are in red; used to create documentation; can be repurposed
  • 26. PRESERVATION AND CURATION  Continuous monitoring of file formats; migrate to new formats when: New operating system; New version of statistical software New mode of file transfer; Code change  Monitoring of database function; software updates or redesigns  Monitoring of servers, external media health; replace as needed  Data forensics; check sums; validation; authentication; version control; format migration; refresh media; record preservation metadata -- DDI  Review disaster plan and collection policy at regular intervals  Review new or revised regulations for intellectual property; security; data producers/distributors; funding agencies  Review with original depositor, their data management plans, changes in access or user permissions 26 Focus is on functional-level preservation and long term usability through use of DDI and continuous review.
  • 27. UNCOMFORTABLE TRUTHS  Data management in institutions requires high level administrative participation; new, sustained funding; and differently trained staff  Data management planning is not a static event but a continuous process to ensure long term independently understandable informed reuse of research  There is an urgent need for standards, tools, and best practice models for many different file formats and disciplines NISO - AUGUST 31, 2016 27
  • 28. NEXT STEPS FOR PRACTITIONERS “Crucial metadata about data are not always being captured or created and linked to data in repositories. Storage and persistence of data submissions isn't enough. We need data archivists and librarians to commit to partnering with researchers to curate data -- to review incoming data for usability, confidentiality, and completeness of descriptive information.” NISO - AUGUST 31, 2016 28 Ann Green (2016) Email communication Used with permission
  • 29. ANY QUESTIONS? THANK YOU!  Social Science Data Archive, UCLA  Box 951484 Los Angeles, CA 90095-1484 310-825-0716 NISO - AUGUST 31, 2016 29
  • 30. LINKSSocial Science Data Archive dataverse.harvard.edu/dataverse/ssda_ucla Data Seal of Approval www.datasealofapproval.org/en/ National Digital Stewardship Alliance ndsa.org/activities/levels-of-digital-preservation/ Open Archival Information System www.oclc.org/research/publications/library/2000/lavoie-oais.html Social Science Data Archive Policy data-archive.library.ucla.edu/SSDA_collectionAndArchivingPolicy.pdf?_ga= 1.3255478.786669706.1378228281 Data Curation Profile datacurationprofiles.org/ Data Management Planning at ICPSR www.icpsr.umich.edu/icpsrweb/content/datamanagement/dmp/index.html ICPSR Guide to Data Preparation www.icpsr.umich.edu/icpsrweb/content/deposit/guide/ Colectica www.colectica.com/ NESSTAR www.nesstar.com/index.html DDI www.ddialliance.org/ Dataverse dataverse.org/ NISO - AUGUST 31, 2016