While some archaeological research requires destructive methodologies such as excavation, archaeology faces a race against time as rapid development, urbanization, war, and antiquities trading erase so much of the material record of the past. For these reasons, archaeology needs better approaches to document and meaningfully preserve the digital record of the past. Our project explores archaeological data creation and management practices in three geographical areas (Africa, Europe, and South America) in order to better understand how to better align these practices with the data reuse needs of a broader research community. With funding from the National Endowment for the Humanities, the Secret Life of Data (SLO-data) project follows the lifecycle of data from the field to the digital repository to better understand opportunities and challenges in data interpretation, publication and preservation. Our “slow data” approach focuses not on maximizing the speed and quantity of data, but rather on emphasizing curation, contextualization, and hopefully communication and broader understanding. Our work will guide archaeologists in creating higher-quality and more easily understood data and will expand data publishing services provided by Open Context (opencontext.org).
The project team includes researchers at Stanford University, OCLC Research, the University of Michigan, and the Institute for Field Research. During the first year of work, our team conducted 21 interviews at the participating field sites. We analyzed the transcripts using a codebook and coding protocols that we developed as a team during a week-long meeting in September 2016. We are using these interviews, as well as field observations, sample excavation data, documentation, and guidelines to establish a baseline for each excavation project that describes current data collection and management practices, including the features and functions of the tools they use. Analysis of these studies guides our current work to recommend changes (both technical and organizational) that will improve data creation and management practices.
Beyond Management: Data Curation as Scholarship in Archaeology
1. Beyond Management:
Data Curation as Scholarship
in Archaeology
Sarah Whitcher Kansa,
The Alexandria Archive Institute
with Anne Austin, Ixchel Faniel,
Beth Yakel, Ran Boytner,
Eric Kansa, Jennifer Jacobs
& Phoebe France
2.
Open Context: 10 years of iterative development
Linked: Links with other systems & data (tDAR, EOL, ORCID, etc)
Open: Code, data (mainly CC-By) on GitHub, machine-readable formats, APIs
Long-term: NSF, NEH data management. California Digital Library archiving.
Global: Mirroring, collaboration with the German Archaeological Institute (DAI)
Recognition: Awards from Digital Curation (2014),
Archaeological Institute of America (2016), and the White House (2013)
Open Context: 10 years of iterative development
Linked: Links with other systems & data (tDAR, EOL, ORCID, etc)
Open: Code, data (mainly CC-By) on GitHub, machine-readable formats, APIs
Long-term: NSF, NEH data management. California Digital Library archiving.
Global: Mirroring, collaboration with the German Archaeological Institute (DAI)
Recognition: Awards from Digital Curation (2014),
Archaeological Institute of America (2016), and the White House (2013)
3. Why a Publishing Metaphor?
1. Editorial (curatorial) co-production
2. Promote vision of data as more than a
“residue” of research
Why a Publishing Metaphor?
1. Editorial (curatorial) co-production
2. Promote vision of data as more than a
“residue” of research
4. Data
Reuse
Data
Creation
Usable / Useful Data
Tension between what data creators do
and what data reusers need.
Need to better align data creation and reuse!
Tension between what data creators do
and what data reusers need.
Need to better align data creation and reuse!
5. Important to get it right
because you can only
excavate a site once!
Data issues are central
to the practice of
research in the 21st
century.
Important to get it right
because you can only
excavate a site once!
Data issues are central
to the practice of
research in the 21st
century.
6. Create
Process
AnalyzePreserve
Share
Reuse
SLO-data (Secret Life of Data)
●
Builds on previous qualitative
research (DIPIR, NEH, EOL)
●
Explores relationships between
data creation practices and reuse
●
Better align data creation with
reuse needs
●
Encourage more thoughtful data
creation and dissemination,
training in digital literacy
●
Guidance, exemplars, and
“recipes” for creating high-
quality, usable data
●
Builds on previous qualitative
research (DIPIR, NEH, EOL)
●
Explores relationships between
data creation practices and reuse
●
Better align data creation with
reuse needs
●
Encourage more thoughtful data
creation and dissemination,
training in digital literacy
●
Guidance, exemplars, and
“recipes” for creating high-
quality, usable data
7.
8. SLO-data (Secret Life of Data)
Researcher
Interviews
Field
Observations
Database
Analysis
Reuser
Interviews
https://alexandriaarchive.org/secret-life-of-data/
https://alexandriaarchive.org/secret-life-of-data/
9. Year 1 observations
explored how data are:
●
Structured
●
Stored
●
Accessed
●
Modified
●
Backed-up/Secured
●
Made Consistent
●
Identified
…or not!
Year 1 observations
explored how data are:
●
Structured
●
Stored
●
Accessed
●
Modified
●
Backed-up/Secured
●
Made Consistent
●
Identified
…or not!
14.
How to maintain knowledge
continuity when team has
contingent involvement.
How to maintain knowledge
continuity when team has
contingent involvement.
16. Recommendations:
●
Broader data literacy & more
formal processes
●
Establish expectations and
protocols for specialists to share
data
●
Better identifier management
●
Data validation
●
Promote shared controlled
vocabularies at the time of data
creation (especially referenced by
URIs for Linked Data).
Recommendations:
●
Broader data literacy & more
formal processes
●
Establish expectations and
protocols for specialists to share
data
●
Better identifier management
●
Data validation
●
Promote shared controlled
vocabularies at the time of data
creation (especially referenced by
URIs for Linked Data).
17. Year 2 & Beyond:
●
Provide recommendations for workflow and
database changes; Observe in field
●
Establish expectations from a project's inception
(including training, specialists, database)
●
Code/analyze Year 2 field & data reuser
interviews
●
Create exemplars and guidelines (universal
handbook for data management)
Year 2 & Beyond:
●
Provide recommendations for workflow and
database changes; Observe in field
●
Establish expectations from a project's inception
(including training, specialists, database)
●
Code/analyze Year 2 field & data reuser
interviews
●
Create exemplars and guidelines (universal
handbook for data management)