SlideShare ist ein Scribd-Unternehmen logo
1 von 39
Data Publishing in
      Archaeozoology
or “Everybody knows that a 14 is a Sheep”


           Sarah Whitcher Kansa
            Alexandria Archive Institute
                 OpenContext.org

                                            Unless otherwise indicated, this work is licensed
                                           under a Creative Commons Attribution 3.0 License
                                             <http://creativecommons.org/licenses/by/3.0/>
Main Points
- Reproducibility and new
  research opportunities
  require data sharing
- Raw data are not sufficient
- Publishing open data on the
  Web is a solution
- Publishing data takes special
  expertise
Good scientific practice requires
         data sharing.

We cannot trust results based on
         hidden data.
The Challenges


 • Limits of print
   (entrenched practice
   but not best practice)
 • Data preservation
   crisis (wasted effort)
 • Hard to compare and
   integrate data now
Policy Consensus:

  Urgent Need for
Better Data Practices!
DIPIR (http://www.dipir.org)

   3-Year project, Oct. 2010-Sept. 2013
   National Leadership Grant from the Institute for
    Museum and Library Services (LG-06-10-0140-10)
   Ixchel Faniel (PI), Elizabeth Yakel (Co-PI)
Raw Data Can Be Unappetizing
Data Documentation Practices
“I use an Excel spreadsheet…which I … inherited from my
research advisers. …my dissertation advisor was still recording
data for each specimen on paper when I was in graduate school
so that's what I started …then quickly, I was like, ‘This is
ridiculous.’… I just started using an Excel spreadsheet that has
sort of slowly gotten bigger and bigger over time with more
variables or columns…I've added …color coding…I also use…a very
sort of primitive numerical coding system, again, that I inherited
from my research advisers…So, this little book that goes with me
of codes which is sort of odd, but …we all know that a 14 is a
sheep.” (CCU13)

                                    A long way to go before we
                                    get usable, intelligible data
Sometimes data is
better served
cooked.
Adapt “publishing” metaphor
       to digital data
What is Data Publication?

     Putting editorially-vetted data on the Web
 • Cleaned, described, organized
 • More intelligible and cohesive
 • Open access
 • Linked to other resources (including print
   publications)
 • Machine-readable for discovery and reuse
 • Archived and curated (CDL)
Benefits & Challenges
The Good:
   • Enhanced presentation
   • Enhanced search, discovery, understanding
   • Depth & breadth (linked to project data, other
     datasets, print publications, etc.)
   • Allowing for Linked Open Data = facilitates future use
   • Professional advancement

                             The Bad: • Takes time, effort
                             • Requires informatics expertise

     Benefits need to outweigh challenges
Thousand Flowers




          Started in 2007
          Integrates and publishes
           various forms of archaeological
           documentation (structured
           data, media, documents)
          Not a repository, but archived
           with California Digital Library
          Interoperability via web
           services, increasing emphasis
           on Linked Data
Data Publishing




                  Data Quality and
                  Standards Alignment
                  (1) Check consistency
                  (2) Edit functions
                  (3) Align to common standards
                      (“Linked Data” if applicable)
                  (4) Issue tracking, version
                      control
Data Publishing




Data Publishing
    Comprehensive (Kenan Tepe: 30K
     photos, documents, object
     descriptions)
    Added capabilities
     (search, analysis, visualization)
    More attractive, usable data
    Interactions with data editors
     improve data
• Citation provided for
  each item
• CDL archival service to
  give permanence
Beyond the Silo


          Often too much emphasis on
           single systems, need to consider
           relationships across systems

          Even if one reaches some
           scale, it can't be isolated from
           the rest of the Web

          Machines are important
           “audiences” (e.g. RESTful
           Services:
           Atom, AtomPub, JSON, etc.)
Linked Open Data



                    Regarded as best
                   practice for sharing
                      data (among
                       informatics
                      researchers)
Web of Data (2009)




     Growing, Decentralized Innovation
Web of Data (2011)
Web of Data (2011)




       Need Archaeology on the Map

       Contributions should not be isolated
       from other communities
Open Context: Record
           HTTP URIs to identify resources
            at a meaningful level of granulaity
            (“a URL per potsherd”)

           Use HTTP URIs published by
            others

           URIs act as “primary keys” allow
            data to be related
Concept: Bos taurus (http://eol.org/pages/328699/)
Concept: Bos taurus (http://eol.org/pages/328699/)
Open Context: Record
Open Context Entity Reconciliation
    Authors / Editors
 relate project-specific
    terminologies to
  global terminologies




                           “Common name : Cattle, domestic”
                             = http://eol.org/pages/328699/
                                       (Bos taurus)
Open Context Entity Reconciliation
    Authors / Editors           Many project-
 relate project-specific        specific terms
    terminologies to           related to global
  global terminologies          terminologies



  Project Specific Property   EOL Link (Global Terminology)
  Species : Sheep / Goat      http://eol.org/pages/2851411/ (Caprinae)
  Taxon : Bos taurus          http://eol.org/pages/328699/ (Bos taurus)
  Species : Deer              http://eol.org/pages/38816/ (Dama sp.)
  Type : Deer                 http://eol.org/pages/34547/ (Odocoileus sp.)
  Taxon : Ovis / Capra        http://eol.org/pages/2851411/ (Caprinae)
  Species : Cattle            http://eol.org/pages/34548/ (Bos taurus)
  Species : Goat              http://eol.org/pages/328660/ (Capra hircus)
Open Context Entity Reconciliation
    Authors / Editors           Many project-            Editorial work-flow
 relate project-specific        specific terms            helps annotate
    terminologies to           related to global              data for
  global terminologies          terminologies             interoperability



  Project Specific Property   EOL Link (Global Terminology)
  Species : Sheep / Goat      http://eol.org/pages/2851411/ (Caprinae)
  Taxon : Bos taurus          http://eol.org/pages/328699/ (Bos taurus)
  Species : Deer              http://eol.org/pages/38816/ (Dama sp.)
  Type : Deer                 http://eol.org/pages/34547/ (Odocoileus sp.)
  Taxon : Ovis / Capra        http://eol.org/pages/2851411/ (Caprinae)
  Species : Cattle            http://eol.org/pages/34548/ (Bos taurus)
  Species : Goat              http://eol.org/pages/328660/ (Capra hircus)
Data Publishing Projects




          EOL (2012) funding for publishing
          additional zooarchaeology datasets
          (Neolithic Anatolia), in project led by Ben
          Arbuckle (Baylor University)
Data Publishing Projects




          NEH (2012) funding for publishing trade
          + exchange related datasets (Bronze-
          Iron Age Mediterranean)
Data Publishing Projects




             Complement Conventional
             Publishing
                 Lockwood Press
                  (“Archaeobiology
                  Series”), Cotsen Institute Press
                  (UCLA)
Data Publishing Projects




       Driven by research interests and
       publication goals among researchers
       wanting to compare datasets, create
       reference collections, and have citable, full
       datasets linked to synthetic publications.
Summary




 Outcomes of Publishing Data:

  (1) Make “datasets” first class citizens in
      world of scholarly communications
  (2) Provide needed transparency to
      published interpretations
  (3) Enable new kinds of multi-disciplinary
      research across many datasets
Thank you!




Special Thanks!

Canan Ҫakırlar, RCAC, Koҫ
   University, ICAZ, and other
   sponsors

Weitere ähnliche Inhalte

Was ist angesagt?

The SFX Framework for Context-Sensitive Reference Linking
The SFX Framework for  Context-Sensitive Reference LinkingThe SFX Framework for  Context-Sensitive Reference Linking
The SFX Framework for Context-Sensitive Reference Linking
Herbert Van de Sompel
 

Was ist angesagt? (20)

Modern Tools & Rationales for 21st Century Research
Modern Tools & Rationales  for 21st Century ResearchModern Tools & Rationales  for 21st Century Research
Modern Tools & Rationales for 21st Century Research
 
Specimen-level mining: bringing knowledge back 'home' to the Natural History ...
Specimen-level mining: bringing knowledge back 'home' to the Natural History ...Specimen-level mining: bringing knowledge back 'home' to the Natural History ...
Specimen-level mining: bringing knowledge back 'home' to the Natural History ...
 
The State of Open Research Data
The State of Open Research DataThe State of Open Research Data
The State of Open Research Data
 
Automatic Extraction of Knowledge from the Literature
Automatic Extraction of Knowledge from the LiteratureAutomatic Extraction of Knowledge from the Literature
Automatic Extraction of Knowledge from the Literature
 
Automatic Extraction of Knowledge from the Literature
Automatic Extraction of Knowledge from the LiteratureAutomatic Extraction of Knowledge from the Literature
Automatic Extraction of Knowledge from the Literature
 
The SFX Framework for Context-Sensitive Reference Linking
The SFX Framework for  Context-Sensitive Reference LinkingThe SFX Framework for  Context-Sensitive Reference Linking
The SFX Framework for Context-Sensitive Reference Linking
 
Resources, resources, resources: the three rs of the Web
Resources, resources, resources: the three rs of the WebResources, resources, resources: the three rs of the Web
Resources, resources, resources: the three rs of the Web
 
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...
 
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
 
The "social" side of digital science
The "social" side of digital scienceThe "social" side of digital science
The "social" side of digital science
 
Intro to Linked Open Data in Libraries Archives & Museums.
Intro to Linked Open Data in Libraries Archives & Museums.Intro to Linked Open Data in Libraries Archives & Museums.
Intro to Linked Open Data in Libraries Archives & Museums.
 
Petermrjisc20141201
Petermrjisc20141201Petermrjisc20141201
Petermrjisc20141201
 
Digital Scholarship: Enlightenment or Devastated Landscape?
Digital Scholarship: Enlightenment or Devastated Landscape? Digital Scholarship: Enlightenment or Devastated Landscape?
Digital Scholarship: Enlightenment or Devastated Landscape?
 
April 23 NISO Virtual Conference: Dealing with the Data Deluge: Successful Te...
April 23 NISO Virtual Conference: Dealing with the Data Deluge: Successful Te...April 23 NISO Virtual Conference: Dealing with the Data Deluge: Successful Te...
April 23 NISO Virtual Conference: Dealing with the Data Deluge: Successful Te...
 
HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8
 
Cochrane workshop2016
Cochrane workshop2016Cochrane workshop2016
Cochrane workshop2016
 
Building the new open linked library: Theory and Practice
Building the new open linked library: Theory and PracticeBuilding the new open linked library: Theory and Practice
Building the new open linked library: Theory and Practice
 
Open science, open-source, and open data: Collaboration as an emergent property?
Open science, open-source, and open data: Collaboration as an emergent property?Open science, open-source, and open data: Collaboration as an emergent property?
Open science, open-source, and open data: Collaboration as an emergent property?
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 
The biodiversity informatics landscape: a systematics perspective
The biodiversity informatics landscape: a systematics perspectiveThe biodiversity informatics landscape: a systematics perspective
The biodiversity informatics landscape: a systematics perspective
 

Andere mochten auch

учебные проекты на блогах
учебные проекты на блогахучебные проекты на блогах
учебные проекты на блогах
..
 
Masin De Epoca La Monaco Pt Blog
Masin De Epoca La Monaco  Pt BlogMasin De Epoca La Monaco  Pt Blog
Masin De Epoca La Monaco Pt Blog
serbionel
 

Andere mochten auch (6)

Beyond Open Access: Open Data, Web services, and Semantics (the Open Context ...
Beyond Open Access: Open Data, Web services, and Semantics (the Open Context ...Beyond Open Access: Open Data, Web services, and Semantics (the Open Context ...
Beyond Open Access: Open Data, Web services, and Semantics (the Open Context ...
 
香港六合彩
香港六合彩香港六合彩
香港六合彩
 
учебные проекты на блогах
учебные проекты на блогахучебные проекты на блогах
учебные проекты на блогах
 
An Open Context for Zooarchaeology: Publishing Research Data on the Web
An Open Context for Zooarchaeology: Publishing Research Data on the WebAn Open Context for Zooarchaeology: Publishing Research Data on the Web
An Open Context for Zooarchaeology: Publishing Research Data on the Web
 
香港六合彩
香港六合彩香港六合彩
香港六合彩
 
Masin De Epoca La Monaco Pt Blog
Masin De Epoca La Monaco  Pt BlogMasin De Epoca La Monaco  Pt Blog
Masin De Epoca La Monaco Pt Blog
 

Ähnlich wie Data Publishing in Archaeozoology

Beyond Preservation: Situating Archaeological Data in Professional Practice
Beyond Preservation: Situating Archaeological Data in Professional PracticeBeyond Preservation: Situating Archaeological Data in Professional Practice
Beyond Preservation: Situating Archaeological Data in Professional Practice
Eric Kansa
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
Jon Voss
 
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
ekansa
 
INSPIRE Hackathon Webinar Intro to Linked Data and Semantics
INSPIRE Hackathon Webinar   Intro to Linked Data and SemanticsINSPIRE Hackathon Webinar   Intro to Linked Data and Semantics
INSPIRE Hackathon Webinar Intro to Linked Data and Semantics
plan4all
 
Fuller Disclosure: Getting More Collections into the Network Flow
Fuller Disclosure: Getting More Collections into the Network FlowFuller Disclosure: Getting More Collections into the Network Flow
Fuller Disclosure: Getting More Collections into the Network Flow
kramsey
 

Ähnlich wie Data Publishing in Archaeozoology (20)

Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentation
Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI PresentationOpen Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentation
Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentation
 
IASSIT Kansa Presentation
IASSIT Kansa PresentationIASSIT Kansa Presentation
IASSIT Kansa Presentation
 
An Open Context for Archaeology
An Open Context for ArchaeologyAn Open Context for Archaeology
An Open Context for Archaeology
 
Beyond Preservation: Situating Archaeological Data in Professional Practice
Beyond Preservation: Situating Archaeological Data in Professional PracticeBeyond Preservation: Situating Archaeological Data in Professional Practice
Beyond Preservation: Situating Archaeological Data in Professional Practice
 
Data Sharing as Publication: A View from Archaeology
Data Sharing as Publication: A View from ArchaeologyData Sharing as Publication: A View from Archaeology
Data Sharing as Publication: A View from Archaeology
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
 
Open Data and Institutional Repositories
Open Data and Institutional RepositoriesOpen Data and Institutional Repositories
Open Data and Institutional Repositories
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with Dataverse
 
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
 
Interpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextInterpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open Context
 
E research overview gahegan bioinformatics workshop 2010
E research overview gahegan bioinformatics workshop 2010E research overview gahegan bioinformatics workshop 2010
E research overview gahegan bioinformatics workshop 2010
 
Research Data Sharing LERU
Research Data Sharing LERU Research Data Sharing LERU
Research Data Sharing LERU
 
From Data to Data: One version of a History of Scholarly Communication
From Data to Data: One version of a History of Scholarly CommunicationFrom Data to Data: One version of a History of Scholarly Communication
From Data to Data: One version of a History of Scholarly Communication
 
INSPIRE Hackathon Webinar Intro to Linked Data and Semantics
INSPIRE Hackathon Webinar   Intro to Linked Data and SemanticsINSPIRE Hackathon Webinar   Intro to Linked Data and Semantics
INSPIRE Hackathon Webinar Intro to Linked Data and Semantics
 
LUCERO - Building the Open University Web of Linked Data
LUCERO - Building the Open University Web of Linked DataLUCERO - Building the Open University Web of Linked Data
LUCERO - Building the Open University Web of Linked Data
 
Introduction of Linked Data for Science
Introduction of Linked Data for ScienceIntroduction of Linked Data for Science
Introduction of Linked Data for Science
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
 
Scott Edmunds: Data Dissemination in the era of "Big-Data"
Scott Edmunds: Data Dissemination in the era of "Big-Data"Scott Edmunds: Data Dissemination in the era of "Big-Data"
Scott Edmunds: Data Dissemination in the era of "Big-Data"
 
Fuller Disclosure: Getting More Collections into the Network Flow
Fuller Disclosure: Getting More Collections into the Network FlowFuller Disclosure: Getting More Collections into the Network Flow
Fuller Disclosure: Getting More Collections into the Network Flow
 
2 Discovery and Acquisition of Data1.pptx
2 Discovery and Acquisition of Data1.pptx2 Discovery and Acquisition of Data1.pptx
2 Discovery and Acquisition of Data1.pptx
 

Kürzlich hochgeladen

1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
SanaAli374401
 

Kürzlich hochgeladen (20)

Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 

Data Publishing in Archaeozoology

  • 1. Data Publishing in Archaeozoology or “Everybody knows that a 14 is a Sheep” Sarah Whitcher Kansa Alexandria Archive Institute OpenContext.org Unless otherwise indicated, this work is licensed under a Creative Commons Attribution 3.0 License <http://creativecommons.org/licenses/by/3.0/>
  • 2. Main Points - Reproducibility and new research opportunities require data sharing - Raw data are not sufficient - Publishing open data on the Web is a solution - Publishing data takes special expertise
  • 3. Good scientific practice requires data sharing. We cannot trust results based on hidden data.
  • 4. The Challenges • Limits of print (entrenched practice but not best practice) • Data preservation crisis (wasted effort) • Hard to compare and integrate data now
  • 5. Policy Consensus: Urgent Need for Better Data Practices!
  • 6. DIPIR (http://www.dipir.org)  3-Year project, Oct. 2010-Sept. 2013  National Leadership Grant from the Institute for Museum and Library Services (LG-06-10-0140-10)  Ixchel Faniel (PI), Elizabeth Yakel (Co-PI)
  • 7. Raw Data Can Be Unappetizing
  • 8. Data Documentation Practices “I use an Excel spreadsheet…which I … inherited from my research advisers. …my dissertation advisor was still recording data for each specimen on paper when I was in graduate school so that's what I started …then quickly, I was like, ‘This is ridiculous.’… I just started using an Excel spreadsheet that has sort of slowly gotten bigger and bigger over time with more variables or columns…I've added …color coding…I also use…a very sort of primitive numerical coding system, again, that I inherited from my research advisers…So, this little book that goes with me of codes which is sort of odd, but …we all know that a 14 is a sheep.” (CCU13) A long way to go before we get usable, intelligible data
  • 9. Sometimes data is better served cooked.
  • 11. What is Data Publication? Putting editorially-vetted data on the Web • Cleaned, described, organized • More intelligible and cohesive • Open access • Linked to other resources (including print publications) • Machine-readable for discovery and reuse • Archived and curated (CDL)
  • 12. Benefits & Challenges The Good: • Enhanced presentation • Enhanced search, discovery, understanding • Depth & breadth (linked to project data, other datasets, print publications, etc.) • Allowing for Linked Open Data = facilitates future use • Professional advancement The Bad: • Takes time, effort • Requires informatics expertise Benefits need to outweigh challenges
  • 13. Thousand Flowers  Started in 2007  Integrates and publishes various forms of archaeological documentation (structured data, media, documents)  Not a repository, but archived with California Digital Library  Interoperability via web services, increasing emphasis on Linked Data
  • 14. Data Publishing Data Quality and Standards Alignment (1) Check consistency (2) Edit functions (3) Align to common standards (“Linked Data” if applicable) (4) Issue tracking, version control
  • 15.
  • 16.
  • 17.
  • 18. Data Publishing Data Publishing  Comprehensive (Kenan Tepe: 30K photos, documents, object descriptions)  Added capabilities (search, analysis, visualization)  More attractive, usable data  Interactions with data editors improve data
  • 19. • Citation provided for each item • CDL archival service to give permanence
  • 20. Beyond the Silo  Often too much emphasis on single systems, need to consider relationships across systems  Even if one reaches some scale, it can't be isolated from the rest of the Web  Machines are important “audiences” (e.g. RESTful Services: Atom, AtomPub, JSON, etc.)
  • 21. Linked Open Data Regarded as best practice for sharing data (among informatics researchers)
  • 22. Web of Data (2009) Growing, Decentralized Innovation
  • 23. Web of Data (2011)
  • 24. Web of Data (2011) Need Archaeology on the Map Contributions should not be isolated from other communities
  • 25. Open Context: Record  HTTP URIs to identify resources at a meaningful level of granulaity (“a URL per potsherd”)  Use HTTP URIs published by others  URIs act as “primary keys” allow data to be related
  • 26.
  • 27. Concept: Bos taurus (http://eol.org/pages/328699/)
  • 28. Concept: Bos taurus (http://eol.org/pages/328699/)
  • 30. Open Context Entity Reconciliation Authors / Editors relate project-specific terminologies to global terminologies “Common name : Cattle, domestic” = http://eol.org/pages/328699/ (Bos taurus)
  • 31. Open Context Entity Reconciliation Authors / Editors Many project- relate project-specific specific terms terminologies to related to global global terminologies terminologies Project Specific Property EOL Link (Global Terminology) Species : Sheep / Goat http://eol.org/pages/2851411/ (Caprinae) Taxon : Bos taurus http://eol.org/pages/328699/ (Bos taurus) Species : Deer http://eol.org/pages/38816/ (Dama sp.) Type : Deer http://eol.org/pages/34547/ (Odocoileus sp.) Taxon : Ovis / Capra http://eol.org/pages/2851411/ (Caprinae) Species : Cattle http://eol.org/pages/34548/ (Bos taurus) Species : Goat http://eol.org/pages/328660/ (Capra hircus)
  • 32. Open Context Entity Reconciliation Authors / Editors Many project- Editorial work-flow relate project-specific specific terms helps annotate terminologies to related to global data for global terminologies terminologies interoperability Project Specific Property EOL Link (Global Terminology) Species : Sheep / Goat http://eol.org/pages/2851411/ (Caprinae) Taxon : Bos taurus http://eol.org/pages/328699/ (Bos taurus) Species : Deer http://eol.org/pages/38816/ (Dama sp.) Type : Deer http://eol.org/pages/34547/ (Odocoileus sp.) Taxon : Ovis / Capra http://eol.org/pages/2851411/ (Caprinae) Species : Cattle http://eol.org/pages/34548/ (Bos taurus) Species : Goat http://eol.org/pages/328660/ (Capra hircus)
  • 33.
  • 34. Data Publishing Projects EOL (2012) funding for publishing additional zooarchaeology datasets (Neolithic Anatolia), in project led by Ben Arbuckle (Baylor University)
  • 35. Data Publishing Projects NEH (2012) funding for publishing trade + exchange related datasets (Bronze- Iron Age Mediterranean)
  • 36. Data Publishing Projects Complement Conventional Publishing  Lockwood Press (“Archaeobiology Series”), Cotsen Institute Press (UCLA)
  • 37. Data Publishing Projects Driven by research interests and publication goals among researchers wanting to compare datasets, create reference collections, and have citable, full datasets linked to synthetic publications.
  • 38. Summary Outcomes of Publishing Data: (1) Make “datasets” first class citizens in world of scholarly communications (2) Provide needed transparency to published interpretations (3) Enable new kinds of multi-disciplinary research across many datasets
  • 39. Thank you! Special Thanks! Canan Ҫakırlar, RCAC, Koҫ University, ICAZ, and other sponsors