SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Downloaden Sie, um offline zu lesen
Sharing Between
     Data Repositories
                    Kevin S. Clarke
                 ksclarke@nescent.org

Thanks to the Dryad Data Repository contributors and funders:

Ryan Scherle, Todd J. Vision, Hilmar Lapp (NESCent)
Ben Bosman, Mark Diggory, Kevin Van de Velde (@mire, Inc.)



       NESCent
The Bio-Reposphere




(Generic Subject Repository)


                                                          (General Scholarly Repository)

                               (Subject Specific Repository)
Generic vs. Specific Repos




✔ Easy submission         ✔ Complex submission
✔ Simple metadata         ✔ More useful metadata

✔ Data is a “black box”   ✔ Well structured data

✔ No “orphaned” data      ✔ Specific type of data
A Dryad Data Package
One Possible Workflow
“Save the Time of the User” #1
“Save the Time of the User” #2
Three Simple Steps
Case 1: TreeBASE Data Import
Harvesting and Web Services

           OAI-PMH




           PhyloWS
Case 2: Data Uploaded to Dryad
Partner Repository Upload
BagIt Disseminator
      (implements DSpace PackageDisseminator)


                              Dryad Application Profile
            XSLT                             Dryad
            Crosswalk                      Publication
DSpace                          Dryad
                                Data      Dryad
Metadata                                 Data File
                               Package      Dryad
                                          Data File
                                              Dryad
                                            Data File




Bag                                        Data
                                           from
                                           DSpace
A BagIt Bag

  bag-info.txt

                        data
    bagit.txt




manifest-md5.txt   tagmanifest-md5.txt
Dryad Data in the Bag


           datafile-2
                                    dryadpkg.xml
dryadfile-2.xml   ApineDNA.nexus




           datafile-1               dryadpub.xml


dryadfile-1.xml   ApineCYTB.nexus
HTTP PUT Handshake

        TreeBASE URL




Email




              BagIt Upload
Lessons Learned

✔   Just enough to get the job done and no more

✔   Less local conventions and more “standards”

✔   There will always be custom solutions

✔   Options are developing quickly in this space
Future Directions
Less reliance on local conventions
✔   Plan to use OAI-ORE and Pairtree(s) within BagIt

OAI-ORE: Because it's Linked Data

Pairtree Filesystem
✔   So we can dereference URIs in ORE Resource Maps
     http://dx.doi.org/10.5061/dryad.8343

    URI prefix: http://dx.doi.org/10.5061/dryad.
    Path:       83/43
                83/43/Arctostaphylos.nex
Other Interesting Developments

DataONE
✔ Distributing data files and metadata

✔ May support packages in the future




“Dropbox of Bags”
  or Bag replication network (BagNet?)

METS in Bags (in contrast to ORE)
The End




          The cake was a lie
References
Dryad Code
 http://dryad.googlecode.com

Dryad Data Repository
  http://datadryad.org

BagIt
  http://en.wikipedia.org/wiki/BagIt

OAI-ORE Primer
 http://www.openarchives.org/ore/1.0/primer

OAI-ORE in BagIt
 http://groups.google.com/group/oai-ore/browse_thread/thread/3ebfa7fcb4588048

ADMIRAL Data Packages (Planning ORE in BagIt)
  http://imageweb.zoo.ox.ac.uk/wiki/index.php/ADMIRAL_data_packages

DSpace Packagers
  https://wiki.duraspace.org/display/DSPACE/PackagerPlugins

Weitere ähnliche Inhalte

Was ist angesagt?

Hopsfs 10x HDFS performance
Hopsfs 10x HDFS performanceHopsfs 10x HDFS performance
Hopsfs 10x HDFS performanceJim Dowling
 
How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?andrea huang
 
HKU Data Curation MLIM7350 Student Project: Data Curation Workshop
HKU Data Curation MLIM7350 Student Project: Data Curation WorkshopHKU Data Curation MLIM7350 Student Project: Data Curation Workshop
HKU Data Curation MLIM7350 Student Project: Data Curation Workshopl_ernest
 
DataCite How To: Use the MDS
DataCite How To: Use the MDSDataCite How To: Use the MDS
DataCite How To: Use the MDSFrauke Ziedorn
 
The Open Archives Initiative
The Open Archives InitiativeThe Open Archives Initiative
The Open Archives InitiativeMichael Nelson
 
Duplicate File Analyzer using N-layer Hash and Hash Table
Duplicate File Analyzer using N-layer Hash and Hash TableDuplicate File Analyzer using N-layer Hash and Hash Table
Duplicate File Analyzer using N-layer Hash and Hash TableAM Publications
 
Provenance state-of-art
Provenance state-of-artProvenance state-of-art
Provenance state-of-artvty
 
30° Nexa Lunch Seminar - Linked Data Platform vs real world
30° Nexa Lunch Seminar - Linked Data Platform vs real world30° Nexa Lunch Seminar - Linked Data Platform vs real world
30° Nexa Lunch Seminar - Linked Data Platform vs real worldDiego Valerio Camarda
 
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...andrea huang
 
DDS-XRCE (Extremely Resource Constrained Environments)
DDS-XRCE (Extremely Resource Constrained Environments)DDS-XRCE (Extremely Resource Constrained Environments)
DDS-XRCE (Extremely Resource Constrained Environments)Gerardo Pardo-Castellote
 
Big data for SAS programmers
Big data for SAS programmersBig data for SAS programmers
Big data for SAS programmersKevin Lee
 
SWIB14 Weaving repository contents into the Semantic Web
SWIB14 Weaving repository contents into the Semantic WebSWIB14 Weaving repository contents into the Semantic Web
SWIB14 Weaving repository contents into the Semantic WebPascal-Nicolas Becker
 

Was ist angesagt? (13)

Hopsfs 10x HDFS performance
Hopsfs 10x HDFS performanceHopsfs 10x HDFS performance
Hopsfs 10x HDFS performance
 
How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?
 
HKU Data Curation MLIM7350 Student Project: Data Curation Workshop
HKU Data Curation MLIM7350 Student Project: Data Curation WorkshopHKU Data Curation MLIM7350 Student Project: Data Curation Workshop
HKU Data Curation MLIM7350 Student Project: Data Curation Workshop
 
DataCite How To: Use the MDS
DataCite How To: Use the MDSDataCite How To: Use the MDS
DataCite How To: Use the MDS
 
The Open Archives Initiative
The Open Archives InitiativeThe Open Archives Initiative
The Open Archives Initiative
 
Duplicate File Analyzer using N-layer Hash and Hash Table
Duplicate File Analyzer using N-layer Hash and Hash TableDuplicate File Analyzer using N-layer Hash and Hash Table
Duplicate File Analyzer using N-layer Hash and Hash Table
 
Provenance state-of-art
Provenance state-of-artProvenance state-of-art
Provenance state-of-art
 
30° Nexa Lunch Seminar - Linked Data Platform vs real world
30° Nexa Lunch Seminar - Linked Data Platform vs real world30° Nexa Lunch Seminar - Linked Data Platform vs real world
30° Nexa Lunch Seminar - Linked Data Platform vs real world
 
Tese phd
Tese phdTese phd
Tese phd
 
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
 
DDS-XRCE (Extremely Resource Constrained Environments)
DDS-XRCE (Extremely Resource Constrained Environments)DDS-XRCE (Extremely Resource Constrained Environments)
DDS-XRCE (Extremely Resource Constrained Environments)
 
Big data for SAS programmers
Big data for SAS programmersBig data for SAS programmers
Big data for SAS programmers
 
SWIB14 Weaving repository contents into the Semantic Web
SWIB14 Weaving repository contents into the Semantic WebSWIB14 Weaving repository contents into the Semantic Web
SWIB14 Weaving repository contents into the Semantic Web
 

Andere mochten auch

Be2camp Brum
Be2camp BrumBe2camp Brum
Be2camp Brumeversion
 
The Golden Road to Unlimited Devotion: Building a Socially Constructive Archi...
The Golden Road to Unlimited Devotion: Building a Socially Constructive Archi...The Golden Road to Unlimited Devotion: Building a Socially Constructive Archi...
The Golden Road to Unlimited Devotion: Building a Socially Constructive Archi...Kevin Clarke
 
The Convergence of Social Marketing Worlds
The Convergence of Social Marketing WorldsThe Convergence of Social Marketing Worlds
The Convergence of Social Marketing Worldscraig lefebvre
 
Classrooms As Third Places
Classrooms As Third PlacesClassrooms As Third Places
Classrooms As Third PlacesKonrad Glogowski
 
Assessment And Evaluation in the Age of Networked Learning
Assessment And Evaluation in the Age of Networked LearningAssessment And Evaluation in the Age of Networked Learning
Assessment And Evaluation in the Age of Networked LearningKonrad Glogowski
 
Education Under Attack: The Impact of Armed Conflict on Education around the ...
Education Under Attack: The Impact of Armed Conflict on Education around the ...Education Under Attack: The Impact of Armed Conflict on Education around the ...
Education Under Attack: The Impact of Armed Conflict on Education around the ...Konrad Glogowski
 
Assessment In The 21st-Century Classroom
Assessment In The 21st-Century ClassroomAssessment In The 21st-Century Classroom
Assessment In The 21st-Century ClassroomKonrad Glogowski
 
Toward Global Social Marketing Network 13 March 09
Toward Global Social Marketing Network   13 March 09Toward Global Social Marketing Network   13 March 09
Toward Global Social Marketing Network 13 March 09craig lefebvre
 
Integrating Social Media Into Prevention Programs
Integrating Social Media Into Prevention ProgramsIntegrating Social Media Into Prevention Programs
Integrating Social Media Into Prevention Programscraig lefebvre
 
Evaluating Social Marketing in the Context of Financial Literacy and Educatio...
Evaluating Social Marketing in the Context of Financial Literacy and Educatio...Evaluating Social Marketing in the Context of Financial Literacy and Educatio...
Evaluating Social Marketing in the Context of Financial Literacy and Educatio...craig lefebvre
 
eHealth Engagement Scale
eHealth Engagement ScaleeHealth Engagement Scale
eHealth Engagement Scalecraig lefebvre
 
Social media shifts the dynamics of communication in public health emergencies
Social media shifts the dynamics of communication in public health emergenciesSocial media shifts the dynamics of communication in public health emergencies
Social media shifts the dynamics of communication in public health emergenciescraig lefebvre
 
Partnerships to inform, support and enhance health promotion programs
Partnerships to inform, support and enhance health promotion programsPartnerships to inform, support and enhance health promotion programs
Partnerships to inform, support and enhance health promotion programscraig lefebvre
 
Frontiers of the new social marketing
Frontiers of the new social marketingFrontiers of the new social marketing
Frontiers of the new social marketingcraig lefebvre
 
Social Media: Strategic Shift or Tactical Tool?
Social Media: Strategic Shift or Tactical Tool?Social Media: Strategic Shift or Tactical Tool?
Social Media: Strategic Shift or Tactical Tool?craig lefebvre
 
Mobile Telephone Market Segments
Mobile Telephone Market SegmentsMobile Telephone Market Segments
Mobile Telephone Market Segmentscraig lefebvre
 
Design thinking and public health
Design thinking and public healthDesign thinking and public health
Design thinking and public healthcraig lefebvre
 
Introduction To Behavioral Design
Introduction To Behavioral DesignIntroduction To Behavioral Design
Introduction To Behavioral Designcraig lefebvre
 

Andere mochten auch (19)

Be2camp Brum
Be2camp BrumBe2camp Brum
Be2camp Brum
 
Progetto_corvo
Progetto_corvoProgetto_corvo
Progetto_corvo
 
The Golden Road to Unlimited Devotion: Building a Socially Constructive Archi...
The Golden Road to Unlimited Devotion: Building a Socially Constructive Archi...The Golden Road to Unlimited Devotion: Building a Socially Constructive Archi...
The Golden Road to Unlimited Devotion: Building a Socially Constructive Archi...
 
The Convergence of Social Marketing Worlds
The Convergence of Social Marketing WorldsThe Convergence of Social Marketing Worlds
The Convergence of Social Marketing Worlds
 
Classrooms As Third Places
Classrooms As Third PlacesClassrooms As Third Places
Classrooms As Third Places
 
Assessment And Evaluation in the Age of Networked Learning
Assessment And Evaluation in the Age of Networked LearningAssessment And Evaluation in the Age of Networked Learning
Assessment And Evaluation in the Age of Networked Learning
 
Education Under Attack: The Impact of Armed Conflict on Education around the ...
Education Under Attack: The Impact of Armed Conflict on Education around the ...Education Under Attack: The Impact of Armed Conflict on Education around the ...
Education Under Attack: The Impact of Armed Conflict on Education around the ...
 
Assessment In The 21st-Century Classroom
Assessment In The 21st-Century ClassroomAssessment In The 21st-Century Classroom
Assessment In The 21st-Century Classroom
 
Toward Global Social Marketing Network 13 March 09
Toward Global Social Marketing Network   13 March 09Toward Global Social Marketing Network   13 March 09
Toward Global Social Marketing Network 13 March 09
 
Integrating Social Media Into Prevention Programs
Integrating Social Media Into Prevention ProgramsIntegrating Social Media Into Prevention Programs
Integrating Social Media Into Prevention Programs
 
Evaluating Social Marketing in the Context of Financial Literacy and Educatio...
Evaluating Social Marketing in the Context of Financial Literacy and Educatio...Evaluating Social Marketing in the Context of Financial Literacy and Educatio...
Evaluating Social Marketing in the Context of Financial Literacy and Educatio...
 
eHealth Engagement Scale
eHealth Engagement ScaleeHealth Engagement Scale
eHealth Engagement Scale
 
Social media shifts the dynamics of communication in public health emergencies
Social media shifts the dynamics of communication in public health emergenciesSocial media shifts the dynamics of communication in public health emergencies
Social media shifts the dynamics of communication in public health emergencies
 
Partnerships to inform, support and enhance health promotion programs
Partnerships to inform, support and enhance health promotion programsPartnerships to inform, support and enhance health promotion programs
Partnerships to inform, support and enhance health promotion programs
 
Frontiers of the new social marketing
Frontiers of the new social marketingFrontiers of the new social marketing
Frontiers of the new social marketing
 
Social Media: Strategic Shift or Tactical Tool?
Social Media: Strategic Shift or Tactical Tool?Social Media: Strategic Shift or Tactical Tool?
Social Media: Strategic Shift or Tactical Tool?
 
Mobile Telephone Market Segments
Mobile Telephone Market SegmentsMobile Telephone Market Segments
Mobile Telephone Market Segments
 
Design thinking and public health
Design thinking and public healthDesign thinking and public health
Design thinking and public health
 
Introduction To Behavioral Design
Introduction To Behavioral DesignIntroduction To Behavioral Design
Introduction To Behavioral Design
 

Ähnlich wie Sharing Between Data Repositories

Ceph Days 2014 Paul Evans Slide Deck
Ceph Days 2014 Paul Evans Slide DeckCeph Days 2014 Paul Evans Slide Deck
Ceph Days 2014 Paul Evans Slide DeckDaystromTech
 
3.7.17 DSpace for Data: issues, solutions and challenges Webinar Slides
3.7.17 DSpace for Data: issues, solutions and challenges Webinar Slides3.7.17 DSpace for Data: issues, solutions and challenges Webinar Slides
3.7.17 DSpace for Data: issues, solutions and challenges Webinar SlidesDuraSpace
 
Force11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, OxfordForce11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, OxfordMark Wilkinson
 
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...Simplilearn
 
Empowering Transformational Science
Empowering Transformational ScienceEmpowering Transformational Science
Empowering Transformational ScienceChelle Gentemann
 
Database tools for technologists - short
Database tools for technologists - shortDatabase tools for technologists - short
Database tools for technologists - shortIan Barrodale
 
Overview of Big Data by Sunny
Overview of Big Data by SunnyOverview of Big Data by Sunny
Overview of Big Data by SunnyDignitasDigital1
 
Hadoop @ Sara & BiG Grid
Hadoop @ Sara & BiG GridHadoop @ Sara & BiG Grid
Hadoop @ Sara & BiG GridEvert Lammerts
 
A Survey on Approaches for Frequent Item Set Mining on Apache Hadoop
A Survey on Approaches for Frequent Item Set Mining on Apache HadoopA Survey on Approaches for Frequent Item Set Mining on Apache Hadoop
A Survey on Approaches for Frequent Item Set Mining on Apache HadoopIJTET Journal
 
Big data présentation
Big data présentationBig data présentation
Big data présentationAbdo Bim
 
Facebook's Approach to Big Data Storage Challenge
Facebook's Approach to Big Data Storage ChallengeFacebook's Approach to Big Data Storage Challenge
Facebook's Approach to Big Data Storage ChallengeDataWorks Summit
 
Make your data great now
Make your data great nowMake your data great now
Make your data great nowDaniel JACOB
 
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...Simplilearn
 
Seminar presentation
Seminar presentationSeminar presentation
Seminar presentationKlawal13
 
Population genomics is a data management problem
Population genomics is a data management problemPopulation genomics is a data management problem
Population genomics is a data management problemStavros Papadopoulos
 

Ähnlich wie Sharing Between Data Repositories (20)

Ceph Days 2014 Paul Evans Slide Deck
Ceph Days 2014 Paul Evans Slide DeckCeph Days 2014 Paul Evans Slide Deck
Ceph Days 2014 Paul Evans Slide Deck
 
3.7.17 DSpace for Data: issues, solutions and challenges Webinar Slides
3.7.17 DSpace for Data: issues, solutions and challenges Webinar Slides3.7.17 DSpace for Data: issues, solutions and challenges Webinar Slides
3.7.17 DSpace for Data: issues, solutions and challenges Webinar Slides
 
Data mining
Data miningData mining
Data mining
 
Force11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, OxfordForce11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, Oxford
 
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...
 
Empowering Transformational Science
Empowering Transformational ScienceEmpowering Transformational Science
Empowering Transformational Science
 
Database tools for technologists - short
Database tools for technologists - shortDatabase tools for technologists - short
Database tools for technologists - short
 
Overview of Big Data by Sunny
Overview of Big Data by SunnyOverview of Big Data by Sunny
Overview of Big Data by Sunny
 
Hadoop @ Sara & BiG Grid
Hadoop @ Sara & BiG GridHadoop @ Sara & BiG Grid
Hadoop @ Sara & BiG Grid
 
Whither Small Data?
Whither Small Data?Whither Small Data?
Whither Small Data?
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 
A Survey on Approaches for Frequent Item Set Mining on Apache Hadoop
A Survey on Approaches for Frequent Item Set Mining on Apache HadoopA Survey on Approaches for Frequent Item Set Mining on Apache Hadoop
A Survey on Approaches for Frequent Item Set Mining on Apache Hadoop
 
Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-
 
Big data présentation
Big data présentationBig data présentation
Big data présentation
 
Democratizing Big Semantic Data management
Democratizing Big Semantic Data managementDemocratizing Big Semantic Data management
Democratizing Big Semantic Data management
 
Facebook's Approach to Big Data Storage Challenge
Facebook's Approach to Big Data Storage ChallengeFacebook's Approach to Big Data Storage Challenge
Facebook's Approach to Big Data Storage Challenge
 
Make your data great now
Make your data great nowMake your data great now
Make your data great now
 
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
 
Seminar presentation
Seminar presentationSeminar presentation
Seminar presentation
 
Population genomics is a data management problem
Population genomics is a data management problemPopulation genomics is a data management problem
Population genomics is a data management problem
 

Kürzlich hochgeladen

Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Kürzlich hochgeladen (20)

Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Sharing Between Data Repositories

  • 1. Sharing Between Data Repositories Kevin S. Clarke ksclarke@nescent.org Thanks to the Dryad Data Repository contributors and funders: Ryan Scherle, Todd J. Vision, Hilmar Lapp (NESCent) Ben Bosman, Mark Diggory, Kevin Van de Velde (@mire, Inc.) NESCent
  • 2. The Bio-Reposphere (Generic Subject Repository) (General Scholarly Repository) (Subject Specific Repository)
  • 3. Generic vs. Specific Repos ✔ Easy submission ✔ Complex submission ✔ Simple metadata ✔ More useful metadata ✔ Data is a “black box” ✔ Well structured data ✔ No “orphaned” data ✔ Specific type of data
  • 4. A Dryad Data Package
  • 6. “Save the Time of the User” #1
  • 7. “Save the Time of the User” #2
  • 9. Case 1: TreeBASE Data Import
  • 10. Harvesting and Web Services OAI-PMH PhyloWS
  • 11. Case 2: Data Uploaded to Dryad
  • 13. BagIt Disseminator (implements DSpace PackageDisseminator) Dryad Application Profile XSLT Dryad Crosswalk Publication DSpace Dryad Data Dryad Metadata Data File Package Dryad Data File Dryad Data File Bag Data from DSpace
  • 14. A BagIt Bag bag-info.txt data bagit.txt manifest-md5.txt tagmanifest-md5.txt
  • 15. Dryad Data in the Bag datafile-2 dryadpkg.xml dryadfile-2.xml ApineDNA.nexus datafile-1 dryadpub.xml dryadfile-1.xml ApineCYTB.nexus
  • 16. HTTP PUT Handshake TreeBASE URL Email BagIt Upload
  • 17. Lessons Learned ✔ Just enough to get the job done and no more ✔ Less local conventions and more “standards” ✔ There will always be custom solutions ✔ Options are developing quickly in this space
  • 18. Future Directions Less reliance on local conventions ✔ Plan to use OAI-ORE and Pairtree(s) within BagIt OAI-ORE: Because it's Linked Data Pairtree Filesystem ✔ So we can dereference URIs in ORE Resource Maps http://dx.doi.org/10.5061/dryad.8343 URI prefix: http://dx.doi.org/10.5061/dryad. Path: 83/43 83/43/Arctostaphylos.nex
  • 19. Other Interesting Developments DataONE ✔ Distributing data files and metadata ✔ May support packages in the future “Dropbox of Bags” or Bag replication network (BagNet?) METS in Bags (in contrast to ORE)
  • 20. The End The cake was a lie
  • 21. References Dryad Code http://dryad.googlecode.com Dryad Data Repository http://datadryad.org BagIt http://en.wikipedia.org/wiki/BagIt OAI-ORE Primer http://www.openarchives.org/ore/1.0/primer OAI-ORE in BagIt http://groups.google.com/group/oai-ore/browse_thread/thread/3ebfa7fcb4588048 ADMIRAL Data Packages (Planning ORE in BagIt) http://imageweb.zoo.ox.ac.uk/wiki/index.php/ADMIRAL_data_packages DSpace Packagers https://wiki.duraspace.org/display/DSPACE/PackagerPlugins