SlideShare ist ein Scribd-Unternehmen logo
1 von 16
DataONE Preservation
and Metadata Working Group

September 2012
DataONE All Hands Meeting
DataONE Preservation in a Nutshell*
 1. Keep the bits safe
     • Replicate the data and metadata
     • Do local security and media refresh
 2. Protect their form and meaning
     • Know what you have, and know your rights
     • Know when to migrate and emulate
 3. Safeguard the guardians
     • Organizational and network sustainability

 * DataONE Preservation Strategy, PWG
    workshop, Chicago, December 5-6, 2010
DataONE Metadata WG Goals
 1. Build an e-dictionary to look up metadata terms
    and to publish your own terms
 2. Develop community focusing on data curation,
    citation, and discovery for DataONE
 3. Develop a community to sustain it
Agreeing on terms: a totally different take

    • Traditional metadata standards are controlled
    • Change by committee is ugly, costly, and slow
    • Example: Dublin Core, 15 cross-domain terms
     • 5 years to agree, highly divergent local
       use, change relegated to external ontologies




4
The Metadata Universe




       Jenn Riley, IU
The Metadata Universe




       Jenn Riley, IU
The Metadata Universe




       Jenn Riley, IU
The Metadata Universe




       Jenn Riley, IU
The Metadata Universe




       Jenn Riley, IU
Metadata Vision
  Instead, create one dictionary
  • Crowd sourced plus lightly supervised canon
  • Anyone can look up terms
  • Any part of “metadata speech”
  • Anyone can propose and refine their terms
  • Strong terms rise, weak terms decline
Greenberg, J., Murillo, A. and Kunze, J (in press). Ontological
Empowerment: Sustainability via Ownership. In K. LeBarre and J. Tennis Advances
in Classification Research, 23nd Annual ASIS SIG/CR Workshop, 26 October 2012,
Baltimore, MD.

10
DataONE Preservation
and Metadata Working Group

September 2012
DataONE All Hands Meeting
Metadata Vision
  One dictionary
  • Crowd sourced plus lightly supervised canon
  • Anyone can look up terms
  • Any part of “metadata speech”
  • Anyone can propose and refine their terms
  • Strong terms rise, weak terms decline
Greenberg, J., Murillo, A. and Kunze, J (in press). Ontological
Empowerment: Sustainability via Ownership. In K. LeBarre and J. Tennis Advances
in Classification Research, 23nd Annual ASIS SIG/CR Workshop, 26 October 2012,
Baltimore, MD.

12
What we did
• Met
• Laughed, Talked, Cried, Hugged
• Conquered




13
Use cases
Six solid cases, eg,
• Sally Scientist is about to enter column headers
  for observational data on Pikas in the alpine for
  data to go into Dryad
• Doug Data wants to use Sally’s observations and
  needs to lookup the definition of one of her
  column headers




14
Mockup
Work packages in the next 2 years
Move from pre-proof-of-concept to Beta
• Software development
• Assessment (eg, students)
• Moderation protocols – community elders
• Establish community identity and rhythm
   • Not completely flat, not completely crowd-sourced

Weitere ähnliche Inhalte

Andere mochten auch

A Vocabulary for Persistence
A Vocabulary for PersistenceA Vocabulary for Persistence
A Vocabulary for PersistenceJohn Kunze
 
Marketing for Bands on the Web
Marketing for Bands on the Web Marketing for Bands on the Web
Marketing for Bands on the Web SFU Pub355
 
ARK identifiers: lessons learnt at BnF: paths forward
ARK identifiers: lessons learnt at BnF: paths forwardARK identifiers: lessons learnt at BnF: paths forward
ARK identifiers: lessons learnt at BnF: paths forwardJohn Kunze
 
Big Data's Long Tail
Big Data's Long TailBig Data's Long Tail
Big Data's Long TailJohn Kunze
 
YAMZ: a cross-domain crowd-sourced metadata vocabulary
YAMZ: a cross-domain crowd-sourced metadata vocabularyYAMZ: a cross-domain crowd-sourced metadata vocabulary
YAMZ: a cross-domain crowd-sourced metadata vocabularyJohn Kunze
 
Scalable Identifiers for Natural History Collections
Scalable Identifiers for Natural History CollectionsScalable Identifiers for Natural History Collections
Scalable Identifiers for Natural History CollectionsJohn Kunze
 
Annotating Research Datasets
Annotating Research DatasetsAnnotating Research Datasets
Annotating Research DatasetsJohn Kunze
 
How the Long Tail is Occurring in the Movie Industry
How the Long Tail is Occurring in the Movie IndustryHow the Long Tail is Occurring in the Movie Industry
How the Long Tail is Occurring in the Movie IndustrySFU Pub355
 
Information literacy in a media-saturated world
Information literacy in a media-saturated worldInformation literacy in a media-saturated world
Information literacy in a media-saturated worldPam Wilson
 
How words and images signify
How words and images signifyHow words and images signify
How words and images signifyPam Wilson
 
YAMZ.net: better, faster, cheaper taxonomy building
YAMZ.net:  better, faster, cheaper taxonomy buildingYAMZ.net:  better, faster, cheaper taxonomy building
YAMZ.net: better, faster, cheaper taxonomy buildingJohn Kunze
 

Andere mochten auch (12)

A Vocabulary for Persistence
A Vocabulary for PersistenceA Vocabulary for Persistence
A Vocabulary for Persistence
 
Marketing for Bands on the Web
Marketing for Bands on the Web Marketing for Bands on the Web
Marketing for Bands on the Web
 
ARK identifiers: lessons learnt at BnF: paths forward
ARK identifiers: lessons learnt at BnF: paths forwardARK identifiers: lessons learnt at BnF: paths forward
ARK identifiers: lessons learnt at BnF: paths forward
 
Big Data's Long Tail
Big Data's Long TailBig Data's Long Tail
Big Data's Long Tail
 
YAMZ: a cross-domain crowd-sourced metadata vocabulary
YAMZ: a cross-domain crowd-sourced metadata vocabularyYAMZ: a cross-domain crowd-sourced metadata vocabulary
YAMZ: a cross-domain crowd-sourced metadata vocabulary
 
Scalable Identifiers for Natural History Collections
Scalable Identifiers for Natural History CollectionsScalable Identifiers for Natural History Collections
Scalable Identifiers for Natural History Collections
 
Annotating Research Datasets
Annotating Research DatasetsAnnotating Research Datasets
Annotating Research Datasets
 
How the Long Tail is Occurring in the Movie Industry
How the Long Tail is Occurring in the Movie IndustryHow the Long Tail is Occurring in the Movie Industry
How the Long Tail is Occurring in the Movie Industry
 
RSS Feeds
RSS FeedsRSS Feeds
RSS Feeds
 
Information literacy in a media-saturated world
Information literacy in a media-saturated worldInformation literacy in a media-saturated world
Information literacy in a media-saturated world
 
How words and images signify
How words and images signifyHow words and images signify
How words and images signify
 
YAMZ.net: better, faster, cheaper taxonomy building
YAMZ.net:  better, faster, cheaper taxonomy buildingYAMZ.net:  better, faster, cheaper taxonomy building
YAMZ.net: better, faster, cheaper taxonomy building
 

Ähnlich wie Pamwg 2012ahm

DataONE Preservation and Metadata Working Group Report 2014
DataONE Preservation and Metadata Working Group Report 2014DataONE Preservation and Metadata Working Group Report 2014
DataONE Preservation and Metadata Working Group Report 2014John Kunze
 
The Research Data Alliance: Creating the culture and technology for an intern...
The Research Data Alliance: Creating the culture and technology for an intern...The Research Data Alliance: Creating the culture and technology for an intern...
The Research Data Alliance: Creating the culture and technology for an intern...Research Data Alliance
 
Citizen Science Phenotypes
Citizen Science PhenotypesCitizen Science Phenotypes
Citizen Science PhenotypesAndrea Wiggins
 
The Research Data Alliance--Creating the culture and technology for an intern...
The Research Data Alliance--Creating the culture and technology for an intern...The Research Data Alliance--Creating the culture and technology for an intern...
The Research Data Alliance--Creating the culture and technology for an intern...Research Data Alliance
 
Linked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based TechniquesLinked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based TechniquesPrateek Jain
 
2016 Ocean Sciences Meeting tutorial
2016 Ocean Sciences Meeting tutorial2016 Ocean Sciences Meeting tutorial
2016 Ocean Sciences Meeting tutorialJosh Young
 
DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE
 
Data Management for Collaboration, Access, and Interoperability
Data Management for Collaboration, Access, and InteroperabilityData Management for Collaboration, Access, and Interoperability
Data Management for Collaboration, Access, and InteroperabilityPlato L. Smith II
 
IMT530 Tagging Presentation
IMT530 Tagging PresentationIMT530 Tagging Presentation
IMT530 Tagging PresentationMichael Braly
 
Data Citation Rewards and Incentives
 Data Citation Rewards and Incentives Data Citation Rewards and Incentives
Data Citation Rewards and IncentivesMicah Altman
 
DataONE Education Module 08: Data Citation
DataONE Education Module 08: Data CitationDataONE Education Module 08: Data Citation
DataONE Education Module 08: Data CitationDataONE
 
Data Policy for Open Science
Data Policy for Open ScienceData Policy for Open Science
Data Policy for Open ScienceMark Parsons
 
How and Why to Share Your Data
How and Why to Share Your DataHow and Why to Share Your Data
How and Why to Share Your Datakfear
 
Bridging the missing middle for al_tversionfinal_14_08_2014
Bridging the missing middle for al_tversionfinal_14_08_2014Bridging the missing middle for al_tversionfinal_14_08_2014
Bridging the missing middle for al_tversionfinal_14_08_2014debbieholley1
 

Ähnlich wie Pamwg 2012ahm (20)

DataONE Preservation and Metadata Working Group Report 2014
DataONE Preservation and Metadata Working Group Report 2014DataONE Preservation and Metadata Working Group Report 2014
DataONE Preservation and Metadata Working Group Report 2014
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
The Research Data Alliance: Creating the culture and technology for an intern...
The Research Data Alliance: Creating the culture and technology for an intern...The Research Data Alliance: Creating the culture and technology for an intern...
The Research Data Alliance: Creating the culture and technology for an intern...
 
Citizen Science Phenotypes
Citizen Science PhenotypesCitizen Science Phenotypes
Citizen Science Phenotypes
 
The Research Data Alliance--Creating the culture and technology for an intern...
The Research Data Alliance--Creating the culture and technology for an intern...The Research Data Alliance--Creating the culture and technology for an intern...
The Research Data Alliance--Creating the culture and technology for an intern...
 
Linked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based TechniquesLinked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
 
ACRL STS Liaisons Forum - AIBS
ACRL STS Liaisons Forum - AIBSACRL STS Liaisons Forum - AIBS
ACRL STS Liaisons Forum - AIBS
 
PhD Proposal Defense - Prateek Jain
PhD Proposal Defense - Prateek JainPhD Proposal Defense - Prateek Jain
PhD Proposal Defense - Prateek Jain
 
2016 Ocean Sciences Meeting tutorial
2016 Ocean Sciences Meeting tutorial2016 Ocean Sciences Meeting tutorial
2016 Ocean Sciences Meeting tutorial
 
DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data Sharing
 
Data Management for Collaboration, Access, and Interoperability
Data Management for Collaboration, Access, and InteroperabilityData Management for Collaboration, Access, and Interoperability
Data Management for Collaboration, Access, and Interoperability
 
IMT530 Tagging Presentation
IMT530 Tagging PresentationIMT530 Tagging Presentation
IMT530 Tagging Presentation
 
Data Citation Rewards and Incentives
 Data Citation Rewards and Incentives Data Citation Rewards and Incentives
Data Citation Rewards and Incentives
 
DataONE Education Module 08: Data Citation
DataONE Education Module 08: Data CitationDataONE Education Module 08: Data Citation
DataONE Education Module 08: Data Citation
 
Data Exchange, Data Citation: An overview of some community work
Data Exchange, Data Citation: An overview of some community workData Exchange, Data Citation: An overview of some community work
Data Exchange, Data Citation: An overview of some community work
 
Data Policy for Open Science
Data Policy for Open ScienceData Policy for Open Science
Data Policy for Open Science
 
Data Policy for Open Science
Data Policy for Open ScienceData Policy for Open Science
Data Policy for Open Science
 
Data Exchange, Data Citation: An overview of some community work
Data Exchange, Data Citation: An overview of some community workData Exchange, Data Citation: An overview of some community work
Data Exchange, Data Citation: An overview of some community work
 
How and Why to Share Your Data
How and Why to Share Your DataHow and Why to Share Your Data
How and Why to Share Your Data
 
Bridging the missing middle for al_tversionfinal_14_08_2014
Bridging the missing middle for al_tversionfinal_14_08_2014Bridging the missing middle for al_tversionfinal_14_08_2014
Bridging the missing middle for al_tversionfinal_14_08_2014
 

Mehr von John Kunze

The YAMZ Metadictionary
The YAMZ MetadictionaryThe YAMZ Metadictionary
The YAMZ MetadictionaryJohn Kunze
 
YAMZ Metadata Vocabulary Builder
YAMZ Metadata Vocabulary BuilderYAMZ Metadata Vocabulary Builder
YAMZ Metadata Vocabulary BuilderJohn Kunze
 
The ARK Alliance: 20 years, 850 institutions, 8.2 billion persistent identifi...
The ARK Alliance: 20 years, 850 institutions, 8.2 billion persistent identifi...The ARK Alliance: 20 years, 850 institutions, 8.2 billion persistent identifi...
The ARK Alliance: 20 years, 850 institutions, 8.2 billion persistent identifi...John Kunze
 
EZID and N2T at CDL
EZID and N2T at CDLEZID and N2T at CDL
EZID and N2T at CDLJohn Kunze
 
Names, Things, and Open Identifier Infrastructure: N2T and ARKs
Names, Things, and Open Identifier Infrastructure: N2T and ARKsNames, Things, and Open Identifier Infrastructure: N2T and ARKs
Names, Things, and Open Identifier Infrastructure: N2T and ARKsJohn Kunze
 
Selected Bash shell tricks from Camp CDL breakout group
Selected Bash shell tricks from Camp CDL breakout groupSelected Bash shell tricks from Camp CDL breakout group
Selected Bash shell tricks from Camp CDL breakout groupJohn Kunze
 
Future-Proofing the Web: What We Can Do Today
Future-Proofing the Web: What We Can Do TodayFuture-Proofing the Web: What We Can Do Today
Future-Proofing the Web: What We Can Do TodayJohn Kunze
 
Supporting Data-Rich Research on Many Fronts
Supporting Data-Rich Research on Many FrontsSupporting Data-Rich Research on Many Fronts
Supporting Data-Rich Research on Many FrontsJohn Kunze
 
Pairtrees for object storage
Pairtrees for object storagePairtrees for object storage
Pairtrees for object storageJohn Kunze
 
The BagIt file package format
The BagIt file package formatThe BagIt file package format
The BagIt file package formatJohn Kunze
 

Mehr von John Kunze (10)

The YAMZ Metadictionary
The YAMZ MetadictionaryThe YAMZ Metadictionary
The YAMZ Metadictionary
 
YAMZ Metadata Vocabulary Builder
YAMZ Metadata Vocabulary BuilderYAMZ Metadata Vocabulary Builder
YAMZ Metadata Vocabulary Builder
 
The ARK Alliance: 20 years, 850 institutions, 8.2 billion persistent identifi...
The ARK Alliance: 20 years, 850 institutions, 8.2 billion persistent identifi...The ARK Alliance: 20 years, 850 institutions, 8.2 billion persistent identifi...
The ARK Alliance: 20 years, 850 institutions, 8.2 billion persistent identifi...
 
EZID and N2T at CDL
EZID and N2T at CDLEZID and N2T at CDL
EZID and N2T at CDL
 
Names, Things, and Open Identifier Infrastructure: N2T and ARKs
Names, Things, and Open Identifier Infrastructure: N2T and ARKsNames, Things, and Open Identifier Infrastructure: N2T and ARKs
Names, Things, and Open Identifier Infrastructure: N2T and ARKs
 
Selected Bash shell tricks from Camp CDL breakout group
Selected Bash shell tricks from Camp CDL breakout groupSelected Bash shell tricks from Camp CDL breakout group
Selected Bash shell tricks from Camp CDL breakout group
 
Future-Proofing the Web: What We Can Do Today
Future-Proofing the Web: What We Can Do TodayFuture-Proofing the Web: What We Can Do Today
Future-Proofing the Web: What We Can Do Today
 
Supporting Data-Rich Research on Many Fronts
Supporting Data-Rich Research on Many FrontsSupporting Data-Rich Research on Many Fronts
Supporting Data-Rich Research on Many Fronts
 
Pairtrees for object storage
Pairtrees for object storagePairtrees for object storage
Pairtrees for object storage
 
The BagIt file package format
The BagIt file package formatThe BagIt file package format
The BagIt file package format
 

Pamwg 2012ahm

  • 1. DataONE Preservation and Metadata Working Group September 2012 DataONE All Hands Meeting
  • 2. DataONE Preservation in a Nutshell* 1. Keep the bits safe • Replicate the data and metadata • Do local security and media refresh 2. Protect their form and meaning • Know what you have, and know your rights • Know when to migrate and emulate 3. Safeguard the guardians • Organizational and network sustainability * DataONE Preservation Strategy, PWG workshop, Chicago, December 5-6, 2010
  • 3. DataONE Metadata WG Goals 1. Build an e-dictionary to look up metadata terms and to publish your own terms 2. Develop community focusing on data curation, citation, and discovery for DataONE 3. Develop a community to sustain it
  • 4. Agreeing on terms: a totally different take • Traditional metadata standards are controlled • Change by committee is ugly, costly, and slow • Example: Dublin Core, 15 cross-domain terms • 5 years to agree, highly divergent local use, change relegated to external ontologies 4
  • 5. The Metadata Universe Jenn Riley, IU
  • 6. The Metadata Universe Jenn Riley, IU
  • 7. The Metadata Universe Jenn Riley, IU
  • 8. The Metadata Universe Jenn Riley, IU
  • 9. The Metadata Universe Jenn Riley, IU
  • 10. Metadata Vision Instead, create one dictionary • Crowd sourced plus lightly supervised canon • Anyone can look up terms • Any part of “metadata speech” • Anyone can propose and refine their terms • Strong terms rise, weak terms decline Greenberg, J., Murillo, A. and Kunze, J (in press). Ontological Empowerment: Sustainability via Ownership. In K. LeBarre and J. Tennis Advances in Classification Research, 23nd Annual ASIS SIG/CR Workshop, 26 October 2012, Baltimore, MD. 10
  • 11. DataONE Preservation and Metadata Working Group September 2012 DataONE All Hands Meeting
  • 12. Metadata Vision One dictionary • Crowd sourced plus lightly supervised canon • Anyone can look up terms • Any part of “metadata speech” • Anyone can propose and refine their terms • Strong terms rise, weak terms decline Greenberg, J., Murillo, A. and Kunze, J (in press). Ontological Empowerment: Sustainability via Ownership. In K. LeBarre and J. Tennis Advances in Classification Research, 23nd Annual ASIS SIG/CR Workshop, 26 October 2012, Baltimore, MD. 12
  • 13. What we did • Met • Laughed, Talked, Cried, Hugged • Conquered 13
  • 14. Use cases Six solid cases, eg, • Sally Scientist is about to enter column headers for observational data on Pikas in the alpine for data to go into Dryad • Doug Data wants to use Sally’s observations and needs to lookup the definition of one of her column headers 14
  • 16. Work packages in the next 2 years Move from pre-proof-of-concept to Beta • Software development • Assessment (eg, students) • Moderation protocols – community elders • Establish community identity and rhythm • Not completely flat, not completely crowd-sourced

Hinweis der Redaktion

  1. We’re a sort of cluster group, which really consists of two parts: a preservation subgroup and a metadata subgroup.They are different, and I’ll spend one slide on Preservation and the rest on the exciting work in Metadata that’s just starting up.
  2. If we had just on slide on Preservation, this pretty much summarizes the whole story. To meet the objective of “easy, secure, and persistent storage of data”, DataONE adopts a simple 3-tiered approach.Retaining the actual bits that comprise the data is paramount, as all other preservation and access questions are moot if the bits are lost. A cornerstone of this tier is replication. We attempt to make our replicas “de-correlated”, in the sense that we hold the copies in places where they are unlikely to be subject to the same power failure, same earthquake, same funding loss, etc. CNs hold a copy of all science metadata, so that we always know what DataONE has. An extra copy of MN data is held by each of two other MNs. Damage or corruption in those copies is detected by periodically re-computing checksums (eg, SHA-256 digests) for randomly selected datasets and comparing them with checksums securely stored at the CNs – any bit-level change can be corrected by copying from an unchanged copy. This kind of “pop quiz” cannot be cheated by simply reporting back a previously computed checksum as it’s the actual MN replica data that’s requested. Although it entails sampling only a subset of the data, it is not feasible to exhaustively check the amount of content that DataONE anticipates holding, because that will effectively keep the MNs and CNs busy all the time. Local Information Technology (IT) standards at the MNs are important, and there will be more about this in a later slide. MN guidelines also call for the common-sense and usual practice of periodic “media refresh”, which is the copying of data from old physical recording devices to new physical recording devices to avoid errors due to media degradation and vendor de-support.Assuming the bits are kept safe, one also has to be able to make sense of them into the future, so protecting their form, meaning, and behavior is critical. This we accomplish first by fully knowing the form and structure of the data, in other words, by collecting accurate characterization metadata. Sources of this metadata include scientists, MN curators, and the output from automated characterization tools such as JHOVE. We also encourage use of widely supported formats. Finally, we will use standardized format names from the Unified Digital Format Registry (UDFR), which enables automated notification of obsolescence through services such as AONS (Automated Obsolescence Notification System) and Plato (PlanetsPreservation Planning Tool). I’ll note that both JHOVE and UDFR are maintained by the California Digital Library, which is a DataONE partner. Migration and emulation are sub-strategies that DataONE will use in the event that formats become obsolete. At some time in the future, one may expect that available contemporary hardware and software will be unable to render or otherwise use bits saved in some formats. Migration is used to convert from older to newer formats; all converted content is subject to “before” and “after” characterization to ensure semantic invariance. Emulation effectively preserves older computing environments in order to retain the experience of rendering older formats; once considered a specialized intervention, emulation has become a more viable technique with recent developments in consumer and enterprise server virtualization solutions. Ultimately, having the bits and their meaning is useless if we don’t also have the legal right (a) to hold the data, (b) to make copies and derivatives in performance of preservation management (such as replication and migration), and (c) to transfer those same rights to a successor archive. Just as important is to know specifically who owns the original data and whether those rights have been granted. As a start we strongly encourage providers to assign “Creative Commons Zero” (CC0) licenses to all contributed data, which facilitates preservation while still permitting an attribution requirement.Of course the DataONE organization and network itself needs to be preserved. No network, no MNs, no data. This topic has considerable cross-over with what the Governance and Sustainability working group is doing, and I’ll say more about it in a subsequent slide.
  3. Goals:Develop and implement a sustainable, effective metadata registry framework.Identify a core, foundational, yet flexible set of metadata properties (elements, attributes, and other sub-vocabularies) supporting basic curation and interoperability. This work will explore bridges with the Dublin Core Metadata Initiative (DCMI) and the DataCite consortium.Survey and assess metadata generation approaches (automatic, semi-automatic, derived, manual) and models to support the above stated goals.Purpose: to assist DataONE in recording and maintaining via metadata (as structured, named information elements) sufficient, sustainable functional information about data sets to support discovery, life-cycle management, citation, and general interoperation. Interoperation is a core value for any federation of autonomous nodes such as DataONE, and has separate consequences for every working group; for the MWG, general interoperation is meant to address data discovery across nodes and disciplines, as well as data re-use within the earth sciences (to the extent that this can be generalized).Scope: While metadata is a vast subject comprising, in principle, every piece of structured data bearing any relationship to any other piece of data, the MWG focuses on expressing technical and scientific metadata (DataONE’s “system” and “science” metadata). This emphasis combines the main metadata requirements from the core cyberinfrastructure team (CCIT) with relevant sources of minimal metadata requirements. Because the CCIT is best qualified to focus on technical metadata, the MWG will give priority to metadata that supports data preservation, curation, citation, and discovery in general. Of special interest will be the publication of spreadsheet data and data papers.
  4. Traditional metadata standards are controlled by panels of experts, eg, FGDC, EML, Darwin Core Change by committee is ugly, costly, and slowExample: perhaps most widely use cross domain vocabulary is Dublin Core, 15 cross-domain termsAgreed on in 5 years, lots of local divergence“I love the 15, but my domain needs these 2 terms. How do we add them?” A: Make your own ontology!Multiply by 200 domains and the result is 200 ontologies, 200 panels, 200 islands of non-interoperation
  5. Something between crowd-sourcing and an exclusive clubLearn from wikipedia, internet RFCs, and American Heritage DictionaryGreenberg, J., Murillo, A. and Kunze, J  (in press). Ontological Empowerment:  Sustainability via Ownership.  In K. LeBarre and J. Tennis Advances in Classification Research, 23nd Annual ASIS SIG/CR Workshop, 26 October 2012, Baltimore, MD. 
  6. We’re a sort of cluster group, which really consists of two parts: a preservation subgroup and a metadata subgroup.They are different, and I’ll spend one slide on Preservation and the rest on the exciting work in Metadata that’s just starting up.
  7. Something between crowd-sourcing and an exclusive clubLearn from wikipedia, internet RFCs, and American Heritage DictionaryGreenberg, J., Murillo, A. and Kunze, J  (in press). Ontological Empowerment:  Sustainability via Ownership.  In K. LeBarre and J. Tennis Advances in Classification Research, 23nd Annual ASIS SIG/CR Workshop, 26 October 2012, Baltimore, MD. 
  8. First meeting. Re-affirmed our vision.Met with Semantics WG and Provenance WGGot scared, got over it, because this is hard hard hard.Pre-proof of conceptAggressive plan in next 2 months to develop a 0.1 prototype using - either Drupal or StackOverflow - Sally Scientist