SlideShare a Scribd company logo
1 of 27
Download to read offline
1
Preservation Metadata: an introduction to PREMIS
and
its application in audiovisual archives
Karin Bredenberg, The National Archives of Sweden
Member of PREMIS Editorial Comittee
2013-05-16
2
The Challenge of Digital Preservation
3
How to access the material 1 month, 1 year,
10 years from now?
• Information about the material
–Intellectual information (Descriptive Metadata)
• Who
• Why
• and so on
–”Physical” information (Digital Preservation Metadata)
• Which kind of file am I?
• What has happened to me during the years?
• Who can look at me?
• And so on
Metadata = data about data
Digital Preservation Metadata =
metadata that is essential to ensure long-
term accessibility of digital resources
4
• A best guess on the future
– little experience validating the longevity of digital objects
– uncertain future technical possibilities
– uncertain future legal framework
• Digital objects must be self-descriptive
• Must be able to exist independently from the systems
which were used to create them
– XML (machine and human readable)
What Digital Preservation
Metadata to store?
5
OAIS
Open Archival Information System or also the ISO OAIS Reference Model for an OAIS
(A simple OAIS explanation by
Richard Pearce-Moses and more)
6
The PREMIS Data Dictionary
• Information you need to know for preserving
digital objects
• Available on line through the PREMIS website
• Preservation Metadata:
Implementation Strategies
– Includes PREMIS Data Dictionary, context/assumptions, data
model, usage examples
– XML schema to support implementation
7
PREMIS Web and PREMIS EC
• Web site:
– Permanent Web presence
(http://www.loc.gov/standards/premis/ ),
hosted by Library of Congress
– Central destination for PREMIS-related info,
announcements, resources
– Home of the PREMIS Implementers’ Group (PIG)
discussion list (pig@loc.gov)
• PREMIS Editorial Committee:
– Set directions/priorities for PREMIS development
– Considers proposals for changes
– Coordinates revisions of Data Dictionary and XML schema
– Consists of members with different affiliations from all over the world.
– Meetings once a month (sometimes more)
– Hosts PREMIS events eg PREMIS Implementation Fair at iPRES
8
OAIS Reference Model and PREMIS
• OAIS reference model specifies the Preservation Description Information (PDI)
• PREMIS used the OAIS information model as a starting point
• PREMIS Data Dictionary consolidated and further developed the conceptual types of
information objects into more than 100 structured and logically integrated semantic
units.
• PREMIS Data Dictionary provided detailed descriptions and guidelines to implement
these semantic units.
• PREMIS Data Dictionary does not provide semantic units for Intellectual Entities, but
provides semantic units to link to other metadata sources for Intellectual Entities (this
will change in version 3)
• All entities have reference (identification) information.
• No “packaging information” that links content with metadata, but PREMIS can be
used with container schemas
• PREMIS deals mostly with representation, context, provenance, and fixity
information, in keeping with PREMIS definition of preservation metadata.
9
The PREMIS data model: 5 interacting
entitiesIntellectual
Entity
Object
Event
Agent
Rights
identifier
10
1.8.1 environmentCharacteristic
1.8.2 environmentPurpose
1.8.3 environmentNote
1.8.4 dependency
1.8.5 software
1.8.6 hardware
1.8.7 enviromentExtension
For Example: Object Entity semantic units
1.5.1 compositionLevel
1.5.2 fixity
1.5.3 size
1.5.4 format
1.5.5 creatingApplication
1.5.6 inhibitors
1.5.7
objectCharacteristicsExtension
1.1 objectIdentifier
1.2 objectCategory
1.3 preservationLevel
1.4 significantProperties
1.5 objectCharacteristics
1.6 originalname
1.7 storage
1.8 enviroment
1.9 signatureInformation
1.10 relationsship
1.11 linkingEventIdentifier
1.12 linkingIntellectualIdentifier
1.13 linkingRightsStatementIdentifier
11
Sample Data Dictionary Entry
Semantic unit size
Semantic
components
None
Definition The size in bytes of the file or bitstream stored in the
repository.
Rationale Size is useful for ensuring the correct number of bytes from
storage have been retrieved and that an application has
enough room to move or process files. It might also be used
when billing for storage.
Data constraint Integer
Object category Representation File Bitstream
Applicability Not applicable Applicable Applicable
Examples 2038927
Repeatability Not repeatable Not repeatable
Obligation Optional Optional
Creation/
Maintenance notes
Automatically obtained by the repository.
Usage notes Defining this semantic unit as size in bytes makes it
unnecessary to record a unit of measurement. However, for
the purpose of data exchange the unit of measurement should
be stated or understood by both partners.
12
• What PREMIS DD is:
– Common data model for organizing/thinking about
preservation metadata
– Implementable
– Standard for exchanging information packages between
repositories
– Technically neutral
– Core metadata
Scope
13
• What PREMIS DD is not:
– Out-of-the-box solution
– All needed metadata
– Lifecycle management of objects outside repository
– Rights management
Scope
14
Technology Dependence
0001.tiff 0002.tiff 0003.tiff 0004.tiff 000156.tiff0005.tiff 0006.tiff
No direct access • Not self-descriptive
• Complex formats
Complex environments
digital
…
15
Information packages
• Information about owner; what the package is and more
• The files, checksum, filenam, use and more
• Technical information like Digital Preservation Metadata, what has happend to
the files and more
– need for detailed rendering information
» Software
» Hardware
» Other dependencies: schemas, style sheets, encodings, etc.
– need for format information
• Information about structure, how are the files related?
16
Standards for Information Packages
• One commonly used standard is METS
Metadata Encoding and Transmission Standard
• PREMIS can be used togehter with METS
<metsHdr>
<dmdSec>
<amdSec>
<fileSec>
<structMap>
<structLink>
<behaviorSec>
<mets>
mets Header
descriptive metadata Section
administrative metadata Section
file Section
structural Map section
structural Link section
behavior Section
17
Technical metadata for audio and video
• A “new” need, objects now created digitally and digitization has increased
• Not as fast developed as other technical metadata schemes
• Complexities of file formats require expertise to develop and implement
these
• Few standards available for metadata about audio and video
– AES (will be briefly introduced)
– audioMD and videoMD (will be briefly introduced)
– Material Exchange Format (MXF)
– Technical metadata in EBUCore, PBCore
– In US the Federal Agencies Digitization Guidelines Initiative (FAGDI)
– MPEG-7 and MPEG-21 for video
• Programs creating audio and/or video often can export metadata.
Question: Is this exported information sufficient?
Answer: Needs to be evaluated at the archives and a decision taken!
18
AES
• Audio Engineering Society (http://www.aes.org/ )
• AES-X098B supersided by:
– AES57-2011-f (2011)
AES standard for audio metadata - Audio object structures for preservation and
restoration
– AES60-2011-f (2011)
AES standard for audio metadata - Core audio metadata
• Two XML schemas available
• According to earlier know information 98C (video) was planned to be
made after 98B had been established
• Some educational orientated presentations can be found.
19
audioMD and videoMD (AMD and VMD)
• Hosted by Library of Congress
(http://www.loc.gov/standards/amdvmd/index.html )
• Simple schemas developed during 10 years
• Current version published during spring 2011
• Information about one use case together with METS
• Mailing list exists, but rarely used
• Archives interested in using not too complex schemas for preservation
purposes
20
Tools
• PREMIS in METS toolbox
• The controlled vocabularies database
• Some institutions are making repository software available
that implements PREMIS
– DAITSS Digital Preservation Repository Software
– Archivematica
21
The controlled vocabularies database
• Library of Congress is establishing databases with controlled vocabulary values
for standards that it maintains
• http://id.loc.gov
• Now also specific vocabularies for PREMIS semantic units:
preservationLevelRole, cryptographicHashAlgorithm, eventType
• Additional PREMIS controlled lists to be made available with the PREMIS OWL
ontology
22
PREMIS Web Ontology Language (OWL) ontology
• Initiated by the Archipel project to use PREMIS in Open Archives Initiative
Object Reuse and Exchange (OAI-ORE)
(description/exchange of Web resources)
• Resource Description Framework (RDF) serialization of preservation
metadata as a data management function in a preservation repository
• Interoperate with other preservation Linked Data efforts such as UDFR
(Unified Digital Formats Registry)
• Interoperate with PREMIS controlled vocabularies at http://id.loc.gov
23
PREMIS OWL ontology in a
nutshell
• Purpose
– Providing the community with an RDF serialization of
the PREMIS data model and dictionary
– While remaining as close as possible to the data
dictionary’s clearly defined semantics
RDF modelling in 3 words:
• Everything modelled under the form of
subject-verb-object
• But what objects? what verbs? what
objects?
role of vocabularies & ontologies
24
Implementation issues:
Conformance
• Conformant Implementation of the PREMIS Data
Dictionary http://www.loc.gov/standards/premis/premis-
conformance-oct2010.pdf
• What does "being conformant to PREMIS"
mean?
• Conformant at which level?
– semantic unit: conformant implementation of the
information defined in a particular semantic unit
– data dictionary: conformant implementation of all
semantic units
• Conformant from what perspective?
– internal: conformant implementation at semantic units and
data dictionary levels
– external (exchanging PREMIS descriptions):
import = the repository can manage PREMIS conformant
information
export = the repository can provide others with PREMIS
25
Implementation issues: Technical
• Which semantic units to use besides the
mandatory?
• Create own vocabularys?
• Where to store the metadata?
– In an XML-document?
– In one or more databases?
• Which event to store?
• How to store agents, rights management?
• In short:
A lot of descision making needs to be preformed!
26
Conclusion
• Using PREMIS as the basis for digital preservation metadata is
widely implemented
• Both IT and the archives need to work together.
Different kind of expertise.
• Complexities of audio and video require increased need for
technical and structural metadata
• Increasing use of digital preservation metadata for archiving audio
and video is expected
• Examples of use of PREMIS together with audio and video
metadata is needed
27
Thank you!
Karin Bredenberg, The National Archives of Sweden
karin.bredenberg@riksarkivet.se
Presentation made with the help of:
Angela Dappert
Sébastien Peyrard
Rebecca Guenther

More Related Content

What's hot

Using DICOM and NIfTI in R
Using DICOM and NIfTI in RUsing DICOM and NIfTI in R
Using DICOM and NIfTI in RBrandon Whitcher
 
Research Data Shared Services
Research Data Shared ServicesResearch Data Shared Services
Research Data Shared ServicesJisc RDM
 
Digital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the PondDigital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the PondBenoit Pauwels
 
ICIC is the International Conference on Trends for Scientific Information Pro...
ICIC is the International Conference on Trends for Scientific Information Pro...ICIC is the International Conference on Trends for Scientific Information Pro...
ICIC is the International Conference on Trends for Scientific Information Pro...Dr. Haxel Consult
 
IPTC Semantic Web 2012 Spring Working Group
IPTC Semantic Web 2012 Spring Working GroupIPTC Semantic Web 2012 Spring Working Group
IPTC Semantic Web 2012 Spring Working GroupStuart Myles
 
29th ICIC International Conference for the Information Community
29th ICIC International Conference for the Information Community29th ICIC International Conference for the Information Community
29th ICIC International Conference for the Information CommunityDr. Haxel Consult
 
Aggregation of Linked Data A case study in the cultural heritage domain
Aggregation of Linked Data A case study in the cultural heritage domainAggregation of Linked Data A case study in the cultural heritage domain
Aggregation of Linked Data A case study in the cultural heritage domainNuno Freire
 
Gbrds Workshop Sept09 Metadata Identifiers
Gbrds Workshop Sept09 Metadata IdentifiersGbrds Workshop Sept09 Metadata Identifiers
Gbrds Workshop Sept09 Metadata IdentifiersVishwas Chavan
 
Privacy preserving repositoy
Privacy preserving repositoyPrivacy preserving repositoy
Privacy preserving repositoymanishajadhav13j
 
Nordic Tryggve project
Nordic Tryggve projectNordic Tryggve project
Nordic Tryggve projectanttipursula
 
IPTC Semantic Web Working Group 2011 Autumn Working Group
IPTC Semantic Web Working Group 2011 Autumn Working GroupIPTC Semantic Web Working Group 2011 Autumn Working Group
IPTC Semantic Web Working Group 2011 Autumn Working GroupStuart Myles
 
Summary of the Deployment Scenarios and Functional Requirements
Summary of the Deployment Scenarios and Functional RequirementsSummary of the Deployment Scenarios and Functional Requirements
Summary of the Deployment Scenarios and Functional RequirementsArchiver
 
DESY / XFEL Deployment Scenarios
DESY / XFEL Deployment Scenarios  DESY / XFEL Deployment Scenarios
DESY / XFEL Deployment Scenarios Archiver
 
EOSC-hub service portfolio
EOSC-hub service portfolioEOSC-hub service portfolio
EOSC-hub service portfolioEOSC-hub project
 
ELIXIR Competence Centre in EOSC-hub
ELIXIR Competence Centre in EOSC-hubELIXIR Competence Centre in EOSC-hub
ELIXIR Competence Centre in EOSC-hubEOSC-hub project
 
Data Preservation Service Area
Data Preservation Service AreaData Preservation Service Area
Data Preservation Service AreaEUDAT
 

What's hot (20)

[EN] Trends in Records Management, Archiving and Digital Preservation | Abbot...
[EN] Trends in Records Management, Archiving and Digital Preservation | Abbot...[EN] Trends in Records Management, Archiving and Digital Preservation | Abbot...
[EN] Trends in Records Management, Archiving and Digital Preservation | Abbot...
 
Using DICOM and NIfTI in R
Using DICOM and NIfTI in RUsing DICOM and NIfTI in R
Using DICOM and NIfTI in R
 
Research Data Shared Services
Research Data Shared ServicesResearch Data Shared Services
Research Data Shared Services
 
Helix Nebula - The Science Cloud - Lessons learned
Helix Nebula - The Science Cloud - Lessons learned Helix Nebula - The Science Cloud - Lessons learned
Helix Nebula - The Science Cloud - Lessons learned
 
HNSciCloud Prototype Phase Award - Marc-Elian Begin
HNSciCloud Prototype Phase Award - Marc-Elian Begin HNSciCloud Prototype Phase Award - Marc-Elian Begin
HNSciCloud Prototype Phase Award - Marc-Elian Begin
 
Digital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the PondDigital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the Pond
 
SafeNet: Progress and Data Gathering
SafeNet: Progress and Data GatheringSafeNet: Progress and Data Gathering
SafeNet: Progress and Data Gathering
 
ICIC is the International Conference on Trends for Scientific Information Pro...
ICIC is the International Conference on Trends for Scientific Information Pro...ICIC is the International Conference on Trends for Scientific Information Pro...
ICIC is the International Conference on Trends for Scientific Information Pro...
 
IPTC Semantic Web 2012 Spring Working Group
IPTC Semantic Web 2012 Spring Working GroupIPTC Semantic Web 2012 Spring Working Group
IPTC Semantic Web 2012 Spring Working Group
 
29th ICIC International Conference for the Information Community
29th ICIC International Conference for the Information Community29th ICIC International Conference for the Information Community
29th ICIC International Conference for the Information Community
 
Aggregation of Linked Data A case study in the cultural heritage domain
Aggregation of Linked Data A case study in the cultural heritage domainAggregation of Linked Data A case study in the cultural heritage domain
Aggregation of Linked Data A case study in the cultural heritage domain
 
Gbrds Workshop Sept09 Metadata Identifiers
Gbrds Workshop Sept09 Metadata IdentifiersGbrds Workshop Sept09 Metadata Identifiers
Gbrds Workshop Sept09 Metadata Identifiers
 
Privacy preserving repositoy
Privacy preserving repositoyPrivacy preserving repositoy
Privacy preserving repositoy
 
Nordic Tryggve project
Nordic Tryggve projectNordic Tryggve project
Nordic Tryggve project
 
IPTC Semantic Web Working Group 2011 Autumn Working Group
IPTC Semantic Web Working Group 2011 Autumn Working GroupIPTC Semantic Web Working Group 2011 Autumn Working Group
IPTC Semantic Web Working Group 2011 Autumn Working Group
 
Summary of the Deployment Scenarios and Functional Requirements
Summary of the Deployment Scenarios and Functional RequirementsSummary of the Deployment Scenarios and Functional Requirements
Summary of the Deployment Scenarios and Functional Requirements
 
DESY / XFEL Deployment Scenarios
DESY / XFEL Deployment Scenarios  DESY / XFEL Deployment Scenarios
DESY / XFEL Deployment Scenarios
 
EOSC-hub service portfolio
EOSC-hub service portfolioEOSC-hub service portfolio
EOSC-hub service portfolio
 
ELIXIR Competence Centre in EOSC-hub
ELIXIR Competence Centre in EOSC-hubELIXIR Competence Centre in EOSC-hub
ELIXIR Competence Centre in EOSC-hub
 
Data Preservation Service Area
Data Preservation Service AreaData Preservation Service Area
Data Preservation Service Area
 

Similar to Presentation 16 may keynote karin bredenberg

ERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarFAIRDOM
 
Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Sarah Anna Stewart
 
Building Cyber-infrastructure at UNC-CH
Building Cyber-infrastructure at UNC-CHBuilding Cyber-infrastructure at UNC-CH
Building Cyber-infrastructure at UNC-CHGary Wilhelm
 
Brief Introduction to Digital Preservation
Brief Introduction to Digital PreservationBrief Introduction to Digital Preservation
Brief Introduction to Digital PreservationMichael Day
 
Using Archivemedia to preserve research data
Using Archivemedia to preserve research dataUsing Archivemedia to preserve research data
Using Archivemedia to preserve research dataARDC
 
FAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM
 
"Filling the Digital Preservation Gap" with Archivematica
"Filling the Digital Preservation Gap" with Archivematica"Filling the Digital Preservation Gap" with Archivematica
"Filling the Digital Preservation Gap" with ArchivematicaJenny Mitcham
 
EUDAT data architecture and interoperability aspects – Daan Broeder
EUDAT data architecture and interoperability aspects – Daan BroederEUDAT data architecture and interoperability aspects – Daan Broeder
EUDAT data architecture and interoperability aspects – Daan BroederOpenAIRE
 
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...BigData_Europe
 
eROSA Stakeholder WS1: Data discovery through federated dataset catalogues
eROSA Stakeholder WS1: Data discovery through federated dataset catalogueseROSA Stakeholder WS1: Data discovery through federated dataset catalogues
eROSA Stakeholder WS1: Data discovery through federated dataset cataloguese-ROSA
 
Open Archives Initiatives For Metadata Harvesting
Open Archives Initiatives For Metadata   HarvestingOpen Archives Initiatives For Metadata   Harvesting
Open Archives Initiatives For Metadata HarvestingNikesh Narayanan
 
A collaborative approach to "filling the digital preservation gap" for Resear...
A collaborative approach to "filling the digital preservation gap" for Resear...A collaborative approach to "filling the digital preservation gap" for Resear...
A collaborative approach to "filling the digital preservation gap" for Resear...Jenny Mitcham
 
A collaborative approach to filling the digital preservation gap for RDM
A collaborative approach to filling the digital preservation gap for RDMA collaborative approach to filling the digital preservation gap for RDM
A collaborative approach to filling the digital preservation gap for RDMnortherncollaboration
 
20100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_033020100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_0330glorykim
 
20100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_033020100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_0330광영 김
 
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...Jenn Riley
 
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...OpenAIRE
 

Similar to Presentation 16 may keynote karin bredenberg (20)

MIDESS
MIDESSMIDESS
MIDESS
 
Caplan and York, 'What It Takes To Make It Last: E-Resources Preservation"
Caplan and York, 'What It Takes To Make It Last:  E-Resources Preservation"Caplan and York, 'What It Takes To Make It Last:  E-Resources Preservation"
Caplan and York, 'What It Takes To Make It Last: E-Resources Preservation"
 
ERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management Webinar
 
Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...
 
2015 05-07-mac
2015 05-07-mac2015 05-07-mac
2015 05-07-mac
 
Building Cyber-infrastructure at UNC-CH
Building Cyber-infrastructure at UNC-CHBuilding Cyber-infrastructure at UNC-CH
Building Cyber-infrastructure at UNC-CH
 
Brief Introduction to Digital Preservation
Brief Introduction to Digital PreservationBrief Introduction to Digital Preservation
Brief Introduction to Digital Preservation
 
Using Archivemedia to preserve research data
Using Archivemedia to preserve research dataUsing Archivemedia to preserve research data
Using Archivemedia to preserve research data
 
FAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech Proposals
 
"Filling the Digital Preservation Gap" with Archivematica
"Filling the Digital Preservation Gap" with Archivematica"Filling the Digital Preservation Gap" with Archivematica
"Filling the Digital Preservation Gap" with Archivematica
 
EUDAT data architecture and interoperability aspects – Daan Broeder
EUDAT data architecture and interoperability aspects – Daan BroederEUDAT data architecture and interoperability aspects – Daan Broeder
EUDAT data architecture and interoperability aspects – Daan Broeder
 
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
 
eROSA Stakeholder WS1: Data discovery through federated dataset catalogues
eROSA Stakeholder WS1: Data discovery through federated dataset catalogueseROSA Stakeholder WS1: Data discovery through federated dataset catalogues
eROSA Stakeholder WS1: Data discovery through federated dataset catalogues
 
Open Archives Initiatives For Metadata Harvesting
Open Archives Initiatives For Metadata   HarvestingOpen Archives Initiatives For Metadata   Harvesting
Open Archives Initiatives For Metadata Harvesting
 
A collaborative approach to "filling the digital preservation gap" for Resear...
A collaborative approach to "filling the digital preservation gap" for Resear...A collaborative approach to "filling the digital preservation gap" for Resear...
A collaborative approach to "filling the digital preservation gap" for Resear...
 
A collaborative approach to filling the digital preservation gap for RDM
A collaborative approach to filling the digital preservation gap for RDMA collaborative approach to filling the digital preservation gap for RDM
A collaborative approach to filling the digital preservation gap for RDM
 
20100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_033020100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_0330
 
20100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_033020100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_0330
 
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
 
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
 

More from Nederlands Instituut voor Beeld en Geluid

More from Nederlands Instituut voor Beeld en Geluid (14)

TROVe Transmedia Observatory eindpresentatie
TROVe Transmedia Observatory eindpresentatieTROVe Transmedia Observatory eindpresentatie
TROVe Transmedia Observatory eindpresentatie
 
Presentation 17 may afternoon casestudy 1 yves raimond kopie
Presentation 17 may afternoon casestudy 1 yves raimond kopiePresentation 17 may afternoon casestudy 1 yves raimond kopie
Presentation 17 may afternoon casestudy 1 yves raimond kopie
 
Presentation 17 may morning case study 2 sarahhaye aziz
Presentation 17 may morning case study 2 sarahhaye azizPresentation 17 may morning case study 2 sarahhaye aziz
Presentation 17 may morning case study 2 sarahhaye aziz
 
Presentation 17 may keynote lara aroyo
Presentation 17 may keynote lara aroyoPresentation 17 may keynote lara aroyo
Presentation 17 may keynote lara aroyo
 
Presentation 17 may morning keynote cees snoek
Presentation 17 may morning keynote cees snoekPresentation 17 may morning keynote cees snoek
Presentation 17 may morning keynote cees snoek
 
Presentation 17 may morning casestudy 1 sam davies
Presentation 17 may morning casestudy 1 sam daviesPresentation 17 may morning casestudy 1 sam davies
Presentation 17 may morning casestudy 1 sam davies
 
Presentation 17 may afternoon casestudy 2 liam wylie
Presentation 17 may afternoon casestudy 2 liam wyliePresentation 17 may afternoon casestudy 2 liam wylie
Presentation 17 may afternoon casestudy 2 liam wylie
 
Presentation 16 may casestudy 2 evalisgreen kaisa unander
Presentation 16 may casestudy 2 evalisgreen kaisa unanderPresentation 16 may casestudy 2 evalisgreen kaisa unander
Presentation 16 may casestudy 2 evalisgreen kaisa unander
 
Presentation 16 may morning semantic linking rutger verhoeven
Presentation 16 may morning semantic linking rutger verhoevenPresentation 16 may morning semantic linking rutger verhoeven
Presentation 16 may morning semantic linking rutger verhoeven
 
Presentation 16 may morning casestudy 2 xavier jacques jourion
Presentation 16 may morning casestudy 2 xavier jacques jourionPresentation 16 may morning casestudy 2 xavier jacques jourion
Presentation 16 may morning casestudy 2 xavier jacques jourion
 
Presentation 16 may morning casestudy 1 maarten de rijke
Presentation 16 may morning casestudy 1 maarten de rijkePresentation 16 may morning casestudy 1 maarten de rijke
Presentation 16 may morning casestudy 1 maarten de rijke
 
Presentation 16 may morning keynote seth van hooland
Presentation 16 may morning keynote seth van hoolandPresentation 16 may morning keynote seth van hooland
Presentation 16 may morning keynote seth van hooland
 
Presentation 16 may casestudy 2 evalisgreen kaisa unander
Presentation 16 may casestudy 2 evalisgreen kaisa unanderPresentation 16 may casestudy 2 evalisgreen kaisa unander
Presentation 16 may casestudy 2 evalisgreen kaisa unander
 
Presentation 16 may archive achievements awards tom de smet
Presentation 16 may archive achievements awards tom de smetPresentation 16 may archive achievements awards tom de smet
Presentation 16 may archive achievements awards tom de smet
 

Recently uploaded

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 

Presentation 16 may keynote karin bredenberg

  • 1. 1 Preservation Metadata: an introduction to PREMIS and its application in audiovisual archives Karin Bredenberg, The National Archives of Sweden Member of PREMIS Editorial Comittee 2013-05-16
  • 2. 2 The Challenge of Digital Preservation
  • 3. 3 How to access the material 1 month, 1 year, 10 years from now? • Information about the material –Intellectual information (Descriptive Metadata) • Who • Why • and so on –”Physical” information (Digital Preservation Metadata) • Which kind of file am I? • What has happened to me during the years? • Who can look at me? • And so on Metadata = data about data Digital Preservation Metadata = metadata that is essential to ensure long- term accessibility of digital resources
  • 4. 4 • A best guess on the future – little experience validating the longevity of digital objects – uncertain future technical possibilities – uncertain future legal framework • Digital objects must be self-descriptive • Must be able to exist independently from the systems which were used to create them – XML (machine and human readable) What Digital Preservation Metadata to store?
  • 5. 5 OAIS Open Archival Information System or also the ISO OAIS Reference Model for an OAIS (A simple OAIS explanation by Richard Pearce-Moses and more)
  • 6. 6 The PREMIS Data Dictionary • Information you need to know for preserving digital objects • Available on line through the PREMIS website • Preservation Metadata: Implementation Strategies – Includes PREMIS Data Dictionary, context/assumptions, data model, usage examples – XML schema to support implementation
  • 7. 7 PREMIS Web and PREMIS EC • Web site: – Permanent Web presence (http://www.loc.gov/standards/premis/ ), hosted by Library of Congress – Central destination for PREMIS-related info, announcements, resources – Home of the PREMIS Implementers’ Group (PIG) discussion list (pig@loc.gov) • PREMIS Editorial Committee: – Set directions/priorities for PREMIS development – Considers proposals for changes – Coordinates revisions of Data Dictionary and XML schema – Consists of members with different affiliations from all over the world. – Meetings once a month (sometimes more) – Hosts PREMIS events eg PREMIS Implementation Fair at iPRES
  • 8. 8 OAIS Reference Model and PREMIS • OAIS reference model specifies the Preservation Description Information (PDI) • PREMIS used the OAIS information model as a starting point • PREMIS Data Dictionary consolidated and further developed the conceptual types of information objects into more than 100 structured and logically integrated semantic units. • PREMIS Data Dictionary provided detailed descriptions and guidelines to implement these semantic units. • PREMIS Data Dictionary does not provide semantic units for Intellectual Entities, but provides semantic units to link to other metadata sources for Intellectual Entities (this will change in version 3) • All entities have reference (identification) information. • No “packaging information” that links content with metadata, but PREMIS can be used with container schemas • PREMIS deals mostly with representation, context, provenance, and fixity information, in keeping with PREMIS definition of preservation metadata.
  • 9. 9 The PREMIS data model: 5 interacting entitiesIntellectual Entity Object Event Agent Rights identifier
  • 10. 10 1.8.1 environmentCharacteristic 1.8.2 environmentPurpose 1.8.3 environmentNote 1.8.4 dependency 1.8.5 software 1.8.6 hardware 1.8.7 enviromentExtension For Example: Object Entity semantic units 1.5.1 compositionLevel 1.5.2 fixity 1.5.3 size 1.5.4 format 1.5.5 creatingApplication 1.5.6 inhibitors 1.5.7 objectCharacteristicsExtension 1.1 objectIdentifier 1.2 objectCategory 1.3 preservationLevel 1.4 significantProperties 1.5 objectCharacteristics 1.6 originalname 1.7 storage 1.8 enviroment 1.9 signatureInformation 1.10 relationsship 1.11 linkingEventIdentifier 1.12 linkingIntellectualIdentifier 1.13 linkingRightsStatementIdentifier
  • 11. 11 Sample Data Dictionary Entry Semantic unit size Semantic components None Definition The size in bytes of the file or bitstream stored in the repository. Rationale Size is useful for ensuring the correct number of bytes from storage have been retrieved and that an application has enough room to move or process files. It might also be used when billing for storage. Data constraint Integer Object category Representation File Bitstream Applicability Not applicable Applicable Applicable Examples 2038927 Repeatability Not repeatable Not repeatable Obligation Optional Optional Creation/ Maintenance notes Automatically obtained by the repository. Usage notes Defining this semantic unit as size in bytes makes it unnecessary to record a unit of measurement. However, for the purpose of data exchange the unit of measurement should be stated or understood by both partners.
  • 12. 12 • What PREMIS DD is: – Common data model for organizing/thinking about preservation metadata – Implementable – Standard for exchanging information packages between repositories – Technically neutral – Core metadata Scope
  • 13. 13 • What PREMIS DD is not: – Out-of-the-box solution – All needed metadata – Lifecycle management of objects outside repository – Rights management Scope
  • 14. 14 Technology Dependence 0001.tiff 0002.tiff 0003.tiff 0004.tiff 000156.tiff0005.tiff 0006.tiff No direct access • Not self-descriptive • Complex formats Complex environments digital …
  • 15. 15 Information packages • Information about owner; what the package is and more • The files, checksum, filenam, use and more • Technical information like Digital Preservation Metadata, what has happend to the files and more – need for detailed rendering information » Software » Hardware » Other dependencies: schemas, style sheets, encodings, etc. – need for format information • Information about structure, how are the files related?
  • 16. 16 Standards for Information Packages • One commonly used standard is METS Metadata Encoding and Transmission Standard • PREMIS can be used togehter with METS <metsHdr> <dmdSec> <amdSec> <fileSec> <structMap> <structLink> <behaviorSec> <mets> mets Header descriptive metadata Section administrative metadata Section file Section structural Map section structural Link section behavior Section
  • 17. 17 Technical metadata for audio and video • A “new” need, objects now created digitally and digitization has increased • Not as fast developed as other technical metadata schemes • Complexities of file formats require expertise to develop and implement these • Few standards available for metadata about audio and video – AES (will be briefly introduced) – audioMD and videoMD (will be briefly introduced) – Material Exchange Format (MXF) – Technical metadata in EBUCore, PBCore – In US the Federal Agencies Digitization Guidelines Initiative (FAGDI) – MPEG-7 and MPEG-21 for video • Programs creating audio and/or video often can export metadata. Question: Is this exported information sufficient? Answer: Needs to be evaluated at the archives and a decision taken!
  • 18. 18 AES • Audio Engineering Society (http://www.aes.org/ ) • AES-X098B supersided by: – AES57-2011-f (2011) AES standard for audio metadata - Audio object structures for preservation and restoration – AES60-2011-f (2011) AES standard for audio metadata - Core audio metadata • Two XML schemas available • According to earlier know information 98C (video) was planned to be made after 98B had been established • Some educational orientated presentations can be found.
  • 19. 19 audioMD and videoMD (AMD and VMD) • Hosted by Library of Congress (http://www.loc.gov/standards/amdvmd/index.html ) • Simple schemas developed during 10 years • Current version published during spring 2011 • Information about one use case together with METS • Mailing list exists, but rarely used • Archives interested in using not too complex schemas for preservation purposes
  • 20. 20 Tools • PREMIS in METS toolbox • The controlled vocabularies database • Some institutions are making repository software available that implements PREMIS – DAITSS Digital Preservation Repository Software – Archivematica
  • 21. 21 The controlled vocabularies database • Library of Congress is establishing databases with controlled vocabulary values for standards that it maintains • http://id.loc.gov • Now also specific vocabularies for PREMIS semantic units: preservationLevelRole, cryptographicHashAlgorithm, eventType • Additional PREMIS controlled lists to be made available with the PREMIS OWL ontology
  • 22. 22 PREMIS Web Ontology Language (OWL) ontology • Initiated by the Archipel project to use PREMIS in Open Archives Initiative Object Reuse and Exchange (OAI-ORE) (description/exchange of Web resources) • Resource Description Framework (RDF) serialization of preservation metadata as a data management function in a preservation repository • Interoperate with other preservation Linked Data efforts such as UDFR (Unified Digital Formats Registry) • Interoperate with PREMIS controlled vocabularies at http://id.loc.gov
  • 23. 23 PREMIS OWL ontology in a nutshell • Purpose – Providing the community with an RDF serialization of the PREMIS data model and dictionary – While remaining as close as possible to the data dictionary’s clearly defined semantics RDF modelling in 3 words: • Everything modelled under the form of subject-verb-object • But what objects? what verbs? what objects? role of vocabularies & ontologies
  • 24. 24 Implementation issues: Conformance • Conformant Implementation of the PREMIS Data Dictionary http://www.loc.gov/standards/premis/premis- conformance-oct2010.pdf • What does "being conformant to PREMIS" mean? • Conformant at which level? – semantic unit: conformant implementation of the information defined in a particular semantic unit – data dictionary: conformant implementation of all semantic units • Conformant from what perspective? – internal: conformant implementation at semantic units and data dictionary levels – external (exchanging PREMIS descriptions): import = the repository can manage PREMIS conformant information export = the repository can provide others with PREMIS
  • 25. 25 Implementation issues: Technical • Which semantic units to use besides the mandatory? • Create own vocabularys? • Where to store the metadata? – In an XML-document? – In one or more databases? • Which event to store? • How to store agents, rights management? • In short: A lot of descision making needs to be preformed!
  • 26. 26 Conclusion • Using PREMIS as the basis for digital preservation metadata is widely implemented • Both IT and the archives need to work together. Different kind of expertise. • Complexities of audio and video require increased need for technical and structural metadata • Increasing use of digital preservation metadata for archiving audio and video is expected • Examples of use of PREMIS together with audio and video metadata is needed
  • 27. 27 Thank you! Karin Bredenberg, The National Archives of Sweden karin.bredenberg@riksarkivet.se Presentation made with the help of: Angela Dappert Sébastien Peyrard Rebecca Guenther