Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Presentation 16 may keynote karin bredenberg
1. 1
Preservation Metadata: an introduction to PREMIS
and
its application in audiovisual archives
Karin Bredenberg, The National Archives of Sweden
Member of PREMIS Editorial Comittee
2013-05-16
3. 3
How to access the material 1 month, 1 year,
10 years from now?
• Information about the material
–Intellectual information (Descriptive Metadata)
• Who
• Why
• and so on
–”Physical” information (Digital Preservation Metadata)
• Which kind of file am I?
• What has happened to me during the years?
• Who can look at me?
• And so on
Metadata = data about data
Digital Preservation Metadata =
metadata that is essential to ensure long-
term accessibility of digital resources
4. 4
• A best guess on the future
– little experience validating the longevity of digital objects
– uncertain future technical possibilities
– uncertain future legal framework
• Digital objects must be self-descriptive
• Must be able to exist independently from the systems
which were used to create them
– XML (machine and human readable)
What Digital Preservation
Metadata to store?
5. 5
OAIS
Open Archival Information System or also the ISO OAIS Reference Model for an OAIS
(A simple OAIS explanation by
Richard Pearce-Moses and more)
6. 6
The PREMIS Data Dictionary
• Information you need to know for preserving
digital objects
• Available on line through the PREMIS website
• Preservation Metadata:
Implementation Strategies
– Includes PREMIS Data Dictionary, context/assumptions, data
model, usage examples
– XML schema to support implementation
7. 7
PREMIS Web and PREMIS EC
• Web site:
– Permanent Web presence
(http://www.loc.gov/standards/premis/ ),
hosted by Library of Congress
– Central destination for PREMIS-related info,
announcements, resources
– Home of the PREMIS Implementers’ Group (PIG)
discussion list (pig@loc.gov)
• PREMIS Editorial Committee:
– Set directions/priorities for PREMIS development
– Considers proposals for changes
– Coordinates revisions of Data Dictionary and XML schema
– Consists of members with different affiliations from all over the world.
– Meetings once a month (sometimes more)
– Hosts PREMIS events eg PREMIS Implementation Fair at iPRES
8. 8
OAIS Reference Model and PREMIS
• OAIS reference model specifies the Preservation Description Information (PDI)
• PREMIS used the OAIS information model as a starting point
• PREMIS Data Dictionary consolidated and further developed the conceptual types of
information objects into more than 100 structured and logically integrated semantic
units.
• PREMIS Data Dictionary provided detailed descriptions and guidelines to implement
these semantic units.
• PREMIS Data Dictionary does not provide semantic units for Intellectual Entities, but
provides semantic units to link to other metadata sources for Intellectual Entities (this
will change in version 3)
• All entities have reference (identification) information.
• No “packaging information” that links content with metadata, but PREMIS can be
used with container schemas
• PREMIS deals mostly with representation, context, provenance, and fixity
information, in keeping with PREMIS definition of preservation metadata.
9. 9
The PREMIS data model: 5 interacting
entitiesIntellectual
Entity
Object
Event
Agent
Rights
identifier
11. 11
Sample Data Dictionary Entry
Semantic unit size
Semantic
components
None
Definition The size in bytes of the file or bitstream stored in the
repository.
Rationale Size is useful for ensuring the correct number of bytes from
storage have been retrieved and that an application has
enough room to move or process files. It might also be used
when billing for storage.
Data constraint Integer
Object category Representation File Bitstream
Applicability Not applicable Applicable Applicable
Examples 2038927
Repeatability Not repeatable Not repeatable
Obligation Optional Optional
Creation/
Maintenance notes
Automatically obtained by the repository.
Usage notes Defining this semantic unit as size in bytes makes it
unnecessary to record a unit of measurement. However, for
the purpose of data exchange the unit of measurement should
be stated or understood by both partners.
12. 12
• What PREMIS DD is:
– Common data model for organizing/thinking about
preservation metadata
– Implementable
– Standard for exchanging information packages between
repositories
– Technically neutral
– Core metadata
Scope
13. 13
• What PREMIS DD is not:
– Out-of-the-box solution
– All needed metadata
– Lifecycle management of objects outside repository
– Rights management
Scope
14. 14
Technology Dependence
0001.tiff 0002.tiff 0003.tiff 0004.tiff 000156.tiff0005.tiff 0006.tiff
No direct access • Not self-descriptive
• Complex formats
Complex environments
digital
…
15. 15
Information packages
• Information about owner; what the package is and more
• The files, checksum, filenam, use and more
• Technical information like Digital Preservation Metadata, what has happend to
the files and more
– need for detailed rendering information
» Software
» Hardware
» Other dependencies: schemas, style sheets, encodings, etc.
– need for format information
• Information about structure, how are the files related?
16. 16
Standards for Information Packages
• One commonly used standard is METS
Metadata Encoding and Transmission Standard
• PREMIS can be used togehter with METS
<metsHdr>
<dmdSec>
<amdSec>
<fileSec>
<structMap>
<structLink>
<behaviorSec>
<mets>
mets Header
descriptive metadata Section
administrative metadata Section
file Section
structural Map section
structural Link section
behavior Section
17. 17
Technical metadata for audio and video
• A “new” need, objects now created digitally and digitization has increased
• Not as fast developed as other technical metadata schemes
• Complexities of file formats require expertise to develop and implement
these
• Few standards available for metadata about audio and video
– AES (will be briefly introduced)
– audioMD and videoMD (will be briefly introduced)
– Material Exchange Format (MXF)
– Technical metadata in EBUCore, PBCore
– In US the Federal Agencies Digitization Guidelines Initiative (FAGDI)
– MPEG-7 and MPEG-21 for video
• Programs creating audio and/or video often can export metadata.
Question: Is this exported information sufficient?
Answer: Needs to be evaluated at the archives and a decision taken!
18. 18
AES
• Audio Engineering Society (http://www.aes.org/ )
• AES-X098B supersided by:
– AES57-2011-f (2011)
AES standard for audio metadata - Audio object structures for preservation and
restoration
– AES60-2011-f (2011)
AES standard for audio metadata - Core audio metadata
• Two XML schemas available
• According to earlier know information 98C (video) was planned to be
made after 98B had been established
• Some educational orientated presentations can be found.
19. 19
audioMD and videoMD (AMD and VMD)
• Hosted by Library of Congress
(http://www.loc.gov/standards/amdvmd/index.html )
• Simple schemas developed during 10 years
• Current version published during spring 2011
• Information about one use case together with METS
• Mailing list exists, but rarely used
• Archives interested in using not too complex schemas for preservation
purposes
20. 20
Tools
• PREMIS in METS toolbox
• The controlled vocabularies database
• Some institutions are making repository software available
that implements PREMIS
– DAITSS Digital Preservation Repository Software
– Archivematica
21. 21
The controlled vocabularies database
• Library of Congress is establishing databases with controlled vocabulary values
for standards that it maintains
• http://id.loc.gov
• Now also specific vocabularies for PREMIS semantic units:
preservationLevelRole, cryptographicHashAlgorithm, eventType
• Additional PREMIS controlled lists to be made available with the PREMIS OWL
ontology
22. 22
PREMIS Web Ontology Language (OWL) ontology
• Initiated by the Archipel project to use PREMIS in Open Archives Initiative
Object Reuse and Exchange (OAI-ORE)
(description/exchange of Web resources)
• Resource Description Framework (RDF) serialization of preservation
metadata as a data management function in a preservation repository
• Interoperate with other preservation Linked Data efforts such as UDFR
(Unified Digital Formats Registry)
• Interoperate with PREMIS controlled vocabularies at http://id.loc.gov
23. 23
PREMIS OWL ontology in a
nutshell
• Purpose
– Providing the community with an RDF serialization of
the PREMIS data model and dictionary
– While remaining as close as possible to the data
dictionary’s clearly defined semantics
RDF modelling in 3 words:
• Everything modelled under the form of
subject-verb-object
• But what objects? what verbs? what
objects?
role of vocabularies & ontologies
24. 24
Implementation issues:
Conformance
• Conformant Implementation of the PREMIS Data
Dictionary http://www.loc.gov/standards/premis/premis-
conformance-oct2010.pdf
• What does "being conformant to PREMIS"
mean?
• Conformant at which level?
– semantic unit: conformant implementation of the
information defined in a particular semantic unit
– data dictionary: conformant implementation of all
semantic units
• Conformant from what perspective?
– internal: conformant implementation at semantic units and
data dictionary levels
– external (exchanging PREMIS descriptions):
import = the repository can manage PREMIS conformant
information
export = the repository can provide others with PREMIS
25. 25
Implementation issues: Technical
• Which semantic units to use besides the
mandatory?
• Create own vocabularys?
• Where to store the metadata?
– In an XML-document?
– In one or more databases?
• Which event to store?
• How to store agents, rights management?
• In short:
A lot of descision making needs to be preformed!
26. 26
Conclusion
• Using PREMIS as the basis for digital preservation metadata is
widely implemented
• Both IT and the archives need to work together.
Different kind of expertise.
• Complexities of audio and video require increased need for
technical and structural metadata
• Increasing use of digital preservation metadata for archiving audio
and video is expected
• Examples of use of PREMIS together with audio and video
metadata is needed
27. 27
Thank you!
Karin Bredenberg, The National Archives of Sweden
karin.bredenberg@riksarkivet.se
Presentation made with the help of:
Angela Dappert
Sébastien Peyrard
Rebecca Guenther