SlideShare ist ein Scribd-Unternehmen logo
1 von 27
Downloaden Sie, um offline zu lesen
SCAPE
Johan van der Knijff
Koninklijke Bibliotheek – National Library of the Netherlands
DPC, PDF/A-3 Briefing, Leeds, 13.3.2013
PDF/A-3 for preservation
Notes on embedded files and JPEG 2000
Part 1: Embedded files
PDF/A-3: embedding of any file (type)
Key point:
Use of “embedded files” really means
“embedded file streams” = specific data
structure in PDF!
File specification dictionary
31 0 obj
<</Type /Filespec /F (mysvg.svg) /EF <</F 32 0 R>> >>
endobj
File specification dictionary
31 0 obj
<</Type /Filespec /F (mysvg.svg) /EF <</F 32 0 R>> >>
endobj
EF key
points to embedded file
stream
Embedded file stream
32 0 obj
<</Type /EmbeddedFile /Subtype /image#2Fsvg+xml /Length 72>>
stream
…SVG Data…
endstream
endobj
Uses of embedded file streams
File attachments not meant to be rendered by
viewer
File attachment annotation
EmbeddedFiles entry in name dictionary
PDF/A-3
Rendered in/by PDF viewer
Rendition actions
Screen annotations
PDF/A-3
What about inline images?
Not based on “embedded file stream”, but on
“Image XObject” data structure (allows
limited set of pre-defined formats)
What about inline images?
No impact on content that is meant to be
rendered by PDF viewer
But PDF/A-3’s may contain file of any possible
format as an attachment
Embedded files wrap-up:
Part 2: JPEG 2000
Supported since PDF/A-2
Image XObject
1614 0 obj
<</Subtype/Image/Width 615/Height 978/ColorSpace/DeviceRGB
/BitsPerComponent 8/Interpolate true/Length 5278
/Filter/JPXDecode>>
stream
… Image data …
::
::
endstream
endobj
Image XObject
1614 0 obj
<</Subtype/Image/Width 615/Height 978/ColorSpace/DeviceRGB
/BitsPerComponent 8/Interpolate true/Length 5278
/Filter/JPXDecode>>
stream
… Image data …
::
::
endstream
endobj
Identifies object as
JPEG 2000 image
ISO 19005-2 (PDF/A-2):
JPEG 2000 support based on subset of JPEG
2000 Part 2 (JPX baseline)
Only Part 1 of the standard (JP2) commonly
used for archival applications!
JP2 vs JPX
JP2
JPX
JPEG 2000 Part 1:
Basic still image format
JPEG 2000 Part 2:
= JP2 + assorted
advanced stuff …
Fragmented codestreams
Allowed in JPX Baseline!
OS PDF viewers – JPEG 2000 libraries
Ghostscript: OpenJPEG or JasPer
Evince: OpenJPEG
Mupdf: OpenJPEG
Firefox PDF viewer: built-in decoder
 None of these libraries support fragmented
codestreams!
Is it really a problem?
Fragmented codestreams extremely rare
But why is this feature even allowed in a long-
term archival format?
OS support of JPEG 2000 in general remains
problematic
#SCAPEProject
http://www.scape-project.eu
This work was partially supported by the SCAPE Project.
The SCAPE project is co-funded by the European Union under
FP7 ICT-2009.4.1 (Grant Agreement number 270137).
Funding

Weitere ähnliche Inhalte

Andere mochten auch

The social construction of reality
The social construction of realityThe social construction of reality
The social construction of realityEric Strayer
 
Animation in power point
Animation in power pointAnimation in power point
Animation in power pointleoleogo
 
Mail merge - Get Complete Information !!
Mail merge - Get Complete Information !!Mail merge - Get Complete Information !!
Mail merge - Get Complete Information !!peterb8
 
Mail Merge - the basics
Mail Merge - the basicsMail Merge - the basics
Mail Merge - the basicskprentice
 

Andere mochten auch (6)

The social construction of reality
The social construction of realityThe social construction of reality
The social construction of reality
 
Animation in power point
Animation in power pointAnimation in power point
Animation in power point
 
Mail merge - Get Complete Information !!
Mail merge - Get Complete Information !!Mail merge - Get Complete Information !!
Mail merge - Get Complete Information !!
 
Mail merge
Mail mergeMail merge
Mail merge
 
Mail Merge in Microsoft Word
Mail Merge in Microsoft WordMail Merge in Microsoft Word
Mail Merge in Microsoft Word
 
Mail Merge - the basics
Mail Merge - the basicsMail Merge - the basics
Mail Merge - the basics
 

Ähnlich wie PDF/A-3 for preservation. Notes on embedded files and JPEG2000

Ähnlich wie PDF/A-3 for preservation. Notes on embedded files and JPEG2000 (20)

Gewinen mit 3W
Gewinen mit 3WGewinen mit 3W
Gewinen mit 3W
 
Jpeg 2000 For Digital Archives
Jpeg 2000 For Digital ArchivesJpeg 2000 For Digital Archives
Jpeg 2000 For Digital Archives
 
Apple's live http streaming
Apple's live http streamingApple's live http streaming
Apple's live http streaming
 
Mpeg 7 slides
Mpeg 7 slides Mpeg 7 slides
Mpeg 7 slides
 
5.Arne_Nowak_Digital_Archiving_Pilots.pdf
5.Arne_Nowak_Digital_Archiving_Pilots.pdf5.Arne_Nowak_Digital_Archiving_Pilots.pdf
5.Arne_Nowak_Digital_Archiving_Pilots.pdf
 
spraa64
spraa64spraa64
spraa64
 
spraa64
spraa64spraa64
spraa64
 
spraa64
spraa64spraa64
spraa64
 
spraa64
spraa64spraa64
spraa64
 
Using the JPEG2000 image format for storage and access in biodiversity collec...
Using the JPEG2000 image format for storage and access in biodiversity collec...Using the JPEG2000 image format for storage and access in biodiversity collec...
Using the JPEG2000 image format for storage and access in biodiversity collec...
 
presentation
presentationpresentation
presentation
 
Content packaging and MPEG-21 DID
Content packaging and MPEG-21 DIDContent packaging and MPEG-21 DID
Content packaging and MPEG-21 DID
 
Hw2
Hw2Hw2
Hw2
 
Performance Analysis of Various Video Compression Techniques
Performance Analysis of Various Video Compression TechniquesPerformance Analysis of Various Video Compression Techniques
Performance Analysis of Various Video Compression Techniques
 
File types, photoshop
File types, photoshopFile types, photoshop
File types, photoshop
 
JPEG2000 Alliance IBC 2009
JPEG2000 Alliance IBC 2009JPEG2000 Alliance IBC 2009
JPEG2000 Alliance IBC 2009
 
Videostream compression in iOS
Videostream compression in iOSVideostream compression in iOS
Videostream compression in iOS
 
Mpeg 7-21
Mpeg 7-21Mpeg 7-21
Mpeg 7-21
 
Lecture 6 -_presentation_layer
Lecture 6 -_presentation_layerLecture 6 -_presentation_layer
Lecture 6 -_presentation_layer
 
Integrating media
Integrating mediaIntegrating media
Integrating media
 

Mehr von SCAPE Project

SCAPE Information Day at BL - Characterising content in web archives with Nanite
SCAPE Information Day at BL - Characterising content in web archives with NaniteSCAPE Information Day at BL - Characterising content in web archives with Nanite
SCAPE Information Day at BL - Characterising content in web archives with NaniteSCAPE Project
 
Scape information day at BL - Using Jpylyzer and Schematron for validating JP...
Scape information day at BL - Using Jpylyzer and Schematron for validating JP...Scape information day at BL - Using Jpylyzer and Schematron for validating JP...
Scape information day at BL - Using Jpylyzer and Schematron for validating JP...SCAPE Project
 
SCAPE Information Day at BL - Some of the SCAPE Outputs Available
SCAPE Information Day at BL - Some of the SCAPE Outputs AvailableSCAPE Information Day at BL - Some of the SCAPE Outputs Available
SCAPE Information Day at BL - Some of the SCAPE Outputs AvailableSCAPE Project
 
SCAPE Information Day at BL - Large Scale Processing with Hadoop
SCAPE Information Day at BL - Large Scale Processing with HadoopSCAPE Information Day at BL - Large Scale Processing with Hadoop
SCAPE Information Day at BL - Large Scale Processing with HadoopSCAPE Project
 
SCAPE Information day at BL - Flint, a Format and File Validation Tool
SCAPE Information day at BL - Flint, a Format and File Validation ToolSCAPE Information day at BL - Flint, a Format and File Validation Tool
SCAPE Information day at BL - Flint, a Format and File Validation ToolSCAPE Project
 
SCAPE Webinar: Tools for uncovering preservation risks in large repositories
SCAPE Webinar: Tools for uncovering preservation risks in large repositoriesSCAPE Webinar: Tools for uncovering preservation risks in large repositories
SCAPE Webinar: Tools for uncovering preservation risks in large repositoriesSCAPE Project
 
SCAPE – Scalable Preservation Environments, SCAPE Information Day, 25 June 20...
SCAPE – Scalable Preservation Environments, SCAPE Information Day, 25 June 20...SCAPE – Scalable Preservation Environments, SCAPE Information Day, 25 June 20...
SCAPE – Scalable Preservation Environments, SCAPE Information Day, 25 June 20...SCAPE Project
 
Policy driven validation of JPEG 2000 files based on Jpylyzer, SCAPE Informat...
Policy driven validation of JPEG 2000 files based on Jpylyzer, SCAPE Informat...Policy driven validation of JPEG 2000 files based on Jpylyzer, SCAPE Informat...
Policy driven validation of JPEG 2000 files based on Jpylyzer, SCAPE Informat...SCAPE Project
 
Migration of audio files using Hadoop, SCAPE Information Day, 25 June 2014
Migration of audio files using Hadoop, SCAPE Information Day, 25 June 2014Migration of audio files using Hadoop, SCAPE Information Day, 25 June 2014
Migration of audio files using Hadoop, SCAPE Information Day, 25 June 2014SCAPE Project
 
Integrating the Fedora based DOMS repository with Hadoop, SCAPE Information D...
Integrating the Fedora based DOMS repository with Hadoop, SCAPE Information D...Integrating the Fedora based DOMS repository with Hadoop, SCAPE Information D...
Integrating the Fedora based DOMS repository with Hadoop, SCAPE Information D...SCAPE Project
 
Hadoop and its applications at the State and University Library, SCAPE Inform...
Hadoop and its applications at the State and University Library, SCAPE Inform...Hadoop and its applications at the State and University Library, SCAPE Inform...
Hadoop and its applications at the State and University Library, SCAPE Inform...SCAPE Project
 
Scape project presentation - Scalable Preservation Environments
Scape project presentation - Scalable Preservation EnvironmentsScape project presentation - Scalable Preservation Environments
Scape project presentation - Scalable Preservation EnvironmentsSCAPE Project
 
LIBER Satellite Event, SCAPE by Sven Schlarb
LIBER Satellite Event, SCAPE by Sven SchlarbLIBER Satellite Event, SCAPE by Sven Schlarb
LIBER Satellite Event, SCAPE by Sven SchlarbSCAPE Project
 
Content profiling and C3PO
Content profiling and C3POContent profiling and C3PO
Content profiling and C3POSCAPE Project
 
Control policy formulation
Control policy formulationControl policy formulation
Control policy formulationSCAPE Project
 
Preservation Policy in SCAPE - Training, Aarhus
Preservation Policy in SCAPE - Training, AarhusPreservation Policy in SCAPE - Training, Aarhus
Preservation Policy in SCAPE - Training, AarhusSCAPE Project
 
An image based approach for content analysis in document collections
An image based approach for content analysis in document collectionsAn image based approach for content analysis in document collections
An image based approach for content analysis in document collectionsSCAPE Project
 
SCAPE - Skalierbare Langzeitarchivierung (SCAPE - scalable longterm digital p...
SCAPE - Skalierbare Langzeitarchivierung (SCAPE - scalable longterm digital p...SCAPE - Skalierbare Langzeitarchivierung (SCAPE - scalable longterm digital p...
SCAPE - Skalierbare Langzeitarchivierung (SCAPE - scalable longterm digital p...SCAPE Project
 
TAVERNA Components - Semantically annotated and sharable units of functionality
TAVERNA Components - Semantically annotated and sharable units of functionalityTAVERNA Components - Semantically annotated and sharable units of functionality
TAVERNA Components - Semantically annotated and sharable units of functionalitySCAPE Project
 

Mehr von SCAPE Project (20)

C sz z6
C sz z6C sz z6
C sz z6
 
SCAPE Information Day at BL - Characterising content in web archives with Nanite
SCAPE Information Day at BL - Characterising content in web archives with NaniteSCAPE Information Day at BL - Characterising content in web archives with Nanite
SCAPE Information Day at BL - Characterising content in web archives with Nanite
 
Scape information day at BL - Using Jpylyzer and Schematron for validating JP...
Scape information day at BL - Using Jpylyzer and Schematron for validating JP...Scape information day at BL - Using Jpylyzer and Schematron for validating JP...
Scape information day at BL - Using Jpylyzer and Schematron for validating JP...
 
SCAPE Information Day at BL - Some of the SCAPE Outputs Available
SCAPE Information Day at BL - Some of the SCAPE Outputs AvailableSCAPE Information Day at BL - Some of the SCAPE Outputs Available
SCAPE Information Day at BL - Some of the SCAPE Outputs Available
 
SCAPE Information Day at BL - Large Scale Processing with Hadoop
SCAPE Information Day at BL - Large Scale Processing with HadoopSCAPE Information Day at BL - Large Scale Processing with Hadoop
SCAPE Information Day at BL - Large Scale Processing with Hadoop
 
SCAPE Information day at BL - Flint, a Format and File Validation Tool
SCAPE Information day at BL - Flint, a Format and File Validation ToolSCAPE Information day at BL - Flint, a Format and File Validation Tool
SCAPE Information day at BL - Flint, a Format and File Validation Tool
 
SCAPE Webinar: Tools for uncovering preservation risks in large repositories
SCAPE Webinar: Tools for uncovering preservation risks in large repositoriesSCAPE Webinar: Tools for uncovering preservation risks in large repositories
SCAPE Webinar: Tools for uncovering preservation risks in large repositories
 
SCAPE – Scalable Preservation Environments, SCAPE Information Day, 25 June 20...
SCAPE – Scalable Preservation Environments, SCAPE Information Day, 25 June 20...SCAPE – Scalable Preservation Environments, SCAPE Information Day, 25 June 20...
SCAPE – Scalable Preservation Environments, SCAPE Information Day, 25 June 20...
 
Policy driven validation of JPEG 2000 files based on Jpylyzer, SCAPE Informat...
Policy driven validation of JPEG 2000 files based on Jpylyzer, SCAPE Informat...Policy driven validation of JPEG 2000 files based on Jpylyzer, SCAPE Informat...
Policy driven validation of JPEG 2000 files based on Jpylyzer, SCAPE Informat...
 
Migration of audio files using Hadoop, SCAPE Information Day, 25 June 2014
Migration of audio files using Hadoop, SCAPE Information Day, 25 June 2014Migration of audio files using Hadoop, SCAPE Information Day, 25 June 2014
Migration of audio files using Hadoop, SCAPE Information Day, 25 June 2014
 
Integrating the Fedora based DOMS repository with Hadoop, SCAPE Information D...
Integrating the Fedora based DOMS repository with Hadoop, SCAPE Information D...Integrating the Fedora based DOMS repository with Hadoop, SCAPE Information D...
Integrating the Fedora based DOMS repository with Hadoop, SCAPE Information D...
 
Hadoop and its applications at the State and University Library, SCAPE Inform...
Hadoop and its applications at the State and University Library, SCAPE Inform...Hadoop and its applications at the State and University Library, SCAPE Inform...
Hadoop and its applications at the State and University Library, SCAPE Inform...
 
Scape project presentation - Scalable Preservation Environments
Scape project presentation - Scalable Preservation EnvironmentsScape project presentation - Scalable Preservation Environments
Scape project presentation - Scalable Preservation Environments
 
LIBER Satellite Event, SCAPE by Sven Schlarb
LIBER Satellite Event, SCAPE by Sven SchlarbLIBER Satellite Event, SCAPE by Sven Schlarb
LIBER Satellite Event, SCAPE by Sven Schlarb
 
Content profiling and C3PO
Content profiling and C3POContent profiling and C3PO
Content profiling and C3PO
 
Control policy formulation
Control policy formulationControl policy formulation
Control policy formulation
 
Preservation Policy in SCAPE - Training, Aarhus
Preservation Policy in SCAPE - Training, AarhusPreservation Policy in SCAPE - Training, Aarhus
Preservation Policy in SCAPE - Training, Aarhus
 
An image based approach for content analysis in document collections
An image based approach for content analysis in document collectionsAn image based approach for content analysis in document collections
An image based approach for content analysis in document collections
 
SCAPE - Skalierbare Langzeitarchivierung (SCAPE - scalable longterm digital p...
SCAPE - Skalierbare Langzeitarchivierung (SCAPE - scalable longterm digital p...SCAPE - Skalierbare Langzeitarchivierung (SCAPE - scalable longterm digital p...
SCAPE - Skalierbare Langzeitarchivierung (SCAPE - scalable longterm digital p...
 
TAVERNA Components - Semantically annotated and sharable units of functionality
TAVERNA Components - Semantically annotated and sharable units of functionalityTAVERNA Components - Semantically annotated and sharable units of functionality
TAVERNA Components - Semantically annotated and sharable units of functionality
 

Kürzlich hochgeladen

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 

Kürzlich hochgeladen (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 

PDF/A-3 for preservation. Notes on embedded files and JPEG2000

  • 1. SCAPE Johan van der Knijff Koninklijke Bibliotheek – National Library of the Netherlands DPC, PDF/A-3 Briefing, Leeds, 13.3.2013 PDF/A-3 for preservation Notes on embedded files and JPEG 2000
  • 2. Part 1: Embedded files PDF/A-3: embedding of any file (type)
  • 3.
  • 4. Key point: Use of “embedded files” really means “embedded file streams” = specific data structure in PDF!
  • 5. File specification dictionary 31 0 obj <</Type /Filespec /F (mysvg.svg) /EF <</F 32 0 R>> >> endobj
  • 6. File specification dictionary 31 0 obj <</Type /Filespec /F (mysvg.svg) /EF <</F 32 0 R>> >> endobj EF key points to embedded file stream
  • 7. Embedded file stream 32 0 obj <</Type /EmbeddedFile /Subtype /image#2Fsvg+xml /Length 72>> stream …SVG Data… endstream endobj
  • 8. Uses of embedded file streams
  • 9.
  • 10. File attachments not meant to be rendered by viewer
  • 11. File attachment annotation EmbeddedFiles entry in name dictionary PDF/A-3
  • 12.
  • 15. What about inline images?
  • 16. Not based on “embedded file stream”, but on “Image XObject” data structure (allows limited set of pre-defined formats) What about inline images?
  • 17. No impact on content that is meant to be rendered by PDF viewer But PDF/A-3’s may contain file of any possible format as an attachment Embedded files wrap-up:
  • 18. Part 2: JPEG 2000 Supported since PDF/A-2
  • 19.
  • 20. Image XObject 1614 0 obj <</Subtype/Image/Width 615/Height 978/ColorSpace/DeviceRGB /BitsPerComponent 8/Interpolate true/Length 5278 /Filter/JPXDecode>> stream … Image data … :: :: endstream endobj
  • 21. Image XObject 1614 0 obj <</Subtype/Image/Width 615/Height 978/ColorSpace/DeviceRGB /BitsPerComponent 8/Interpolate true/Length 5278 /Filter/JPXDecode>> stream … Image data … :: :: endstream endobj Identifies object as JPEG 2000 image
  • 22. ISO 19005-2 (PDF/A-2): JPEG 2000 support based on subset of JPEG 2000 Part 2 (JPX baseline) Only Part 1 of the standard (JP2) commonly used for archival applications!
  • 23. JP2 vs JPX JP2 JPX JPEG 2000 Part 1: Basic still image format JPEG 2000 Part 2: = JP2 + assorted advanced stuff …
  • 25. OS PDF viewers – JPEG 2000 libraries Ghostscript: OpenJPEG or JasPer Evince: OpenJPEG Mupdf: OpenJPEG Firefox PDF viewer: built-in decoder  None of these libraries support fragmented codestreams!
  • 26. Is it really a problem? Fragmented codestreams extremely rare But why is this feature even allowed in a long- term archival format? OS support of JPEG 2000 in general remains problematic
  • 27. #SCAPEProject http://www.scape-project.eu This work was partially supported by the SCAPE Project. The SCAPE project is co-funded by the European Union under FP7 ICT-2009.4.1 (Grant Agreement number 270137). Funding