SlideShare ist ein Scribd-Unternehmen logo
1 von 15
Beyond TIFF and JPEG2000: PDF/A as an OAIS submission
information package container
Presentation of Massoud Mortazavi
Student of MS in Information Science
SBU University
Master Name: Mrs. Pakdaman
Han, Y. (2015). Beyond TIFF and JPEG2000: PDF/A as an OAIS submission information package
container. Library Hi Tech, 409 - 423.
HTTP://DX.DOI.ORG/10.1108/LHT-06-2015-0068
Abstract
 Purpose
introduce PDF/A to replace TIFF as the preferred file format for digitization of
textual documents
 Methodology
reviewed the current digitization guidelines, the OAIS model and provides on an
overview of the development PDF and PDF/A as international standards.
2
Abstract
 Findings
TIFF file format has been the preferred master file format
PDF/A has been the preferred standard for coding born-digital documents
PDF/A can be used as an OAIS SIP container
 Background
More Than 20 Years Digitalization's In Libraries
Digital Library Federation (DLF) have published several critical digitization guidelines
3
Standardization of PDF as PDF/A
Format
Standardization of PDF as PDF/A Format Started in
2005:
PDF/A-1: (PDF 1/4): ISO 19005-1:2005
PDF/A-2: (ISO 32000-1): ISO 19005-2: 2011
PDF/A-3: Use of ISO 32000-1 with support for
embedded files (PDF/A-3)(PDF 1/7): ISO 19005-3:
2013
4
5
PDF Versions
PDF 1.4:
Version 1.4 was the basis for the first versions of ISO standards PDF/X and
PDF/A
PDF 1.7:
The original version 1.7 of the PDF format was released November 2006 and
associated with Acrobat and Adobe Reader 8.0. Version 1.7 was published as ISO
32000-1 in July 2008
6
PDF/A as an OAIS SIP container
 The key requirement of PDF/A is that it is self-described and self-contained so that
it can be reproduced exactly the same way with different software in various
platforms.
 All of the information necessary for displaying the document is embedded in the
PDF/A file.
 text, raster images and vector graphics, fonts and color profiles
7
PDF/A as an OAIS SIP container
(1) tagged PDF: embed structural metadata via pre-defined PDF tags or create your
own tags;
(2) self-contained: embed required color profiles, fonts and other related information;
and
(3) self-described using extensible metadata platform (XMP) metadata: PDF/A can
code all the required information from an OAIS SIP through the standard and XMP.
8
TIFF As a good Format
 For the past 20 years TIFF 6.0 has been the preferred master file format for
digitization due to a few factors such as availability of the technical specification
and easy-to-understand file structure.
 TIFF is very simple
 easy to repair
 Migrate
9
The Problems of TIFF
 it cannot include layers and JPEG (Its Not True)
 TIFF 6.0 is an open standard (But it Should Use a License that its not Open Standard,
actually its not OPEN STANDARD)
 Big File Size
 Inflexible for web and mobile delivery
 Indexing is difficult
 OCR, XMP, ALTO XML is not Supported
 METS Not Supported (?)
 TIFF tags are difficult to work with
10
What about PDF?
 Open International Standards
 Self-contained and self-described.
 Flexible
 Space saving
 Better metadata support with XMP
 Other files or data. PDF/A-3 has the ability to have any file or data encoded
11
12
ALTO XML
Summary
 PDF and PDF/A as international standards since 2005
 PDF/A has been widely accepted as the preferred master file format for born-
digital documents, but it has not been recommended for digitization
 Every PDF/A Formats (1,2,3) Can be Used for Some Digitalization
 The author shows how PDF/A is a better file format than current preferred TIFF
and JPEG2000
13
References
 Guidelines for TIFF Metadata Recommended Elements and Format
 http://www.iso.org
 http://www.digitalpreservation.gov
 The Use of PDF in Digital Archives
 The Use of PDF/A in Digital Archives: A Case Study from Archaeology
 https://en.wikipedia.org/wiki/Extensible_Metadata_Platform
14
15

Weitere ähnliche Inhalte

Ähnlich wie Beyond TIFF and JPEG2000

January 2006 Archival Storage Strategies and Technologies Presentation
January 2006 Archival Storage Strategies and Technologies PresentationJanuary 2006 Archival Storage Strategies and Technologies Presentation
January 2006 Archival Storage Strategies and Technologies Presentation
John Wang
 
October 2006 Impact of PDF/A on Content Management by Christy Hubbard
October 2006 Impact of PDF/A on Content Management by Christy HubbardOctober 2006 Impact of PDF/A on Content Management by Christy Hubbard
October 2006 Impact of PDF/A on Content Management by Christy Hubbard
John Wang
 
Presentation1
Presentation1Presentation1
Presentation1
f6aim
 
Apago Pdfx Nyc Seminar Fs Presentation
Apago Pdfx Nyc Seminar Fs PresentationApago Pdfx Nyc Seminar Fs Presentation
Apago Pdfx Nyc Seminar Fs Presentation
Dwight Kelly
 
FileType.pdf
FileType.pdfFileType.pdf
FileType.pdf
qqlove2
 

Ähnlich wie Beyond TIFF and JPEG2000 (20)

What is PDF/A?
What is PDF/A?What is PDF/A?
What is PDF/A?
 
Pdfa Keynote
Pdfa KeynotePdfa Keynote
Pdfa Keynote
 
What is PDF/X?
What is PDF/X? What is PDF/X?
What is PDF/X?
 
The importance of standards
The importance of standardsThe importance of standards
The importance of standards
 
January 2006 Archival Storage Strategies and Technologies Presentation
January 2006 Archival Storage Strategies and Technologies PresentationJanuary 2006 Archival Storage Strategies and Technologies Presentation
January 2006 Archival Storage Strategies and Technologies Presentation
 
Pdfa 2 rome-fanning
Pdfa 2 rome-fanningPdfa 2 rome-fanning
Pdfa 2 rome-fanning
 
October 2006 Impact of PDF/A on Content Management by Christy Hubbard
October 2006 Impact of PDF/A on Content Management by Christy HubbardOctober 2006 Impact of PDF/A on Content Management by Christy Hubbard
October 2006 Impact of PDF/A on Content Management by Christy Hubbard
 
Presentation1
Presentation1Presentation1
Presentation1
 
PDF vs. TIFF, An Evaluation of Document Scanning File Formats
PDF vs. TIFF, An Evaluation of Document Scanning File FormatsPDF vs. TIFF, An Evaluation of Document Scanning File Formats
PDF vs. TIFF, An Evaluation of Document Scanning File Formats
 
What is PDF/A?
What is PDF/A?What is PDF/A?
What is PDF/A?
 
print media - file formats - LO1
print media - file formats - LO1print media - file formats - LO1
print media - file formats - LO1
 
PRESENTATION: Challenges of Digitization (November 2012)
PRESENTATION: Challenges of Digitization (November 2012)PRESENTATION: Challenges of Digitization (November 2012)
PRESENTATION: Challenges of Digitization (November 2012)
 
PDF/Archive: Preserving Electronic Assets
PDF/Archive: Preserving Electronic AssetsPDF/Archive: Preserving Electronic Assets
PDF/Archive: Preserving Electronic Assets
 
Different file types
Different file typesDifferent file types
Different file types
 
Apago Pdfx Nyc Seminar Fs Presentation
Apago Pdfx Nyc Seminar Fs PresentationApago Pdfx Nyc Seminar Fs Presentation
Apago Pdfx Nyc Seminar Fs Presentation
 
PDF
PDFPDF
PDF
 
PDF Generation in Rails with Prawn and Prawn-to: John McCaffrey
PDF Generation in Rails with Prawn and Prawn-to: John McCaffreyPDF Generation in Rails with Prawn and Prawn-to: John McCaffrey
PDF Generation in Rails with Prawn and Prawn-to: John McCaffrey
 
 
 
FileType.pdf
FileType.pdfFileType.pdf
FileType.pdf
 

Kürzlich hochgeladen

AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
Alluxio, Inc.
 

Kürzlich hochgeladen (20)

Agnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in KrakówAgnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in Kraków
 
What need to be mastered as AI-Powered Java Developers
What need to be mastered as AI-Powered Java DevelopersWhat need to be mastered as AI-Powered Java Developers
What need to be mastered as AI-Powered Java Developers
 
Workforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdfWorkforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdf
 
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdfStrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
 
5 Reasons Driving Warehouse Management Systems Demand
5 Reasons Driving Warehouse Management Systems Demand5 Reasons Driving Warehouse Management Systems Demand
5 Reasons Driving Warehouse Management Systems Demand
 
COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...
COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...
COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...
 
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesGraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
 
Entropy, Software Quality, and Innovation (presented at Princeton Plasma Phys...
Entropy, Software Quality, and Innovation (presented at Princeton Plasma Phys...Entropy, Software Quality, and Innovation (presented at Princeton Plasma Phys...
Entropy, Software Quality, and Innovation (presented at Princeton Plasma Phys...
 
IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024
 
How to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabberHow to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabber
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
 
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdfMicrosoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
 
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
 
AI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in Michelangelo
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
AI/ML Infra Meetup | Perspective on Deep Learning Framework
AI/ML Infra Meetup | Perspective on Deep Learning FrameworkAI/ML Infra Meetup | Perspective on Deep Learning Framework
AI/ML Infra Meetup | Perspective on Deep Learning Framework
 
Crafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationCrafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM Integration
 
Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024
 
A Guideline to Zendesk to Re:amaze Data Migration
A Guideline to Zendesk to Re:amaze Data MigrationA Guideline to Zendesk to Re:amaze Data Migration
A Guideline to Zendesk to Re:amaze Data Migration
 
10 Essential Software Testing Tools You Need to Know About.pdf
10 Essential Software Testing Tools You Need to Know About.pdf10 Essential Software Testing Tools You Need to Know About.pdf
10 Essential Software Testing Tools You Need to Know About.pdf
 

Beyond TIFF and JPEG2000

  • 1. Beyond TIFF and JPEG2000: PDF/A as an OAIS submission information package container Presentation of Massoud Mortazavi Student of MS in Information Science SBU University Master Name: Mrs. Pakdaman Han, Y. (2015). Beyond TIFF and JPEG2000: PDF/A as an OAIS submission information package container. Library Hi Tech, 409 - 423. HTTP://DX.DOI.ORG/10.1108/LHT-06-2015-0068
  • 2. Abstract  Purpose introduce PDF/A to replace TIFF as the preferred file format for digitization of textual documents  Methodology reviewed the current digitization guidelines, the OAIS model and provides on an overview of the development PDF and PDF/A as international standards. 2
  • 3. Abstract  Findings TIFF file format has been the preferred master file format PDF/A has been the preferred standard for coding born-digital documents PDF/A can be used as an OAIS SIP container  Background More Than 20 Years Digitalization's In Libraries Digital Library Federation (DLF) have published several critical digitization guidelines 3
  • 4. Standardization of PDF as PDF/A Format Standardization of PDF as PDF/A Format Started in 2005: PDF/A-1: (PDF 1/4): ISO 19005-1:2005 PDF/A-2: (ISO 32000-1): ISO 19005-2: 2011 PDF/A-3: Use of ISO 32000-1 with support for embedded files (PDF/A-3)(PDF 1/7): ISO 19005-3: 2013 4
  • 5. 5
  • 6. PDF Versions PDF 1.4: Version 1.4 was the basis for the first versions of ISO standards PDF/X and PDF/A PDF 1.7: The original version 1.7 of the PDF format was released November 2006 and associated with Acrobat and Adobe Reader 8.0. Version 1.7 was published as ISO 32000-1 in July 2008 6
  • 7. PDF/A as an OAIS SIP container  The key requirement of PDF/A is that it is self-described and self-contained so that it can be reproduced exactly the same way with different software in various platforms.  All of the information necessary for displaying the document is embedded in the PDF/A file.  text, raster images and vector graphics, fonts and color profiles 7
  • 8. PDF/A as an OAIS SIP container (1) tagged PDF: embed structural metadata via pre-defined PDF tags or create your own tags; (2) self-contained: embed required color profiles, fonts and other related information; and (3) self-described using extensible metadata platform (XMP) metadata: PDF/A can code all the required information from an OAIS SIP through the standard and XMP. 8
  • 9. TIFF As a good Format  For the past 20 years TIFF 6.0 has been the preferred master file format for digitization due to a few factors such as availability of the technical specification and easy-to-understand file structure.  TIFF is very simple  easy to repair  Migrate 9
  • 10. The Problems of TIFF  it cannot include layers and JPEG (Its Not True)  TIFF 6.0 is an open standard (But it Should Use a License that its not Open Standard, actually its not OPEN STANDARD)  Big File Size  Inflexible for web and mobile delivery  Indexing is difficult  OCR, XMP, ALTO XML is not Supported  METS Not Supported (?)  TIFF tags are difficult to work with 10
  • 11. What about PDF?  Open International Standards  Self-contained and self-described.  Flexible  Space saving  Better metadata support with XMP  Other files or data. PDF/A-3 has the ability to have any file or data encoded 11
  • 13. Summary  PDF and PDF/A as international standards since 2005  PDF/A has been widely accepted as the preferred master file format for born- digital documents, but it has not been recommended for digitization  Every PDF/A Formats (1,2,3) Can be Used for Some Digitalization  The author shows how PDF/A is a better file format than current preferred TIFF and JPEG2000 13
  • 14. References  Guidelines for TIFF Metadata Recommended Elements and Format  http://www.iso.org  http://www.digitalpreservation.gov  The Use of PDF in Digital Archives  The Use of PDF/A in Digital Archives: A Case Study from Archaeology  https://en.wikipedia.org/wiki/Extensible_Metadata_Platform 14
  • 15. 15