SlideShare ist ein Scribd-Unternehmen logo
1 von 52
PDF made easy with iText 7
PDF is dead! Long live PDF!
Benoit Lagae, Developer, iText Software
Bruno Lowagie, Chief Strategy Officer, iText Group
Is PDF dead?
PDF specifications
Everybody uses HTML
Source:
http://duff-johnson.com/2014/03/10/98-percent-of-dot-com-is-html-but-38-percent-of-dot-gov-is-pdf/
But governments love PDF
Source:
http://duff-johnson.com/2014/03/10/98-percent-of-dot-com-is-html-but-38-percent-of-dot-gov-is-pdf/
Percentage of PDF files:
.org: 15%
.gov: 38%
.edu: 27%
Publication versus …
• No need to be self-contained
• May change over time
• Not all content produced by the author
• e.g. Advertisements
• Becoming more interactive
• e.g. Comments on a news article
… Document
• Self-contained
• Unchanging (non-dynamic)
• Able to be authenticated
• Able to be secured/protected
Not counting HTML, PDF is king
Source:
http://duff-johnson.com/2015/02/12/the-8-most-popular-document-formats-on-the-web-in-2015/
Publication:
HTML depends on context
Document:
PDF is forever
PDF/E
engineering
Since 2008
ISO 24517
PDF/VT
printing
Since 2010
ISO 16612
PDF/X
graphic arts
Since 2001
ISO 15930
PDF/A
archive
Since 2005
ISO 19005
PDF/UA
accessibility
Since 2012
ISO 14289
PDF
Portable Document Format
First released by Adobe in 1993
ISO Standard since 2008
ISO 32000
Related: XFDF (ISO), EcmaScript (ISO), PRC (ISO), PAdES (ETSI), ZUGFeRD
An umbrella of standards:
iText 7: a PDF engine
Image example
Image fox = new Image(ImageFactory.getImage(FOX));
Image dog = new Image(ImageFactory.getImage(DOG));
Paragraph p = new Paragraph("The quick brown ").add(fox)
.add(" jumps over the lazy ").add(dog);
document.add(p);
On the importance of
making a document
accessible
Can everyone read this?
Some structure is helpful
title
list item
list item
list item
Label Content
Can everyone read this?
How do we read a spider chart?
RiskManagement
StructuredFinance
Mergers&acquisitions
Governance&Internal
Control
AccountingOperations
Treasuryoperations
ManagementInformation
&BusinessDecision
Support
BusinessPlanning&
Strategy
FinanceContributiontoIT
Management
CommercialActivities
Taxation
FunctionalLeadership
Is this a better way to read data?
Adapting the
‘quick brown fox’
example for PDF/UA
PDF/UA (part 1)
PdfDocument pdf = new PdfDocument(new PdfWriter(dest));
Document document = new Document(pdf);
//Setting some required parameters
Pdf.setTagged();
pdf.getCatalog().setLang(new PdfString("en-US"));
pdf.getCatalog().setViewerPreferences(
new PdfViewerPreferences().setDisplayDocTitle(true));
PdfDocumentInfo info = pdf.getDocumentInfo();
info.setTitle("iText7 PDF/UA example");
//Create XMP meta data
pdf.createXmpMetadata();
PDF/UA (part 2)
//Fonts need to be embedded
PdfFont font = PdfFontFactory.createFont(FONT, PdfEncodings.WINANSI, true);
Paragraph p = new Paragraph();
p.setFont(font);
p.add(new Text("The quick brown "));
Image foxImage = new Image(ImageFactory.getImage(FOX));
//PDF/UA: Set alt text
foxImage.getAccessibilityProperties().setAlternateDescription("Fox");
p.add(foxImage);
p.add(" jumps over the lazy ");
Image dogImage = new Image(ImageFactory.getImage(DOG));
//PDF/UA: Set alt text
dogImage.getAccessibilityProperties().setAlternateDescription("Dog");
p.add(dogImage);
document.add(p);
document.close();
Result
On the importance of
making a document
archivable
PDF/A
• ISO-19005
– Long-term preservation of documents
– Approved parts will never become invalid
– Individual parts define new, useful features
• Obligations and restrictions
– Metadata: ISO 16684, eXtensible Metadata Platform (XMP)
– The document must be self-contained:
• All fonts need to be embedded
• No external movie, sound or other binary files
– No JavaScript allowed
– No encryption allowed
Three standards
• PDF/A-1 (2005)
– based on PDF 1.4
– Level B (“basic”): visual appearance
– Level A (“accessible”): visual appearance + structural and semantic properties
(Tagged PDF)
• PDF/A-2 (2011)
– Based on ISO-32000-1
– Features introduced in PDF 1.5, 1.6, and 1.7:
• Added support for JPEG2000, Collections, object-level XMP, optional content
• Improved support for transparency, comment types and annotations, digital
signatures
– Level U (“unicode”): visual appearance + all text is in Unicode
• PDF/A-3 (2012)
– Based on PDF/A-2 with only 1 difference: attachments do not need to be PDF/A
Adapting the
‘quick brown fox’
example for PDF/A
PDF/A-1b example
PdfADocument pdf = new PdfADocument(new PdfWriter(dest),
PdfAConformanceLevel.PDF_A_1B, new PdfOutputIntent("Custom", "",
"http://www.color.org", "sRGB IEC61966-2.1", new FileInputStream(INTENT)));
Document document = new Document(pdf);
//Create XMP meta data
pdf.createXmpMetadata();
//Fonts need to be embedded
PdfFont font = PdfFontFactory.createFont(FONT, PdfEncodings.WINANSI, true);
Paragraph p = new Paragraph();
p.setFont(font);
p.add(new Text("The quick brown "));
Image foxImage = new Image(ImageFactory.getImage(FOX));
p.add(foxImage);
p.add(" jumps over the lazy ");
Image dogImage = new Image(ImageFactory.getImage(DOG));
p.add(dogImage);
document.add(p);
document.close();
Resulting PDF/A-1b
PDF/A-1a example
PdfADocument pdf = new PdfADocument(new PdfWriter(dest),
PdfAConformanceLevel.PDF_A_1A, new PdfOutputIntent("Custom", "",
"http://www.color.org", "sRGB IEC61966-2.1", new FileInputStream(INTENT)));
Document document = new Document(pdf);
pdf.setTagged();
pdf.createXmpMetadata();
PdfFont font = PdfFontFactory.createFont(FONT, PdfEncodings.WINANSI, true);
Paragraph p = new Paragraph();
p.setFont(font);
p.add(new Text("The quick brown "));
Image foxImage = new Image(ImageFactory.getImage(FOX));
foxImage.getAccessibilityProperties().setAlternateDescription("Fox");
p.add(foxImage);
p.add(" jumps over the lazy ");
Image dogImage = new Image(ImageFactory.getImage(DOG));
dogImage.getAccessibilityProperties().setAlternateDescription("Dog");
p.add(dogImage);
document.add(p);
document.close();
Resulting PDF/A-1a
Real-world use:
publishing a CSV file as
PDF/A-3a and PDF/UA
United States database
United States example
part 1: initializations
PdfADocument pdf = new PdfADocument(
new PdfWriter(dest), PdfAConformanceLevel.PDF_A_3A,
new PdfOutputIntent("Custom", "", "http://www.color.org",
"sRGB IEC61966-2.1", new FileInputStream(INTENT)));
Document document = new Document(pdf, PageSize.A4.rotate());
//Setting some required parameters
pdf.setTagged(); // PDF/UA and PDF/A Level a
pdf.getCatalog().setLang(new PdfString("en-US")); // PDF/UA
pdf.getCatalog().setViewerPreferences( // PDF/UA
new PdfViewerPreferences().setDisplayDocTitle(true)); // PDF/UA
PdfDocumentInfo info = pdf.getDocumentInfo(); // PDF/UA
info.setTitle("iText7 PDF/A-3 example"); // PDF/UA
//Create XMP meta data
pdf.createXmpMetadata(); // PDF/UA and PDF/A Level a
United States example
part 2: add attachment
//Add attachment
PdfDictionary parameters = new PdfDictionary();
parameters.put(PdfName.ModDate, new PdfDate().getPdfObject());
PdfFileSpec fileSpec = PdfFileSpec.createEmbeddedFileSpec(
pdf, Files.readAllBytes(Paths.get(DATA)), "united_states.csv",
"united_states.csv", new PdfName("text/csv"), parameters,
PdfName.Data, false);
fileSpec.put(new PdfName("AFRelationship"), new PdfName("Data"));
pdf.addFileAttachment("united_states.csv", fileSpec);
PdfArray array = new PdfArray();
array.add(fileSpec.getPdfObject().getIndirectReference());
pdf.getCatalog().put(new PdfName("AF"), array);
United States example
part 3: parse CSV file
PdfFont font = PdfFontFactory.createFont(FONT, true);
PdfFont bold = PdfFontFactory.createFont(BOLD_FONT, true);
// Parsing a CSV file and add data to a table
Table table = new Table(new float[]{4, 1, 3, 4, 3, 3, 3, 3, 1});
table.setWidthPercent(100);
BufferedReader br = new BufferedReader(new FileReader(DATA));
String line = br.readLine();
process(table, line, bold, true);
while ((line = br.readLine()) != null) {
process(table, line, font, false);
}
br.close();
document.add(table);
document.close();
United States example
part 4: process each line
public void process(Table table, String line,
PdfFont font, boolean isHeader) {
StringTokenizer tokenizer = new StringTokenizer(line, ";");
while (tokenizer.hasMoreTokens()) {
if (isHeader) {
table.addHeaderCell(
new Cell().add(
new Paragraph(tokenizer.nextToken()).setFont(font)));
} else {
table.addCell(
new Cell().add(
new Paragraph(tokenizer.nextToken()).setFont(font)));
}
}
}
United States example: result
United States example: result
Real-world use:
ZUGFeRD,
the future of invoicing
Invoices:
Need to be archived
Invoices:
Need to be accessible
Invoices:
Need to be machine-readable
Invoices:
Need to be machine-readable
iText 7 and its value add-ons
New in iText 7:
improved typography
and support for Indic
scripts
iText 5: missing links
Indic scripts:
•Only unsupported major script family
•Feature request #1
•Huge opportunity
•limited support in most other PDF libraries
Other features:
•Optional ligatures in Latin script
•Vowel diacritics in Arabic
Indic scripts: problems
•Lack of expertise
•Unicode encodes 49 Indic scripts
•Complex scripts with unique features
•Glyph repositioning: ह + ि = िह
•Glyph substitution: ம + ு = மு
•Half-characters: त + + य = त्य
•Unsolvable issues for iText 5 font engine
•No dedicated Unicode points for half-characters
•No font lookups past ‘uFFFF’
•Ligaturization is context-dependent (virama)
Indic scripts: solutions
Writing a new font engine
• Automatic script recognition
• Based on Unicode ranges
• Flexibility = extensibility
• Generic Shaper class
• Separate module, only called when necessary
• Glyph replacement rules
• Different per writing system
• Alternate glyphs are font-dependent
Indic scripts: examples
PdfFont font = PdfFontFactory.createFont(arial, PdfEncodings.IDENTITY_H, true);
String txt = "u0938u093Eu0939u093Fu0924u094Du092Fu0915u093Eu0930"; // saahityakaar
document.add(new Paragraph(txt).setFont(font));
String txt = "u0B8Eu0BB4u0BC1u0BA4u0BCDu0BA4u0BBEu0BB3u0BB0u0BCD"; // eluttaalar
document.add(new Paragraph(txt).setFont(font));
Other scripts: examples
PdfFont font = PdfFontFactory.createFont(arial, PdfEncodings.IDENTITY_H, true);
String txt = " u0627u0644u0643u0627u062Au0628"; // al-katibu
document.add(new Paragraph(txt).setFont(font));
String txt = "writer";
GlyphLine glyphLine = font.createGlyphLine(txt);
Shaper.applyLigaFeature(foglihtenNo07, glyphLine, null);
canvas.showText(glyphLine)
Status of advanced
typography in iText 7
•Indic scripts
•We already support:
•Devanagari
•Tamil
•Coming soon:
•Telugu
•Others: based on customer demand
•Arabic
•Support for vocalized Arabic (diacritics) is in development
•Latin
•Optional ligatures are fully supported
PDF is dead. Long live PDF!

Weitere ähnliche Inhalte

Mehr von iText Group nv

PDF made easy with iText 7
PDF made easy with iText 7PDF made easy with iText 7
PDF made easy with iText 7iText Group nv
 
Start-ups: the tortoise and the hare
Start-ups: the tortoise and the hareStart-ups: the tortoise and the hare
Start-ups: the tortoise and the hareiText Group nv
 
IANAL: what developers should know about IP and Legal
IANAL: what developers should know about IP and LegalIANAL: what developers should know about IP and Legal
IANAL: what developers should know about IP and LegaliText Group nv
 
Digital Signatures in the Cloud: A B2C Case Study
Digital Signatures in the Cloud: A B2C Case StudyDigital Signatures in the Cloud: A B2C Case Study
Digital Signatures in the Cloud: A B2C Case StudyiText Group nv
 
Digital Signatures: how it's done in PDF
Digital Signatures: how it's done in PDFDigital Signatures: how it's done in PDF
Digital Signatures: how it's done in PDFiText Group nv
 
PDF is dead. Long live PDF... with Java!
PDF is dead. Long live PDF... with Java!PDF is dead. Long live PDF... with Java!
PDF is dead. Long live PDF... with Java!iText Group nv
 
Digital Signatures: how it's done in PDF
Digital Signatures: how it's done in PDFDigital Signatures: how it's done in PDF
Digital Signatures: how it's done in PDFiText Group nv
 
iText Summit 2014: Talk: iText throughout the document life cycle
iText Summit 2014: Talk: iText throughout the document life cycleiText Summit 2014: Talk: iText throughout the document life cycle
iText Summit 2014: Talk: iText throughout the document life cycleiText Group nv
 
iText Summit 2014: Keynote talk
iText Summit 2014: Keynote talkiText Summit 2014: Keynote talk
iText Summit 2014: Keynote talkiText Group nv
 
iText Summit 2014: Talk: eGriffie and JustX, introducing digital documents at...
iText Summit 2014: Talk: eGriffie and JustX, introducing digital documents at...iText Summit 2014: Talk: eGriffie and JustX, introducing digital documents at...
iText Summit 2014: Talk: eGriffie and JustX, introducing digital documents at...iText Group nv
 
The XML Forms Architecture
The XML Forms ArchitectureThe XML Forms Architecture
The XML Forms ArchitectureiText Group nv
 
Damn, the new generation kids are getting iPads in Highschool!
Damn, the new generation kids are getting iPads in Highschool!Damn, the new generation kids are getting iPads in Highschool!
Damn, the new generation kids are getting iPads in Highschool!iText Group nv
 
PAdES signatures in iText and the road ahead
PAdES signatures in iText and the road aheadPAdES signatures in iText and the road ahead
PAdES signatures in iText and the road aheadiText Group nv
 
Best practices in Certifying and Signing PDFs
Best practices in Certifying and Signing PDFsBest practices in Certifying and Signing PDFs
Best practices in Certifying and Signing PDFsiText Group nv
 
Choosing the iText Solution that is right for you: Community or Commercial ed...
Choosing the iText Solution that is right for you: Community or Commercial ed...Choosing the iText Solution that is right for you: Community or Commercial ed...
Choosing the iText Solution that is right for you: Community or Commercial ed...iText Group nv
 
The importance of standards
The importance of standardsThe importance of standards
The importance of standardsiText Group nv
 

Mehr von iText Group nv (17)

PDF made easy with iText 7
PDF made easy with iText 7PDF made easy with iText 7
PDF made easy with iText 7
 
Start-ups: the tortoise and the hare
Start-ups: the tortoise and the hareStart-ups: the tortoise and the hare
Start-ups: the tortoise and the hare
 
IANAL: what developers should know about IP and Legal
IANAL: what developers should know about IP and LegalIANAL: what developers should know about IP and Legal
IANAL: what developers should know about IP and Legal
 
Digital Signatures in the Cloud: A B2C Case Study
Digital Signatures in the Cloud: A B2C Case StudyDigital Signatures in the Cloud: A B2C Case Study
Digital Signatures in the Cloud: A B2C Case Study
 
Digital Signatures: how it's done in PDF
Digital Signatures: how it's done in PDFDigital Signatures: how it's done in PDF
Digital Signatures: how it's done in PDF
 
ZUGFeRD: an overview
ZUGFeRD: an overviewZUGFeRD: an overview
ZUGFeRD: an overview
 
PDF is dead. Long live PDF... with Java!
PDF is dead. Long live PDF... with Java!PDF is dead. Long live PDF... with Java!
PDF is dead. Long live PDF... with Java!
 
Digital Signatures: how it's done in PDF
Digital Signatures: how it's done in PDFDigital Signatures: how it's done in PDF
Digital Signatures: how it's done in PDF
 
iText Summit 2014: Talk: iText throughout the document life cycle
iText Summit 2014: Talk: iText throughout the document life cycleiText Summit 2014: Talk: iText throughout the document life cycle
iText Summit 2014: Talk: iText throughout the document life cycle
 
iText Summit 2014: Keynote talk
iText Summit 2014: Keynote talkiText Summit 2014: Keynote talk
iText Summit 2014: Keynote talk
 
iText Summit 2014: Talk: eGriffie and JustX, introducing digital documents at...
iText Summit 2014: Talk: eGriffie and JustX, introducing digital documents at...iText Summit 2014: Talk: eGriffie and JustX, introducing digital documents at...
iText Summit 2014: Talk: eGriffie and JustX, introducing digital documents at...
 
The XML Forms Architecture
The XML Forms ArchitectureThe XML Forms Architecture
The XML Forms Architecture
 
Damn, the new generation kids are getting iPads in Highschool!
Damn, the new generation kids are getting iPads in Highschool!Damn, the new generation kids are getting iPads in Highschool!
Damn, the new generation kids are getting iPads in Highschool!
 
PAdES signatures in iText and the road ahead
PAdES signatures in iText and the road aheadPAdES signatures in iText and the road ahead
PAdES signatures in iText and the road ahead
 
Best practices in Certifying and Signing PDFs
Best practices in Certifying and Signing PDFsBest practices in Certifying and Signing PDFs
Best practices in Certifying and Signing PDFs
 
Choosing the iText Solution that is right for you: Community or Commercial ed...
Choosing the iText Solution that is right for you: Community or Commercial ed...Choosing the iText Solution that is right for you: Community or Commercial ed...
Choosing the iText Solution that is right for you: Community or Commercial ed...
 
The importance of standards
The importance of standardsThe importance of standards
The importance of standards
 

Kürzlich hochgeladen

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 

Kürzlich hochgeladen (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 

PDF is dead. Long live PDF!

  • 1. PDF made easy with iText 7 PDF is dead! Long live PDF! Benoit Lagae, Developer, iText Software Bruno Lowagie, Chief Strategy Officer, iText Group
  • 5. But governments love PDF Source: http://duff-johnson.com/2014/03/10/98-percent-of-dot-com-is-html-but-38-percent-of-dot-gov-is-pdf/ Percentage of PDF files: .org: 15% .gov: 38% .edu: 27%
  • 6. Publication versus … • No need to be self-contained • May change over time • Not all content produced by the author • e.g. Advertisements • Becoming more interactive • e.g. Comments on a news article
  • 7. … Document • Self-contained • Unchanging (non-dynamic) • Able to be authenticated • Able to be secured/protected
  • 8. Not counting HTML, PDF is king Source: http://duff-johnson.com/2015/02/12/the-8-most-popular-document-formats-on-the-web-in-2015/
  • 9. Publication: HTML depends on context Document: PDF is forever
  • 10. PDF/E engineering Since 2008 ISO 24517 PDF/VT printing Since 2010 ISO 16612 PDF/X graphic arts Since 2001 ISO 15930 PDF/A archive Since 2005 ISO 19005 PDF/UA accessibility Since 2012 ISO 14289 PDF Portable Document Format First released by Adobe in 1993 ISO Standard since 2008 ISO 32000 Related: XFDF (ISO), EcmaScript (ISO), PRC (ISO), PAdES (ETSI), ZUGFeRD An umbrella of standards:
  • 11. iText 7: a PDF engine
  • 12. Image example Image fox = new Image(ImageFactory.getImage(FOX)); Image dog = new Image(ImageFactory.getImage(DOG)); Paragraph p = new Paragraph("The quick brown ").add(fox) .add(" jumps over the lazy ").add(dog); document.add(p);
  • 13. On the importance of making a document accessible
  • 15. Some structure is helpful title list item list item list item Label Content
  • 17. How do we read a spider chart? RiskManagement StructuredFinance Mergers&acquisitions Governance&Internal Control AccountingOperations Treasuryoperations ManagementInformation &BusinessDecision Support BusinessPlanning& Strategy FinanceContributiontoIT Management CommercialActivities Taxation FunctionalLeadership
  • 18. Is this a better way to read data?
  • 19. Adapting the ‘quick brown fox’ example for PDF/UA
  • 20. PDF/UA (part 1) PdfDocument pdf = new PdfDocument(new PdfWriter(dest)); Document document = new Document(pdf); //Setting some required parameters Pdf.setTagged(); pdf.getCatalog().setLang(new PdfString("en-US")); pdf.getCatalog().setViewerPreferences( new PdfViewerPreferences().setDisplayDocTitle(true)); PdfDocumentInfo info = pdf.getDocumentInfo(); info.setTitle("iText7 PDF/UA example"); //Create XMP meta data pdf.createXmpMetadata();
  • 21. PDF/UA (part 2) //Fonts need to be embedded PdfFont font = PdfFontFactory.createFont(FONT, PdfEncodings.WINANSI, true); Paragraph p = new Paragraph(); p.setFont(font); p.add(new Text("The quick brown ")); Image foxImage = new Image(ImageFactory.getImage(FOX)); //PDF/UA: Set alt text foxImage.getAccessibilityProperties().setAlternateDescription("Fox"); p.add(foxImage); p.add(" jumps over the lazy "); Image dogImage = new Image(ImageFactory.getImage(DOG)); //PDF/UA: Set alt text dogImage.getAccessibilityProperties().setAlternateDescription("Dog"); p.add(dogImage); document.add(p); document.close();
  • 23. On the importance of making a document archivable
  • 24. PDF/A • ISO-19005 – Long-term preservation of documents – Approved parts will never become invalid – Individual parts define new, useful features • Obligations and restrictions – Metadata: ISO 16684, eXtensible Metadata Platform (XMP) – The document must be self-contained: • All fonts need to be embedded • No external movie, sound or other binary files – No JavaScript allowed – No encryption allowed
  • 25. Three standards • PDF/A-1 (2005) – based on PDF 1.4 – Level B (“basic”): visual appearance – Level A (“accessible”): visual appearance + structural and semantic properties (Tagged PDF) • PDF/A-2 (2011) – Based on ISO-32000-1 – Features introduced in PDF 1.5, 1.6, and 1.7: • Added support for JPEG2000, Collections, object-level XMP, optional content • Improved support for transparency, comment types and annotations, digital signatures – Level U (“unicode”): visual appearance + all text is in Unicode • PDF/A-3 (2012) – Based on PDF/A-2 with only 1 difference: attachments do not need to be PDF/A
  • 26. Adapting the ‘quick brown fox’ example for PDF/A
  • 27. PDF/A-1b example PdfADocument pdf = new PdfADocument(new PdfWriter(dest), PdfAConformanceLevel.PDF_A_1B, new PdfOutputIntent("Custom", "", "http://www.color.org", "sRGB IEC61966-2.1", new FileInputStream(INTENT))); Document document = new Document(pdf); //Create XMP meta data pdf.createXmpMetadata(); //Fonts need to be embedded PdfFont font = PdfFontFactory.createFont(FONT, PdfEncodings.WINANSI, true); Paragraph p = new Paragraph(); p.setFont(font); p.add(new Text("The quick brown ")); Image foxImage = new Image(ImageFactory.getImage(FOX)); p.add(foxImage); p.add(" jumps over the lazy "); Image dogImage = new Image(ImageFactory.getImage(DOG)); p.add(dogImage); document.add(p); document.close();
  • 29. PDF/A-1a example PdfADocument pdf = new PdfADocument(new PdfWriter(dest), PdfAConformanceLevel.PDF_A_1A, new PdfOutputIntent("Custom", "", "http://www.color.org", "sRGB IEC61966-2.1", new FileInputStream(INTENT))); Document document = new Document(pdf); pdf.setTagged(); pdf.createXmpMetadata(); PdfFont font = PdfFontFactory.createFont(FONT, PdfEncodings.WINANSI, true); Paragraph p = new Paragraph(); p.setFont(font); p.add(new Text("The quick brown ")); Image foxImage = new Image(ImageFactory.getImage(FOX)); foxImage.getAccessibilityProperties().setAlternateDescription("Fox"); p.add(foxImage); p.add(" jumps over the lazy "); Image dogImage = new Image(ImageFactory.getImage(DOG)); dogImage.getAccessibilityProperties().setAlternateDescription("Dog"); p.add(dogImage); document.add(p); document.close();
  • 31. Real-world use: publishing a CSV file as PDF/A-3a and PDF/UA
  • 33. United States example part 1: initializations PdfADocument pdf = new PdfADocument( new PdfWriter(dest), PdfAConformanceLevel.PDF_A_3A, new PdfOutputIntent("Custom", "", "http://www.color.org", "sRGB IEC61966-2.1", new FileInputStream(INTENT))); Document document = new Document(pdf, PageSize.A4.rotate()); //Setting some required parameters pdf.setTagged(); // PDF/UA and PDF/A Level a pdf.getCatalog().setLang(new PdfString("en-US")); // PDF/UA pdf.getCatalog().setViewerPreferences( // PDF/UA new PdfViewerPreferences().setDisplayDocTitle(true)); // PDF/UA PdfDocumentInfo info = pdf.getDocumentInfo(); // PDF/UA info.setTitle("iText7 PDF/A-3 example"); // PDF/UA //Create XMP meta data pdf.createXmpMetadata(); // PDF/UA and PDF/A Level a
  • 34. United States example part 2: add attachment //Add attachment PdfDictionary parameters = new PdfDictionary(); parameters.put(PdfName.ModDate, new PdfDate().getPdfObject()); PdfFileSpec fileSpec = PdfFileSpec.createEmbeddedFileSpec( pdf, Files.readAllBytes(Paths.get(DATA)), "united_states.csv", "united_states.csv", new PdfName("text/csv"), parameters, PdfName.Data, false); fileSpec.put(new PdfName("AFRelationship"), new PdfName("Data")); pdf.addFileAttachment("united_states.csv", fileSpec); PdfArray array = new PdfArray(); array.add(fileSpec.getPdfObject().getIndirectReference()); pdf.getCatalog().put(new PdfName("AF"), array);
  • 35. United States example part 3: parse CSV file PdfFont font = PdfFontFactory.createFont(FONT, true); PdfFont bold = PdfFontFactory.createFont(BOLD_FONT, true); // Parsing a CSV file and add data to a table Table table = new Table(new float[]{4, 1, 3, 4, 3, 3, 3, 3, 1}); table.setWidthPercent(100); BufferedReader br = new BufferedReader(new FileReader(DATA)); String line = br.readLine(); process(table, line, bold, true); while ((line = br.readLine()) != null) { process(table, line, font, false); } br.close(); document.add(table); document.close();
  • 36. United States example part 4: process each line public void process(Table table, String line, PdfFont font, boolean isHeader) { StringTokenizer tokenizer = new StringTokenizer(line, ";"); while (tokenizer.hasMoreTokens()) { if (isHeader) { table.addHeaderCell( new Cell().add( new Paragraph(tokenizer.nextToken()).setFont(font))); } else { table.addCell( new Cell().add( new Paragraph(tokenizer.nextToken()).setFont(font))); } } }
  • 41. Invoices: Need to be accessible
  • 42. Invoices: Need to be machine-readable
  • 43. Invoices: Need to be machine-readable
  • 44. iText 7 and its value add-ons
  • 45. New in iText 7: improved typography and support for Indic scripts
  • 46. iText 5: missing links Indic scripts: •Only unsupported major script family •Feature request #1 •Huge opportunity •limited support in most other PDF libraries Other features: •Optional ligatures in Latin script •Vowel diacritics in Arabic
  • 47. Indic scripts: problems •Lack of expertise •Unicode encodes 49 Indic scripts •Complex scripts with unique features •Glyph repositioning: ह + ि = िह •Glyph substitution: ம + ு = மு •Half-characters: त + + य = त्य •Unsolvable issues for iText 5 font engine •No dedicated Unicode points for half-characters •No font lookups past ‘uFFFF’ •Ligaturization is context-dependent (virama)
  • 48. Indic scripts: solutions Writing a new font engine • Automatic script recognition • Based on Unicode ranges • Flexibility = extensibility • Generic Shaper class • Separate module, only called when necessary • Glyph replacement rules • Different per writing system • Alternate glyphs are font-dependent
  • 49. Indic scripts: examples PdfFont font = PdfFontFactory.createFont(arial, PdfEncodings.IDENTITY_H, true); String txt = "u0938u093Eu0939u093Fu0924u094Du092Fu0915u093Eu0930"; // saahityakaar document.add(new Paragraph(txt).setFont(font)); String txt = "u0B8Eu0BB4u0BC1u0BA4u0BCDu0BA4u0BBEu0BB3u0BB0u0BCD"; // eluttaalar document.add(new Paragraph(txt).setFont(font));
  • 50. Other scripts: examples PdfFont font = PdfFontFactory.createFont(arial, PdfEncodings.IDENTITY_H, true); String txt = " u0627u0644u0643u0627u062Au0628"; // al-katibu document.add(new Paragraph(txt).setFont(font)); String txt = "writer"; GlyphLine glyphLine = font.createGlyphLine(txt); Shaper.applyLigaFeature(foglihtenNo07, glyphLine, null); canvas.showText(glyphLine)
  • 51. Status of advanced typography in iText 7 •Indic scripts •We already support: •Devanagari •Tamil •Coming soon: •Telugu •Others: based on customer demand •Arabic •Support for vocalized Arabic (diacritics) is in development •Latin •Optional ligatures are fully supported