2. PDF: the rationale
•
•
•
•
•
•
Portable: platform independent
Document: rendered in a reliable way
Format: page based approach
Consistent: predictable result
Fast: no programming language
Complete and compact
“Creating PDF is a One-Way Process”
3. XML Forms Architecture
• Document defined in XML
– Template: appearance of the form
– Datasets: data and data description
– Rendered on-the-fly in the viewer
• The Portable Document Format is used:
– as the container of the XML stream
– for the backgrounds of the form
“Data-based dynamical document”
4. PDF versus XFA
Pro
Contra
• XML based
• XML based
– You can use your own schema
– Easy to extract/exchange data
• Dynamic document
– Data shapes document
– Variable number of pages
• Functionality (vs AcroForm)
– More flexibility
– More feature rich
– Slow rendering for large docs
– XML manipulation
• Slow adoption by viewers
– Adobe Reader
– Preview
• Not many tools available
– Adobe LiveCycle
– Merging, splitting,...
– Problem: continuity?
5. Let’s build an XFA2PDF tool
•
•
•
•
iText 5+: filling out XFA forms
iText 5.2.1+: making XFA forms Read-Only
XFA Worker: flatten a filled out XFA form
This is a huge work
– Time + $$$
– Specialists needed
• Different approach
– Closed source
– Customers only
– Different subprojects
6. Let’s start with XML
• Extensible Markup Language (XML) is a
set of rules for encoding documents in
machine-readable form.
• Hundreds of XML-based languages have
been developed, including RSS, Atom,
SOAP, SVG, XHTML,...
7. Converting XML to PDF
• Either you use XSLT to transform one type
of XML to another one that can be parsed
to PDF (this is what is done with XSL-FO).
• Or you can program custom parsers for
your custom XML.
• Which approach is best depends on the
project.
8. iText before XML Worker
• XmlParser with a custom iText DTD.
– Why invent a new standard?
• XmlPeer classes for custom tags.
– Good idea, but nobody understood how it works
• Writing your own DocumentHandler.
– Not for the faint of heart
• Using HTMLWorker.
– Organically grown functionality; dito frustration
• These are things of the past!
9. Understanding XML Worker
• Different pipelines
• In the case of XHTML:
CSS
pipeline
HTML
pipeline
• In the case of custom XML:
Custom
pipeline
PDF
pipeline
PDF
pipeline