2. Protecting your PDF iText in Action, chapter 12 12.1: Adding Metadata 12.2: PDF and compression 12.3: Encrypting a PDF document 12.4: Digital signatures, OCSP, and timestamping
3. Structure of a PDF file %PDF-1.x %âãÏÓ 1 0 obj ... 2 0 obj ... (Hello World) Tj ... xref 0 81 0000000000 65535 f 0000000015 00000 n ... trailer << ... >> startxref 15787 %%EOF A PDF file consists of a collection of objects. A PDF files starts with %PDF-1.x and ends with %%EOF
4. Changing the content of a PDF file %PDF-1.x %âãÏÓ 1 0 obj ... 2 0 obj ... (Hello People) Tj ... 121 0 obj ... xref 0 85 0000000000 65535 f 0000000015 00000 n ... trailer << ... >> startxref 16157 %%EOF You can use software to change the content of a PDF document: change a stream, add objects (e.g annotations), and so on.
5. What are our concerns? Integrity—we want assurance that the document hasn’t been changed somewhere in the workflow Authenticity—we want assurance that the author of the document is who we think it is (and not somebody else) Non-repudiation—we want assurance that the author can’t deny his authorship.
6. Integrity A digest is computed over a range of bytes from the file. This ByteRange is signed using the private key of the sender. This digest and the sender’s Certificate are embedded in the PDF. The receiver compares the embedded digest with the digest of the content.
7. Digital Signature field %PDF-1.x %âãÏÓ 1 0 obj ... 2 0 obj << /Type/Sig /Contents/... >> ... xref 0 81 0000000000 65535 f ... trailer << ... >> startxref 15787 %%EOF A signed PDF file contains a signature dictionary. The binary value of the PDF signature is placed into the Contents entry of a signature dictionary.
8. Embedded Digital Signature %PDF-1.x %âãÏÓ ... 2 0 obj <<... /Type/Sig /Contents< The digital signature isn’t part of the ByteRange. There are no bytes in the PDF that aren’t covered, other than the PDF signature itself. DIGITAL SIGNATURE > ... >> xref 0 81 0000000000 65535 f ... trailer << ... >> startxref 15787 %%EOF
9. Cryptography Symmetric key algorithms: the same key is used to encrypt and decrypt content. Asymmetric key algorithms: a public key is used to encrypt, a private key is used to decrypt (for encryption purposes). Or, a private key is used to encrypt, a public key is used to decrypt (for digital signatures).
10. Obtain a public/private key Create your own keystore (with the private key) and self-signed certificate (with the public key); e.g. using keytool Ask a Certificate Authority (CA) to sign your certificate to prove your identity A Certificate signed by a CA’s private key can be decrypted with the CA’s root certificate (stored in Adobe Reader)
16. Timestamp... %%EOF Existing PDF document Created by PDF producer Fill out signature field Using iText Externally sign digest created with iText
17. Displaying digital signatures Digital signatures are part of the file structure: it isn’t mandatory for a digital signature to be displayed on a page. Digital signatures are listed in the signature panel. A digital signature can be visualized as a field widget (this widget can consist of graphics, text,...).
22. Important note A signature signs the complete document. The concept of signing separate pages in a document (“to initial a document”) doesn’t exist in PDF. Legal issue: how to prove that a person who signed for approval has read the complete document?
23. Serial signatures %PDF-1.x %Originaldocument DIGITAL SIGNATURE 1 ... %%EOF A PDF document can be signed more than once, but parallel signatures aren’t supported, only serial signatures: additional signatures sign all previous signatures. Rev1 % Additional content 1 ... DIGITAL SIGNATURE 2 ... %%EOF Rev2 % Additional content 2 ... DIGITAL SIGNATURE 3 ... %%EOF Rev3
25. Types of signatures Certification (aka author) signature— only possible for the first revision; involves modification detection permissions. Approval (aka recipient) signature— workflow with subsequent signers. Usage Rights signature— involving Adobe’s private key to Reader enable a PDF (off-topic here).
26. Problems solved? Integrity—signature is invalidated if bytes are changed Authenticity—Certificate Authority verifies the identity of the owner of the private key Non-repudiation—the author is the only one who has access to the private key
27. What if? What if the author’s private key is compromised? What if the author falsifies the creation date of the document? What if the certificate expires too soon?
28. Revocation checking Certificate Revocation List (CRL) The certificate is checked against a list of revoked certificates. Online Certificate Status Protocol (OCSP) The revokation status is obtained from a server. If the certificate was revoked, the signature is invalid.
30. Timestamping The timestamp of a signature can be based on the signer’s local machine time, Or the signer can involve a Time Stamp Authority (TSA). The message digest is sent to a trusted timestamp server. This server adds a timestamp and signs the resulting hash using the TSA’s private key. The signer can’t forge the time anymore.
32. PAdES - LTV PAdES: PDF Advanced Electronic Signatures LTV: Long Term Validation Requires extensions to ISO-32000-1 Described by ETSI in TS 102 778 part 4 Requires Document Security Store (DSS) and Document Timestamp A new DSS+TS are added before expiration of the last document timestamp