SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Downloaden Sie, um offline zu lesen
An overview of
potential leaks via PDF
Ange Albertini
Ange Albertini
reverse engineering
visual documentations
@angealbertini
ange@corkami.com
http://www.corkami.com
Yet another talk on PDF from me?
● this one is high-level
○ awareness without the hardcore details
● a new kind of leak happened ITW recently
⇒ it’s still worth spreading the knowledge!
http://download.repubblica.it/pdf/rapportousacalipari.pdf
It really happens
in the wild!
potential leaks
via the standard page elements
text, image, drawing
Pages are made of 3 kinds of ‘visual’ elements (that can look identical).
1: Text
‘string of the text in the document’
Text
● explicitly spelled in the data
● can be
○ invisible
■ white, invisible style, covered
○ forbidden to copy/paste
■ but this can be disabled instantly
○ mapped to some weird unicode
but still technically there!
⇒ it can still be extracted, often automatically
pdftotext -layout ...
2: Images
Stored as-is, then referenced, then displayed in the page contents.
Even if the image is not used (displayed),
the image object (and content) may still be present.
Images
● embedded as a dedicated object
○ can be automatically extracted
○ pdfimages -j -layout ...
● then referenced in pages’ contents
○ useful for multiple uses
⇒ images can be present (and extracted)
even if not used
Images
● JPEG are stored as-is (the complete file)
Extra risk: leak via thumbnail, EXIF, RDF
3: drawings
sequences of graphical operators
Drawings
(rectangles, lines
)
● the information is not trivial to extract
● can still be modified without any problem
○ remove covering layers (censorship)
Importing a specific part of a confidential PDF
With OSX Preview: select area, then paste in a new document...
So you get a new document, showing only what you wanted

(cropme.pdf is much smaller because it was hand-written, while cropped.pdf is bloated)
$ du -b cropme.pdf cropped.pdf
595 cropme.pdf
10203 cropped.pdf
Risk: it’s actually the same content with an extra ‘limiting view’!
If you remove the “CropBox”, you get back the original content.
Importing
● Copy/paste from OSX preview
● Import via LaTeX
● 
?
What it actually does:
1/ imports the whole doc
(to prevent incompatibilities)
2/ adds a limiting view
Risk: the original content is still there!
Incremental updates
updates (even deletions) are appended,
like in Microsoft Office, etc

⇒ “save as
” a new document to prevent it
Forms
Forms
● Time saver:
○ type (copy/paste) your info in the doc, then print!
○ you can even save the info in the doc
■ this info is not stored like standard text
Risk:
you spread an updated document
containing private info!
Some readers may not show the saved information!
Forms
● Forms are not always supported
○ you won’t even get a warning!
● Content is not stored like standard text
○ not as easy to extract, but still there!
Bigger risk :
Just opening the file to double-check
may be not enough!
The only fully reliable way ?
(the one that *NSA* uses
)
Convert pages to pictures !
Just use Imagemagick convert
then import to a new PDF
Damn ugly, but fully reliable.
Conclusion
PDF sucks to prevent leaks
PDF is a monster for attack surface
(and metadata embedding)
No free PDF ‘dissector’
because we only focus on malware
⇒ No solution anytime soon
(Btw, how much is worth the map of a petroleum reservoir ?)
Questions?
That was just ITW examples of leaks,
other kind of leaks may be possible.
@angealbertini
Hail to the king, baby!
Note:
this PDF is also a ZIP,
containing the PoCs
shown in the document.

Weitere Àhnliche Inhalte

Ähnlich wie An overview of potential leaks via PDF

idocument_1
idocument_1idocument_1
idocument_1
herbacenter
 
idocument_user_manaul
idocument_user_manaulidocument_user_manaul
idocument_user_manaul
ignasi r torne
 
idocument_1
idocument_1idocument_1
idocument_1
herbacenter
 
An Introduction to TiddlyWiki
An Introduction to TiddlyWikiAn Introduction to TiddlyWiki
An Introduction to TiddlyWiki
guest102a23
 
Tech class pp. prestn finished with questions
Tech class pp. prestn finished with questionsTech class pp. prestn finished with questions
Tech class pp. prestn finished with questions
mzjazzlady03
 
Tech class pp. prestn finished with questions
Tech class pp. prestn finished with questionsTech class pp. prestn finished with questions
Tech class pp. prestn finished with questions
mzjazzlady03
 
Tech class pp. prestn finished with questions
Tech class pp. prestn finished with questionsTech class pp. prestn finished with questions
Tech class pp. prestn finished with questions
mzjazzlady03
 
Portfolio website details
Portfolio website detailsPortfolio website details
Portfolio website details
David
 
Quick Start Guide.pdf
Quick Start Guide.pdfQuick Start Guide.pdf
Quick Start Guide.pdf
pa jo
 

Ähnlich wie An overview of potential leaks via PDF (20)

PDF - Secrets - 140519092839-phpapp01
PDF - Secrets - 140519092839-phpapp01PDF - Secrets - 140519092839-phpapp01
PDF - Secrets - 140519092839-phpapp01
 
Analysis of malicious pdf
Analysis of malicious pdfAnalysis of malicious pdf
Analysis of malicious pdf
 
Analysis of malicious pdf
Analysis of malicious pdfAnalysis of malicious pdf
Analysis of malicious pdf
 
Skippipe: skipping the watermark in digital content
Skippipe: skipping the watermark in digital contentSkippipe: skipping the watermark in digital content
Skippipe: skipping the watermark in digital content
 
idocument_1
idocument_1idocument_1
idocument_1
 
idocument_user_manaul
idocument_user_manaulidocument_user_manaul
idocument_user_manaul
 
idocument_1
idocument_1idocument_1
idocument_1
 
Tutorial for ISSU
Tutorial for ISSUTutorial for ISSU
Tutorial for ISSU
 
Messing with binary formats
Messing with binary formatsMessing with binary formats
Messing with binary formats
 
An introduction to inkscape
An introduction to inkscapeAn introduction to inkscape
An introduction to inkscape
 
An Introduction to TiddlyWiki
An Introduction to TiddlyWikiAn Introduction to TiddlyWiki
An Introduction to TiddlyWiki
 
Tech class pp. prestn finished with questions
Tech class pp. prestn finished with questionsTech class pp. prestn finished with questions
Tech class pp. prestn finished with questions
 
Tech class pp. prestn finished with questions
Tech class pp. prestn finished with questionsTech class pp. prestn finished with questions
Tech class pp. prestn finished with questions
 
Tech class pp. prestn finished with questions
Tech class pp. prestn finished with questionsTech class pp. prestn finished with questions
Tech class pp. prestn finished with questions
 
Portfolio website details
Portfolio website detailsPortfolio website details
Portfolio website details
 
An Overview of RoboHelp 7
An Overview of RoboHelp 7An Overview of RoboHelp 7
An Overview of RoboHelp 7
 
Digital Literacy
Digital LiteracyDigital Literacy
Digital Literacy
 
An Introduction to TiddlyWiki, revised
An Introduction to TiddlyWiki, revisedAn Introduction to TiddlyWiki, revised
An Introduction to TiddlyWiki, revised
 
Quick Start Guide.pdf
Quick Start Guide.pdfQuick Start Guide.pdf
Quick Start Guide.pdf
 
How to use slideshare
How to use slideshareHow to use slideshare
How to use slideshare
 

Mehr von Ange Albertini

Mehr von Ange Albertini (20)

Technical challenges with file formats
Technical challenges with file formatsTechnical challenges with file formats
Technical challenges with file formats
 
Relations between archive formats
Relations between archive formatsRelations between archive formats
Relations between archive formats
 
Abusing archive file formats
Abusing archive file formatsAbusing archive file formats
Abusing archive file formats
 
TimeCryption
TimeCryptionTimeCryption
TimeCryption
 
You are *not* an idiot
You are *not* an idiotYou are *not* an idiot
You are *not* an idiot
 
Improving file formats
Improving file formatsImproving file formats
Improving file formats
 
KILL MD5
KILL MD5KILL MD5
KILL MD5
 
No more dumb hex!
No more dumb hex!No more dumb hex!
No more dumb hex!
 
Beyond your studies
Beyond your studiesBeyond your studies
Beyond your studies
 
The challenges of file formats
The challenges of file formatsThe challenges of file formats
The challenges of file formats
 
Exploiting hash collisions
Exploiting hash collisionsExploiting hash collisions
Exploiting hash collisions
 
Infosec & failures
Infosec & failuresInfosec & failures
Infosec & failures
 
Connecting communities
Connecting communitiesConnecting communities
Connecting communities
 
TASBot - the perfectionist
TASBot - the perfectionistTASBot - the perfectionist
TASBot - the perfectionist
 
Hacks in video games
Hacks in video gamesHacks in video games
Hacks in video games
 
Preserving arcade games - 31c3
Preserving arcade games -  31c3Preserving arcade games -  31c3
Preserving arcade games - 31c3
 
Preserving arcade games
Preserving arcade gamesPreserving arcade games
Preserving arcade games
 
Let's talk about...
Let's talk about...Let's talk about...
Let's talk about...
 
Hide Android applications in images
Hide Android applications in imagesHide Android applications in images
Hide Android applications in images
 
Let's play with crypto! v2
Let's play with crypto! v2Let's play with crypto! v2
Let's play with crypto! v2
 

KĂŒrzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

KĂŒrzlich hochgeladen (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 

An overview of potential leaks via PDF

  • 1. An overview of potential leaks via PDF Ange Albertini
  • 2. Ange Albertini reverse engineering visual documentations @angealbertini ange@corkami.com http://www.corkami.com
  • 3. Yet another talk on PDF from me? ● this one is high-level ○ awareness without the hardcore details ● a new kind of leak happened ITW recently ⇒ it’s still worth spreading the knowledge!
  • 5. potential leaks via the standard page elements text, image, drawing
  • 6. Pages are made of 3 kinds of ‘visual’ elements (that can look identical).
  • 7. 1: Text ‘string of the text in the document’
  • 8. Text ● explicitly spelled in the data ● can be ○ invisible ■ white, invisible style, covered ○ forbidden to copy/paste ■ but this can be disabled instantly ○ mapped to some weird unicode but still technically there! ⇒ it can still be extracted, often automatically pdftotext -layout ...
  • 9. 2: Images Stored as-is, then referenced, then displayed in the page contents.
  • 10. Even if the image is not used (displayed), the image object (and content) may still be present.
  • 11. Images ● embedded as a dedicated object ○ can be automatically extracted ○ pdfimages -j -layout ... ● then referenced in pages’ contents ○ useful for multiple uses ⇒ images can be present (and extracted) even if not used
  • 12. Images ● JPEG are stored as-is (the complete file) Extra risk: leak via thumbnail, EXIF, RDF
  • 13. 3: drawings sequences of graphical operators
  • 14. Drawings (rectangles, lines
) ● the information is not trivial to extract ● can still be modified without any problem ○ remove covering layers (censorship)
  • 15. Importing a specific part of a confidential PDF
  • 16. With OSX Preview: select area, then paste in a new document...
  • 17. So you get a new document, showing only what you wanted
 (cropme.pdf is much smaller because it was hand-written, while cropped.pdf is bloated) $ du -b cropme.pdf cropped.pdf 595 cropme.pdf 10203 cropped.pdf
  • 18. Risk: it’s actually the same content with an extra ‘limiting view’!
  • 19. If you remove the “CropBox”, you get back the original content.
  • 20. Importing ● Copy/paste from OSX preview ● Import via LaTeX ● 
? What it actually does: 1/ imports the whole doc (to prevent incompatibilities) 2/ adds a limiting view Risk: the original content is still there!
  • 21. Incremental updates updates (even deletions) are appended, like in Microsoft Office, etc
 ⇒ “save as
” a new document to prevent it
  • 22. Forms
  • 23. Forms ● Time saver: ○ type (copy/paste) your info in the doc, then print! ○ you can even save the info in the doc ■ this info is not stored like standard text Risk: you spread an updated document containing private info!
  • 24. Some readers may not show the saved information!
  • 25. Forms ● Forms are not always supported ○ you won’t even get a warning! ● Content is not stored like standard text ○ not as easy to extract, but still there! Bigger risk : Just opening the file to double-check may be not enough!
  • 26. The only fully reliable way ? (the one that *NSA* uses
)
  • 27. Convert pages to pictures ! Just use Imagemagick convert then import to a new PDF Damn ugly, but fully reliable.
  • 29. PDF sucks to prevent leaks PDF is a monster for attack surface (and metadata embedding) No free PDF ‘dissector’ because we only focus on malware ⇒ No solution anytime soon (Btw, how much is worth the map of a petroleum reservoir ?)
  • 30. Questions? That was just ITW examples of leaks, other kind of leaks may be possible.
  • 31. @angealbertini Hail to the king, baby! Note: this PDF is also a ZIP, containing the PoCs shown in the document.