SlideShare ist ein Scribd-Unternehmen logo
1 von 13
Downloaden Sie, um offline zu lesen
pdf.js
Julian Viereck
  @jviereck
Overview
• What is pdf.js
• How PDF is structured
• Processing in pdf.js
• Images & Fonts
• Problems
• Todo
• Demo
What is pdf.js

•   building faithful & efficient PDF renderer
•   HTML5 technology experiment
•   no native code
•   secure (web sandbox)
•   Mozilla Labs Project - Open Source
How PDF is structured
 Header      PDF version

             sequence of objets
   Body

 [Objects]   fonts, drawing cmds, images,
             words, bookmarks, form fields
xRef Table   mapping objID     byte offset
  Trailer    root objID, xRef byte offset
 PDF file     root obj = ref to pages catalog
Processing in pdf.js
• get plain Uint8Array via XHR2, build Stream
• new PDFDoc(stream): read xRef, root object
• page = PDFDoc.getPage(N)                    Internal

• page.startRendering(graphics)            Representation




 • read & convert all PDF cmds ➟ IR PartialEvaluator
 • load required objects (fonts, images)
 • graphics.executeIR(IR)                CanvasGraphics
1. page=PDFDoc.getPage(2)     5 0 obj
                                                stream maybe
   ➟ obj#3                    <<                   encoded!
                               /Length 8 0 R
2. page.startRendering(...)   >>
   ➟ obj#4, obj#5              stream
                                 /GS1 gs
                                 /F0 12 Tf
 3 0 obj                         BT
 <<                                100 700 Td
 /Type /Page                       (Hello World!) Tj
 /MediaBox 	

 0 612 792]
                [0               ET
 /Resources 	

 4 0 R            50 600 m
 /Parent 	

	

 2 0 R            400 600 l
 /Contents 	

5 0 R              S
 >>                            endstream
 endobj                       endobj
xRef, catalog,                                  IR
5 0 obj           + resources            PartialEvaluator         Form
<<
 /Length 8 0 R
>>                                  setGState: 	

   [ LW: 10 ]
 stream                             dependency:	

   [ font0 ]
   /GS1 gs                          setFont: 	

     font0, 12
   /F0 12 Tf                        beginText
   BT                               moveText: 	

    100, 700
     100 700 Td                     showText: 	

    “Hello World!”
     (Hello World!) Tj              endText
   ET                               moveTo: 	

      50, 600
   50 600 m                         lineTo: 	

      400, 600
   400 600 l                        stroke
   S
 endstream
endobj                                  CanvasGraphics
Images
• JPEG streams:
 • DOMImg.src = 'data:image/jpeg;base64,'
    + window.btoa(bytesToString(bytes));
• If not JPEG stream:
 • read bytes, convert to colorspace
 • imgData = canvas.getImageData()
 • fillWithPixelData(bytes, imgData)
 • canvas.putImageData(imgData)
Fonts
• There are lots of different font formats!
 • fonts are converted to OpenType
 • use CSS:
      @font-face { font-family:'font0'; src:url
    (data:font/opentype;base64, ...)
• some fonts can’t be converted :(
 • use drawing commands?
Problems
                                       platform =
                                     browser + OS

• No way to detect font is loaded (hacks)
• Font width (wrong on some platforms)
• Subpixel font size depending on platform
• Text selection
• Printing
• Speed
 • use workers (postMessage lose shape)
 • partial rendering
Todo
• more font work, printing, speed
• support more rendering spec
• explore using SVG
• PDF forms, “advanced PDF features”
• infrastructure: automated testing, requireJS
• test more PDF (need your help!)
Demo
Contact
Github:
 https://github.com/andreasgal/pdf.js
Mailing list:
 https://groups.google.com/group/
mozilla.dev.pdf-js/topics
IRC:
 irc.mozilla.org #pdfjs

Weitere ähnliche Inhalte

Was ist angesagt?

MongoDB at MercadoLibre
MongoDB at MercadoLibreMongoDB at MercadoLibre
MongoDB at MercadoLibrePablo Molnar
 
CouchDB Open Source Bridge
CouchDB Open Source BridgeCouchDB Open Source Bridge
CouchDB Open Source BridgeChris Anderson
 
Mastering the MongoDB Javascript Shell
Mastering the MongoDB Javascript ShellMastering the MongoDB Javascript Shell
Mastering the MongoDB Javascript ShellScott Hernandez
 
Shankar's mongo db presentation
Shankar's mongo db presentationShankar's mongo db presentation
Shankar's mongo db presentationShankar Kamble
 
JSONSchema with golang
JSONSchema with golangJSONSchema with golang
JSONSchema with golangSuraj Deshmukh
 
Apache AVRO (Boston HUG, Jan 19, 2010)
Apache AVRO (Boston HUG, Jan 19, 2010)Apache AVRO (Boston HUG, Jan 19, 2010)
Apache AVRO (Boston HUG, Jan 19, 2010)Cloudera, Inc.
 
Dirty - How simple is your database?
Dirty - How simple is your database?Dirty - How simple is your database?
Dirty - How simple is your database?Felix Geisendörfer
 
PhpstudyTokyo MongoDB PHP CakePHP
PhpstudyTokyo MongoDB PHP CakePHPPhpstudyTokyo MongoDB PHP CakePHP
PhpstudyTokyo MongoDB PHP CakePHPichikaway
 
Shell Tips & Tricks
Shell Tips & TricksShell Tips & Tricks
Shell Tips & TricksMongoDB
 
Mongo Presentation by Metatagg Solutions
Mongo Presentation by Metatagg SolutionsMongo Presentation by Metatagg Solutions
Mongo Presentation by Metatagg SolutionsMetatagg Solutions
 
Getting Started with MongoDB
Getting Started with MongoDBGetting Started with MongoDB
Getting Started with MongoDBMichael Redlich
 
javaScript.ppt
javaScript.pptjavaScript.ppt
javaScript.pptsentayehu
 
Apache CouchDB Presentation @ Sept. 2104 GTALUG Meeting
Apache CouchDB Presentation @ Sept. 2104 GTALUG MeetingApache CouchDB Presentation @ Sept. 2104 GTALUG Meeting
Apache CouchDB Presentation @ Sept. 2104 GTALUG MeetingMyles Braithwaite
 
NoSQL - An introduction to CouchDB
NoSQL - An introduction to CouchDBNoSQL - An introduction to CouchDB
NoSQL - An introduction to CouchDBJonathan Weiss
 
MongoDb In Action
MongoDb In ActionMongoDb In Action
MongoDb In Actionfuchaoqun
 

Was ist angesagt? (20)

MongoDB at MercadoLibre
MongoDB at MercadoLibreMongoDB at MercadoLibre
MongoDB at MercadoLibre
 
CouchDB Open Source Bridge
CouchDB Open Source BridgeCouchDB Open Source Bridge
CouchDB Open Source Bridge
 
Mastering the MongoDB Javascript Shell
Mastering the MongoDB Javascript ShellMastering the MongoDB Javascript Shell
Mastering the MongoDB Javascript Shell
 
MongoDB & PHP
MongoDB & PHPMongoDB & PHP
MongoDB & PHP
 
CouchDB in The Room
CouchDB in The RoomCouchDB in The Room
CouchDB in The Room
 
Shankar's mongo db presentation
Shankar's mongo db presentationShankar's mongo db presentation
Shankar's mongo db presentation
 
faastCrystal
faastCrystalfaastCrystal
faastCrystal
 
JSONSchema with golang
JSONSchema with golangJSONSchema with golang
JSONSchema with golang
 
Apache AVRO (Boston HUG, Jan 19, 2010)
Apache AVRO (Boston HUG, Jan 19, 2010)Apache AVRO (Boston HUG, Jan 19, 2010)
Apache AVRO (Boston HUG, Jan 19, 2010)
 
Dirty - How simple is your database?
Dirty - How simple is your database?Dirty - How simple is your database?
Dirty - How simple is your database?
 
PhpstudyTokyo MongoDB PHP CakePHP
PhpstudyTokyo MongoDB PHP CakePHPPhpstudyTokyo MongoDB PHP CakePHP
PhpstudyTokyo MongoDB PHP CakePHP
 
Shell Tips & Tricks
Shell Tips & TricksShell Tips & Tricks
Shell Tips & Tricks
 
Trimming The Cruft
Trimming The CruftTrimming The Cruft
Trimming The Cruft
 
Mongo Presentation by Metatagg Solutions
Mongo Presentation by Metatagg SolutionsMongo Presentation by Metatagg Solutions
Mongo Presentation by Metatagg Solutions
 
Getting Started with MongoDB
Getting Started with MongoDBGetting Started with MongoDB
Getting Started with MongoDB
 
javaScript.ppt
javaScript.pptjavaScript.ppt
javaScript.ppt
 
Apache CouchDB Presentation @ Sept. 2104 GTALUG Meeting
Apache CouchDB Presentation @ Sept. 2104 GTALUG MeetingApache CouchDB Presentation @ Sept. 2104 GTALUG Meeting
Apache CouchDB Presentation @ Sept. 2104 GTALUG Meeting
 
NoSQL - An introduction to CouchDB
NoSQL - An introduction to CouchDBNoSQL - An introduction to CouchDB
NoSQL - An introduction to CouchDB
 
Mongo DB 102
Mongo DB 102Mongo DB 102
Mongo DB 102
 
MongoDb In Action
MongoDb In ActionMongoDb In Action
MongoDb In Action
 

Ähnlich wie 2011 09-pdfjs

JRuby with Java Code in Data Processing World
JRuby with Java Code in Data Processing WorldJRuby with Java Code in Data Processing World
JRuby with Java Code in Data Processing WorldSATOSHI TAGOMORI
 
Tuning Web Performance
Tuning Web PerformanceTuning Web Performance
Tuning Web PerformanceEric ShangKuan
 
Tuning web performance
Tuning web performanceTuning web performance
Tuning web performanceGeorge Ang
 
TorqueBox: The beauty of Ruby with the power of JBoss. Presented at Devnexus...
TorqueBox: The beauty of Ruby with the power of JBoss.  Presented at Devnexus...TorqueBox: The beauty of Ruby with the power of JBoss.  Presented at Devnexus...
TorqueBox: The beauty of Ruby with the power of JBoss. Presented at Devnexus...bobmcwhirter
 
解密解密
解密解密解密解密
解密解密Tom Chen
 
How dojo works
How dojo worksHow dojo works
How dojo worksAmit Tyagi
 
Playing with d3.js
Playing with d3.jsPlaying with d3.js
Playing with d3.jsmangoice
 
Rapid and Scalable Development with MongoDB, PyMongo, and Ming
Rapid and Scalable Development with MongoDB, PyMongo, and MingRapid and Scalable Development with MongoDB, PyMongo, and Ming
Rapid and Scalable Development with MongoDB, PyMongo, and MingRick Copeland
 
node.js: Javascript's in your backend
node.js: Javascript's in your backendnode.js: Javascript's in your backend
node.js: Javascript's in your backendDavid Padbury
 
Nodejs - Should Ruby Developers Care?
Nodejs - Should Ruby Developers Care?Nodejs - Should Ruby Developers Care?
Nodejs - Should Ruby Developers Care?Felix Geisendörfer
 
Allura - an Open Source MongoDB Based Document Oriented SourceForge
Allura - an Open Source MongoDB Based Document Oriented SourceForgeAllura - an Open Source MongoDB Based Document Oriented SourceForge
Allura - an Open Source MongoDB Based Document Oriented SourceForgeRick Copeland
 
Intro to mobile web application development
Intro to mobile web application developmentIntro to mobile web application development
Intro to mobile web application developmentzonathen
 
Sorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at SpotifySorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at SpotifyNeville Li
 
Familiar HTML5 - 事例とサンプルコードから学ぶ 身近で普通に使わているHTML5
Familiar HTML5 - 事例とサンプルコードから学ぶ 身近で普通に使わているHTML5Familiar HTML5 - 事例とサンプルコードから学ぶ 身近で普通に使わているHTML5
Familiar HTML5 - 事例とサンプルコードから学ぶ 身近で普通に使わているHTML5Sadaaki HIRAI
 
Cape Cod Web Technology Meetup - 2
Cape Cod Web Technology Meetup - 2Cape Cod Web Technology Meetup - 2
Cape Cod Web Technology Meetup - 2Asher Martin
 
Dojo: Getting Started Today
Dojo: Getting Started TodayDojo: Getting Started Today
Dojo: Getting Started TodayGabriel Hamilton
 

Ähnlich wie 2011 09-pdfjs (20)

JRuby with Java Code in Data Processing World
JRuby with Java Code in Data Processing WorldJRuby with Java Code in Data Processing World
JRuby with Java Code in Data Processing World
 
Tuning Web Performance
Tuning Web PerformanceTuning Web Performance
Tuning Web Performance
 
Tuning web performance
Tuning web performanceTuning web performance
Tuning web performance
 
Pdf secrets v2
Pdf secrets v2Pdf secrets v2
Pdf secrets v2
 
Hadoop - Introduction to Hadoop
Hadoop - Introduction to HadoopHadoop - Introduction to Hadoop
Hadoop - Introduction to Hadoop
 
TorqueBox: The beauty of Ruby with the power of JBoss. Presented at Devnexus...
TorqueBox: The beauty of Ruby with the power of JBoss.  Presented at Devnexus...TorqueBox: The beauty of Ruby with the power of JBoss.  Presented at Devnexus...
TorqueBox: The beauty of Ruby with the power of JBoss. Presented at Devnexus...
 
解密解密
解密解密解密解密
解密解密
 
How dojo works
How dojo worksHow dojo works
How dojo works
 
Playing with d3.js
Playing with d3.jsPlaying with d3.js
Playing with d3.js
 
Rapid and Scalable Development with MongoDB, PyMongo, and Ming
Rapid and Scalable Development with MongoDB, PyMongo, and MingRapid and Scalable Development with MongoDB, PyMongo, and Ming
Rapid and Scalable Development with MongoDB, PyMongo, and Ming
 
Wider than rails
Wider than railsWider than rails
Wider than rails
 
node.js: Javascript's in your backend
node.js: Javascript's in your backendnode.js: Javascript's in your backend
node.js: Javascript's in your backend
 
Nodejs - Should Ruby Developers Care?
Nodejs - Should Ruby Developers Care?Nodejs - Should Ruby Developers Care?
Nodejs - Should Ruby Developers Care?
 
Allura - an Open Source MongoDB Based Document Oriented SourceForge
Allura - an Open Source MongoDB Based Document Oriented SourceForgeAllura - an Open Source MongoDB Based Document Oriented SourceForge
Allura - an Open Source MongoDB Based Document Oriented SourceForge
 
Modern C++
Modern C++Modern C++
Modern C++
 
Intro to mobile web application development
Intro to mobile web application developmentIntro to mobile web application development
Intro to mobile web application development
 
Sorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at SpotifySorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at Spotify
 
Familiar HTML5 - 事例とサンプルコードから学ぶ 身近で普通に使わているHTML5
Familiar HTML5 - 事例とサンプルコードから学ぶ 身近で普通に使わているHTML5Familiar HTML5 - 事例とサンプルコードから学ぶ 身近で普通に使わているHTML5
Familiar HTML5 - 事例とサンプルコードから学ぶ 身近で普通に使わているHTML5
 
Cape Cod Web Technology Meetup - 2
Cape Cod Web Technology Meetup - 2Cape Cod Web Technology Meetup - 2
Cape Cod Web Technology Meetup - 2
 
Dojo: Getting Started Today
Dojo: Getting Started TodayDojo: Getting Started Today
Dojo: Getting Started Today
 

Kürzlich hochgeladen

How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 

Kürzlich hochgeladen (20)

How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 

2011 09-pdfjs

  • 2. Overview • What is pdf.js • How PDF is structured • Processing in pdf.js • Images & Fonts • Problems • Todo • Demo
  • 3. What is pdf.js • building faithful & efficient PDF renderer • HTML5 technology experiment • no native code • secure (web sandbox) • Mozilla Labs Project - Open Source
  • 4. How PDF is structured Header PDF version sequence of objets Body [Objects] fonts, drawing cmds, images, words, bookmarks, form fields xRef Table mapping objID byte offset Trailer root objID, xRef byte offset PDF file root obj = ref to pages catalog
  • 5. Processing in pdf.js • get plain Uint8Array via XHR2, build Stream • new PDFDoc(stream): read xRef, root object • page = PDFDoc.getPage(N) Internal • page.startRendering(graphics) Representation • read & convert all PDF cmds ➟ IR PartialEvaluator • load required objects (fonts, images) • graphics.executeIR(IR) CanvasGraphics
  • 6. 1. page=PDFDoc.getPage(2) 5 0 obj stream maybe ➟ obj#3 << encoded! /Length 8 0 R 2. page.startRendering(...) >> ➟ obj#4, obj#5 stream /GS1 gs /F0 12 Tf 3 0 obj BT << 100 700 Td /Type /Page (Hello World!) Tj /MediaBox 0 612 792] [0 ET /Resources 4 0 R 50 600 m /Parent 2 0 R 400 600 l /Contents 5 0 R S >> endstream endobj endobj
  • 7. xRef, catalog, IR 5 0 obj + resources PartialEvaluator Form << /Length 8 0 R >> setGState: [ LW: 10 ] stream dependency: [ font0 ] /GS1 gs setFont: font0, 12 /F0 12 Tf beginText BT moveText: 100, 700 100 700 Td showText: “Hello World!” (Hello World!) Tj endText ET moveTo: 50, 600 50 600 m lineTo: 400, 600 400 600 l stroke S endstream endobj CanvasGraphics
  • 8. Images • JPEG streams: • DOMImg.src = 'data:image/jpeg;base64,' + window.btoa(bytesToString(bytes)); • If not JPEG stream: • read bytes, convert to colorspace • imgData = canvas.getImageData() • fillWithPixelData(bytes, imgData) • canvas.putImageData(imgData)
  • 9. Fonts • There are lots of different font formats! • fonts are converted to OpenType • use CSS: @font-face { font-family:'font0'; src:url (data:font/opentype;base64, ...) • some fonts can’t be converted :( • use drawing commands?
  • 10. Problems platform = browser + OS • No way to detect font is loaded (hacks) • Font width (wrong on some platforms) • Subpixel font size depending on platform • Text selection • Printing • Speed • use workers (postMessage lose shape) • partial rendering
  • 11. Todo • more font work, printing, speed • support more rendering spec • explore using SVG • PDF forms, “advanced PDF features” • infrastructure: automated testing, requireJS • test more PDF (need your help!)
  • 12. Demo
  • 13. Contact Github: https://github.com/andreasgal/pdf.js Mailing list: https://groups.google.com/group/ mozilla.dev.pdf-js/topics IRC: irc.mozilla.org #pdfjs