SlideShare ist ein Scribd-Unternehmen logo
1 von 44
Downloaden Sie, um offline zu lesen
Julian Viereck

  @jviereck
+julian.viereck
Overview
5    • What is PDF.JS about
10   • How PDF is structured & processing in PDF.JS
15   • “Why are you doing this?”
5    • Firefox Integration
5    • What’s next?
15   • Demo
5    • Q &A
About me



 Bespin
             Firefox
                       ETH
                                  ?
Skywriter             Zurich     PDF.JS
            DevTools
  Ace                (Physics)
PDF Viewer
  using
OpenWeb
Standards
What is PDF.JS

•   building faithful & efficient PDF viewer
•   HTML5 technology experiment
•   no native code
•   secure (web sandbox)
•   Mozilla Labs Project - Open Source (Github)
What is PDF.JS

• Not Firefox-Specific - all modern browsers
• 1.3 MB uncompressed JS
• ~ 33`000 lines of code
• viewer in different languages
• async API
How PDF is structured
 Header      PDF version

             sequence of objets
   Body

 [Objects]   fonts, drawing cmds, images,
             words, bookmarks, form fields
xRef Table   mapping objID     byte offset
  Trailer    root objID, xRef byte offset
 PDF file     root obj = ref to pages catalog
Let’s look at it
Processing in PDF.JS
• get plain Uint8Array via XHR2, build Stream
• new PDFDoc(stream): read xRef, root object
• page = PDFDoc.getPage(N)              Operation


• page.startRendering(graphics)
                                          List




 • read & convert all PDF cmds ➟ OL PartialEvaluator
 • load required objects (fonts, images)
 • graphics.executeOperatorList(OL) CanvasGraphics
Execution Example
                            “get page 2”    Partial
              Data
                                           Evaluator


 obj#3?              obj#3 = ”foo”               builds
dict.x, .y?             x = 20
                        y = 30             draw(
                                             obj#3,
          Graphics                           dict.x,
                           drawing cmds      dict.y
                                           )
                     draw on
                      canvas
Problem Processing
• Extracting data slow (compressed)
• Transform data (images) slow
• Sometimes a lot of objects on page
➡ Freezes UI
➡ Use WebWorker
➡ :( no direct memory access, postMessage
Main                                Web
  Thread                              Worker

                  data           Partial
 Data                                        Data
              “get page 2”      Evaluator


                             builds

                                draw(
                                  draw(
                                  obj#3,       Op
              Operation             “foo”,
Graphics                          dict.x,      List
                                    20,
              List + Data         dict.y
                                    30
                                )
        draw on                   )
         canvas
5 0 obj             xRef, catalog,                                 OL
                 + resources              PartialEvaluator
<<
 /Length 8 0 R
>>                                   setGState: 	

   [ LW: 10 ]
 stream                              dependency:	

   [ font0 ]
   /GS1 gs                           setFont: 	

     font0, 12
   /F0 12 Tf                         beginText
   BT                                moveText: 	

    100, 700
     100 700 Td                      showText: 	

    “Hello World!”
     (Hello World!) Tj               endText
   ET                                moveTo: 	

      50, 600
   50 600 m                          lineTo: 	

      400, 600
   400 600 l                         stroke
   S
 endstream
endobj                                        Graphics
Images
• JPEG streams:
 • DOMImg.src = 'data:image/jpeg;base64,'
    + window.btoa(bytesToString(bytes));
• If not JPEG stream:
 • read bytes, convert to colorspace
 • imgData = canvas.getImageData()
 • fillWithPixelData(bytes, imgData)
 • canvas.putImageData(imgData)
Jpeg, but...

• no natives support for Jpeg 2000, CMYK
 ➡ use JS implementation
‣ works, not that performant but good enough
Fonts
• There are lots of different font formats!
 • fonts are converted to OpenType
 • use CSS for loading:
      @font-face { font-family:'font0';
    src:url(data:font/opentype;base64, ...)
• Fonts are sanitized by browser
 • Need to rebuild malformed fonts :/
“Why are you doing this?”
            aka.
    ∃ C/C++ libraries
    = isn’t that faster?
“Performance is not
 the only measure”
1. Security
Most vulnerable programs




 Source: http://www.csis.dk/en/csis/news/3321
~ 25% crashes in Firefox
   are Plugin related
2. WebSpecific Viewer
3. Drive Innovation
4. Speed
4. Speed
• Rendering slower then C/C++
• BUT
 • Partial downloading
 • Render page in background
 • Make slow become faster
 • Mostly: Good enough
5. Can do better
6. Push WebPlatform
B2G aka. Boot2Gecko
New API: Printing
• Printing very limited on the web right now
• no way to achieve native printing experience
• NEED: New API for printing
 • mozPrintCallback
 • define canvas content during printing
 • send drawing commands directly to printer
Print
WebPage           Single Pages
Page 1




         • Find print canvas on page
         • Execute printCallback
         • All canvas done ➠ print page
Page 2
canvas.mozPrintCallback
Firefox Integration
Firefox Integration
• PDF.JS as bundled Addon in Firefox Nightly
• Getting in Release Channel is hard
 • 400M users have expectations
 • more testing coverage
 • accessibility
 • match UX expectation
 • fallback if something is not working
Firefox Integration

• Try to make it till Aurora Merge (6/5)
• Firefox Specific, BUT
 • improving quality browser independent
 • only small parts Firefox specific
What’s next
• Fix broken PDFs
• Improve performance
• Improve Text selection
• Text search
• Form support
• Printing support
Demo
Contributing

• Lots of areas
 • Translation
 • Writing Code (embeddable viewer?)
 • Testing (Firefox Auto-Update Addon)
Github:                              Readme
 https://github.com/mozilla/pdf.js    Issues
                                       Wiki
Twitter:
 @pdfjs
Mailing List:
 https://groups.google.com/group/
mozilla.dev.pdf-js/topics
IRC:
 irc.mozilla.org #pdfjs
Engineering Weekly Call:
 Thursday - 10:00am PDT
Q &A

Weitere ähnliche Inhalte

Was ist angesagt?

Cubes - Lightweight Python OLAP (EuroPython 2012 talk)
Cubes - Lightweight Python OLAP (EuroPython 2012 talk)Cubes - Lightweight Python OLAP (EuroPython 2012 talk)
Cubes - Lightweight Python OLAP (EuroPython 2012 talk)Stefan Urbanek
 
Highlights of F# lightning talk
Highlights of F# lightning talkHighlights of F# lightning talk
Highlights of F# lightning talkmbhwork
 
MongoDB, E-commerce and Transactions
MongoDB, E-commerce and TransactionsMongoDB, E-commerce and Transactions
MongoDB, E-commerce and TransactionsSteven Francia
 
FrozenRails Training
FrozenRails TrainingFrozenRails Training
FrozenRails TrainingMike Dirolf
 
Geoindexing with MongoDB
Geoindexing with MongoDBGeoindexing with MongoDB
Geoindexing with MongoDBleafnode
 
Aggregation in MongoDB
Aggregation in MongoDBAggregation in MongoDB
Aggregation in MongoDBKishor Parkhe
 
MongoDB World 2016 : Advanced Aggregation
MongoDB World 2016 : Advanced AggregationMongoDB World 2016 : Advanced Aggregation
MongoDB World 2016 : Advanced AggregationJoe Drumgoole
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation FrameworkMongoDB
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation FrameworkTyler Brock
 
Bubbles – Virtual Data Objects
Bubbles – Virtual Data ObjectsBubbles – Virtual Data Objects
Bubbles – Virtual Data ObjectsStefan Urbanek
 
MongoDB, Hadoop and humongous data - MongoSV 2012
MongoDB, Hadoop and humongous data - MongoSV 2012MongoDB, Hadoop and humongous data - MongoSV 2012
MongoDB, Hadoop and humongous data - MongoSV 2012Steven Francia
 
Cubes – pluggable model explained
Cubes – pluggable model explainedCubes – pluggable model explained
Cubes – pluggable model explainedStefan Urbanek
 
Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics MongoDB
 
Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)Stefan Urbanek
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation FrameworkCaserta
 
연구자 및 교육자를 위한 계산 및 분석 플랫폼 설계 - PyCon KR 2015
연구자 및 교육자를 위한 계산 및 분석 플랫폼 설계 - PyCon KR 2015연구자 및 교육자를 위한 계산 및 분석 플랫폼 설계 - PyCon KR 2015
연구자 및 교육자를 위한 계산 및 분석 플랫폼 설계 - PyCon KR 2015Jeongkyu Shin
 
Shankar's mongo db presentation
Shankar's mongo db presentationShankar's mongo db presentation
Shankar's mongo db presentationShankar Kamble
 

Was ist angesagt? (20)

MongoDB
MongoDBMongoDB
MongoDB
 
Cubes - Lightweight Python OLAP (EuroPython 2012 talk)
Cubes - Lightweight Python OLAP (EuroPython 2012 talk)Cubes - Lightweight Python OLAP (EuroPython 2012 talk)
Cubes - Lightweight Python OLAP (EuroPython 2012 talk)
 
Highlights of F# lightning talk
Highlights of F# lightning talkHighlights of F# lightning talk
Highlights of F# lightning talk
 
Kiosk / PHP
Kiosk / PHP Kiosk / PHP
Kiosk / PHP
 
MongoDB, E-commerce and Transactions
MongoDB, E-commerce and TransactionsMongoDB, E-commerce and Transactions
MongoDB, E-commerce and Transactions
 
Latinoware
LatinowareLatinoware
Latinoware
 
FrozenRails Training
FrozenRails TrainingFrozenRails Training
FrozenRails Training
 
Geoindexing with MongoDB
Geoindexing with MongoDBGeoindexing with MongoDB
Geoindexing with MongoDB
 
Aggregation in MongoDB
Aggregation in MongoDBAggregation in MongoDB
Aggregation in MongoDB
 
MongoDB World 2016 : Advanced Aggregation
MongoDB World 2016 : Advanced AggregationMongoDB World 2016 : Advanced Aggregation
MongoDB World 2016 : Advanced Aggregation
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation Framework
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
 
Bubbles – Virtual Data Objects
Bubbles – Virtual Data ObjectsBubbles – Virtual Data Objects
Bubbles – Virtual Data Objects
 
MongoDB, Hadoop and humongous data - MongoSV 2012
MongoDB, Hadoop and humongous data - MongoSV 2012MongoDB, Hadoop and humongous data - MongoSV 2012
MongoDB, Hadoop and humongous data - MongoSV 2012
 
Cubes – pluggable model explained
Cubes – pluggable model explainedCubes – pluggable model explained
Cubes – pluggable model explained
 
Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics
 
Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
 
연구자 및 교육자를 위한 계산 및 분석 플랫폼 설계 - PyCon KR 2015
연구자 및 교육자를 위한 계산 및 분석 플랫폼 설계 - PyCon KR 2015연구자 및 교육자를 위한 계산 및 분석 플랫폼 설계 - PyCon KR 2015
연구자 및 교육자를 위한 계산 및 분석 플랫폼 설계 - PyCon KR 2015
 
Shankar's mongo db presentation
Shankar's mongo db presentationShankar's mongo db presentation
Shankar's mongo db presentation
 

Ähnlich wie Viewing PDFs with Open Web Standards

How dojo works
How dojo worksHow dojo works
How dojo worksAmit Tyagi
 
2011 11-mozcamp-111115062121-phpapp02
2011 11-mozcamp-111115062121-phpapp022011 11-mozcamp-111115062121-phpapp02
2011 11-mozcamp-111115062121-phpapp02arnwbl
 
Above the cloud: Big Data and BI
Above the cloud: Big Data and BIAbove the cloud: Big Data and BI
Above the cloud: Big Data and BIDenny Lee
 
Masterin Large Scale Java Script Applications
Masterin Large Scale Java Script ApplicationsMasterin Large Scale Java Script Applications
Masterin Large Scale Java Script ApplicationsFabian Jakobs
 
Practical Use of MongoDB for Node.js
Practical Use of MongoDB for Node.jsPractical Use of MongoDB for Node.js
Practical Use of MongoDB for Node.jsasync_io
 
Nodejs - Should Ruby Developers Care?
Nodejs - Should Ruby Developers Care?Nodejs - Should Ruby Developers Care?
Nodejs - Should Ruby Developers Care?Felix Geisendörfer
 
Node.js: The What, The How and The When
Node.js: The What, The How and The WhenNode.js: The What, The How and The When
Node.js: The What, The How and The WhenFITC
 
JavaScript Library Overview (Ajax Exp West 2007)
JavaScript Library Overview (Ajax Exp West 2007)JavaScript Library Overview (Ajax Exp West 2007)
JavaScript Library Overview (Ajax Exp West 2007)jeresig
 
Azure for SharePoint Developers - Workshop - Part 2: Azure Functions
Azure for SharePoint Developers - Workshop - Part 2: Azure FunctionsAzure for SharePoint Developers - Workshop - Part 2: Azure Functions
Azure for SharePoint Developers - Workshop - Part 2: Azure FunctionsBob German
 
How to make Ajax Libraries work for you
How to make Ajax Libraries work for youHow to make Ajax Libraries work for you
How to make Ajax Libraries work for youSimon Willison
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsServer Density
 
Building Dojo in the Cloud
Building Dojo in the CloudBuilding Dojo in the Cloud
Building Dojo in the CloudJames Thomas
 
JavaScript Libraries: The Big Picture
JavaScript Libraries: The Big PictureJavaScript Libraries: The Big Picture
JavaScript Libraries: The Big PictureSimon Willison
 
From YUI3 to K2
From YUI3 to K2From YUI3 to K2
From YUI3 to K2kaven yan
 
System insight without Interference
System insight without InterferenceSystem insight without Interference
System insight without InterferenceTony Tam
 
Utilizing the open ntf domino api
Utilizing the open ntf domino apiUtilizing the open ntf domino api
Utilizing the open ntf domino apiOliver Busse
 

Ähnlich wie Viewing PDFs with Open Web Standards (20)

How dojo works
How dojo worksHow dojo works
How dojo works
 
2011 11-mozcamp-111115062121-phpapp02
2011 11-mozcamp-111115062121-phpapp022011 11-mozcamp-111115062121-phpapp02
2011 11-mozcamp-111115062121-phpapp02
 
2011 11-mozcamp
2011 11-mozcamp2011 11-mozcamp
2011 11-mozcamp
 
Above the cloud: Big Data and BI
Above the cloud: Big Data and BIAbove the cloud: Big Data and BI
Above the cloud: Big Data and BI
 
Masterin Large Scale Java Script Applications
Masterin Large Scale Java Script ApplicationsMasterin Large Scale Java Script Applications
Masterin Large Scale Java Script Applications
 
Practical Use of MongoDB for Node.js
Practical Use of MongoDB for Node.jsPractical Use of MongoDB for Node.js
Practical Use of MongoDB for Node.js
 
Nodejs - Should Ruby Developers Care?
Nodejs - Should Ruby Developers Care?Nodejs - Should Ruby Developers Care?
Nodejs - Should Ruby Developers Care?
 
Node.js: The What, The How and The When
Node.js: The What, The How and The WhenNode.js: The What, The How and The When
Node.js: The What, The How and The When
 
JavaScript Library Overview (Ajax Exp West 2007)
JavaScript Library Overview (Ajax Exp West 2007)JavaScript Library Overview (Ajax Exp West 2007)
JavaScript Library Overview (Ajax Exp West 2007)
 
Azure for SharePoint Developers - Workshop - Part 2: Azure Functions
Azure for SharePoint Developers - Workshop - Part 2: Azure FunctionsAzure for SharePoint Developers - Workshop - Part 2: Azure Functions
Azure for SharePoint Developers - Workshop - Part 2: Azure Functions
 
How to make Ajax Libraries work for you
How to make Ajax Libraries work for youHow to make Ajax Libraries work for you
How to make Ajax Libraries work for you
 
Couchdb Nosql
Couchdb NosqlCouchdb Nosql
Couchdb Nosql
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
 
Building Dojo in the Cloud
Building Dojo in the CloudBuilding Dojo in the Cloud
Building Dojo in the Cloud
 
JavaScript Libraries: The Big Picture
JavaScript Libraries: The Big PictureJavaScript Libraries: The Big Picture
JavaScript Libraries: The Big Picture
 
Offline Html5 3days
Offline Html5 3daysOffline Html5 3days
Offline Html5 3days
 
JS Essence
JS EssenceJS Essence
JS Essence
 
From YUI3 to K2
From YUI3 to K2From YUI3 to K2
From YUI3 to K2
 
System insight without Interference
System insight without InterferenceSystem insight without Interference
System insight without Interference
 
Utilizing the open ntf domino api
Utilizing the open ntf domino apiUtilizing the open ntf domino api
Utilizing the open ntf domino api
 

Kürzlich hochgeladen

Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 

Kürzlich hochgeladen (20)

Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 

Viewing PDFs with Open Web Standards

  • 1. Julian Viereck @jviereck +julian.viereck
  • 2. Overview 5 • What is PDF.JS about 10 • How PDF is structured & processing in PDF.JS 15 • “Why are you doing this?” 5 • Firefox Integration 5 • What’s next? 15 • Demo 5 • Q &A
  • 3. About me Bespin Firefox ETH ? Skywriter Zurich PDF.JS DevTools Ace (Physics)
  • 4. PDF Viewer using OpenWeb Standards
  • 5. What is PDF.JS • building faithful & efficient PDF viewer • HTML5 technology experiment • no native code • secure (web sandbox) • Mozilla Labs Project - Open Source (Github)
  • 6. What is PDF.JS • Not Firefox-Specific - all modern browsers • 1.3 MB uncompressed JS • ~ 33`000 lines of code • viewer in different languages • async API
  • 7. How PDF is structured Header PDF version sequence of objets Body [Objects] fonts, drawing cmds, images, words, bookmarks, form fields xRef Table mapping objID byte offset Trailer root objID, xRef byte offset PDF file root obj = ref to pages catalog
  • 9. Processing in PDF.JS • get plain Uint8Array via XHR2, build Stream • new PDFDoc(stream): read xRef, root object • page = PDFDoc.getPage(N) Operation • page.startRendering(graphics) List • read & convert all PDF cmds ➟ OL PartialEvaluator • load required objects (fonts, images) • graphics.executeOperatorList(OL) CanvasGraphics
  • 10. Execution Example “get page 2” Partial Data Evaluator obj#3? obj#3 = ”foo” builds dict.x, .y? x = 20 y = 30 draw( obj#3, Graphics dict.x, drawing cmds dict.y ) draw on canvas
  • 11. Problem Processing • Extracting data slow (compressed) • Transform data (images) slow • Sometimes a lot of objects on page ➡ Freezes UI ➡ Use WebWorker ➡ :( no direct memory access, postMessage
  • 12. Main Web Thread Worker data Partial Data Data “get page 2” Evaluator builds draw( draw( obj#3, Op Operation “foo”, Graphics dict.x, List 20, List + Data dict.y 30 ) draw on ) canvas
  • 13. 5 0 obj xRef, catalog, OL + resources PartialEvaluator << /Length 8 0 R >> setGState: [ LW: 10 ] stream dependency: [ font0 ] /GS1 gs setFont: font0, 12 /F0 12 Tf beginText BT moveText: 100, 700 100 700 Td showText: “Hello World!” (Hello World!) Tj endText ET moveTo: 50, 600 50 600 m lineTo: 400, 600 400 600 l stroke S endstream endobj Graphics
  • 14. Images • JPEG streams: • DOMImg.src = 'data:image/jpeg;base64,' + window.btoa(bytesToString(bytes)); • If not JPEG stream: • read bytes, convert to colorspace • imgData = canvas.getImageData() • fillWithPixelData(bytes, imgData) • canvas.putImageData(imgData)
  • 15. Jpeg, but... • no natives support for Jpeg 2000, CMYK ➡ use JS implementation ‣ works, not that performant but good enough
  • 16. Fonts • There are lots of different font formats! • fonts are converted to OpenType • use CSS for loading: @font-face { font-family:'font0'; src:url(data:font/opentype;base64, ...) • Fonts are sanitized by browser • Need to rebuild malformed fonts :/
  • 17. “Why are you doing this?” aka. ∃ C/C++ libraries = isn’t that faster?
  • 18. “Performance is not the only measure”
  • 20. Most vulnerable programs Source: http://www.csis.dk/en/csis/news/3321
  • 21. ~ 25% crashes in Firefox are Plugin related
  • 25. 4. Speed • Rendering slower then C/C++ • BUT • Partial downloading • Render page in background • Make slow become faster • Mostly: Good enough
  • 26. 5. Can do better
  • 29.
  • 30.
  • 31.
  • 32.
  • 33. New API: Printing • Printing very limited on the web right now • no way to achieve native printing experience • NEED: New API for printing • mozPrintCallback • define canvas content during printing • send drawing commands directly to printer
  • 34. Print WebPage Single Pages
  • 35. Page 1 • Find print canvas on page • Execute printCallback • All canvas done ➠ print page Page 2
  • 38. Firefox Integration • PDF.JS as bundled Addon in Firefox Nightly • Getting in Release Channel is hard • 400M users have expectations • more testing coverage • accessibility • match UX expectation • fallback if something is not working
  • 39. Firefox Integration • Try to make it till Aurora Merge (6/5) • Firefox Specific, BUT • improving quality browser independent • only small parts Firefox specific
  • 40. What’s next • Fix broken PDFs • Improve performance • Improve Text selection • Text search • Form support • Printing support
  • 41. Demo
  • 42. Contributing • Lots of areas • Translation • Writing Code (embeddable viewer?) • Testing (Firefox Auto-Update Addon)
  • 43. Github: Readme https://github.com/mozilla/pdf.js Issues Wiki Twitter: @pdfjs Mailing List: https://groups.google.com/group/ mozilla.dev.pdf-js/topics IRC: irc.mozilla.org #pdfjs Engineering Weekly Call: Thursday - 10:00am PDT
  • 44. Q &A