SlideShare ist ein Scribd-Unternehmen logo
1 von 128
Downloaden Sie, um offline zu lesen
High Performance JavaScript: JITs in Firefox
Power of the Web
How did we get here? ,[object Object]
Adding new technologies, like video, audio, integration and 2D+3D drawing,[object Object]
Getting a Webpage www.intel.com 192.198.164.158
Getting a Webpage www.intel.com 192.198.164.158 GET / HTTP/1.0 (HTTP Request)
Getting a Webpage <html><head><title>Laptop, Notebooks, ...</title></head><script language="javascript">functiondoStuff() {...}</script><body>... (HTML, CSS, JavaScript)
Core Web Components ,[object Object]
CSS provides styling
JavaScript provides a programming interface,[object Object]
Video, Audio tags
WebGL (OpenGL + JavaScript),[object Object]
CSS Cascading Style Sheets Separates presentation from content <b class=”warning”>This is red</b> .warning {color: red;text-decoration: underline;}
What is JavaScript? ,[object Object]
De facto language of the web
Interacts with the DOM and the browser,[object Object]
Displaying a Webpage Layout engine applies CSS to DOM Computes geometry of elements JavaScript engine executes code This can change DOM, triggers “reflow” Finished layout is sent to graphics layer
The Result
JavaScript ,[object Object]
Initial version written in 10 days
Anything from small browser interactions, complex apps (Gmail) to intense graphics (WebGL),[object Object]
Think LISP or Self, not Java Untyped - no type declarations Multi-Paradigm – objects, closures, first-class functions Highly dynamic - objects are dictionaries
No Type Declarations ,[object Object],functionf(a) {var x =72.3;if (a)        x = a +"string";return x;}
No Type Declarations ,[object Object],if a is:    number: add;    string: concat; functionf(a) {var x =“hi”;if (a)        x = a +33.2;return x;}
Functional ,[object Object],functionf(a) {returnfunction () {return a;    }}var m = f(5);print(m());
Objects ,[object Object]
Properties may be deleted or added at any time!var point = { x : 5, y : 10 };deletepoint.x;point.z=12;
Prototypes Every object can have a prototype object If a property is not found on an object, its prototype is searched instead … And the prototype’s prototype, etc..
Numbers ,[object Object]
Engines use 32-bit integers to optimize
Must preserve semantics: integers overflow to doubles,[object Object]
JavaScript Implementations ,[object Object]
Garbage collector
Runtime library
Execution engine (VM),[object Object]
Values ,[object Object]
Unboxed values required for computation,[object Object]
Easy idea: 96-bit struct
32-bits for type
64-bits for double, pointer, or integer
Too big! We have to pack better.,[object Object]
[object Object]
Doubles are normal IEEE-754
How to pack non-doubles?
51 bits of NaN space!IA32, ARM Nunboxing
Nunboxing IA32, ARM Type Payload 0x400c0000 0x00000000 63 0 ,[object Object]
Type is double because it’s not a NaN
Encodes: (Double, 3.5),[object Object]
Value is in NaN space
Type is 0xFFFF0001(Int32)
Value is (Int32, 0x00000040),[object Object]
Value is in NaN space
Bits >> 47 == 0x1FFFF == INT32
Bits 47-63 masked off to retrieve payload
Value(Int32, 0x00000040),[object Object]
Garbage Collection Need to reclaim memory without pausing user workflow, animations, etc Need very fast object allocation Consider lots of small objects like points, vectors Reading and writing to the heap must be fast
Mark and Sweep All live objects are found via a recursive traversal of the root value set All dead objects are added to a free list Very slow to traverse entire heap Building free lists can bring “dead” memory into cache
Improvements ,[object Object]
Marking can be incremental, interleaved with program execution,[object Object]
Generational GC Inactive Active Newborn Space 8MB 8MB Tenured Space
After Minor GC Active Inactive Newborn Space 8MB 8MB Tenured Space
Barriers Incremental marking, generational GC need read or write barriers Read barriers are much slower Azul Systems has hardware read barrier Write barrier seems to be well-predicted?
Unknowns How much do cache effects matter? Memory GC has to touch Locality of objects What do we need to consider to make GC fast?
Running JavaScript ,[object Object]
Basic JIT – Simple, untyped compiler
Advanced JIT – Typed compiler, lightspeed,[object Object]
Giant switch loop
Handles all edge cases of JS semantics,[object Object]
Interpreter case OP_ADD: {        Value lhs = POP();        Value rhs = POP();        Value result;if (lhs.isInt32() && rhs.isInt32()) {int left = rhs.toInt32();int right = rhs.toInt32();if (AddOverflows(left, right, left + right))                result.setInt32(left + right);elseresult.setNumber(double(left) + double(right));        } elseif (lhs.isString() || rhs.isString()) {            String *left = ValueToString(lhs);            String *right = ValueToString(rhs);            String *r = Concatenate(left, right);result.setString(r);        } else {double left = ValueToNumber(lhs);double right = ValueToNumber(rhs);result.setDouble(left + right);        }PUSH(result);break;
Interpreter Very slow! Lots of opcode and type dispatch Lots of interpreter stack traffic Just-in-time compilation solves both
Just In Time Compilation must be very fast Can’t introduce noticeable pauses People care about memory use Can’t JIT everything Can’t create bloated code May have to discard code at any time
Basic JIT JägerMonkey in Firefox 4 Every opcode has a hand-coded template of assembly (registers left blank) Method at a time: Single pass through bytecode stream! Compiler uses assembly templates corresponding to each opcode
Bytecode GETARG 0 ; fetch x GETARG 1 ; fetch y ADD RETURN functionAdd(x, y) {return x + y;}
Interpreter case OP_ADD: {        Value lhs = POP();        Value rhs = POP();        Value result;if (lhs.isInt32() && rhs.isInt32()) {int left = rhs.toInt32();int right = rhs.toInt32();if (AddOverflows(left, right, left + right))                result.setInt32(left + right);elseresult.setNumber(double(left) + double(right));        } elseif (lhs.isString() || rhs.isString()) {            String *left = ValueToString(lhs);            String *right = ValueToString(rhs);            String *r = Concatenate(left, right);result.setString(r);        } else {double left = ValueToNumber(lhs);double right = ValueToNumber(rhs);result.setDouble(left + right);        }PUSH(result);break;
Assembling ADD ,[object Object]
Observation:
Some input types much more common than others,[object Object]
Assembling ADD ,[object Object]
Large design space
Can consider integers common, or
Integers and doubles, or
Anything!,[object Object]
Basic JIT - ADD if (arg0.type != INT32) gotoslow_add;if (arg1.type != INT32)gotoslow_add; R0 = arg0.data R1 = arg1.data Greedy Register Allocator
Basic JIT - ADD if (arg0.type != INT32) gotoslow_add;if (arg1.type != INT32)gotoslow_add; R0 = arg0.data;R1 = arg1.data; R2 = R0 + R1; if (OVERFLOWED)gotoslow_add;
Slow Paths slow_add:    Value result = runtime::Add(arg0, arg1);
Inline Out-of-line if (arg0.type != INT32) gotoslow_add;if (arg1.type != INT32)gotoslow_add;R0 = arg0.data;R1 = arg1.data;R2 = R0 + R1;  if (OVERFLOWED)gotoslow_add;rejoin: slow_add:    Value result = Interpreter::Add(arg0, arg1); R2 = result.data; goto rejoin;
Rejoining Inline Out-of-line   if (arg0.type != INT32) gotoslow_add;  if (arg1.type != INT32)gotoslow_add;  R0 = arg0.data;  R1 = arg1.data;  R2 = R0 + R1;  if (OVERFLOWED)gotoslow_add; R3 = TYPE_INT32;rejoin: slow_add:    Value result = runtime::Add(arg0, arg1);     R2 = result.data;     R3 = result.type; goto rejoin;
Final Code Inline Out-of-line if (arg0.type != INT32) goto slow_add; if (arg1.type != INT32)goto slow_add;R0 = arg0.data;R1 = arg1.data;R2 = R0 + R1;  if (OVERFLOWED)goto slow_add; R3 = TYPE_INT32;rejoin: return Value(R3, R2); slow_add:    Value result = Interpreter::Add(arg0, arg1); R2 = result.data; R3 = result.type; goto rejoin;
Observations Even though we have fast paths, no type information can flow in between opcodes Cannot assume result of ADD is integer Means more branches Types increase register pressure
Basic JIT – Object Access ,[object Object]
How can we create a fast-path?
We could try to cache the lookup, but
We want it to work for >1 objectsfunctionf(x) {returnx.y;}
Object Access Observation Usually, objects flowing through an access site look the same If we knew an object’s layout, we could bypass the dictionary lookup
Object Layout var obj = { x: 50,              y: 100,            z: 20     };
Object Layout
Object Layout varobj = { x: 50,              y: 100,            z: 20     }; … varobj2 = { x: 78,              y: 93,              z: 600     };
Object Layout
Familiar Problems We don’t know an object’s shape during JIT compilation ... Or even that a property access receives an object! Solution: leave code “blank,” lazily generate it later
Inline Caches functionf(x) {returnx.y;}
Inline Caches ,[object Object],if (arg0.type != OBJECT)gotoslow_property; JSObject *obj = arg0.data;
[object Object], if (arg0.type != OBJECT)gotoslow_property; JSObject *obj = arg0.data; gotoproperty_ic; Inline Caches
Property ICs functionf(x) {returnx.y;} f({x: 5, y: 10, z: 30});
Property ICs
Property ICs
Property ICs
Inline Caches ,[object Object], if (arg0.type != OBJECT)gotoslow_property; JSObject *obj = arg0.data; gotoproperty_ic;
Inline Caches ,[object Object], if (arg0.type != OBJECT)gotoslow_property; JSObject *obj = arg0.data; gotoproperty_ic;
Inline Caches ,[object Object],  if (arg0.type != OBJECT)gotoslow_property; JSObject *obj = arg0.data;   if (obj->shape != SHAPE1) gotoproperty_ic; R0 = obj->slots[1].type; R1 = obj->slots[1].data;
Polymorphism ,[object Object]
Add another fast-path...functionf(x) {returnx.y;}f({ x: 5, y: 10, z: 30});f({ y: 40});
Polymorphism Main Method  if (arg0.type != OBJECT)gotoslow_path;JSObject *obj = arg0.data;if (obj->shape != SHAPE1)gotoproperty_ic;R0 = obj->slots[1].type;R1 = obj->slots[1].data;rejoin:
Polymorphism Main Method Generated Stub   if (arg0.type != OBJECT)gotoslow_path;JSObject *obj = arg0.data;  if (obj->shape != SHAPE0)gotoproperty_ic;  R0 = obj->slots[1].type;  R1 = obj->slots[1].data;rejoin:   stub: if (obj->shape != SHAPE2)gotoproperty_ic; R0 = obj->slots[0].type;R1 = obj->slots[0].data; goto rejoin;
Polymorphism Main Method Generated Stub if (arg0.type != OBJECT)    goto slow_path;  JSObject *obj = arg0.data;if (obj->shape != SHAPE1)goto stub;  R0 = obj->slots[1].type;  R1 = obj->slots[1].data;rejoin:   stub: if (obj->shape != SHAPE2)goto property_ic; R0 = obj->slots[0].type;R1 = obj->slots[0].data; goto rejoin;
Chain of Stubs if(obj->shape == S1) {     result = obj->slots[0]; } else { if (obj->shape == S2) {         result = obj->slots[1];     } else { if (obj->shape == S3) {              result = obj->slots[2];         } else { if (obj->shape == S4) {                 result = obj->slots[3];             } else { goto property_ic;             }         }     } }
Code Memory ,[object Object]
Self-modifying, but single threaded
Code memory is always rwx
Concerned about protection-flipping expense,[object Object]
Inline caches adapt code based on runtime observations
Lots of guards, poor type information,[object Object]
We could generate an IR, but without type information...
Slow paths prevent most optimizations.,[object Object]
Code Motion? functionAdd(x, n) { varsum = 0; vartemp0 = x + n;for (vari = 0; i < n; i++)         sum += temp0;     return sum;} Let’s try it.
Wrong Result var global =0;varobj= {valueOf: function () {return++global;    }}Add(obj, 10); ,[object Object]
Hoisted version returns 110!,[object Object]
Ideal Scenario ,[object Object]
Remove slow paths

Weitere ähnliche Inhalte

Was ist angesagt?

Fun with Lambdas: C++14 Style (part 2)
Fun with Lambdas: C++14 Style (part 2)Fun with Lambdas: C++14 Style (part 2)
Fun with Lambdas: C++14 Style (part 2)Sumant Tambe
 
Introduction to Objective - C
Introduction to Objective - CIntroduction to Objective - C
Introduction to Objective - CJussi Pohjolainen
 
Fun with Lambdas: C++14 Style (part 1)
Fun with Lambdas: C++14 Style (part 1)Fun with Lambdas: C++14 Style (part 1)
Fun with Lambdas: C++14 Style (part 1)Sumant Tambe
 
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...Positive Hack Days
 
Egor Bogatov - .NET Core intrinsics and other micro-optimizations
Egor Bogatov - .NET Core intrinsics and other micro-optimizationsEgor Bogatov - .NET Core intrinsics and other micro-optimizations
Egor Bogatov - .NET Core intrinsics and other micro-optimizationsEgor Bogatov
 
BayFP: Concurrent and Multicore Haskell
BayFP: Concurrent and Multicore HaskellBayFP: Concurrent and Multicore Haskell
BayFP: Concurrent and Multicore HaskellBryan O'Sullivan
 
Cocoaheads Meetup / Alex Zimin / Swift magic
Cocoaheads Meetup / Alex Zimin / Swift magicCocoaheads Meetup / Alex Zimin / Swift magic
Cocoaheads Meetup / Alex Zimin / Swift magicBadoo Development
 
A look into the sanitizer family (ASAN & UBSAN) by Akul Pillai
A look into the sanitizer family (ASAN & UBSAN) by Akul PillaiA look into the sanitizer family (ASAN & UBSAN) by Akul Pillai
A look into the sanitizer family (ASAN & UBSAN) by Akul PillaiCysinfo Cyber Security Community
 
Oh Crap, I Forgot (Or Never Learned) C! [CodeMash 2010]
Oh Crap, I Forgot (Or Never Learned) C! [CodeMash 2010]Oh Crap, I Forgot (Or Never Learned) C! [CodeMash 2010]
Oh Crap, I Forgot (Or Never Learned) C! [CodeMash 2010]Chris Adamson
 
DEFUN 2008 - Real World Haskell
DEFUN 2008 - Real World HaskellDEFUN 2008 - Real World Haskell
DEFUN 2008 - Real World HaskellBryan O'Sullivan
 
Memory efficient pytorch
Memory efficient pytorchMemory efficient pytorch
Memory efficient pytorchHyungjoo Cho
 
Software Design in Practice (with Java examples)
Software Design in Practice (with Java examples)Software Design in Practice (with Java examples)
Software Design in Practice (with Java examples)Ganesh Samarthyam
 

Was ist angesagt? (20)

Fun with Lambdas: C++14 Style (part 2)
Fun with Lambdas: C++14 Style (part 2)Fun with Lambdas: C++14 Style (part 2)
Fun with Lambdas: C++14 Style (part 2)
 
The Style of C++ 11
The Style of C++ 11The Style of C++ 11
The Style of C++ 11
 
Python Scipy Numpy
Python Scipy NumpyPython Scipy Numpy
Python Scipy Numpy
 
Introduction to Objective - C
Introduction to Objective - CIntroduction to Objective - C
Introduction to Objective - C
 
Fun with Lambdas: C++14 Style (part 1)
Fun with Lambdas: C++14 Style (part 1)Fun with Lambdas: C++14 Style (part 1)
Fun with Lambdas: C++14 Style (part 1)
 
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
 
Egor Bogatov - .NET Core intrinsics and other micro-optimizations
Egor Bogatov - .NET Core intrinsics and other micro-optimizationsEgor Bogatov - .NET Core intrinsics and other micro-optimizations
Egor Bogatov - .NET Core intrinsics and other micro-optimizations
 
BayFP: Concurrent and Multicore Haskell
BayFP: Concurrent and Multicore HaskellBayFP: Concurrent and Multicore Haskell
BayFP: Concurrent and Multicore Haskell
 
Scientific Python
Scientific PythonScientific Python
Scientific Python
 
Java Script Workshop
Java Script WorkshopJava Script Workshop
Java Script Workshop
 
Cocoaheads Meetup / Alex Zimin / Swift magic
Cocoaheads Meetup / Alex Zimin / Swift magicCocoaheads Meetup / Alex Zimin / Swift magic
Cocoaheads Meetup / Alex Zimin / Swift magic
 
A look into the sanitizer family (ASAN & UBSAN) by Akul Pillai
A look into the sanitizer family (ASAN & UBSAN) by Akul PillaiA look into the sanitizer family (ASAN & UBSAN) by Akul Pillai
A look into the sanitizer family (ASAN & UBSAN) by Akul Pillai
 
Oh Crap, I Forgot (Or Never Learned) C! [CodeMash 2010]
Oh Crap, I Forgot (Or Never Learned) C! [CodeMash 2010]Oh Crap, I Forgot (Or Never Learned) C! [CodeMash 2010]
Oh Crap, I Forgot (Or Never Learned) C! [CodeMash 2010]
 
What's New in C++ 11?
What's New in C++ 11?What's New in C++ 11?
What's New in C++ 11?
 
DEFUN 2008 - Real World Haskell
DEFUN 2008 - Real World HaskellDEFUN 2008 - Real World Haskell
DEFUN 2008 - Real World Haskell
 
Functional programming in C++
Functional programming in C++Functional programming in C++
Functional programming in C++
 
Memory efficient pytorch
Memory efficient pytorchMemory efficient pytorch
Memory efficient pytorch
 
Using Parallel Computing Platform - NHDNUG
Using Parallel Computing Platform - NHDNUGUsing Parallel Computing Platform - NHDNUG
Using Parallel Computing Platform - NHDNUG
 
Software Design in Practice (with Java examples)
Software Design in Practice (with Java examples)Software Design in Practice (with Java examples)
Software Design in Practice (with Java examples)
 
C language
C languageC language
C language
 

Ähnlich wie High Performance JavaScript: JITs in Firefox Optimize Integer Math

Lo Mejor Del Pdc2008 El Futrode C#
Lo Mejor Del Pdc2008 El Futrode C#Lo Mejor Del Pdc2008 El Futrode C#
Lo Mejor Del Pdc2008 El Futrode C#Juan Pablo
 
Too Much Data? - Just Sample, Just Hash, ...
Too Much Data? - Just Sample, Just Hash, ...Too Much Data? - Just Sample, Just Hash, ...
Too Much Data? - Just Sample, Just Hash, ...Andrii Gakhov
 
Rcpp: Seemless R and C++
Rcpp: Seemless R and C++Rcpp: Seemless R and C++
Rcpp: Seemless R and C++Romain Francois
 
Cluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in CCluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in CSteffen Wenz
 
C++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical ReviewerC++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical ReviewerAndrey Karpov
 
C Programming Language
C Programming LanguageC Programming Language
C Programming LanguageRTS Tech
 
Rcpp: Seemless R and C++
Rcpp: Seemless R and C++Rcpp: Seemless R and C++
Rcpp: Seemless R and C++Romain Francois
 
Beyond Breakpoints: A Tour of Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic AnalysisBeyond Breakpoints: A Tour of Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic AnalysisFastly
 
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data EcosystemWprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data EcosystemSages
 
devLink - What's New in C# 4?
devLink - What's New in C# 4?devLink - What's New in C# 4?
devLink - What's New in C# 4?Kevin Pilch
 
VHDL Packages, Coding Styles for Arithmetic Operations and VHDL-200x Additions
VHDL Packages, Coding Styles for Arithmetic Operations and VHDL-200x AdditionsVHDL Packages, Coding Styles for Arithmetic Operations and VHDL-200x Additions
VHDL Packages, Coding Styles for Arithmetic Operations and VHDL-200x AdditionsAmal Khailtash
 
How much performance can you get out of Javascript? - Massimiliano Mantione -...
How much performance can you get out of Javascript? - Massimiliano Mantione -...How much performance can you get out of Javascript? - Massimiliano Mantione -...
How much performance can you get out of Javascript? - Massimiliano Mantione -...Codemotion
 
掀起 Swift 的面紗
掀起 Swift 的面紗掀起 Swift 的面紗
掀起 Swift 的面紗Pofat Tseng
 
Category theory, Monads, and Duality in the world of (BIG) Data
Category theory, Monads, and Duality in the world of (BIG) DataCategory theory, Monads, and Duality in the world of (BIG) Data
Category theory, Monads, and Duality in the world of (BIG) Datagreenwop
 

Ähnlich wie High Performance JavaScript: JITs in Firefox Optimize Integer Math (20)

Lo Mejor Del Pdc2008 El Futrode C#
Lo Mejor Del Pdc2008 El Futrode C#Lo Mejor Del Pdc2008 El Futrode C#
Lo Mejor Del Pdc2008 El Futrode C#
 
Code optimization
Code optimization Code optimization
Code optimization
 
Code optimization
Code optimization Code optimization
Code optimization
 
Too Much Data? - Just Sample, Just Hash, ...
Too Much Data? - Just Sample, Just Hash, ...Too Much Data? - Just Sample, Just Hash, ...
Too Much Data? - Just Sample, Just Hash, ...
 
Rcpp: Seemless R and C++
Rcpp: Seemless R and C++Rcpp: Seemless R and C++
Rcpp: Seemless R and C++
 
Cluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in CCluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in C
 
C++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical ReviewerC++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical Reviewer
 
C Programming Language
C Programming LanguageC Programming Language
C Programming Language
 
Rcpp: Seemless R and C++
Rcpp: Seemless R and C++Rcpp: Seemless R and C++
Rcpp: Seemless R and C++
 
Beyond Breakpoints: A Tour of Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic AnalysisBeyond Breakpoints: A Tour of Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic Analysis
 
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data EcosystemWprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
 
devLink - What's New in C# 4?
devLink - What's New in C# 4?devLink - What's New in C# 4?
devLink - What's New in C# 4?
 
C tutorial
C tutorialC tutorial
C tutorial
 
C tutorial
C tutorialC tutorial
C tutorial
 
C tutorial
C tutorialC tutorial
C tutorial
 
Monads in Swift
Monads in SwiftMonads in Swift
Monads in Swift
 
VHDL Packages, Coding Styles for Arithmetic Operations and VHDL-200x Additions
VHDL Packages, Coding Styles for Arithmetic Operations and VHDL-200x AdditionsVHDL Packages, Coding Styles for Arithmetic Operations and VHDL-200x Additions
VHDL Packages, Coding Styles for Arithmetic Operations and VHDL-200x Additions
 
How much performance can you get out of Javascript? - Massimiliano Mantione -...
How much performance can you get out of Javascript? - Massimiliano Mantione -...How much performance can you get out of Javascript? - Massimiliano Mantione -...
How much performance can you get out of Javascript? - Massimiliano Mantione -...
 
掀起 Swift 的面紗
掀起 Swift 的面紗掀起 Swift 的面紗
掀起 Swift 的面紗
 
Category theory, Monads, and Duality in the world of (BIG) Data
Category theory, Monads, and Duality in the world of (BIG) DataCategory theory, Monads, and Duality in the world of (BIG) Data
Category theory, Monads, and Duality in the world of (BIG) Data
 

Kürzlich hochgeladen

Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessWSO2
 
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...itnewsafrica
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 

Kürzlich hochgeladen (20)

Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with Platformless
 
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 

High Performance JavaScript: JITs in Firefox Optimize Integer Math

  • 3.
  • 4.
  • 5. Getting a Webpage www.intel.com 192.198.164.158
  • 6. Getting a Webpage www.intel.com 192.198.164.158 GET / HTTP/1.0 (HTTP Request)
  • 7. Getting a Webpage <html><head><title>Laptop, Notebooks, ...</title></head><script language="javascript">functiondoStuff() {...}</script><body>... (HTML, CSS, JavaScript)
  • 8.
  • 10.
  • 12.
  • 13. CSS Cascading Style Sheets Separates presentation from content <b class=”warning”>This is red</b> .warning {color: red;text-decoration: underline;}
  • 14.
  • 15. De facto language of the web
  • 16.
  • 17. Displaying a Webpage Layout engine applies CSS to DOM Computes geometry of elements JavaScript engine executes code This can change DOM, triggers “reflow” Finished layout is sent to graphics layer
  • 19.
  • 21.
  • 22. Think LISP or Self, not Java Untyped - no type declarations Multi-Paradigm – objects, closures, first-class functions Highly dynamic - objects are dictionaries
  • 23.
  • 24.
  • 25.
  • 26.
  • 27. Properties may be deleted or added at any time!var point = { x : 5, y : 10 };deletepoint.x;point.z=12;
  • 28. Prototypes Every object can have a prototype object If a property is not found on an object, its prototype is searched instead … And the prototype’s prototype, etc..
  • 29.
  • 30. Engines use 32-bit integers to optimize
  • 31.
  • 32.
  • 35.
  • 36.
  • 37.
  • 40. 64-bits for double, pointer, or integer
  • 41.
  • 42.
  • 44. How to pack non-doubles?
  • 45. 51 bits of NaN space!IA32, ARM Nunboxing
  • 46.
  • 47. Type is double because it’s not a NaN
  • 48.
  • 49. Value is in NaN space
  • 51.
  • 52. Value is in NaN space
  • 53. Bits >> 47 == 0x1FFFF == INT32
  • 54. Bits 47-63 masked off to retrieve payload
  • 55.
  • 56. Garbage Collection Need to reclaim memory without pausing user workflow, animations, etc Need very fast object allocation Consider lots of small objects like points, vectors Reading and writing to the heap must be fast
  • 57. Mark and Sweep All live objects are found via a recursive traversal of the root value set All dead objects are added to a free list Very slow to traverse entire heap Building free lists can bring “dead” memory into cache
  • 58.
  • 59.
  • 60. Generational GC Inactive Active Newborn Space 8MB 8MB Tenured Space
  • 61. After Minor GC Active Inactive Newborn Space 8MB 8MB Tenured Space
  • 62. Barriers Incremental marking, generational GC need read or write barriers Read barriers are much slower Azul Systems has hardware read barrier Write barrier seems to be well-predicted?
  • 63. Unknowns How much do cache effects matter? Memory GC has to touch Locality of objects What do we need to consider to make GC fast?
  • 64.
  • 65. Basic JIT – Simple, untyped compiler
  • 66.
  • 68.
  • 69. Interpreter case OP_ADD: { Value lhs = POP(); Value rhs = POP(); Value result;if (lhs.isInt32() && rhs.isInt32()) {int left = rhs.toInt32();int right = rhs.toInt32();if (AddOverflows(left, right, left + right)) result.setInt32(left + right);elseresult.setNumber(double(left) + double(right)); } elseif (lhs.isString() || rhs.isString()) { String *left = ValueToString(lhs); String *right = ValueToString(rhs); String *r = Concatenate(left, right);result.setString(r); } else {double left = ValueToNumber(lhs);double right = ValueToNumber(rhs);result.setDouble(left + right); }PUSH(result);break;
  • 70. Interpreter Very slow! Lots of opcode and type dispatch Lots of interpreter stack traffic Just-in-time compilation solves both
  • 71. Just In Time Compilation must be very fast Can’t introduce noticeable pauses People care about memory use Can’t JIT everything Can’t create bloated code May have to discard code at any time
  • 72. Basic JIT JägerMonkey in Firefox 4 Every opcode has a hand-coded template of assembly (registers left blank) Method at a time: Single pass through bytecode stream! Compiler uses assembly templates corresponding to each opcode
  • 73. Bytecode GETARG 0 ; fetch x GETARG 1 ; fetch y ADD RETURN functionAdd(x, y) {return x + y;}
  • 74. Interpreter case OP_ADD: { Value lhs = POP(); Value rhs = POP(); Value result;if (lhs.isInt32() && rhs.isInt32()) {int left = rhs.toInt32();int right = rhs.toInt32();if (AddOverflows(left, right, left + right)) result.setInt32(left + right);elseresult.setNumber(double(left) + double(right)); } elseif (lhs.isString() || rhs.isString()) { String *left = ValueToString(lhs); String *right = ValueToString(rhs); String *r = Concatenate(left, right);result.setString(r); } else {double left = ValueToNumber(lhs);double right = ValueToNumber(rhs);result.setDouble(left + right); }PUSH(result);break;
  • 75.
  • 77.
  • 78.
  • 82.
  • 83. Basic JIT - ADD if (arg0.type != INT32) gotoslow_add;if (arg1.type != INT32)gotoslow_add; R0 = arg0.data R1 = arg1.data Greedy Register Allocator
  • 84. Basic JIT - ADD if (arg0.type != INT32) gotoslow_add;if (arg1.type != INT32)gotoslow_add; R0 = arg0.data;R1 = arg1.data; R2 = R0 + R1; if (OVERFLOWED)gotoslow_add;
  • 85. Slow Paths slow_add: Value result = runtime::Add(arg0, arg1);
  • 86. Inline Out-of-line if (arg0.type != INT32) gotoslow_add;if (arg1.type != INT32)gotoslow_add;R0 = arg0.data;R1 = arg1.data;R2 = R0 + R1; if (OVERFLOWED)gotoslow_add;rejoin: slow_add: Value result = Interpreter::Add(arg0, arg1); R2 = result.data; goto rejoin;
  • 87. Rejoining Inline Out-of-line if (arg0.type != INT32) gotoslow_add; if (arg1.type != INT32)gotoslow_add; R0 = arg0.data; R1 = arg1.data; R2 = R0 + R1; if (OVERFLOWED)gotoslow_add; R3 = TYPE_INT32;rejoin: slow_add: Value result = runtime::Add(arg0, arg1); R2 = result.data; R3 = result.type; goto rejoin;
  • 88. Final Code Inline Out-of-line if (arg0.type != INT32) goto slow_add; if (arg1.type != INT32)goto slow_add;R0 = arg0.data;R1 = arg1.data;R2 = R0 + R1; if (OVERFLOWED)goto slow_add; R3 = TYPE_INT32;rejoin: return Value(R3, R2); slow_add: Value result = Interpreter::Add(arg0, arg1); R2 = result.data; R3 = result.type; goto rejoin;
  • 89. Observations Even though we have fast paths, no type information can flow in between opcodes Cannot assume result of ADD is integer Means more branches Types increase register pressure
  • 90.
  • 91. How can we create a fast-path?
  • 92. We could try to cache the lookup, but
  • 93. We want it to work for >1 objectsfunctionf(x) {returnx.y;}
  • 94. Object Access Observation Usually, objects flowing through an access site look the same If we knew an object’s layout, we could bypass the dictionary lookup
  • 95. Object Layout var obj = { x: 50, y: 100, z: 20 };
  • 97. Object Layout varobj = { x: 50, y: 100, z: 20 }; … varobj2 = { x: 78, y: 93, z: 600 };
  • 99. Familiar Problems We don’t know an object’s shape during JIT compilation ... Or even that a property access receives an object! Solution: leave code “blank,” lazily generate it later
  • 100. Inline Caches functionf(x) {returnx.y;}
  • 101.
  • 102.
  • 103. Property ICs functionf(x) {returnx.y;} f({x: 5, y: 10, z: 30});
  • 107.
  • 108.
  • 109.
  • 110.
  • 111. Add another fast-path...functionf(x) {returnx.y;}f({ x: 5, y: 10, z: 30});f({ y: 40});
  • 112. Polymorphism Main Method if (arg0.type != OBJECT)gotoslow_path;JSObject *obj = arg0.data;if (obj->shape != SHAPE1)gotoproperty_ic;R0 = obj->slots[1].type;R1 = obj->slots[1].data;rejoin:
  • 113. Polymorphism Main Method Generated Stub if (arg0.type != OBJECT)gotoslow_path;JSObject *obj = arg0.data; if (obj->shape != SHAPE0)gotoproperty_ic; R0 = obj->slots[1].type; R1 = obj->slots[1].data;rejoin: stub: if (obj->shape != SHAPE2)gotoproperty_ic; R0 = obj->slots[0].type;R1 = obj->slots[0].data; goto rejoin;
  • 114. Polymorphism Main Method Generated Stub if (arg0.type != OBJECT) goto slow_path; JSObject *obj = arg0.data;if (obj->shape != SHAPE1)goto stub; R0 = obj->slots[1].type; R1 = obj->slots[1].data;rejoin: stub: if (obj->shape != SHAPE2)goto property_ic; R0 = obj->slots[0].type;R1 = obj->slots[0].data; goto rejoin;
  • 115. Chain of Stubs if(obj->shape == S1) { result = obj->slots[0]; } else { if (obj->shape == S2) { result = obj->slots[1]; } else { if (obj->shape == S3) { result = obj->slots[2]; } else { if (obj->shape == S4) { result = obj->slots[3]; } else { goto property_ic; } } } }
  • 116.
  • 118. Code memory is always rwx
  • 119.
  • 120. Inline caches adapt code based on runtime observations
  • 121.
  • 122. We could generate an IR, but without type information...
  • 123.
  • 124. Code Motion? functionAdd(x, n) { varsum = 0; vartemp0 = x + n;for (vari = 0; i < n; i++) sum += temp0; return sum;} Let’s try it.
  • 125.
  • 126.
  • 127.
  • 129.
  • 130. Guesses must be informed
  • 131. Don’t want to waste time compiling code that can’t or won’t run
  • 132. Using type inference or runtime profiling
  • 133.
  • 134.
  • 135. Constructs high and low-level IRs
  • 136. Applies type information using runtime feedback
  • 137.
  • 138. Build SSA GETARG 0 ; fetch x GETARG 1 ; fetch y ADD GETARG 0 ; fetch x ADD RETURN v0 = arg0 v1 = arg1
  • 139. Build SSA GETARG 0 ; fetch x GETARG 1 ; fetch y ADD GETARG 0 ; fetch x ADD RETURN v0 = arg0 v1 = arg1
  • 140. Type Oracle Need a mechanism to inform compiler about likely types Type Oracle: given program counter, returns types for inputs and outputs May use any data source – runtime profiling, type inference, etc
  • 141. Type Oracle For this example, assume the type oracle returns “integer” for the output and both inputs to all ADD operations
  • 142. Build SSA GETARG 0 ; fetch x GETARG 1 ; fetch y ADD GETARG 0 ; fetch x ADD RETURN v0 = arg0 v1 = arg1 v2 = iadd(v0, v1)
  • 143. Build SSA GETARG 0 ; fetch x GETARG 1 ; fetch y ADD GETARG 0 ; fetch x ADD RETURN v0 = arg0 v1 = arg1 v2 = iadd(v0, v1) v3 = iadd(v2, v0)
  • 144. Build SSA GETARG 0 ; fetch x GETARG 1 ; fetch y ADD GETARG 0 ; fetch x ADD RETURN v0 = arg0 v1 = arg1 v2 = iadd(v0, v1) v3 = iadd(v2, v0) v4 = return(v3)
  • 145. SSA (Intermediate) v0 = arg0 v1 = arg1 v2 = iadd(v0, v1) v3 = iadd(v2, v0) v4 = return(v3) Untyped Typed
  • 146.
  • 148. Makes sure SSA inputs have correct type
  • 149.
  • 150. Optimization v0 = arg0 v1 = arg1 v2 = unbox(v0, INT32) v3 = unbox(v1, INT32) v4 = iadd(v2, v3) v5 = unbox(v0, INT32) v6 = iadd(v4, v5) v7 = return(v6)
  • 151. Optimization v0 = arg0 v1 = arg1 v2 = unbox(v0, INT32) v3 = unbox(v1, INT32) v4 = iadd(v2, v3) v5 = unbox(v0, INT32) v6 = iadd(v4, v5) v7 = return(v6)
  • 152. Optimized SSA v0 = arg0 v1 = arg1 v2 = unbox(v0, INT32) v3 = unbox(v1, INT32) v4 = iadd(v2, v3) v5 = iadd(v4, v2) v6 = return(v5)
  • 153. Register Allocation Linear Scan Register Allocation Based on Christian Wimmer’s work Linear Scan Register Allocation for the Java HotSpot™ Client Compiler. Master's thesis, Institute for System Software, Johannes Kepler University Linz, 2004 Linear Scan Register Allocation on SSA Form. In Proceedings of the International Symposium on Code Generation and Optimization, pages 170–179. ACM Press, 2010. Same algorithm used in Hotspot JVM
  • 154. Allocate Registers 0: stack arg0 1: stack arg1 2: r1 unbox(0, INT32) 3: r0 unbox(1, INT32) 4: r0 iadd(2, 3) 5: r0 iadd(4, 2) 6: r0 return(5)
  • 155. Code Generation SSA 0: stack arg0 1: stack arg1 2: r1 unbox(0, INT32) 3: r0 unbox(1, INT32) 4: r0 iadd(2, 3) 5: r0 iadd(4, 2) 6: r0 return(5)
  • 156. Code Generation SSA Native Code 0: stack arg0 1: stack arg1 2: r1unbox(0, INT32) 3: r0 unbox(1, INT32) 4: r0 iadd(2, 3) 5: r0 iadd(4, 2) 6: r0 return(5) if (arg0.type != INT32)goto BAILOUT; r1 = arg0.data;
  • 157. Code Generation SSA Native Code 0: stack arg0 1: stack arg1 2: r1 unbox(0, INT32) 3: r0unbox(1, INT32) 4: r0 iadd(2, 3) 5: r0 iadd(4, 2) 6: r0 return(5) if (arg0.type != INT32) goto BAILOUT; r1 = arg0.data; if (arg1.type != INT32)goto BAILOUT; r0 = arg1.data;
  • 158. Code Generation SSA Native Code 0: stack arg0 1: stack arg1 2: r0 unbox(0, INT32) 3: r1 unbox(1, INT32) 4: r0iadd(2, 3) 5: r0 iadd(4, 2) 6: r0 return(5) if (arg0.type != INT32) goto BAILOUT; r0 = arg0.data; if (arg1.type != INT32) goto BAILOUT; r1 = arg1.data; r0 = r0 + r1; if (OVERFLOWED) goto BAILOUT;
  • 159. Code Generation SSA Native Code 0: stack arg0 1: stack arg1 2: r0 unbox(0, INT32) 3: r1 unbox(1, INT32) 4: r0 iadd(2, 3) 5: r0iadd(4, 2) 6: r0 return(5) if (arg0.type != INT32) goto BAILOUT; r0 = arg0.data; if (arg1.type != INT32) goto BAILOUT; r1 = arg1.data; r0 = r0 + r1; if (OVERFLOWED) goto BAILOUT; r0 = r0 + r1; if (OVERFLOWED) goto BAILOUT;
  • 160. Code Generation SSA Native Code 0: stack arg0 1: stack arg1 2: r0 unbox(0, INT32) 3: r1 unbox(1, INT32) 4: r0 iadd(2, 3) 5: r0 iadd(4, 2) 6: r0 return(4) if (arg0.type != INT32) goto BAILOUT; r0 = arg0.data; if (arg1.type != INT32) goto BAILOUT; r1 = arg1.data; r0 = r0 + r1; if (OVERFLOWED) goto BAILOUT; r0 = r0 + r1; if (OVERFLOWED) goto BAILOUT; return r0;
  • 161. Generated Code if (arg0.type != INT32)goto BAILOUT; r0 = arg0.data; if (arg1.type != INT32)goto BAILOUT; r1 = arg1.data; r0 = r0 + r1; if (OVERFLOWED)goto BAILOUT; r0 = r0 + r1; if (OVERFLOWED)goto BAILOUT;return r0;
  • 162.
  • 163. May recompile function if (arg0.type != INT32)goto BAILOUT; r0 = arg0.data; if (arg1.type != INT32)goto BAILOUT; r1 = arg1.data; r0 = r0 + r1; if (OVERFLOWED)goto BAILOUT; r0 = r0 + r1; if (OVERFLOWED)goto BAILOUT;return r0;
  • 164. Guards Still Needed Object shapes for reading properties Types of… Arguments Values read from the heap Values returned from C++
  • 165. Guards Still Needed Math Integer Overflow Divide by Zero Multiply by -0 NaN is falseyin JS, truthy in ISAs
  • 166.
  • 167. Can move and eliminate code
  • 168. If speculation fails, method is recompiled or deoptimized
  • 169.
  • 172. Better than Basic JIT, but we can do better
  • 173. Interval analysis can remove overflow checks
  • 174.
  • 175. Hybrid: both static and dynamic
  • 176. Replaces checks in JIT with checks in virtual machine
  • 177.
  • 178. Conclusions JITs getting more powerful Type specialization is key Still need checks and special cases on many operations

Hinweis der Redaktion

  1. I’m going to start off by showing my favorite demo, that I think really captures what we’re enabling by making the browser as fast possible.
  2. So this is Doom, translated - actually compiled - from C++ to JavaScript, being played in Firefox. In addition to JavaScript it’s using the new 2D drawing element called Canvas. This demo is completely native to the browser. There’s no Flash, it’s all open web technologies. And it’s managing to get a pretty good framerate, 30fps, despite being mostly unoptimized. This is a really cool demo for me because, three years ago, this kind of demo on the web was just impossible without proprietary extensions. Some of the technology just wasn’t there, and the rest of it was anywhere from ten to a hundred times slower. So we’ve come a long way.… Unfortunately all I have of the demo is a recording, it was playable online for about a day but iD Software requested that we take it down.
  3. To step back for a minute, I’m going to briefly overview how the browser and the major web technologies interact, and then jump into JavaScript which is the heart of this talk.
  4. This is a much less gory example, a major company’s website in a popular web browser. The process of getting from the intel.com URL to this display is actually pretty complicated.
  5. The browser then sends intel.com this short plaintext message which says, “give me the webpage corresponding to the top of your website’s hierarchy.”
  6. How does the browser get from all of this to the a fully rendered website?
  7. Now that we’re familiar with the language’s main features, let’s talk about implementing a JavaScript engine. A JavaScript implementation has four main pieces
  8. One of the challenges of a JavaScript engine is dealing with the lack of type declarations. In order to dispatch the correct computation, the runtime must be able to query a variable or field’s current type.
  9. We’re done with data representation and garbage collection, so it’s time for the really fun stuff: running JavaScript code, and making it run fast. One way to run code, but not make it run fast, is to use an interpreter.