SlideShare ist ein Scribd-Unternehmen logo
1 von 47
Downloaden Sie, um offline zu lesen
VMs,
Interpreters,
  JIT & Co

   A First
Introduction

Marcus Denker
Overview

 •   Virtual Machines: How and Why
     – Bytecode
     – Garbage Collection (GC)

 •   Code Execution
     –   Interpretation
     –   Just-In-Time
     –   Optimizing method lookup
     –   Dynamic optimization




Marcus Denker
Caveat


 •   Virtual Machines are not trivial!

 •   This lecture can only give a very high-level overview

 •   You will not be a VM Hacker afterwards ;-)

 •   Many slides of this talk should be complete lectures
     (or even courses)



Marcus Denker
Virtual Machine: What’s that?

 •   Software implementation of a machine

 •   Process VMs
     – Processor emulation (e.g. run PowerPC on Intel)
         • FX32!, MacOS 68k, powerPC
     – Optimization: HP Dynamo
     – High level language VMs

 •   System Virtual Machines
     – IBM z/OS
     – Virtual PC



Marcus Denker
High Level Language VM

 •   We focus on HLL VMs
 •   Examples: Java, Smalltalk, Python, Perl....

 •   Three parts:
     – The Processor
         • Simulated Instruction Set Architecture (ISA)
     – The Memory: Garbage Collection
     – Abstraction for other OS / Hardware (e.g, I/O)

 •   Very near to the implemented language
 •   Provides abstraction from OS


Marcus Denker
The Processor

 •   VM has an instruction set. (virtual ISA)
     – Stack machine
     – Bytecode
     – High level: models the language quite directly
         • e.g, “Send” bytecode in Smalltalk

 •   VM needs to run this Code
 •   Many designs possible:
     –   Interpreter (simple)
     –   Threaded code
     –   Simple translation to machine code
     –   Dynamic optimization (“HotSpot”)


Marcus Denker
Garbage Collection

 •   Provides a high level model for memory
 •   No need to explicitly free memory
 •   Different implementations:
     – Reference counting
     – Mark and sweep
     – Generational GC

 •   A modern GC is *very* efficient
 •   It’s hard to do better




Marcus Denker
Other Hardware Abstraction
 •   We need a way to access Operating System APIs
     – Graphics
     – Networking (TCP/IP)
     – Disc / Keyboard....

 •   Simple solution:
     – Define an API for this Hardware
     – Implement this as part of the VM (“Primitives”)
     – Call OS library functions directly




Marcus Denker
Virtual Machine: Lessons learned
•   VM: Simulated Machine
•   Consists of
    – Virtual ISA
    – Memory Management
    – API for Hardware/OS


•   Next: Bytecode
Bytecode

 •   Byte-encoded instruction set
 •   This means: 256 main instructions
 •   Stack based

 •   Positive:
     – Very compact
     – Easy to interpret

 •   Important: The ISA of a VM does not need to be
     Bytecode.


Marcus Denker
Compiling to Bytecode



           Program Text            1 + 2




                     Compiler



                Bytecode        76 77 B0 7C




Marcus Denker
Example: Number>>asInteger

 •   Smalltalk code:
      	

 ^1 + 2




 •   Symbolic Bytecode
      <76>      pushConstant: 1
      <77>      pushConstant: 2
      <B0>      send: +
      <7C>      returnTop




Marcus Denker
Example: Java Bytecode

 •   From class Rectangle
public int perimeter()

0: iconst_2
1: aload_0         “push this”
2: getfield#2      “value of sides”
5: iconst_0
6: iaload
7: aload_0                        2*(sides[0]+sides[1])
8: getfield#2
11: iconst_1
12: iaload
13: iadd
14: imul
15: ireturn


Marcus Denker
Difference Java and Squeak

 •   Instruction for arithmetic
     – Just method calls in Smalltalk
 •   Typed: special bytecode for int, long, float
 •   Bytecode for array access


 •   Not shown:
     – Jumps for loops and control structures
     – Exceptions (java)
     – ....



Marcus Denker
Systems Using Bytecode

 •   USCD Pascal (first, ~1970)
 •   Smalltalk
 •   Java
 •   PHP
 •   Python


 •   No bytecode:
     – Ruby: Interprets the AST




Marcus Denker
Bytecode: Lessons Learned
•   Instruction set of a VM
•   Stack based
•   Fairly simple

•   Next: Bytecode Execution
Running Bytecode

 •   Invented for Interpretation
 •   But there are faster ways to do it

 •   We will see
     – Interpreter
     – Threaded code
     – Translation to machine code (JIT, Hotspot)




Marcus Denker
Interpreter

 •   Just a big loop with a case statement
                                      while (1) {
 •   Positive:                        	

 bc = *ip++;
     – Memory efficient                	

 switch (bc) {
     – Very simple                    	

 	

 ...
                                      	

 	

 case 0x76:
     – Easy to port to new machines   	

 	

 	

 *++sp = ConstOne;
                                      	

 	

 	

 break;
 •   Negative:                        	

 	

 ...
                                      	

 }
     – Slooooooow...                  }




Marcus Denker
Faster: Threaded Code

 •    Idea: Bytecode implementation in memory has
      Address
                                         	

     Bytecode                Addresses   push1:
                                         	

 *++sp = ConstOne
                 translate
                                         	

 goto *ip++;
                                         push2:
 •    Pro:                               	

 *++sp = ConstTwo
       –   Faster                        	

 goto *ip++;

       –   Still fairly simple
       –   Still portable
       –   Used (and pioneered) in Forth.
       –   Threaded code can be the ISA of the VM

Marcus Denker
Next Step: Translation

 •   Idea: Translate bytecode to machine code

                        Bytecode                Binary Code

 •   Pro:                           translate

     – Faster execution
     – Possible to do optimizations
     – Interesting tricks possible: Inline Caching (later)

 •   Negative
     – Complex
     – Not portable across processors
     – Memory

Marcus Denker
Simple JIT Compiler

 •   JIT = “Just in Time”
 •   On first execution: Generate Code
 •   Need to be very fast
     – No time for many code optimizations
 •   Code is cached (LRU)

 •   Memory:
     – Cache + implementation JIT in RAM.
     – We trade memory for performance.




Marcus Denker
Bytecode Execution: Lessons Learned
•   We have seen
    – Interpreter
    – Threaded Code
    – Simple JIT

•   Next: Optimizations
Optimizing Method Lookup

 •   What is slow in OO virtual machines?

 •   Overhead of Bytecode Interpretation
     – Solved with JIT

 •   Method Lookup
     – Java: Polymorph
     – Smalltalk: Polymorph and dynamically typed
     – Method to call can only be looked up at runtime




Marcus Denker
Example Method Lookup

 •   A simple example:

      array := #(0 1 2 2.5).

      array collect: [:each | each + 1]



 •   The method “+” executed depends on the receiving
     object




Marcus Denker
Method Lookup

 •   Need to look this up at runtime:
     – Get class of the receiver
     – if method not found, look in superclass



                                Code for lookup
                                (in the VM)
     ....
     lookup(obj, sel)
     ....
                                Code of Method
                                Found



      Binary Code
      (generated by JIT)
Marcus Denker
Observation: Yesterday’s Weather

 •   Predict the weather of tomorrow: Same as today
 •   Is right in over 80%

 •   Similar for method lookup:
     – Look up method on first execution of a send
     – The next lookup would likely get the same method

 •   True polymorphic sends are seldom




Marcus Denker
Inline Cache

 •   Goal: Make sends faster in many cases
 •   Solution: remember the last lookup

 •   Trick:
     – Inline cache (IC)
     – In the binary code of the sender




Marcus Denker
Inline Cache: First Execution

 First Call:

                                     1) lookup the method



                ....
                lookup(obj, sel)
                ....




                Binary Code
                (generated by JIT)




Marcus Denker
Inline Cache: First Execution

 First Call:

                                       1) lookup the method



                ....
                lookup(obj, sel)
                ....




                                     Code of Method
                Binary Code
                (generated by JIT)




Marcus Denker
Inline Cache: First Execution

 First Call:

                                       1) lookup the method
                                       2) generate test preamble


                ....
                lookup(obj, sel)
                ....

                                     check receiver
                                     type

                                     Code of Method
                Binary Code
                (generated by JIT)




Marcus Denker
Inline Cache: First Execution

 First Call:

                                       1) lookup the method
                                       2) generate test preamble
                                       3) patch calling method

                ....
                call preamble
                ....

                                     check receiver
                                     type

                                     Code of Method
                Binary Code
                (generated by JIT)




Marcus Denker
Inline Cache: First Execution

 First Call:

                                       1) lookup the method
                                       2) generate test preamble
                                       3) patch calling method
                                       4) execute method and return
                ....
                call preamble
                ....

                                     check receiver
                                     type

                                     Code of Method
                Binary Code
                (generated by JIT)




Marcus Denker
Inline Cache: Second Execution

 Second Call:
                                     - we jump directly to the test
                                     - If test is ok, execute the method
                                     - If not, do normal lookup


                ....
                call preamble
                ....
                                                                 wrong
                                            check receiver                 normal
                                            type                           lookup

                                            Code of Method
                Binary Code
                (generated by JIT)




Marcus Denker
Limitation of Simple Inline Caches

 •   Works nicely at places were only one method is
     called. “Monomorphic sends”. >80%

 •   How to solve it for the rest?

 •   Polymorphic sends (<15%)
     – <10 different classes

 •   Megamophic sends (<5%)
     – >10 different classes



Marcus Denker
Example Polymorphic Send

 •   This example is Polymorphic.

      array := #(1 1.5 2 2.5 3 3.5).

      array collect: [:each | each + 1]

 •   Two classes: Integer and Float

 •   Inline cache will fail at every send
 •   It will be slower than doing just a lookup!



Marcus Denker
Polymorphic Inline Caches

 •   Solution: When inline cache fails, build up a PIC

 •   Basic idea:
     – For each receiver, remember the method found
     – Generate stub to call the correct method




Marcus Denker
PIC


                                                check receiver
                                                type

                                                Code of Method
                            If type = Integer   (Integer)
      ....                  
    jump
      call pic              If type = Float
      ....                  
    jump
                            else
                            
    call lookup
                                                check receiver
                                                type

       Binary Code                              Code of Method
       (generated by JIT)                       (Float)




Marcus Denker
PIC

 •   PICs solve the problem of Polymorphic sends
 •   Three steps:
     – Normal Lookup / generate IC
     – IC lookup, if fail: build PIC
     – PIC lookup
 •   PICs grow to a fixed size (~10)
 •   After that: replace entries

 •   Megamorphic sends:
     – Will be slow
     – Some systems detect them and disable caching


Marcus Denker
Optimizations: Lessons Learned
•   We have seen
    – Inline Caches
    – Polymorphic Inline Caches

•   Next: Beyond Just-In-Time
Beyond JIT: Dynamic Optimization

 •   Just-In-Time == Not-Enough-Time
     – No complex optimization possible
     – No whole-program-optimization

 •   We want to do real optimizations!




Marcus Denker
Excursion: Optimizations

 •   Or: Why is a modern compiler so slow?

 •   There is a lot to do to generate good code!
     – Transformation in a good intermediate form (SSA)
     – Many different optimization passes
         • Constant subexpression elimination (CSE)
         • Dead code elimination
         • Inlining
     – Register allocation
     – Instruction selection




Marcus Denker
State of the Art: HotSpot et. al.

 •   Pioneered in Self
 •   Use multiple compilers
     – Fast but bad code
     – Slow but good code

 •   Only invest time were it really pays off
 •   Here we can invest some more

 •   Problem:Very complicated, huge warmup, lots of
     Memory
 •   Examples: Self, Java Hotspot, Jikes

Marcus Denker
PIC Data for Inlining

•   Use type information from PIC for specialisation
•   Example: Class Point

                               CarthesianPoint>>x
                               	

 ^x

         ...
         lookup(rec, sel)
         ....
                               PolarPoint>>x
                               	

 “compute x from rho and theta”


          Binary Code
          (generated by JIT)



Marcus Denker
Example Inlining

   •   The PIC will be build



                               If type = CarthesianPoint
                               
    jump                   The PIC contains
         ...                   If type = PolarPoint
         call pic              
    jump                   type Information!
         ....                  else
                               
    call lookup




          Binary Code
          (generated by JIT)



Marcus Denker
Example Inlining

 •   We can inline code for all known cases



       ...
       if type = cartesian point
       
 result ← receiver.x
       else if type = polar point
       
 result ← receiver.rho * cos(receiver.theta)
       else call lookup
       ...
                         Binary Code
                         (generated by JIT)



Marcus Denker
End


•   What is a VM
•   Overview about bytecode interpreter / compiler
•   Inline Caching / PICs
•   Dynamic Optimization


•   Questions?
Literature
 •   Smith/Nair:Virtual Machines (Morgan Kaufman
     August 2005). Looks quite good!

 •   For PICs:
     – Urs Hölzle, Craig Chambers, David Ungar: Optimizing
       Dynamically-Typed Object-Oriented Languages With
       Polymorphic Inline Caches
     – Urs Hoelzle. "Adaptive Optimization for Self: Reconciling
       High Performance with Exploratory Programming." Ph.D.
       thesis, Computer Science Department, Stanford
       University, August 1994.



Marcus Denker

Weitere ähnliche Inhalte

Was ist angesagt?

Kernel Recipes 2015: Anatomy of an atomic KMS driver
Kernel Recipes 2015: Anatomy of an atomic KMS driverKernel Recipes 2015: Anatomy of an atomic KMS driver
Kernel Recipes 2015: Anatomy of an atomic KMS driverAnne Nicolas
 
Использование KASan для автономного гипервизора
Использование KASan для автономного гипервизораИспользование KASan для автономного гипервизора
Использование KASan для автономного гипервизораPositive Hack Days
 
Possibility of arbitrary code execution by Step-Oriented Programming
Possibility of arbitrary code execution by Step-Oriented ProgrammingPossibility of arbitrary code execution by Step-Oriented Programming
Possibility of arbitrary code execution by Step-Oriented Programmingkozossakai
 
Overflowing attack potential, scoring defence in-depth
Overflowing attack potential, scoring defence in-depthOverflowing attack potential, scoring defence in-depth
Overflowing attack potential, scoring defence in-depthJavier Tallón
 
Linux: the first second
Linux: the first secondLinux: the first second
Linux: the first secondAlison Chaiken
 
Infecting the Embedded Supply Chain
 Infecting the Embedded Supply Chain Infecting the Embedded Supply Chain
Infecting the Embedded Supply ChainPriyanka Aash
 
Coding for multiple cores
Coding for multiple coresCoding for multiple cores
Coding for multiple coresLee Hanxue
 
Lockless Programming GDC 09
Lockless Programming GDC 09Lockless Programming GDC 09
Lockless Programming GDC 09Lee Hanxue
 
Transactional Memory
Transactional MemoryTransactional Memory
Transactional MemoryYuuki Takano
 
Solving Real-Time Scheduling Problems With RT_PREEMPT and Deadline-Based Sche...
Solving Real-Time Scheduling Problems With RT_PREEMPT and Deadline-Based Sche...Solving Real-Time Scheduling Problems With RT_PREEMPT and Deadline-Based Sche...
Solving Real-Time Scheduling Problems With RT_PREEMPT and Deadline-Based Sche...peknap
 
Sysdig Tokyo Meetup 2018 02-27
Sysdig Tokyo Meetup 2018 02-27Sysdig Tokyo Meetup 2018 02-27
Sysdig Tokyo Meetup 2018 02-27Michael Ducy
 
Captain Hook: Pirating AVs to Bypass Exploit Mitigations
Captain Hook: Pirating AVs to Bypass Exploit MitigationsCaptain Hook: Pirating AVs to Bypass Exploit Mitigations
Captain Hook: Pirating AVs to Bypass Exploit MitigationsenSilo
 
”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016Kuniyasu Suzaki
 
Kernel Recipes 2014 - Writing Code: Keep It Short, Stupid!
Kernel Recipes 2014 - Writing Code: Keep It Short, Stupid!Kernel Recipes 2014 - Writing Code: Keep It Short, Stupid!
Kernel Recipes 2014 - Writing Code: Keep It Short, Stupid!Anne Nicolas
 
QEMU and Raspberry Pi. Instant Embedded Development
QEMU and Raspberry Pi. Instant Embedded DevelopmentQEMU and Raspberry Pi. Instant Embedded Development
QEMU and Raspberry Pi. Instant Embedded DevelopmentGlobalLogic Ukraine
 
Embedded device hacking Session i
Embedded device hacking Session iEmbedded device hacking Session i
Embedded device hacking Session iMalachi Jones
 

Was ist angesagt? (20)

Kernel Recipes 2015: Anatomy of an atomic KMS driver
Kernel Recipes 2015: Anatomy of an atomic KMS driverKernel Recipes 2015: Anatomy of an atomic KMS driver
Kernel Recipes 2015: Anatomy of an atomic KMS driver
 
Использование KASan для автономного гипервизора
Использование KASan для автономного гипервизораИспользование KASan для автономного гипервизора
Использование KASan для автономного гипервизора
 
Possibility of arbitrary code execution by Step-Oriented Programming
Possibility of arbitrary code execution by Step-Oriented ProgrammingPossibility of arbitrary code execution by Step-Oriented Programming
Possibility of arbitrary code execution by Step-Oriented Programming
 
Overflowing attack potential, scoring defence in-depth
Overflowing attack potential, scoring defence in-depthOverflowing attack potential, scoring defence in-depth
Overflowing attack potential, scoring defence in-depth
 
Linux: the first second
Linux: the first secondLinux: the first second
Linux: the first second
 
Xvisor: embedded and lightweight hypervisor
Xvisor: embedded and lightweight hypervisorXvisor: embedded and lightweight hypervisor
Xvisor: embedded and lightweight hypervisor
 
Infecting the Embedded Supply Chain
 Infecting the Embedded Supply Chain Infecting the Embedded Supply Chain
Infecting the Embedded Supply Chain
 
Understanding the Dalvik Virtual Machine
Understanding the Dalvik Virtual MachineUnderstanding the Dalvik Virtual Machine
Understanding the Dalvik Virtual Machine
 
Coding for multiple cores
Coding for multiple coresCoding for multiple cores
Coding for multiple cores
 
Lockless Programming GDC 09
Lockless Programming GDC 09Lockless Programming GDC 09
Lockless Programming GDC 09
 
Transactional Memory
Transactional MemoryTransactional Memory
Transactional Memory
 
Solving Real-Time Scheduling Problems With RT_PREEMPT and Deadline-Based Sche...
Solving Real-Time Scheduling Problems With RT_PREEMPT and Deadline-Based Sche...Solving Real-Time Scheduling Problems With RT_PREEMPT and Deadline-Based Sche...
Solving Real-Time Scheduling Problems With RT_PREEMPT and Deadline-Based Sche...
 
Sysdig Tokyo Meetup 2018 02-27
Sysdig Tokyo Meetup 2018 02-27Sysdig Tokyo Meetup 2018 02-27
Sysdig Tokyo Meetup 2018 02-27
 
Captain Hook: Pirating AVs to Bypass Exploit Mitigations
Captain Hook: Pirating AVs to Bypass Exploit MitigationsCaptain Hook: Pirating AVs to Bypass Exploit Mitigations
Captain Hook: Pirating AVs to Bypass Exploit Mitigations
 
”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016
 
Kernel Recipes 2014 - Writing Code: Keep It Short, Stupid!
Kernel Recipes 2014 - Writing Code: Keep It Short, Stupid!Kernel Recipes 2014 - Writing Code: Keep It Short, Stupid!
Kernel Recipes 2014 - Writing Code: Keep It Short, Stupid!
 
QEMU and Raspberry Pi. Instant Embedded Development
QEMU and Raspberry Pi. Instant Embedded DevelopmentQEMU and Raspberry Pi. Instant Embedded Development
QEMU and Raspberry Pi. Instant Embedded Development
 
Transactional Memory
Transactional MemoryTransactional Memory
Transactional Memory
 
Multithreading Design Patterns
Multithreading Design PatternsMultithreading Design Patterns
Multithreading Design Patterns
 
Embedded device hacking Session i
Embedded device hacking Session iEmbedded device hacking Session i
Embedded device hacking Session i
 

Ähnlich wie VMs, Interpreters, JIT

Efficient Bytecode Analysis: Linespeed Shellcode Detection
Efficient Bytecode Analysis: Linespeed Shellcode DetectionEfficient Bytecode Analysis: Linespeed Shellcode Detection
Efficient Bytecode Analysis: Linespeed Shellcode DetectionGeorg Wicherski
 
Patching Windows Executables with the Backdoor Factory | DerbyCon 2013
Patching Windows Executables with the Backdoor Factory | DerbyCon 2013Patching Windows Executables with the Backdoor Factory | DerbyCon 2013
Patching Windows Executables with the Backdoor Factory | DerbyCon 2013midnite_runr
 
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...Alexandre Moneger
 
Bytecode Manipulation with a Java Agent and Byte Buddy
Bytecode Manipulation with a Java Agent and Byte BuddyBytecode Manipulation with a Java Agent and Byte Buddy
Bytecode Manipulation with a Java Agent and Byte BuddyKoichi Sakata
 
You didnt see it’s coming? "Dawn of hardened Windows Kernel"
You didnt see it’s coming? "Dawn of hardened Windows Kernel" You didnt see it’s coming? "Dawn of hardened Windows Kernel"
You didnt see it’s coming? "Dawn of hardened Windows Kernel" Peter Hlavaty
 
RBuilder and ByteSurgeon
RBuilder and ByteSurgeonRBuilder and ByteSurgeon
RBuilder and ByteSurgeonMarcus Denker
 
Ice Age melting down: Intel features considered usefull!
Ice Age melting down: Intel features considered usefull!Ice Age melting down: Intel features considered usefull!
Ice Age melting down: Intel features considered usefull!Peter Hlavaty
 
2013 syscan360 yuki_chen_syscan360_exploit your java native vulnerabilities o...
2013 syscan360 yuki_chen_syscan360_exploit your java native vulnerabilities o...2013 syscan360 yuki_chen_syscan360_exploit your java native vulnerabilities o...
2013 syscan360 yuki_chen_syscan360_exploit your java native vulnerabilities o...chen yuki
 
Reverse Engineering the TomTom Runner pt. 2
Reverse Engineering the TomTom Runner pt. 2Reverse Engineering the TomTom Runner pt. 2
Reverse Engineering the TomTom Runner pt. 2Luis Grangeia
 
Baab (Bug as a Backdoor) through automatic exploit generation (CRAX)
Baab (Bug as a Backdoor) through automatic exploit generation (CRAX)Baab (Bug as a Backdoor) through automatic exploit generation (CRAX)
Baab (Bug as a Backdoor) through automatic exploit generation (CRAX)Shih-Kun Huang
 
JVM Memory Model - Yoav Abrahami, Wix
JVM Memory Model - Yoav Abrahami, WixJVM Memory Model - Yoav Abrahami, Wix
JVM Memory Model - Yoav Abrahami, WixCodemotion Tel Aviv
 
Embedded system custom single purpose processors
Embedded system custom single  purpose processorsEmbedded system custom single  purpose processors
Embedded system custom single purpose processorsAiswaryadevi Jaganmohan
 
Clr jvm implementation differences
Clr jvm implementation differencesClr jvm implementation differences
Clr jvm implementation differencesJean-Philippe BEMPEL
 
IoT exploitation: from memory corruption to code execution by Marco Romano
IoT exploitation: from memory corruption to code execution by Marco RomanoIoT exploitation: from memory corruption to code execution by Marco Romano
IoT exploitation: from memory corruption to code execution by Marco RomanoCodemotion
 
IoT exploitation: from memory corruption to code execution - Marco Romano - C...
IoT exploitation: from memory corruption to code execution - Marco Romano - C...IoT exploitation: from memory corruption to code execution - Marco Romano - C...
IoT exploitation: from memory corruption to code execution - Marco Romano - C...Codemotion
 

Ähnlich wie VMs, Interpreters, JIT (20)

Efficient Bytecode Analysis: Linespeed Shellcode Detection
Efficient Bytecode Analysis: Linespeed Shellcode DetectionEfficient Bytecode Analysis: Linespeed Shellcode Detection
Efficient Bytecode Analysis: Linespeed Shellcode Detection
 
Patching Windows Executables with the Backdoor Factory | DerbyCon 2013
Patching Windows Executables with the Backdoor Factory | DerbyCon 2013Patching Windows Executables with the Backdoor Factory | DerbyCon 2013
Patching Windows Executables with the Backdoor Factory | DerbyCon 2013
 
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
 
Bytecode Manipulation with a Java Agent and Byte Buddy
Bytecode Manipulation with a Java Agent and Byte BuddyBytecode Manipulation with a Java Agent and Byte Buddy
Bytecode Manipulation with a Java Agent and Byte Buddy
 
You didnt see it’s coming? "Dawn of hardened Windows Kernel"
You didnt see it’s coming? "Dawn of hardened Windows Kernel" You didnt see it’s coming? "Dawn of hardened Windows Kernel"
You didnt see it’s coming? "Dawn of hardened Windows Kernel"
 
RBuilder and ByteSurgeon
RBuilder and ByteSurgeonRBuilder and ByteSurgeon
RBuilder and ByteSurgeon
 
Ice Age melting down: Intel features considered usefull!
Ice Age melting down: Intel features considered usefull!Ice Age melting down: Intel features considered usefull!
Ice Age melting down: Intel features considered usefull!
 
You suck at Memory Analysis
You suck at Memory AnalysisYou suck at Memory Analysis
You suck at Memory Analysis
 
05 defense
05 defense05 defense
05 defense
 
2013 syscan360 yuki_chen_syscan360_exploit your java native vulnerabilities o...
2013 syscan360 yuki_chen_syscan360_exploit your java native vulnerabilities o...2013 syscan360 yuki_chen_syscan360_exploit your java native vulnerabilities o...
2013 syscan360 yuki_chen_syscan360_exploit your java native vulnerabilities o...
 
Memory, IPC and L4Re
Memory, IPC and L4ReMemory, IPC and L4Re
Memory, IPC and L4Re
 
Reverse Engineering the TomTom Runner pt. 2
Reverse Engineering the TomTom Runner pt. 2Reverse Engineering the TomTom Runner pt. 2
Reverse Engineering the TomTom Runner pt. 2
 
Baab (Bug as a Backdoor) through automatic exploit generation (CRAX)
Baab (Bug as a Backdoor) through automatic exploit generation (CRAX)Baab (Bug as a Backdoor) through automatic exploit generation (CRAX)
Baab (Bug as a Backdoor) through automatic exploit generation (CRAX)
 
JVM Memory Model - Yoav Abrahami, Wix
JVM Memory Model - Yoav Abrahami, WixJVM Memory Model - Yoav Abrahami, Wix
JVM Memory Model - Yoav Abrahami, Wix
 
Embedded system custom single purpose processors
Embedded system custom single  purpose processorsEmbedded system custom single  purpose processors
Embedded system custom single purpose processors
 
Clr jvm implementation differences
Clr jvm implementation differencesClr jvm implementation differences
Clr jvm implementation differences
 
Deep hooks
Deep hooksDeep hooks
Deep hooks
 
Effective code reviews
Effective code reviewsEffective code reviews
Effective code reviews
 
IoT exploitation: from memory corruption to code execution by Marco Romano
IoT exploitation: from memory corruption to code execution by Marco RomanoIoT exploitation: from memory corruption to code execution by Marco Romano
IoT exploitation: from memory corruption to code execution by Marco Romano
 
IoT exploitation: from memory corruption to code execution - Marco Romano - C...
IoT exploitation: from memory corruption to code execution - Marco Romano - C...IoT exploitation: from memory corruption to code execution - Marco Romano - C...
IoT exploitation: from memory corruption to code execution - Marco Romano - C...
 

Mehr von Marcus Denker

ConstantBlocks in Pharo11
ConstantBlocks in Pharo11ConstantBlocks in Pharo11
ConstantBlocks in Pharo11Marcus Denker
 
First Class Variables as AST Annotations
First Class Variables as AST AnnotationsFirst Class Variables as AST Annotations
First Class Variables as AST AnnotationsMarcus Denker
 
Supporting Pharo / Getting Pharo Support
Supporting Pharo / Getting Pharo SupportSupporting Pharo / Getting Pharo Support
Supporting Pharo / Getting Pharo SupportMarcus Denker
 
Lecture: "Advanced Reflection: MetaLinks"
Lecture: "Advanced Reflection: MetaLinks"Lecture: "Advanced Reflection: MetaLinks"
Lecture: "Advanced Reflection: MetaLinks"Marcus Denker
 
thisContext in the Debugger
thisContext in the DebuggerthisContext in the Debugger
thisContext in the DebuggerMarcus Denker
 
Lecture. Advanced Reflection: MetaLinks
Lecture. Advanced Reflection: MetaLinksLecture. Advanced Reflection: MetaLinks
Lecture. Advanced Reflection: MetaLinksMarcus Denker
 
Improving code completion for Pharo
Improving code completion for PharoImproving code completion for Pharo
Improving code completion for PharoMarcus Denker
 
VUB Brussels Lecture 2019: Advanced Reflection: MetaLinks
VUB Brussels Lecture 2019: Advanced Reflection: MetaLinksVUB Brussels Lecture 2019: Advanced Reflection: MetaLinks
VUB Brussels Lecture 2019: Advanced Reflection: MetaLinksMarcus Denker
 
Lecture: Advanced Reflection. MetaLinks
Lecture: Advanced Reflection. MetaLinksLecture: Advanced Reflection. MetaLinks
Lecture: Advanced Reflection. MetaLinksMarcus Denker
 
Open-Source: An Infinite Game
Open-Source: An Infinite GameOpen-Source: An Infinite Game
Open-Source: An Infinite GameMarcus Denker
 
PharoTechTalk: Contributing to Pharo
PharoTechTalk: Contributing to PharoPharoTechTalk: Contributing to Pharo
PharoTechTalk: Contributing to PharoMarcus Denker
 
Feedback Loops in Practice
Feedback Loops in PracticeFeedback Loops in Practice
Feedback Loops in PracticeMarcus Denker
 

Mehr von Marcus Denker (20)

Soil And Pharo
Soil And PharoSoil And Pharo
Soil And Pharo
 
ConstantBlocks in Pharo11
ConstantBlocks in Pharo11ConstantBlocks in Pharo11
ConstantBlocks in Pharo11
 
Demo: Improved DoIt
Demo: Improved DoItDemo: Improved DoIt
Demo: Improved DoIt
 
First Class Variables as AST Annotations
First Class Variables as AST AnnotationsFirst Class Variables as AST Annotations
First Class Variables as AST Annotations
 
Supporting Pharo / Getting Pharo Support
Supporting Pharo / Getting Pharo SupportSupporting Pharo / Getting Pharo Support
Supporting Pharo / Getting Pharo Support
 
Lecture: "Advanced Reflection: MetaLinks"
Lecture: "Advanced Reflection: MetaLinks"Lecture: "Advanced Reflection: MetaLinks"
Lecture: "Advanced Reflection: MetaLinks"
 
thisContext in the Debugger
thisContext in the DebuggerthisContext in the Debugger
thisContext in the Debugger
 
Variables in Pharo
Variables in PharoVariables in Pharo
Variables in Pharo
 
Lecture. Advanced Reflection: MetaLinks
Lecture. Advanced Reflection: MetaLinksLecture. Advanced Reflection: MetaLinks
Lecture. Advanced Reflection: MetaLinks
 
Improving code completion for Pharo
Improving code completion for PharoImproving code completion for Pharo
Improving code completion for Pharo
 
VUB Brussels Lecture 2019: Advanced Reflection: MetaLinks
VUB Brussels Lecture 2019: Advanced Reflection: MetaLinksVUB Brussels Lecture 2019: Advanced Reflection: MetaLinks
VUB Brussels Lecture 2019: Advanced Reflection: MetaLinks
 
Slot Composition
Slot CompositionSlot Composition
Slot Composition
 
Lecture: Advanced Reflection. MetaLinks
Lecture: Advanced Reflection. MetaLinksLecture: Advanced Reflection. MetaLinks
Lecture: Advanced Reflection. MetaLinks
 
PHARO IOT
PHARO IOTPHARO IOT
PHARO IOT
 
Open-Source: An Infinite Game
Open-Source: An Infinite GameOpen-Source: An Infinite Game
Open-Source: An Infinite Game
 
Lecture: MetaLinks
Lecture: MetaLinksLecture: MetaLinks
Lecture: MetaLinks
 
PharoTechTalk: Contributing to Pharo
PharoTechTalk: Contributing to PharoPharoTechTalk: Contributing to Pharo
PharoTechTalk: Contributing to Pharo
 
Feedback Loops in Practice
Feedback Loops in PracticeFeedback Loops in Practice
Feedback Loops in Practice
 
Pharo6 - ESUG17
Pharo6 - ESUG17Pharo6 - ESUG17
Pharo6 - ESUG17
 
Pharo6
Pharo6Pharo6
Pharo6
 

Kürzlich hochgeladen

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 

Kürzlich hochgeladen (20)

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 

VMs, Interpreters, JIT

  • 1. VMs, Interpreters, JIT & Co A First Introduction Marcus Denker
  • 2. Overview • Virtual Machines: How and Why – Bytecode – Garbage Collection (GC) • Code Execution – Interpretation – Just-In-Time – Optimizing method lookup – Dynamic optimization Marcus Denker
  • 3. Caveat • Virtual Machines are not trivial! • This lecture can only give a very high-level overview • You will not be a VM Hacker afterwards ;-) • Many slides of this talk should be complete lectures (or even courses) Marcus Denker
  • 4. Virtual Machine: What’s that? • Software implementation of a machine • Process VMs – Processor emulation (e.g. run PowerPC on Intel) • FX32!, MacOS 68k, powerPC – Optimization: HP Dynamo – High level language VMs • System Virtual Machines – IBM z/OS – Virtual PC Marcus Denker
  • 5. High Level Language VM • We focus on HLL VMs • Examples: Java, Smalltalk, Python, Perl.... • Three parts: – The Processor • Simulated Instruction Set Architecture (ISA) – The Memory: Garbage Collection – Abstraction for other OS / Hardware (e.g, I/O) • Very near to the implemented language • Provides abstraction from OS Marcus Denker
  • 6. The Processor • VM has an instruction set. (virtual ISA) – Stack machine – Bytecode – High level: models the language quite directly • e.g, “Send” bytecode in Smalltalk • VM needs to run this Code • Many designs possible: – Interpreter (simple) – Threaded code – Simple translation to machine code – Dynamic optimization (“HotSpot”) Marcus Denker
  • 7. Garbage Collection • Provides a high level model for memory • No need to explicitly free memory • Different implementations: – Reference counting – Mark and sweep – Generational GC • A modern GC is *very* efficient • It’s hard to do better Marcus Denker
  • 8. Other Hardware Abstraction • We need a way to access Operating System APIs – Graphics – Networking (TCP/IP) – Disc / Keyboard.... • Simple solution: – Define an API for this Hardware – Implement this as part of the VM (“Primitives”) – Call OS library functions directly Marcus Denker
  • 9. Virtual Machine: Lessons learned • VM: Simulated Machine • Consists of – Virtual ISA – Memory Management – API for Hardware/OS • Next: Bytecode
  • 10. Bytecode • Byte-encoded instruction set • This means: 256 main instructions • Stack based • Positive: – Very compact – Easy to interpret • Important: The ISA of a VM does not need to be Bytecode. Marcus Denker
  • 11. Compiling to Bytecode Program Text 1 + 2 Compiler Bytecode 76 77 B0 7C Marcus Denker
  • 12. Example: Number>>asInteger • Smalltalk code: ^1 + 2 • Symbolic Bytecode <76> pushConstant: 1 <77> pushConstant: 2 <B0> send: + <7C> returnTop Marcus Denker
  • 13. Example: Java Bytecode • From class Rectangle public int perimeter() 0: iconst_2 1: aload_0 “push this” 2: getfield#2 “value of sides” 5: iconst_0 6: iaload 7: aload_0 2*(sides[0]+sides[1]) 8: getfield#2 11: iconst_1 12: iaload 13: iadd 14: imul 15: ireturn Marcus Denker
  • 14. Difference Java and Squeak • Instruction for arithmetic – Just method calls in Smalltalk • Typed: special bytecode for int, long, float • Bytecode for array access • Not shown: – Jumps for loops and control structures – Exceptions (java) – .... Marcus Denker
  • 15. Systems Using Bytecode • USCD Pascal (first, ~1970) • Smalltalk • Java • PHP • Python • No bytecode: – Ruby: Interprets the AST Marcus Denker
  • 16. Bytecode: Lessons Learned • Instruction set of a VM • Stack based • Fairly simple • Next: Bytecode Execution
  • 17. Running Bytecode • Invented for Interpretation • But there are faster ways to do it • We will see – Interpreter – Threaded code – Translation to machine code (JIT, Hotspot) Marcus Denker
  • 18. Interpreter • Just a big loop with a case statement while (1) { • Positive: bc = *ip++; – Memory efficient switch (bc) { – Very simple ... case 0x76: – Easy to port to new machines *++sp = ConstOne; break; • Negative: ... } – Slooooooow... } Marcus Denker
  • 19. Faster: Threaded Code • Idea: Bytecode implementation in memory has Address Bytecode Addresses push1: *++sp = ConstOne translate goto *ip++; push2: • Pro: *++sp = ConstTwo – Faster goto *ip++; – Still fairly simple – Still portable – Used (and pioneered) in Forth. – Threaded code can be the ISA of the VM Marcus Denker
  • 20. Next Step: Translation • Idea: Translate bytecode to machine code Bytecode Binary Code • Pro: translate – Faster execution – Possible to do optimizations – Interesting tricks possible: Inline Caching (later) • Negative – Complex – Not portable across processors – Memory Marcus Denker
  • 21. Simple JIT Compiler • JIT = “Just in Time” • On first execution: Generate Code • Need to be very fast – No time for many code optimizations • Code is cached (LRU) • Memory: – Cache + implementation JIT in RAM. – We trade memory for performance. Marcus Denker
  • 22. Bytecode Execution: Lessons Learned • We have seen – Interpreter – Threaded Code – Simple JIT • Next: Optimizations
  • 23. Optimizing Method Lookup • What is slow in OO virtual machines? • Overhead of Bytecode Interpretation – Solved with JIT • Method Lookup – Java: Polymorph – Smalltalk: Polymorph and dynamically typed – Method to call can only be looked up at runtime Marcus Denker
  • 24. Example Method Lookup • A simple example: array := #(0 1 2 2.5). array collect: [:each | each + 1] • The method “+” executed depends on the receiving object Marcus Denker
  • 25. Method Lookup • Need to look this up at runtime: – Get class of the receiver – if method not found, look in superclass Code for lookup (in the VM) .... lookup(obj, sel) .... Code of Method Found Binary Code (generated by JIT) Marcus Denker
  • 26. Observation: Yesterday’s Weather • Predict the weather of tomorrow: Same as today • Is right in over 80% • Similar for method lookup: – Look up method on first execution of a send – The next lookup would likely get the same method • True polymorphic sends are seldom Marcus Denker
  • 27. Inline Cache • Goal: Make sends faster in many cases • Solution: remember the last lookup • Trick: – Inline cache (IC) – In the binary code of the sender Marcus Denker
  • 28. Inline Cache: First Execution First Call: 1) lookup the method .... lookup(obj, sel) .... Binary Code (generated by JIT) Marcus Denker
  • 29. Inline Cache: First Execution First Call: 1) lookup the method .... lookup(obj, sel) .... Code of Method Binary Code (generated by JIT) Marcus Denker
  • 30. Inline Cache: First Execution First Call: 1) lookup the method 2) generate test preamble .... lookup(obj, sel) .... check receiver type Code of Method Binary Code (generated by JIT) Marcus Denker
  • 31. Inline Cache: First Execution First Call: 1) lookup the method 2) generate test preamble 3) patch calling method .... call preamble .... check receiver type Code of Method Binary Code (generated by JIT) Marcus Denker
  • 32. Inline Cache: First Execution First Call: 1) lookup the method 2) generate test preamble 3) patch calling method 4) execute method and return .... call preamble .... check receiver type Code of Method Binary Code (generated by JIT) Marcus Denker
  • 33. Inline Cache: Second Execution Second Call: - we jump directly to the test - If test is ok, execute the method - If not, do normal lookup .... call preamble .... wrong check receiver normal type lookup Code of Method Binary Code (generated by JIT) Marcus Denker
  • 34. Limitation of Simple Inline Caches • Works nicely at places were only one method is called. “Monomorphic sends”. >80% • How to solve it for the rest? • Polymorphic sends (<15%) – <10 different classes • Megamophic sends (<5%) – >10 different classes Marcus Denker
  • 35. Example Polymorphic Send • This example is Polymorphic. array := #(1 1.5 2 2.5 3 3.5). array collect: [:each | each + 1] • Two classes: Integer and Float • Inline cache will fail at every send • It will be slower than doing just a lookup! Marcus Denker
  • 36. Polymorphic Inline Caches • Solution: When inline cache fails, build up a PIC • Basic idea: – For each receiver, remember the method found – Generate stub to call the correct method Marcus Denker
  • 37. PIC check receiver type Code of Method If type = Integer (Integer) .... jump call pic If type = Float .... jump else call lookup check receiver type Binary Code Code of Method (generated by JIT) (Float) Marcus Denker
  • 38. PIC • PICs solve the problem of Polymorphic sends • Three steps: – Normal Lookup / generate IC – IC lookup, if fail: build PIC – PIC lookup • PICs grow to a fixed size (~10) • After that: replace entries • Megamorphic sends: – Will be slow – Some systems detect them and disable caching Marcus Denker
  • 39. Optimizations: Lessons Learned • We have seen – Inline Caches – Polymorphic Inline Caches • Next: Beyond Just-In-Time
  • 40. Beyond JIT: Dynamic Optimization • Just-In-Time == Not-Enough-Time – No complex optimization possible – No whole-program-optimization • We want to do real optimizations! Marcus Denker
  • 41. Excursion: Optimizations • Or: Why is a modern compiler so slow? • There is a lot to do to generate good code! – Transformation in a good intermediate form (SSA) – Many different optimization passes • Constant subexpression elimination (CSE) • Dead code elimination • Inlining – Register allocation – Instruction selection Marcus Denker
  • 42. State of the Art: HotSpot et. al. • Pioneered in Self • Use multiple compilers – Fast but bad code – Slow but good code • Only invest time were it really pays off • Here we can invest some more • Problem:Very complicated, huge warmup, lots of Memory • Examples: Self, Java Hotspot, Jikes Marcus Denker
  • 43. PIC Data for Inlining • Use type information from PIC for specialisation • Example: Class Point CarthesianPoint>>x ^x ... lookup(rec, sel) .... PolarPoint>>x “compute x from rho and theta” Binary Code (generated by JIT) Marcus Denker
  • 44. Example Inlining • The PIC will be build If type = CarthesianPoint jump The PIC contains ... If type = PolarPoint call pic jump type Information! .... else call lookup Binary Code (generated by JIT) Marcus Denker
  • 45. Example Inlining • We can inline code for all known cases ... if type = cartesian point result ← receiver.x else if type = polar point result ← receiver.rho * cos(receiver.theta) else call lookup ... Binary Code (generated by JIT) Marcus Denker
  • 46. End • What is a VM • Overview about bytecode interpreter / compiler • Inline Caching / PICs • Dynamic Optimization • Questions?
  • 47. Literature • Smith/Nair:Virtual Machines (Morgan Kaufman August 2005). Looks quite good! • For PICs: – Urs Hölzle, Craig Chambers, David Ungar: Optimizing Dynamically-Typed Object-Oriented Languages With Polymorphic Inline Caches – Urs Hoelzle. "Adaptive Optimization for Self: Reconciling High Performance with Exploratory Programming." Ph.D. thesis, Computer Science Department, Stanford University, August 1994. Marcus Denker