3. Retargetable Binary
Translators
10 Sep 2007 3
Emulation SummaryEmulation Summary
Decode-
dispatch
Indirect
Threaded
Direct
Threaded
Binary
Translation
Memory Low Low High High
Start-up Fast Fast Slow Slow
Steady-
state
Slow Slow Medium Fast
Code
Portability
Excellent Excellent Medium Poor
4. Retargetable Binary
Translators
10 Sep 2007 4
Binary TranslatorsBinary Translators
Example IExample I
• Transmeta’s Code Morphing
– Intel IA-32 binary to run on VLIW
Crusoe processor
• UQDBT System
– Intel IA-32 binary to run on SPARC-
based processor
• FX!32
– x86 binary to run on Alpha processor
5. Retargetable Binary
Translators
10 Sep 2007 5
Binary TranslatorsBinary Translators
Example IIExample II
• Shade
– Implement high-performance instruction
set simulators
• Embra
– Implement a high-performance
operating system emulator
• Dynamo and Mojo
– Improve the performance of native
binaries
6. Retargetable Binary
Translators
10 Sep 2007 6
Binary TranslatorBinary Translator
WeaknessWeakness
• Code Portability - Poor
– Typically, written for a single application
and/or platform
– Specialized for the target ISA
– Single-target, single-purpose approach
– Reinvent the wheel!
• Have to develop a new system (Binary
Translator) from scratch!
7. Retargetable Binary
Translators
10 Sep 2007 7
Retargetable Binary TranslatorRetargetable Binary Translator
FrameworkFramework
• Strata
– A cross-platform infrastructure for
building software binary dynamic
translators
• Walkabout
– A retargetable binary translation
framework for experimenting with
dynamic translation of binary code
8. Retargetable Binary
Translators
10 Sep 2007 8
Strata - BackgroundStrata - Background
• Strata
– Binary Translator implementation
infrastructure
– Provides common framework for
software engineering principle, Code
Reuse
• Provides simple binary translator for a
variety of architectures
• Allows code reuse through composition
10. Retargetable Binary
Translators
10 Sep 2007 10
Strata-SPARCStrata-SPARC
• First Strata software dynamic
translator
• For the SPARC V8/V9 instruction set
architecture and the Solaris
operating system
• A variety of target-independent
techniques
– Reduce the number of context switches
11. Retargetable Binary
Translators
10 Sep 2007 11
Strata-SPARCStrata-SPARC
PerformancePerformance
• Strata-SPARC with native execution
– From 1.02x to 1.8x, average 1.32x
12. Retargetable Binary
Translators
10 Sep 2007 12
Strata-MIPSStrata-MIPS
• For MIPS IV instruction set
architecture and the IRIX 6.5.10
operating system
• In porting Strata to the MIPS
– Found Strata’s structure to be flexible
and relatively easy to retarget
• Initial version
– One person less than three weeks
13. Retargetable Binary
Translators
10 Sep 2007 13
Strata-MIPSStrata-MIPS
PerformancePerformance
• Strata-MIPS with native execution
– From 1.09x to 3.0x, average 1.8x
14. Retargetable Binary
Translators
10 Sep 2007 14
Strata-X86Strata-X86
• For Intel 80x86 instruction set
architecture
• Difference between CISC and RISC
– Still used Strata-SPARC as the basis for
retargeting to the x86
• Focus on implementing the
instruction fetch and decode function
– To do some amount of decoding of an
instruction to determine its length
15. Retargetable Binary
Translators
10 Sep 2007 15
Strata-X86Strata-X86
PerformancePerformance
• Strata-X86 with native execution
– From 1.0x to 1.8x, average 1.35x
16. Retargetable Binary
Translators
10 Sep 2007 16
Walkabout - BackgroundWalkabout - Background
• Walkabout
– Retargetable binary translation
framework for experimenting with
dynamic translation of binary code
– How to instrument interpreters in a
retargetable way
– Inspiration
• University of Queensland Binary Translator
(UQBT) framework
– Enabled static translations of binary codes
18. Retargetable Binary
Translators
10 Sep 2007 18
Walkabout - RetargetabilityWalkabout - Retargetability
• Supporting binaries for different
input and output machines
• Users could instantiate new
translators out of the framework
– Source and target machines of choice
• Supported through the specifications
– Machine descriptions
– Hot path selection method specifications
19. Retargetable Binary
Translators
10 Sep 2007 19
Walkabout - InterpreterWalkabout - Interpreter
• Automatically generated from
– Specifications of syntax and semantics of
machine instruction set
• Interpreter Generator
– SLED describes the instruction syntax
– SSL describes the instruction semantics
20. Retargetable Binary
Translators
10 Sep 2007 20
Walkabout - PerformanceWalkabout - Performance
Performance results for an automatically-generated
C Language interpreter for the SPARC architecture
(Static size in bytes, Interpreter running time in seconds,
Path-Finder with dynamic optimization running time)
21. Retargetable Binary
Translators
10 Sep 2007 21
ReferencesReferences
• K. Scott, N. Kumar, S. Velusamy, B. R. Childers, J. W. Davidson,
and M. L. Soffa, Retargetable and Reconfigurable Software
Dynamic Translation, International Symposium on Code Generation
and Optimization, pp. 36–47 (March 2003).
• CIFUENTES, C., LEWIS, B., AND UNG, D. Walkabout – A
Retargetable Dynamic Binary Translation Framework. In
Proceedings of the 2002 Workshop on Binary Translation (2002).
• C. Cifuentes, M. Van Emmerik, N. Ramsey, and B. Lewis.
Experience in the design, implementation and use of a retargetable
static binary translation framework. Technical Report TR-2002-
105, Sun Microsystems Laboratories, Palo Alto, CA 94303, January
2002.
• C. Cifuentes, B. Lewis. Walkabout – A framework for Experimental
Dynamic Binary Translation. Sun Microsystems Inc, Palo Alto, CA
94303, January 2002.
Good afternoon, everyone!
Today I want to talk about Retargetable Binary Translators.
Here is the agenda of this presentation.
Let’s start from the emulation summary of the last lecture.
After that, take a look at binary translators example and weakness.
Lastly I will introduce two retargetable binary translator framework: Strata and Walkabout.
Here is the summary of 4 emulation techniques which we studied at the last lecture.
I will focus on the Binary Translation today.
As you know, in the case of Binary Translation, Memory Requirement is high, Start-up performance is slow.
However in the factor of steady-state performance is fast.
Unfortunately binary translation has a poor code portability.
Remember these features, let’s go to the next slide.
I want to share some examples of binary translator.
Because the steady-state performance is high, binary translators have been used widely.
First one is the Transmeta’s code morphing. This technology allows unmodified intel IA-32 binary to run on the low-power, VLIW Crusoe processor.
Similarly, the UQDBT (University of Queensland Dynamic Binary Translator) dynamically translate Intel IA-32 binaries to run on SPARC-based processor, and FX!32 dynamically translates x86 binaries to run on Alpha processor.
These examples are used to overcome the barriers to entry associated with the introduction of a new CPU architecture.
Moreover the Binary Translators have proven useful for a variety of other purposes.
Shade uses binary translator to implement high-performance instruction set simulators.
Embra uses binary translator to implement a high-performance operating system emulator.
Dynamo and Mojo use binary translator to improve the performance of native binaries.
In this case, both source ISA and target ISA are same.
Although the binary translator has been used widely, it has a weakness in code portability.
Because typically the binary translator has written for a single application or platform.
Moreover it is specialized only for the target ISA through single-target, single-purpose approach.
Reinventing the wheel is a phrase that means a generally accepted technique or solution is ignored in favor of a locally invented solution. To "reinvent the wheel" is to duplicate a basic method that has long since been accepted and even taken for granted.
Because of this weakness, the retargetable binary translator framework has been developed.
In this presentation, I will introduce two retargetable binary translator frameworks focusing on the retargetability function.
First one is Strata and it is a cross-platform infrastructure for building software binary dynamic translators.
The other is Walkabout and it is a retargetable binary translation framework for experimenting with dynamic translation of binary code.
Let’s start from Strata framework.
Strata is a binary translator implementation infrastructure which provides common framework for software engineering principle, Code Reuse.
As you know, if you want to remove the duplicated codes, then you have to identify the commonality first. In this framework, the simple binary translator for a variety of architectures is a common part among several architectures.
User can modify these translators to suit their specific needs without having to build an entire translator from scratch.
User can use the work of others to enhance their binary translators. For example, by composing a Strata-based dynamic optimizer with a Strata-based simulator, and optimizing simulator can be realized.
Implementing a new software dynamic translator often requires only a small amount of coding and a simple reconfiguration of the target interface.
when retargeting the VM to a new platform, the programmer is only obligated to implement the target-specific functions required by the target interface; common services should never have to be reimplemented or modified.
Then let’s take a look at about Strata framework based on the SPARC architecture.
Actually it is a first Strata software dynamic translator for the SPARC V8/V9 ISA and the Solaris OS.
They put variety of target-independent techniques to reduce the number of context switches into the common services of Strata architecture.
Here is the performance result of Strata-SPARC.
Second Strata project is porting Strata to the MIPS.
In this project, they found Strata’s structure is flexible and relatively easy to retarget.
As an evidence, this project’s initial version took less than three weeks for one developer.
Here is the performance result of Strata-MIPS.
Third Strata project is a Strata-X86.
80x86 CISC architecture is quite different with previous two RISC architectures.
However they still used Strata-SPARC as the basis for retargeting to the x86.
They put a lot of time to implement the instruction fetch and decode function to do some amount of decoding of an instruction to determine its length.
RISC architecture has fixed length instruction, but CISC has variable length instruction.
Here is the performance result of Strata-X86.
Walkabout is a retargetable binary translation framework for experimenting with dynamic translation of binary code.
It is including ‘How to instrument interpreters in a retargetable way’.
They got an inspiration from UQBT framework which is enabled static translations of binary codes
Here is the Architecture of Walkabout which is designed with retargetability in mind.
1. The source binary program is loaded into virtual memory and initially interpreted until a hot path is found.
(A hot path is a frequently executed path in a program.)
2. Code is generated for that hot path and placed into a translated instruction cache (called the fragment cache or F$).
3. During code generation, simple optimizations are applied to obtain better code locality.
4. Once the generated code is executed, control transfers to the dispatcher, which decides whether to interpret more code or transfer control to code in the fragment cache.
5. If interpreted, the process repeats.
6. Reoptimization of translated code occurs when a piece of translated code in the fragment cache is executed too often.
Now let’s focus on the retargetability feature of Walkabout framework.
They want to achieve supporting binaries for different input and output machines.
The framework was designed so that users could instantiate new translators out of the framework for their source and target machines of choice to run on a host machine.
Retargetability is supported in the Walkabout framework through the use of specifications: machine descriptions and specifications of the hot path selection methods.
(A hot path is a frequently executed path in a program.)
Interpreters in the Walkabout framework are automatically generated from specifications of syntax and semantics of machine instruction sets.
While SLED describes the instruction syntax, SSL describes the instruction semantics.
The interpreter generator takes as input a SLED and an SSL description for a given machine and generates source code for an interpreter for target machine in either the C or Java language.
Here is the performance results for an automatically-generated C Language interpreter for the SPARC architecture.
Actually Walkabout framework is really general for supporting both input and output machines.
So generated interpreter running time without dynamic optimization will be around 200 times slower than native program running time.
However, the path-finder running time with dynamic optimization is decreased remarkably.