4. ⣠Highly Dynamic
⢠Very high level operations
⢠New code can be introduced at anytime
⢠Dynamic typing
⢠Exclusively late bound method calls
⢠Easier to implement as an interpreter
Wednesday, September 16, 2009
7. âŁPrior Work
⢠Smalltalk
⢠1980-1994: Extensive work to make it fast
⢠Self
⢠1992-1996: A primary research vehicle for making dynamic
languages fast
⢠Java / Hotspot
⢠1996-present: A battle hardened engine for (limited) dynamic
dispatch
Wednesday, September 16, 2009
8. âŁWhat Can We Learn From Them?
Wednesday, September 16, 2009
9. âŁWhat Can We Learn From Them?
⢠Complied code is faster than interpreted code
⢠Itâs very hard (almost impossible) to ďŹgure things out staticly
⢠The type proďŹle of a program is stable over time
⢠Therefore:
⢠Learn what a program does and optimize based on that
⢠This is called Type Feedback
Wednesday, September 16, 2009
10. âŁCode Generation (JIT)
⢠Eliminating overhead of interpreter instantly increases
performance a ďŹxed percentage
⢠Naive code generation results in small improvement over
interpreter
⢠Method calling continues to dominate time
⢠Need a way to generate better code
⢠Combine with program type information!
Wednesday, September 16, 2009
11. âŁType ProďŹle
⢠As the program executes, itâs possible to see how one method
calls another methods
⢠The relationship of one method and all the methods it calls is the
type proďŹle of the method
⢠Just because you CAN use dynamic dispatch, doesnât mean you
always do.
⢠Itâs common that a call site always calls the same method every
time itâs run
Wednesday, September 16, 2009
13. âŁType ProďŹling (Cont.)
⢠98% of all method calls are to the same method
every time
⢠In other words, 98% of all method calls are statically
bound
Wednesday, September 16, 2009
14. âŁType Feedback
⢠Optimize a semi-static relationship to generate faster code
⢠Semi-static relationships are found by proďŹling all call sites
⢠Allow JIT to make vastly better decisions
⢠Most common optimization: Method Inlining
Wednesday, September 16, 2009
15. âŁMethod Inlining
⢠Rather than emit a call to a target method, copy itâs body at the
call site
⢠Eliminates code to lookup and begin execution of target method
⢠SimpliďŹes (or eliminates) setup for target method
⢠Allows for type propagation, as well as providing a wider horizon
for optimization.
⢠A wider horizon means better generated code, which means
less work to do per method == faster execution.
Wednesday, September 16, 2009
17. âŁCode Generation (JIT)
⢠Early experimentation with custom JIT
⢠Realized we werenât experts
⢠Would take years to get good code being generated
⢠Switched to LLVM
Wednesday, September 16, 2009
18. âŁLLVM
⢠Provides an internal AST (LLVM IR) for describing work to
be done
⢠Text representation of AST allows for easy debugging
⢠Provides ability to compile AST to machine code in
memory
⢠Contains thousands of optimizations
⢠Competitive with GCC
Wednesday, September 16, 2009
19. âŁType ProďŹling
⢠All call sites use a class called InlineCache, one per call site
⢠InlineCache accelerates method dispatch by caching previous
method used
⢠In addition, tracks a ďŹxed number of receiver classes seen when
there is a cache miss
⢠When compiling a method using LLVM, all InlineCaches for a
method can be read
⢠InlineCaches with good information can be used to accurately
ďŹnd a method to inline
Wednesday, September 16, 2009
20. âŁWhen To Compile
⢠It takes time for a methodâs type information to settle down
⢠Compiling too early means not having enough type info
⢠Compiling too late means lost performance
⢠Use simple call counters to allow a method to âheat upâ
⢠Each invocation of a method increments counter
⢠When counter reaches a certain value, method is queued for
compilation.
⢠Threshold value is tunable: -Xjit.call_til_compile
⢠Still experimenting with good default values
Wednesday, September 16, 2009
21. âŁHow to Compile
⢠To impact runtime as little as possible, all JIT compilation happens
in a background OS thread
⢠Methods are queued, and background thread reads queue to ďŹnd
methods to compile
⢠After compiling, function pointers to JIT generated code are
installed in methods
⢠All future invocations of method use JIT code
Wednesday, September 16, 2009
22. âŁBenchmarks Seconds
9
8.02
6.75
def foo() 5.30
5.90
ary = [] 4.5
100.times { |i| ary << i }
3.60
end
2.25 2.59
300,000 times
0
1.8 1.9 rbx rbx jit rbx jit +blocks
Wednesday, September 16, 2009
26. âŁConclusion
⢠Ruby is a wonderful language because it is organized for humans
⢠By gather and using information about a running program, itâs
possible to make that program much faster without impacting
ďŹexibility
⢠Thank You!
Wednesday, September 16, 2009