Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
The light-weight-jit-compiler-project-for-c ruby
1. A Light Weight JIT Compiler Project For CRuby
Vladimir Makarov
Red Hat
Apr 19, 2019
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 1 / 1
2. About Myself
Red Hat, Toronto office, Canada
Tools group (GCC, GLIBC, LLVM, Rust, GO, OpenMP, ...)
a part of bigger platform engineering team (RHEL)
>20 years of work on GCC
>3 years of work on CRuby
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 2 / 1
3. GCC/LLVM As a JIT Compiler
The most popular approach to implement a JIT
Possible APIs: LLVM ORC or GCC LibGCCJIT
Getting GCC/LLVM optimizations for free
The fastest way to implement CRuby JIT
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 3 / 1
4. CRuby MJIT
MJIT also uses GCC or LLVM
Unique feature: C as an interface language
Simpler implementation, maintenance and debugging
The same compilation speed as in other API cases achieved by
Precompiled header usage
Memory FS
Parallel compiling
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 4 / 1
5. GCC/LLVM based JIT disadvantages
Big comparing to CRuby
Slow compilation speed for some cases
Difficult for optimizing on borders of code written on different
programming languages
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 5 / 1
6. CRuby/GCC/LLVM SLOC
GCC-8
CRuby-2.6
LLVM-8 + Clang +
Libcxx + Compiler-RT
5.49 M Lines
1.66 M Lines
4.13 M Lines
Scaled to the corresponding source sizes
GCC and LLVM sources are ~3 times bigger
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 6 / 1
7. CRuby/GCC/LLVM Binary Size
GCC-8 x86-64
cc1
CRuby-2.6
ruby
LLVM-8 clang
x86/x86-64 only25.2 MB
3.5 MB
63.4 MB
Scaled to the corresponding binary sizes
GCC and LLVM binaries are ~7-18 times bigger
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 7 / 1
8. GCC/LLVM Compilation Speed
~20ms for a small method compilation by GCC/LLVM (and MJIT)
on modern Intel CPUs
~0.5s for latest Raspberry PI 3 B+ on ARM64 Linux
SPEC2000 Est 176.gcc: 320 (PI 3 B+) vs 8520 (i7-9700K)
Slow environments for GCC/LLVM based JITs
MingW, CygWin, environments w/o memory FS
Example of JIT compilation speed difference: Java implementation
by Azul Systems
100ms for a typical Java method compiled with aggressive inlinining by
Falcon, a tier 2 JIT compiler implemented with LLVM
1ms for the method compiled by a tier 1 JIT compiler
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 8 / 1
10. GCC/LLVM startup
GCC -O0
GCC -O2
Clang -O0
Clang -02
0
2
4
6
8
10
12
CPUtime(ms) 7.95 8.38
7.21 7.70
8.71
10.70
7.29
9.11
Empty file vs 30 Line Preprocessed File Compilation
empty file
30 lines file
x86 64 GCC-8/LLVM-8, Intel i7-9700K, FC29
Most time is spent in compiler (and assembler) data initialization
builtins descriptions, different optimization data, etc
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 10 / 1
11. Inlining C and Ruby code
Inlining is the most important JIT optimization
Many Ruby standard methods are written on C
We need inlining the C code and machine code generated from
Ruby code
Example: x=2; 10.times {x*=2}
x = 2 times x *= 2
Ruby C Ruby
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 11 / 1
12. Inlining C and Ruby code in MJIT
Environment
header
CRuby building phase
MJIT
CRuby execution run
CRuby
Precompiled header
C code .so file
CC
CC
loading
Generated
Machine code
Adding the C code to the precompiled header
Slower startup, slower compilation
Rewriting the C code on Ruby
Slower generated code as a rule
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 12 / 1
13. Light-Weight JIT Compiler
One possible solution is a light-weight JIT compiler in addition to
existing MJIT one:
The light-weight JIT compiler as a tier 1 JIT compiler
Existing MJIT generating C as a tier 2 JIT compiler for more frequently
running code
Or only the light-weight JIT compiler for environments where the
current MJIT compiler does not work
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 13 / 1
14. MIR for Light-Weight JIT compiler
My spare-time project started last year:
Universal light-weight JIT compiler based on MIR
MIR is Medium Internal Representation
MIR means peace and world in Russian
MIR is strongly typed
MIR can represent machine insns of different architectures
Plans to try the light-weight JIT compiler first for CRuby or/and
MRuby
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 14 / 1
15. Example: C Prime Sieve
#define Size 819000
int sieve (int iter) {
int i, k, prime, count, n; char flags[Size];
for (n = 0; n < iter; n++) {
count = 0;
for (i = 0; i < Size; i++)
flags[i] = 1;
for (i = 0; i < Size; i++)
if (flags[i]) {
prime = i + i + 3;
for (k = i + prime; k < Size; k += prime)
flags[k] = 0;
count++;
}
}
return count;
}
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 15 / 1
16. Example: MIR Prime Sieve
m_sieve: module
export sieve
sieve: func 819000, i32, i32:iter
local i64:flags, i64:count, i64:prime, i64:n, i64:i, i64:k, i64:temp
mov flags, fp; mov n, 0
loop: bge fin, n, iter
mov count, 0; mov i, 0
loop2: bgt fin2, i, 819000
mov ui8:(flags, i), 1; add i, i, 1; jmp loop2
fin2: mov i, 0
loop3: bgt fin3, i, 819000
beq cont3, ui8:(flags,i), 0
add temp, i, i; add prime, temp, 3; add k, i, prime
loop4: bgt fin4, k, 819000
mov ui8:(flags, k), 0; add k, k, prime; jmp loop4
fin4: add count, count, 1
cont3: add i, i, 1; jmp loop3
fin3: add n, n, 1; jmp loop
fin: ret count
endfunc
endmodule
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 16 / 1
17. The Light-Weight JIT Compiler Goals
Comparing to GCC -O2
70% of generated code speed
100 times faster compilation speed
100 times faster start-up
100 times smaller code size
less 10K C LOC
No external dependencies – only standard C (no YACC, LEX, etc)
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 17 / 1
18. How to achieve the performance goals?
Use few most valuable optimizations
Optimize only frequent cases
Use algorithms with the best combination of simplicity (code size)
and performance
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 18 / 1
19. How to achieve the performance goals?
What are the most valuable GCC optimizations for x86-64?
a decent RA
code selection
GCC-9.0, i7-9700K under FC29
SPECInt2000 Est. GCC -O2 GCC -O0 + simple RA + combiner
-fno-inline 5458 4342 (80%)
-finline 6141 4339 (71%)
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 19 / 1
20. The current state of MIR project
MIRAPI
C
MIR
binary
MIR
text
C
x86-64
MIR
binary
MIR
text
Interpreter
Generator
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 20 / 1
21. Possible future directions of MIR project
MIRAPI
C LLVM IRWASM
C++Rust
MIR
binary
MIR
text
C
x86-64 aarch64PPC64 MIPS64
MIR
binary
MIR
text
...
Interpreter
Generator
Java
bytecode
Java
bytecode
...
WASM
CIL
CIL
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 21 / 1
23. MIR Generator
Simplify
Machinize
Build
CFG
Build Live
Info
Build Live
Ranges
Assign
Registers
MIR
Global
Common
Sub-Expr
Elimination
Dead Code
Elimination
Sparse
Conditional
Constant
Propagation
Inline
SCCP: constant propagation/folding and removing unreachable BBs
Machinize: MIR for calls ABI, 2-op insns, etc
Building Live Info: Calculating BB live in/out sets
Build Live Ranges: Calculating live ranges for regs
Assign: Priority-based RA
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 22 / 1
24. MIR Generator
Simplify
Machinize
Build
CFG
Build Live
Info
Build Live
Ranges
Assign
Registers
Rewrite
Combine
Insns
Dead Code
Elimination
Generate
Machine Code
MIR
Machine
Code
Global
Common
Sub-Expr
Elimination
Dead Code
Elimination
Sparse
Conditional
Constant
Propagation
Inline
Rewrite: Transform MIR according to RA
Combine: Merging data-dependent insns into one
The 2nd Dead Code Elimination
Generate machine insns (machine-dependent code)
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 22 / 1
25. Some MIR Generator Features
No Static Single Assignment Form
In and Out SSA passes are expensive, especially for short MIR-generator
pass pipeline
SSA absence complicates conditional constant propagation and global
common sub-expression elimination
No Position Independent Code
It speeds up the generated code a bit
It simplifies the code generation
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 23 / 1
26. Possible ways to compile C to MIR
LLVM IR to MIR or GCC Port
Dependence to a particular external project
Big efforts to implement
Maintenance burden
Own C compiler
Practically the same efforts to implement
Examples: tiny CC, 8cc, 9cc
No dependency to any external project
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 24 / 1
27. C to MIR compiler
No any tools, like YACC or LEX
PEG (parsing expression grammar) parser
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 25 / 1
28. Current MJIT
Environment
header
CRuby building phase
MJIT
CRuby execution run
CRuby
Precompiled header
C code .so file
CC
CC
loading
Generated
Machine code
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 26 / 1
29. MIR as a tier1 JIT compiler in CRuby
Environment
header
CRuby building phase
MJIT
CRuby execution run
CRuby
Precompiled header
C code .so file
CC
CC
loading
Generated
Machine code
Ruby methods
written on C
API
MIR
Binaries
MIR Generator
MIR-to-C
C-to-MIR
MIR API,
MIR Generator,
MIR-to-C
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 27 / 1
30. MIR as a tier1 JIT compiler in CRuby – 2nd Option
Environment
header
CRuby building phase
MJIT
CRuby execution run
CRuby
C code .so file
CC
loading
Generated
Machine code
Ruby methods
written on C
API
MIR
Binaries
MIR Generator
MIR-to-C
C-to-MIR
MIR API,
MIR Generator,
MIR-to-C
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 28 / 1
31. MIR as single JIT compiler in CRuby
Environment
header
CRuby building phase
MJIT
CRuby execution run
CRuby
Generated
Machine code
Ruby methods
written on C
API
MIR
Binaries
MIR Generator
MIR-to-C
C-to-MIR
MIR API,
MIR Generator,
MIR-to-C
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 29 / 1
32. Current MIR Performance Results
Intel i7-9700K under FC29 with GCC-8.2.1:
MIR-gen MIR-interp gcc -O2 gcc -O0
compilation1
1.0 (0.075ms) 0.16 (0.012ms) 178 (13.35ms) 171 (12.8ms)
execution1
1.0 (3.1s) 5.9 (18.3s) 0.94 (2.9s) 2.05 (6.34s)
code size2
1.0 (175KB) 0.65 (114KB) 144 (25.2MB) 144 (25.2MB)
startup3
1.0 (1.3us) 1.0 (1.3us) 9310 (12.1ms) 9580 (12.8ms)
LOC4
1.0 (9.5K) 0.58 (5.5K) 155 (1480) 155 (1480K)
Table: Sieve5
: MIR vs GCC
1
Best wall time of 10 runs
2
Stripped size of cc1 and minimal program running MIR code
3
Wall time to generate code for running empty C or MIR program
4
Size of minimal files to create and run MIR code or build x86-64 GCC compiler
5
30 lines of preprocessed C code, MIR is created through API
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 30 / 1
33. Current MIR SLOC distribution
9.0K
C2MIR
3.2K Generator: Core
1.2K Generator: x86-64 Target
4.0K MIR API
1.2K
ADT
1K
Interpr.
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 31 / 1
34. MIR Project Competitors
LibJIT started as a part of DotGNU Project
80K SLOC, GPL/LGPL License
Only register allocation and primitive copy propagation
RyuJIT, a part of runtime for .NET Core
360K SLOC, MIT License
MIR-generator optimizations plus loop invariant motion minus SCCP
SSA
Other candidates:
QBE: standalone+, small+ (10K LOC), SSA-, ASM generation-, MIT
License
LIBFirm: less standalone-, big- (140K LOC), SSA-, ASM generation-,
LGPL2
CraneLift: less standalone-, big- (70K LOC of Rust-), SSA-, Apache License
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 32 / 1
35. MIR Project Plans
MIR-based CRuby JIT compiler is beyond Ruby 3 time-frame
Short term plans:
Finish C to MIR compiler
Prototype of MIR based JIT compiler in CRuby
Porting MIR Generator to AARCH64
https://github.com/vnmakarov/mir
Vladimir Makarov (Red Hat) A Light Weight JIT Compiler Project For CRuby Apr 19, 2019 33 / 1