SlideShare ist ein Scribd-Unternehmen logo
1 von 28
Downloaden Sie, um offline zu lesen
Introduction to LLVM
Projects, Components, Integration, Internals
Renato Golin
ENGINEERS AND DEVICES
WORKING TOGETHER
Overview
LLVM is not a toolchain, but a number of sub-projects that can behave like one.
● Front-ends:
○ Clang (C/C++/ObjC/OpenCL/OpenMP), flang (Fortran), LDC (D), PGI’s Fortran, etc
● Front-end plugins:
○ Static analyser, clang-tidy, clang-format, clang-complete, etc
● Middle-end:
○ Optimization and Analysis passes, integration with Polly, etc.
● Back-end:
○ JIT (MC and ORC), targets: ARM, AArch64, MIPS, PPC, x86, GPUs, BPF, WebAsm, etc.
● Libraries:
○ Compiler-RT, libc++/abi, libunwind, OpenMP, libCL, etc.
● Tools:
○ LLD, LLDB, LNT, readobj, llc, lli, bugpoint, objdump, lto, etc.
ENGINEERS AND DEVICES
WORKING TOGETHER
LLVM / GNU comparison
LLVM component / tools
Front-end: Clang
Middle-end: LLVM
Back-end: LLVM
Assembler: LLVM (MC)
Linker: LLD
Libraries: Compiler-RT, libc++ (no libc)
Debugger: LLDB / LLDBserver
GNU component / tool
Front-end: CC1 / CPP
Middle-end: GCC
Back-end: GCC
Assembler: GAS
Linker: GNU-LD
Libraries: libgcc, stdlibc++, glibc
Debugger: GDB / GDBserver
ENGINEERS AND DEVICES
WORKING TOGETHER
Source Paths
Direct route ... … Individual steps
clang clang -S -emit-llvmclang -Sclang -c
EXE EXE EXE EXE
opt
IR
IR
llc
ASM
OBJ
clang
lld
ASM
OBJ
clang
lld
OBJ
lld
cc1
lld
Basically, only
two forks...
Modulesusedbytools(clang,opt,llc)
ENGINEERS AND DEVICES
WORKING TOGETHER
Perils of multiple paths
● Not all paths are created equal…
○ Core LLVM classes have options that significantly change code-gen
○ Target interpretation (triple, options) are somewhat independent
○ Default pass structure can be different
● Not all tools can pass all arguments…
○ Clang’s driver can’t handle some -Wl, and -Wa, options
○ Include paths, library paths, tools paths can be different depending on distro
○ GCC has build-time options (--with-*), LLVM doesn’t (new flags are needed)
● Different order produces different results…
○ “opt -O0” + “opt -O2” != “opt -O2”
○ Assembly notation, parsing and disassembling not entirely unique / bijective
○ So, (clang -emit-llvm)+(llc -S)+(clang -c) != (clang -c)
○ Not guaranteed distributive, associative or commutative properties
ENGINEERS AND DEVICES
WORKING TOGETHER
C to IR
● IR is not target independent
○ Clang produces reasonably independent IR
○ Though, data sizes, casts, C++ structure layout, ABI, PCS are all taken into account
● clang -target <triple> -O2 -S -emit-llvm file.c
C x86_64 ARM
#include <stdio.h>
int foo(int j, int d) {
return j / d ;
}
int main( void ) {
int r = foo(5, 0);
printf(" r = %dn", r);
}
@.str = private unnamed_addr constant [9 x i8] c" r =
%d0A00", align 1
; Function Attrs: norecurse nounwind readnone
define arm_aapcscc i32 @foo(i32 %j, i32 %d) #0 {
%1 = sdiv i32 %j, %d
ret i32 %1
}
; Function Attrs: nounwind
define arm_aapcscc i32 @main() #1 {
%1 = tail call arm_aapcscc i32 (i8*, ...) @printf(i8*
getelementptr inbounds ([9 x i8], [9 x i8]* @.str, i32 0, i32
0), i32 undef)
ret i32 0
}
@.str = private unnamed_addr constant [9 x i8] c" r =
%d0A00", align 1
; Function Attrs: norecurse nounwind readnone uwtable
define i32 @foo(i32 %j, i32 %d) #0 {
%1 = sdiv i32 %j, %d
ret i32 %1
}
; Function Attrs: nounwind uwtable
define i32 @main() #1 {
%1 = tail call i32 (i8*, ...) @printf(i8* getelementptr
inbounds ([9 x i8], [9 x i8]* @.str, i64 0, i64 0), i32 undef)
ret i32 0
}
ENGINEERS AND DEVICES
WORKING TOGETHER
ABI differences in IR
target triple = "armv7--linux-gnueabihf"
define i32 @_Z5valueicd(i32 %i, i8 zeroext %c, double %d) {
%7 = call %class.Baz* @_ZN3BazC2Eicd(...)
ret i32 %8
}
define linkonce_odr %class.Baz* @_ZN3BazC2Eicd(...) {
(...)
%1 = alloca %class.Baz*, align 4
(...)
store %class.Baz* %this, %class.Baz** %1, align 4
(...)
%5 = load %class.Baz*, %class.Baz** %1, align 4
(...)
ret %class.Baz* %5
}
target triple = "x86_64-unknown-linux-gnu"
define i32 @_Z5valueicd(i32 %i, i8 signext %c, double %d) {
(...)
call void @_ZN3BazC2Eicd(...)
(...)
ret i32 %7
}
define linkonce_odr void @_ZN3BazC2Eicd(...) {
(...)
%1 = alloca %class.Baz*, align 8
(...)
store %class.Baz* %this, %class.Baz** %1, align 8
(...)
%5 = load %class.Baz*, %class.Baz** %1, align 8
(...)
ret void
}
ARM ABI defined unsigned char
Pointer alignment
CTor return values (tail call)
ENGINEERS AND DEVICES
WORKING TOGETHER
Optimization Passes & Pass Manager
● Types of passes
○ Analysis: Gathers information about code, can annotate (metadata)
○ Transform: Can change instructions, entire blocks, usually rely on analysis passes
○ Scope: Module, Function, Loop, Region, BBlock, etc.
● Registration
○ Static, via INITIALIZE_PASS_BEGIN / INITIALIZE_PASS_DEPENDENCY macros
○ Implements getAnalysisUsage() by registering required / preserved passes
○ The PassManager is used by tools (clang, llc, opt) to add passes in specific order
● Execution
○ Registration order pass: Module, Function, …
○ Push dependencies to queue before next, unless it was preserved by previous passes
○ Create a new { module, function, basic block } → change → validate → replace all uses
ENGINEERS AND DEVICES
WORKING TOGETHER
IR transformations
● opt is a developer tool, to help test and debug passes
○ Clang, llc, lli use the same infrastructure (not necessarily in the same way)
○ opt -S -sroa file.ll -o opt.ll
O0 +SROA -print-before|after-all
define i32 @foo(i32 %j, i32 %d) #0 {
%1 = alloca i32, align 4
%2 = alloca i32, align 4
store i32 %j, i32* %1, align 4
store i32 %d, i32* %2, align 4
%3 = load i32, i32* %1, align 4
%4 = load i32, i32* %2, align 4
%5 = sdiv i32 %3, %4
ret i32 %5
}
define i32 @foo(i32 %j, i32 %d) #0 {
%1 = sdiv i32 %j, %d
ret i32 %1
}
*** IR Dump After SROA ***
; Function Attrs: nounwind uwtable
define i32 @foo(i32 %j, i32 %d) #0 {
%1 = sdiv i32 %j, %d
ret i32 %1
}
*** IR Dump Before SROA ***
; Function Attrs: nounwind uwtable
define i32 @foo(i32 %j, i32 %d) #0 {
%1 = alloca i32, align 4
%2 = alloca i32, align 4
store i32 %j, i32* %1, align 4
store i32 %d, i32* %2, align 4
%3 = load i32, i32* %1, align 4
%4 = load i32, i32* %2, align 4
%5 = sdiv i32 %3, %4
ret i32 %5
}
bool SROA::runOnFunction(Function &F) {
if (skipFunction(F))
return false;
bool Changed = performPromotion(F);
(...)
bool SROA::performPromotion(Function &F) {
(...)
if (HasDomTree)
PromoteMemToReg(Allocas, *DT, nullptr, &AC);
Nothing to do with SROA… :)
ENGINEERS AND DEVICES
WORKING TOGETHER
IR Lowering
● SelectionDAGISel
○ IR to DAG is target Independent (with some target-dependent hooks)
○ runOnMachineFunction(MF) → For each Block → SelectBasicBlock()
○ Multi-step legalization/combining because of type differences (patterns don’t match)
foreach(Inst in Block) SelectionDAGBuilder.visit()
CodeGenAndEmitDAG()
CodeGenAndEmitDAG() Combine()
LegalizeTypes(
)
Legalize()
DoInstructionSelection() Scheduler->Run(DAG)
ENGINEERS AND DEVICES
WORKING TOGETHER
DAG Transformation
Before
Legalize
Types
Before
Legalize
Before
ISel
“Glue” means nodes that “belong together”
“Chain” is “program order”
AAPCS
R0
R1
i64 “add”
“addc+adde” ARMISD
32-bit registers, from front-end lowering
ENGINEERS AND DEVICES
WORKING TOGETHER
Legalize Types & DAG Combining
● LegalizeTypes
○ for(Node in Block) { Target.getTypeAction(Node.Type);
○ If type is not Legal, TargetLowering::Type<action><type>, ex:
■ TypeExpandInteger
■ TypePromoteFloat
■ TypeScalarizeVector
■ etc.
○ An ugly chain of GOTOs and switches with the same overall idea (switch(Type):TypeOpTy)
● DAGCombine
○ Clean up dead nodes
○ Uses TargetLowering to combine DAG nodes, bulk of it C++ methods combine<Opcode>()
○ Promotes types after combining, to help next cycle’s type legalization
ENGINEERS AND DEVICES
WORKING TOGETHER
DAG Legalization
● LegalizeDAG
○ for(Node in Block) { LegalizeOp(Node); }
○ Action = TargetLowering.getOperationAction(Opcode, Type)
Legal
Expand
Custom
LibCall
Promote while(TargetLowering.isOperationLegalOrCustom(TypeSize)) TypeSize << 1
continue
generic DAG expansions
TargetLowering.LowerOp()
Add a new Call() from TargetLowering.getLibCallName(Opcode)
ENGINEERS AND DEVICES
WORKING TOGETHER
Instruction Selection & Scheduler
● Instruction Selection
○ <Target>ISelLowering: From SDNode (ISD::) to (ARMISD::)
○ <Target>ISelDAGToDAG: From SDNode (ARMISD::) to MachineSDNode (ARM::)
○ ABI/PCS registers, builtins, intrinsics
○ Still, some type legalization (for new nodes)
○ Inline assembly is still text (will be expanded in the MC layer)
● Scheduler
○ Sorts DAG in topological order
○ Inserts / removes edges, updates costs based on TargetInformation
○ Glue keeps paired / dependent instructions together
○ Target’s schedule is in TableGen (most inherit from basic description + specific rules)
○ Produces MachineBasicBlocks and MachineInstructions (MI)
○ Still in SSA form (virtual registers)
ENGINEERS AND DEVICES
WORKING TOGETHER
Register Allocation & Serialization
● Register allocators
○ Fast: Linear scan, multi-pass (define ranges, allocate, collect dead, coalesce)
○ Greedy: default on optimised builds (live ranges, interference graph / colouring)
○ PBQP: Partitioned Boolean Quadratic Programming (constraint solver, useful for DSP)
● MachineFunction passes
○ Before/after register allocation
○ Frame lowering (prologue/epilogue), EH tables, constant pools, late opts.
● Machine Code (MC) Layer
○ Can emit both assembly (<Target>InstPrinter) and object (<Target>ELFStreamer)
○ Most MCInst objects can be constructed from TableGen, some need custom lowering
○ Parses inline assembly and inserts instructions in the MC stream, matches registers, etc
○ Inline Asm local registers are reserved in the register allocator and linked here
○ Also used by assembler (<Target>AsmParser) and disassembler (<Target>Disassembler)
ENGINEERS AND DEVICES
WORKING TOGETHER
Assembler / Disassembler
● AsmParser
○ Used for both asm files and inline asm
○ Uses mostly TableGen instruction definitions (Inst, InstAlias, PseudoInst)
○ Single pass assembler with a few hard-coded transformations (which makes it messy)
○ Connects into MC layer and can output text (ex. llvm-mc) or object code (ex. clang)
● MCDisassembler
○ Iteration of trial and fail (ARM, Thumb, VFP, NEON, etc)
○ Most of it relies on TableGen encodings, but there’s a lot of hard-coded stuff
○ Doesn’t know much about object formats (ELF/COFF/MachO)
○ Used by llvm-objdump, llvm-mc, connects back to MC layer
ENGINEERS AND DEVICES
WORKING TOGETHER
TableGen
● Parse hardware description and generates code and tables to describe them
○ Common parser (same language), multiple back-ends (different outputs)
○ Templated descriptive language, good for composition and pattern matching
○ Back-ends generate multiple tables/enums with header guards + supporting code
● Back-ends describe their registers, instructions, schedules, patterns, etc.
○ Definition files generated at compile time, included in CPP files using define-include trick
○ Most matching, cost and code generating patterns are done via TableGen
● Clang also uses it for diagnostics and command line options
● Examples:
○ Syntax
○ Define-include trick
○ Language introduction and formal definition
ENGINEERS AND DEVICES
WORKING TOGETHER
Libraries
● LibC++
○ Complete Standard C++ library with native C++11/14 compatibility (no abi_tag necessary)
○ Production in FreeBSD, Darwin (MacOS)
● LibC++abi(similar to libgcc_eh)
○ Exception handling (cxa_*)
● Libunwind(similar to libgcc_s)
○ Stack unwinding (Dwarf, SjLj, EHABI)
● Compiler-RT(similar to libgcc + “stuff”)
○ Builtins + sanitizers + profile + CFI + etc.
○ Some inter/intra-dependencies (with clang, libc++abi, libunwind) being resolved
○ Generic C implementation + some Arch-specific optimized versions (build dep.)
ENGINEERS AND DEVICES
WORKING TOGETHER
Sanitizers
● Not static analysis
○ The code needs to be compiled with instrumentation (-fsanitize=address)
○ And executed, preferably with production workloads
● Not Valgrind
○ The instrumentation is embedded in the code (orders of magnitude faster)
○ But needs to re-compile code, work around bugs in compilation, etc.
● Compiler instrumentation
○ In Clang and GCC
○ Add calls to instrumentation before load/stores, malloc/free, etc.
● Run-time libraries
○ Arch-specific instrumentation on how memory is laid out, etc.
○ Maps loads/stores, allocations, etc. into a shadow memory for tagging
○ Later calls do sanity checks on shadow tags and assert on errors
ENGINEERS AND DEVICES
WORKING TOGETHER
● ASAN: Address Sanitizer (~2x slower)
○ Out-of-bounds (heap, stack, BSS), use-after-free, double-free, etc.
● MSAN: Memory Sanitizer (no noticeable penalty)
○ Uninitialised memory usage (suggestions to merge into ASAN)
● LSAN: Leak Sanitizer (no noticeable penalty)
○ Memory leaks (heap objects losing scope)
● TSAN: Thread Sanitizer (5~10x slower on x86_64, more on AArch64)
○ Detects data races
○ Needs 64-bit pointers, to use the most-significant bits as tags
○ Due to multiple VMA configurations in AArch64, additional run-time checks are needed
● UBSAN: Undefined Behaviour Sanitizer (no noticeable penalty)
○ Integer overflow, null pointer use, misaligned reads
Sanitizers: Examples
ENGINEERS AND DEVICES
WORKING TOGETHER
LLD the llvm linker
● Since May 2015, 3 separate linkers in one project
○ ELF, COFF and the Atom based linker (Mach-O)
○ ELF and COFF have a similar design but don’t share code
○ Primarily designed to be system linkers
■ ELF Linker a drop in replacement for GNU ld
■ COFF linker a drop in replacement for link.exe
○ Atom based linker is a more abstract set of linker tools
■ Only supports Mach-O output
○ Uses llvm object reading libraries and core data structures
● Key design choices
○ Do not abstract file formats (c.f. BFD)
○ Emphasis on performance at the high-level, do minimal amount as late as possible.
○ Have a similar interface to existing system linkers but simplify where possible
ENGINEERS AND DEVICES
WORKING TOGETHER
LLD ELF
● Support for AArch64, amd64, ARM (sort of), Mips, Power, X86 targets
● Focused on Linux and BSD like ELF files suitable for demand paging
● FreeBSD team aiming at using it for AARCH64 lld
● Many ld command line options supported
● Linker script support in active development
● Not ready for being a callable library yet
ENGINEERS AND DEVICES
WORKING TOGETHER
LLD key data structure relationship
InputSection
OutputSection
Contains
InputSections
InputFile
Defines and
references
Symbol bodies
Contains
InputSections
Symbol
Best
SymbolBody
SymbolBody
SymbolTable
Global
Symbols
ENGINEERS AND DEVICES
WORKING TOGETHER
LLD control flow
Driver.cpp
1. Process command line
options
2. Create data structures
3. For each input file
a. Create InputFile
b. Read symbols into
symbol table
4. Optimizations such as GC
5. Create and call writer Writer.cpp
1. Create OutputSections
2. Create Thunks
3. Create PLT and GOT
4. Relax TLS
5. Assign addresses
6. Perform relocation
7. Write file
InputFiles.cpp
● Read symbols
LinkerScript.cpp
Can override default behaviour
● InputFiles
● Ordering of Sections
● DefineSymbols
SymbolTable.cpp
● Add files from archive to
resolve undefined symbols
ENGINEERS AND DEVICES
WORKING TOGETHER
LLDB
● A modern, high-performance source-level debugger written in C++
● Extensively under development for various use-cases.
● Default debugger for OSX, Xcode IDE, Android Studio.
● Re-uses LLVM/Clang code JIT/IR for expression evaluation, disassembly etc.
● Provides a C++ Debugger API which can be used by various clients
● Supported Host Platforms
○ OS X, Linux/Android, FreeBSD, NetBSD, and Windows
● Supported Target Architectures
○ i386/x86_64, Arm/AArch64, MIPS/MIPS64, IBM s390
● Supported Languages
○ Fully support C, C++ and Objective-C while SWIFT and GoLang (under development)
ENGINEERS AND DEVICES
WORKING TOGETHER
LLDB Architecture
LLDB API
LLDB Command line Executable LLDB MI Interface LLDB Python Module
Process Plugin
Process
Thread
Registers
Memory
pTrace
Interface
L
L
D
B
S
E
R
V
E
R
LLDB HOST ABSTRACTION LAYER
Linux
Android
gdb-server
MacOSX
NetBSD
FreeBSD
Windows
Platform
ELF
JIT
MACH-O
PECOFF
DWARF
Object File
Symbols
Target
Breakpoint
LLDB Core
LLDB Utility
Expressions
.
Other Plugins
ABI
Disassembler
Expressions Parser
Unwinder
Instruction Emulation
ENGINEERS AND DEVICES
WORKING TOGETHER
References
● Official docs
○ LLVM docs (LangRef, Passes, CodeGen, BackEnds, TableGen, Vectorizer, Doxygen)
○ Clang docs (LangExt, SafeStack, LTO, AST)
○ LLDB (Architecture, GDB to LLDB commands, Doxygen)
○ LLD (New ELF/COFF backend)
○ Sanitizers (ASAN, TSAN, MSAN, LSAN, UBSAN, DFSAN)
○ Compiler-RT / LibC++ (docs)
● Blogs
○ LLVM Blog
○ LLVM Weekly
○ Planet Clang
○ Eli Bendersky’s excellent blog post: Life of an instruction in LLVM
○ Old and high level, bug good overall post by Chris Lattner
○ Not that old, but great HowTo adding a new back-end
○ libunwind is not easy!
Thank You
#LAS16
For further information: www.linaro.org
LAS16 keynotes and videos on: connect.linaro.org

Weitere ähnliche Inhalte

Was ist angesagt?

Las16 309 - lua jit arm64 port - status
Las16 309 - lua jit arm64 port - statusLas16 309 - lua jit arm64 port - status
Las16 309 - lua jit arm64 port - status
Linaro
 
BUD17-405: Building a reference IoT product with Zephyr
BUD17-405: Building a reference IoT product with Zephyr BUD17-405: Building a reference IoT product with Zephyr
BUD17-405: Building a reference IoT product with Zephyr
Linaro
 
Linux-wpan: IEEE 802.15.4 and 6LoWPAN in the Linux Kernel - BUD17-120
Linux-wpan: IEEE 802.15.4 and 6LoWPAN in the Linux Kernel - BUD17-120Linux-wpan: IEEE 802.15.4 and 6LoWPAN in the Linux Kernel - BUD17-120
Linux-wpan: IEEE 802.15.4 and 6LoWPAN in the Linux Kernel - BUD17-120
Linaro
 
HKG15-300: Art's Quick Compiler: An unofficial overview
HKG15-300: Art's Quick Compiler: An unofficial overviewHKG15-300: Art's Quick Compiler: An unofficial overview
HKG15-300: Art's Quick Compiler: An unofficial overview
Linaro
 

Was ist angesagt? (20)

LAS16-TR02: Upstreaming 101
LAS16-TR02: Upstreaming 101LAS16-TR02: Upstreaming 101
LAS16-TR02: Upstreaming 101
 
LAS16-TR03: Upstreaming 201
LAS16-TR03: Upstreaming 201LAS16-TR03: Upstreaming 201
LAS16-TR03: Upstreaming 201
 
LAS16-201: ART JIT in Android N
LAS16-201: ART JIT in Android NLAS16-201: ART JIT in Android N
LAS16-201: ART JIT in Android N
 
LAS16-207: Bus scaling QoS
LAS16-207: Bus scaling QoSLAS16-207: Bus scaling QoS
LAS16-207: Bus scaling QoS
 
LAS16-507: LXC support in LAVA
LAS16-507: LXC support in LAVALAS16-507: LXC support in LAVA
LAS16-507: LXC support in LAVA
 
BKK16-304 The State of GDB on AArch64
BKK16-304 The State of GDB on AArch64BKK16-304 The State of GDB on AArch64
BKK16-304 The State of GDB on AArch64
 
Las16 309 - lua jit arm64 port - status
Las16 309 - lua jit arm64 port - statusLas16 309 - lua jit arm64 port - status
Las16 309 - lua jit arm64 port - status
 
LAS16-200: SCMI - System Management and Control Interface
LAS16-200:  SCMI - System Management and Control InterfaceLAS16-200:  SCMI - System Management and Control Interface
LAS16-200: SCMI - System Management and Control Interface
 
BKK16-103 OpenCSD - Open for Business!
BKK16-103 OpenCSD - Open for Business!BKK16-103 OpenCSD - Open for Business!
BKK16-103 OpenCSD - Open for Business!
 
LAS16-TR06: Remoteproc & rpmsg development
LAS16-TR06: Remoteproc & rpmsg developmentLAS16-TR06: Remoteproc & rpmsg development
LAS16-TR06: Remoteproc & rpmsg development
 
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
 
LAS16-305: Smart City Big Data Visualization on 96Boards
LAS16-305: Smart City Big Data Visualization on 96BoardsLAS16-305: Smart City Big Data Visualization on 96Boards
LAS16-305: Smart City Big Data Visualization on 96Boards
 
MazuV-Debug-System
MazuV-Debug-SystemMazuV-Debug-System
MazuV-Debug-System
 
ONNC - 0.9.1 release
ONNC - 0.9.1 releaseONNC - 0.9.1 release
ONNC - 0.9.1 release
 
BUD17-405: Building a reference IoT product with Zephyr
BUD17-405: Building a reference IoT product with Zephyr BUD17-405: Building a reference IoT product with Zephyr
BUD17-405: Building a reference IoT product with Zephyr
 
BKK16-106 ODP Project Update
BKK16-106 ODP Project UpdateBKK16-106 ODP Project Update
BKK16-106 ODP Project Update
 
Online test program generator for RISC-V processors
Online test program generator for RISC-V processorsOnline test program generator for RISC-V processors
Online test program generator for RISC-V processors
 
Linux-wpan: IEEE 802.15.4 and 6LoWPAN in the Linux Kernel - BUD17-120
Linux-wpan: IEEE 802.15.4 and 6LoWPAN in the Linux Kernel - BUD17-120Linux-wpan: IEEE 802.15.4 and 6LoWPAN in the Linux Kernel - BUD17-120
Linux-wpan: IEEE 802.15.4 and 6LoWPAN in the Linux Kernel - BUD17-120
 
HKG15-300: Art's Quick Compiler: An unofficial overview
HKG15-300: Art's Quick Compiler: An unofficial overviewHKG15-300: Art's Quick Compiler: An unofficial overview
HKG15-300: Art's Quick Compiler: An unofficial overview
 
An Open Discussion of RISC-V BitManip, trends, and comparisons _ Claire
 An Open Discussion of RISC-V BitManip, trends, and comparisons _ Claire An Open Discussion of RISC-V BitManip, trends, and comparisons _ Claire
An Open Discussion of RISC-V BitManip, trends, and comparisons _ Claire
 

Andere mochten auch

LAS16-500: The Rise and Fall of Assembler and the VGIC from Hell
LAS16-500: The Rise and Fall of Assembler and the VGIC from HellLAS16-500: The Rise and Fall of Assembler and the VGIC from Hell
LAS16-500: The Rise and Fall of Assembler and the VGIC from Hell
Linaro
 
LAS16-400K2: TianoCore – Open Source UEFI Community Update
LAS16-400K2: TianoCore – Open Source UEFI Community UpdateLAS16-400K2: TianoCore – Open Source UEFI Community Update
LAS16-400K2: TianoCore – Open Source UEFI Community Update
Linaro
 

Andere mochten auch (11)

Как приручить дракона: введение в LLVM
Как приручить дракона: введение в LLVMКак приручить дракона: введение в LLVM
Как приручить дракона: введение в LLVM
 
LLVM Internal Architecture par Michel Guillet
LLVM Internal Architecture par Michel GuilletLLVM Internal Architecture par Michel Guillet
LLVM Internal Architecture par Michel Guillet
 
LLVM introduction
LLVM introductionLLVM introduction
LLVM introduction
 
LAS16-101: Efficient kernel backporting
LAS16-101: Efficient kernel backportingLAS16-101: Efficient kernel backporting
LAS16-101: Efficient kernel backporting
 
LAS16-106: GNU Toolchain Development Lifecycle
LAS16-106: GNU Toolchain Development LifecycleLAS16-106: GNU Toolchain Development Lifecycle
LAS16-106: GNU Toolchain Development Lifecycle
 
LAS16-403: GDB Linux Kernel Awareness
LAS16-403: GDB Linux Kernel AwarenessLAS16-403: GDB Linux Kernel Awareness
LAS16-403: GDB Linux Kernel Awareness
 
LAS16-500: The Rise and Fall of Assembler and the VGIC from Hell
LAS16-500: The Rise and Fall of Assembler and the VGIC from HellLAS16-500: The Rise and Fall of Assembler and the VGIC from Hell
LAS16-500: The Rise and Fall of Assembler and the VGIC from Hell
 
LAS16-203: Platform security architecture for embedded devices
LAS16-203: Platform security architecture for embedded devicesLAS16-203: Platform security architecture for embedded devices
LAS16-203: Platform security architecture for embedded devices
 
LAS16-400K2: TianoCore – Open Source UEFI Community Update
LAS16-400K2: TianoCore – Open Source UEFI Community UpdateLAS16-400K2: TianoCore – Open Source UEFI Community Update
LAS16-400K2: TianoCore – Open Source UEFI Community Update
 
LAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
LAS16-402: ARM Trusted Firmware – from Enterprise to EmbeddedLAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
LAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
 
BUD17-416: Benchmark and profiling in OP-TEE
BUD17-416: Benchmark and profiling in OP-TEE BUD17-416: Benchmark and profiling in OP-TEE
BUD17-416: Benchmark and profiling in OP-TEE
 

Ähnlich wie LAS16-501: Introduction to LLVM - Projects, Components, Integration, Internals

Stranger in These Parts. A Hired Gun in the JS Corral (JSConf US 2012)
Stranger in These Parts. A Hired Gun in the JS Corral (JSConf US 2012)Stranger in These Parts. A Hired Gun in the JS Corral (JSConf US 2012)
Stranger in These Parts. A Hired Gun in the JS Corral (JSConf US 2012)
Igalia
 
Embedded c programming22 for fdp
Embedded c programming22 for fdpEmbedded c programming22 for fdp
Embedded c programming22 for fdp
Pradeep Kumar TS
 

Ähnlich wie LAS16-501: Introduction to LLVM - Projects, Components, Integration, Internals (20)

A taste of GlobalISel
A taste of GlobalISelA taste of GlobalISel
A taste of GlobalISel
 
Linux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloudLinux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloud
 
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the CompilerPragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
 
Part II: LLVM Intermediate Representation
Part II: LLVM Intermediate RepresentationPart II: LLVM Intermediate Representation
Part II: LLVM Intermediate Representation
 
Boosting Developer Productivity with Clang
Boosting Developer Productivity with ClangBoosting Developer Productivity with Clang
Boosting Developer Productivity with Clang
 
Stranger in These Parts. A Hired Gun in the JS Corral (JSConf US 2012)
Stranger in These Parts. A Hired Gun in the JS Corral (JSConf US 2012)Stranger in These Parts. A Hired Gun in the JS Corral (JSConf US 2012)
Stranger in These Parts. A Hired Gun in the JS Corral (JSConf US 2012)
 
C++ amp on linux
C++ amp on linuxC++ amp on linux
C++ amp on linux
 
BKK16-302: Android Optimizing Compiler: New Member Assimilation Guide
BKK16-302: Android Optimizing Compiler: New Member Assimilation GuideBKK16-302: Android Optimizing Compiler: New Member Assimilation Guide
BKK16-302: Android Optimizing Compiler: New Member Assimilation Guide
 
Embedded c programming22 for fdp
Embedded c programming22 for fdpEmbedded c programming22 for fdp
Embedded c programming22 for fdp
 
Lecture 04
Lecture 04Lecture 04
Lecture 04
 
Verilog presentation final
Verilog presentation finalVerilog presentation final
Verilog presentation final
 
Auto Tuning
Auto TuningAuto Tuning
Auto Tuning
 
ERTS UNIT 3.ppt
ERTS UNIT 3.pptERTS UNIT 3.ppt
ERTS UNIT 3.ppt
 
Challenges in GPU compilers
Challenges in GPU compilersChallenges in GPU compilers
Challenges in GPU compilers
 
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
 
Joel Falcou, Boost.SIMD
Joel Falcou, Boost.SIMDJoel Falcou, Boost.SIMD
Joel Falcou, Boost.SIMD
 
Csdfsadf
CsdfsadfCsdfsadf
Csdfsadf
 
C
CC
C
 
C
CC
C
 
Address/Thread/Memory Sanitizer
Address/Thread/Memory SanitizerAddress/Thread/Memory Sanitizer
Address/Thread/Memory Sanitizer
 

Mehr von Linaro

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Linaro
 
HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018
Linaro
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Linaro
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Linaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
Linaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
Linaro
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
Linaro
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMU
Linaro
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation
Linaro
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted boot
Linaro
 

Mehr von Linaro (20)

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
 
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaArm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
 
Bud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaBud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qa
 
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
 
HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018
 
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
 
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteHKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening Keynote
 
HKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopHKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP Workshop
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allHKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMU
 
HKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MHKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8M
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted boot
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Kürzlich hochgeladen (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

LAS16-501: Introduction to LLVM - Projects, Components, Integration, Internals

  • 1. Introduction to LLVM Projects, Components, Integration, Internals Renato Golin
  • 2. ENGINEERS AND DEVICES WORKING TOGETHER Overview LLVM is not a toolchain, but a number of sub-projects that can behave like one. ● Front-ends: ○ Clang (C/C++/ObjC/OpenCL/OpenMP), flang (Fortran), LDC (D), PGI’s Fortran, etc ● Front-end plugins: ○ Static analyser, clang-tidy, clang-format, clang-complete, etc ● Middle-end: ○ Optimization and Analysis passes, integration with Polly, etc. ● Back-end: ○ JIT (MC and ORC), targets: ARM, AArch64, MIPS, PPC, x86, GPUs, BPF, WebAsm, etc. ● Libraries: ○ Compiler-RT, libc++/abi, libunwind, OpenMP, libCL, etc. ● Tools: ○ LLD, LLDB, LNT, readobj, llc, lli, bugpoint, objdump, lto, etc.
  • 3. ENGINEERS AND DEVICES WORKING TOGETHER LLVM / GNU comparison LLVM component / tools Front-end: Clang Middle-end: LLVM Back-end: LLVM Assembler: LLVM (MC) Linker: LLD Libraries: Compiler-RT, libc++ (no libc) Debugger: LLDB / LLDBserver GNU component / tool Front-end: CC1 / CPP Middle-end: GCC Back-end: GCC Assembler: GAS Linker: GNU-LD Libraries: libgcc, stdlibc++, glibc Debugger: GDB / GDBserver
  • 4. ENGINEERS AND DEVICES WORKING TOGETHER Source Paths Direct route ... … Individual steps clang clang -S -emit-llvmclang -Sclang -c EXE EXE EXE EXE opt IR IR llc ASM OBJ clang lld ASM OBJ clang lld OBJ lld cc1 lld Basically, only two forks... Modulesusedbytools(clang,opt,llc)
  • 5. ENGINEERS AND DEVICES WORKING TOGETHER Perils of multiple paths ● Not all paths are created equal… ○ Core LLVM classes have options that significantly change code-gen ○ Target interpretation (triple, options) are somewhat independent ○ Default pass structure can be different ● Not all tools can pass all arguments… ○ Clang’s driver can’t handle some -Wl, and -Wa, options ○ Include paths, library paths, tools paths can be different depending on distro ○ GCC has build-time options (--with-*), LLVM doesn’t (new flags are needed) ● Different order produces different results… ○ “opt -O0” + “opt -O2” != “opt -O2” ○ Assembly notation, parsing and disassembling not entirely unique / bijective ○ So, (clang -emit-llvm)+(llc -S)+(clang -c) != (clang -c) ○ Not guaranteed distributive, associative or commutative properties
  • 6. ENGINEERS AND DEVICES WORKING TOGETHER C to IR ● IR is not target independent ○ Clang produces reasonably independent IR ○ Though, data sizes, casts, C++ structure layout, ABI, PCS are all taken into account ● clang -target <triple> -O2 -S -emit-llvm file.c C x86_64 ARM #include <stdio.h> int foo(int j, int d) { return j / d ; } int main( void ) { int r = foo(5, 0); printf(" r = %dn", r); } @.str = private unnamed_addr constant [9 x i8] c" r = %d0A00", align 1 ; Function Attrs: norecurse nounwind readnone define arm_aapcscc i32 @foo(i32 %j, i32 %d) #0 { %1 = sdiv i32 %j, %d ret i32 %1 } ; Function Attrs: nounwind define arm_aapcscc i32 @main() #1 { %1 = tail call arm_aapcscc i32 (i8*, ...) @printf(i8* getelementptr inbounds ([9 x i8], [9 x i8]* @.str, i32 0, i32 0), i32 undef) ret i32 0 } @.str = private unnamed_addr constant [9 x i8] c" r = %d0A00", align 1 ; Function Attrs: norecurse nounwind readnone uwtable define i32 @foo(i32 %j, i32 %d) #0 { %1 = sdiv i32 %j, %d ret i32 %1 } ; Function Attrs: nounwind uwtable define i32 @main() #1 { %1 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([9 x i8], [9 x i8]* @.str, i64 0, i64 0), i32 undef) ret i32 0 }
  • 7. ENGINEERS AND DEVICES WORKING TOGETHER ABI differences in IR target triple = "armv7--linux-gnueabihf" define i32 @_Z5valueicd(i32 %i, i8 zeroext %c, double %d) { %7 = call %class.Baz* @_ZN3BazC2Eicd(...) ret i32 %8 } define linkonce_odr %class.Baz* @_ZN3BazC2Eicd(...) { (...) %1 = alloca %class.Baz*, align 4 (...) store %class.Baz* %this, %class.Baz** %1, align 4 (...) %5 = load %class.Baz*, %class.Baz** %1, align 4 (...) ret %class.Baz* %5 } target triple = "x86_64-unknown-linux-gnu" define i32 @_Z5valueicd(i32 %i, i8 signext %c, double %d) { (...) call void @_ZN3BazC2Eicd(...) (...) ret i32 %7 } define linkonce_odr void @_ZN3BazC2Eicd(...) { (...) %1 = alloca %class.Baz*, align 8 (...) store %class.Baz* %this, %class.Baz** %1, align 8 (...) %5 = load %class.Baz*, %class.Baz** %1, align 8 (...) ret void } ARM ABI defined unsigned char Pointer alignment CTor return values (tail call)
  • 8. ENGINEERS AND DEVICES WORKING TOGETHER Optimization Passes & Pass Manager ● Types of passes ○ Analysis: Gathers information about code, can annotate (metadata) ○ Transform: Can change instructions, entire blocks, usually rely on analysis passes ○ Scope: Module, Function, Loop, Region, BBlock, etc. ● Registration ○ Static, via INITIALIZE_PASS_BEGIN / INITIALIZE_PASS_DEPENDENCY macros ○ Implements getAnalysisUsage() by registering required / preserved passes ○ The PassManager is used by tools (clang, llc, opt) to add passes in specific order ● Execution ○ Registration order pass: Module, Function, … ○ Push dependencies to queue before next, unless it was preserved by previous passes ○ Create a new { module, function, basic block } → change → validate → replace all uses
  • 9. ENGINEERS AND DEVICES WORKING TOGETHER IR transformations ● opt is a developer tool, to help test and debug passes ○ Clang, llc, lli use the same infrastructure (not necessarily in the same way) ○ opt -S -sroa file.ll -o opt.ll O0 +SROA -print-before|after-all define i32 @foo(i32 %j, i32 %d) #0 { %1 = alloca i32, align 4 %2 = alloca i32, align 4 store i32 %j, i32* %1, align 4 store i32 %d, i32* %2, align 4 %3 = load i32, i32* %1, align 4 %4 = load i32, i32* %2, align 4 %5 = sdiv i32 %3, %4 ret i32 %5 } define i32 @foo(i32 %j, i32 %d) #0 { %1 = sdiv i32 %j, %d ret i32 %1 } *** IR Dump After SROA *** ; Function Attrs: nounwind uwtable define i32 @foo(i32 %j, i32 %d) #0 { %1 = sdiv i32 %j, %d ret i32 %1 } *** IR Dump Before SROA *** ; Function Attrs: nounwind uwtable define i32 @foo(i32 %j, i32 %d) #0 { %1 = alloca i32, align 4 %2 = alloca i32, align 4 store i32 %j, i32* %1, align 4 store i32 %d, i32* %2, align 4 %3 = load i32, i32* %1, align 4 %4 = load i32, i32* %2, align 4 %5 = sdiv i32 %3, %4 ret i32 %5 } bool SROA::runOnFunction(Function &F) { if (skipFunction(F)) return false; bool Changed = performPromotion(F); (...) bool SROA::performPromotion(Function &F) { (...) if (HasDomTree) PromoteMemToReg(Allocas, *DT, nullptr, &AC); Nothing to do with SROA… :)
  • 10. ENGINEERS AND DEVICES WORKING TOGETHER IR Lowering ● SelectionDAGISel ○ IR to DAG is target Independent (with some target-dependent hooks) ○ runOnMachineFunction(MF) → For each Block → SelectBasicBlock() ○ Multi-step legalization/combining because of type differences (patterns don’t match) foreach(Inst in Block) SelectionDAGBuilder.visit() CodeGenAndEmitDAG() CodeGenAndEmitDAG() Combine() LegalizeTypes( ) Legalize() DoInstructionSelection() Scheduler->Run(DAG)
  • 11. ENGINEERS AND DEVICES WORKING TOGETHER DAG Transformation Before Legalize Types Before Legalize Before ISel “Glue” means nodes that “belong together” “Chain” is “program order” AAPCS R0 R1 i64 “add” “addc+adde” ARMISD 32-bit registers, from front-end lowering
  • 12. ENGINEERS AND DEVICES WORKING TOGETHER Legalize Types & DAG Combining ● LegalizeTypes ○ for(Node in Block) { Target.getTypeAction(Node.Type); ○ If type is not Legal, TargetLowering::Type<action><type>, ex: ■ TypeExpandInteger ■ TypePromoteFloat ■ TypeScalarizeVector ■ etc. ○ An ugly chain of GOTOs and switches with the same overall idea (switch(Type):TypeOpTy) ● DAGCombine ○ Clean up dead nodes ○ Uses TargetLowering to combine DAG nodes, bulk of it C++ methods combine<Opcode>() ○ Promotes types after combining, to help next cycle’s type legalization
  • 13. ENGINEERS AND DEVICES WORKING TOGETHER DAG Legalization ● LegalizeDAG ○ for(Node in Block) { LegalizeOp(Node); } ○ Action = TargetLowering.getOperationAction(Opcode, Type) Legal Expand Custom LibCall Promote while(TargetLowering.isOperationLegalOrCustom(TypeSize)) TypeSize << 1 continue generic DAG expansions TargetLowering.LowerOp() Add a new Call() from TargetLowering.getLibCallName(Opcode)
  • 14. ENGINEERS AND DEVICES WORKING TOGETHER Instruction Selection & Scheduler ● Instruction Selection ○ <Target>ISelLowering: From SDNode (ISD::) to (ARMISD::) ○ <Target>ISelDAGToDAG: From SDNode (ARMISD::) to MachineSDNode (ARM::) ○ ABI/PCS registers, builtins, intrinsics ○ Still, some type legalization (for new nodes) ○ Inline assembly is still text (will be expanded in the MC layer) ● Scheduler ○ Sorts DAG in topological order ○ Inserts / removes edges, updates costs based on TargetInformation ○ Glue keeps paired / dependent instructions together ○ Target’s schedule is in TableGen (most inherit from basic description + specific rules) ○ Produces MachineBasicBlocks and MachineInstructions (MI) ○ Still in SSA form (virtual registers)
  • 15. ENGINEERS AND DEVICES WORKING TOGETHER Register Allocation & Serialization ● Register allocators ○ Fast: Linear scan, multi-pass (define ranges, allocate, collect dead, coalesce) ○ Greedy: default on optimised builds (live ranges, interference graph / colouring) ○ PBQP: Partitioned Boolean Quadratic Programming (constraint solver, useful for DSP) ● MachineFunction passes ○ Before/after register allocation ○ Frame lowering (prologue/epilogue), EH tables, constant pools, late opts. ● Machine Code (MC) Layer ○ Can emit both assembly (<Target>InstPrinter) and object (<Target>ELFStreamer) ○ Most MCInst objects can be constructed from TableGen, some need custom lowering ○ Parses inline assembly and inserts instructions in the MC stream, matches registers, etc ○ Inline Asm local registers are reserved in the register allocator and linked here ○ Also used by assembler (<Target>AsmParser) and disassembler (<Target>Disassembler)
  • 16. ENGINEERS AND DEVICES WORKING TOGETHER Assembler / Disassembler ● AsmParser ○ Used for both asm files and inline asm ○ Uses mostly TableGen instruction definitions (Inst, InstAlias, PseudoInst) ○ Single pass assembler with a few hard-coded transformations (which makes it messy) ○ Connects into MC layer and can output text (ex. llvm-mc) or object code (ex. clang) ● MCDisassembler ○ Iteration of trial and fail (ARM, Thumb, VFP, NEON, etc) ○ Most of it relies on TableGen encodings, but there’s a lot of hard-coded stuff ○ Doesn’t know much about object formats (ELF/COFF/MachO) ○ Used by llvm-objdump, llvm-mc, connects back to MC layer
  • 17. ENGINEERS AND DEVICES WORKING TOGETHER TableGen ● Parse hardware description and generates code and tables to describe them ○ Common parser (same language), multiple back-ends (different outputs) ○ Templated descriptive language, good for composition and pattern matching ○ Back-ends generate multiple tables/enums with header guards + supporting code ● Back-ends describe their registers, instructions, schedules, patterns, etc. ○ Definition files generated at compile time, included in CPP files using define-include trick ○ Most matching, cost and code generating patterns are done via TableGen ● Clang also uses it for diagnostics and command line options ● Examples: ○ Syntax ○ Define-include trick ○ Language introduction and formal definition
  • 18. ENGINEERS AND DEVICES WORKING TOGETHER Libraries ● LibC++ ○ Complete Standard C++ library with native C++11/14 compatibility (no abi_tag necessary) ○ Production in FreeBSD, Darwin (MacOS) ● LibC++abi(similar to libgcc_eh) ○ Exception handling (cxa_*) ● Libunwind(similar to libgcc_s) ○ Stack unwinding (Dwarf, SjLj, EHABI) ● Compiler-RT(similar to libgcc + “stuff”) ○ Builtins + sanitizers + profile + CFI + etc. ○ Some inter/intra-dependencies (with clang, libc++abi, libunwind) being resolved ○ Generic C implementation + some Arch-specific optimized versions (build dep.)
  • 19. ENGINEERS AND DEVICES WORKING TOGETHER Sanitizers ● Not static analysis ○ The code needs to be compiled with instrumentation (-fsanitize=address) ○ And executed, preferably with production workloads ● Not Valgrind ○ The instrumentation is embedded in the code (orders of magnitude faster) ○ But needs to re-compile code, work around bugs in compilation, etc. ● Compiler instrumentation ○ In Clang and GCC ○ Add calls to instrumentation before load/stores, malloc/free, etc. ● Run-time libraries ○ Arch-specific instrumentation on how memory is laid out, etc. ○ Maps loads/stores, allocations, etc. into a shadow memory for tagging ○ Later calls do sanity checks on shadow tags and assert on errors
  • 20. ENGINEERS AND DEVICES WORKING TOGETHER ● ASAN: Address Sanitizer (~2x slower) ○ Out-of-bounds (heap, stack, BSS), use-after-free, double-free, etc. ● MSAN: Memory Sanitizer (no noticeable penalty) ○ Uninitialised memory usage (suggestions to merge into ASAN) ● LSAN: Leak Sanitizer (no noticeable penalty) ○ Memory leaks (heap objects losing scope) ● TSAN: Thread Sanitizer (5~10x slower on x86_64, more on AArch64) ○ Detects data races ○ Needs 64-bit pointers, to use the most-significant bits as tags ○ Due to multiple VMA configurations in AArch64, additional run-time checks are needed ● UBSAN: Undefined Behaviour Sanitizer (no noticeable penalty) ○ Integer overflow, null pointer use, misaligned reads Sanitizers: Examples
  • 21. ENGINEERS AND DEVICES WORKING TOGETHER LLD the llvm linker ● Since May 2015, 3 separate linkers in one project ○ ELF, COFF and the Atom based linker (Mach-O) ○ ELF and COFF have a similar design but don’t share code ○ Primarily designed to be system linkers ■ ELF Linker a drop in replacement for GNU ld ■ COFF linker a drop in replacement for link.exe ○ Atom based linker is a more abstract set of linker tools ■ Only supports Mach-O output ○ Uses llvm object reading libraries and core data structures ● Key design choices ○ Do not abstract file formats (c.f. BFD) ○ Emphasis on performance at the high-level, do minimal amount as late as possible. ○ Have a similar interface to existing system linkers but simplify where possible
  • 22. ENGINEERS AND DEVICES WORKING TOGETHER LLD ELF ● Support for AArch64, amd64, ARM (sort of), Mips, Power, X86 targets ● Focused on Linux and BSD like ELF files suitable for demand paging ● FreeBSD team aiming at using it for AARCH64 lld ● Many ld command line options supported ● Linker script support in active development ● Not ready for being a callable library yet
  • 23. ENGINEERS AND DEVICES WORKING TOGETHER LLD key data structure relationship InputSection OutputSection Contains InputSections InputFile Defines and references Symbol bodies Contains InputSections Symbol Best SymbolBody SymbolBody SymbolTable Global Symbols
  • 24. ENGINEERS AND DEVICES WORKING TOGETHER LLD control flow Driver.cpp 1. Process command line options 2. Create data structures 3. For each input file a. Create InputFile b. Read symbols into symbol table 4. Optimizations such as GC 5. Create and call writer Writer.cpp 1. Create OutputSections 2. Create Thunks 3. Create PLT and GOT 4. Relax TLS 5. Assign addresses 6. Perform relocation 7. Write file InputFiles.cpp ● Read symbols LinkerScript.cpp Can override default behaviour ● InputFiles ● Ordering of Sections ● DefineSymbols SymbolTable.cpp ● Add files from archive to resolve undefined symbols
  • 25. ENGINEERS AND DEVICES WORKING TOGETHER LLDB ● A modern, high-performance source-level debugger written in C++ ● Extensively under development for various use-cases. ● Default debugger for OSX, Xcode IDE, Android Studio. ● Re-uses LLVM/Clang code JIT/IR for expression evaluation, disassembly etc. ● Provides a C++ Debugger API which can be used by various clients ● Supported Host Platforms ○ OS X, Linux/Android, FreeBSD, NetBSD, and Windows ● Supported Target Architectures ○ i386/x86_64, Arm/AArch64, MIPS/MIPS64, IBM s390 ● Supported Languages ○ Fully support C, C++ and Objective-C while SWIFT and GoLang (under development)
  • 26. ENGINEERS AND DEVICES WORKING TOGETHER LLDB Architecture LLDB API LLDB Command line Executable LLDB MI Interface LLDB Python Module Process Plugin Process Thread Registers Memory pTrace Interface L L D B S E R V E R LLDB HOST ABSTRACTION LAYER Linux Android gdb-server MacOSX NetBSD FreeBSD Windows Platform ELF JIT MACH-O PECOFF DWARF Object File Symbols Target Breakpoint LLDB Core LLDB Utility Expressions . Other Plugins ABI Disassembler Expressions Parser Unwinder Instruction Emulation
  • 27. ENGINEERS AND DEVICES WORKING TOGETHER References ● Official docs ○ LLVM docs (LangRef, Passes, CodeGen, BackEnds, TableGen, Vectorizer, Doxygen) ○ Clang docs (LangExt, SafeStack, LTO, AST) ○ LLDB (Architecture, GDB to LLDB commands, Doxygen) ○ LLD (New ELF/COFF backend) ○ Sanitizers (ASAN, TSAN, MSAN, LSAN, UBSAN, DFSAN) ○ Compiler-RT / LibC++ (docs) ● Blogs ○ LLVM Blog ○ LLVM Weekly ○ Planet Clang ○ Eli Bendersky’s excellent blog post: Life of an instruction in LLVM ○ Old and high level, bug good overall post by Chris Lattner ○ Not that old, but great HowTo adding a new back-end ○ libunwind is not easy!
  • 28. Thank You #LAS16 For further information: www.linaro.org LAS16 keynotes and videos on: connect.linaro.org