SlideShare ist ein Scribd-Unternehmen logo
1 von 57
Downloaden Sie, um offline zu lesen
Compiling to Execution
C.K. Chen 2015.11.18
Compiling Flow
• What happened from compiling the source
code to executing
Source
Code(.c)
Header
Files(.h)
Preprocessing
cpp
Preprocessed
(.i)
Compilation
gcc
Static Library
(.a)
Object File
(.o)
Linking
(ld)
Assemble
as
Executable
a.out
Assembly
(.s)
Preproccessing
• Preproccessing
– Extend and remove #define
– #if, #ifdef, #elif, …..
$gcc -E hello.c -o hello.i
Compilation
• Translate high-level source code into assembly
code
$gcc -S hello.i -o hello.s
Assembly
• Assemble assembly into machine code
• After assembly, we will have the ELF format
file
$gcc -c hello.s -o hello.o
ELF FILE FORMAT
ELF Format
• Executable file format
– Derived from COFF(Common Object File Format)
• Windows : PE (Portable Executable)
• Linux: ELF (Executable Linkable Format)
– Dynamic Linking Library (DLL)
• Windows (.dll); Linux (.so)
– Static Linking Library
• Windows (.lib); Linux (.a)
– Object file
• Windows (.obj); Linux (.o)
• Like executable file format
• Intermediate file between compilation and linking
File Content
• Machine code, data, symbol table, string table
• File header
– Basic file information
• File divided by sections
– Code Section (.code, .text)
– Data Section (.data)
– Special Section (.symtab, .strtab)
File Header
• File header contains following information
– Is executable
– Static Link or Dynamic Link
– Entry address
– Target hardware / OS
– Section Table
File Header Structure
• The structure of ELF header is defined as Elf_Ehdr
• e_ident
– The first 4 byte is
‘x7f’, ‘E’,’L’,’F’
– File signature
• e_type
typedef struct
{
unsigned char e_ident[16]; /* ELF identification */
Elf64_Half e_type; /* Object file type */
Elf64_Half e_machine; /* Machine type */
Elf64_Word e_version; /* Object file version */
Elf64_Addr e_entry; /* Entry point address */
Elf64_Off e_phoff; /* Program header offset */
Elf64_Off e_shoff; /* Section header offset */
Elf64_Word e_flags; /* Processor-specific flags */
Elf64_Half e_ehsize; /* ELF header size */
Elf64_Half e_phentsize; /* Size of program header entry */
Elf64_Half e_phnum; /* Number of program header entries */
Elf64_Half e_shentsize; /* Size of section header entry */
Elf64_Half e_shnum; /* Number of section header entries */
Elf64_Half e_shstrndx; /* Section name string table index */
} Elf64_Ehdr;
File Header Structure
• e_machine
– 62 for AMD
x86-64 architecture
– 243 for RISC-V
(HITCON 2015)
• e_entry
• e_shoff
– Follow this member
can find the section
table
typedef struct
{
unsigned char e_ident[16]; /* ELF identification */
Elf64_Half e_type; /* Object file type */
Elf64_Half e_machine; /* Machine type */
Elf64_Word e_version; /* Object file version */
Elf64_Addr e_entry; /* Entry point address */
Elf64_Off e_phoff; /* Program header offset */
Elf64_Off e_shoff; /* Section header offset */
Elf64_Word e_flags; /* Processor-specific flags */
Elf64_Half e_ehsize; /* ELF header size */
Elf64_Half e_phentsize; /* Size of program header entry */
Elf64_Half e_phnum; /* Number of program header entries */
Elf64_Half e_shentsize; /* Size of section header entry */
Elf64_Half e_shnum; /* Number of section header entries */
Elf64_Half e_shstrndx; /* Section name string table index */
} Elf64_Ehdr;
Sections Table
• Sections Table store
the array of section headers
– Each section is in type
Elf64_Shdr
– Every element 64 bytes
typedef struct
{
Elf64_Word sh_name; /* Section name */
Elf64_Word sh_type; /* Section type */
Elf64_Xword sh_flags; /* Section attributes */
Elf64_Addr sh_addr; /* Virtual address in memory */
Elf64_Off sh_offset; /* Offset in file */
Elf64_Xword sh_size; /* Size of section */
Elf64_Word sh_link; /* Link to other section */
Elf64_Word sh_info; /* Miscellaneous information */
Elf64_Xword sh_addralign; /* Address alignment boundary */
Elf64_Xword sh_entsize; /* Size of entries, if section has table */
} Elf64_Shdr;
e_shoff
.sh.strtab
.sh.strtab + 0x20
Section Table
• Traversal table to
find all sections
Code Section
• Code section, most case .text, is used to save
the binary code
• objdump –s
– Display the full contents of all sections
• objdump –d
– Display assembler contents of executable sections
Data Section
• There are several sections to store program’s data
– .data → Initialized global variable & static variable
– .rodata → save the constant value in the program
– .bss → save the uninitialized variables
Bss section
• BSS section is used to save uninitialized data
or data filled with zero
• This section will not occupy space in the ELF
file
– But have space when loading into memory
Other
• .rela.text, .symtab
– Used in symbol resolving
• .strtab
– Save strings for symbol resolution
Summary of ELF format
STATIC LINKING
Static Linking
• Static link is responsible for combining several
object files into final executable
$gcc hello.o -o hello.out
Two-pass Linking
• Two-pass Linking
• Space & Address Allocation
– Fetch section length, attribute and position
– Collect symbol(define, reference) and put to a global table
• Symbol Resolution & Relocation
– Modify relocation entry
Space & Address Allocation
• Define Symbols
– variables
– Functions
• The virtual address is
allocated after linking
Symbol Table before Linking Symbol Table after Linking
Symbol Table
typedef struct elf64_sym {
Elf64_Word st_name; /* Symbol name, index in string tbl */
unsigned char st_info; /* Type and binding attributes */
unsigned char st_other; /* No defined meaning, 0 */
Elf64_Half st_shndx; /* Associated section index */
Elf64_Addr st_value; /* Value of the symbol */
Elf64_Xword st_size; /* Associated symbol size */
} Elf64_Sym;
Size: 24 bytes
Fist symbol
.strtab section at 0x00000570
0x570+0xe(offset) = 0x57e
This symbol is named shared
Symbol Resolution & Relocation
• Resolve symbol’s address in the final executable
– Address of external symbols are unknown before linking
– Before linking, the temporary location is put into object
files
– Automatic patch the address with the correct one
Relocation Table
• Relocation table records the address of symbol to
patch
typedef struct elf64_rela {
Elf64_Addr r_offset;
Elf64_Xword r_info;
Elf64_Sxword r_addend;
} Elf64_Rela;
r_offset r_info[1] r_info[2] r_addend
shared 14 00 00 00 00 00 00 00 0a 00 00 00 09 00 00 00 00 00 00 00 00 00 00 0
swap 21 00 00 00 00 00 00 00 02 00 00 00 0a 00 00 00 fc ff ff ff ff ff ff ff
0x0a -> R_X86_64_32/R_AMD64_32
0X02 -> R_X86_64_PC32/R_AMD64_PC32
They have different way to patch address
Relocation Type and Patch Calculation
•
A - The addend value of the
relocatable field.
S - The value of the symbol
P - The section offset or address of
the storage unit being relocated,
computed using r_offset.
GOT - The address of the global
offset table
https://docs.oracle.com/cd/E2382
4_01/html/819-0690/chapter6-
54839.html
Program Execution
• After static linking, we have the executable file
• The loader is most important to make the
program run
– Loading the executable into memory
– Dynamic resolving the symbols
Creation of Process
• Create a independent virtual AS
– page directory(Linux)
• Read executable file header, create mapping between virtual
AS and executable file
– VMA, Virtual Memory Area
• Assign entry address to
program register(PC)
– Switch between kernel stack
and process stack
– CPU access attribute
Section to Segment Mapping
• Several sections are merge into the segment
– Depend on it’s permission
• Read
• Write
• Execution
Load Executable into Mem
• The program header section contains the
information of segments
– Program load into memory in the unit of segments
– Readelf
– Program
header table
in ELF file
Disadvantage of Static Linking
• Advantage
– Independent development
– Test individual modules
• Disadvantage
– Waste memory and disk space
• Every program has a copy of runtime library(printf,
scanf, strlen, ...)
• Difficulty of updating module
– Need to re-link and publish to user when a module is updated
Dynamic Linking
• Delay linking until execution
– Load Time Relocation
• Example:
– Program1.o, Program2.o, Lib.o
– Execute Program1 → Load Program1.o
– Program1 uses Lib → Load Lib.o
– Execute Program2 → Load Program2.o
– Program2 uses Lib → Lib.o has already been loaded into
physical memory
– Advantage
• Save space
• Easier to update modules
Lazy Binding
• Bind when the first time use the
function(relocation, symbol searching)
• More efficient
– In most executions, only small amount of function
called
– Not need to resolve all the symbols
Global Offset Table
• Divide into 2 part
– .got
• Store the reference address of global variables
– .got.plt
• Store the reference address of global functions
.GOT.PLT
• The first 3 items are fixed and have special purpose
– address of .dynamic section
– link_map
– dl_runtime_resolve
• The following items are functions in the share libraries
.dynamic
.got
.got.plt
.data
Addr of .dynamic
link_map
dl_resolve
print
print
.
.
.
call foo@plt
…
Lazy binding
.text
jmp *(bar@GOT)
push index
jmp PLT0
foo@plt
.got.plt
printf
bar@plt+6
foo
…
push *(GOT + 4)
jmp *(GOT + 8)
PLT0
36
Note: GOT here
means .got.plt
.
.
.
call foo@plt
…
.text
jmp *(bar@GOT)
push index
jmp PLT0
foo@plt
.got.plt
printf
bar@plt+6
foo
…
push *(GOT + 4)
jmp *(GOT + 8)
PLT0
Lazy binding
37
.
.
.
call foo@plt
…
.text
jmp *(bar@GOT)
push index
jmp PLT0
foo@plt
.got.plt
printf
bar@plt+6
foo
…
push *(GOT + 4)
jmp *(GOT + 8)
PLT0
因 bar還沒 call 過
所以 bar在 .got.plt 中所存的值
會是.plt中的下一行指令位置
所以看起來會像沒有 jmp 過
Lazy binding
38
.
.
.
call foo@plt
…
.text
jmp *(bar@GOT)
push index
jmp PLT0
foo@plt
.got.plt
printf
bar@plt+6
foo
…
push *(GOT + 4)
jmp *(GOT + 8)
PLT0
Lazy binding
39
.
.
.
call foo@plt
…
.text
jmp *(bar@GOT)
push index
jmp PLT0
foo@plt
.got.plt
printf
bar@plt+6
foo
…
push *(GOT + 4)
jmp *(GOT + 8)
PLT0
Lazy binding
40
.
.
.
call foo@plt
…
.text
jmp *(bar@GOT)
push index
jmp PLT0
foo@plt
.got.plt
printf
bar@plt+6
foo
…
push *(GOT + 4)
jmp *(GOT + 8)
PLT0
push link_map
Lazy binding
41
.
.
.
call foo@plt
…
.text
jmp *(bar@GOT)
push index
jmp PLT0
foo@plt
.got.plt
printf
bar@plt+6
foo
…
push *(GOT + 4)
jmp *(GOT + 8)
PLT0
jmp dl_runtime_resolve
dl_runtime_resolve (link_map,index)
Lazy binding
42
.
.
.
call foo@plt
…
.text
.got.plt
printf
bar@plt+6
foo
…
..
..
call _fix_up
..
..
ret 0xc
dl_resolve
Lazy binding
43
.
.
.
call foo@plt
…
.text
.got.plt
printf
foo@plt+6
bar
…
..
..
call _fix_up
..
..
ret 0xc
dl_resolve
foo 找到 bar 在 library 的位置後
會填回 .got.plt
Lazy binding
44
.
.
.
call foo@plt
…
Lazy binding
.text
.got.plt
printf
foo@plt+6
foo
…
..
..
call _fix_up
..
..
ret 0xc
dl_resolve
bar
return to bar
45
Program Memory Layout
• Flat memory model
– Default regions:
• stack
• heap
• mapping of executable file
• dynamic libraries
• Stack Frame(Activate Record)
– Return address, arguments
– Temporary variables
– Context
• Frame Pointer(ebp on i386)
• Stack Pointer(esp on i386)
Calling Convention
• Consistency between caller and callee •
• Argument passing order and method
– Stack, Register(eax for return value on i386) •
• Stack maintainer
– Keep consistency before and after function call
– Responsibility of caller or callee
– Name-mangling
– Default calling convention in C language is “cdecl”
Calling Convention Example
LD_Preload
• Ordinarily the dynamic linker loads shared libs
in whatever order it needs them
• $LD_PRELOAD is an environment variable
containing a colon (or space) separated list of
libraries that the dynamic linker loads before
any others
LD_Preload
• Preloading a library means that its functions will
be used before others of the same name in later
libraries
• Allows functions to be overridden/replaced/
intercepted
• Program behaviour can be modified “non-
invasively”
– ie. no recompile/relink necessary
– Especially useful for closed-source programs
– And when the modifications don’t belong in the
program or the library
Example
• We want to intercept
Implement Shared Lib
• Write our own function
• Compile into share library
gcc -Wall -fpic -shared -o libmylib.so mylibc.c
Result
System Call
• User processes cannot perform privileged operations
themselves
• Must request OS to do so on their behalf by issuing system
calls
• System calls elevate privilege of
user process
Ltrace
• Tracing system calls in Linux – strace command
• Output is printed for each system call as it is executed,
including parameters and return codes
• ptrace() system call is used to implement strace – Also used by
debuggers (breakpoint, singlestep, etc)
– Maybe anti-debug
– How to solve?
Summary
• ELF file format
• Section
• Static Link
• Dynamic Link and Lazy Binding
• LD_Preload
• strace

Weitere ähnliche Inhalte

Was ist angesagt?

8086 assembly language
8086 assembly language8086 assembly language
8086 assembly language
Mir Majid
 

Was ist angesagt? (20)

Operator Overloading
Operator OverloadingOperator Overloading
Operator Overloading
 
Unary operator overloading
Unary operator overloadingUnary operator overloading
Unary operator overloading
 
C presentation
C presentationC presentation
C presentation
 
8086 assembly language
8086 assembly language8086 assembly language
8086 assembly language
 
data types in C programming
data types in C programmingdata types in C programming
data types in C programming
 
C++
C++C++
C++
 
INLINE FUNCTION IN C++
INLINE FUNCTION IN C++INLINE FUNCTION IN C++
INLINE FUNCTION IN C++
 
C program
C programC program
C program
 
Arithmetic micro operations
Arithmetic micro operationsArithmetic micro operations
Arithmetic micro operations
 
Assembly 8086
Assembly 8086Assembly 8086
Assembly 8086
 
c++ programming Unit 2 basic structure of a c++ program
c++ programming Unit 2 basic structure of a c++ programc++ programming Unit 2 basic structure of a c++ program
c++ programming Unit 2 basic structure of a c++ program
 
C fundamental
C fundamentalC fundamental
C fundamental
 
User defined functions
User defined functionsUser defined functions
User defined functions
 
Chapter Introduction to Modular Programming.ppt
Chapter Introduction to Modular Programming.pptChapter Introduction to Modular Programming.ppt
Chapter Introduction to Modular Programming.ppt
 
Pointer in c
Pointer in cPointer in c
Pointer in c
 
C++ programming
C++ programmingC++ programming
C++ programming
 
Steps for c program execution
Steps for c program executionSteps for c program execution
Steps for c program execution
 
Programming the basic computer
Programming the basic computerProgramming the basic computer
Programming the basic computer
 
Storage classes in C
Storage classes in C Storage classes in C
Storage classes in C
 
operator overloading & type conversion in cpp over view || c++
operator overloading & type conversion in cpp over view || c++operator overloading & type conversion in cpp over view || c++
operator overloading & type conversion in cpp over view || c++
 

Andere mochten auch

Program Structure in GNU/Linux (ELF Format)
Program Structure in GNU/Linux (ELF Format)Program Structure in GNU/Linux (ELF Format)
Program Structure in GNU/Linux (ELF Format)
Varun Mahajan
 
Algorithm and flowchart
Algorithm and flowchartAlgorithm and flowchart
Algorithm and flowchart
IamPe Khamkhum
 
141 deview 2013 발표자료(박준형) v1.1(track4-session1)
141 deview 2013 발표자료(박준형) v1.1(track4-session1)141 deview 2013 발표자료(박준형) v1.1(track4-session1)
141 deview 2013 발표자료(박준형) v1.1(track4-session1)
NAVER D2
 

Andere mochten auch (20)

Program Structure in GNU/Linux (ELF Format)
Program Structure in GNU/Linux (ELF Format)Program Structure in GNU/Linux (ELF Format)
Program Structure in GNU/Linux (ELF Format)
 
A hands-on introduction to the ELF Object file format
A hands-on introduction to the ELF Object file formatA hands-on introduction to the ELF Object file format
A hands-on introduction to the ELF Object file format
 
ELF
ELFELF
ELF
 
06 - ELF format, knowing your friend
06 - ELF format, knowing your friend06 - ELF format, knowing your friend
06 - ELF format, knowing your friend
 
Linkers And Loaders
Linkers And LoadersLinkers And Loaders
Linkers And Loaders
 
C lects (3)
C lects (3)C lects (3)
C lects (3)
 
Algorithm and flowchart
Algorithm and flowchartAlgorithm and flowchart
Algorithm and flowchart
 
Best Techniques To Design Programs - Program Designing Techniques
Best Techniques To Design Programs - Program Designing TechniquesBest Techniques To Design Programs - Program Designing Techniques
Best Techniques To Design Programs - Program Designing Techniques
 
Microcontroller lec 3
Microcontroller  lec 3Microcontroller  lec 3
Microcontroller lec 3
 
141 deview 2013 발표자료(박준형) v1.1(track4-session1)
141 deview 2013 발표자료(박준형) v1.1(track4-session1)141 deview 2013 발표자료(박준형) v1.1(track4-session1)
141 deview 2013 발표자료(박준형) v1.1(track4-session1)
 
SFO15-406: ARM FDPIC toolset, kernel & libraries for Cortex-M & Cortex-R mmul...
SFO15-406: ARM FDPIC toolset, kernel & libraries for Cortex-M & Cortex-R mmul...SFO15-406: ARM FDPIC toolset, kernel & libraries for Cortex-M & Cortex-R mmul...
SFO15-406: ARM FDPIC toolset, kernel & libraries for Cortex-M & Cortex-R mmul...
 
Algorithm and flowchart
Algorithm and flowchartAlgorithm and flowchart
Algorithm and flowchart
 
PE File Format
PE File FormatPE File Format
PE File Format
 
Addios!
Addios!Addios!
Addios!
 
Security events in 2014
Security events in 2014Security events in 2014
Security events in 2014
 
Intro. to static analysis
Intro. to static analysisIntro. to static analysis
Intro. to static analysis
 
Object oriented programming
Object oriented programmingObject oriented programming
Object oriented programming
 
Automatic tool for static analysis
Automatic tool for static analysisAutomatic tool for static analysis
Automatic tool for static analysis
 
Mem forensic
Mem forensicMem forensic
Mem forensic
 
ELF 101
ELF 101ELF 101
ELF 101
 

Ähnlich wie Compilation and Execution

Reverse Engineering for exploit writers
Reverse Engineering for exploit writersReverse Engineering for exploit writers
Reverse Engineering for exploit writers
amiable_indian
 
Jonathan - Reverse Engineering for exploit writers - ClubHack2008
Jonathan - Reverse Engineering for exploit writers - ClubHack2008Jonathan - Reverse Engineering for exploit writers - ClubHack2008
Jonathan - Reverse Engineering for exploit writers - ClubHack2008
ClubHack
 

Ähnlich wie Compilation and Execution (20)

Intro reverse engineering
Intro reverse engineeringIntro reverse engineering
Intro reverse engineering
 
ELF(executable and linkable format)
ELF(executable and linkable format)ELF(executable and linkable format)
ELF(executable and linkable format)
 
wk 4 -- linking.ppt
wk 4 -- linking.pptwk 4 -- linking.ppt
wk 4 -- linking.ppt
 
嵌入式Linux課程-GNU Toolchain
嵌入式Linux課程-GNU Toolchain嵌入式Linux課程-GNU Toolchain
嵌入式Linux課程-GNU Toolchain
 
Embedded Systems: Lecture 13: Introduction to GNU Toolchain (Build Tools)
Embedded Systems: Lecture 13: Introduction to GNU Toolchain (Build Tools)Embedded Systems: Lecture 13: Introduction to GNU Toolchain (Build Tools)
Embedded Systems: Lecture 13: Introduction to GNU Toolchain (Build Tools)
 
Char Drivers And Debugging Techniques
Char Drivers And Debugging TechniquesChar Drivers And Debugging Techniques
Char Drivers And Debugging Techniques
 
Erlocator
ErlocatorErlocator
Erlocator
 
7986-lect 7.pdf
7986-lect 7.pdf7986-lect 7.pdf
7986-lect 7.pdf
 
Linker and loader upload
Linker and loader   uploadLinker and loader   upload
Linker and loader upload
 
Build process ppt.pptx
Build process ppt.pptxBuild process ppt.pptx
Build process ppt.pptx
 
Purdue CS354 Operating Systems 2008
Purdue CS354 Operating Systems 2008Purdue CS354 Operating Systems 2008
Purdue CS354 Operating Systems 2008
 
Specialized Compiler for Hash Cracking
Specialized Compiler for Hash CrackingSpecialized Compiler for Hash Cracking
Specialized Compiler for Hash Cracking
 
AllBits presentation - Lower Level SW Security
AllBits presentation - Lower Level SW SecurityAllBits presentation - Lower Level SW Security
AllBits presentation - Lower Level SW Security
 
Introduction to C programming
Introduction to C programmingIntroduction to C programming
Introduction to C programming
 
Reverse Engineering for exploit writers
Reverse Engineering for exploit writersReverse Engineering for exploit writers
Reverse Engineering for exploit writers
 
Jonathan - Reverse Engineering for exploit writers - ClubHack2008
Jonathan - Reverse Engineering for exploit writers - ClubHack2008Jonathan - Reverse Engineering for exploit writers - ClubHack2008
Jonathan - Reverse Engineering for exploit writers - ClubHack2008
 
C_and_C++_notes.pdf
C_and_C++_notes.pdfC_and_C++_notes.pdf
C_and_C++_notes.pdf
 
test
testtest
test
 
CS-102 DS-class04a Lectures DS Class.pdf
CS-102 DS-class04a Lectures DS Class.pdfCS-102 DS-class04a Lectures DS Class.pdf
CS-102 DS-class04a Lectures DS Class.pdf
 
The Scheme Language -- Using it on the iPhone
The Scheme Language -- Using it on the iPhoneThe Scheme Language -- Using it on the iPhone
The Scheme Language -- Using it on the iPhone
 

Mehr von Chong-Kuan Chen

Malware collection and analysis
Malware collection and analysisMalware collection and analysis
Malware collection and analysis
Chong-Kuan Chen
 
Malware classification and detection
Malware classification and detectionMalware classification and detection
Malware classification and detection
Chong-Kuan Chen
 
2012 S&P Paper Reading Session1
2012 S&P Paper Reading Session12012 S&P Paper Reading Session1
2012 S&P Paper Reading Session1
Chong-Kuan Chen
 

Mehr von Chong-Kuan Chen (12)

Cgc2
Cgc2Cgc2
Cgc2
 
DARPA CGC and DEFCON CTF: Automatic Attack and Defense Technique
DARPA CGC and DEFCON CTF: Automatic Attack and Defense TechniqueDARPA CGC and DEFCON CTF: Automatic Attack and Defense Technique
DARPA CGC and DEFCON CTF: Automatic Attack and Defense Technique
 
Oram And Secure Computation
Oram And Secure ComputationOram And Secure Computation
Oram And Secure Computation
 
Android Application Security
Android Application SecurityAndroid Application Security
Android Application Security
 
Android system security
Android system securityAndroid system security
Android system security
 
HITCON CTF 2014 BambooFox 解題心得分享
HITCON CTF 2014 BambooFox 解題心得分享HITCON CTF 2014 BambooFox 解題心得分享
HITCON CTF 2014 BambooFox 解題心得分享
 
Inside the Matrix,How to Build Transparent Sandbox for Malware Analysis
Inside the Matrix,How to Build Transparent Sandbox for Malware AnalysisInside the Matrix,How to Build Transparent Sandbox for Malware Analysis
Inside the Matrix,How to Build Transparent Sandbox for Malware Analysis
 
Become A Security Master
Become A Security MasterBecome A Security Master
Become A Security Master
 
Malware Detection - A Machine Learning Perspective
Malware Detection - A Machine Learning PerspectiveMalware Detection - A Machine Learning Perspective
Malware Detection - A Machine Learning Perspective
 
Malware collection and analysis
Malware collection and analysisMalware collection and analysis
Malware collection and analysis
 
Malware classification and detection
Malware classification and detectionMalware classification and detection
Malware classification and detection
 
2012 S&P Paper Reading Session1
2012 S&P Paper Reading Session12012 S&P Paper Reading Session1
2012 S&P Paper Reading Session1
 

Kürzlich hochgeladen

Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Christo Ananth
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Christo Ananth
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
ankushspencer015
 

Kürzlich hochgeladen (20)

Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Vivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design SpainVivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design Spain
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 

Compilation and Execution

  • 1. Compiling to Execution C.K. Chen 2015.11.18
  • 2. Compiling Flow • What happened from compiling the source code to executing Source Code(.c) Header Files(.h) Preprocessing cpp Preprocessed (.i) Compilation gcc Static Library (.a) Object File (.o) Linking (ld) Assemble as Executable a.out Assembly (.s)
  • 3. Preproccessing • Preproccessing – Extend and remove #define – #if, #ifdef, #elif, ….. $gcc -E hello.c -o hello.i
  • 4. Compilation • Translate high-level source code into assembly code $gcc -S hello.i -o hello.s
  • 5. Assembly • Assemble assembly into machine code • After assembly, we will have the ELF format file $gcc -c hello.s -o hello.o
  • 7. ELF Format • Executable file format – Derived from COFF(Common Object File Format) • Windows : PE (Portable Executable) • Linux: ELF (Executable Linkable Format) – Dynamic Linking Library (DLL) • Windows (.dll); Linux (.so) – Static Linking Library • Windows (.lib); Linux (.a) – Object file • Windows (.obj); Linux (.o) • Like executable file format • Intermediate file between compilation and linking
  • 8. File Content • Machine code, data, symbol table, string table • File header – Basic file information • File divided by sections – Code Section (.code, .text) – Data Section (.data) – Special Section (.symtab, .strtab)
  • 9. File Header • File header contains following information – Is executable – Static Link or Dynamic Link – Entry address – Target hardware / OS – Section Table
  • 10. File Header Structure • The structure of ELF header is defined as Elf_Ehdr • e_ident – The first 4 byte is ‘x7f’, ‘E’,’L’,’F’ – File signature • e_type typedef struct { unsigned char e_ident[16]; /* ELF identification */ Elf64_Half e_type; /* Object file type */ Elf64_Half e_machine; /* Machine type */ Elf64_Word e_version; /* Object file version */ Elf64_Addr e_entry; /* Entry point address */ Elf64_Off e_phoff; /* Program header offset */ Elf64_Off e_shoff; /* Section header offset */ Elf64_Word e_flags; /* Processor-specific flags */ Elf64_Half e_ehsize; /* ELF header size */ Elf64_Half e_phentsize; /* Size of program header entry */ Elf64_Half e_phnum; /* Number of program header entries */ Elf64_Half e_shentsize; /* Size of section header entry */ Elf64_Half e_shnum; /* Number of section header entries */ Elf64_Half e_shstrndx; /* Section name string table index */ } Elf64_Ehdr;
  • 11. File Header Structure • e_machine – 62 for AMD x86-64 architecture – 243 for RISC-V (HITCON 2015) • e_entry • e_shoff – Follow this member can find the section table typedef struct { unsigned char e_ident[16]; /* ELF identification */ Elf64_Half e_type; /* Object file type */ Elf64_Half e_machine; /* Machine type */ Elf64_Word e_version; /* Object file version */ Elf64_Addr e_entry; /* Entry point address */ Elf64_Off e_phoff; /* Program header offset */ Elf64_Off e_shoff; /* Section header offset */ Elf64_Word e_flags; /* Processor-specific flags */ Elf64_Half e_ehsize; /* ELF header size */ Elf64_Half e_phentsize; /* Size of program header entry */ Elf64_Half e_phnum; /* Number of program header entries */ Elf64_Half e_shentsize; /* Size of section header entry */ Elf64_Half e_shnum; /* Number of section header entries */ Elf64_Half e_shstrndx; /* Section name string table index */ } Elf64_Ehdr;
  • 12. Sections Table • Sections Table store the array of section headers – Each section is in type Elf64_Shdr – Every element 64 bytes typedef struct { Elf64_Word sh_name; /* Section name */ Elf64_Word sh_type; /* Section type */ Elf64_Xword sh_flags; /* Section attributes */ Elf64_Addr sh_addr; /* Virtual address in memory */ Elf64_Off sh_offset; /* Offset in file */ Elf64_Xword sh_size; /* Size of section */ Elf64_Word sh_link; /* Link to other section */ Elf64_Word sh_info; /* Miscellaneous information */ Elf64_Xword sh_addralign; /* Address alignment boundary */ Elf64_Xword sh_entsize; /* Size of entries, if section has table */ } Elf64_Shdr; e_shoff .sh.strtab .sh.strtab + 0x20
  • 13. Section Table • Traversal table to find all sections
  • 14. Code Section • Code section, most case .text, is used to save the binary code • objdump –s – Display the full contents of all sections • objdump –d – Display assembler contents of executable sections
  • 15. Data Section • There are several sections to store program’s data – .data → Initialized global variable & static variable – .rodata → save the constant value in the program – .bss → save the uninitialized variables
  • 16. Bss section • BSS section is used to save uninitialized data or data filled with zero • This section will not occupy space in the ELF file – But have space when loading into memory
  • 17. Other • .rela.text, .symtab – Used in symbol resolving • .strtab – Save strings for symbol resolution
  • 18. Summary of ELF format
  • 20. Static Linking • Static link is responsible for combining several object files into final executable $gcc hello.o -o hello.out
  • 21. Two-pass Linking • Two-pass Linking • Space & Address Allocation – Fetch section length, attribute and position – Collect symbol(define, reference) and put to a global table • Symbol Resolution & Relocation – Modify relocation entry
  • 22. Space & Address Allocation • Define Symbols – variables – Functions • The virtual address is allocated after linking Symbol Table before Linking Symbol Table after Linking
  • 23. Symbol Table typedef struct elf64_sym { Elf64_Word st_name; /* Symbol name, index in string tbl */ unsigned char st_info; /* Type and binding attributes */ unsigned char st_other; /* No defined meaning, 0 */ Elf64_Half st_shndx; /* Associated section index */ Elf64_Addr st_value; /* Value of the symbol */ Elf64_Xword st_size; /* Associated symbol size */ } Elf64_Sym; Size: 24 bytes Fist symbol .strtab section at 0x00000570 0x570+0xe(offset) = 0x57e This symbol is named shared
  • 24. Symbol Resolution & Relocation • Resolve symbol’s address in the final executable – Address of external symbols are unknown before linking – Before linking, the temporary location is put into object files – Automatic patch the address with the correct one
  • 25. Relocation Table • Relocation table records the address of symbol to patch typedef struct elf64_rela { Elf64_Addr r_offset; Elf64_Xword r_info; Elf64_Sxword r_addend; } Elf64_Rela; r_offset r_info[1] r_info[2] r_addend shared 14 00 00 00 00 00 00 00 0a 00 00 00 09 00 00 00 00 00 00 00 00 00 00 0 swap 21 00 00 00 00 00 00 00 02 00 00 00 0a 00 00 00 fc ff ff ff ff ff ff ff 0x0a -> R_X86_64_32/R_AMD64_32 0X02 -> R_X86_64_PC32/R_AMD64_PC32 They have different way to patch address
  • 26. Relocation Type and Patch Calculation • A - The addend value of the relocatable field. S - The value of the symbol P - The section offset or address of the storage unit being relocated, computed using r_offset. GOT - The address of the global offset table https://docs.oracle.com/cd/E2382 4_01/html/819-0690/chapter6- 54839.html
  • 27. Program Execution • After static linking, we have the executable file • The loader is most important to make the program run – Loading the executable into memory – Dynamic resolving the symbols
  • 28. Creation of Process • Create a independent virtual AS – page directory(Linux) • Read executable file header, create mapping between virtual AS and executable file – VMA, Virtual Memory Area • Assign entry address to program register(PC) – Switch between kernel stack and process stack – CPU access attribute
  • 29. Section to Segment Mapping • Several sections are merge into the segment – Depend on it’s permission • Read • Write • Execution
  • 30. Load Executable into Mem • The program header section contains the information of segments – Program load into memory in the unit of segments – Readelf – Program header table in ELF file
  • 31. Disadvantage of Static Linking • Advantage – Independent development – Test individual modules • Disadvantage – Waste memory and disk space • Every program has a copy of runtime library(printf, scanf, strlen, ...) • Difficulty of updating module – Need to re-link and publish to user when a module is updated
  • 32. Dynamic Linking • Delay linking until execution – Load Time Relocation • Example: – Program1.o, Program2.o, Lib.o – Execute Program1 → Load Program1.o – Program1 uses Lib → Load Lib.o – Execute Program2 → Load Program2.o – Program2 uses Lib → Lib.o has already been loaded into physical memory – Advantage • Save space • Easier to update modules
  • 33. Lazy Binding • Bind when the first time use the function(relocation, symbol searching) • More efficient – In most executions, only small amount of function called – Not need to resolve all the symbols
  • 34. Global Offset Table • Divide into 2 part – .got • Store the reference address of global variables – .got.plt • Store the reference address of global functions
  • 35. .GOT.PLT • The first 3 items are fixed and have special purpose – address of .dynamic section – link_map – dl_runtime_resolve • The following items are functions in the share libraries .dynamic .got .got.plt .data Addr of .dynamic link_map dl_resolve print print
  • 36. . . . call foo@plt … Lazy binding .text jmp *(bar@GOT) push index jmp PLT0 foo@plt .got.plt printf bar@plt+6 foo … push *(GOT + 4) jmp *(GOT + 8) PLT0 36 Note: GOT here means .got.plt
  • 37. . . . call foo@plt … .text jmp *(bar@GOT) push index jmp PLT0 foo@plt .got.plt printf bar@plt+6 foo … push *(GOT + 4) jmp *(GOT + 8) PLT0 Lazy binding 37
  • 38. . . . call foo@plt … .text jmp *(bar@GOT) push index jmp PLT0 foo@plt .got.plt printf bar@plt+6 foo … push *(GOT + 4) jmp *(GOT + 8) PLT0 因 bar還沒 call 過 所以 bar在 .got.plt 中所存的值 會是.plt中的下一行指令位置 所以看起來會像沒有 jmp 過 Lazy binding 38
  • 39. . . . call foo@plt … .text jmp *(bar@GOT) push index jmp PLT0 foo@plt .got.plt printf bar@plt+6 foo … push *(GOT + 4) jmp *(GOT + 8) PLT0 Lazy binding 39
  • 40. . . . call foo@plt … .text jmp *(bar@GOT) push index jmp PLT0 foo@plt .got.plt printf bar@plt+6 foo … push *(GOT + 4) jmp *(GOT + 8) PLT0 Lazy binding 40
  • 41. . . . call foo@plt … .text jmp *(bar@GOT) push index jmp PLT0 foo@plt .got.plt printf bar@plt+6 foo … push *(GOT + 4) jmp *(GOT + 8) PLT0 push link_map Lazy binding 41
  • 42. . . . call foo@plt … .text jmp *(bar@GOT) push index jmp PLT0 foo@plt .got.plt printf bar@plt+6 foo … push *(GOT + 4) jmp *(GOT + 8) PLT0 jmp dl_runtime_resolve dl_runtime_resolve (link_map,index) Lazy binding 42
  • 44. . . . call foo@plt … .text .got.plt printf foo@plt+6 bar … .. .. call _fix_up .. .. ret 0xc dl_resolve foo 找到 bar 在 library 的位置後 會填回 .got.plt Lazy binding 44
  • 45. . . . call foo@plt … Lazy binding .text .got.plt printf foo@plt+6 foo … .. .. call _fix_up .. .. ret 0xc dl_resolve bar return to bar 45
  • 46. Program Memory Layout • Flat memory model – Default regions: • stack • heap • mapping of executable file • dynamic libraries
  • 47. • Stack Frame(Activate Record) – Return address, arguments – Temporary variables – Context • Frame Pointer(ebp on i386) • Stack Pointer(esp on i386)
  • 48. Calling Convention • Consistency between caller and callee • • Argument passing order and method – Stack, Register(eax for return value on i386) • • Stack maintainer – Keep consistency before and after function call – Responsibility of caller or callee – Name-mangling – Default calling convention in C language is “cdecl”
  • 50. LD_Preload • Ordinarily the dynamic linker loads shared libs in whatever order it needs them • $LD_PRELOAD is an environment variable containing a colon (or space) separated list of libraries that the dynamic linker loads before any others
  • 51. LD_Preload • Preloading a library means that its functions will be used before others of the same name in later libraries • Allows functions to be overridden/replaced/ intercepted • Program behaviour can be modified “non- invasively” – ie. no recompile/relink necessary – Especially useful for closed-source programs – And when the modifications don’t belong in the program or the library
  • 52. Example • We want to intercept
  • 53. Implement Shared Lib • Write our own function • Compile into share library gcc -Wall -fpic -shared -o libmylib.so mylibc.c
  • 55. System Call • User processes cannot perform privileged operations themselves • Must request OS to do so on their behalf by issuing system calls • System calls elevate privilege of user process
  • 56. Ltrace • Tracing system calls in Linux – strace command • Output is printed for each system call as it is executed, including parameters and return codes • ptrace() system call is used to implement strace – Also used by debuggers (breakpoint, singlestep, etc) – Maybe anti-debug – How to solve?
  • 57. Summary • ELF file format • Section • Static Link • Dynamic Link and Lazy Binding • LD_Preload • strace