Secure Architectures and Systems - Exploiting Vulnerabilities Part 1: Example of Attacks

class: center, middle

### Secure Computer Architecture and Systems
***
# Exploiting Vulnerabilities Part 1: Example of Attacks

???

- Hi everyone
- In this video we are going to see how the memory safety violation we discussed can represent security vulnerabilities that can be exploited by attackers

---
# Memory Unsafety in C/C++

- C/C++ are not memory safe
- Programmer may introduce memory errors, **hard to detect and to debug**
  - No warning/error at compile-time
  - Undefined behaviour: symptoms *may* manifest at runtime... or not

???

- As we have seen, C and C++ are not memory safe
- Programming mistakes may introduce memory errors and other bugs that are hard to detect and debug
- When these bugs are present often there is no error or warning at compile time, and the issue could further be completely silent at runtime

.large[.center[**⚠️ These bugs have huge security implications! ⚠️**]]

Exploiting these bugs an attacker can break CIA:
- Leak/tamper with sensitive data (e.g. passwords, crypto keys)
- Escalate privileges
- Take over the entire program (have it execute arbitrary code)
- Trigger various forms of denial of service

???

- Beyond leading to program crashes or misbehaviour, these bugs can also represent security vulnerabilities
- That can be exploited by attackers to break all aspects of the confidentiality/integrity/availability triad
- Exploiting these bugs let attackers leak and tamper with sensitive data, escalate privileges, take over the execution flow of programs, and disturb or crash applications and systems.

---
<div style="text-align:center"><img src="include/motivation.png" width=830 /></div>

.small[Sources: https://zd.net/3q8axgo, https://zd.net/3wfFHU7]

???

- This is a big problem
- In fact, a few years ago both Microsoft and Google reported that about 70% of their security bugs were due to memory safety violations

---
class: inverse, middle, center

# Example 1: Infoleak

???

- Let's have a look at a first example of vulnerable program

---
# Example 1: Infoleak

Assume the following scenario:
- A security-sensitive program is distributed in binary-only form
- It contains sensitive data: a password
- An attacker has access to the binary only and aims to figure out the password

???

- We assume the following scenario
- We have a program that is distributed in binary only form (that's how most windows proprietary applications are shipped)
- It contains some sensitive data: a password
- An attacker has access to the binary only, not the source, and aims to figure out the password

---
name: infoleak
# Example 1: Infoleak

Original code:
```c
char *welcome_message = "Hi there! How is it going?\n"; // 27 char
char *password = "secret";
char entered_password[128];

int main(int argc, char **argv) {
    for(int i=0; i<27; i++)  // Print welcome message character by character
        printf("%c", welcome_message[i]);

printf("Please input the password:\n");
    scanf("%s", entered_password);

if(!strcmp(entered_password, password)) { printf("Password ok!\n"); /* ... */ }
    else { printf("Wrong password! aborting\n"); }

return 0;
}
```
.codelink[<a href="src/infoleak-orig.c" download>`08-exploiting-vulnerabilities-1/infoleak-orig.c`</a>  <a href="https://github.com/olivierpierre/comp60261-devcontainer" target="_blank" style="text-decoration: none"><img src="include/gh-logo.svg" style="height: 1em"></a>]

???

- This is the code of the program
- It prints a welcome message that says "Hi there! how is it going"
- The message is printed character by character, it's a bit silly but we need that for the sake of the demonstration
- Then prompt the user for the password
- If the password is correct it goes on to execute more code, and if not it prints an error message and exits
- You may already notice a very bad security practice: the password is hardcoded in clear text in the binary

---
# Example 1: Infoleak

Updated code (`welcome_message` shortened):
```c
*char *welcome_message = "Hi there!\n"; // shortened welcome message, only 11 chars now
char *password = "secret";
char entered_password[128];

int main(int argc, char **argv) {
    for(int i=0; i<27; i++)
        printf("%c", welcome_message[i]);

printf("Please input the password:\n");
    scanf("%s", entered_password);

if(!strcmp(entered_password, password)) { printf("Password ok!\n"); /* ... */ }
    else { printf("Wrong password! aborting\n"); }

return 0;
}
```
.codelink[<a href="src/infoleak-updated.c" download>`08-exploiting-vulnerabilities-1/infoleak-updated.c`</a>  <a href="https://github.com/olivierpierre/comp60261-devcontainer" target="_blank" style="text-decoration: none"><img src="include/gh-logo.svg" style="height: 1em"></a>]

???

- Now imagine that the company making the program updates the code and shorten the welcome message
- But the programmer forgot to update the number of iteration of the loop printing that welcome message character by character
- On a large and complex code base, that's something that could happen

---
name: infoleak
# Example 1: Infoleak

Updated code (`welcome_message` shortened):
```c
*char *welcome_message = "Hi there!\n"; // shortened welcome message, only 11 chars now
char *password = "secret";
char entered_password[128];

int main(int argc, char **argv) {
*   for(int i=0; i<27; i++)  // Oopsie! forgot to update that bit of the code
        printf("%c", welcome_message[i]);

printf("Please input the password:\n");
    scanf("%s", entered_password);

if(!strcmp(entered_password, password)) { printf("Password ok!\n"); /* ... */ }
    else { printf("Wrong password! aborting\n"); }

???

- So now the printing loop is going to overflow the welcome message and print on the standard output the content what is located right after the welcome message in memory

---
template: infoleak
<div style="text-align:center"><img src="include/overflow1-1.svg" width=600 /></div>

???

- What's unfortunate here is that, based on how the compiler lays out global variables in memory, it's very likely that the password is located just past the welcome message

---
template: infoleak
<div style="text-align:center"><img src="include/overflow1-2.svg" width=600 /></div>

???

- And when the print loop overflows the welcome message

---
template: infoleak
<div style="text-align:center"><img src="include/overflow1-3.svg" width=600 /></div>

???

-  it's going to leak the value of the password to the attacker on the standard output

---
class: inverse, middle, center

# Example 2: Sensitive Data Tampering

???

- Now let's see a second example in which the attacker takes a more active role

---
# Example 2: Sensitive Data Tampering

Assume the following scenario:

- Same type of program performing a password check, distributed as binary only
- Attacker aims to bypass the password check

???

- We assume a scenario with a similar program performing a password check, distributed as a binary only so the attacker does not have access to the sources
- The attacker does not know the password and wants to pass the password check

---
name: tampering
# Example 2: Sensitive Data Tampering

```c
char user_input[32] = "00000000000";
char password[32] = "secret";

int main(int argc, char **argv) {
    if(argc != 2) { printf("Usage: %s <password>\n", argv[0]); return 0; }

strcpy(user_input, argv[1]);
    if(!strncmp(password, user_input, strlen(password))) {
        printf("login success!\n");
        /* do important stuff  ... */
    } else {
        printf("wrong password!\n");
    }

return 0;
}
```
.codelink[<a href="src/tampering.c" download>`08-exploiting-vulnerabilities-1/tampering.c`</a>  <a href="https://github.com/olivierpierre/comp60261-devcontainer" target="_blank" style="text-decoration: none"><img src="include/gh-logo.svg" style="height: 1em"></a>]

???

- This is our vulnerable program
- This time the user passes the program as a command line argument to the program
- That password attempt is copied into the user_input buffer with strcpy
- The content of that buffer is compared with strncmp to the correct password, and if they match the authentication succeeds
- So can you check the vulnerability in this program?
- Look as the strcpy, as we saw previously that function will copy the entirety of the source string independently of its size
- So if the user passes as password attempt a string that is larger than the destination buffer, which size is 32 bytes, strcpy will overflow user_input and start writing past that buffer in memory

---
template: tampering
<div style="text-align:center"><img src="include/overflow2-1.svg" width=800 /></div>

???

- Same as for the previous program, because of how variables are declared, it is likely that the compiler will place the correct password hash right after the user_input buffer

---
template: tampering
<div style="text-align:center"><img src="include/overflow2-2.svg" width=800 /></div>

???

- So the attacker has the ability to overwrite the correct password hash, by passing a string that is long enough

---
template: tampering
<div style="text-align:center"><img src="include/overflow2-3.svg" width=800 /></div>

???

- By making sure that what is now the correct password matches what is in user_input

---
template: tampering
<div style="text-align:center"><img src="include/overflow2-4.svg" width=800 /></div>

???

- The authentication will succeed

---
class: inverse, center, middle

# Example 3: Stack Smashing

???

- Let us see a third example
- This time of a classic attack called stack smashing

---
# Example 3: Stack Smashing

- Classic control flow hijacking attack:
  - Attacker has external (e.g. command line) access to the program
  - Attacker exploits a bug to make the program execute code it's not supposed to
- Originally proposed in 1996 here: http://www.phrack.org/archives/issues/49/14.txt
- First let's see how the CPU manage function calls/returns

???

- Stack smashing lets the attacker divert the normal control flow of a program
- That means the attacker can have the executed code follow paths that were not intended by the programmer
- This attack was first described in 1996 is an article that has become quite famous since then
- To understand stack smashing, we first need to understand how function calls and returns are implemented at the machine code level

---
# Example 3: Stack Smashing
<div style="text-align:center"><img src="include/stack-smashing-0.svg" width=900 /></div>

???

- Assume we have the following scenario, a C program with a function `f` calling another function `g`

---
# Example 3: Stack Smashing
<div style="text-align:center"><img src="include/stack-smashing-1.svg" width=900 /></div>

???

- In memory, when f runs, there is an area of the address space that holds f's local variables and parameters
- It is called the stack

---
# Example 3: Stack Smashing
<div style="text-align:center"><img src="include/stack-smashing-2.svg" width=900 /></div>

???

- When f calls g, this is achieved at the level of the machine code with a callq instruction on x86-64
- Callq pushes on the stack what is called the return address
- The push is realised in the direction of lower addresses, because as a convention on x86-64 the stack grows down
- It is the address, in the code segment of the program, of the instruction that should be executed next when we will return from the call to g

---
# Example 3: Stack Smashing
<div style="text-align:center"><img src="include/stack-smashing-3.svg" width=900 /></div>

???

- After the call g starts to run, and there is some space allocated on the stack for its own local variables and parameter

---
# Example 3: Stack Smashing
<div style="text-align:center"><img src="include/stack-smashing-4.svg" width=900 /></div>

???

- When g returns, the CPU executes a RET instruction
- RET will pop the return address from the stack and jump to it
- This way the execution in f resumes right after the call to g

---
# Example 3: Stack Smashing

```c
char *password = "secret";

void security_critical_function() { printf("launching nukes!!\n"); }

void preprocess_input(char *string) {
    char local_buffer[16];
    strcpy(local_buffer, string);
    /* work on local buffer ... */
    return;
}

int main(int argc, char **argv) {
    if (argc != 2) { printf("usage: %s <password>\n", argv[0]); return -1; }

preprocess_input(argv[1]);

if(!strncmp(password, argv[1], strlen(password)))
        security_critical_function();
    else
        printf("Unauthorized user!\n");

return 0;
}
```
.codelink[<a href="src/stack-smashing.c" download>`08-exploiting-vulnerabilities-1/stack-smashing.c`</a>  <a href="https://github.com/olivierpierre/comp60261-devcontainer" target="_blank" style="text-decoration: none"><img src="include/gh-logo.svg" style="height: 1em"></a>]

???

- Let's now take a look at our vulnerable program
- It's the same type of password checking application we have seen previously
- It takes the password attempt from the command line, and before performing the check it passes it to a function named preprocess_input
- preprocess_input copies the attempt into a local_buffer before working on it
- In the code you can also see the function executed when the authentication succeeds, it's called security_critical_function
- Here the goal of the attacker is going to be to run this function without going through the password check
- As you can see the strcpy in preprocess_input takes as source something coming from the command line, so similarly to the previous example we have the capacity to overflow local_buffer
- How can we exploit that to bypass the password check?

---
name: smashing
# Example 3: Stack Smashing

.leftcol[
```c
char *password = "secret";

void security_critical_function() {
    printf("launching nukes!!\n");
}

void preprocess_input(char *string) {
    char local_buffer[16];
    strcpy(local_buffer, string);
    /* work on local buffer ... */
}

int main(int argc, char **argv) {
    if (argc != 2) { /* ... */ }

*   preprocess_input(argv[1]);

if(!strncmp(password, argv[1],
            strlen(password)))
        security_critical_function();
    else printf("Unauthorised user!\n");
    return 0;
}
```
.codelink[<a href="src/stack-smashing.c" download>`08-exploiting-vulnerabilities-1/stack-smashing.c`</a>  <a href="https://github.com/olivierpierre/comp60261-devcontainer" target="_blank" style="text-decoration: none"><img src="include/gh-logo.svg" style="height: 1em"></a>]

]

???

- So here on the left what I have is just a condensed version of the vulnerable program

---
template: smashing
.rightcol[
<div style="text-align:center"><img src="include/stack-smashing2-1.svg" width=400 /></div>
]

???

- When preprocess_input is called, the stack looks like this
- We have an area reserved for main's local variable and parameters
- Then we have the return address where we should jump in main when preprocess_input return
- Then we have an area for preprocess_input locals and parameters

---
template: smashing
.rightcol[
<div style="text-align:center"><img src="include/stack-smashing2-2.svg" width=400 /></div>
]

???

- local_buffer is somewhere in that area

---
template: smashing
.rightcol[
<div style="text-align:center"><img src="include/stack-smashing2-3.svg" width=400 /></div>
]

???

- So when we overflow local_buffer with strcpy, we have the ability to overflow upwards in the stack, towards the high addresses

---
template: smashing
.rightcol[
<div style="text-align:center"><img src="include/stack-smashing2-4.svg" width=400 /></div>
]

???

- If we craft the content of what we overflow local_buffer with carefully, we can make it in such a way that we overwrite the return address with the address of our target, which is security_critical_function

---
template: smashing
.rightcol[
<div style="text-align:center"><img src="include/stack-smashing2-5.svg" width=400 /></div>
]

???

- Doing so, when preprocess_input returns, the CPU will pop the overwritten address on the stack and jump to it
- Executing security_critical_function without going through the password check

---
class: inverse, center, middle

# Example 4: Use-After-Free

???

- We previously talked about the use after free being a common temporal safety violation
- Let's quickly see here how it can be exploited

---
name: uaf
# Example 4: Use-After-Free

.leftcol[
```c
typedef struct {
    double member1; double member2;
    void (*member3)(int);
} my_struct;

void print_hello(int x) {
    printf("Hello, parameter: %d\n", x);
}

void security_critical_function() {
    printf("Launching nukes!\n");
    /* ... */
}

//
```
]

.rightcol[
```c
int main(int argc, char **argv) {
    /* allocate and init ms */
    my_struct *ms = malloc(sizeof(my_struct));
    ms->member1 = 42.0; ms->member2 = 42.0;
    ms->member3 = &print_hello;
    /* call the function pointer */
    ms->member3(12);

free(ms);
    char *buffer = malloc(12);
    strcpy(buffer, argv[1]);

ms->member3(12);
    /* check a password, runs sec_crit_fn */
}
```
.codelink[<a href="src/use-after-free.c" download>`08-exploiting-vulnerabilities-1/use-after-free.c`</a>  <a href="https://github.com/olivierpierre/comp60261-devcontainer" target="_blank" style="text-decoration: none"><img src="include/gh-logo.svg" style="height: 1em"></a>]

]

???

- This is our vulnerable program
- It declares a data structure mystruct for which one of the members, member3, is a function pointer
- In main a data structure object ms is allocated with malloc, and initialised, with the function pointer set to point to a benign function that prints hello
- The object is then freed
- Then there is another call to malloc and the buffer in question is filled with data from the command line arguments, with strcpy
- And then finally we have our use after free: the object ms, previously freed, is mistakenly accessed: the function pointer is invoked
- We also have a security critical function that we would like to redirect the execution to without going through a password check

---
template: uaf
<div style="text-align:center"><img src="include/use-after-free-1.svg" width=550 /></div>

???

- We can exploit this program as follows
- When the object ms is initialised, things look like that
- The 3 members of the object are laid out contiguously in memory
- And the function pointer points to the first byte of code of the print_hello function in the code segment

---
template: uaf
<div style="text-align:center"><img src="include/use-after-free-2.svg" width=550 /></div>

???

- When free is called that memory is discarded

---
template: uaf
<div style="text-align:center"><img src="include/use-after-free-3.svg" width=550 /></div>

???

- Due to the way malloc is implemented, it will try to reuse freed memory for future allocations as much as possible
- So it is likely that the space that previously held the ms data structure will be reused for the next allocation, which is filled with data coming from the command line parameter
- With strcpy the attacker can write in that space, and overflow the 12 bytes of buffer to overwrite the space that previously held the function pointer with the address of security_critical_function

---
template: uaf
<div style="text-align:center"><img src="include/use-after-free-4.svg" width=550 /></div>

???

- Later when the use after free happens, this in effect invokes the security_critical_function

---
# Advanced Control Flow Hijacking

- 2 last examples are **control flow hijack attacks**
  - Divert the program's control flow to different CFG paths than what the programmer intended

???

- The two last examples of attacks we have seen are named control flow hijacking attacks
- The attacker divert the control flow of the program and have the CPU runs code paths that are different from what the programmer originally intended

- Attacker can jump/return to program's functions, but also:
  - Libc function, e.g. a remote attacker jumping to `exec` with `"/bin/sh"` as argument to get a shell
  - Small snippets of code ending with `ret`: **return oriented programming** (ROP)

???

- Our examples showed how the attacker can rewrite return addresses and function pointers to return and jump to security critical pieces of code
- Other control flow hijacking attack can attempt to jump to libc functions, for example jumping to the exec function while having the string "/bin/sh" in the register holding the first function parameter according to the ABI can lead to a remote attacker getting access to a shell on the victim machine
- Another relatively advanced attack is named return oriented programming

---
# Advanced Control Flow Hijacking (2)

- With ROP an attacker puts on the stack a series of addresses pointing to **gadgets**, small snippets of code (just a few instructions) ending in `ret`
  - CPU `ret` from gadget to gadget
  - Stack can also contain data for use by the gadgets, e.g. loading a value from the stack into a register

.leftcol[
<div style="text-align:center"><img src="include/rop.svg" width=330 /></div>
]

???

- With ROP, the attacker has full control over what is on the stack, for example through an overflow
- The attacker places on the stack a series of code addresses that point to small snippets of machine code
- These are called gadget, and represent sequences of just a few instructions ending with ret
- This way the CPU executes the first sequence of instruction, returns to the second, executes it, then returns to the third, and so on
--

.rightcol[
- Tons of gadgets in modern program
- Easy for an attacker to achieve **arbitrary Turing-complete computation** with ROP
]

???

- In modern programs there is a very high number of gadgets
- In fact on medium to large size programs, the attacker can generally achieve Turing complete computations through ROP, which makes it a particularly concerning attack vector

---
# Summary

- C is **not memory safe**
- Memory issues benign at a first glance can have **huge security consequences**
- Exploiting them allows attackers to break any aspect of confidentiality/integrity/availability

???

- To conclude, the lack of memory safety in languages such as C and C++ opens for a vast array of security vulnerabilities
- They can be exploited by attackers to break the confidentiality, integrity, and availability properties of systems software