class: center, middle ### COMP26020 Programming Languages and Paradigms Part 1: C Programming *** # Secure Coding Practices, Detecting Bugs ??? - Hi everyone - We previously saw how to adopt some good practices when writing code to avoid introducing bugs that could translate in security issues - Here we are going to cover a complementary approach, which is the use of automated tools to detect programming mistakes and bugs into existing code bases --- # Detecting Coding Mistakes - Certain tools can help detect the coding errors leading to vulnerabilities - Cannot run in production (e.g. due to high overhead), executed in the build and testing phases - Often integrated within the CI/CD pipeline ??? - Here we will cover techniques that are slow to execute, or that make the application slow - As a result they cannot run in production, and are rather used during development -- - **2 main categories:** 1. Static analysis approaches 2. Dynamic analysis approaches ??? - These techniques fall within 2 main categories: static and dynamic analysis --- # Static Analysis - **Static analysis** searches for issues by **analysing the source code**, without running the program - Pros: good coverage, lends itself well to automation - Cons: false positives, limited amount of context available, scalability on large programs ??? - Static analysis tools scan the source code of the program for possible bugs without actually running the program - The benefits of this approach is that it has good coverage, it goes over the entirety of the program's code, which lends itself well to automation - In terms of downsides, static analysis generally suffers from false positives: it means it may identify issues in the code that actually do not represent programming mistakes or security vulnerabilities - Because it does not run the program, the efficiency of static analysis suffers from a limited amount of context, for example most of the memory content is not determined until runtime - Finally, some static analysis techniques are quite slow and do not scale well to the large code bases of certain systems software -- - **Enable extra warnings** with compiler flags: - Additional warnings vs. default: `-Wall` - More warnings: `-Wextra` - Even more warnings (can be picky): `-pedantic` - More info: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html ??? - A first thing you should do is enable high degrees of compiler warnings - You can use, by increasing order of pickiness, `-Wall` to get additional warnings, `-Wextra` to get even more warnings, and `-pedantic` to add even even more warnings - See the link on the slides for details about what warnings are added by each option --- # Static Analysis (2) - **Static analysis tools**: - Clang Static Analyser: https://clang-analyzer.llvm.org/ - Lint: https://docs.oracle.com/cd/E19205-01/820-4180/man1/lint.1.html - Coverity: https://scan.coverity.com/ - cppcheck: https://cppcheck.sourceforge.io/ ??? - There are more advanced static analysis tools, you have a few examples here: the clang static analysis, Lint, Coverity, or Cppcheck -- Let's check out an example with the Clang static analyser. ??? - As an example let's see how we can use the clang static analyser --- # Clang Static Analyser .leftcol[ ```c int c; int main() { int a = INT_MAX; int b = 1; c = a + b; // Integer overflow! char buffer[8]; char str[] = "this string is too long"; strcpy(buffer, str); // Buffer overflow! int *ptr = (int *)malloc(sizeof(int)); *ptr = 42; free(ptr); *ptr = 99; // Use-after-free! return 0; } ``` .codelink[
`25-detecting-bugs/faulty.c`
] ] ??? - We have a faulty program here, it contains 3 bugs - The first bug is an integer overflow, we add in c 1 to the largest integer that can be stored on an int - The second bug is a buffer overflow, we copy in in buffer -- which size is 8 bytes -- a string that is larger than 8 bytes - And the last bug is a use after free, where we dereference the pointer ptr after having freed the buffer it points to -- .rightcol[ ```bash $ gcc faulty.c -o faulty $ ./faulty ``` - No warning/error at compile-time! - No visible effect at runtime! ] ??? - Notice that with the default level of warnings, this program compiles fine, and also it runs without any visible error --- # Clang Static Analyser (2) ```bash $ clang --analyze faulty.c faulty.c:22:10: warning: Use of memory after it is freed [unix.Malloc] *ptr = 99; // Use-after-free! ~~~~ ^ 1 warning generated. ``` - Clang static analyser detects he use-after-free... - ... but not the integer and buffer overflows - For that we need **dynamic analysis tools** ??? - if we launch clang's static analyser on our source code, we can see that it is able to detect the use after free bug - However it does not detect the two other bugs - For that we need to use dynamic analysis --- # Dynamic Analysis - **Dynamic analysis**: tries to detect errors **while running the program** - Pros: runtime context available, easier if sources unavailable (black box testing) - Cons: input-dependant coverage, scalability to many programs, high runtime overheads ??? - Dynamic analysis tries to detect errors while running the program - Doing so it gets access to more information than static analysis, that is runtime information - It's also useful when the sources of the program we wish to analyse are not available -- - **Compiler-based instrumentation** is a highly popular dynamic analysis approach: **sanitisers** - **AddressSanitizer (ASan)**: detects heap/stack/globals memory issues - Buffer overflows, use after free, double free, - Memory leaks - **UndefinedBehaviorSanitizer (UBSan)**: integer overflows, invalid casts, misaligned pointers, division by 0 - A few more: https://en.wikipedia.org/wiki/Code_sanitizer ??? - A very popular type of dynamic analysis is achieved through compiler based instrumentation - These are called sanitisers - The most widespread is address sanitiser, that will detect a wide range of memory errors that would not be caught at compile time or at runtime without the instrumentation - You also have the undefined behaviour sanitiser that detects things like integer overflows, invalid casts, and so on - Check out the link on the slide for more information about sanitisers --- # ASan and UBSan - Enabling ASan and UBSan on the previously-seen faulty program ```bash # Compile with ASan enabled: $ clang -fsanitize=address faulty.c -o faulty $ ./faulty ================================================================= ==21543==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffcc881f268 # ... ``` ??? - We can enable address sanitiser instrumentation on our faulty program as follows - As you can see it first catch the buffer overflow -- ```bash # Enable ASan again, after having fixed the buffer overflow: clang -fsanitize=address faulty.c -o faulty $ ./faulty ./faulty ================================================================= ==22504==ERROR: AddressSanitizer: heap-use-after-free on address 0x602000000010 # ... ``` ??? - Once we have fixed that overflow we can recompile the program, still with address sanitiser, and launch it again - This time we can see that it detects the use after free -- ```bash # Compile with UBSan enabled: $ clang -fsanitize=undefined faulty.c -o faulty $ ./faulty faulty.c:12:11: runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int' ``` ??? - And finally, when we enable undefined behaviour sanitiser, and launch the program, the integer overflow is detected --- # Valgrind - Older dynamic analysis tool to detect memory errors and leaks among other things - Mostly **superseded by sanitisers** for this task - In addition to leaks Valgrind can detect certain memory errors - Sanitisers detect leaks too - Still, no **need to recompile with Valgrind** ??? - You have other dynamic analysis tools - Most of what came before the sanitisers has been rendered more or less obsolete by them - We have seen Valgrind previously, in addition to reporting about memory leaks, it can also detect certain memory errors - Given that sanitisers also detect memory leaks, that makes Valgrind quite redundant - However, note that with Valgrind there is no need to recompile the program to insert instrumentation, as one do with the sanitisers - So Valgrind is still useful in context where we have only access to the application's binary and not its sources --- # Fuzzing AKA Fuzz Testing - **Fuzzing: injecting malformed input through a trust boundary to trigger bugs** - E.g. command line arguments, input files, network - A form of dynamic analysis - Highly popular modern approach helping to secure interfaces ??? - One last dynamic analysis technique is fuzzing - It consist in blasting a trust boundary with malformed inputs with the hope to trigger bugs - Examples of trust boundaries that are good candidates for fuzzing include the command line arguments, input files, network packets, and so on - Fuzzing is highly popular these days, and has help uncover a very large number of bugs in many projects -- Let's see an example with the tool American Fuzzy Lop (AFL): https://github.com/google/AFL ??? - Let's briefly see an example with the AFL fuzzer --- # Fuzzing (2) Vulnerable program: ```c int main(int argc, char *argv[]) { char name[32]; // Vulnerable buffer (too small for unchecked input) if (argc < 2) { printf("Usage: %s
\n", argv[0]); return 1; } FILE *f = fopen(argv[1], "r"); if (!f) { printf("Error, can't open %s\n", argv[1]); return 1; } fread(name, 1, 512, f); // Reads up to 512 bytes into a 32-byte buffer! fclose(f); printf("hello %s\n", name); return 0; } ``` .codelink[
`25-detecting-bugs/fuzzme.c`
] ??? - This is a vulnerable program - It opens a file and reads its content into a buffer - It reads 512 bytes, however the destination buffer is only 32 bytes long, so there is a possibility of overflow here - The name of the file to read comes from the command line --- # Fuzzing (3) ```bash # Install AFL $ sudo apt install afl # or afl++ on very recent ubuntu/debian distributions # Compile and instrument target program $ afl-clang fuzzme.c -o fuzzme # Create a "seed" input to kickstart the fuzzing process $ mkdir input $ echo "testname" > input/seed # Start the fuzzing process (after a few seconds stop it with ctrl+c) $ AFL_SKIP_CPUFREQ=1 afl-fuzz -i input -o output -- ./fuzzme @@ # Reproduce the bug found by AFL (payload file may have a different name on your machine) $ clang -g -fsanitize=address fuzzme.c -o fuzzme $ ./fuzzme output/crashes/id:000000,sig:11,src:000000,op:havoc,rep:128 ``` ??? - You have instructions on the slide for how to fuzz this program - You can pause the video to study them in details -- .small[ There is a lot more to say about fuzzing. Further readings: - Sutton et al., **Fuzzing: Brute Force Vulnerability Discovery** - **The Fuzzing Book**: https://www.fuzzingbook.org/ - Fuzzing 101: https://github.com/antonio-morales/Fuzzing101 ] ??? - One last thing about fuzzing: it's a huge field and you probably will hear more details about it in other units - On the slide you can find a few pointers if you want to dig deeper. --- ## Other Static and Dynamic Analysis Approaches - Widespread approaches: manual code review, unit testing - Linters/style checkers - Taint analysis - Symbolic execution, abstract interpretation - Formal verification, model checking ??? - There are a few other static and dynamic analyses techniques that can be used - You are probably familiar with unit testing and manual code reviews, as well as with tools to check that your code follows a certain style - There are other techniques such as formal verification, that will be covered in other units so I won't talk about them here --- # Summary - We covered various static/dynamic tools to detect bugs during testing phase - Unfortunately, **none of these approaches can guarantee the absence of bugs**. - We also need runtime defences in production - To detect bugs - To make exploits harder to achieve and limit the damage an attacker can do when exploiting vulnerabilities ---- .leftcol[ .center[Feedback form: https://bit.ly/4oDLf4S] ] .rightcol[
] ??? - To conclude, we covered the use of various automated tools to try to detect bugs in existing code - They belong in two main categories, static and dynamic analysis approaches - Unfortunately, even if you combine these with the secure coding practices we saw previously, none of these approaches will allow you to get rid of 100% of the bugs and vulnerabilities - We also need defences executing at runtime in production, to make exploits harder and limit their damage