class: center, middle ### Secure Computer Architecture and Systems *** # Process Management ??? - Hi everyone, here we are going to talk about one of the core responsibilities of an operating system - Process management - As mentioned previously we will focus on Linux --- class: inverse, center, middle # Processes ??? - So what is a process --- # What is a Process - An instance of a running program ??? - A process is simply an instance of a running program in the system -- - Program is tricked by the OS into thinking it is alone running on the machine: - Memory: **each process has its own virtual address space** - CPU: **each process is transparently scheduled** in an out of the CPU ??? - Each program is tricked by the OS into thinking it runs alone and has access to 100% of the machine's resource - This way all applications in the system can run independently from each other - This is achieved as follows: regarding memory, each process has its own address space, and it can try to address most of it - And regarding the CPU, each process is scheduled in and out of the CPU completely transparently -- - A process also has: - **Handlers to system resources** maintained by the kernel: file descriptors, sockets, etc. - An **execution context**: values in CPU registers ??? - A process also has access to handlers representing systems resources such as file descriptors, network sockets, etc. - And and execution context, which is the value in CPU registers at a given point in the program's execution --- # Process Identifiers (PIDs) PID: Unique integer identifying each process in the system ```c #include
#include
#include
int main() { printf("my pid is: %d\n", getpid()); return 0; } ``` .codelink[
`14-process-management/pid.c`
] To see every process and their PID in the system: ```bash $ ps -e ``` ??? - Each process has a unique integer identifier in the system, called the PID - A process can get its PID with the `getpid` system call - One can also list all processes runnign in the system with their pid with the `ps -e` command --- # Process Tree - A process needs to be created by another process - Parent/child relationship ??? - Apart from the first process created from scratch right after the kernel boots, every other process is created from another process - This way we have a parent/child relationship between a creator process, the parent, and a created process, the child -- - At boot time, after the kernel is done initialising, it creates a first process (PID 1, sometimes named `init`) - `init` then creates children, that may create children, etc. - System has a **process tree**, display it e.g. with the `pstree` command ??? - After the kernel boots it creates the first process which gets PID 1 - This process has historically been called `init`, but can take different names --
??? - The process tree illustrating the parent/child relationships of all processes in the system can be obtained with the `pstree` command --- # Process Creation: `fork` - A parent processes calls `fork` to create a new child process ??? - The one and only way to create a process is through the `fork` primitive - It dates back from the 60s and is now part of the POSIX operatign system interface standard - When a parent process calls fork it creates a child process -- - **Child is not created empty: it is a duplicate of its parent's state at the time `fork` was called:** - Copy of the address space - Copy of systems resources e.g. file descriptors - Copy of the execution context (CPU register values) - `fork` involves the OS (`clone` system call), return path executed by both the parent and the child ??? - The child is not created empty: it is actually a duplicate of the parent: it inheritates a copy of the parent address space, system resources such as file descriptor, and even a copy of the parent's CPU register content at the time fork was called - With Linux under the hood fork is implemented by the C standard library and calls the `clone` system call which performs the child creation and duplicates the parent -- .leftlargecol[
] .rightsmallcol[ After `fork` parent and child run **concurrently** ] ??? - When they return from fork, the parent and the child run concurrently - This is illustrated on the slide with a parent of PID 42 that calls fork a first time and creates a child PID 43, then calls fork a second time and creates another child PID 46 --- # Process Creation: `fork` (2) ```c int global = 42; int main() { int local = 10; printf("[%d] parent before update local = %d, global = %d\n", getpid(), local, global); int pid = fork(); switch(pid) { case -1: printf("fork error!\n"); return -1; case 0: // Child run printf("[%d] child, before update local = %d, global = %d\n", getpid(), local, global); local = 100; global = 100; printf("[%d] child after update local = %d, global = %d\n", getpid(), local, global); break; default: // Parent run local = 0; global = 0; printf("[%d] parent after update local = %d, global = %d\n", getpid(), local, global); } return 0; // executed by both } ``` .codelink[
`14-process-management/fork.c`
] ??? - You have an example of fork usage on the slide - Our program starts by printing the values of a local and globa variables, then calls fork and make a switch case on its return value - if it's -1, there was a problem with fork and we just exit - if it's 0, this is the return path of the child, we print the variables then set 100 as the value of both variables and print these values again - if it's something else, it corresponds to the return path of the parent, it sets 0 in both variables and print them - Then both processes will leave the switch case body and return 0 from main - If we run this program we can see that the child sees the same values for the variable before it updates then - Once the variables are updated, we can see that each process has its own copy of the variables - This demonstrates that the address space of the child starts right after fork as a copy of that of the parent, but also that each process has its own, private, address space and it is free to read and write to it independently from other processes --- name: cow # `fork` Implementation by the OS - **How does the OS duplicate the address space?** - Copying the entirety of mapped memory upon `fork` would be a huge waste of memory and time - Instead, use paging to implement **on-demand, copy-on-write (CoW) address space duplication** ??? - How does the copy of the address space is realised by the OS? - It's not a synchronous copy realised upon fork, there are megabytes or possibly gigabytes of mapped memory in the address space so that would take too much time and memory space - The copy is actually realised later, on-demand, when the memory is accessed --- template:cow
??? - When fork returns, the child gets a copy of the parent’s page table: address spaces are identical, mapped virtual pages point to the same physical pages --- template:cow
??? - Read accesses are performed normally: as long as neither the parent nor the child modifies the memory, they don’t need to see different content for it --- template:cow
??? - Only when the parent or the child writes to the address space, the corresponding physical memory is copied and page table updated - This is achieved at the granularity of a page, which in the vast majority of systems is 4 KB --- # Executing a Different Program ??? - Now what if we want to create a new process to execute a new program and not run a copy of the parent -- - `fork` + `execve` ```c int main() { char *args[] = {"/bin/ls", "-l", NULL}; char *envp[] = {NULL}; printf("[%d] Parent, forking\n", getpid()); switch(fork()) { case -1: printf("fork error!\n"); return -1; case 0: printf("[%d] child, calling execve()\n", getpid()); execve("/bin/ls", args, envp); printf("execve error!\n"); // should not reach here break; default: break; } return 0; } ``` .codelink[
`14-process-management/forkexec.c`
] ??? - We still need to use fork, but this time in conjunction with `execve` - You can see an example here, the parent calls fork - And the child, on its return path calls `execve` - `execve` takes as first parameter the new program's binary we want to execute, here it's "/bin/ls" - Followed by an array of strings, one for each arguments (with the first being the binary's name) - And an optional set of environment variables, which is just NULL here - If we launch this program, we can see that the child executes `ls` successfully --- # `execve` Implementation by the OS - Kernel creates a **new (blank) address space** - Child duplicate of the parent address space is lost, program exec'd will start from its entry point ??? - Under the hood, when a process calls execve, the OS sets for it a new blank address space - Whatever was present in the old address space is completely lost -- - Kernel has a **loader**, starts by reading metadata about the program to load: - Segments to load - Program entry point - Interpreter (user space loader `ld-linux.so`) - Load the binary (static binary) or interpreter (dynamic binaries, scripts) ??? - The kernel has a loader that extracts from the binary to execute what segments should be loaded, where is the program entry point, and if there is the need for a user space laoder or interpreter - If it's a static binary it is then loaded in the address space - If it's a dynamically compiled binary or an interpreted script, the user space loader or interpreter will rather be loaded, with the program/script to execute passed as parameter -- - Allocate a stack, populate it with information to send to the program: `argc`, `argv`, etc. - Return to user space at the program/interpreter entry point - Interpreter will load the program in user space ??? - Then the OS allocates a stack and populate it with what the program needs to initialise: among other things command line parameters and environment variables - Then the OS returns to user space at the program or interpreter entry point, and things start to run --- class: center, inverse, middle # Interactions Between Processes ??? - Let's now see how the OS lets processes interract with each others --- # Communication Between Processes - **Multiprocess applications**: web browsers, servers, GUI apps, etc. - Use several processes to exploit parallelism, run background operations, sandbox untrusted code, etc. - These processes work together: they need to **communicate** and **synchronise** with each other ??? - Some application like web browsers or GUI software have multiple processes working together, to leverage parallelism, run background tasks, isolate untrusted code, etc. - These processes need to communicate and synchronise -- - Communication is achieved through **Inter-Process Communication (IPC)** mechanisms - Signals, pipes, socket, shared memory, etc. ??? - Communication is done through inter process communication mechanisms such as pipes, signals, sockets, etc. -- - **Synchronisation mechanisms** let processes coordinate so they don't step on each other's feet - Waiting for other processes, barriers, locks - All of this is provided by the OS, accessed with system calls ??? - And synchronisation lets process coordindate so they don't step on each other's feet, we don't want things like 2 processes updating a shared data structure at the same time - All of these mechanisms are provided by the operating system and accessed through system calls --- # Signals - Signal: a notification sent by the kernel to a process - Sent upon certain events (e.g.
ctrl
+
c
pressed, segmentation fault) - Can also originate from another process - Apart from a signal type (integer id), does not carry any data - A process can install handlers that will be invoked when a given signal is received - If a signal with no handler is received, the OS kills the process ??? - A signal is a notification sent by the kernel to a process upon certain events, like when the program encounters a page fault, tries to divide by zero, when the user presses certain combinations of keys, etc. - A process can instruct the kernel to send a signal to another process - A signal is really jsut a notification, apart from its type does not carry any data - A process installs handler for signals it wishes to receive and act upon - If a signal is received by a process that has no handler for it, the OS will terminate the process --- # Signals (2) ```c // Signal handler for SIGUSR1 void handle_sigusr1(int signum) { printf("[%d] signal received!\n", getpid()); fflush(stdout); // Ensure immediate output } int main() { // Install custom handler for SIGUSR1 struct sigaction sa; sa.sa_handler = handle_sigusr1; sigemptyset(&sa.sa_mask); // No additional signals blocked in handler sa.sa_flags = SA_RESTART; // Restart interrupted syscalls automatically if (sigaction(SIGUSR1, &sa, NULL) == -1) { perror("sigaction"); exit(EXIT_FAILURE); } while (1) { printf("[%d] Doing useful work\n", getpid()); sleep(1); // Simulate work } return 0; } ``` .codelink[
`14-process-management/signal.c`
] ??? - You have an example of a process installing a signal handler here - It prepares a struct sigaction object - Notice how it indicates, for the `sa_handler` field, the name of the function that will be executed when the signal is received - The sigaction object is then installed with the sigaction system call, in order to handle the signal of type `SIGUSR1` - From that point, if the signal `SIGUSR1` is received, the handler `handle_sigusr1` will run --- # Signals: Implementation by the OS - Kernel delivers signal **lazily**: checks if any signal needs to be delivered to a process right before it returns to user space - If a signal needs to be delivered, kernel determines the action (call handler or kill) ??? - Under the hood, the kernel delivers signals lazily: upon returning to user space in a process from an interrupt or a system call, the kernel check if that process has any signal pending - If so it determines the action, either invoke the handler or kill the process if there is no handler -- - If it's a handler, modify the user execution context to be restored: - PC points to the handler - Stack contains signal information (e.g. id) and where to return from the handler (`sigreturn` system call) ??? - If a hanlder needs to run, the kernel modifies the execution of the process: it sets the instruction pointer to return to to be the handler's code, and puts on the stack information about the signal received and what to do when the handler finishes to run -- - Return to user space, handler runs then `sigreturn` cleans up the stack frame and return ??? - Then we return to user space, the handler runs and finally invokes the `sigreturn` system call that will clean things up and resume the normal process execution --
??? - All these steps are illustrated here, we have a process running in user space - Then a trap to the kernel following a system call or an interrupt - After the kernel is done processing this trap, before returning to use space it checks if a signal needs to be delivered - If so, it returns to user space at the level of the handler, than runs, and when the handler is done the process resumes its execution normally --- # Pipes and Sockets - Uni- (pipes) and bi- (sockets) directional **data communication channels** - Processes sets them up with specific system calls, then read and write in them to exchange data - Visible from multiple processes as **pseudo files** on the filesystem (shared by processes) - Uses kernel buffers for data in transit - Kernel also puts to sleep processes e.g. writing to a full pipe/reading from an empty pipe
??? - Pipes and sockets are data communication channels - Pipes are unidirectional, there is one writer process and one reader process, while sockets are bidirectional - Both mechanisms are visible to processes as pseudo files on the filesystem - This is because the filesystem is visible from all processes so it's easy to share a pipe or a socket between different processes this way - The kernel uses internal buffers to hold the data in transit on pipes and sockets - It is also involved in putting reading and writing processes to sleep when the communication channels are empty or full --- # Pipes and Sockets ```c #define FIFO_PATH "/tmp/myfifo" int main() { mkfifo(FIFO_PATH, 0666) // Create a named pipe (FIFO) pid_t pid = fork(); if (pid == 0) { // --- Child Process: Writer --- int fd = open(FIFO_PATH, O_WRONLY); const char *msg = "Hello from child!\n"; write(fd, msg, strlen(msg)); close(fd); } else { // --- Parent Process: Reader --- int fd = open(FIFO_PATH, O_RDONLY); char buffer[128]; int n = read(fd, buffer, sizeof(buffer)-1); if (n > 0) { buffer[n] = '\0'; // Null-terminate printf("Parent received: %s", buffer); } close(fd); unlink(FIFO_PATH); // Clean up FIFO file }} ``` .codelink[
`14-process-management/pid.c`
] ??? - Here is an example of usage of a pipe, also named FIFO - We have a parent process that creates the pipe with the `mkfifo` system call - It then forks to create a child - The child opens the pipe in write mode and writes "hello from child" in it - Concurrently, the parent opens the pipe in read mode, reads from it, and displays what it read - Both processes close the pipe's file descriptor when they are done and the parent deletes the pipe at the end with the unlik system call --- # Shared Memory - All previous IPCs involve many system calls during intense communication - Not ideal from the performance point of view as user/kernel switches are costly ??? - The IPC examples we saw all involve quite a lot of system calls to be set up and to achieve communication - This is not ideal in terms of performance, system calls are costly because user/kernel switches take many CPU cycles -- - A more basic but faster mechanism is **shared memory**: physical pages mapped in the address space of 2 (or more) processes - Achieved by the OS by setting up the page tables of the processes to map the same physical pages ??? - A more basic but also faster mechanism is shared memory - Two processes can instruct the OS to let them share one or more physical memory pages --
??? - This is illustrated here - The kernel maps the same pysical page within the address space of the two processes - They can then read and write to that page and hit the same locations in memory --- name: race ## Shared Memory: the Need for Synchronisation - The possibility of reading and writing to shared data concurrently brings the need for **synchronisation** between processes - Imagine a process preempted in the middle of updating a large data structure - A second process is scheduled and tries to read this inconsistent data structure: **race condition** ??? - The fact that we now have two processes that can read and write concurrently to the same areas of memory brings the need for synchronisation - If these processes do not agree on a form of protocol to access this shared memory, bad things will happen - Imagine a scenario in which a process 1 is interrupted in the middle of updating a large datastructure - A process 2 starts to run and reads that data structure that is only half valid - This is called a race condition, and obviously it's a bug --- template: race
??? - The issue is illustrated here: we have our data structure with 3 fields - Process 1 is running and wants to update the data structure - For it to be in a state that makes sense, all 3 fields should be updated --- template: race
??? - Process 1 updates the first field --- template: race
??? - Then the second field, but before it can update the third field it is preempted and the scheduler decides to run Process 2 on the CPU instead - Process 2 reads the entire datas structure, which is in an inconsistent state - That's our race condition --- name: lock # Synchronisation - Pieces of code where processes access shared data are called **critical sections** - To avoid race conditions critical section must be executed **atomically**, i.e. 1. A critical section can only be executed by 1 process at a time; and 2. When a process starts to execute a critical section, it must finish it before another process can start to execute that critical section - Atomicity ensured by using **locks** ??? - So the bits of code in your program where processes access shared data are called critical sections - To avoid race conditions, the critical sections of all processes accessing are given piece of shared data need to execute in a very particular fashion: they need to execute atomically - Atomicity means that these two rules must be enforced - First, a critical section can only be executed by one process at a time - Second, if a process start to execute a critical section it must finish it before another process can start to execute another critical section accessing the same piece of shared data - This is enforced by a particular mechanism called locks --- template: lock
??? - A lock protect a piece of shared data - It works as follows - Say we are two processes that want to access some shared data - We have one lock protecting it - Each process is going to attempt to take the lock --- template: lock
??? - The lock is implemented in such a way that only one process can succeed in taking the lock and holding it - Let's assume here it's process 1 - Because that process got the lock it can go ahead and execute its critical section, i.e. read or write the shared data --- template: lock
??? - In the meantime the second process needs to wait for the lock, it is put to sleep by the OS --- template: lock
??? - When process 1 is done with accessing the shared data structure it exits its critical section and releases the lock - Process 2 then tries to take the lock again, it is free now so it gets it --- template: lock
??? - Process 2 can go ahead with its critical section and release the lock when its done - This way we have ensured atomicity for the execution of the critical section: it was executed by one thread at a time and there was no race conditions --- # Locks: Example ```c typedef struct { int field1; int field2; int counter; pthread_mutex_t lock; } shared_data_t; int main() { shared_data_t *shared; // Create anonymous shared memory shared = mmap(NULL, sizeof(shared_data_t), PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); // Initialise process-shared mutex pthread_mutexattr_t attr; pthread_mutexattr_init(&attr); pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED); pthread_mutexattr_destroy(&attr); shared->counter = 0; ``` ??? - Let's illustrate here both how to establish shared memory between two processes, and how to use lock to protect access to data living in shared memory - Here we first declare our shared data structure - It has two integer fields field1 and field2, as well as a counter we'll use to keep track of the number of times the object was updated - Finally it has a lock to protect it against concurrent accesses, the type of the lock is pthread_mutex_t - In main we use mmap to ask the OS an area of memory large enough to hold our data structure, we want it to be readable and writable, and with `MAP_SHARED` we indicate that it will be shared with the child process we are going to create next - We also initialise the lock with is also called a mutex which stands for mutual exclusion lock - We indicate that it will be shared with the child process - And we also set the counter to zero --- # Locks: Example (continued) ```c pid_t pid = fork(); // code below will be executed by both the parent and child processes for (int i = 0; i < 5; i++) { pthread_mutex_lock(&shared->lock); // critical section starts: take the lock int old = shared->counter; shared->field1 = rand(); shared->field2 = rand(); shared->counter++; printf("[%d] counter %d -> %d\n", getpid(), old, shared->counter); pthread_mutex_unlock(&shared->lock); // critical section ends: release the lock usleep(100000); // small delay } // Wait for child in parent if (pid > 0) { wait(NULL); pthread_mutex_destroy(&shared->lock); munmap(shared, sizeof(shared_data_t)); } return 0; } ``` .codelink[
`14-process-management/lock.c`
] ??? - Then the parent forks - What is below fork will be executed by both the parent and the child - In a loop, they both repeat the following process - First they take the lock to indicate the start of their critical section - Then they update the shared data structure, writing some random values in its two fields and incrementing the counter - And when the critical section is over they release the lock - When the loop is over the parent use the wait system call to wait for the child to exit, then clean things up - The lock is destroyed with `pthread_mutex_destroy` - And the shared memor yis released to the OS with munmap - As you can see this program works fine - If you are curious you can try commenting the pthread_mutex_lock and pthread_mutex_unlock statement which will get rid of the lock - Try to launch this program a few time, you will likely get some race conditions and will start to see strange values for the counter --- # Locks: Implementation by the OS - OS needs to be involved when accessing locks (puts waiters to sleep) - Used to be a **costly system call for every lock/unlock operation** ??? - The OS needs to be involved in lock taking/release operations because only the kernel can put processes to sleep or wake them up - So in the old days each lock take and release operation was a system call, which is a slow operation - This was very costly from the performance point of view -- - Optimisation: **futex, implementing the lock partially in user space** - Using a shared user space variable indicating if the lock is free or not, accessed with **atomic CPU instructions** - Switch to the kernel only when needed (a process needs to be put to sleep/awaken)
??? - The old way to deal with locks with one system call for each lock/relead is illustrated on the left here - Linux and many other OSes implement an optimisation, which is called futex, for fast user space mutex - Part of the lock is implemented in user space, we have a futex data structure that is accessed atomically by processes using atomic instructions - System calls end up being made only when the kernel needs to be involved, i.e. when processes need to sleep or to be awaken - So when there is no contention the lock is accessed swiftly without any system call --- class: inverse, middle, center # Threads --- # Threads - **A thread is an execution flow in a process** - Each process has at least 1 thread - The execution flow that starts to execute at the program's entry point at load time ??? - One last thing to discuss, threads - A thread is an execution flow in a process - Each process has at least one, it's the execution flow that starts at main after the program is loaded -- - Can create more threads - E.g. Using the POSIX thread library (pthread) in C - Also supported in most languages ??? - And a process can create more threads - In C this is achieved with the POSIX thread library - And threads are vailable in most other languages too -- - **Threads belonging to the same process share the same address space** - They can communicate by exchanging references (pointers) and through global variables - Because **threads run concurrently**, this opens up for race conditions and brings the **need for synchronisation** ??? - The key idea with threads is that all threads belonging to the same process share that process' address space - That makes it very easy for threads to communicate using global variables or pointer to anywhere in memory - They run concurrenlty, which means they also need to synchronise to avoid race conditions when accessing shared data --- # Threads: Example ```c typedef struct { pthread_mutex_t lock; int field1; int field2; int counter; } shared_data_t; int main() { pthread_t t1, t2; shared_data_t shared; shared.counter = 0; pthread_mutex_init(&shared.lock, NULL); pthread_create(&t1, NULL, thread_func, &shared); // Create two threads pthread_create(&t2, NULL, thread_func, &shared); pthread_join(t1, NULL); pthread_join(t2, NULL); // Wait for both threads to finish pthread_mutex_destroy(&shared.lock); // Cleanup return 0; } ``` ??? - Here is an exmaple of multithreaded program - It performs the same thing as the multiprocess application we saw previously, this time with threads - We have the same data structure with two fields, a counter, and a lock - This time the shared object will leave on the stack of the main thread - We also declare two pthread_t object representing the threads we want to create t1 and t2 - We initialise the lock with pthread_mutex_init, then create and start the two threads with `pthread_create` - They take as 3rd parameter the name of the function they should run, `trad_func`, and as 4th parameter the argument to pass to that function, here it is the address of the shared data structure - The main thread then waits for t1 and t2 to finish with 2 calls to `pthread_join` --- # Threads: Example (continued) ```c void* thread_func(void* arg) { shared_data_t *shared = (shared_data_t*)arg; for (int i = 0; i < 5; i++) { pthread_mutex_lock(&shared->lock); int old = shared->counter; shared->field1 = rand(); shared->field2 = rand(); shared->counter++; printf("[%d] counter %d -> %d\n", gettid(), old, shared->counter); pthread_mutex_unlock(&shared->lock); usleep(100000); } pthread_exit(NULL); } ``` .codelink[
`14-process-management/thread.c`
] ??? - When they run both threads will execute this function - It executes the same loop as our multiprocess application: take the lock, update the shared data structure, then release the lock - Once they are done each thread calls `pthread_exit` --- # Threads: Implementation by the OS - From the OS point of view a thread is a task i.e. the smallest schedulable entity - Threads are created with the same system call used underneath `fork`: `clone` ??? - The OS does not schedule processes, it schedule threads - Threads are created under the hood with the same system call used to implement `fork`, `clone` -- - Threads belonging to the same process have the **same address space by using the same page table** ??? - The sharing of a single address space between all threads belonging to the same process is simple: they all have the exact same page table -- - Threads of a process report the same PID - Each thread in the system also has a unique thread identifier (TID) - Many scheduler-related system calls take a TID as parameter ??? - All threads of a process will report the same PID, but they also have a thread-level identifier named the TID - Most of the scheduler related system calls, such as the one to update priorities, actually take a TID as parameter and not a PID - This is because threads, and not processes, are the schedulable entities --- # Summary - **Process**: instance of a running program - Created with `fork`: child is a duplicate of the parent - Parent/children relationship, process tree - Made of one or several **threads**: execution flow, schedulable entity - Process often need to - **Communicate** with IPCs e.g. signals or pipes - **Synchronise** with locks - Important to avoid data races in concurrent (multiprocess/multithreaded) environments ??? - To conclude, we have seen a lot of things here - We covered the notion of process, which is an instance of a running program - Processes are always created with fork, establishing this parent/child relationship and the system process tree - A process is made of one or more threads, which are execution flows sharing the same address space - Processes and threads often need to communicate with IPCs or shared memory, and to synchronise with locks when accessing shared data