Secure Architectures and Systems - OS Security Concepts Part 2

class: center, middle

### Secure Computer Architecture and Systems
***
# OS Security Concepts Part 2

???

- Hi everyone, welcome to OS security concepts part 2
- Here we are going to talk about how the user-level access control mechanisms that are implemented by OS kernels

---
# UNIX/Linux File Permissions

- Linux inherits the **everything is a file** approach from UNIX
  - Regular files, devices, IPCs, OS metadata/configuration knobs, etc. (but technically not *everything*)

???
- Linux is heavily inspired by UNIX and in particular it follows one of its main principles which states that everything is a file
- Beyond regular files, this also applies to devices, IPCs, OS metadata and configuration knobs, among other

--
- As a result much of its security aspects rely on **file permissions**
  - Define what user/process can do what with what files

???
- As a result, a lot of the security mechanisms that define what user and processes can and cannot do rely on file permissions

--
- Same as UNIX, Linux is a **multi-user** OS
  - Users share the computer, do not trust each other
  - System administrator do not trust users

???
- Linux is also a multi-user operating system: several users that do not trust each other share the same machine
- We also have a privileged system administrator that do not trust the users
- So the OS must make sure the files and system resources are properly isolated to respect that trust model

|   | File A       | File B       | File C     |
|-----------|--------------|--------------|-------------|
| **Process 1 (user 1)** | read, write  | read         | -           |
| **Process 2 (user 2)** | read         | write        | read       |

.small[Lampson's **access control matrix**. First columns: **subjects**, first row: **objects**, other cells: **permissions**]

???

- You probably already know about file permissions
- Here you have an example in the form of an access control matrix
- In the first column you have what is called subjects, here it is processes, but it could also be users
- In the first row we have objects, these are files that subjects may or may not access
- Finally in the rest of the matrix, at the intersection between a subject and an object, you have the list of permissions that subject has upon that object
- For example here process 1 can read and write file A, can read file B, and cannot access file C at all
- Process 2 has different permissions

---
# UNIX/Linux File Permissions

- **Process identity**: a user ID (UID) and a group ID (GID) 
- Each file is associated with an **owner UID** and an **owner GID**
  - Processes executing with the file's owner UID can **change permissions** on that file (i.e. full access)
  - Processes executing with the file's owner GID can obtain additional permissions
- **Permissions** defined within each file's metadata as **permission bits**

???
- Each process running in the system is associated with a user ID, and a group ID
- In the vast majority of cases, when a user invokes a program, the process will take the user's user and group IDs
- Each file in the system is associated with an owner user and group ID
- So given a particular file, processes executing with a user ID equals to the file's owner UID can change permissions on that file, in other words they have full read/write/execute permissions
- Processes executing with a group ID equals to the file's owner GID can also obtain additional permissions compared to the processes that run with unrelated UIDs and GIDs
- The permissions are defined within each file's metadata on the filesystem, there are specific bits for that named the permission bits

```bash
# Check a process' UID/GID
$ ps -o pid,user,group,uid,gid,comm -p <pid>

# Check a file's UID/GID and permission bits
$ ls -ln <file>
```

???

- If you want to check a process' UID and GID, you can use the ps command as follows
- And if you want to review a file's owner UID and GID, as well as the value of its permission bits, use the ls -ln command

---
# UNIX/Linux File Permissions

```bash
$ ls -l /usr/bin/cp
-rwxr-xr-x 1 0 0 151152 Sep 20  2022 /usr/bin/cp
```
- `-rwxr-xr-x`: file type and permission bits for the owner UID, owner GID, and other users
- `1`: number of hard links
- `0`: owner UID
- `0`: owner GID
- `151152`: file size in bytes
- The rest is the modification timestamp and file/directory.

???

- Listing a file's permissions and owner UID and GID with ls gives you the following output
- The string with dashes and r/w/x character gives us the file type and permissions, we'll zoom in on that in the next slide
- the first `1` after that is the number of hard links to the file
- The next two zeros here are respectively the file's owner UID and GID
- The number coming next is the file size in bytes
- And the rest of the information is a timestamp for the file's last modification, and the file path on the filesystem

---
# UNIX/Linux File Permissions

- Permissions for each subject (owner/group/others) can be:
  - **Read access**: subject can (`r`) or cannot (`-`) read the content of the file
  - **Write access**: subject can (`w`) or cannot (`-`) write in the file
  - **Execute permission**: subject can (`x`) or cannot (`-`) execute the file as a program

???
- If we focus a bit on the permission string
- The first character represents the file type
- It's a dash for regular files, a d for directories, and you'll find other characters for special files representing things like devices of IPCs
- Then we have 3 blocks of 3 character each, representing permission for subjects
- The first 3 characters give the permissions for the file's owner UID, the second set of 3 characters gives the permissions for the file's owner GID, and the last 3 characters give the permission for anyone else
- The permissions can be read access, permitted with an `r` and prohibited with a dash
- Write access, permitted with a `w` and prohibited with a dash
- And execute permissions, this is for executable programs, these permissions are enabled with an `x` and prohibited with a dash

--
.small[
Name | Owner | Group | Mode bits
-----|-------|-------|----------
`foo`  | `alice`   | `faculty` | `rwxr--r--`
`bar`  | `bob`     | `students` | `rw-rw-r--`
`baz`  | `charlie` | `faculty` | `rwxrwxrwx`
]

???
- Here is an example on how the permissions could be set up on a University lab machine that is shared between students and faculties
- users are classified in groups, students and faculties
- And each file as a user owner, and a group owner
- `foo` accessible with full permissions by Alice, but can only be read by the faculty group and other users
- `bar` is readable and writable by bob, and anyone from the student's group. It can only be read by other users.
- And `baz` is fully accessible by anyone

---

# Authorisation Mechanisms

- User space configure permissions for a file
  - `chown` to change file owner, `chmod` for permissions, etc.
- Authorisation checks made in the kernel on each access (open, read, write, etc.)

???
- System and administrator use user space applications to configure permissions for files
- An authorised user can change the owner of a file with `chown`, and the file's permissions with `chmod`
- When files are accessed, the permissions checks are realised by the kernel upon each access

--
- Authentication processes (`login`, `sshd`) need to run as root
  - Switches to the identity of a user when they log in
  - All subsequent processes in the session inherit this identity

???
- When a user authenticate, the authentication program like login or ssh needs to run as root, which means it has system administrator privilege
- This is because if the authentication succeeds the process needs to switch to the identity of the authenticated user
- After that, all subsequent processes will inherit the user's identity

--
- Specific services need to be callable by users but require root permission
  - E.g. `passwd` to update the password file `/etc/shadow` not accessible to users
- Special **setuid** permission bit: program invoked by a user but run with root permissions

???

- You have some services that are supposed to be invoked by standard users, but that requires root privileges
- An example is the passwd command, which users can invoke to change their password
- This program needs to update the central system password file `/etc/shadow`, which obviously cannot be accessed by normal users, only by root
- So passwd has a special permission bit named the **setuid** flag, that lets it be invoked by a standard user but actually run with root privileges
- This has important security considerations, that we will cover more in details in the second lab exercise

---
# Discretionary Access Control

- (Non-root) users can change the security configuration of the system
  - They can update mode bits and owner UID/GID of the files they own
- This is called **Discretionary Access Control (DAC)**

???
- So with Linux file permissions, non-administrator users can change the security configuration of the system because they manage the permissions to their own files
- They do so by updating permission bits and owner UID/GID for the files they own
- This is called discretionary access control, DAC

--
- Problem with DAC: it assumes users are always behaving correctly 
      - E.g. user A does not by mistake configure permissions so that user B can access A's private files
      - Or one of A's process (outside the TCB!) could be compromised by an attacker and be acting maliciously
- These assumptions do not hold in reality
- We need a protection system that **maintains guarantees even when software outside the TCB may be malicious**

???
- DAC is not great from the security point of view
- It assumes that users are fully trusted and that they always behave correctly
- In reality, users can make mistakes
- Imagine a user A that screws up a chmod command and lets by mistake other users access his private files like his private SSH key
- Some users of processes can also be actively malicious
- Imagine a remote attacker taking over a user process' execution flow with a use after free exploit
- That attacker could manipulate the program to lower the defences of the system by changing its security configuration through file permissions
- So unfortunately, the assumptions behind discretionary access control do not hold in reality, and it's not a very secure solution
- We need a protection system that maintains security guarantees even when software outside the trusted computing base is malicious

---
# Mandatory Access Control

- **Mandatory Access Control (MAC)**: protection system where security configuration can only be modified by trusted administrators

???
- This is the goal of mandatory access control protection systems
- With MAC the security configuration can only be modified by trusted administrators

--
- Every subject (process) and object (file, system resource) gets a **security label**
- Labels are used to define rules describing how processes can interact with each other and with system resources
  - Set of labels is defined by trusted administrators and is **immutable**
      - Cannot be changed by processes: access control is mandatory
      - Labels are assigned to processes and objects at creation time
      - Can be changed later by trusted software

???
- It works as follows
- Every subject such as processes and object such as files or system resources get a security label
- Labels define rules describing how processes can interact with each other and with system resources
- The set of labels is defined by the trusted administrator 
- Labels are assigned to processes and objects at creation time
- At runtime the security policy is immutable: trusted software can be used by administrators to update it, but a restart the protection system would be needed to take effect, something that can also be only be done by an administrator

---
# Mandatory Access Control

.small[Adapted from the book *Operating System Security* by Trent Jaeger.]

???

- Here is an example of a mandatory access control policy
- We have our subjects on the left, two processes
- Process 1 gets assigned the secret label when it is created
- And process 2 gets assigned the public label
- On top we have our objects, two files
- File 1 gets assigned the top secret label, and file two gets initially assigned the confidential label
- Note that subject and object labels do not have to necessarily be the same
- Within the matrix you can see the permissions: process 1 being secret it cannot access top secret files like file 1, and can read and write confidential files like file 2
- Process 2 being public it can access neither file 1 nor file 2
- If the administrator wants to change the labelling, that would require a restart of the MAC system
- For example here file 2 becomes public and can then be read by process 2

---
# SELinux

- **Security Enhanced Linux (SELinux)**: MAC in Linux

???
- Linux has a mandatory access control framework named SELinux

--
- Processes and system resources gets assigned labels named **contexts**
- A context contains several fields including a type. Example of types:
   - A web server process running is of type `httpd_t`
   - Served content in `/var/www/html/` is of type `httpd_sys_content_t`
   - Ports traditionally used by web servers are of type `http_port_t`
   - etc.

???
- Processes, files and system resources get assigned labels that are called contexts
- A context has in particular a type, you have a few examples here
- A web server process running is of type `httpd_t`
- The files it serves are labelled `httpd_sys_content_t`
- And the ports it will usually listen to like port 80 or port 443 are labelled `http_port_t`

--
- Policy defined by trusted administrator
    - **Rules** explicitly describing the possible operations processes can perform on OS resources
    - Checked after file permissions that still apply
- **No rule means deny by default**
  - Strict enforcement

???
- The rules are set by the administrator, for example the web server will get access to both the files it needs to serve, and the ports it needs to listen to
- SELinux checks are performed after traditional filesystem permission checks are done
- Note also that with SELinux no rule means deny by default, so the policies are quite strict and when they are well-defined, they represent a good way to achieve least privilege for the system

---
# SELinux

???

- You have an example of SELinux policy for 2 processes here, a web server and an SQL database
- Each is labelled with its own type
- The web server can access the usual port serving HTTPS requests, as well as the files it is supposed to serve on the filesystem
- The SQL database gets access to the filesystem location where the database is stored
- Because there is no rule allowing it, the web server cannot access the database file
- The database also cannot access the web server's port or the files it serves

---
# SELinux: Pros & Cons

- Pros:
  - **Strict and fine-grained security enforcement**
    - E.g. if Apache is compromised the attacker can only access files marked as such by the policy
        - Vs. all files accessible by the user/others with traditional file permissions
        - Better application of the least privilege principle
  - **Widely available and required for certain compliance standards**

???
- As we saw MAC implemented with SELinux is very strict and is a good way to enforce least privilege
- For example if the web server is compromised by an attacker, the attacker will only be able to access whatever SELinux allows the web server to access
- With traditional file permissions the attacker would get access to much more systems resources
- If the web server was running as root, which would be terrible from the security point of view, the attacker will gain full access to the entire machine
- Even if the server was not running as root, the attacker would still get access to everything the user it runs on behalf can access
- Another benefit of SELinux is that it is widely available, and required for certain compliance standards

--
- Cons:
  - **Complexity of configuration, management, and troubleshooting**
      - Main barrier to adoption, Google "how to disable SELinux"
  - **Too strict**
      - Legitimate actions blocked with overly strict/outdated policies

???
- Unfortunately there are also some downsides to using SELinux
- The main issue is how difficult it is to configure, manage, and troubleshoot
- Many actors really struggle with it, just Google "how to disable SELinux" to get an idea
- It is also very strict, sometimes too strict, so it creates some false positives at runtime, flagging legitimate behaviour as security issues

---
# Linux Security Modules

.leftcol[
- SELinux is mostly concerned about **policies**: rules defining what process can do what with what system resource
- To enforce policies SELinux relies on the **Linux Security Modules** (LSM) framework
- LSM provides **mechanisms** to implement access control systems (security modules) within the kernel
- Used by SELinux but also Smack, TOMOYO, Apparmor, etc.
]

.rightcol[
<div style="text-align:center"><img src="include/lsm.svg" width=300 /></div>
]

???

- SELinux is mostly concerned about the rules defining what processes and users can do with what files and system resources
- These rules represent what's call a policy
- To enforce these rules we need a mechanism that will hook into file and resource accesses and will deny or approve these accesses based on what policy SELinux wants to enforce
- This mechanism is called Linux Security Modules, LSM
- It is a mechanism framework on top of which many access control systems are built, SELinux but also systems like Smack, TOMOYO Linux, or Apparmor

---
# Linux Security Modules

.leftcol[
- LSM exposes **hooks** on kernel code paths right before the access to internal, security sensitive, kernel resources
  - E.g. before opening/accessing a file
- Security modules (policies) can then permit/deny operations, log them for auditing
- Kernel objects have an opaque (`void *`) `security` field for holding security metadata
]

.rightcol[
<div style="text-align:center"><img src="include/lsm-hooks.svg" width=330 /></div>

.small[Adapted from Wright et al., **Linux Security Modules: General Security Support for the Linux Kernel**]
]

???
- From a high level point of view, LSM works as follows
- It uses hooks on the relevant kernel code paths, at the point the kernel accesses security critical resources
- For example at the time a file is opened
- The security modules in place, for example SELinux, can then permit, deny operations, and/or log them for auditing
- Many kernel data structures have a generic void * pointer named security that can be used to hold any kind of relevant security metadata
- You have an example here on the right, when a process opens a file the kernel starts to process the system calls, look up the inode for the file in question
- It performs some basic error checks for example does the file exist
- It then performs the discretionary access control checks first on traditional UNIX file permissions
- And then we have the LSM hook, plugged to the policy in place, for example SELinux
- Whatever handler is called there will allow, deny or log the operation
- And if the operation is allowed, it can then be performed

---
# The Confused Deputy Problem
<div style="text-align:center"><img src="include/confused-deputy-1.svg" width=500 /></div>

.small[Norm Hardy, *The Confused Deputy (or why capabilities might have been invented)*]

???

- Access control systems are not very efficient in all situation
- Here is a classical computer security problem called the confused deputy, with which access control systems are not ideal
- Imagine a shared, multi user system used to compile some code
- Users have a home folder `/home/user` with their source code src.c
- They invoke the compiler, pass the source file as parameters, as well as the name of the executable they wish to create, a.out
- They also can dump debug symbols into a separate file, here named debug-info
- The system administrator also configured the compiler to write down a summary of language usage statistics
- This gets written in a system directory, `/sysx/language-stats.txt`

---
# The Confused Deputy Problem
<div style="text-align:center"><img src="include/confused-deputy-2.svg" width=500 /></div>

???
- That folder `/sysx/` is supposed to be accessed only by root, however because the compiler needs to write to it when invoked by the user, the administrator configures the access control system to let the compiler program write in that folder to dump the language usage statistics
- In `/sysx` there is also a security critical file that contains some important billing information, billing.txt
- This file should not be accessible from normal users

---
# The Confused Deputy Problem
<div style="text-align:center"><img src="include/confused-deputy-3.svg" width=500 /></div>

Deputy (compiler) confused into accessing a file (billing.txt) the user it runs on behalf to does not have permission to access

???

- If a user learns somehow about the existence of the file billing.txt, for example by discussing with a colleague, they can trick the compiler into overwriting billing.txt
- For example by indicating it as target for writing the debug symbols, as presented here
- So in effect we have a program with a certain amount of privilege, that we name a deputy (here the compiler), confused into accessing a file that the user it runs on behalf of does not have the permission to access

---
# Capability Systems

- Can't solve the confused deputy problem easily with access control lists (MAC/DAC)

???
- There is now good way to solve the confused deputy problem easily with access control systems

--
- **Capability systems** address the issue. A capability is a token that conflates:
  - The **designation** of a system resource, e.g. a file; and
  - The **permissions** to access that resource
- In other words if you can name a resource you can access it
- Differs from access control lists where the name and permission to access e.g. a file are separate
  - E.g. a user can refer to `/etc/shadow` with its path

???

- An approach different from access control lists can be used here, it is called capability systems
- A capability is a token that can be held by a subject
- The capability brings together the ability to designate a resource, for example the ability to name a file and the permissions to access that resource
- In other words, if you can name a resource, you can access it
- This differs from access control system where you can name resources that you can't access, for example as a standard user you can `ls /etc/shadow` although you can't access it

---
# Capability Systems

- Capabilities are **unforgeable**
  - The only way to obtain one is to have another security domain make a copy of one of its capabilities and give it to you
- The OS starts with all the capabilities, i.e. all permissions to all resources
- Creates `init` process with a large set of capabilities
- `init` spawns other processes, parents in the process tree decide which capabilities their children get

???

- A key idea behind capabilities is that a subject cannot forge a capability out of thin air: the only way to obtain one is to have another security domain make a copy of one of its capabilities, and transmit it to you
- Given this, the way capability systems work is that after the OS boots it has access to all the capabilities, in other words it can access all resources in the system
- The OS then creates the first process let's call it init, giving it a subset of the systems' capabilities
- `init` creates other processes, and subsets its own capabilities based on what each process needs
- Sets of capabilities continue to be subset from parents to children
- Because processes cannot forge capabilities, there is no way for them to increase the permissions that were given by their parents

---
# Capability Systems
<div style="text-align:center"><img src="include/capability-1.svg" width=600 /></div>

???
- You can see an illustration of how capabilities can help solve the confused deputy problem we previously mentioned
- For the sake of simplicity we'll only consider capabilities relating to file access
- When the OS boots up, it is initialised with all the capabilities, meaning it has access to the entire filesystem
- It creates the process init which itself creates login, both of these run as root because the user is not authenticated yet, and they both get a copy of the entire set of capabilities letting them access the entire filesystem 
- Once login authenticates the user, it spawns a shell
- This shell will run on behalf of the user, so login subsets its capabilities for the shell: it corresponds to only what the user can access, which is read write access to the content of the user's home directory and execute access to the compiler

---
# Capability Systems
<div style="text-align:center"><img src="include/capability-2.svg" width=600 /></div>

???

- When the user runs the compiler, the shell subsets its capabilities again and give the compiler only what it needs
- That is read write access to the source file, executable output, and debug symbols output
- The compiler also requires a write capability to `/sysx/language-stats.txt`
- It could be passed down the capability subseting chain, however that would let something like the shell access the file in question which is not really needed
- That capability can instead be obtained by the compiler from the kernel, which has full filesystem access

---
# Capability Systems
<div style="text-align:center"><img src="include/capability-3.svg" width=600 /></div>

???

- Because the compiler does not have a capability to access the billing file, it won't be able to overwrite it

---

# Summary

- Traditional file permissions is an instance of **Discretionary Access Control**
  - (Untrusted) users configure permissions, not very robust but easy to use
- **Mandatory Access Control** is stronger, permissions configuration rather static and defined by trusted administrators
  - More complex to define, manage, troubleshot
- Access control lists suffer from the **confused deputy** problem
  - Privileged entity tricked into misusing its authority by an unprivileged attacker
  - Issue addressed with **capability systems**

???

- To sum up
- We covered traditional UNIX file permission, they are an instance of discretionary access control in which the users can define part of the systems' security configuration, which is not particularly robust but easy to use
- We also looked at mandatory access control, in which only trusted administrators can define the security configuration
- This is stronger but harder to define and maintain
- Finally, we covered capability systems, which are better than access control ones for certain classes of security problems such as the confuse deputy