Secure Architectures and Systems - Software Compartmentalisation Policies

class: center, middle

### Secure Computer Architecture and Systems
***
# Software Compartmentalisation Policies

???

- Hi everyone
- In this video, we are going to talk about compartmentalisation policies

---
# Compartmentalisation Policies

- Deciding for the compartmentalised application:
  - **How many compartments there should be** and
  - **What goes in which compartment**

???

- What is a compartmentalisation policy?
- It is a series of choices made at design stage when compartmentalising
- More precisely it is the definition, for a target application to compartmentalise, of how many compartments there should be, and what bits of the application go into what compartment
- Say we have an existing monolithic application to compartmentalise
- As illustrated on the slide, there are different possible compartmentalisation policies
- And of course choosing a particular policy will have important consequences on the security and performance of the future compartmentalised application

---
# Compartment Selection Method

- **Code-centric** or **spatial** approaches
  - Protection domains (compartments) are **regions of code**
- **Data-centric** or **temporal**/**horizontal** approaches
  - Compartments are temporal **units of execution**, e.g. a thread/process

???

- A first important choice regarding the policy is how to organise compartments
- You have 2 main choices here
- Code-centric or spatial approaches will split the source code into different compartments
- An example of code-centric policy here would be: put each library of my program within its own compartment
- Another example is illustrated on the slide
- We have a web server to compartmentalise, and choosing to put the main server code in one compartment, and the SSL library in another is a code-centric compartmentalisation policy
- Now on the other hand, you have data centric, also called temporal or horizontal compartmentalisation policies
- These rather put execution flows within their own compartments, for example each thread or each process of an application is within its own compartment
- You have an example on the slide with our web server, imagine it spawns a certain number of worker threads
- These workers execute more or less the same code, but each runs within its own compartments
- Then of course you can combine both approaches, see the hybrid example on the slide with the main application's code in one compartment, one library in a second compartment, plus additional compartments, one per worker thread

---
# Compartment Granularity

- From coarser to finer-grain: library/software package, linkage unit, function, etc.

| | **Coarser granularities** | **Finer granularities** |
|-| --------------------------| ------------------------|
| **Pros** | Reduce compartmentalisation effort, lower performance impact | Better privilege reduction |
| **Cons** | Low degree of privilege reduction | Higher compartmentalisation complexity and performance impact |
  
<div style="text-align:center"><img src="include/granularity.svg" width=800"/></div>

???

- The granularity of compartment denotes how large or small compartments can be
- Coarse granularity means that each compartment contains a lot of code
- Generally this translates into a lower number of compartments, which reduces the compartmentalisation effort and the performance slowdown because there are less security domain switches
- The degree of privilege reduction is also limited due to the large size of compartments
- Fine compartmentalisation granularity means having a lot of small compartments
- This is great from the privilege reduction point of view, as it limits what an attacker can access when they subvert 1 compartment
- On the downside, this high amount of compartments requires a lot of engineering to be put in place, and the many security domain crossings it involves at runtime can slow things down significantly

---
# Automation

???

- Automation is an ideal goal in compartmentalisation
- It would be awesome if we could take a monolithic program and give it to a framework, maybe a compiler, that would produce a compartmentalised version fully automatically, without any engineering effort or expert knowledge

--
- **Manual methods**
  - Adopted by most existing compartmentalisation efforts
  - Depend heavily on developer expertise, prone to human error and unable to guarantee correctness

???
- Unfortunately this is a bit of an unrealistic goal
- As a matter of fact, most compartmentalisation efforts are achieved fully manually
- They take quite a bit of engineering effort and entirely rely on the expertise of the programmers
- They are also prone to human error, and it's not really possible to prove that the resulting compartmentalisations are entirely correct

--
- **Guided manual approaches**
  - Assist developers with tools and feedback loops to reduce errors and improve boundary definition
  - Offer stronger guarantees against issues such as interface vulnerabilities

???
- Some approaches can be classified as "guided manual"
- They are mostly manual, but programmers rely on tools that can help the manual compartmentalisation process
- They can help by pointing out things like good compartment boundaries or untrusted data to check at cross-compartment boundaries

---
# Automation (2)

- **Policy-refinement methods**
  - Developer indicates high-level policies with e.g. code annotations
  - Framework automate the installation of a low-level concrete policy
  - Hard to automate interface safety

???
- Add more automation, and you get policy-refinement approaches
- With these the developer indicate the high-level policy to apply, for example by telling what library should go into what compartment in a configuration file
- Or by marking untrusted and security-critical data with code annotations and letting the system make sure both categories don't end up in the same compartment
- The frameworks supporting this method support quite a good deal of automation, however certain aspects of the job like securing interfaces are still very hard to automate today

--
- **Full automation**
  - Require no effort from the programmer
  - Computing data dependencies without manual refinement may lead weaken the degree of privilege reduction

???
- As mentioned fully automated methods are an ideal goal, but they can't be achieved properly in reality
- Not only you have things like interface safety that are hard, or rather likely impossible, to entirely automate
- But also full automation raises concerns about lowered security guarantees
- For example full automation requires extensive use of static analysis techniques which tend to overestimate, so if such techniques are used to identify data that should be shared between several compartments, it will likely lead to a certain amount of oversharing

--
- Overall, mostly/fully automated methods **trade off security to lower engineering effort**

???

- Overall, the more automation is used, the lower the engineering effort is, but also the lower the security guarantees obtained from compartmentalisation will be

---
# Policy Languages

.leftcol[
- **Code annotations** (e.g. compiler attributes) expressing semantics about shared/private data and sensitive code
- **Placement rules**: higher level policy placement rules defining which parts of the application go into which compartment
- Have various **granularity of expressivity**, address different **trust models**
]

.rightcol[
```c
int function(char *parameter) {
  // treat all data as private by default,
  // mark shared data as such
  int __shared(compartment1) *glob_ref = // ...
  // or treat all data as shared by default,
  // mark private data as such
  char __private password[128];
}
```

```yaml
# libredis, libopenjpg, and libxml each in a
# separate compartment, rest of the code in
# another compartment

default: comp0

libraries:
- libredis: comp1
- libopenjpg: comp2
- libxml: comp3
```
]

???

- The way policies can be expressed within the code of the compartmentalised program are varied
- Many existing compartmentalisation frameworks will make use of code annotations
- These annotations allows to do things like marking some data as security sensitive or untrusted, marking data as private to a compartment or shared between multiple compartments, or indicating compartments boundaries
- You have an example of annotations on the top right here, with a global reference marked as shared with a compartment, and a password variable marked as private to the containing compartment
- You also have higher-level placement rules, for example the configuration file on the bottom right here places each library within its own compartment, and the rest of the code within an additional compartment
- Overall compartmentalisation frameworks will let you express policy information in the code at various granularity, here variables vs. entire libraries in our examples, and this will also make it more or less easy to enforce certain trust models vs. others
- For example if you can only mark data as untrusted for sandboxing, but you cannot mark data as security-sensitive, it's unlikely that the framework in question supports the safebox isolation model

---
# Analysis Techniques

- For automated approaches, **analysis techniques** used to determine permissions, compartment boundaries, and/or shared/private data flow

???

- For approaches using automation, some analysis techniques must be used to determine permissions, compartment boundaries, and the status of data such as shared or private

--
- **Static analysis**
  - Generally complete but overestimate: leads to various consequences such as oversharing
  - Scale well to many applications/policies, but may not scale to large code bases

???

- Static analysis is generally the default choice
- It scales well to many applications and different policies, and contrary to dynamic analysis it is complete which is important
- Unfortunately it overestimates, which leads to issues like oversharing when used to identify shared data
- It can also be quite slow and resource demanding, and may not scale to very large code bases

--
- **Dynamic analysis**
  - Incomplete, underestimate: compartments may be underprivileged
      - Permission fault at runtime under legitimate behaviour
  - Good scalability to large programs, poor scalability to many programs/policies

???
- Dynamic analysis can be used, but it's a bit more problematic
- First it is limited to the coverage of the program that was achieved during the analysis, so it is basically incomplete, which is concerning from the security point of view
- It scales well to large programs, but may be difficult to scale to many programs and policies

--
- **Hybrid** methods exist but generally static and dynamic analyses do not compose well

???
- A few hybrid methods have been presented but overall static and dynamic analysis do not compose well
- Indeed when you mix them you obtain a mix of the drawbacks of both approaches which is not ideal

---
# PL Genericity

- Most policy definition approaches focus on **one or a class of programming languages**
  - Tackle domain specific problems e.g. pointer aliasing in C
  - Leverage language-specific features:
      - Rich type information in C++ to partially automatise interface safety
      - Software fault isolation to confine WebAssembly sandboxes

???

- The vast majority of policy definition methods are not generic and focus on one or a class of programming languages
- This is because they need to tackle specific problem, such as pointer aliasing in C
- But also because many approaches leverage language-specific features
- For example the RLBox framework focuses on C++ and uses the rich type information exposed by the language to partially automate some interface safety checks

---
# Summary

- A **compartmentalisation policy** defines how many compartments and what goes into what compartment for a given application
- 2 main classes: **code-centric** and **data-centric**
- Various **granularities** of compartmentalisation trade off security and performance for engineering effort
- Various approaches with different degrees of **automation**
  - From fully manual to fully automated
  - Middle ground solutions requiring annotation or higher level rules
  - Achieved with static or dynamic analysis and trades off lower security for reduced engineering effort

???

- To sum up, a compartmentalisation policy is the process of defining how many compartments there should be and what part of our target application goes into what compartment
- There are 2 main classes, code and data centric
- We covered various approaches with different degrees of automation, from fully manual to fully automated, and middle ground solution requiring more or less engineering effort and expertise
- Automation is achieved with static or dynamic analysis, and generally trades off lower security for a reduced engineering effort