Secure Architectures and Systems - Lightweight Virtualization: Containers & Unikernels

class: center, middle

### Secure Computer Architecture and Systems
***
# Lightweight Virtualisation

???
- Hi everyone, welcome to this last video of the virtualisation part of the unit
- Here we'll talk abotu lightweight virtualisation

---
class: center, middle, inverse

# Lightweight Virtualisation: Motivation & Definition

???
- I'll start with a motivating example, before we define the concept of lightweight virtualisation

---
# Lightweight Virtualisation
<div style="text-align:center"><img src="include/lightweight1.svg" width=700 /></div>

???
- Imagine you want to run a website
- And you don't want to leave your personal machine up and running 24/7, so you decide to rent a virtual machine in the cloud
- You choose a cloud provider, say AWS, and you select a Linux distribution to install on your VM, for example Ubuntu

---
# Lightweight Virtualisation
<div style="text-align:center"><img src="include/lightweight2.svg" width=700 /></div>

???
- Zooming in a little bit, we have the cloud provider hardware managed by the provider's hypervisor, for example Qemu/KVM
- And your VM running on top of that virtualisation layer
- Inside your VM you have a guest operating system, it's the Linux kernel
- And all the user space software coming with the Ubuntu distribution

---
# Lightweight Virtualisation
<div style="text-align:center"><img src="include/lightweight3.svg" width=700 /></div>

???
- Recall that you want to run a website
- So inside your VM you will install a web server like Apache, and its library dependencies, things like perl, libssl, etc.

---
# Lightweight Virtualisation
<div style="text-align:center"><img src="include/lightweight4.svg" width=700 /></div>

???
- And all this software that is what you want to run in the VM is using a subset of the features provided by the kernel

---
# Lightweight Virtualisation
<div style="text-align:center"><img src="include/lightweight5.svg" width=700 /></div>

???
- So on the illustration here what you really need to run are the blue boxes: the web server, its dependencies, and the subset of kernel features it requires
- That's it
- All the gray areas are installed and maybe even running but no needed
- We call it software bloat
- And it's a bit of an issue

---
# Lightweight Virtualisation
<div style="text-align:center"><img src="include/lightweight6.svg" width=700 /></div>

???
- Indeed software bloat leads first to an increased attack surface
- All the software installed in the Linux distribution, all the background programs running, and that you don't really need represent potential targets for an attacker to take over and as a first step to attack your environment
- Remember, probabilistically, the more software you run, the higher the chances of a vulnerability to be present somewhere
- Second, software bloat represents additional costs
- You are paying the cloud provider for disk, memory and CPU cycles used to store and run software you don't even need
- Third, for a fixed money budget software bloat also cause performance loss, because all this memory and CPU cycles are not used to run what really needs to run, which is you web server

---
# Lightweight Virtualisation: Definition

- Lightweight virtualisation represent virtualisation solutions aiming at
  providing, compared to traditional virtual machines:
  - **Lower memory footprint**: KBs to a few MBs per virtualised instance
  - **Faster boot times**: in the order of micro/milliseconds
  - Lower disk footprint: in the order of KBs/MBs

???
- So lightweight virtualisation tackles this issue by providing the following, compared to traditional virtual machines
- First, lower memory footprint, in the order of kylobytes to a few megabytes of systems software overhead for each virtualsied instance, compared to hundreds of megabytes or gigabytes of memory consumption for traditional VMs
- Second, boot times in micro or milliseconds, compared to seconds or minutes for traditional VMs
- Third, lower disk footprint, once again we are talking about a few kilobytes of megabytes

--
- These metrics regards the systems software, part of the boot time/memory or disk footprint will be application-specific

???
- Obviously these metrics regards the per VM systems software, in particular the operating system
- The part of the boot time and memory/disk footpritn that relate to an application will stay the same if it runs in a lightweight or in a traditional VM

--
- Can approach these lightweightness objectives with small traditional Linux VMs: **micro VMs**
- Brief overview of two (quite different) technologies that go one step further: **containers** and
**unikernels**

???
- Today there are 3 modern technologies that allow to achieve the aforementioned lightweightness objectives
- First, stripped down Linux VMs, called micro-VMs
- These can be quite minimalist but there are two technologies that take things one step further in terms of lightweightness
- Container and Unikernels

---
class: middle, center, inverse

# Containers

???
- Let's first talk about containers

---
# Introducing Containers

.leftcol[
- **Containers: process-level sandboxing technologies**
  - Enforced by the **operating system**
  - Sometimes called OS-level virtualisation
]

.rightcol[
<div style="text-align:center"><img src="include/intro.svg" width=500 /></div>
<br>
]

???
- Containers are a process-based sandboxing technology, enforced by the operatign system
- Contrary to a traditional virtual machine, illustrated on the left of the picture here, a container is a process or a group of processes for which the OS restricts the visibility on systems resources
- This way the software running in teh container is sandboxed, and it also feels it is running alone in the system, like in a virtual machine

---
# Containers: Key Idea

- **The OS restricts the visibility on system resources for a process
  or a set of processes**
  - Filesystem, network interfaces, PIDs, etc.

???
- The resources which visibility upon can be reduced and changed for the container are the filesystem, the systems' users, visible PIDs, IPCs, system clocks, among others.

--
- **The OS also controls hardware resource allocation/usage between such isolated processes**
  - CPU scheduling cycles, memory, disk/network bandwidth, etc.

???
- The OS can also control the allocation to the container of certain resources, including CPU scheduling cycles, memory available, disk and network bandwidth usable, among others

--
- Conceptually it achieves the same isolation goals as a VM: process(es) are confined in this virtualised environment, **a container**
???
- Conceptually, by reducing or changing the visibility on resources, and limiting their allocation to a process or a group of processes, containers achieve the same isolation goals as a virtul machine, wihtout the need for an hypervisor and a system level VM

--
- **A container is much lighter than a traditional VM**
  - Per-container system memory/disk footprint close to 0
  - Boot time is that of spawning a process, i.e. micro seconds

???
- This is much lighter than using a traditional virtual machine
- The boot time is that of spawnign a process, a few microseconds, and the memor yfootpritn is close to 0

--
- Still containers are not an ideal form of virtualisation
  - **Security issues**

???
- Still containers are not perfect, and as we will see they suffer from significant security concerns

---
# Containers: Use Cases

.leftlargecol[.medium[
- **Software development/testing/deployment**
  - Develop, build and test in a controlled, identical environment
  - Deploy in the same environment as the development one (repeatability)
      - Can be deployed on any machine supporting containers independently of the host configuration
]]

.rightsmallcol[
<div style="text-align:center"><img src="include/docker.png" width=200 /></div>
<br>
]

???
- Container are useful in most scenarios where virtualisation is beneficial
- They are extensively used in software development, where they allow to bring up a homogeneous evironment to develop, build, and test an application, for the entire development and testing team
- Containers can also be used for deployment, as they represents a lightweight way to package an application with all of its dependencies

---
# Containers: Use Cases  (2)

.leftlargecol[.medium[
- **Lightweight (low cost) & elastic virtualisation**
  - Containers consume few resources and can be brought up/destroyed very fast
  - Cloud services such as Gmail and Facebook make extensive use of containers
  - Serverless computing (e.g. AWS Lambda)
]]

.right3col[
<div style="text-align:center"><img src="include/lambda.png" width=100 /></div>
]

???
- Because they are so lightweight, container can replace traditional VMs for many aspects of cloud requiring very quick initialisation and execution of a particular task
- Services such as Gmail or facebook make extensive use of containers for such tasks
- You may also heard about AWS Lambda, which provides serveless computing services
- With this paradigms, the developper programs cloud machinges with small stateless functions executed on demand when certain events happen, for example a user visits a web page
- These functions generally run within containers

---

## Containers: Restricting System Resources Visibility with Namespaces

- **Filesystem/mount points** (~chroot)
  - Container cannot see host/other containers' file systems
  - E.g. can run a Fedora rootfs in a container on a Debian host

???
- The operating system can restrict visibility for a container over the following resources
- First, filesystems and mount points
- The container is generally given its own root fileystem from a based image, and it cannot access the host's filesystem

--
- **Network stack**
  - Container has its own IP, virtual bridged/routed network

???
- The container also has its own state of the network stack, including its own IP, with a virtual bridged and routed network

--
- **Processes: PIDs and IPCs**
  - Isolated PID set, cannot see or communicate with host/other containers' processes

???
- A container has also its own isolated PIDs, one for each process it runs
- The container cannot see or communicate with external processes

--
- **Host and domain name**

???
- The container can set the hostname, which is the machine's name, to somethign different than what the host sees
- Same thing for the domain name

--
- **User IDs**
  - Can have root privileges inside container

???
- User names and IDs cna also be different within the container comapred to what's on the host
- In most scenario a user will simply take the identity of root within the container

--
- Achieved with Linux through a feature called **namespaces**

???
- All of these things are achieved with a mechanism implemented by the Linux kernel which is called namespaces
- Namespace cna eb created for each of the resources we listed: mount points, PIDs, IPCs, etc.

---
## Containers: Controlling Hardware Resources Access with Control Groups

- **Memory**
  - Limits memory and swap usage

???
- A second thing the OS does is control the allocation of hardware resources to containers
- This concerns the following resources
- First, memory, one can set the maximum amount of memory and swap a container can use

--
- **CPU**
  - Limit CPU usage (can be for example 1.5 CPU) and CPU sets
  - Control CFS quotas

???
- Second, the CPU
- The OS can control how much CPU cycles a container can use, and the container can also be restricted to a subset of the machine's cores
- Things like quotas for CFS can also be configured on a per-container basis

--
- **Devices**
  - Enable/disable access to certain devices

???
- Container can also be restricted to seeing only certain devices

--
- **Block I/O**
  - Control throughput

???
- Their disk throughput can be rate limited

--
- **Network I/O**
  - Control network traffic priority

???
- As well as their network bandwidth

--
- Achieved under Linux through a feature called **control groups**

???
- This is all achieved by another mechanism implmented in the Linux kernel, named contro groups

---
# Containers vs. VMs

Pros of containers and VMs:

| Containers | VMs |
| ---------- | --- |
| Low memory/disk usage | OS diversity |
| Fast Boot times | Kernel version |
| Per host density | Performance isolation |
| Nesting | **Security** |

???
- If we list the respective benefits of containers versus traditional virtual machines, we get the following
- Containers are very lightweight, meaning they have a low memor and disk usage and very fast boot times
- Their lightweightness allows to create a very high number of container on a single machine, it's not uncommon to run hundred or even thousands of containers on a host
- Nested virtualisation is also easy with containers, in other words it's simple to create a container within a container
- Regarding virtual machines, they are still useful when one wants to run an operating different from Linux -- something that is difficult to do efficiently with containers because they rely on the namespaces and control groups technologies available only with Linux
- Several studies have also show that performance isolation is stronger with VMs than with container, meaning it's more difficult for a malicious VM to steal resources by abusing them
- Finally, the degree of isolation for the sandboxing enforced by virtual machines is considered as much stronger compared to containers

---
# Containers and Security
<div style="text-align:center"><img src="include/secu1.svg" width=800 /></div>

???
- To understand why the isolation in VM environments is considered as stronger compared to containers, let's consider both setups
- We have a container environment on the left, with several containers runnign on top of the OS kernel
- An on the right a VM environment, with several VMs running on top of an hypervisor

---
# Containers and Security
<div style="text-align:center"><img src="include/secu2.svg" width=800 /></div>

???
- If we establish in both cases a trust model from the cloud provider point of view, the virtualization layer is trusted, that is the OS for the container environment, and the hypervisor for the VM one
- The instances of either containers or virtual machines are obviously untrusted

---
# Containers and Security
<div style="text-align:center"><img src="include/secu3.svg" width=800 /></div>

???
- The threat model in virtualised environement is often the following
- One of the containers or VM is malicious, and tries to perform an escape attack, that is, to get access to the virtualisation layer's memory, or to the memory allocated for other cotnainers or VMs
- As we have seen in the past, hardware enforced isolation mechanisms such as the page table or extended page tables will prevent direct access from the malicious entity to other VMs or containers
- The real threat lies in the virtualisation layer, which can be invoked by the malicious VM or container
- If this invocation can manage to trigger a bug in the virtualization layer, the isolation may be broken and the attacker could access the virtualization layer's or orther container/VMs' memory

---
# Containers and Security
<div style="text-align:center"><img src="include/secu4.svg" width=800 /></div>

???
- So how complex is this interface between what we trust and what we don't trust in both case
- Here "how complex" translates into "how hard it is" to secure this interface and make sure there are no bugs

---
# Containers and Security
<div style="text-align:center"><img src="include/secu5.svg" width=800 /></div>

???
- Well in the case of containers that interface is very complex: it is the system call interface, which is made of hundreds of system calls, some of them like IOCTL presenting thousands of subfunctions
- There is just no way we can guarantee that the implementation of all these system calls is bug free.
- Actually system call fuzzers like syzkaller will regularly find bugs on that interface
- Conversely, the interface between a VM and the hypervisor managing it is much simpler
- It's just a few traps

---
# Containers and Security

- **The isolation enforced by a host OS between containers is not trusted to
  be as strong as that enforced between VMs by a hypervisor**
  - Due to the size and complexity of the interface between a container
    and the host OS: the **system call interface**
???
- So the isolation between containers is not considered to be as strong as that between VMs because of the complexity of the interface between containers and the privileged layer, the OS kernel

--
- Attempts at securing containers:
  - Run containers within virtual machines...

???
- To illustrate my point, note that many actors running containers in production will actually run containers within virtual machines, to benefits from their strong isolation
- You ahve a few examples of technologies on the slide
- These approaches try to reduce as much as possible the memory footprint and boot times of Linux VMs, creating what they call micro VMs, however this still kills most of the lightweightness benefits of containers

- **But can we get both lightweightness and security?**

???
- In that context one important question is: can we get both the lightweightness benefits of container, combined with the security benefits of virtual machines?

---
class: center, middle, inverse

# Unikernels

???
- That's what unikernels are for
- We already encountered that OS model in a previous video
- Here I'm going to give you a more in-depth definition and description

---
# Unikernels

???
- if we take back our motivational example where we had all this software bloat

---
# Unikernels

???
- using unikernels we would solve that issue this way
- With a unikernel you compile your application's code as well as all of its dependencies with a very small operating system layer into a static binary wich merges the applicationa and the operating system
- You can then run this binary as a kernel, in a virtual machine, on top of an hypervisor

---
# Unikernels: Definition

- **Unikernel: application + dependencies + thin OS compiled as a static binary running
on top of a hypervisor**

???
- Ok so once again a unikernel is an application sources, its library dependencies, and a small OS layer compiled all together into a static binary executed in a lightweight virtual machine on top of an hypervisor
--
- **Single-***
  - **Single purpose**: run 1 application
      - Want to run multiple applications? Run multiple unikernels
???
- A unikernel instance is single purpose: it runs one application
- If you want to run multiple applications, you need to run multiple unikernel instances
--
  - **Single process**
      - Want to run a multiprocess application? Run multiple unikernels
      - However, SMP (multicores) and multithreading are supported
???
- A unikernel instance is also a single process virtual machine
- Once again to run a multiprocess application you generally have to run several unikernel instances
--
  - **Single binary** and single address space for application + kernel
      - No user/kernel protection needed

.small[
Madhavapeddy et al., “Unikernels: Library Operating Systems for the Cloud”, ASPLOS’13
]

???
- And finally in a unikernel instance the VM runs a single binary, containing the application, its dependencies, and the kernel
- All of this code runs within a single address space, and there is no user/kernel protection
- This is because there is only a single application running in one unikernel instance, and the inter application isolation is already enforced by running apps as separate unikernel instances
- The unikernel model was originally proposed in this seminal paper in 2013

---
# Unikernels: Benefits

.leftlargecol[.medium[
- **A form of lightweight virtualisation**
  - Contain and run only what is absolutely necessary to the application
  - Cost advantage: memory/disk footprint reduction
  - **Considered as a secure alternative to containers**
      - Unikernels are virtual machines!
- **Per-application tailored kernel** (Exokernel model)
  - Specialisation for lightweightness but also performance
- **Reduced OS noise, increased performance**
  - Low system call latency: app + kernel in ring 0, system calls are function calls
  - Sub-second boot time
]]

.rightsmallcol[
<br>
<div style="text-align:center"><img src="include/lightweight8.svg" width=150 /></div>
]

???
- With that model, unikernels present a series of benefits
- First, it's a form of lightweight virtualisation
- Because a unikernel instance only run the code absolutely necessary for the application in question, and that the OS layer is so small, we get similar benefits as with containers in terms of low memory/disk footprint, and fast boot times
- Second, because they run as virtual machines, unikernels are well isolated and considered a secure alternative to container in many scenarios
- Third, the OS layer within a unikernel instance can be specialised towards the application it runs, similar to what we get with an exokernel
- And finally, because the operating system is so small and simple, it does not get in the way as much as larger OSes such as Linux
- This translates into increased performance
- A notworthy point regarding performance is the system call latency
- With unikernels, because there is no user/kernel isolation, system calls are simple function calls which make them much faster

---
# Unikernels: Application Domains

- Cloud applications: servers, microservices, SaaS
- Embedded virtualisation, Edge computing, IoT
- Network Function Virtualisation, HPC, VM introspection, malware analysis, secure
Desktop applications
- etc.

???
- Given these benefits unikernels have plenty of application domains
- We motivated them with cloud environments such as server or microservice software
- But they have also been explored in the domains of emebedded virtualization, edge computing and IoT
- NEtwork function virtualization, high performance computing, and various security critical domains such as VM introspection, malware analysis, and secure desktop environments

--
- **Contrary to containers which are a mature and widespread technology,
  unikernels are still at the stage of research prototypes**

???
- Still an important point to note is that most unikernels are still at the stage of research prototype
- This is different from containers which as you may know are a production-ready technology

---
# Unikernel Projects

- Unikernels can be classified based on the targeted language for supported applications
  - Pure memory safe languages (OCamL, Erlang, Haskell): [MirageOS](https://mirage.io/),
    [LING](https://github.com/cloudozer/ling), [HalVM](https://github.com/GaloisInc/HaLVM)
  - C/C++, semi-posix API: [Unikraft](https://unikraft.org/),
    [HermiTux](https://ssrg-vt.github.io/hermitux/),
    [HermitCore](https://hermitcore.org/), [OSv](http://osv.io/),
    [Rumprun](https://github.com/rumpkernel/rumprun),
    [Lupine](https://github.com/hckuo/Lupine-Linux)
  - Rust/Go: [Hermit](http://hermit-os.org/),
    [Clive](https://lsub.org/clive/)
  - More: https://unikernelalliance.org/projects/

???
- You have a few examples of unikernel projects on the slide, feel free to check them out
- They can be classified in various way, here I chose to organise them based on the programming languages they support
- Some of these are relatively unstable and poorly maintained academic research artifact
- The most mature project, the one that is the closest to production ready status, is Unikraft
- So if you can only try one, I would advise to focus on that one

---
# Unikernels: Performance Benefits

- **Kernel + application in protection ring 0, system calls are function calls**
  - Significant decrease in syscall latency

<div style="text-align:center"><img src="include/uk-perf2.png" width=390 /></div>
.center[.medium[Redis throughput under various setups (higher is better)]]

.small[S. Kuenzer et al., *Unikraft: Fast, Specialized Unikernels the Easy Way*, EuroSys'21]

???
- Just to illustrate the unikernel performance benefits that comes from their low latency system call
- The graph is taken from the Unikraft paper
- It shows the throughput for Redis, which is a very popular key value store, in million of requests per second
- You have various setups on the X axis including different unikernels, as well as vanilla Linux
- As you can see Unikraft is the fastest solution
- Even if it runs virtualised on top of Qemu/KVM, it is still a bit faster than Linux non-virtualised, and also much faster than Linux in a VM

---
# Wrapping Up

- Heavyweight nature of traditional VMs calls for **lightweight virtualisation**
  in many scenarios
  - Low system memory disk footprint, fast boot times
- **Containers** are a form of OS-level lightweight virtualisation
  - Widespread use, in particular for software development/testing/deployment
  - Security concerns
- **Unikernels** are single-purpose minimal VMs
  - Lightweight and secure
  - Performance benefits
  - Still at the stage of research prototypes

???
- To conclude, in certain scenarios the heavyweight nature of traditional virtual machines brings the need for lightweight virtualization
- We have covered 2 important technologies
- First, containers: a form of virtualisation applied by the OS kernel with creating a VM
- They are used everywhere today, in particular to develop, test, and deploy software
- However they are not as secure as VMs, which raises concerns regarding their use in production
- Second, we also saw unikernels: single purpose, minimal virtual machine
- They are both lightweight and secure, and can bring certain performance benefits
- However most unikernel solutions are still at the stage of research prototypes, contrary to containers which are production ready