Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Compartmentalising Tinyexpr

Introduction

TinyExpr is a mathematical expression parsing and evaluation engine. It is available as a C library that can be integrated in C/C++ projects to provide mathematical expression evaluation features. In this part of the exercise, we will integrate a simplified version of TinyExpr in various programs and sandbox it within its own process.

Sandboxing TinyExpr makes sense from the security point of view. First, it manipulates input (mathematical expressions to evaluate) that may come from untrusted sources, such as the command line, files, the standard input, etc. Second, these expressions must be parsed by TinyExpr before being evaluated: as seen during the lecture, parsers can be quite complex and particularly prone to bugs, which combined with the fact that they often manipulate untrusted input, is concerning.

Our Simplified Version of TinyExpr

We will be working on a simplified version of TinyExpr, which support only the evaluation of mathematical expression in interpreted mode, without any variable. These features make compartmentalising TinyExpr much more complex, something we want to avoid.

Download the source code of our simplified TinyExpr here, and unzip the archive somewhere on your file system. The sources are made of the following files:

  • tinyexpr.c: the library's implementation (you don't need to study that code in details).
  • tinyexpr.h: a header file describing the interface exposed by the library.
  • example.c: a minimal program using the library to evaluate a particular mathematical expression.
  • example2.c: a small program using the library to evaluate mathematical expression fed through the command line.
  • test-suite.c: a test suite checking that the library's behaviour is correct over a few mathematical expressions.
  • benchmark.c: a benchmark measuring the speed of the library to evaluate a particular mathematical expression.

TinyExpr: Exposed Interface

To use TinyExpr, a program's C code must include tinyexpr.h and make use of the interface exposed in that header:

#ifndef TINYEXPR_H
#define TINYEXPR_H

#ifdef __cplusplus
extern "C" {
#endif

/* Parses the input expression, evaluates it, and frees it. */
/* Returns NaN on error. */
double te_interp(const char *expression, int *error);

#ifdef __cplusplus
}
#endif

#endif /*TINYEXPR_H*/

The interface is very simple: a program must call the exposed function te_interp to evaluate a mathematical expression, which is passed as the first (string) parameter. The second parameter is a pointer to an integer that will contain something different from 0 in case of error. If all goes well, the function returns a double which contains the result of the expression's evaluation.

The interface exposed by TinyExpr gives us an idea of what data transfers will need to happen when it is compartmentalised and sandboxed within its own process:

  • The parameters pointed by te_interp will leave within the main program's process, and will need to be transferred within TinyExpr's compartment.
  • After the function runs, the double result will need to be transferred back to the main program's compartment.

Building Programs with the Makefile

The Makefile describes build rules and dependencies for each of the executables: the two minimal examples, the test suite, and the benchmark. You can rebuild everything by typing in a terminal:

make
gcc -g -Wall -o example example.c tinyexpr.c -lm
gcc -g -Wall -o example2 example2.c tinyexpr.c -lm
gcc -g -Wall -o test-suite test-suite.c tinyexpr.c -lm
gcc -g -Wall -o benchmark benchmark.c tinyexpr.c -lm

The make system will check the modification date of each source file and rebuild only what is needed:

touch example2.c # Simulate the fact that we modified only example2.c
make
gcc -g -Wall -o example2 example2.c tinyexpr.c -lm

Example Programs

The program example.c is a good example of usage of the library:

#include "tinyexpr.h"
#include <stdio.h>

int main(int argc, char *argv[]) {
    const char *c = "sqrt(5^2+7^2+11^2+(8-2)^2)";
    int err;

    double r = te_interp(c, &err);

    if(err) {
        printf("ERROR evaluating %s\n", c);
        return -1;
    }

    printf("The expression %s evaluates to: %f\n", c, r);
    return 0;
}

As previously mentioned, the program includes tinyexpr.h and calls te_interp to evaluate a mathematical expression c. See how the error code err is checked after the call to te_interp above. If all goes well the result is displayed by the program.

You can compile example with make or manually then run it as follows:

gcc example.c tinyexpr.c -o example -lm
./example
The expression sqrt(5^2+7^2+11^2+(8-2)^2) evaluates to: 15.198684

Here, -lm means that the program needs to be linked against the mathematical functions implementation of the libc (these will be called by tinyexpr).

Study also the second example program example2.c: it takes a mathematical expression from the command line and prints what it evaluates to:

gcc -g -Wall -o example2 example2.c tinyexpr.c -lm
./example2 "3*5"
Evaluating:
        3*5
result: 15.000000

Test Suite and Benchmark

The test suite program test-suite.c lets us validate the library's functionality. It is recommended that you use it regularly when compartmentalising, to check that you are not breaking functionality:

gcc -g -Wall -o test-suite test-suite.c tinyexpr.c -lm
./test-suite
# ...
ALL TESTS PASSED (200/200)

The benchmark benchmark.c can be used to measure the performance of the library, and compare it to the performance of native C mathematical operations:

gcc -g -Wall -o benchmark benchmark.c tinyexpr.c -lm
./benchmark 
Expression: sqrt(5^2+7^2+11^2+(8-2)^2)
Evaluated result: 15.1986841536
Native result: 15.1986841536
Total time: 0.143495 seconds
Evaluations per second: 696889
Native total time: 0.000143 seconds
Native evaluations per second: 701139351

As you can see the library is about 1000x slower than native C operations. This is expected, as interpreting mathematical expression takes a lot of time: it involves parsing the expression and determining what operations to run. On the other hand, native execution directly run the relevant operations.