Pointers: Applications
In the previous lecture we have presented pointers, and in this one we will describe in what scenarios they can be useful. You can access the slides 🖼️ for this lecture. All the code samples given here can be found online, alongside instructions on how to bring up the proper environment to build and execute them here.
Allow a Function to Access the Calling Context
Recall that arguments are passed by copy in C. So, with arguments of regular types, a function cannot change its calling context.
void add_one(int param) {
param++;
}
int main(int argc, char **argv) {
int x = 20;
printf("before call, x is %d\n", x); // prints 20
add_one(x);
printf("after call, x is %d\n", x); // prints 20
return 0;
}
Before and after the call, x's value is the same.
A copy of x was made upon the call to add_one.
That copy gets incremented by that function and then discarded when it returns.
Let's illustrate what happens in memory.
Each function stores its local variables as well as arguments somewhere in memory.
We have an area for main and an area for add_one.
In main's area we have x, and a copy of x is made upon call to add_one, in the area related to this function, in the form of the param parameter:
If we rather want add_one to increment the value of x as seen by main, i.e. we want add_one to modify the memory related to its calling context (main), we can use pointers:
void add_one(int *param) {
(*param)++;
}
int main(int argc, char **argv) {
int x = 20;
printf("before call, x is %d\n", x); // print 20
add_one(&x);
printf("after call, x is %d\n", x); // print 21
return 0;
}
We now call add_one by passing as parameter the address of x.
Let's illustrate what happens in memory:
In main's area we have x, let's say it's located at address 0x128 (we'll use fictitious addresses for all the examples of this lecture).
In add_one's area, when that function is called, some space is reserved for param and filled with x's address, 0x128.
Through param, add_one now has access to x's location in memory.
In effect, add_one can increment x by dereferencing param:
Let a Function "Return" More Than a Single Value
Pointers can be used to access the calling context when we want a function to "return" more than a single value. Here is an example in which we have a function that return both the product and the quotient of two numbers:
// we want this function to return 3 things: the product and quotient of n1 by n2,
// as well as an error code in case division is impossible
int multiply_and_divide(int n1, int n2, int *product, int *quotient) {
if(n2 == 0) return -1; // Can't divide if n2 is 0
*product = n1 * n2;
*quotient = n1 / n2;
return 0;
}
int main(int argc, char **argv) {
int p, q, a = 5, b = 10;
if(multiply_and_divide(a, b, &p, &q) == 0) {
printf("10*5 = %d\n", p); printf("10/5 = %d\n", q);
}
}
We also want it to return an error code if the divider is 0. So we have 3 things to return. The error or success code is returned normally, while the quotient and product are returned through pointers.
It works as follows: main reserves space for the quotient and product by creating two variables p and q.
Then it passes their addresses to multiply_and_divide, which performs the operations and writes the results in p and q through the pointers product and quotient.
In memory things look like that:
In practice, this is another example of a function (multiply_and_divide) modifying the memory of its calling context (main).
To generalise, with pointers a function can read or write anywhere in memory.
Inefficient Function Calls with Large Data Structures Copies
Let's see another use case for pointers.
typedef struct {
// lots of large (8 bytes) fields:
double a; double b; double c; double d; double e; double f;
} large_struct;
large_struct f(large_struct s) { // very inefficient in terms of
s.a += 42.0; // performance and memory usage!
return s;
}
int main(int argc, char **argv) {
large_struct x = {1.0, 2.0, 3.0, 4.0, 5.0, 6.0};
large_struct y = f(x);
printf("y.a: %f\n", y.a);
}
The function takes the large_struct as a parameter, updates it, and returns it.
The data structure is large, with six fields of 8 bytes each, totalling 48 bytes on x86‑64.
We know C passes parameters by copy, so when f is called there is a first copy corresponding to f's parameter s.
Next, f updates the struct member.
And finally there is a second copy when we return the struct into y.
In memory it looks like that:
This is very inefficient both in terms of performance and memory usage.
The two copy operations take time because the struct is large.
And in memory there are 3 instances of the struct (x, y, and s), which takes up a lot of space.
Efficient Function Calls with Pointers
We can fix that issue with a pointer.
The goal is to maintain a unique copy of the variable and update it in f though a pointer, similar to what we saw in the previous examples.
We change the parameter type to a pointer and call the function with the address of x.
In the function we dereference the pointer before accessing the field:
void f(large_struct *s) { // now takes a pointer parameter
(*s).a += 42.0; // dereference to access x
return;
}
int main(int argc, char **argv) {
large_struct x = {1, 2, 3, 4, 5, 6};
f(&x); // pass x's address
printf("x.a: %f\n", x.a);
return 0;
}
When the function is called we now have the pointer, which is a small parameter of 64 bits, i.e. 8 bytes.
Only the pointer is copied when f is called.
In f we can directly update the struct through the pointer; nothing needs to be copied or returned back to main.
It's much more efficient; we don't have to hold multiple copies of the large struct and there is no expensive copy operations:
C Arrays are Pointers
Under the hood, arrays are implemented as pointers in C. Arrays can be quite large, and it would be very inefficient to pass them by copy. Here is a function that takes an int pointer as parameter:
void negate_int_array(int *ptr, int size) { // function taking pointer as parameter
for(int i=0; i<size; i++) // also need the size to iterate properly
ptr[i] = -ptr[i]; // use square brackets like a standard array
}
int main(int argc, char **argv) {
int array[] = {1, 2, 3, 4, 5, 6, 7};
negate_int_array(array, 7); // to get the pointer just use the array's name
for(int i=0; i<7; i++)
printf("array[%d] = %d\n", i, array[i]);
return 0;
}
Notice at how we call it: we just put the name of an array variable. Also notice how we can use the square brackets on a pointer variable to address it like an array.
This function negates all the elements of an int array.
Indexing Arrays with Pointers, Pointer Arithmetics
Consider the following examples:
int int_array[2] = {1, 2};
double double_array[2] = {4.2, 2.4};
char char_array[] = "ab";
printf("int_array[0] = %d\n", int_array[0]);
printf("int_array[1] = %d\n", int_array[1]);
printf("*(int_array+0) = %d\n", *(int_array+0)); // pointer arithmetic!
printf("*(int_array+1) = %d\n", *(int_array+1)); // +1 means + sizeof(array type)
printf("double_array[0] = %f\n", double_array[0]);
printf("double_array[1] = %f\n", double_array[1]);
printf("*(double_array+0) = %f\n", *(double_array+0));
printf("*(double_array+1) = %f\n", *(double_array+1));
printf("char_array[0] = %c\n", char_array[0]);
printf("char_array[1] = %c\n", char_array[1]);
printf("*(char_array+0) = %c\n", *(char_array+0));
printf("*(char_array+1) = %c\n", *(char_array+1));
In memory we have the following:
The array int_array has 2 elements, each 4 bytes.
The array double_array has 2 elements, each 8 bytes.
The array char_array has 3 elements, counting the termination character, each 1 byte.
Remember that array elements are stored contiguously in memory. We can refer to each element of each array with either the square brackets or the dereference operator.
When we use the dereference operator, things work as follows: the name of an array corresponds to a pointer to its first element, so dereferencing it gives that first element.
To access the second element we increase the pointer by one element, hence the +1 before dereferencing.
Operations on pointers are called pointer arithmetic.
One must be careful with them.
The expression int_array + 1 is interpreted by the compiler as the base address of int_array plus the size of one element, and not plus one byte.
In other words, the actual number of bytes that will be added to the base address of int_array depends on the type of the elements contained in the array.
For example, each element of int_array is 4 bytes.
Each element of double_array is 8 bytes.
Each element of char_array is 1 byte.
Pointers and Data Structures
Data structures are often passed as pointers. Furthermore, structs themselves can have pointer fields.
typedef struct {
int int_member1;
int int_member2;
int *ptr_member;
} my_struct;
We can create a pointer to a previously declared struct variable:
my_struct *p = &ms;
To access one of the int members, we have two choices.
The classic * operator is used for dereferencing, followed by a dot (.) for field access.
Don't forget the parentheses as the . takes precedence over the *.
There is also a shortcut: the arrow ->.
It can be used on a struct pointer to access a field.
The arrow first dereferences the pointer and then accesses the field:
p->int_member2 = 2; // s->x is a shortcut for (*s).x
Further, one can get the address of an individual struct member with the & operator.
Here we set the pointer field equal to the address of one of the integer fields:
p->ptr_member = &(p->int_member2);
In memory this example looks like that:
Then we can print all the members' values and dereference the member pointer:
printf("p->int_member1 = %d\n", p->int_member1);
printf("p->int_member2 = %d\n", p->int_member2);
printf("p->ptr_member = %p\n", p->ptr_member);
printf("*(p->ptr_member) = %d\n", *(p->ptr_member));
Pointer Chains
A pointer is a variable, so it can be itself pointed to: we can create a pointer to a pointer and by doing so construct pointer chains. Consider this example:
int value = 42; // integer
int *ptr1 = &value; // pointer of integer
int **ptr2 = &ptr1; // pointer of pointer of integer
int ***ptr3 = &ptr2; // pointer of pointer of pointer of integer
printf("ptr1: %p, *ptr1: %d\n", ptr1, *ptr1);
printf("ptr2: %p, *ptr2: %p, **ptr2: %d\n", ptr2, *ptr2, **ptr2);
printf("ptr3: %p, *ptr3: %p, **ptr3: %p, ***ptr3: %d\n", ptr3, *ptr3,
**ptr3, ***ptr3);
In memory, we have a layout as follows:
Here we have a value, and we create a first pointer to it.
Then we have a second pointer pointing to the first one.
The type of this pointer of pointer is int **, which means pointer of pointer of int.
We also create a pointer of pointer of pointer to int, int ***.
Note in the code the use of several star operators for dereferencing pointers of pointers.
For example, **ptr2 means accessing the value that is pointed to by what is pointed to by ptr2.
In other words, a first * gives us access to ptr1, and a second gives us access to val.
Pointers of pointers are useful to create dynamically allocated arrays, as we will see next.