class: center, middle ### Secure Computer Architecture and Systems *** # Pointers in C ??? - Hi everyone and welcome to this second video on C - Here we will cover a central concept in that language, pointers --- class: middle, center, inverse # Pointers: Definition ??? - Let's first define what a pointer is --- # The Virtual Address Space - Each program sees the memory it can access as a very large array of bytes, **the address space** - Each slot in this array has an **address**, from 0 to ~128 TB on modern 64 bits CPUs - The program access memory with load and store instructions at target addresses - With virtual memory the address space is private to each program
??? - As you know each program running on the CPU accesses memory with load and store instructions - These target memory locations called addresses, and the set of addresses a program can read from and write to is named the virtual address space - It is very large, for Linux on 64 bit processors it ranges from address 0 to 128 TB - Its size is unrelated to the amount of physical memory present in the machine, in fact most of these virtual addresses are not mapped to physical memory - Finally, there is one address space for each program in the system - Address spaces are private and programs do not see each others' address space --- # Addresses - **An address is a location in memory** - The address of a variable is the first byte holding this variable - You can obtain the address of a variable with the `&` operator ```c int x = 42; printf("0x%x\n", &x); // print the address of x in hexadecimal ``` .codelink[
`03-c-pointers/address.c`
]
??? - An address is the index of a byte in the address space, in other word it's a location in memory - As we have seen previously, program variables live in memory, and we can determine their address: it is that of the first byte holding the variable - In the example illustrated here the address of X is the address of the first byte holding it, which is 0xd35442fc - In your code you can obtain the address of a variable by using the & operator --- # Pointers - **A pointer is a variable which value is an address** - Declared with `*` preceded by the type of data it references, e.g. `int *` - The pointed content can be accessed by **dereferencing** the pointer with the `*` operator ```c int x = 42; int *ptr = &x; // ptr is a pointer of int and _points to_ x printf("%d\n", *ptr); // dereference ptr, print the value of x ``` .codelink[
`03-c-pointers/pointer.c`
]
??? - Now we can define a pointer: a pointer is simply avariable which value is an address - Can be the address of another variable or something else - You declare a pointer with the * operator, preceded by the type of the data it references, for example here we have `ptr` which is a pointer of `int`, so its type is `int *` - `ptr` holds the address of `x`, we say that `ptr` points to `x` - A key operation we can realise on a pointer is to access the memory pointed by the pointer, through the pointer - This is realised with the * operator, you have an example here in which we print the value of `x` through `ptr` - The action of accessing the data pointing to by a pointer with the `*` operator is called dereferencing the pointer --- class: center, middle, inverse # Pointers: Usage ??? - Let's see how useful can pointer be in C --- # Argument Passing in C - **C passes arguments by value**: a copy of each parameter's value is realised at the time a function is called ```c int swap(int a, int b) { int tmp = a; a = b; b = tmp; } int main() { x = 10; y = 100; swap(x, y); printf("x=%d, y%d\n", x, y); // x is still 10 and y is still 100: the swap operated // on a and b in the function swap's stack frame } ``` .codelink[
`03-c-pointers/swap-v1.c`
]
??? - You may know that in C, upon function call, a copy of the arguments is realised in memory to create the parameters of the called function - Check out this example which is a naive attempt at swapping the value of two variables in main - x and y are passed as parameters to swap which swaps the value of its parameters named a and b - If we run this program we'll see that the values of x and y are unchanged after the function call - Indeed when swap was called the program made a copy of the value of x and y for the parameters a and b, swapped the copie's value, and then discarded them as the function swap returned --- name: params # Passing References - If we want `swap` to exchange the values of `x` and `y` in `main`'s memory, we can pass their address as parameters ```c *int swap(int *a, int *b) { * int tmp = *a; * *a = *b; * *b = tmp; } int main() { x = 10; y = 100; * swap(&x, &y); printf("x=%d, y%d\n", x, y); } ``` .codelink[
`03-c-pointers/swap-v2.c`
] ??? - We can make a working version of this program with pointers - Here we updated the function's parameters types to be pointers of integer - The swap is now realised on the pointed values by dereferencing the pointers - In main, we no longer pass x and y to the function but rather their addresses --- template: params
??? - If you run this program you will see that the values of x and y are successfully swapped after the call to swap - What happens is that what is that a copy of the addresses of x and y is created for the pointers a and b --- template: params
??? - So we have a poiting to x, and b pointing to y --- template: params
??? - By dereferenmcing these pointers, the swap function is able to update the values pointed and perform the swap within the area of memory allocated for the main function --- # Passing References - Passing references with pointers is extensively used to: - Have a function modify its calling context (previous example) - "Return" more than 1 single value - Avoid costly data copies for arrays/large data structures ```c // we want this function to "return" 3 things: the product and quotient of n1 by n2, // as well as an error code in case the division is impossible int multiply_and_divide(int n1, int n2, int *product, int *quotient) { if(n2 == 0) return -1; // Can't divide if n2 is 0 *product = n1 * n2; *quotient = n1 / n2; return 0; } int main(int argc, char **argv) { int p, q, a = 5, b = 10; if(multiply_and_divide(a, b, &p, &q) == 0) { printf("10*5 = %d\n", p); printf("10/5 = %d\n", q); } } ``` .codelink[
`03-c-pointers/multiply-and-divide.c`
] ??? - What we just saw is how pointers can be used ot let a function manipulate its calling context - This is useful in other scenario, for example when you want a function to return more than a single value or things like arrays - You have an example here, with the `multiply_and_divide` function taking 2 numbers n1 and n2 and storing the product and quotient of `n1` and `n2` in variables allocated by the caller, which addresses are passed as third and fourth parameters to `multiply_and_divide` - The function also return an error code to indicate the success or failure of the operation - Another key point is that passing or returning a pointer to/from a function is also very quick: a pointer is and address so its size is just 8 bytes on modern 64 bits architecures - If you need to pass or return large data structures, consider rather passing pointers to the objects in question which is much more efficient --- # C Arrays are Pointers - The variable representing an array is a pointer to the first byte of the array in memory ```c void negate_int_array(int *ptr, int size) { // function taking pointer as parameter for(int i=0; i
`03-c-pointers/negate-int-array.c`
]
??? - In C arrays are implemented with pointers - The variable representing an array is a pointer to the first byte of the array in memory - Check out the example here, in main we define an array of integers with 7 elements, and we pass it as parameter to negate_int_array which negates all elements of an array - As you can see this function takes a pointer of integer as parameter - Within the function's body that pointer is indexed with the square bracket like a standard array - This is the equivalent of summing an offset `i` to the pointer and dereferencing the result - As you can see the array "array" and the pointer ptr are equivalent, they both represent a pointer to the first byte of the array in memory --- # Custom Data Structures and Pointers - With a pointer to a custom data structure, access a field by: - Dereferencing the pointer with `*` then using the `.` operator - Need parentheses because of precedence! - Or, simpler: use the "shortcut" operator `->` ```c typedef struct { int x; float f; char *s; } my_struct; my_struct ms = {42, 2.5, "hello"}; my_struct *ptr = &ms; *printf("%d\n", (*ptr).x); // prints "42" *printf("%s\n", ptr->s); // prints "hello", equivalent to (*ptr).s ``` .codelink[
`03-c-pointers/struct-ptr.c`
] ??? - In C very often we create pointers to custom data structures - Here we ahve an example with `ptr` pointing to `ms` - To access a field of `ms` we need to dereference the pointers first with the star operator, then access the field with the dot operator. - Parentheses are needed here because of operator precedence - This is a bit cumbersome, so you can use as a shortcut the arrow operator, that both dereference the pointer on its left and access the field on its right --- # Pointers Chains - A pointer is a variable and has itself an address, i.e. a location in memory - So it can be pointed to! ```c int value = 42; // integer int *ptr1 = &value; // pointer of integer int **ptr2 = &ptr1; // pointer of pointer of integer int ***ptr3 = &ptr2; // pointer of pointer of pointer of integer printf("ptr1: %p, *ptr1: %d\n", ptr1, *ptr1); printf("ptr2: %p, *ptr2: %p, **ptr2: %d\n", ptr2, *ptr2, **ptr2); printf("ptr3: %p, *ptr3: %p, **ptr3: %p, ***ptr3: %d\n", ptr3, *ptr3, **ptr3, ***ptr3); ``` .codelink[
`03-c-pointers/chain.c`
]
??? - Because a pointer is a variable it can itself be pointed to by another pointer - We talk in that case of a pointer of pointer - And with this we can create pointer chains linking memory locations - You have an example here: we have a value, an int - `ptr1` is a pointer of `int`, an int *, and points to the value - Then we want to have something that points to `ptr1` - That's a pointer of pointer of int, an `int **` - It's `ptr2`, which value is the address of `ptr1` - Then we create another level in the chain with `ptr3` which is a pointer of pointer of pointer of int, an `int ***`, which points to ptr2. - An next we print the value of each pointer and what they point to - The chain is illustrated on the slide --- # Function Pointers ```c #include
void greet_v1(char *name) { printf("Good morning, %s!\n", name); } void greet_v2(char *name) { printf("Good evening, %s!\n", name); } int main() { // declare a function pointer, function returns void and takes a char * as parameter: void (*func_ptr)(char *); char *username = "Pierre"; func_ptr = greet_v1; // set func_ptr to point to greet_v1 func_ptr(username); // call greet_v1 through the pointer func_ptr = greet_v2; // set func_ptr to point greet_v2 func_ptr(username); // call greet_v2 through the pointer return 0; } ``` .codelink[
`03-c-pointers/function-ptr.c`
] ??? - Function pointers are a special type of pointer - Rather than taking the address of some data, they take the address of machine code - More precisely a function pointer pointing to function f takes as value the address of the first byte of machine code of the function f - A function pointer is declared as shown on the slide, with the return type of the pointed function (here void), the name of the pointer (here fun_ptr), and the type or types of parameters of the pointed function, here a `char *` - A function pointer can then be set to point to various functions fitting the prototype it was defined with, here we make it point to greet_v1 and then greet_v2 - This is a simple affectation using the function name to obtain its address - THe function pointer can be called by using its name as one would do for a function --- # Summary - A pointer: variable that stores an address corresponding to a memory location - Can access that location through the pointer - Useful to let a function manipulates its calling context and avoid expensive copies of large data structures exchanged through function calls - We have seen pointer-related aspects of arrays and custom data structures, and we also briefly covered function pointers