class: center, middle ### COMP26020 Programming Languages and Paradigms Part 1: C Programming *** # The C Standard Library Part 2 ## File I/O and Error Management ??? - Hello everyone - In this second video regarding the standard library, we will talk about reading and writing to files, as well as error management --- # File I/O ```c int open(const char *pathname, int flags, mode_t mode); ``` .codelink[[man](https://linux.die.net/man/2/open)] - `open` creates a reference to a file: **file descriptor** - `pathname`: path of the file - `flags`: - Access mode: `O_RDONLY`, `O_WRONLY`, `O_RDWR` - Create the file if it does not exist: `O_CREAT` - Truncate the file size to 0 if it exists: `O_TRUNC` - more: https://linux.die.net/man/2/open - File permissions if created: `mode` ```c // open /home/pierre/test, read-write mode, and create the file if it does not // exists with r/w/x permissions int file_descriptor = open("/home/pierre/test", O_RDWR | O_CREAT, S_IRWXU); ``` ??? - When we want to access a file we first need to open it - The open function is used to create a reference to a file - This reference is named a file descriptor - Open takes several parameters - `pathname` is the path of the file, for example `/home/pierre/test` - `flags` specifies how the file will be used - You can set several flags by piping particular keywords - The access mode indicates if we will only read the file (`O_RDONLY`) - only write the file (`O_WRONLY`) or both (`O_RDWR`) - We can indicate to create the file if it does not exist with `O_CREAT` - We can truncate the file size to 0 if it exists with `O_TRUNC` - There are many more described in the [manual](https://linux.die.net/man/2/open) - The final argument `mode` indicates file permissions if it is created, see in the man page the accepted values - Open returns the file descriptor in the form of an `int` --- # File I/O ```c ssize_t read(int fd, void *buf, size_t count); ``` .codelink[[man](https://linux.die.net/man/3/read)] - `read` attempts to read bytes from a file - `fd` is the file descriptor, previously created with `open` - `buf` is the address of the buffer that will receive the content read - `count` is the number of bytes to *try* to read - return the number of bytes *actually* read ```c int bytes_read = (file_descriptor, buffer, 10); if(bytes_read == -1) { /* error */ } else if(bytes_read != 10) { /* technically not an error */ } ``` ??? - Once we have the file descriptor we can access the file - Use the `read` function to read from the file - The first parameter is the file descriptor to read from - The second is the address of a buffer that will receive the data that is read - The third is the amount of bytes to read - The `read` function returns the amount of bytes that were actually read - You can use it to check for errors - Note that it is not an error if the number of bytes read is smaller than what is requested, for example you could have reached the end of the file - If it's `-1`, it's actually an error --- # File I/O ```c ssize_t write(int fd, const void *buf, size_t count); ``` .codelink[[man](https://linux.die.net/man/3/write)] - `write` attempts to write `count` bytes from `buf` int the file corresponding to `fd` - Returns the number of bytes actually written ```c int bytes_written = (file_descriptor, buffer, 10); if(bytes_written == -1) { /* error */ } else if(bytes_written != 10) { /* technically not an error */ } ``` ```c int close(int fd); ``` .codelink[[man](https://linux.die.net/man/3/close)] - `close` terminates file I/O on a given descriptor ??? - The `write` function is used to write in the file, it works in the same way as `read` - We have the file descriptor to write to - The address of a buffer containing the data we wish to write - And the amount of bytes to write - It returns the amount of bytes that were actually written - Same as for `read`, it is not an error if the number of bytes written is smaller than what is requested, for example the device can be full - However it is an error when it returns `-1` - Once you are done with a file descriptor you need to close it with `close` --- ```c #include
#include
// needed for open #include
// needed for open #include
// needed for open #include
// needed for read and write #include
int main(int argc, char **argv) { int fd1; char *buffer = "hello, world!"; fd1 = open("./test", O_WRONLY | O_TRUNC | O_CREAT, S_IRUSR | S_IWUSR); if(fd1 == -1) { printf("error with open\n"); return -1; } /* write 'hello, world!' in the file */ if(write(fd1, buffer, strlen(buffer)) != strlen(buffer)) { printf("issue writing\n"); close(fd1); return -1; } /* write it again */ if(write(fd1, buffer, strlen(buffer)) != strlen(buffer)) { printf("issue writing\n"); close(fd1); return -1; } close(fd1); return 0; } ``` .codelink[
`13-standard-library-2/file-write.c`
] ??? - Here is an example where we first open a file - With these flags we specify that we will perform only write operations - That if the file exists we want its size to be reduced to 0 upon open - This effectively destroys all the content of the file if there was any - And we also specify that we want to create the file if it does not exist, with read and write permissions for the current user - Next, we perform a write operation in the file - We have the file descriptor and we write from `buffer` that contains "hello world" - We set the amount of bytes to write to be the size of that buffer - For simplicity we exit if we cannot fully write the buffer - And then we perform again the same write operation - Next we close the file and exit --- name: file # File I/O: Write ```c int fd = open( /* ... */, O_CREAT | O_TRUNC, /* ... */ ); write(fd, buffer, strlen(buffer)); write(fd, buffer, strlen(buffer)); ``` ??? - Here is a breakdown of what happens --- template: file - At `open` time:
??? - We have the file on disk on the top, let's assume it's empty at the beginning - On the bottom we have the program memory, that includes the buffer - When the file is opened, there is an internal offset value that is set at the beginning of the file on disk, that is the address 0 in the file --- template: file - After first `write`:
??? - When we perform the first write the buffer is written in the file starting from the offset, then the offset is placed right after what was written - Note that we don't write the termination character. --- template: file - After second `write`:
??? - Then we write a second time the buffer in the file starting at the offset, and we shift the offset at the end of what was written --- # File I/O: Read ```c // Assume ./test was previously created with the write example program char buffer2[10]; int fd2 = open("./test", O_RDONLY, 0x0); int bytes_read; if(fd2 == -1) { printf("error open\n"); return -1; } /* read 9 bytes */ if(read(fd2, buffer2, 9) != 9) { printf("error reading\n"); close(fd2); return -1; } /* fix the string and print it */ buffer2[9] = '\0'; printf("read: '%s'\n", buffer2); /* read 9 bytes again */ bytes_read = read(fd2, buffer2, 9); if(bytes_read != 9) { printf("error reading\n"); close(fd2); return -1; } /* fix the string and print it */ buffer2[9] = '\0'; printf("read: '%s'\n", buffer2); close(fd2); ``` .codelink[
`13-standard-library-2/file-read.c`
] ??? - Now things work very similarly for read operations - In this example we open the file we previously created in read only mode - We read 9 bytes from it inside `buffer2` and we display the content of `buffer2` - We do this operation twice - Note how we manually write the string termination character in the last byte of `buffer2`, right after what was read --- name: read # File I/O: Read ```c char buffer2[11]; int fd2 = open(/* ... */); read(fd2, buffer2, 9); read(fd2, buffer2, 9); ``` ??? - Let's break down what happens --- template: read - At `open` time:
??? - At open time we have the file content, and the program memory that includes the 10 bytes-sized `buffer2` --- template: read - After the first call to `read`:
??? - A first read operation of 9 bytes reads the first 9 bytes of the file into the first 9 bytes of `buffer2` and shifts the offset right after - We also fix up the string by manually writing the termination character --- template: read - After the second call to `read`:
??? - A second read operation reads the next 9 bytes from the file and shifts the offset again - Once again we need to manually set the termination character --- class: middle, inverse, center # Random Numbers ??? - Now let's talk briefly about generating random numbers --- # Random Numbers .leftcol[ ```c // V1: get numbers between 0 and RAND_MAX #include
#include
int main(int argc, char **argv) { for(int i=0; i<10; i++) * printf("%d ", rand()); printf("\n"); return 0; } ``` .codelink[
`13-standard-library-2/random-v1.c`
] ] .rightcol[ ```c // V2: numbers between 0 and 99 with modulo #include
#include
int main(int argc, char **argv) { for(int i=0; i<10; i++) * printf("%d ", rand()%100); printf("\n"); return 0; } ``` .codelink[
`13-standard-library-2/random-v2.c`
] ] ```c // V3: display a different sequence each time we launch the program int main(int argc, char **argv) { * srand(time(NULL)); // init random seed for(int i=0; i<10; i++) printf("%d ", rand()%100); printf("\n"); return 0; } ``` .codelink[
`13-standard-library-2/random-v3.c`
] ??? - For this we have the `rand` function - In the first code snippet on the slide we have a simple program in which we call it 10 times - Each call will return a random number between 0 and a large constant named `RAND_MAX` - If we want to constrain the number generated to fall within a particular interval we can use the modulo operator - In the second example we'll get only numbers ranging between 0 and 99 - Finally, you may notice that the sequence is always the same among several runs of the first and second programs - This is not a very random behaviour - It is due to the way the numbers are generated, it is done in sequence based on a value called the seed - A given seed will always yield the same sequence - So to get variable sequences among multiple execution, we can initialise the seed based on the current time - Note that in some cases it is good to have a fixed seed to ensure reproducible results --- class: middle, inverse, center # Error Management ??? - Finally, let's talk about error management --- # Error Management - The variable `errno` can be used to get more information about the failure of many functions of the standard library ```c /* ... */ #include
// needed for errno and perror int main(int argc, char **argv) { int fd = open("a/file/that/does/not/exist", O_RDONLY, 0x0); /* Open always returns -1 on failure, but it can be due to many different reasons */ if(fd == -1) { printf("open failed! errno is: %d\n", errno); /* errno is an integer code corresponding to a given reason. To format * it in a textual way use perror: */ perror("open"); } return 0; } ``` .codelink[
`13-standard-library-2/errno.c`
] - Errors are listed in the function man page ??? - When a function from the library fails, a variable maintained by the C library is set with an error code - The name of this variable is `errno` - Here we try to open a file that does not exists and open fails, it returns `-1` - We can print the value of `errno`, it's an integer so it's not very helpful by itself - We can get a string describing the error using `perror`, which internally looks at `errno` and prints a textual description of the error on the terminal --- # Summary - File I/O - Getting random numbers - Error management ---- .center[Feedback form: https://bit.ly/3Cv5l9W]
??? - Let's recap - In this video we talked about the C standard library, - more precisely about file I/Os, random number generation, as well as error management - In the next video we will talk about string to integer conversion, and additional functions to access files