We have already seen three files that are already accessible by all C programs: stdin
, stdout
and stderr
. These file streams are open for input/output
by default and we do not have to worry about closing them when we are
finished. As we have seen, these file streams are typically used as
arguments to functions e.g.
fgets(buffer, sizeof(buffer), stdin); fprintf(stderr, "Error!");
We can also create files of our own for either reading and/or writing
by using the fopen()
and fclose()
functions,
which are part of the standard library. We can use these functions
to write programs that read and write files (provided we have appropriate
file system permissions). For example, consider the following simple
program which is a simple version of the Unix command cat
.
This program simply takes the filename specified on the command line
and displays the contents of the file on the display.
#include <stdio.h> #include <stdlib.h> int main(int argc, char **argv) { FILE *fp; int c; if (argc != 2) { fprintf(stderr, "Must specify a file name."); exit(1); } if ((fp = fopen(argv[1], "r")) == NULL) { fprintf(stderr, "Unable to open specified file."); exit(1); } while ((c = fgetc(fp)) != EOF) putchar(c); fclose(fp); return 0; }
One of the first things you should notice about this program is the fact
that there are now formal parameters in main()
's parameter
list:
main(int argc, char **argv)
These parameters give the program a way to access command line
arguments when the program is invoked. (They are analogous to the
String args[]
parameter in Java
's main()
method.) The first argument, argc
is the number of command line arguments (including the program
executable itself). The second parameter, argv
, is a
pointer to an array of character strings that contain the name of the
program being executed and the command line arguments. (You may actually
see argv
defined as char *argv[]
. For all
intents and purposes, this is synonymous with char **argv
.)
For example, if we compile the program, and run it as follows:
$ ./a.out cat.c
argc
will be set to 2
and argv
will
point to an array containing pointers to the strings holding the program
name and the command line arguments. Therefore, in the invocation above,
argv[0]
will containing the string "./a.out"
and argv[1]
will be the string "cat.c"
.
The following short program iterates over all the command line arguments (including the program name) and displays them. (K&R p.115) gives another couple of similar examples.)
#include <stdio.h> int main(int argc, char **argv) { int i; for (i = 0; i < argc; i++) printf("argv[%d] is %s\n", i, argv[i]); return 0; }
Note that for the program invocation:
./a.out < input.txt
./a.out
has no command line arguments (apart from itself).
The redirection symbol and the input.txt
filename are not
treated as command line arguments. Instead, the redirection symbols
<
and >
cause stdin
and
stdout
, respectively, to be tied to the specified file
names and the redirection symbols and the file names are dropped from
the command line before the program is actually executed.
Returning to our original cat.c
program, we first ensure
that the user specified a filename as a command line argument. If the
user did, then argc
will be 2
(one for the
command/program name itself and the second one for the actual command
line argument). If the user did not, then we fprintf()
and error and terminate.
fopen()
and fclose()
If a file name was supplied on the command line, then we open
the file using the fopen()
function. This function
takes two arguments: the name of the file to open (in this case,
argv[1]
and the mode, which is passed in as
a string. Because we are opening the file for read access only, we
specify the mode as "r"
. If the fopen()
call
succeeds, then it returns a FILE
pointer, which we use
to actually access the file contents.
In order to access the contents of the file, we use the
fgetc()
function, which takes a FILE
pointer
and returns an integer representing the next character in the input file.
fgets()
returns an integer type and not a character type
because we have to allow for the possibility that the return value will
be EOF
(end of file) which is typically -1
,
and the default char
type may not be able to represent
negative numbers.
Another popular character input function (which also returns an integer),
is getchar()
which returns returns the next character
from standard input (stdin
). getchar()
is sometimes #define
d to fgetc(stdin)
.
We continue to read each character from the input file, displaying it using
putchar()
, until we hit the EOF
character.
At which point, we close the file using fclose()
.
Consider the following program that takes a file name on the command
line (say, filename
) and creates another file named
filename.rev
in which the lines are displayed in reverse
order and the strings representing each line are reversed too.
For example if the file hello
contained the text
Hello World
Then running the program:
$ ./a.out hello
Would create the file hello.rev
, with the contents:
dlroW olleH
#include <stdio.h> #include <string.h> #include <stdlib.h> #include <errno.h> #define BUFFER_LEN (80+1+1) const char *extension = ".rev"; typedef struct node { char buffer[BUFFER_LEN]; struct node *next; } t_node; int main(int argc, char **argv) { FILE *fp; char buffer[BUFFER_LEN]; t_node *list = NULL; char *outfile; if (argc != 2) { fprintf(stderr, "Usage: %s filename\n", argv[0]); exit(1); } if ((fp = fopen(argv[1], "r")) == NULL) { fprintf(stderr, "Unable to open input file \"%s\": %s\n", argv[1], strerror(errno)); exit(1); } while (fgets(buffer, sizeof(buffer), fp) != NULL) { t_node *cur = (t_node *) malloc(sizeof(t_node)); strcpy(cur->buffer, buffer); cur->next = list; list = cur; } fclose(fp); /* Close our input file stream */ /* Don't forget '+ 1' for the the nul byte */ outfile = (char *) malloc (strlen(argv[1]) + strlen(extension) + 1); if (outfile == NULL) { fprintf(stderr, "No memory to store output file name\n"); exit(1); } strcpy(outfile, argv[1]); strcat(outfile, extension); if ((fp = fopen(outfile, "w")) == NULL) { fprintf(stderr, "Unable to open output file \"%s\": %s\n", outfile, strerror(errno)); exit(1); } while (list) { int i; t_node *del = list; for (i = strlen(list->buffer) - 1; i >= 0; --i) fputc(list->buffer[i], fp); list = list->next; free(del); } fclose(fp); /* Close our output file stream */ free(outfile); /* Free up the memory that we * allocated for the output file name */ return 0; }
The first part of this program works similarly to the earlier one except
that now, we read the input a line at a time using fgets()
(note the fp
argument to fgets()
) and store
each line in a linked list. When we later traverse this list, the lines
will be displayed in reverse order. After we finish reading the file,
we fclose()
it, then dynamically allocate a new character
buffer to hold the name of the new file we are about to create.
We then open this new file for write access by specifying a mode
of "w"
. Note that opening a file for write access will
cause a file with the same name (and appropriate write permissions)
to be overwritten without warning. We then proceed to iterate over our
linked list of line buffers and display the characters in each buffer
backwards. Note that we free each node after we are finished with it.
We then close our output file and deallocate the memory that we allocated
for the output file name.
strerror()
and errno
We use a couple of new concepts in the preceding program, namely
strerror()
and errno
, both declared
in errno.h
. Many system functions (e.g.
fopen()
) when they fail, set a global variable called
errno
. This variable is set to an integer value
which represents a reason why the function failed. The function
strerror()
takes this integer value and converts it into
a human readable string. So, in our code above, if the call to
fopen()
fails, we output an error message to the
stderr
stream that indicates the name of the file
and a readable string indicating why the fopen()
call failed.
As an aside, the calls to fclose()
in the above program can
also fail; however, unlike fopen()
, fclose()
returns non-zero upon failure. Strictly speaking, we should be
checking the return value of fclose()
too. Upon failure,
fclose()
will also set errno
.
To see the various error messages that can be displayed, consider the following invocations of the above program.
$ ./a.out nofile Unable to open "nofile": No such file or directory $ chmod 000 revfile.c $ ./a.out revfile.c Unable to open "revfile.c": Permission denied $ chmod 600 revfile.c $ ./a.out revfile.c $ cat recfile.c.rev } ;0 nruter /* eman elif tuptuo eht rof detacolla * ew taht yromem eht pu eerF */ ;)eliftuo(eerf /* maerts elif tuptuo ruo esolC */ ;)pf(esolcf ... etc. ...
Note, that if we wanted to display each line without the characters
reversed, we could have replaced the for
loop and the
fputc()
call with either:
fprintf(fp, "%s", list->buffer);
or better yet,
fputs(list->buffer, fp);
Note that the position of the FILE *
parameter (i.e.
fp
) in the parameter list isn't very consistent.
Other file modes are also available to the programmer ("a"
for append is another popular mode). There are also ways to randomly seek
to an arbitrary offset within a file (fseek()
). Arbitrary
data types (including structures and arrays) can also be read from and
written to files using fread()
and fwrite()
.
Note that files written using fwrite()
on one architecture
may not be readable by fread()
on a different architecture.
These two functions should not be used if portability is desired.
Last modified: Fri Jan 31 15:49:55 2003