Main

February 02, 2004 (Monday)

Pointers to functions (K&R § 5.11, 6.7)

In C, whenever we use the name of the function without the function call operator (the function call operator is simply pair of parentheses that follow the function name), C will treat the function name as a pointer to a function. This is similar to the concept behind C treating an array name without subscripts as being the same as a pointer to the first element of the array. For example, consider the following program:

#include	<stdio.h>
#include	<ctype.h>

int alpha_test(int ch) { return isalpha(ch) != 0; }
int digit_test(int ch) { return isdigit(ch) != 0; }

int countchars(char *, int (*)(int));

int main()
{
	int	(*func)(int);

	func = alpha_test;
	printf ("alpha_test('h') = %d\n", (*func)('h'));

	func = digit_test;
	printf ("digit_test('h') = %d\n", func('h'));

	printf("Alphas: %d\n", countchars("1a2b3d4", alpha_test));
	printf("Digits: %d\n", countchars("1a2b3d4", digit_test));
	return 0;
}

int countchars(char *str, int (*fp)(int))
{
	int ret = 0;

	for (; *str; str++)
		ret += fp(*str);
	return ret;
}
ptrfunc.c

In the main() program, we define a variable called func whose type is pointer to function which takes an int argument and returns an int. Note the parenthesis in (*func) -- without these parenthesis, we would have int *func(int); which would basically declare func to be a function that takes an int argument and returns a pointer to int return type. In other words, without the parenthesis around the *func, we would have a function prototype and not a pointer to a function. The declaration of a pointer to a function (e.g. int (*func)(int)) looks a bit peculiar and does take some time to get used to it.

Once we have defined a pointer to a function variable type, we can assign a function to the variable by simply using the assignment operator. For example:

func = alpha_test;

At this point, func now points to the function alpha_test. We can now call this function via the func function pointer by saying:

(*func)('h')

The first pair of parentheses cause the pointer dereference to occur and the second pair of parentheses actually call the function with the enclosed arguments.

Next, we can assign another function pointer to func:

func = digit_test;

Now, func points to the digit_test function. Again, we can invoke the digit_test function indirectly by writing:

func('h')

This syntax is equivalent to the (more cumbersome) syntax (*func)('h'). In other words, when using func('h'), where func is a pointer to a function, the dereferencing happens automatically. Which style you use depends upon your own preference.

Using pointers to functions as parameters

As with variables of any other type, we can even pass a pointer to a function to other functions, as demonstrated by the countchars() function. This function takes a string (or more accurately, a char *) argument and a pointer to a function that takes an int argument and returns an int value. It then iterates over the characters in the string using the supplied function argument to test each character of the string. Note that we have no idea what type of characters countchars() will count simply by looking at the definition of the countchars() function alone. What it counts depends entirely upon the behaviour of the function passed to it. The first time countchars() is called by the main() program, it will count alphabetic characters; the second time it is called, it will count digit characters. Even though the same countchars() function is being called, its behaviour is dictated by the function that is passed in as its second argument.

Note the prototype for the countchars() function:

int countchars(char *, int (*)(int));

The type of the second parameter is a pointer to a function that takes an int argument and returns an int return value. This prototype is equivalent to the prototype below. All we did was drop the names of the arguments (which is okay when declaring a function prototype):

int countchars(char *str, int (*fp)(int));

We will be seeing more examples of using pointers to functions as parameters to other functions when we study the C++ standard library.

Function Dispatch Tables

As another demonstration of pointers to functions, consider a program that implements very simple calculator that can process two numeric operands combined using a variety of operators. The calculator determines the result of the operation and displays the result. The behaviour of the program (named calc) is demonstrated below (notice that the operands and operator should be separated by a space):

$ ./calc
> 1 + 1
2
> 787 * 674
530438
> 15 & 5
5
> 128 | 64
192
> 128 + 64
192
> 27 / 3
9
> 34 - 87674
-87640
> abc + def
Invalid input.
> 7 ^ 8
Invalid operator: ^
> 6 % 2
Invalid operator: %
> 8 hello 9
Invalid operator: hello
> ^D
$

We can implement this calculator by reading in the operands and the operator and then use a rather large conditional control structure which determines which operator to apply by testing the string representation of the operator supplied by the user and then performing the requested operation as follows:

if (strcmp(op_str, "+") == 0)
	return op1 + op2;
else if (strcmp(op_str, "-") == 0)
	return op1 - op2;
else if (strcmp(op_str, "*") == 0)
	return op1 * op2;
else if /* and so on */

Another way of achieving the same result is to use pointers to functions, as demonstrated by the following code:

#include	<stdio.h>
#include	<string.h>

#define MAX_BUF	80

int add(int arg1, int arg2)	{ return arg1 + arg2; }
int sub(int arg1, int arg2)	{ return arg1 - arg2; }
int mul(int arg1, int arg2)	{ return arg1 * arg2; }
int div(int arg1, int arg2)	{ return arg1 / arg2; }
int and(int arg1, int arg2)	{ return arg1 & arg2; }
int  or(int arg1, int arg2)	{ return arg1 | arg2; }

typedef int (*t_op_func)(int, int);

typedef struct  {
	char     *op_name;
	t_op_func op_func;
} t_dispatch;

/* declare the dispatch_lookup function */
t_op_func dispatch_lookup(char *, char **, int);


int main()
{
	char		 buffer[MAX_BUF + 1];
	char		 op_name[MAX_BUF + 1];
	int		 arg1, arg2;
	t_op_func	 fp;
	char		*err;

	printf("> ");
	while (fgets(buffer, sizeof(buffer), stdin) != NULL) {
		if (sscanf(buffer, "%d %s %d", &arg1, op_name, &arg2) != 3)
			fprintf(stderr, "Invalid input.\n");
		else if ((fp = dispatch_lookup(op_name, &err, arg2)) != NULL)
			printf("%d\n", (*fp)(arg1, arg2));
		else
			fprintf(stderr, err);
		printf("> ");
	}
	return 0;
}

t_op_func dispatch_lookup(char *op_name, char **err, int arg2)
{
	t_dispatch dispatch[] = {
		{ "+",	add }, { "-",	sub }, 
		{ "*",	mul }, { "/",	div },
		{ "&",	and }, { "|",	 or }
	};
	int		i;
	static char	error[MAX_BUF + 20 + 1];

	for (i = 0; i < sizeof(dispatch) / sizeof(dispatch[0]); i++) {
		if (strcmp(dispatch[i].op_name, op_name) == 0) {
			if (strcmp(op_name, "/") == 0 && arg2 == 0) {
				*err = "Division by 0\n";
				return NULL;
			}
			*err = NULL;
			return dispatch[i].op_func;
		}
	}
	sprintf(error, "Invalid operator: %s\n", op_name);
	*err = error;
	return NULL;
}
calc.c

In the above program, we define functions for all the operations that the calculator will allow. For example:

int add(int arg1, int arg2)	{ return arg1 + arg2; }
int sub(int arg1, int arg2)	{ return arg1 - arg2; }
...

We then create a dispatch table inside the dispatch_lookup() function. The table is simply an array of structures -- each structure consists of two members:

When called with an operator string, the dispatch_lookup() function iterates over this table, searching for the operator string (op_name) in the table. If it finds the string in the table, it returns the corresponding pointer to the function that implements the operator to the main() function. main() then uses this function pointer to invoke the appropriate function:

printf("%d\n", (*fp)(arg1, arg2));

Again, remember that most modern day C compilers will let you invoke the function using the regular function syntax without having to dereference the pointer first. So, if you want, you could just say:

printf("%d\n", fp(arg1, arg2));

and the result will be the same. If dispatch_lookup was unable to find the operator string in the dispatch table, it will return NULL along with an error string via the err parameter which will then be displayed by the main() function.

If we wanted to add support for another operator to the calculator program, we only have to add a new function to implement the operation and add an appropriate entry to the dispatch array in the dispatch_lookup() function.

typedefs and Pointers to Functions

typedef's are very helpful when used to declare pointers to functions types and can make complicated declarations involving such pointers easier to read. For example, consider the following typedef used in the code above:

typedef int (*t_op_func)(int, int);

With this typedef, we can now create a pointer to function variables by simply saying:

t_op_func	fp;

instead of the more clumsy:

int (*fp)(int, int);

The typedef is even more helpful when specifying the return type of the dispatch_lookup() function:

t_op_func dispatch_lookup(char *, char **, int);

This says that dispatch_lookup is a function that accepts a char* argument, a char** argument and an int argument and returns a t_op_func type which, according to the typedef, is a pointer to a function that takes two int arguments and returns in int.

Without the typedef, the dispatch_lookup() function prototype declaration would look like this:

int (*dispatch_lookup(char *, char **, int))(int, int);

This means the same thing as the earlier declaration but looks very confusing, even to a seasoned C programmer.

sscanf() and sprintf() (K&R § 7.2, B1.3)

The above code makes use of two other functions that have not yet been discussed. The sscanf() function operates similarly to scanf() except that instead of taking input from stdin, it takes its input from a string (i.e. a char*). We use sscanf() in the above program in order to eliminate the buffer overrun problem that could result if we just use scanf alone.

To see how this works, consider the case where a hostile user enters an excessively long operator string. If we had just used

scanf("%d %s %d", &arg1, op_name, &arg2);

then, no matter how large an array we used for op_name to store the operator string read by scanf(), the user would always be able to create a string one byte longer and end up compromising the program.

In the above program, however, we use a combination of fgets() and sscanf() in order to eliminate this possibility:

while (fgets(buffer, sizeof(buffer), stdin) != NULL) {
	if (sscanf(buffer, "%d %s %d", &arg1, op_name, &arg2) != 3) 
			...

fgets() will read only upto MAX_BUF characters from stdin, so it will not overflow the buffer array. Then, sscanf() uses this buffer as the source of its input. It will try to parse the characters in the buffer array into two numbers and an operator string. Note that because the operator string op_name is defined to be the same size as buffer, there is no way that the op_name buffer can overflow -- the worst that can happen is that the program would generate an Invalid input. message.

sprintf() operates very much the same way as printf() or fprintf() except that instead of sending its output to stdout or a file, it writes its output to a character buffer. In the above program, if the user supplies an invalid operator, we want to return an error message that contains a short error message and the name of the invalid operator. We sprintf() the error message and the invalid operator into a local static error buffer, then return (via the err parameter) a pointer to that buffer. Note that we make the static error buffer 20 bytes larger than MAX_BUF to accommodate the initial Invalid operator: characters and the newline which will all be stored inside the array in addition to the operator name itself.

I/O using fread() and fwrite() (K&R § B1.5)

Consider the following data structure (declared in person.h) which represents a person's name, address and date of birth:

#define MAX_BUF 80

typedef struct {
	char name[MAX_BUF + 1];
	char address[MAX_BUF + 1];
	int  year, month, day;
} t_person_info;
person.h

To store structures of this type to a file, we can use the fprintf() function. For example, if p was a variable of type t_person_info and fp was an open file stream, then we could write:

fprintf(fp, "%s,%s,%d/%d/%d\n", p.name, p.address, p.year, p.month, p.day);

This would store, in the file denoted by fp a human readable representation of the the person's information. The data could then be later read in by using fscanf().

We could also write out the structure by using the fwrite() function which will treat the structure as a big blob of binary data. We can even use this function to write an entire array of data without having to explicitly loop over each element of the array. For example, the following program demonstrates the use of the fwrite() function to write an array of t_person_info objects to a data file person.dat:

#include	<stdio.h>

#include	"person.h"

int main()
{
	t_person_info	people[] = {
		{ "Person A", "Address A", 1956, 1, 2 },
		{ "Person B", "Address B", 1966, 3, 4 },
		{ "Person C", "Address C", 1976, 5, 6 }
	};
	FILE	*fp;
	int	 num_people = sizeof(people) / sizeof(*people);

	fp = fopen("person.dat", "w");

	/* Store the number of people */ 
	fwrite(&num_people, sizeof(int), 1, fp);

	/* Store the peoples' information */ 
	fwrite(&people, sizeof(people), 1, fp);

	fclose(fp);

	return 0;
}
fwrite.c

After defining the array and opening the file for write access, we store the number of t_people_info structures in the file as well as the contents of the t_people_info structures themselves:

fwrite(&num_people, sizeof(int), 1, fp);
fwrite(&people, sizeof(people), 1, fp);

The fwrite() function takes a generic pointer, the size of each data element pointed to by the pointer, the total number of data elements to write and the open file pointer. It then writes to the file all the bytes in memory starting from the initial pointer upto the location determined by the second and third parameters. The representation of this data, especially numbers, will typically not be readable be humans -- the data will be stored in machine format. The second call to fwrite() above could equivalently be rewritten as the following, with the same effect taking place.

fwrite(&people, sizeof(*people), num_people, fp);

Remember that fwrite() will write the numbers in their binary equivalent. For example, consider the integer variable definition:

int num = 414374835

If use had used:

fprintf(fp, "%d", num);

then the nine hexadecimal bytes 34 31 34 33 37 34 38 33 35 (whose ASCII equivalent represent each of the digits of the number 414374835) will be written to the file. However, using:

fwrite(&num, sizeof(int), 1, fp);

only the four bytes: b3 db b2 18 will be writing to file, which is the representation of the number in little-endian format. Needless to say, representing large numbers using the binary equivalent can be much more efficient than using their corresponding "string" representation.

Once we have written the information to a file, we can read the data using fread() as demonstrated by the program below:

#include	<stdio.h>
#include	<stdlib.h>

#include	"person.h"

int main()
{
	t_person_info	*people;
	FILE		*fp;
	int	 	 num_people;
	int		 i;

	if ((fp = fopen("person.dat", "r")) == NULL) {
		fprintf(stderr, "Unable to read people file\n"); 
		return 1;
	}

	/* Read the number of people */ 
	fread(&num_people, sizeof(int), 1, fp);

	people = (t_person_info *) malloc(sizeof(t_person_info) * num_people);

	/* Read the peoples' information */ 
	fread(people, sizeof(t_person_info), num_people, fp);

	fclose(fp);

	for (i = 0; i < num_people; i++) {
		printf("%s\n\t%s\n\t%4d/%02d/%02d\n",
			people[i].name, people[i].address, 
			people[i].year, people[i].month, people[i].day);
	}
	free(people);
	return 0;
}
fread.c

After we open the file for reading, we read in the number of records stored in the file, dynamically allocate an appropriate-sized array to store all of the records. Then, using a single function call, we read in the entire contents of the array using fread().

fread(people, sizeof(t_person_info), num_people, fp);

fread() takes, as arguments, a pointer to the location that will store the results of the fread() operation, the size of each data element pointed to by the pointer, the number of data elements to read and a file stream.

The code then loops over all the people in the array and displays the information in a human readable format using the printf(). After this is done, the dynamically allocated memory is freed up and the program terminates.

Caution: fwrite() and fread() are notoriously non-portable across different architectures. For example, if you fwrite() a file on an architecture which stores its integers in little-endian format and them attempt to fread() that file on an architecture on a machine that stores its numbers in big-endian format, then the data read will be quite different than the data stored.


Last modified: February 4, 2004 20:03:31 NST (Wednesday)