A structure is a collection of heterogeneous values of arbitrary type which form a single logical unit in the context of a program. Each value within the structure is called a field and can be accessed by specifying the structure's variable name, followed by a dot followed by the field name within the structure.
Like arrays, structures can be initialized by enclosing the field initializers, (specified in the same order as they were in the structure declaration) within braces. Once an array has been initialized, the only way to change its values is by assigning to its individual fields; the entire structure cannot be reassigned to as it was during initialization (this is similar to arrays). Fields which are not initialized by the initializer are set to zero. If no initializer at all is specified for a structure, then all of its fields are undefined initially (again, like arrays).
Structures may even be nested and its nested fields can be accessed
by chaining the dot operators as demonstrated by the
following program, which nests a pair
structure inside a
floor
structure:
#include <stdio.h> #define MAX_ROW 10 #define MAX_COL 10 struct pair { int row, col; }; struct floor { char plan[MAX_ROW][MAX_COL]; struct pair position; }; int main() { struct pair pair_var = { 3, 2 }; /* Stand alone structure */ struct floor floor_var = { /* Initialize the two-dimensional array 'plan' */ { ".......", ".*.....", "....*.*", ".*...a*" }, /* Initialize 'position' */ {3, 5} /* Nested sub-structure */ }; int i; printf("pair_var's (row,col) is (%d,%d)\n", pair_var.row, pair_var.col);; puts("floor_var:"); for (i = 0; i < sizeof(floor_var.plan)/sizeof(floor_var.plan[0]); i++) { if (floor_var.plan[i][0] != '\0') puts(floor_var.plan[i]); } printf("floor_var's 'position' field is (%d,%d)\n", floor_var.position.row, floor_var.position.col); return 0; }
In the declaration of the two structures:
struct pair { int row, col; }; struct floor { char plan[MAX_ROW][MAX_COL]; struct pair position; };
pair
and floor
are sometimes referred to
as structure tag names. Note that whenever we define a
structure variable, we use the keyword struct
followed
by the structure's tag name to denote the variable's type. Obviously,
having to use struct
each time we wish to define a structure
variable is tedious. A very common idiom in C
is to declare structures using a typedef
, which creates a
synonym for the specified type:
typedef struct pair { int row, col; } t_pair; typedef struct floor { char plan[MAX_ROW][MAX_COL]; t_pair position; } t_floor;
Now, whenever we want to define a struct floor
, we
can simply say t_floor floor_var
instead of the more
cumbersome struct floor floor_var
. Note that the
t_
prefix is merely a convention and is not required
by C. Its purpose is to provide a way to
distinguish between a structure types and structure variables. If we
consistently declare all our structures with a leading t_
,
then we can define structure variables using the same name as the
structure typedef, but without the t_
prefix. (e.g.
t_floor floor
).
Using typedef
, we can even declare the structures without
the tag name at all:
typedef struct { int row, col; } t_pair; typedef struct { char plan[MAX_ROW][MAX_COL]; t_pair position; } t_floor;
When we do this, we can still create structure variables using
t_floor floor_var
, but we can no longer use the
original syntax struct floor floor_var
because the
structure tag no longer exists. Defining structures without a tag
name is quite common.
typedef
's, while commonly used for structure declaration,
can also be used to create simple abstract data types. For example,
MUN student IDs can be represented by an array of nine digits:
char student_id[10]; /* Don't forget the nul byte! */
Instead of having to type this everytime we wish to create a variable
to hold a student id, we can use a typedef
to create a new,
more abstract type representing a student ID as follows:
typedef char t_student_id[10];
Now, whenever we wish to create a variable that can store a student id,
we can simply use our new type t_student_id
:
t_student_id student_id;
We can define a linked list in C by using a self-referential structure. Consider the following example:
#include <stdio.h> #include <stdlib.h> #include <string.h> #define BUFFER_LEN 20 typedef char t_name[BUFFER_LEN]; typedef struct { t_name name; int num; } t_course; typedef struct node { t_course course; struct node *next; } t_node; int main() { t_course course[] = { { "COMP3710", 25 }, { "COMP3724", 20 }, { "COMP4718", 15 } }; int i = 0; t_node *list = NULL; /* Pointer to the front of the list */ t_node *it = NULL; /* Iterator */ /* Fill up the list */ for (i = 0; i < sizeof(course)/sizeof(course[0]); ++i) { t_node *cur = (t_node *) malloc(sizeof(t_node)); strcpy(cur->course.name, course[i].name); cur->course.num = course[i].num; cur->next = list; /* Insert node at front of list */ list = cur; /* Update the front of the list */ } /* Display the list */ for (it = list; it != NULL; it = it->next) printf("%s: %d\n", it->course.name, it->course.num); /* Free up the list */ while (list) { t_node *del = list; list = list->next; free(del); } return 0; }
The above program copies a collection of course names and their enrollments to a linked list, traverses the linked list to print out the courses and the frees up the linked list (this is a very round about way of printing out this list, but the code is meant for demonstration purposes only).
We define the following t_node
structure which contains
a structure containing the data we wish to store in the linked list
(e.g. a t_course
structure) and a pointer to
another node structure called next
. Note that because we
refer to the node structure inside the node structure itself, we have to
declare the structure with a tag name (node
). Also note
that we are not actually creating a nested structure inside itself,
as that would be impossible. Instead the struct node
contains a pointer to another struct node
.
typedef struct node { t_course course; struct node *next; } t_node;
We can define an array of structures in much the same way as we define an array of any other type. The contents of the entire array is enclosed by braces and the values of the fields of each structure element is similarly enclosed in braces in the initializer list:
t_course course[] = { { "COMP3710", 25 }, { "COMP3724", 20 }, { "COMP4718", 15 } };
Unlike initializing a two-dimensional array, the braces surrounding each structure element are required.
->
operator (K&R § 6.2)
The program then proceeds to iterate over all the elements in the
array. On each iteration, it dynamically allocates a new node
for the linked list and copies all the relevant data from the array
element to the node. Note that cur
represents
a pointer to the newly created node. Before we can access its
fields, we must dereference cur
first. Then we can
use the dot notation to access a field. However, because the dot
operator has higher precedence that the dereference operator,
we cannot refer to the course
field name of cur
by writing:
*cur.course
because this would try to apply the dot operator to cur
and cur
is not a structure (it's a pointer to a
structure). Instead, we must use parenthesis to force the dereferencing
to take place first:
(*cur).course
Syntactically, this is very clumsy to type each time you want to refer
to a field of a structure when you have a pointer to the structure
available. To make things cleaner, C provides
the ->
operator which has the same affect as the code
above. Therefore, we can get access to the course
field
of cur
by writing:
cur->course
The following two lines add the newly created node to the head
of the list. Note that the first node added to the list will
have a next
pointer of NULL
(which is
the initial value of list
). We use this NULL
pointer as a sentinel value when traversing the list.
cur->next = list; /* Insert node at front of list */ list = cur; /* Update the front of the list */
By hooking up the list in the manner shown above, we, in effect, build up the list backwards (i.e. the first element in array of courses will be the last element in the linked list).
The for
loop simply iterates over each element of the newly
created list by using the t_node
pointer it
.
On each iteration, it displays the course and the current enrollment.
This loop stops when we hit the end of the list (i.e. when
it
equals NULL
).
Finally, we traverse over the list a final time using a while
loop, deallocating each node as we go. Because the list is being
destroyed, there is no need to use the it
pointer; we can
simply use our original list
pointer instead.
Note that we save a temporary pointer to the current linked list node in
the del
pointer, update the list
iterator, then
free the node pointed to by del
. Attempting to free the
list without the temporary node pointer would yield unpredictable results
(K&R § 7.8.5):
/* How NOT to free up a linked list */ while (list) { free(list); list = list->next; }
The above code fragment attempts to use the pointer list
after the memory it pointed to was freed. This is a serious,
(but unfortunately, common) mistake in code written by beginners.
Pointers to structures can be used to represent many other data structures as well including trees and hash tables.
When we pass a structure, all the elements in the structure are copied
and assigned to the function's formal parameter. (This includes arrays.)
Any changes made to the structure's fields are not reflected in the
context of the calling function. Passing structures this way can be
quite expensive. It is more common instead to pass the structure's
address to the function instead of the entire structure itself.
If you do not wish for the function to be able to modify the structure,
you can declare the structure parameter as const
.
As a demonstration of passing structure pointers to functions, consider the following modularized version of the previous program:
#include <stdio.h> #include <stdlib.h> #include <string.h> #define BUFFER_LEN 20 typedef char t_name[BUFFER_LEN]; typedef struct { t_name name; int num; } t_course; typedef struct node { t_course course; struct node *next; } t_node; void store_list(t_node **); void display_list(const t_node *); void free_list(t_node *); int main() { t_node *list = NULL; /* Pointer to the front of the list */ store_list(&list); display_list(list); free_list(list); return 0; } void store_list(t_node **list) { t_course course[] = { { "COMP3710", 25 }, { "COMP3724", 20 }, { "COMP4718", 15 } }; int i = 0; /* Fill up the list */ for (i = 0; i < sizeof(course)/sizeof(course[0]); ++i) { t_node *cur = (t_node *) malloc(sizeof(t_node)); strcpy(cur->course.name, course[i].name); cur->course.num = course[i].num; cur->next = *list; /* Insert node at front of list */ *list = cur; /* Update the front of the list */ } } /* Display the list */ void display_list(const t_node *list) { for (; list != NULL; list = list->next) printf("%s: %d\n", list->course.name, list->course.num); } /* Free up the list */ void free_list(t_node *list) { while (list) { t_node *del = list; list = list->next; free(del); } }
There are some notable differences between the two versions of the programs.
main()
function is considerably simpler due
to the increased modularity.
list
is updated by store_list()
and
because we want this change to be reflected in the main()
function, we must pass the address of list
to
store_list()
and dereference list
in this
function, as necessary.
display_list()
uses the list
parameter
directly when traversing the linked list (i.e. there is no
it
iterator variable). Because list
is passed
by value, a copy of the address is passed to display_list()
.
Changes to the list
parameter that take place inside
the function are not reflected in the list
parameter
of the main()
function. Because the list
parameter is already initialized correctly when it is received
by display_list()
, the initialization part of the
for
loop is empty.
list
parameter is passed in as
a const t_node *
. This means that
display_list()
is not permitted to change any of
the fields of the structure pointed to by list
.
The function may change the list
parameter itself
(list = list->next
).
Last modified: Wed Jan 29 15:22:28 2003