January 28 (Wednesday) February 02 (Monday)
So far, all the programs we have studied have been contained in a single source file. For nontrivial programs, it is helpful to break up the program into several source files. Each file (sometimes called a module or a translation unit) would typically contain a collection of logically related functions, some of which would be callable from other modules.
For example, consider the source code for Assignment #3. The source code is distributed over several different source files:
Filename | Description | Functions/Linkage |
---|---|---|
main.c | contains the
main() function which calls functions present in other files.
It is not uncommon for programs to have a main.c file module
that contains the main() function.
| main()/extern
|
imgdims.c | contains all the functions that are responsible for calculating the dimensions of various image types. |
bytes_to_num()/static gifpng_file_dims()/extern jpg_file_dims()/extern
|
imgtype.c | contains functions related to identifying various images types (PNG, GIF89a, JPG). |
get_img_type()/extern
|
process.c | contains functions that
open files, stores relevant image information in a linked list
of structures, sorts images by dimension and frees up the linked list.
This module contains functions that are called by main()
and calls functions present in the other non-main.c modules.
|
get_basename()/static free_images()/extern sort_by_pixels()/extern process_file()/extern |
(The partitioning of source files in this particular example is a
little overly aggressive, as some of the files (main.c
and
imgtype.c
) define only one function. In more practical
cases, file modules may contain tens of functions.)
make
program
The assignment uses a Makefile to actually compile the program.
The Makefile
contains the names of the source files and
rules on how the program is to be built. The program make
uses the Makefile
to determine what source files need to
be recompiled. If only a few source files have changed, then, generally
speaking, only those will have to be recompiled.
For each compiled file, the compiler generates an object
file, which has the same name as the original source file, but
has a .o
or .obj
extension. When all the
files have been compiled to object files, another program called the
linker puts all the object files together to generate the
final executable. The make
program uses a series of
explicit and implicit rules to determine how to compile and build the
final executable.
A more detailed description of make
is beyond the scope of
these notes, but if you are curious, you can read a tutorial on the web.
Most current Integrated Development Environments (IDE) provide automated
program building through the use of project files. These project
files are certainly easier to create and modify in the context of an IDE.
Unfortunately, these project files tend to be large, proprietary, binary
files which are nearly impossible to use outside of an IDE, making portability
difficult (even on the same operating system/hardware architecture).
These project files are also notoriously in a constant state of flux,
thereby making the transition from one version of a vendor's IDE to the
next problematic. Curiously, an IDE may allow one to generate a
Makefile
from a project file.
As we learned earlier, before a function is called, it should be declared,
either with a prototype or by having the function definition occur in the
file before its invocation. This is generally pretty easy to do when we
are dealing with a single file. However, in the context of a program
which has multiple files, things become a bit trickier. For example,
in the assignment, consider the function process_file()
.
This function is defined in the file process.c
, but is called
by the main()
function (defined in main.c
).
How (and where) do we declare this function so that the
main()
function (and possibly other functions) can see
process_file()
's prototype? One option would be to declare
the function in each file that uses it. Unfortunately, this would be
tedious and error prone. If we later changed the function's interface
(i.e. its return type and/or arguments), then we would have to
change all occurrences of the function prototype in all the files that
declared it.
Instead, what we do is declare the function prototype once in a header
file and #include
that header file in all files that call
this function. In our example, we declare process_file()
in img.h
and #include "img.h"
in all
the modules that call process_files()
. Note that the
filename being included is enclosed with double-quotes rather than
angle brackets. This tells the preprocessor to search in the current
directory for the file rather than search in the "standard" place
(typically /usr/include
on Unix systems). We include
img.h
in process.c
too because this forces
us to keep the function definition and declaration consistent. If we
change process_file()
's interface and try to recompile it
without changing the prototype in the header file, the compiler will
generate an error telling us that the prototype declaration and the
function defintion are inconsistent.
This strategy is used for all functions that are defined in one file but
called from another file. We collect all these function declarations and
store them in img.h
, which is then included by all the files.
Some of the function declarations in this header file are listed below:
/* Defined in imgtype.c */ extern const t_img_info *get_img_type(FILE *); /* Defined in process.c */ extern void process_file(const char *, t_image_node **); extern void sort_by_pixels(t_image_node *); extern void free_images(t_image_node *);
Note that these function declarations (and their corresponding definitions
in the corresponding .c
files) require knowledge of
the t_image_node
and t_img_info
structure
types, therefore, we declare them in img.h
as well.
(img.h
also contains a couple of macro #define
's
as well as several typedef
's.)
Because all the files in the program rely on img.h
, whenever
this header file changes, we must recompile all the source files before
a new executable can be built. (Incidentally, a Makefile
provides a way to specify the reliance of one file on another through
dependencies. The Makefile
for Assignment #3
demostraties how to do this.)
The keyword extern
signifies the linkage of
the function. By default, functions have external linkage anyway, so
declaring a function extern
is redundant in most cases,
but is still helpful from a consistency point of view. Note that all
functions that we did not declare in the img.h
header file
are defined as static
functions in the respective source
files (e.g. bytes_to_num()
in imgdims.c
and get_basename()
in process.c
). By declaring
a function to be static
, we are saying that it has internal
linkage (i.e. no other function outside this source file may call
this function). Using static
functions helps to make the
source files more encapsulated. You can think of the static
functions as being "helper" (or private) functions for the other
functions in the source file.
Another thing that some multiple files may have in common are global
variables. Global variables can be declared as extern
in a header file, e.g.:
extern char marker;
and subsequently defined in one of the .c
file:
char marker = 'a';
Any file that wanted access to this variable would just have to include the appropriate header file in which the variable was declared. It's okay for a source file to include a header file that contains a variable/structure/function declarations that are not used by the source file.
Notice the important distinction between declaring a variable and defining
it. The declaration merely tell the compiler of the type of the variable,
whereas the definition actually allocates space for it (and initializes
it, if an initializer was specified). Generally speaking, statements of
the form extern type varname
are declarations,
whereas statements of the form type varname
are definitions. A global variable should never be defined in more
than one place (e.g. do not define variables
in header files); otherwise, the linker may generate a multiple
definition error. Variables may be declared in several places,
therefore making it okay to declare a variable in a header file and
including that header file in multiple source files. The types of
a variable's definition and it's declarations must be consistent.
For example, if you define a variable as an array in a source file,
then do not declare it as a pointer in a header file.
Remember that arrays and pointers are different types.
One more caveat: do not confuse macro definition (e.g
#define
) with variable definition. The two are completely
different.
static
global variables (K&R §4.6)
If we wanted to define a global variable in a file but restrict its scope
so that it is accessible only by functions defined in that source file,
we can define the variable outside all functions (typically, near the
top of the file) and define it as static
. For example,
in imgdims.c
, we define:
static const t_img_data jpg_marker = '\xff';
near the top of a souce file before any function definitions
(i.e. a global variable). Now, all functions in that source can
access that variable, but functions in other source files cannot. Other
source files can define their own variables named jpg_marker
,
but they would be different from the first one. Variables created in
this way are sometimes referred to as file-scope variables.
Note that a global variable that is defined to be static
is very
different from a local variable that is defined to be static
.
In the first case, the static keyword denotes the (internal) linkage of
the variable whereas in the second case, static denotes the storage
class of the variable. The concept of storage class was described
in the context of the runtime stack during the previous lecture.
Last modified: January 29, 2004 23:50:27 NST (Thursday)