January 26 (Monday) January 30 (Friday)
The runtime stack is an implicit stack that is maintained automatically during program execution; we've alluded to this stack in previous lectures. The basic idea behind the stack is that everytime a function invocation takes place, a new stack frame (sometimes called an activation record) is pushed on the stack. Generally, the following information is pushed on the runtime stack during each function invocation (though different compilers may do things differently for efficiency and performance reasons):
static
) local variables are also pushed onto
the stack. Generally speaking, the compiler automatically takes care of
the storage attributes of these variables. As a result, these variables
have auto
storage.
Some variables may be pushed on the runtime stack shortly after
the program starts but before the main()
function is
actually called. Global variables (both static
and
non-static
) are stored on the stack when the program
starts. They are popped off the stack only after the program finishes.
The lifetime of these variables extends over the lifetime of the entire
program. static
global variables have the same storage
attributes as non-static
global variables, but their
linkage attributes are internal rather than external (i.e.
functions in only one source module can access them. We'll be talking
more about linkage attributes when we discuss multifile compilation in
the next lecture). Global variables are initialized to zero.
static
variables
Local static
variables have storage (lifetime) attributes
similar to global variables but with much more limited scope. Local
static
variables (like non-static
local
variables) can only be accessed inside the functions in which they are
defined. static
local variables are initialized when the
function containing the variable is first executed. They do
not get re-initialized each time the function containing
them is executed. By default, static
local variables are
initialized to zero.
Like global variables, space is automatically allocated for local
static
variables at the very beginning of program execution.
(i.e. a local static
variable is not pushed onto
the stack on each invocation of the function in which it resides).
As a result, static
local variables retain their values
across subsequent invocations of the functions in which they reside.
Recursive functions can use local static
variables to reduce
the amount of information that has to be pushed onto the stack on each
invocation. This is particularly true if the value of the variable is
constant over each invocation.
Local static
variables cannot be used if the function is to
be used in a multithreaded program. Some classic C library functions use
static
local variables (e.g. strtok()
)
and as a result would have to be made re-entrant if they are used in a
multithreaded context.
Consider the following (contrived) example:
#include <stdio.h>
int global = 1;
int func(int formal)
{
static int static_local;
int local = 0;
++static_local;
++local;
return local + global + formal + static_local;
}
int
main()
{
int actual = 2;
int ret_value;
ret_value = func(actual);
++global;
++actual;
ret_value = func(actual);
return 0;
}
stack.c
Initially, the global variable and static
local variable
are pushed on the stack and initialized (either explicitly by the
programmer or implicitly, to 0, by the compiler). Note that space is made
for static_local
even though the function it resides in
hasn't even been called yet. Also note that this variable is implicitly
initialized to zero (unlike regular local variables, which must be
initialized explicitly).
static_local (0)
|
global (1)
|
When the main()
function is called, space is allocated on
the stack for main()
's return value (an int
)
and main()
's local variables (actual
and
ret_value
) are pushed on the stack. Note that actual
is initialized to 2
, but ret_value
is uninitialized.
ret_value (?)
|
actual (2)
|
int (?)
|
static_local (0)
|
global (1)
|
Now, when we call func()
for the first time, we make room on
the stack for its return value and push the return address on the stack
(in this case, the return address would be the address of the instruction
that assigns the result of the function to the ret_value
variable). Then we copy and push the function's formal parameter on
the stack. Finally the function's local variable is pushed on the stack:
local (0)
|
formal (2)
|
return address |
int (?)
|
ret_value (?)
|
actual (2)
|
int (?)
|
static_local (0)
|
global (1)
|
After performing the two preincrements, the runtime stack will have the following values:
local (1)
|
formal (2)
|
return address |
int (?)
|
ret_value (?)
|
actual (2)
|
int (?)
|
static_local (1)
|
global (1)
|
After we do the additions:
local + global + formal + static_local = 1 + 1 + 2 + 1 = 5
the return
is performed, causing the stack
frame to be popped and control to jump to the return address
on the stack, giving us the resulting stack:
int (5)
|
ret_value (?)
|
actual (2)
|
int (?)
|
static_local (1)
|
global (1)
|
The assignment of ret_value
to the return value of the
function is then performed by popping the return value off the stack.
The global
and actual
variables are then
incremented, giving us the stack:
ret_value (5)
|
actual (3)
|
int (?)
|
static_local (1)
|
global (2)
|
We then call the function again, pushing the appropriate values and addresses on the stack as before.
local (0)
|
formal (3)
|
return address |
int (?)
|
ret_value (5)
|
actual (3)
|
int (?)
|
static_local (1)
|
global (2)
|
We increment local
and static_local
, as before.
Note that static_local
had retained its value from the
last invocation of the function, so its value now gets updated to
2
. This gives us the following runtime stack contents:
local (1)
|
formal (3)
|
return address |
int (?)
|
ret_value (5)
|
actual (3)
|
int (?)
|
static_local (2)
|
global (2)
|
After we do the additions:
local + global + formal + static_local = 1 + 2 + 3 + 2 = 8
the return
is performed, causing the stack
frame to be popped and control to jump to the return address
on the stack, giving us the resulting stack:
int (8)
|
ret_value (5)
|
actual (2)
|
int (?)
|
static_local (2)
|
global (2)
|
Again, ret_value
is then assigned the return value of the
function by popping the stack:
ret_value (8)
|
actual (2)
|
int (?)
|
static_local (2)
|
global (2)
|
main()
then finishes by returning zero which is eventually
returned to the operating system as the remainder of the runtime stack
is destroyed.
Note that there are some simplifications made in the above example, but the general idea should be clear.
A dangling pointer is a pointer that points to a memory location that was once valid but is now bogus. For example, consider the following invalid C program:
int *p;
void func1()
{
int a = 0;
p = &a;
}
void func2()
{
int b = 1;
}
main()
{
func1();
printf ("%d\n", *p);
func2();
printf ("%d\n", *p);
}
bad.c
The p
pointer pointed to a location on the runtime stack that
is no longer valid when func1()
ends. When func2()
executes, p
(may) just happen to point to the part of the
runtime stack now occupied by func2()
's local
variable b
. The end result is that the two printf()
calls may display different values. For example, running the above program
on a Linux machine produces:
0 1
The memory area pointed to by p
was on the runtime stack
during the execution of func1()
-- it pointed to the memory
area occupied by func1()'s
local variable a
.
When func1()
finished, the runtime stack was popped, meaning
that p
no longer points an allocated area of memory.
When func2
is then invoked, the runtime stack allocates room
for the local variable of func2()
, namely, b
.
In this particular example, p
now points to the memory
location occupied by b
. Therefore, when we initialize
b
, we are also changing the value to which p
points. Hence the change in output observed above.
strtok()
(K&R § B3)
C provides some rudimentary parsing functions which can be used
to extract information from strings (i.e. character
arrays). One function for parsing strings is called strtok()
which is roughly analogous to Java's StringTokenizer
class.
Its use is demonstrated by the program below:
#include <stdio.h>
#include <string.h>
#define MAX_BUF 80
int
main()
{
char *st;
char buffer[MAX_BUF + 1];
const char *delim = ",\n:";
if (fgets(buffer, sizeof(buffer), stdin) == NULL)
return 1;
if ((st = strtok(buffer, delim)) == NULL)
return 1;
do {
puts(st);
} while ((st = strtok(NULL, delim)) != NULL);
return 0;
}
strtok.c
We read in a string from standard input using fgets()
and store the characters in buffer
. We then call
strtok()
for the first time, passing in the string we
want to tokenize and a string of delimiters that delimit the substrings
that we are trying to extract. The first call to strtok()
will return a pointer to the first character of the first token in the
string.
We then enter a do...while
loop which displays the string
and re-calls strtok()
, this time with a first parameter of
NULL
. This signals strtok()
to use the string
that we supplied to the original invocation of strtok()
and
to find the next token in the string that is delimited by the supplied
array of delimiter characters. When all the tokens have been extracted,
strtok()
will return NULL
and the loop will
terminate.
Because strtok()
uses its own static
array to
store the string, the code is not suitable for multithreaded programs.
Also, note that strtok()
actually modifies the characters
within the array passed to it (it places nul bytes in the strings as it
extracts tokens). For these two reasons, use of the strtok()
function is not encouraged, but can still be useful if you are aware of
its limitations.
scanf()
(K&R § 7.4)
A more succinct way of parsing strings is to use the scanf()
function, which we saw earlier. Below, we are using this function
to parse and store a date followed by a colon followed by a series of
words:
#include <stdio.h>
#define MAX_BUF 80
int
main()
{
int year, month, day;
char entry[MAX_BUF + 1];
if (scanf("%d-%d-%d: %[^\n]", &year, &month, &day, entry) != 4) {
fprintf(stderr, "Invalid input\n");
return 1;
}
printf("Year: %d\nMonth: %d\nDay: %d\nEvent: %s\n",
year, month, day, entry);
return 0;
}
scanf.c
The format string denotes how the input is formatted. As expected,
%d
means that an integer is expected on the input.
The %[^\n]
conversion specifier means read and store
all characters upto, but not including, the first newline. If we
had used the %s
specifier, then the scanf()
function call would just input a single word (i.e. all characters
upto, but not including the first whitespace). Note that the above code
has a potential buffer overflow problem if more than 80 characters are
entered on standard input.
The scanf()
function will return the number of
conversion specifiers it was able to apply. In the above example,
if all four conversion specifiers were used during the input, then
scanf()
will return 4
. In the event that
all four values were extracted from the string, the above program will
display the values.
There is also a file version of scanf()
called
fscanf()
which takes a FILE *
pointer as
its first argument. This can be used to parse input coming from a
file other than standard input (e.g. a file you opened using
fopen()
).
Last modified: January 29, 2004 02:04:53 NST (Thursday)