February 09 (Monday) February 13 (Friday)
Although the program discussed during the last lecture was relatively small, we can break it up into several source modules which can be compiled separately and linked together to form a single executable. This is similar to what we did with the C program earlier. You can download and unzip the following zip file to see how the textbook modularizes the program across several modules.
As in C, #include
files form the basis of `declaration
communication' between source modules. Structure and function
declarations are placed in header files -- any source modules that uses
the structures and/or functions must then include the corresponding
header file.
#include "filename.h" ...
We use the .h
extension and enclose our own header files in
quotation marks when including them in a source file. As in C, we include
the header file in both the source module that defines the functions
and the source modules that invoke the functions so that the consistency
between the function invocation and function definition can be ensured.
One problem that can occur in C++ (and in C too, for that matter) is one of multiple inclusion, in which the same header file gets included multiple times, thereby resulting in potential problems during compilation. To get around this problem, we use a guard at the top of each header file that we create:
#ifndef GUARD_filename_h #define GUARD_filename_h ... #endif // GUARD_filename_h
This uses the preprocessor to test whether the GUARD_filename_h
macro has been #define
d. If not, then the header file is
included, as normal, and the first thing that the include
file does is to #define GUARD_filename_h
so that if there
is an attempt by the compiler to re-include this file (either directly
or indirectly) in the same source module, the first #ifndef
will prevent this. We must end the #ifndef
with a
corresponding #endif
preprocessor directive at the end
of the header file.
In header files, you should never use using
declarations.
Instead, within your header files you should always fully qualify the
types and objects used (e.g. by prefacing all standard namespace
entities with std::
). All using
declarations
should be done only in the context of source modules where their effect
is limited to the source file alone. If you put using
declarations in your header file, you are making the tacit assumption
that other developers who include your header files will also want
the accompanying using
declarations of your header file
in their source module -- this is not always the case and can lead to
naming conflicts.
This chapter discusses further operations that you can perform
on vector
s and how to iterate over the elements in a
vector
using iterators. Some of the efficiency
concerns with vector
objects are discussed and a new
container, the list
is also introduced.
vector
Say we want to partition our vector
of
Student_info
objects into two separate groups:
those that pass the course and those that fail. We introduce the
fgrade()
predicate function to determine whether or not a
student has passed the course:
// predicate to determine whether a student failed
bool fgrade(const Student_info& s)
{
return grade(s) < 60;
}
fgrade.cpp
This function calls the grade(const Student_info &)
function
and returns true
if the student has a failing grade.
The function has been added to the grade.cpp
source module
described earlier. We can then use this function to write another
function that will partition a given vector
of students
into those who passed and those who failed:
#include <vector>
#include "Student_info.h"
#include "grade.h"
using std::vector;
// separate passing and failing student records: first try
vector<Student_info> extract_fails(vector<Student_info>& students)
{
vector<Student_info> pass, fail;
for (vector<Student_info>::size_type i = 0;
i != students.size(); ++i)
if (fgrade(students[i]))
fail.push_back(students[i]);
else
pass.push_back(students[i]);
students = pass;
return fail;
}
fails_vec1.cpp
The extract_fails()
function iterates sequentially over
all the elements in the students
vector and calls the
fgrade()
function on each student. If the student has a
failing grade, the student's record is stored in a fail
vector; otherwise, it is stored in a pass
vector -- both of
these vectors are local to the extract_fails()
function.
After the loop has completed, the pass
vector is assigned
to the students
parameter. This parameter was passed by
reference, therefore, any changes to this parameter will be reflected
by the calling function (main()
in this case). Finally,
the function returns the vector
of students who failed the
course as the return value for this function.
This function can be called in our main()
program as follows.
#include <algorithm>
#include <vector>
#include <iostream>
#include <string>
#include "Student_info.h"
#include "grade.h"
//driver program for grade partitioning examples
using std::cin;
using std::cout;
using std::endl;
using std::sort;
using std::string;
using std::vector;
vector<Student_info> extract_fails(vector<Student_info>& v);
int main()
{
vector<Student_info> vs;
Student_info s;
while (read(cin, s))
vs.push_back(s);
sort(vs.begin(), vs.end(), compare);
vector<Student_info> fails = extract_fails(vs);
for (vector<Student_info>::size_type i = 0; i < fails.size(); ++i)
cout << fails[i].name << " " << grade(fails[i]) << endl;
return 0;
}
ext_main.cpp
vector::erase()
method
Unfortunately, in our first attempt at the extract_fails()
function, the pass
and fail
vectors hold
duplicate copies of the elements in the original students
vector. As a result, just prior to the students =
pass;
assignment, the function has essentially two copies of the
vector
storing the students. This is wasteful use of memory.
We can do better by removing the actual failures from the list as we
encounter them during our iteration as demonstrated below:
#include <vector>
#include "Student_info.h"
#include "grade.h"
using std::vector;
// second try: correct but potentially slow
vector<Student_info> extract_fails(vector<Student_info>& students)
{
vector<Student_info> fail;
vector<Student_info>::size_type i = 0;
// invariant: elements [0,i) of `students' represent passing grades
while (i != students.size()) {
if (fgrade(students[i])) {
fail.push_back(students[i]);
students.erase(students.begin() + i);
} else
++i;
}
return fail;
}
fails_vec2.cpp
This above function will no longer keep two entire copies of the vector
in memory. Instead, as the loop encounters a failing student, it will
store the student's record in a fail
vector
and the remove that student record from the students
vector.
After the loop has finished, the students
vector, which
was passed by reference, will contain all the passing students
and the fail
vector will be returned by the function.
There are a few subtle points with the above code:
vector
by merely supplying
an index. Instead, we can compute an offset relative to the value
returned by the students.begin()
method. This method, which
we saw earlier when sorting the list of students, returns a value that
denotes the first element in the list. By adding i
to this
value, we will get a value that that denotes the ith element
of the list. We can then supply this value to the erase()
method of the students
vector, thereby removing the element
from the vector.
erase()
an element we should not increment our index variable
i
. If we did, then i
would not reference the
element after the one we just deleted. To ensure that i
indexes the correct element, we increment i
only when we
did not remove an element from the vector.
size()
of the
students
vector on each iteration of the while
loop. If we cached the number return by the size()
method
in another variable, say num
, and then proceeded to delete
students from the array, then num
would no longer be an
accurate count of how many students were in the vector. As a result, the
while
loop would run off the end of the students
vector and would access elements in the vector that are out of bounds.
Because extract_fails()
is always accessing the elements
sequentially, we can use another type called an iterator to
access the student records individually and achieve the same behaviour
as before. This is demonstrated by the following modified version of
the extract_fails()
function.
#include <vector>
#include "Student_info.h"
#include "grade.h"
using std::vector;
// version 3: iterators but no indexing; still potentially slow
vector<Student_info> extract_fails(vector<Student_info>& students)
{
vector<Student_info> fail;
vector<Student_info>::iterator iter = students.begin();
while (iter != students.end()) {
if (fgrade(*iter)) {
fail.push_back(*iter);
iter = students.erase(iter);
} else
++iter;
}
return fail;
}
fails_iters.cpp
Note that we have gotten rid of the i
index variable
entirely, but we are still able to access the elements of the
vector
sequentially. The use of an iterator rather than an
index variable further reinforces the sequential nature of how we access
the elements of the vector. You should note the following features of
the above function:
iter
, as follows:
vector<Student_info>::iterator iter = students.begin();
The iterator is initialized to the first element (i.e.
students.begin()
). We then use this iterator to
step through all the elements in the students
vector
by using a while
loop.
If the student has a failing grade, we store it in the fail
vector and then remove the student record from the students
vector using erase()
. Note that the erase()
method takes the iterator as its argument. If the student didn't fail,
we update the iterator by incrementing it. This moves the iterator to
the next element in the vector. When iter
reaches one passed
the end of the vector, the conditional iter != students.end()
will be false
and the loop will terminate. Remember that
students.end()
denotes the position that is one passed the
end of the vector.
Student_info
record denoted by the iterator,
we can dereference the iterator variable in much the same way that
we dereference a pointer. For example:
if (fgrade(*iter)) { fail.push_back(*iter); ... }
This is analogous to the previous version of extract_fails()
when we used an index variable to access the elements of the
vector
:
if (fgrade(students[i])) { fail.push_back(students[i]); ... }
As another example, we can iterate over all the students in the vector and display their names as follows:
for (vector<Students_info>::const_iterator iter = students.begin(); iter != students.end(); ++iter) { cout << (*iter).name << endl; }
As you recall from C, we must do the dereference before the dot operator
due to precedence. C++ also supports C's ->
operator
to make the above syntax less clumsy:
Note that we create acout << iter->name << endl;
const_iterator
, instead of a regular
iterator
because the above code fragment does not actually
change the contents of the student
vector. Whenever you
are just reading the elements of a container (and not modifying the
container), you should use a const_iterator
.
erase()
an element, our iterator (and any iterators
denoting elements after it in the vector
) become invalid.
Fortunately, the erase()
method will return an iterator
value which denotes the element following the element we just deleted.
Whenever we erase an element, we update iter
by assigning it
the return value from the erase()
method.
iter = students.erase(iter);
end()
iterator on
each iteration of the loop. If we had written the code as:
vector<Student_info>::iterator iter = students.begin(), end_iter = students.end(); while (iter != end_iter) { ...
then this introduces a subtle bug because whenever an element
is erased from a vector
, the iterator that
denotes the erased element as well as all the iterators that denote elements
after it (including end_iter
in the above case) become
invalid. Therefore, in the conditional of the while
loop we reevaluate the end()
iterator.
Incidentally, whenever we add elements to a vector
(using
insert()
, for example), the insertion may result in the
entire vector being moved to another part of memory. As a result
all iterators that denote elements in the original
vector
will become invalid.
list
type
While the above versions of extract_fails()
are correct, there
is a problem with its efficiency because erasure from (or insertion into)
the middle of a vector
can be very inefficient due to all
the element shifting that must take place. Fortunately, the standard
C++ library supports a list
container which allows for
rapid insertion into and deletion from anywhere in the container, thereby
mitigating the inefficient behaviour of the vector
.
Here is the main()
function that changes the implementation
to use a list
container instead of a vector
container:
#include <list>
#include <string>
#include "grade.h"
#include "Student_info.h"
using std::cin;
using std::cout;
using std::endl;
using std::list;
using std::string;
list<Student_info> extract_fails(list<Student_info>& v);
int main()
{
list<Student_info> vs;
Student_info s;
while (read(cin, s))
vs.push_back(s);
vs.sort(compare);
list<Student_info> fails = extract_fails(vs);
list<Student_info>::iterator i;
for (i = fails.begin(); i != fails.end(); ++i)
cout << i->name << " " << grade(*i) << endl;
return 0;
}
list_main.cpp
You'll note that the main()
function is very similar to
the original main()
function that we saw earlier except
that the vector
has been replaced with a list
and that we are using iterators instead of an index variable to step
through the elements of the list
containing the failing
students. We cannot use the global list
container to
sort the list of students. However, the list
type has
its own sort()
member function which can be used to sort
the students stored in the vs
list.
Here is the implementation of the extract_fails()
function using the list
container. Note, again, that its
implementation is identical to the vector
version that
used iterators, except that vector
has been replaced
by list
.
#include <list>
#include "Student_info.h"
#include "grade.h"
using std::list;
// version 4: use `list' instead of `vector'
list<Student_info> extract_fails(list<Student_info>& students)
{
list<Student_info> fail;
list<Student_info>::iterator iter = students.begin();
while (iter != students.end()) {
if (fgrade(*iter)) {
fail.push_back(*iter);
iter = students.erase(iter);
} else
++iter;
}
return fail;
}
fails_list.cpp
Unfortunately, while vector
s allow random access via the
indexing
operator or by adding a number to an iterator,
e.g.
students.begin() + i
the list
type does not support such random access techniques.
The choice of whether to use a list
or a vector
container depends primarily on how the container is to be used. We'll
discuss advantages and disadvantages of both containers in the next
lecture.
Last modified: February 11, 2004 17:02:11 NST (Wednesday)