Main

February 09, 2004 (Monday)

Chapter 4: Organizing programs and data

This chapter discusses the concepts of functions and structures for organizing your program and data. Many of the ideas are very similar to what we have seen earlier in C and the same basic motivation and concepts apply in C++. However there are some differences between how the two languages deal with functions and structures

Determining grade for one student (modularized) (K&M § 4.1)

The spirit of the following program is very similar to the program that was discussed in Chapter 3 -- the program basically computes the final grade for a single student. However, instead of defining all the functionality inside the main() function. the code has been broken up into several functions to increase modularity. Breaking up a program into several small functions can help make your programs easier to understand and also promotes the possibility for code reuse.

#include <algorithm>
#include <iomanip>
#include <iostream>
#include <stdexcept>
#include <string>
#include <vector>

using std::cin;                 
using std::cout;
using std::domain_error;
using std::endl;
using std::istream;
using std::ostream;
using std::setprecision;
using std::sort;
using std::streamsize;
using std::string;
using std::vector;

// compute the median of a `vector<double>'
// note that calling this function copies the entire argument `vector'
double median(vector<double> vec)
{
	typedef vector<double>::size_type vec_sz;

	vec_sz size = vec.size();
	if (size == 0)
		throw domain_error("median of an empty vector");

	sort(vec.begin(), vec.end());

	vec_sz mid = size/2;

	return size % 2 == 0 ? (vec[mid] + vec[mid-1]) / 2 : vec[mid];
}

// compute a student's overall grade from midterm and final exam grades
// and homework grade
double grade(double midterm, double final, double homework)
{
	return 0.2 * midterm + 0.4 * final + 0.4 * homework;
}

// compute a student's overall grade from midterm and final exam grades
// and vector of homework grades.
// this function does not copy its argument, because `median' does so for us.
double grade(double midterm, double final, const vector<double>& hw)
{
	if (hw.size() == 0)
		throw domain_error("student has done no homework");
	return grade(midterm, final, median(hw));
}

// read homework grades from an input stream into a `vector<double>'
istream& read_hw(istream& in, vector<double>& hw)
{
	if (in) {
		// get rid of previous contents
		hw.clear();

		// read homework grades
		double x;
		while (in >> x)
			hw.push_back(x);

		// clear the stream so that input will work for the next student
		in.clear();
	}
	return in;
}


int main()
{
	// ask for and read the student's name
	cout << "Please enter your first name: ";
	string name;
	cin >> name;
	cout << "Hello, " << name << "!" << endl;

	// ask for and read the midterm and final grades
	cout << "Please enter your midterm and final exam grades: ";
	double midterm, final;
	cin >> midterm >> final;

	// ask for the homework grades
	cout << "Enter all your homework grades, "
	        "followed by end-of-file: ";

	vector<double> homework;

	// read the homework grades
	read_hw(cin, homework);

	// compute and generate the final grade, if possible
	try {
		double final_grade = grade(midterm, final, homework);
		streamsize prec = cout.precision();
		cout << "Your final grade is " << setprecision(3)
		     << final_grade << setprecision(prec) << endl;
	} catch (domain_error) {
		cout << endl << "You must enter your grades.  "
			"Please try again." << endl;
		return 1;
	}

	return 0;
}
main1.cpp

Functions

You can define functions in C++ in much the same was as you do in C. You specify the return type of the function, the function name and the function's arguments, then the function body enclosed in braces. As in C, all functions in C++ must be defined (or declared) before use.

There are two major differences between functions in C and C++. The first has to do with function overloading and the second is with respect to parameter passing. There is also a third aspect of functions that is different between C and C++, namely function inlining. All three of these topics are discussed in the sections below:

Overloaded functions

In the above grading program, there are two functions with the same name -- grade(). If we were attempt something similar in C, we would get "multiple definition" errors because two functions cannot have the same name. However, in C++ we can give two or more functions the same name just as long as the numbers and/or types of parameters are different amongst all the functions. This is known as function overloading. The benefit of function overloading is that it lets the programmer give the same name to functions that perform essentially the same action but with different types (or numbers) of parameters.

In the above program we have two ways of computing a student's overall grade. The first grade() function takes the students midterm and final mark and also takes a single double which represents all the homework grades (this double could be an average or a median of all the homework assignment grades, for example). The second grade() function takes a midterm and final mark, as before, but the third parameter is a vector of all the homework grades. Note that the second grade() function actually calls the first grade() function. This should not be mistaken as being a recursive call.

Note that we cannot overload on the basis of return type because the compiler cannot necessarily determine, at the point of invocation, which function to call if the only difference between the two functions is the return type.

Pass by reference

Like C, parameters are normally passed by value, meaning that the formal parameter's value is actually a copy of the actual parameter's value. However, in addition to call by value, C++ also supports passing by reference. To understand pass by reference, consider the program below, which is adapted from the swap.c program we saw earlier.

#include	<iostream>

void swap(int &a, int &b)
{
	int	t;
	t = a;
	a = b;
	b = t;
}

int main()
{
	int     a = 1, b = 2;

	std::cout << "Originally:               a = "
		<< a << ", b = " << b << std::endl;

	/* Pass references to integers */
	swap(a, b);
	std::cout << "After return from swap()  a = "
		<< a << ", b = " << b << std::endl;

	return 0;
}
swap.cpp

In the above program, the parameters to the swap swap() function are passed by reference, as denoted by the & character before the parameter names. You can think of a reference as being an alias (as opposed to a copy) of another variable. Therefore, whenever we modify a and/or b in the swap() function, those changes will be reflected in the main() program, so the values of the two variables, a and b, will indeed be swapped. Note that we cannot tell whether parameters are being passed by reference or by value merely by looking at the function invocation -- we need to have access to the function definition or declaration.

In the grading program above, we note that read_hw() is passed the vector of homework assignment grades by reference. Therefore, any changes made to the hw vector in this function will also occur to the homework vector in the main() function because the two variables are synonymous with each other.

The second grade() function takes a const reference to a vector<double>. This may seem contradictory, but passing a const reference has a useful purpose from an efficiency perspective. By passing a const reference, the program will not have to make a local copy of the vector that contains all the assignment grades. If this vector contained a lot of grades, then this copying could be prohibitively expensive. This overhead can be eliminated by passing a reference and by making the reference const, we prohibit the (grade() function from modifying the vector).

Finally note that the median() function is passed the vector of homework grades by value and not by reference. The reason for this is that median() actually changes the value of its parameter (it sorts the vector) and we do not want these changes to be reflected back in the calling function's parameter. Therefore, by passing by value, we can make any changes necessary to the copy of the vector without affecting the actual parameter in the context of the calling function.

In summary there are three primary parameter passing semantics in C++: pass by value, pass by reference and pass by const reference. You should be aware of the semantics of all three techniques as well as when to use them.

inline functions

Although not used in the above example, we can preface the function definition with the keyword inline which tells the compiler to substitute the definition of the function whenever the function is called. This has the effect of eliminating the overhead associated with the function call (e.g. pushing parameters on the stack popping return values off the stack, jumping to function address etc.) and can make code execute more efficiently. This is almost the same as using macros as "functions" in C using the #define preprocessor directive. A couple of notes about inline functions: there is no guarantee that the compiler will actually be able to inline the function. If not, regular function call semantics will apply. Second, the compiler must be able to determine the definition of inline functions during compile time before it can inline the code. For this reason, inline functions are typically placed in header files.

Exceptions

The median() and grade() functions both throw exceptions if the vector of homework assignment grades passed to them is empty. The throw keyword is used in conjunction with the name of an exception (domain_error, in this case). The exception itself can be constructed with a corresponding string to give more details about the exception. When an exception is thrown, execution of the function essentially stops and the runtime stack unfolds until we reach a point at which the exception can be handled. In the grading program, we note that the main() function has a try/catch block which can trap exceptions thrown (either directly or indirectly) by statements inside the try block. If an exception is thrown, execution jumps to nearest enclosing catch block that can handle the exception and the block of code corresponding to that catch block will be executed. Note that there can be several catch blocks for a single try block -- each one handling a different type of exception. If there is no handler for the exception, then the program will abort.

In the catch block, we can extract the string which was used to define the exception by using the what() method of the exception objet that we caught. An example of the what() method is presented in the next example.

There are many types of exceptions including logic_error, invalid_argument and out_of_range. See the top of p.73 of K&M for more exception names. The exceptions are all declared in the <stdexcept> header.

In the grading program both the median() and second grade() functions throw a domain_error exception -- the median one throws the exception with a generic error string while the exception thrown by grade() has a string which is more relevant (and meaningful) to the end-user of the grading program.

The following program fragment has a subtle bug due to the possibility of an exception being raised by the grade() function.

	try {
		streamsize prec = cout.precision();
		cout << "Your final grade is " << setprecision(3)
		     << grade(midterm, final, homework) << setprecision(prec) << endl;
	} catch (domain_error) {
		cout << endl << "You must enter your grades.  "
			"Please try again." << endl;
		return 1;
	}

If the grade() function throws an exception, then it is possible that spurious output may result because the literal "Your final grade is " may have already been displayed before the exception is raised. It is also possible that if grade() raised an exception, the setprecision(prec) manipulator may not be sent to cout, thereby not resetting the precision of the output stream to its original value. By computing the final grade in a statement separate from the output, we ensure that the preceding problems do not arise.

vector::clear() and istream::clear()

The read_hw() function has a few subtleties. First, we check the value of the input stream parameter in before doing anything with it. If the stream is in a bad state (i.e. when used in a conditional context in returns false), then we leave the stream alone and return immediately from the function. Second, we must clear out the hw vector incase the vector had values in it from a previous invocation. In the context of this program, this won't occur since the read_hw() function is called only once. Still, clearing out the vector (by using the vector's clear() method) does make for good defensive programming. Finally, after the loop has executed to read in all the homework grades, the in stream is going to be in an invalid state -- either the end of file was reached or we encountered an input that was not a student grade. If we don't clear out this failure/end-of-file state in in, then the calling function may get the impression that the input of the student's homework grade was unsuccessful if it examines the istream reference returned by read_hw(). To prevent this, we call the input stream's clear() method to clear the input stream's `invalid' status. If there was further input (e.g. another student's record to read), then that input will be available during the next read operation performed by the program.

Determining Overall Grades for many Students (K&M § 4.2)

The following program demonstrates how we can compute the final overall grade for several students, instead of just one as the previous program did. This program uses a structure which we saw in C, to aggregate the student's name and the marks associated with each student. The major difference in the program is in the main() function which creates a vector of Student_info objects and has two loops: one for reading in each student's name and marks and another loop to display the students and their respective overall grade in a neatly formatted table.

#include <algorithm>
#include <iomanip>
#include <iostream>
#include <stdexcept>
#include <string>
#include <vector>

using std::max;
using std::cin;
using std::cout;
using std::domain_error;
using std::endl;
using std::istream;
using std::ostream;
using std::setprecision;
using std::sort;
using std::streamsize;
using std::string;
using std::vector;

struct Student_info {
	string name;
	double midterm, final;
	vector<double> homework;
};	// note the semicolon--it's required

// compute the median of a `vector<double>'
// note that calling this function copies the entire argument `vector'
double median(vector<double> vec)
{
	typedef vector<double>::size_type vec_sz;

	vec_sz size = vec.size();
	if (size == 0)
		throw domain_error("median of an empty vector");

	sort(vec.begin(), vec.end());

	vec_sz mid = size/2;

	return size % 2 == 0 ? (vec[mid] + vec[mid-1]) / 2 : vec[mid];
}

// compute a student's overall grade from midterm and final exam grades
// and homework grade
double grade(double midterm, double final, double homework)
{
	return 0.2 * midterm + 0.4 * final + 0.4 * homework;
}

// compute a student's overall grade from midterm and final exam grades
// and vector of homework grades.
// this function does not copy its argument, because `median' does so for us.
double grade(double midterm, double final, const vector<double>& hw)
{
	if (hw.size() == 0)
		throw domain_error("student has done no homework");
	return grade(midterm, final, median(hw));
}

double grade(const Student_info& s)
{
	return grade(s.midterm, s.final, s.homework);
}

// read homework grades from an input stream into a `vector<double>'
istream& read_hw(istream& in, vector<double>& hw)
{
	if (in) {
		// get rid of previous contents
		hw.clear();

		// read homework grades
		double x;
		while (in >> x)
			hw.push_back(x);

		// clear the stream so that input will work for the next student
		in.clear();
	}
	return in;
}

istream& read(istream& is, Student_info& s)
{
	// read and store the student's name and midterm and final exam grades
	is >> s.name >> s.midterm >> s.final;

	read_hw(is, s.homework);  // read and store all the student's
				  // homework grades
	return is;
}

bool compare(const Student_info& x, const Student_info& y)
{
	return x.name < y.name;
}

int main()
{
	vector<Student_info> students;
	Student_info record;
	string::size_type maxlen = 0;

	// read and store all the records, and find the length of the
	// longest name
	while (read(cin, record)) {
		maxlen = max(maxlen, record.name.size());
		students.push_back(record);
	}

	// alphabetize the records
	sort(students.begin(), students.end(), compare);

	for (vector<Student_info>::size_type i = 0;
	     i != students.size(); ++i) {

		// write the name, padded on the right to maxlen + 1 characters
		cout << students[i].name
		     << string(maxlen + 1 - students[i].name.size(), ' ');

		// compute and write the grade
		try {
			double final_grade = grade(students[i]);
			streamsize prec = cout.precision();
			cout << setprecision(3) << final_grade
			     << setprecision(prec);
		} catch (domain_error e) {
			cout << e.what();
		}

		cout << endl;
	}

	return 0;
}
main2.cpp

Structures

Both C and C++ support the concept of the structure with the struct keyword. In this example, the Student_info structure has a string for the student name, two doubles for the midterm and final marks and a vector of doubles for the homework grades. Note that in C++, writing:

struct Student_info {
	string name;
	double midterm, final;
	vector<double> homework;
};

will create a structure type called Student_info. We can then create a Student_info variable by writing:

int main()
{
	Student_info student;
	...
}

Note that in C, if we had used a similar structure declaration, we would have to write:

int main()
{
	struct Student_info student;
}

to define the variable. In C++, there is no need to resort to the typedef hack that we saw in C if we want to drop the struct from our definition of structure variables. In C++, Student_info is a true type. When declaring variables of this type it is not necessary to preface it with the struct keyword.

Related to this structure are three functions which read, compare and calculate the grade when given Student_info variables.

Comparing Student Names

When doing the output, we want to display the names of the students in alphabetical order. In order to do this, we can use the sort() method in the <algorithm> header. However, in order for sort() to work correctly, it must be told how to compare two different students. Fortunately, this is relatively easy to do. All we do is define a function, say, compare() which takes two const references to Student_info structures and returns true if the first student name is lexicographically less than the second student's name:

bool compare(const Student_info& x, const Student_info& y)
{
	return x.name < y.name;
}

(Note that you can use the relational operators, < >=, == etc. to compare string objects.)

Now, when calling the sort() function, we can simply pass a pointer to this function (which is sometimes called a predicate in C++) to the sort() function, which will cause the vector of students to be sorted:

	sort(students.begin(), students.end(), compare);
Calculating a Student's Grade

We add yet another grade() function, this one taking a const reference to a Student_info structure. The function basically delegates the real work to the grade(double, double, const vector<double>&) function to calculate the overall grade for that student:

double grade(const Student_info& s)
{
	return grade(s.midterm, s.final, s.homework);
}
Reading the Students and their Grades

Finally, we have a read() function which takes an input stream reference (istream&) and a reference to a Student_info structure. This function simply inputs the student name, midterm and final mark, then it calls the read_hw() function to input each students' homework assignment marks. In this context, the description given earlier of the subtleties in the read_hw() now make sense because we are reading in more than one student from standard input.

istream& read(istream& is, Student_info& s)
{
	// read and store the student's name and midterm and final exam grades
	is >> s.name >> s.midterm >> s.final;

	read_hw(is, s.homework);  // read and store all the student's
				  // homework grades
	return is;
}

The max() function

The max() function used in the main() function is defined in the <algorithm> header. This function simply returns the larger of its two arguments. Note that one very important point regarding the max() function is that both parameters must be the same type (in this case, std::string::size_type), otherwise the program won't compile. We'll see the reason for this when we talk about template functions in a later chapter.

The max() function is used to determine the length of the longest name of a student. This is used in order to line up the student names and the overall final grade during output.

Iterating over the vector of students

When we want to display all the students in the students vector, we use the following looping technique:

	for (vector<Student_info>::size_type i = 0;
	     i != students.size(); ++i) {
		...
	}

Note that we are defining an index variable i of the appropriate type directly inside the for statement itself. We then loop until i equals the number of students in the vector. There is another way to iterate over the elements in a container using iterators. We'll see an example of that in future lectures.


Last modified: February 11, 2004 15:44:05 NST (Wednesday)