February 27 (Friday) March 03 (Wednesday)

March 01, 2004 (Monday)

Chapter 8: Writing generic functions

C++ supports the concept of generic functions which are abstract forms of the functions that we have written so far. The purpose of generic functions is to allow us to write a single function that have the same behaviour for several types. Instead of having to define several overloaded functions that have essentially the same body but whose only difference is in the types that it uses, we can write a single template function and let the compiler/linker take care of instantiating the actual functions for us.

A generic `max` function

For example, consider the following two versions of the max() function:

int max(const int &i, const int &j)
{
	return i > j ? i : j;
}

double max(const double &i, const double &j)
{
	return i > j ? i : j;
}

These two function essentially have identical behaviour. The only difference is that the types are different — the first function determines the maximum of two ints and the second determines the maximum of two doubles. Instead of manually defining these two functions (and another max() for strings and another one for chars etc.), we can factor out the type and combine the functions above into a single, generic function as follows:

template <class T>
T max(const T &i, const T &j)
{
	return i > j ? i : j;
}

The template <class T> signifies that this function is a generic function that takes one type as a parameter. This type will be denoted by the name T. (The name T is typically used by convention, but we could use another name if we wanted to.)

Within the template function, we replace all occurrences of the differing types with with the generic type T. Now, whenever we call our new max() function, the compiler (or linker) will determine the types of the parameters and will create (or instantiate) the appropriate function for us. For example, if we call:

max(3, 6);

then the compiler/linker will notice that we are passing two int parameters and will instantiate the following max() function for us:

int max(const int &i, const int &j)
{
	return i > j ? i : j;
}

If we call:

max(12.34, 56.78);

then, using the template function that we defined, the compiler/linker will instantiate the following max() function for us:

double max(const double &i, const double &j)
{
	return i > j ? i : j;
}

Analogous functions will be instantiated if we called max() with two chars, or two strings, etc. The following program demonstrates the use of the template function. Note that we name our function my_max() because the standard library already defines its own max() function.

#include <iostream>
#include <string>

using std::endl;
using std::cout;
using std::string;

template <class T>
T my_max(const T &left, const T &right)
{	
	return left > right ? left : right;
}

int main()
{
	string	s1 = "hello", s2 = "world";

	cout << my_max(12, 15) << endl;
	cout << my_max(1232.43, 54.2) << endl;
	cout << my_max(s1, s2) << endl;

	// Error: parameters are of different type
	// cout << my_max(12, 15.0) << endl;

	return 0;
}
max.cpp

Note the following about template functions:

As demonstrated in above example, we cannot call our my_max() function with two parameters that have different types. Doing so would create an ambiguity that the compiler/linker cannot resolve since it cannot tell which version of the function to instantiate. A compiler/linker error will result.
Note that we can call our my_max() function using only types that support the > operator. Attempts to use this function on a type that does not support this operator will generate a compiler/linker error. In the general case, the types of the parameters passed into the function must support all the operations that are used on the type by the generic function.
Sometimes, when creating the function, not all the generic types appear in the parameter list. For example, consider the following program:
```
template <class T>
T zero()
{
	return 0;
}

int main()
{
	zero();
	return 0;
}
zero.cpp
```
The zero() function returns a generic type, but does not have any generic parameters. When calling this function, we must explicitly specify which version of the zero() function to instantiate. Unfortunately, just calling zero(), as done by the above program, doesn't give the compiler enough information to instantiate an appropriate form of the zero() function — what type should T have? As a result, we must pass the type to the function as part of the invocation, for example:
```
zero<double>();
```
Template functions can also take different types as parameters. We can specify them using the template syntax as follows:
```
template <class T, class U>
```

A generic `median()` function

Earlier, we created a function called median() which would determine the median of a vector of doubles:

double median(vector<double> vec)
{
	typedef vector<double>::size_type vec_sz;

	vec_sz size = vec.size();
	if (size == 0)
		throw domain_error("median of an empty vector");

	sort(vec.begin(), vec.end());

	vec_sz mid = size/2;

	return size % 2 == 0 ? (vec[mid] + vec[mid-1]) / 2 : vec[mid];
}
old-median.cpp

We can make this function more generic so that it will work with any numeric type by replacing the double type with a generic type and using the template keyword when defining the function.

template <class T>
T median(vector<T> v)
{
	typedef typename vector<T>::size_type vec_sz;

	vec_sz size = v.size();
	if (size == 0)
		throw domain_error("median of an empty vector");

	sort(v.begin(), v.end());

	vec_sz mid = size/2;

	return size % 2 == 0 ? (v[mid] + v[mid-1]) / 2 : v[mid];
}
median.cpp

This new median() function will now work with a vector containing any numeric type, including ints. However, because it is using division on the values of the vector (amongst other operations), we cannot call this function on any type that does not support division. For example, we cannot call this function on a vector of strings. This function also uses the typename keyword:

typedef typename vector<T>::size_type vec_sz;

Whenever we attempt to access a type that is defined by a template parameter, we must prefix the entire type with the typename keyword. In the above code, we are attempting to retrieve the size_type type from the template type vector<T>. The typename keyword tells the compiler that size_type denotes the name of a type within the vector<T> class.

Iterators categories

In C++ there are five categories of iterators. The category to which an iterator belongs depends primarily upon what operators the iterator supports. For example, if the iterator supports the pre-/post-increment operators (++), the dereference operators (* and ->) and the equality operators (== and !=), then it can be classified as an input iterator.

The following list gives a summary of each of the iterators as well as an example of their use. The iterators are discussed more fully in Section 8.2.

Input iterator: Sequential unidirectional access, read only.
e.g.

template <class In, class X>
In find(In begin, In end, const X& x)
{
	while (begin != end && *begin != x)
		++begin;
	return begin;
}
it-input.cpp

Output iterator: Sequential unidirectional access, write only.
e.g.

template <class In, class Out>
Out copy(In begin, In end, Out dest)
{
	while (begin != end)
		*dest++ = *begin++;
	return dest;
}
it-output.cpp

Forward iterator: Sequential unidirectional access, read and write access.
e.g.

template <class For, class X>
void replace(For beg, For end, const X& x, const X& y)
{
	while (beg != end) {
		if (*beg == x)
			*beg = y;
		++beg;
	}
}
it-forward.cpp

Bidirectional iterator: Sequential bidirectional access, read and write.
e.g.

template <class Bi>
void reverse(Bi begin, Bi end)
{
	while (begin != end) {
		--end;
		if (begin != end)
			swap(*begin++, *end);
	}
}
it-bid.cpp

Random iterator: Can efficiently access any element for read and write.
e.g.

template <class Ran, class X>
bool binary_search(Ran begin, Ran end, const X& x)
{
	while (begin < end) {
		// find the midpoint of the range
		Ran mid = begin + (end - begin) / 2;

		// see which part of the range contains `x';
		// keep looking only in that part
		if (x < *mid)
			end = mid;
		else if (*mid < x)
			begin = mid + 1;
		// if we got here, then `*mid == x' so we're done
		else return true;
	}
	return false;
}
it-rand.cpp

Note that with the exception of the output iterators, each of the listed categories subsumes the categories above it.

Input and output iterators

The standard library defines two special iterators in the <iterator> header called istream_iterator and ostream_iterator. When used in conjunction with the algorithms from the <algorithm> header, we can perform input/output without using any explicit looping mechanisms. For example, consider the following program:

#include	<iostream>
#include	<iterator>
#include	<string>
#include	<vector>
#include	<fstream>
#include	<algorithm>

using std::string;		using std::vector;
using std::ifstream;		using std::endl;	
using std::istream_iterator;	using std::ostream_iterator;
using std::cout;		

int
main()
{
	ifstream in("/usr/share/dict/words");
	vector<string> dictionary;

	copy(istream_iterator<string>(in), istream_iterator<string>(),
		back_inserter(dictionary));

	copy(dictionary.begin(), dictionary.end(),
		ostream_iterator<string>(cout, "\n"));

	return 0;
}
it-io.cpp

During input, the copy() algorithm function is passed two istream_iterator objects. Both iterator types are are instantiated using the string type. The first (beginning) iterator is constructed using the fstream object, in. This instructs the copy() algorithm to acquire strings from file denoted by in. The second (ending) iterator is constructed without any parameters. This iterator will essentially denote end-of-file or a stream error condition. Thus, the copy() function will read strings from the input file, storing them in the dictionary vector until the end of file or a stream error is encountered:

copy(istream_iterator<string>(in), istream_iterator<string>(),
	back_inserter(dictionary));

During output, we again use the copy() algorithm function to copy all the elements from the dictionary vector to standard output cout. To do this, we instantiate a string form of an ostream_iterator and we construct this object by specifying the stream to which we wish to send the data (cout in this case) as well as the separator that will be displayed between each element:

copy(dictionary.begin(), dictionary.end(),
	ostream_iterator<string>(cout, "\n"));

If we didn't want to store the dictionary words in a vector, then we can simultaneously display the contents of the file as we copy() the strings from the input: fstream:

copy(istream_iterator<string>(in), istream_iterator<string>(),
	ostream_iterator<string>(cout, "\n"));

Chapter 9: Defining new types

When we previously defined the Student_info record, we stored all the elements in a struct, including the name, midterm mark, final mark and the homework assignment grades. Unfortunately, this does not provide a sufficiently high degree of abstraction for users of the structure. Anyone using this structure would be required to directly access the structure members so as to read and store values within the structure. This could lead to the data members accidentally (or intentionally) being put in an inconsistent state.

Instead of accessing the data members directly, we can introduce the concept of an functional interface to the data which users would use to indirectly access/modify members of the type. While we had such functions (e.g. grade(), read() etc.) in our earlier implementation, they were not part of the structure in which we defined the data members. C++ allows us to combine data members and the corresponding methods (functions) that operate on them into one cohesive unit called the class.

Header file for the `Student_info` class

The header file below represents the new Student_info class.

#ifndef GUARD_Student_info
#define GUARD_Student_info

#include <string>
#include <vector>

class Student_info {
public:
	Student_info();              // construct an empty `Student_info' object
	Student_info(std::istream&); // construct one by reading a stream
	std::string name() const { return n; }
	bool valid() const { return !homework.empty(); }

	// as defined in 9.2.1/157, and changed to read into
	// `n' instead of `name'
	std::istream& read(std::istream&);

	double grade() const;    // as defined in 9.2.1/158
private:
	std::string n;
	double midterm, final;
	std::vector<double> homework;
};

bool compare(const Student_info&, const Student_info&);

#endif
Student_info.h

You should note the following features about the header file.

As we've seen before, we use a guard to protect against multiple inclusion of the header file. We also do not use using declarations in our header file — instead, we use std::vector std::string etc..
We use the new keyword class instead of struct. Within the class, we partition the members into a public and private section using the public and private protection labels. The members public section can be accessed by all users of our class, whereas members in the private section can be accessed only the by the methods defined by the class. By default, all elements in a class are private. Therefore, we must make sure our public methods are defined in the public portion of the class. By contrast, all members in a struct are public, by default.
The Student_info class defines two constructors. These are methods whose name is the same as the class itself and are used to create an initial state for an object of the class. The first constructor takes no parameters and initializes the Student_info object to a default state. The second constructor uses its istream reference parameter in order initialize its data members by reading data from the stream. The actual definitions for these two function will take place outside the class when we present the corresponding Student_info.cpp source module below.
Two parameter-less member functions called name() and valid() are defined inside the class definition itself. The name() function provides a means to access the string representing the student name n. The valid() function will return a boolean indicating whether or not the student has done any of the homework assignments.
All member functions act on the object through which they were invoked. For example, if we create a Student_info object called s, we can access the student's name for this object by calling s.name(). The name() function takes an implicit first parameter which denotes the corresponding Student_info object, s.
Note that neither of these two functions modify the data members of the class; therefore they are annotated with the const qualifier. Also, because these functions are defined inside the class itself the compiler will attempt to inline these functions, thereby eliminating the function call overhead.
Two other member functions grade() and read() are also declared but not defined in the header. Like the constructors, we will define them in the Student_info.cpp source module.
Finally, in the private section of the class, we define the data members that hold the student's name, final and midterm marks and a vector containing the homework assignments. Because these members are in the private section, only the member functions themselves, (e.g. name(), read(), etc) can actually access/modify them.
The header file also declares (but does not define) a global compare() function that is not a member of the Student_info class. This function is used to sort the students by name during output. It is defined in the Student_info.cpp module.

Last modified: March 6, 2004 15:42:06 NST (Saturday)