Alexander Stepanov’s Standard Template Library provided a huge push towards making template programming an important part of C++, and helped to insure that it was included as part of the first standard in 1998. But adoption of templates by most programmers has been more of an incremental process, as opposed to a revolutionary one - many of us have literally decades of Object Oriented == The One True Path to unlearn. And to top things off, templates needed some refinement over the years before they could do many of the things we expected of them.
In this article, I’ll show you how generic programming with templates makes a noticeable improvement in a pedestrian but important task I deal with constantly in my data compression work. Then I’ll show you the annoying but understandable problems with this approach circa 1998, and finally, how to resolve them with features added in TR1 and C++11.
Most data compression algorithms read their input data using well-supported byte-oriented I/O. Text files are read a byte at a time, and most image files are blocked into data structures that align to byte boundaries.
But the output of compressors tends to be in odd sizes. For example, venerable LZW compression output usually starts with nine-bit wide codes which grow as time goes on, through ten, eleven, and twelve bits, up to some arbitrary size.
The standard C++ libraries don’t have I/O APIs that support these odd sizes, so any compressor implementation has to build these I/O routines into their program.
And of course, the inverse task, decompression, needs to be able to read data in non-standard sizes in order to write uncompressed data. This is effectively the same problem, just turned on its head.
To illustrate how this works, I’ll use the example of arithmetic compression in this article. Canonical CACM89 arithmetic compression outputs one bit at a time. So whether you are writing to files, sockets, or shared memory, you are going to need to have some kind of shim in your library that converts those single bits into bytes suitable for use with standard libraries.
The Solution Before Templates
If you are writing a dedicated compressor that is purpose-built for some application, you would probably just embed some dedicated code in your input and output routines. This article is less important for a one-off solution like that.
Things get more complicated if you are a library writer. If you want to provide an arithmetic compression capability to end users, you don’t have any idea in advance where their I/O is going. So the classical OOP solution to this is to define a base class with virtual methods that the compressor needs, then require library users to implement derived classes that provide all missing functions and obey certain rules.
In this case, my arithmetic compressor will have dedicated routines that do the bit-fiddling,
and then rely on a shim class to provide byte-oriented input and output. This means the library
can read or write from a disparate variety of sources. My pre-template solution to this is
illustrated with a partial code segment from
bitio_00.cpp. This uses classic OOP to
create input and output classes, which are then called from inside the compressor.
If I were going to perform compression on a pair of
iostream objects, I would need
simple derived classes that implemented the necessary shim code for the
getByte() functions. A simple version of the input class used in
bitio_00.cpp is shown here:
This works, and at least for the library writer it is fairly satisfactory.
As a library user it is not really optimal. I’d like to be able to use a library without having to create shim classes. Of course the library writer is free to implement common shim classes - which can then be used as input for more customized approaches.
Withholding any other judgment, this is an object oriented solution that is fairly simple to understand and implement, and there are plenty of libraries out there that use this sort of implementation. You probably find it somewhat familiar and can imagine how you might deal with it as a user.
So Where Are The Problems?
Although the OOP solution to this problem works, there are a couple of problems.
First, as library writers start identifying additional things they need in their I/O classes, the abstract base classes sometimes get larded with additional methods, making it more and more complicated to implement specialized versions. While not a problem in this specific example, it definitely is a problem in the real world.
For example, the arithmetic compressor writer might decide that block I/O offered some performance
improvements, leading him or her to add methods for
getBytes(). This kind of creeping elegance leads to complex base classes that annoy
However, this is more a stylistic problem, and not necessarily tied directly to OOP, although it does seem to be a natural outgrowth of that type of programming.
A more important problem, and one that is really a big deal for data compression, is the cost of
using virtual base classes to implement I/O. Functions like
putBit() are called
repeatedly in tight loops. The fact that they have to invoke virtual functions is annoying - these
are hard for the C++ compiler to optimize at compile time, and so each call is an expensive
(usually one level of pointer indirection) subroutine call.
What would be much better, and what the template-based approach allows for, is to define the input and output routines as parameterized types using templates. The compiler can then generate the code that calls those functions inline with the compression code - avoiding those expensive subroutine calls and giving the compiler leeway to optimize at will. In some circumstances, this can result in a substantial improvement in runtimes.
The 1998 Template-based Solution
So how would my code look if I ditched OOP and used a generic solution? My sample code in
bitio_01.cpp has a compressor class that is declared like this:
With this kind of template programming, I can construct my compressor with an input object of any class that has a
getByte() method, and an output object of any class that has a
As a library writer, I am going to go ahead and provide a convenience class that takes an
std::ostream object and implements the needed function. My template class,
output_bytes<T>, is defined for
std::ostream only. I include a
default implementation that will cause a compiler error if you attempt to construct it with some
other type of object - this is a 1998-circa attempt to perform compile-time assertions.
With that in place, I can now pass in
ostream objects to my compressor without the user having to implement a shim class. I use a helper function to deal with some of the mechanics needed to actually construct the compressor and run it:
Templates Lack OOP Semantics
So in the code above, I’ve created shim classes that should work with anything derived from
std::iostream. I should be able to read and write to files, the console, memory,
This is true, but you will note right away that this template-based solution won’t quite match up with your expectations. If you change the code shown above to look like this:
You’ll get the error showing that you have tried to instantiate the generalized class, not the
1>------ Build started: Project: traits, Configuration: Debug Win32 ------ 1>Compiling... 1>bitio_01.cpp 1>bitio_01.cpp(121) : error C2248: 'output_bytes<T>::output_bytes' 1> cannot access private member declared in class 'output_bytes<T>' 1> with 1> [ 1> T=std::ofstream 1> ] 1> bitio_01.cpp(15) : see declaration of 'output_bytes<T>::output_bytes' 1> with 1> [ 1> T=std::ofstream 1> ]
So what’s going on here? My template class is specialized for
std::ofstream is an
std::ostream object, is it not?
No, it isn’t. As OOP programmers we casually say that an
std::ostream object, but this is syntactic shorthand for a deeper OOP
principle. At the surface level, these are two separate classes, quite distinct from one
And therein lies the problem - template instantiation is simple-minded - when you create a
specialized version of class
std::ostream, the compiler
will construct it for that one class, and that one class only.
So what this means is that my convenience class
not quite as convenient as I would like it to be. Anyone using it for class derived from
ostream will have to upcast their arguments:
The Enable If Idiom
This should be a problem I can fix. What I want to do is modify my template definition of
output_bytes so that it works with
std::ostream or any class derived from it.
This can be done with the help of two relatively new classes.
std::enable_if is a
template class that was added to the library with C++11, and
std::is_base_of is a
struct added with TR1 in 2007. I’ll show the new definition below, then go on to try to analyze
what is happening:
Comparing this code with what was seen in the 1998 template version, you’ll see that the only real changes are in the part of the class definition in which template parameters are declared.
First, the base class now has two parameters - the first being a type that will nominally be a
class derived from
std::ostream, with the second being an arbitrary type parameter
Enable, which defaults to type
void. When instantiating objects
output_bytes, I’m still going to just use the single type specifier,
as shown below.
So where does the second type parameter come into play?
In the specialization of
output_bytes that I have created for classes derived from
std::ostream, you’ll see that I am using an expression for this second parameter:
This expression evaluates to a type of void if T is type
std::ostream or a class
derived from it. If T does not match up this way, then the expression is not defined.
So in some cases a template parameter is defined, and in other cases it is not defined. Let’s see how this works. The inner part of the expression is:
If you look up
you’ll see that the instantiation of that struct has a constant static member named
value, which is true if T is derived from
std::ostream (in this case),
and false if it is not. Since this is a static constant member of the struct, it is known at
compile time and can be used as a template argument.
This means that we are instantiating
std::enable_if with a constant that is either
true or false. This is where things get interesting. We are passing
std::enable_if<T>::value as the second argument to
output_bytes<T,Enable>. Looking at the definition for
you’ll see that if the argument passed to it is true, it has a typedef for
If the template argument is false, the typedef does not exist.
So if T is the desired
std::ostream derived class, we are passing in two type
arguments to the template definition. But if T is some other unwanted class, the second argument
doesn’t exist. What happens then?
It turns out that C++ has a very specific rule in place for this scenario:
Substitution Failure Is Not an Error,
or SFINAE. What this means is that in this case, the compiler finds that it can’t provide the
second argument to the class definition for
output_bytes, so it simply skips trying
to instantiate it. Not an error.
This means that the slightly wonky definition does exactly what I want it to do - it instantiates
std::ostream classes, and doesn’t for others.
Changing a template specialization so that it applies to an entire class hierarchy instead of just a single class can be done fairly easily using modern C++ tools. It involves some fairly benign changes to a class declaration, and imposes no runtime cost.
As always, this type of template metaprogramming is somewhat difficult to get your head around if you are a traditional C++ procedural programmer, as so many of us are. You have to get used to the concept of passing types as parameters at compile time, and this is just completely new territory.
As C++ adds more support for traits-based programming in the post-C++11 future, it may be that creating this type of class becomes simpler. But for now, it works pretty well without too much pain, as long as you can get your compiler updated to TR1 or later.
provides libraries that implemented
long before TR1, so you should be able to implement this type of code with just about any C++
compiler in use today. A home-grown implementation of
enable_if is trivial, and is
included in the sample code: