Dr. Dobb's Portal
Article on DDJ site
This article discusses the use of C++ hash containers to improve storage of subproblem results when using dynamic programming (DP.) Memoization is a key part of dynamic programming, which is conventionally done by storing subproblem results in simple tables or lists. Using hash tables instead of these simpler structures will allow you to use dynamic programming while retaining your algorithm’s natural recursive structure, simplifying design and making your code easier to follow. I’ll provide a fully-developed example of an algorithm, and show how it can be adapted to use hash table memoization.
Dynamic Programming (DP) is a useful technique for algorithm development that is saddled with an unfortunate name. When we refer to greedy algorithms, or the use of divide and conquer techniques, the name provides excellent semantic clues as to what is going on. With dynamic programming, no such luck, but I’m afraid were stuck with name for historical reasons.
Dynamic programming is frequently used to solve optimization problems - that is, problems where a number of choices can be made to lead to some maximum or minimum value. Optimization problems can often be solved efficiently using straightforward iterative or divide and conquer techniques, but dynamic programming becomes useful when those other techniques lead to overlapping subproblems.
The Wikipedia article on dynamic programming gives a good example of what is meant by overlapping subproblems. Imagine that you are a C programmer who decides to use a recursive function call to calculate the nth Fibonacci number:
This code is correct, but it leads to a lot of extra work. Figure 1 shows a map of the recursive calls made using this algorithm:
The Call Map for fib(6)
fib(4) is called twice, which leads to a large duplicated tree of recursive calls.
All in all, a huge amount of wasted effort.
Dynamic programming provides a good way to avoid duplicate work done solving overlapping subproblems, but for the results to be useful, the problem has to adhere to the optimal substructure property.
Saying that a problem has this property is just a way of saying that we can find an optimal solution for problem A by breaking it down into problems B and C, both of which are optimal as well. Chapter 15 of Introduction to Algorithms gives nice examples of cases where this property holds, and where it doesn’t.
Defeating Overlapping Subproblems
Dynamic programming aims to defeat this problem of repeatedly solving overlapping subproblems. Conventionally, this is done by converting the algorithm to one that uses a bottoms up approach, combined with some global storage for intermediate results.
As an example, we could solve the Fibonacci problem using a bottoms-up approach with C++ code that looks something like this:
In this case we only calculate the value of each fib(n) once, completely eliminating the duplicate calculation of those duplicate subproblems.
Most examples of DP that you find in textbooks or on the web will use a table or array to hold the intermediate results from problems. Examples of algorithms that use this type of memoization include Sequence Alignment, Matrix Chain Multiplication, and Optimal Binary Search Trees.
Using the tabular approach for storing subproblems often works well with a bottoms up implementation of the algorithm. DP algorithms typically start by solving the smallest subproblems, storing the results, combining some of those, storing the results in a new level of the table, and so on, until the top case is reached. Again, most of the tutorial DP examples you see will combine a bottoms-up approach with tabular subproblem storage.
From my perspective, the traditional approach to DP suffers from a couple of problems.
First, it seems to me that people feel more comfortable attacking problems in a top-down fashion as opposed to bottoms-up. This is certainly a matter of psychology, not computer science, but it is relevant to consider human factors when designing for maintenance, testability, and review. In many cases the expression of the algorithm seems more natural when given in the original top-down mode, and requires fewer mental gymnastics on the part of whoever is studying the algorithm.
The second problem is the traditional use of array structures to hold subproblems. The problem starts to creep into play when you look at algorithms like Matrix Chain Multiplication, in which half of the storage space is not even used. Things get even worse when you use a problem like that described later in this article, in which subproblems just don’t fit into a row/column organizational format.
Both of these problems can be addressed effectively by using standard C++ hash tables. The procedure is as follows:
- Implement your algorithm using the basic, inefficient recursive implementation.
- Create a C++ hash table with global scope that will hold your subproblem results. The key for the hash table should be a concatenation of the input parameters to your function, and the value will be the return type from your function.
- At the entry point to your function, check the hash table for the presence of an existing value for your subproblem, using the input variables to form the key. If the value is present, instead of executing your function, return immediately.
- At the exit for your function, store the value you are about to return in the hash table, using the input parameters as the key.
With these minor changes, which can often be accomplished in change to as few as three lines of code, you can convert your inefficient recursive algorithm to use dynamic programming, without having to refactor to a bottoms-up implementation, and without having to shoehorn your results into tabular format.
A Simple Example
Applying this technique to our Fibonacci example, we get a routine that looks like this:
So the first time we encounter fib(4), we have to make the recursive calls to
But the second time around, the result has been stored in the map, and it is returned without any
additional calculations taking place.
Hash Table Refresher
Hash tables were part of the
Standard Template Library
when the language standard was ratified in 1998. And while most of the Standard Template Library
was incorporated into the standard library at that time, the
were excluded. According to Bjarne Stroustroup
they would have been in C++98 had we had the time to do a proper detailed design and specification job. There was no doubt that a hash_map was needed as an alternative to map for large tables where the key was a character string and we could design a good hash function. In 1995, Javier Barreirro, Robert Fraley and David Musser tried to get a proposal ready in time for the standard and their work became the basis for many of the later hash_maps . The committee didn’t have the time, though..
While these containers will be added to the standard library under the awkward names
is adopted, there is no reason we have to wait that long. Despite the fact that they are missing
from the standard, virtually every modern C++ compiler has adopted a reasonable variant of
hash_map container does a great job of memoization as described in this article, with a
caveat. As I said earlier, you’ll need the key to your hash table to be some subset of the input
parameters. That’s all fine if your key is a single value using a built-in type, such as
std::string, or a pointer. These key values are supported implicitly by most
Things get a little more complicated if you are using multiple values or your own structures as a key into the map. Owing to the lack of a standard, implementing non-standard keys requires slightly different techniques, depending on your compiler.
I’ll give you examples here that work with the two compilers you are most likely to encounter: Visual C++ .NET 2003/2005, or gcc 3.x.
Let’s say you’re doing a study tracking peoples reading habits by age, and you want to implement the following code:
You won’t be able to compile this code as-is, for at least three reasons:
- You need to include the correct header file, which unfortunately differs depending on which compiler you are using.
- You need to hoist the
hash_mapname into your namespace.
- You need to define the hash function and at least one comparison function for the key class,
Item 3 is only needed if you are using something other than a basic type for your class, and it’s the trickiest. Defining the comparison is necessary so that the hashing library code can verify key equality during lookups or collisions.
In both cases, the function is usually pretty easy to write - you just compose it using existing functions. Examples for Microsoft and gcc compilers circa 2007 are given below.
Sample Code for gcc
gcc puts the hashing header files in the
ext folder, and uses the
namespace, isolating the non-standard library extensions into a ghetto of sorts. Both of these
are dealt with in the first two lines of the sample code below.
The gcc implementation of
hash_map needs two template parameters to handle keys that aren’t
defined as built-in types: one class that has an operator which returns a hash index given an
input key, and a second that returns a boolean value for equality test. I pack both of these into
a single class,
SampleTraits. The traits class is then passed to the
definition as the third and fourth template parameters.
When you are defining the hash function, you need to somehow combine hash values for each of the
members of your structure. Creating hash keys is somewhat of an art, so I try to use the ones
provided by the library writer as the basis for my hash function. In this case I take the hash
values for the two members of
class Sample and simply XOR them together, which should provide a
If you are using g++ 4.x, you may be better off using the tr1
unordered_map class. While it is
not quite part of the standard yet, it should be soon, and that will insure its future support and
Visual C++ .NET 2003/2005
Microsoft’s compilers have similar issues, but deal with them differently. (That’s the problem with non-standard features.)
hash_map header file is accessed from the normal C++ header folder, the class
itself, as was the case with g++, is defined in a different namespace,
stdext. Again, that is
dealt with in the first two lines of the sample code.
The remaining two issues are resolved by defining a global
hash_value class that has an
operator that returns a hash key for a given object of
class Sample, and a global comparison
operator that is used to test for equality of two objects of
class Sample. (It is not unusual
to use the less-than operator to test for equality, by testing both
a > b and
a < b, we
determine whether the objects are equal or not.)
Some Notes on Usage
The member functions of most implementations of
hash_map will be identical to those of the
std::map container. Storing an entry in a
hash_map can be done using an overloaded
operator that makes the container look like an associative array:
Looking up a value from the table is done using a call to
hash_map::find(). This method takes a
reference to a key as its argument, and returns an iterator which points to the end of the
container on failure, or to a key/value pair on success. The pair is stored in an
std::pair<key type,value type> object, which allows you to access the elements with the
second members of that object:
A Weightier Example
Demonstrating the value of a technique with a problem like Fibonacci numbers can be less than convincing - it’s pretty easy to devise reasonable solutions to the problems in an ad hoc manner.
For a more substantial example, I’ve slightly revised a problem from the Cormen, Lieserson, Rivest, and Stein book:
|Before leaving Microsoft to work full-time on philanthropic ventures, Bill Gates has one problem left to solve: his employees are overweight. After a company health fair with a mandatory weigh-in, Bill had HR prepare a standard org chart which includes the number of extra pounds each employee is carrying around. Bill decides the best approach to the problem is to create a regular exercise class for his overweight employees. But with a twist - he decides that he doesn't want any employee to be in a class with his or her direct superior, so as to avoid any hint at coercion. Your task is to take that org chart, plus Bill's no-supervisor constraint, and invite as many employees as you can, maximizing the total excess weight in the class.|
A sample version of the org chart is shown below in Figure 2. Keep in mind that this sample is fairly small, but Bill’s chart will normally have almost 80,000 employees.
The Excess Weight Org Chart
The employee names are not too imaginative, but you can see that employee ‘a’ is carrying 1 extra pound, employee ‘b’ 2, and so on.
Because of the constraints on the problem, we know we can’t invite all employees. We need to decide who to invite, but still obey the rule that no employees are there with their supervisors.
This problem is very amenable to a conventional recursive solution. To determine the maximum weight that can be achieved at a given node, we can use something like the pseudo code shown here:
The logic is straightforward. At each node we calculate two possible values for the maximum
weight. The first value,
weight1, defines the maximum weight if the node does not attend.
Since the node is not attending, the weight will consist of the sum of all that node’s immediate
The second value,
weight2, is used to calculate the maximum value of the node when the node
does attend. Since the node is attending, none of its immediate children can attend, which means
the total weight will be the sum of the node’s weight, plus the max of all of its children’s
You can immediately see that this algorithm shares a characteristic with the Fibonacci algorithm. When evaluating a node, we make recursive calls at two different depths in the calling tree, and this results in overlapping subproblem evaluation. For example, when implementing this pseudo code on the organizational chart showing in Figure 2, the list of nodes passed in to GET-MAX-WEIGHT will be:
|Call Path||Weight 1||Weight 2|
|a||b c||d e f g|
|a->b||d e||h i j k|
|a->b->e||j k||q r|
|a->c||f g||l m n p|
You’ll note that there are 17 nodes, but because of duplicates, we call GET-MAX-WEIGHT 31 times, leading to much extra work. And that level of work gets into the seriously excessive levels when faced with the entire Microsoft 80,000 person organizational chart.
So naturally, Dynamic Programming comes to the rescue. It turns out that this particular problem fits into the category well. We have overlapping subproblems, we have optimal substructure, so we can employ DP.
The Dynamic Programming Solution
A true bottoms-up approach is probably impractical for this problem.
First, it’s a lot of work to traverse a tree one level at a time, starting at the lowest level.
Second, there are problems with the tabular subproblem storage scheme usually favored in dynamic
programming. It’s not entirely clear where in a table we would store the result from
GET-MAX-WEIGHT( q ).
An approximation of the bottoms up approach could be achieved by traversing the tree in postorder, visiting all the children of a node first, then visiting the node itself. In this scenario we would visit the nodes from Figure 2 in the following order:
At every non-leaf node, we would have all the information we needed to calculate
GET‑MAX‑WEIGHT(node) without actually actually recalculating any results -
the subproblem results will have already been calculated and stored.
Of course, traversing the tree in postorder is actually what we’re already doing already in the recursive implementation. Seeing this, my solution to the algorithm was to use a C++ hash table to perform memoization of the subproblems. This has a few advantages:
- My implementation looks just like the recursive definition of the algorithm. It just has a few additional lines needed to check for stored subproblems.
- I don’t have to worry about finding a tabular storage method - I store the results based on a hash of the the node pointer, making lookup and storage nice and simple.
- I don’t have to worry about how to do bottoms-up traversal of the org chart. My recursive definition visits the nodes in the order I want without any changes.
I’ll start by simply presenting my modified version of
GetMax(node), then present the
infrastructure that goes with it. Finally, I’ll discuss the last step in any dynamic programming
problem: using the stored subproblem results to get a solution.
The Memoized GetMax()
The C++ code to solve this problem is completely defined in a class called Memo, which relies on
a tree whose nodes are defined by
class Node. The two class definitions are shown here:
To run an instance of this problem, I first construct a tree by having the
constructor read in a definition file. The resulting root node is then passed in to
Memo::GetMax(), which works its way through all the subproblems and calculates the
maximum total weight for this problem.
The code in
GetMax() should appear nearly identical to that in the pseudocode version
GET‑MAX‑WEIGHT(node), with one critical difference: in my
I check to see if a hashed value of
GetMax() has been saved upon entry to the routine,
and if it has, I short-circuit execution and return that value immediately.
Likewise, at the end of the routine, I store the newly calculated maximum value in the hash table.
The key to this hash table is a pointer to a node - it could just as easily have been the name of the node, but using the pointer has the advantage of making the program a little more invulnerable to input error - duplicate node names won’t break the algorithm.
The use of a pointer as a key into a
hash_map is a technique that can be used to work
around some of the issues I discussed earlier in the article. Because the hash
containers offer built-in support for pointers of any type, use of a pointer as a key
allows you to avoid having to define your own hash function and comparison function, which is nice.
This routine is conceptually pretty simple. The first four lines check to see if I’ve cached a
previously seen solution to the problem in the
mSolutions hash_map. If I have, the
mMaxWeight member of that solution information is returned immediately, avoiding a duplicated
If the solution is not already stored off, I have to do the real work. This means evaluating both
possible solutions - the weight if this node attends, and thus excludes all its immediate
children, and the weight if this node does not attend, and thus includes the children. Those two
values are calculated and stored in local variables
Given those two values, its a simple matter to decide which of the two is larger, and return that one. However, there are a couple of issues to deal with before returning.
First, I want to make sure that we store the solved subproblem in the solution table. Both
return statements do that by writing the value to the
mMaxWeight member of the correct entry
That would be enough to get the maximum value, but I may also want to create a solution for the
problem - which means not only knowing the maximum weight achieved, but a list of attendees. In
order to get that list of attendees, I need to also keep track of whether the maximum value
achieved at this node was done so by attending or not attending. I do that by updating a member
mAttending in the
Solution object. (Note that the constructor sets
false upon creation.)
Getting The Max Value
If all I wanted to know was the maximum value that could be achieved, this single method in
class Memo would be enough. Running a program that calls this routine for the tree shown in
Figure 2 produces the following output:
GetMax(h) GetMax(i) GetMax(d) GetMax(q) GetMax(r) GetMax(j) GetMax(k) GetMax(e) GetMax(b) GetMax(l) GetMax(m) GetMax(f) GetMax(n) GetMax(p) GetMax(g) GetMax(c) GetMax(a) Max = 31
GetMax() prints out each subproblem it’s trying to solve, and you can see that each node is
only called once. The listing also shows you that the nodes are evaluated in postorder, which is
the order we would have needed to simulate with a pure bottoms-up implementation.
Getting the Solution
As I said, the implementation as is returns the maximum value, but to be really useful we need to actually know which of the people on the org chart are going to be invited to attend Bill’s exercise class.
To come up with this list, we need to traverse the tree, only looking at nodes that are included as part of the solution. Nodes that are not included are those nodes that are skipped over because their parent is attending.
Just knowing that a node is included as part of the solution is not quite enough. A node could be included, yet not be attending because the maximum value for that node is achieved by not attending.
The code that actually develops the solution is called
MarkIncluded(). It is called with the
root node as the target, then successively walks down through the tree, marking child nodes that
are part of the solution. As each node is marked, I can see if it is going to be attending, and
its name is printed out:
The logic is pretty simple. When this method is called for a node, that node is automatically marked as included in the solution. Depending on whether this node has chosen to attend or not, I either mark all of its children as included, or mark all of its children’s children.
Adding this code to the program lets me produce this output for the tree shown in Figure 2:
Attendees: d e q r c l m n p
A quick check of the weights of those nodes shows that their sum is 31, which agrees with the output shown above.
I like being able to see the results of of my algorithm in a graphical format, so I added one
class Memo. This method takes the final tree and prints it
out in a format that can be processed into a nice graph using dot, one of the programs
package, a free set of tools from AT&T.
The final result of running the data set shown in Figure 2 through my test program is shown in Figure 3.
Program Output Using GraphViz
In Figure 3, nodes that are not attending are outlined with a dotted border. This means that either their parent node has chosen to attend, excluding them, or their optimal solution calls for them to not attend.
The individual node weight and name are shown in the two cells in the top half of the node. The
bottom half has two cells that show the calculated value of
GetMax() for that node, and the
decision made as to whether that node attends or not in order to reach that maximum value.
Running the Test Program
The test program I used to create this output is a simple program that reads in a tree definition file, outputs the program data to the console, and creates a GraphViz format output file. It’s run with two arguments, an input definition file (one sample is included), and the desired output file name:
memo tree01.def tree01.dot
Project and/or make files are included for Visual Studio 2003/2005 and gcc.
Note that the format of the tree definition file is documented in source code.
main() routine for the test program (minus some error handling) looks like this:
This program should give you an idea of how to harness the power of Dynamic Programming in order to solve some fairly difficult problems with relatively simple algorithms. Dynamic Programming is the key to this, and memoization with C++ hash maps helps make it possible.
Evolving a language in and for the real world: C++ 1991-2006,
Bjarne Stroustrup, ACM HOPL-III. June 2007. http://research.att.com/~bs/hopl-almost-final.pdf
 Graphviz. The free project from AT&T that supports graph visualization.
 Source Code for sample program
 Cormen, Thomas. et. al. Introduction to Algorithms. Cambridge: MIT Press, 2001.