 Dr. Dobb's Portal October, 2007 Article on DDJ site

This article discusses the use of C++ hash containers to improve storage of subproblem results when using dynamic programming (DP.) Memoization is a key part of dynamic programming, which is conventionally done by storing subproblem results in simple tables or lists. Using hash tables instead of these simpler structures will allow you to use dynamic programming while retaining your algorithm’s natural recursive structure, simplifying design and making your code easier to follow. I’ll provide a fully-developed example of an algorithm, and show how it can be adapted to use hash table memoization.

#### Background

Dynamic Programming (DP) is a useful technique for algorithm development that is saddled with an unfortunate name. When we refer to greedy algorithms, or the use of divide and conquer techniques, the name provides excellent semantic clues as to what is going on. With dynamic programming, no such luck, but I’m afraid were stuck with name for historical reasons.

Dynamic programming is frequently used to solve optimization problems - that is, problems where a number of choices can be made to lead to some maximum or minimum value. Optimization problems can often be solved efficiently using straightforward iterative or divide and conquer techniques, but dynamic programming becomes useful when those other techniques lead to overlapping subproblems.

The Wikipedia article on dynamic programming gives a good example of what is meant by overlapping subproblems. Imagine that you are a C programmer who decides to use a recursive function call to calculate the nth Fibonacci number:

This code is correct, but it leads to a lot of extra work. Figure 1 shows a map of the recursive calls made using this algorithm: Figure 1 - The Call Map for fib(6)

For example, `fib(4)` is called twice, which leads to a large duplicated tree of recursive calls. All in all, a huge amount of wasted effort.

Dynamic programming provides a good way to avoid duplicate work done solving overlapping subproblems, but for the results to be useful, the problem has to adhere to the optimal substructure property.

Saying that a problem has this property is just a way of saying that we can find an optimal solution for problem A by breaking it down into problems B and C, both of which are optimal as well. Chapter 15 of Introduction to Algorithms, 3rd Edition (The MIT Press) gives nice examples of cases where this property holds, and where it doesn’t.

#### Defeating Overlapping Subproblems

Dynamic programming aims to defeat this problem of repeatedly solving overlapping subproblems. Conventionally, this is done by converting the algorithm to one that uses a bottoms up approach, combined with some global storage for intermediate results.

As an example, we could solve the Fibonacci problem using a bottoms-up approach with C++ code that looks something like this:

In this case we only calculate the value of each fib(n) once, completely eliminating the duplicate calculation of those duplicate subproblems.

Most examples of DP that you find in textbooks or on the web will use a table or array to hold the intermediate results from problems. Examples of algorithms that use this type of memoization include Sequence Alignment, Matrix Chain Multiplication, and Optimal Binary Search Trees.

Using the tabular approach for storing subproblems often works well with a bottoms up implementation of the algorithm. DP algorithms typically start by solving the smallest subproblems, storing the results, combining some of those, storing the results in a new level of the table, and so on, until the top case is reached. Again, most of the tutorial DP examples you see will combine a bottoms-up approach with tabular subproblem storage.

#### DP Problems

From my perspective, the traditional approach to DP suffers from a couple of problems.

First, it seems to me that people feel more comfortable attacking problems in a top-down fashion as opposed to bottoms-up. This is certainly a matter of psychology, not computer science, but it is relevant to consider human factors when designing for maintenance, testability, and review. In many cases the expression of the algorithm seems more natural when given in the original top-down mode, and requires fewer mental gymnastics on the part of whoever is studying the algorithm.

The second problem is the traditional use of array structures to hold subproblems. The problem starts to creep into play when you look at algorithms like Matrix Chain Multiplication, in which half of the storage space is not even used. Things get even worse when you use a problem like that described later in this article, in which subproblems just don’t fit into a row/column organizational format.

Both of these problems can be addressed effectively by using standard C++ hash tables. The procedure is as follows:

1. Implement your algorithm using the basic, inefficient recursive implementation.
2. Create a C++ hash table with global scope that will hold your subproblem results. The key for the hash table should be a concatenation of the input parameters to your function, and the value will be the return type from your function.
3. At the entry point to your function, check the hash table for the presence of an existing value for your subproblem, using the input variables to form the key. If the value is present, instead of executing your function, return immediately.
4. At the exit for your function, store the value you are about to return in the hash table, using the input parameters as the key.

With these minor changes, which can often be accomplished in change to as few as three lines of code, you can convert your inefficient recursive algorithm to use dynamic programming, without having to refactor to a bottoms-up implementation, and without having to shoehorn your results into tabular format.

#### A Simple Example

Applying this technique to our Fibonacci example, we get a routine that looks like this:

So the first time we encounter fib(4), we have to make the recursive calls to `fib(3)` and `fib(2)`. But the second time around, the result has been stored in the map, and it is returned without any additional calculations taking place.

#### Hash Table Refresher

Hash tables were part of the Standard Template Library when the language standard was ratified in 1998. And while most of the Standard Template Library was incorporated into the standard library at that time, the `hash_set` and `hash_map` classes were excluded. According to Bjarne Stroustroup :

they would have been in C++98 had we had the time to do a proper detailed design and specification job. There was no doubt that a hash_map was needed as an alternative to map for large tables where the key was a character string and we could design a good hash function. In 1995, Javier Barreirro, Robert Fraley and David Musser tried to get a proposal ready in time for the standard and their work became the basis for many of the later hash_maps . The committee didn’t have the time, though..

While these containers will be added to the standard library under the awkward names `unordered_map` and `unordered_set` when TR1 is adopted, there is no reason we have to wait that long. Despite the fact that they are missing from the standard, virtually every modern C++ compiler has adopted a reasonable variant of `hash_map` and `hash_set`. The `hash_map` container does a great job of memoization as described in this article, with a caveat. As I said earlier, you’ll need the key to your hash table to be some subset of the input parameters. That’s all fine if your key is a single value using a built-in type, such as `int`, a `std::string`, or a pointer. These key values are supported implicitly by most implementations of `hash_map`.

Things get a little more complicated if you are using multiple values or your own structures as a key into the map. Owing to the lack of a standard, implementing non-standard keys requires slightly different techniques, depending on your compiler.

I’ll give you examples here that work with the two compilers you are most likely to encounter: Visual C++ .NET 2003/2005, or gcc 3.x.

Let’s say you’re doing a study tracking peoples reading habits by age, and you want to implement the following code:

You won’t be able to compile this code as-is, for at least three reasons:

1. You need to include the correct header file, which unfortunately differs depending on which compiler you are using.
2. You need to hoist the `hash_map` name into your namespace.
3. You need to define the hash function and at least one comparison function for the key class, `Sample`.

Item 3 is only needed if you are using something other than a basic type for your class, and it’s the trickiest. Defining the comparison is necessary so that the hashing library code can verify key equality during lookups or collisions.

In both cases, the function is usually pretty easy to write - you just compose it using existing functions. Examples for Microsoft and gcc compilers circa 2007 are given below.

#### Sample Code for gcc

gcc puts the hashing header files in the `ext` folder, and uses the `__gnu_cxx` namespace, isolating the non-standard library extensions into a ghetto of sorts. Both of these are dealt with in the first two lines of the sample code below.

The gcc implementation of `hash_map` needs two template parameters to handle keys that aren’t defined as built-in types: one class that has an operator which returns a hash index given an input key, and a second that returns a boolean value for equality test. I pack both of these into a single class, `SampleTraits`. The traits class is then passed to the `hash_map` type definition as the third and fourth template parameters.

When you are defining the hash function, you need to somehow combine hash values for each of the members of your structure. Creating hash keys is somewhat of an art, so I try to use the ones provided by the library writer as the basis for my hash function. In this case I take the hash values for the two members of `class Sample` and simply XOR them together, which should provide a decent value.

If you are using g++ 4.x, you may be better off using the tr1 `unordered_map` class. While it is not quite part of the standard yet, it should be soon, and that will insure its future support and compatibility.

#### Visual C++ .NET 2003/2005

Microsoft’s compilers have similar issues, but deal with them differently. (That’s the problem with non-standard features.)

Although the `hash_map` header file is accessed from the normal C++ header folder, the class itself, as was the case with g++, is defined in a different namespace, `stdext`. Again, that is dealt with in the first two lines of the sample code.

The remaining two issues are resolved by defining a global `hash_value` class that has an operator that returns a hash key for a given object of `class Sample`, and a global comparison operator that is used to test for equality of two objects of `class Sample`. (It is not unusual to use the less-than operator to test for equality, by testing both `a > b` and `a < b`, we determine whether the objects are equal or not.)

#### Some Notes on Usage

The member functions of most implementations of `hash_map` will be identical to those of the standard `std::map` container. Storing an entry in a `hash_map` can be done using an overloaded operator that makes the container look like an associative array:

Looking up a value from the table is done using a call to `hash_map::find()`. This method takes a reference to a key as its argument, and returns an iterator which points to the end of the container on failure, or to a key/value pair on success. The pair is stored in an `std::pair<key type,value type>` object, which allows you to access the elements with the `first` and `second` members of that object:

#### A Weightier Example

Demonstrating the value of a technique with a problem like Fibonacci numbers can be less than convincing - it’s pretty easy to devise reasonable solutions to the problems in an ad hoc manner.

For a more substantial example, I’ve slightly revised a problem from the Cormen, Lieserson, Rivest, and Stein book:

Before leaving Microsoft to work full-time on philanthropic ventures, Bill Gates has one problem left to solve: his employees are overweight. After a company health fair with a mandatory weigh-in, Bill had HR prepare a standard org chart which includes the number of extra pounds each employee is carrying around.

Bill decides the best approach to the problem is to create a regular exercise class for his overweight employees. But with a twist - he decides that he doesn’t want any employee to be in a class with his or her direct superior, so as to avoid any hint at coercion.

Your task is to take that org chart, plus Bill’s no-supervisor constraint, and invite as many employees as you can, maximizing the total excess weight in the class.

A sample version of the org chart is shown below in Figure 2. Keep in mind that this sample is fairly small, but Bill’s chart will normally have almost 80,000 employees. Figure 2 - The Excess Weight Org Chart

The employee names are not too imaginative, but you can see that employee ‘a’ is carrying 1 extra pound, employee ‘b’ 2, and so on.

Because of the constraints on the problem, we know we can’t invite all employees. We need to decide who to invite, but still obey the rule that no employees are there with their supervisors.

This problem is very amenable to a conventional recursive solution. To determine the maximum weight that can be achieved at a given node, we can use something like the pseudo code shown here:

The logic is straightforward. At each node we calculate two possible values for the maximum weight. The first value, `weight1`, defines the maximum weight if the node does not attend. Since the node is not attending, the weight will consist of the sum of all that node’s immediate children.

The second value, `weight2`, is used to calculate the maximum value of the node when the node does attend. Since the node is attending, none of its immediate children can attend, which means the total weight will be the sum of the node’s weight, plus the max of all of its children’s children.

You can immediately see that this algorithm shares a characteristic with the Fibonacci algorithm. When evaluating a node, we make recursive calls at two different depths in the calling tree, and this results in overlapping subproblem evaluation. For example, when implementing this pseudo code on the organizational chart showing in Figure 2, the list of nodes passed in to GET-MAX-WEIGHT will be:

 Call Path Weight 1 Weight 2 root a a b c d e f g a->b d e h i j k a->b->d h i a->b->e j k q r a->b->e->j q r a->c f g l m n p a->c->f l m a->c->g n p

You’ll note that there are 17 nodes, but because of duplicates, we call GET-MAX-WEIGHT 31 times, leading to much extra work. And that level of work gets into the seriously excessive levels when faced with the entire Microsoft 80,000 person organizational chart.

So naturally, Dynamic Programming comes to the rescue. It turns out that this particular problem fits into the category well. We have overlapping subproblems, we have optimal substructure, so we can employ DP.

#### The Dynamic Programming Solution

A true bottoms-up approach is probably impractical for this problem.

First, it’s a lot of work to traverse a tree one level at a time, starting at the lowest level. Second, there are problems with the tabular subproblem storage scheme usually favored in dynamic programming. It’s not entirely clear where in a table we would store the result from `GET-MAX-WEIGHT( q )`.

An approximation of the bottoms up approach could be achieved by traversing the tree in postorder, visiting all the children of a node first, then visiting the node itself. In this scenario we would visit the nodes from Figure 2 in the following order:

h i d q r j k e b l m f n p g c a

At every non-leaf node, we would have all the information we needed to calculate `GET‑MAX‑WEIGHT(node)` without actually actually recalculating any results - the subproblem results will have already been calculated and stored.

Of course, traversing the tree in postorder is actually what we’re already doing already in the recursive implementation. Seeing this, my solution to the algorithm was to use a C++ hash table to perform memoization of the subproblems. This has a few advantages:

• My implementation looks just like the recursive definition of the algorithm. It just has a few additional lines needed to check for stored subproblems.
• I don’t have to worry about finding a tabular storage method - I store the results based on a hash of the the node pointer, making lookup and storage nice and simple.
• I don’t have to worry about how to do bottoms-up traversal of the org chart. My recursive definition visits the nodes in the order I want without any changes.

I’ll start by simply presenting my modified version of `GetMax(node)`, then present the infrastructure that goes with it. Finally, I’ll discuss the last step in any dynamic programming problem: using the stored subproblem results to get a solution.

#### The Memoized GetMax()

The C++ code to solve this problem is completely defined in a class called Memo, which relies on a tree whose nodes are defined by `class Node`. The two class definitions are shown here:

To run an instance of this problem, I first construct a tree by having the `Node` constructor read in a definition file. The resulting root node is then passed in to `Memo::GetMax()`, which works its way through all the subproblems and calculates the maximum total weight for this problem.

The code in `GetMax()` should appear nearly identical to that in the pseudocode version of `GET‑MAX‑WEIGHT(node)`, with one critical difference: in my implementation of `GetMax()`, I check to see if a hashed value of `GetMax()` has been saved upon entry to the routine, and if it has, I short-circuit execution and return that value immediately.

Likewise, at the end of the routine, I store the newly calculated maximum value in the hash table.

The key to this hash table is a pointer to a node - it could just as easily have been the name of the node, but using the pointer has the advantage of making the program a little more invulnerable to input error - duplicate node names won’t break the algorithm.

The use of a pointer as a key into a `hash_map` is a technique that can be used to work around some of the issues I discussed earlier in the article. Because the hash containers offer built-in support for pointers of any type, use of a pointer as a key allows you to avoid having to define your own hash function and comparison function, which is nice.

This routine is conceptually pretty simple. The first four lines check to see if I’ve cached a previously seen solution to the problem in the `mSolutions hash_map`. If I have, the `mMaxWeight` member of that solution information is returned immediately, avoiding a duplicated subproblem.

If the solution is not already stored off, I have to do the real work. This means evaluating both possible solutions - the weight if this node attends, and thus excludes all its immediate children, and the weight if this node does not attend, and thus includes the children. Those two values are calculated and stored in local variables `weight_attending` and `weight_not_attending`.

Given those two values, its a simple matter to decide which of the two is larger, and return that one. However, there are a couple of issues to deal with before returning.

First, I want to make sure that we store the solved subproblem in the solution table. Both `return `statements do that by writing the value to the `mMaxWeight `member of the correct entry in `mSolutions`.

That would be enough to get the maximum value, but I may also want to create a solution for the problem - which means not only knowing the maximum weight achieved, but a list of attendees. In order to get that list of attendees, I need to also keep track of whether the maximum value achieved at this node was done so by attending or not attending. I do that by updating a member called `mAttending` in the `Solution` object. (Note that the constructor sets `mAttending` to `false` upon creation.)

#### Getting The Max Value

If all I wanted to know was the maximum value that could be achieved, this single method in `class Memo` would be enough. Running a program that calls this routine for the tree shown in Figure 2 produces the following output:

``````GetMax(h)
GetMax(i)
GetMax(d)
GetMax(q)
GetMax(r)
GetMax(j)
GetMax(k)
GetMax(e)
GetMax(b)
GetMax(l)
GetMax(m)
GetMax(f)
GetMax(n)
GetMax(p)
GetMax(g)
GetMax(c)
GetMax(a)
Max = 31
``````

`GetMax()` prints out each subproblem it’s trying to solve, and you can see that each node is only called once. The listing also shows you that the nodes are evaluated in postorder, which is the order we would have needed to simulate with a pure bottoms-up implementation.

#### Getting the Solution

As I said, the implementation as is returns the maximum value, but to be really useful we need to actually know which of the people on the org chart are going to be invited to attend Bill’s exercise class.

To come up with this list, we need to traverse the tree, only looking at nodes that are included as part of the solution. Nodes that are not included are those nodes that are skipped over because their parent is attending.

Just knowing that a node is included as part of the solution is not quite enough. A node could be included, yet not be attending because the maximum value for that node is achieved by not attending.

The code that actually develops the solution is called `MarkIncluded()`. It is called with the root node as the target, then successively walks down through the tree, marking child nodes that are part of the solution. As each node is marked, I can see if it is going to be attending, and its name is printed out:

The logic is pretty simple. When this method is called for a node, that node is automatically marked as included in the solution. Depending on whether this node has chosen to attend or not, I either mark all of its children as included, or mark all of its children’s children.

Adding this code to the program lets me produce this output for the tree shown in Figure 2:

``````Attendees: d e q r c l m n p
``````

A quick check of the weights of those nodes shows that their sum is 31, which agrees with the output shown above.

#### Graphical Output

I like being able to see the results of of my algorithm in a graphical format, so I added one additional method, `PrintTree()` to `class Memo`. This method takes the final tree and prints it out in a format that can be processed into a nice graph using dot, one of the programs in the Graphviz package, a free set of tools from AT&T.

The final result of running the data set shown in Figure 2 through my test program is shown in Figure 3. Figure 3 - Program Output Using GraphViz

In Figure 3, nodes that are not attending are outlined with a dotted border. This means that either their parent node has chosen to attend, excluding them, or their optimal solution calls for them to not attend.

The individual node weight and name are shown in the two cells in the top half of the node. The bottom half has two cells that show the calculated value of `GetMax()` for that node, and the decision made as to whether that node attends or not in order to reach that maximum value.

#### Running the Test Program

The test program I used to create this output is a simple program that reads in a tree definition file, outputs the program data to the console, and creates a GraphViz format output file. It’s run with two arguments, an input definition file (one sample is included), and the desired output file name:

``````memo tree01.def tree01.dot
``````

Project and/or make files are included for Visual Studio 2003/2005 and gcc.

Note that the format of the tree definition file is documented in source code.

The `main()` routine for the test program (minus some error handling) looks like this:

This program should give you an idea of how to harness the power of Dynamic Programming in order to solve some fairly difficult problems with relatively simple algorithms. Dynamic Programming is the key to this, and memoization with C++ hash maps helps make it possible.

#### References

 Evolving a language in and for the real world: C++ 1991-2006, Bjarne Stroustrup, ACM HOPL-III. June 2007. http://research.att.com/~bs/hopl-almost-final.pdf
 Graphviz. The free project from AT&T that supports graph visualization.
 Source Code for sample program
 Cormen, Thomas. et. al. Introduction to Algorithms, 3rd Edition, The MIT Press, 2001. 