<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Mark Nelson &#187; Web Articles</title>
	<atom:link href="http://marknelson.us/category/web-articles/feed/" rel="self" type="application/rss+xml" />
	<link>http://marknelson.us</link>
	<description>Programming, mostly.</description>
	<lastBuildDate>Fri, 13 Apr 2012 19:25:26 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Headline Writing Gone Bad</title>
		<link>http://marknelson.us/2010/10/01/headline-writing-gone-bad/</link>
		<comments>http://marknelson.us/2010/10/01/headline-writing-gone-bad/#comments</comments>
		<pubDate>Fri, 01 Oct 2010 15:36:37 +0000</pubDate>
		<dc:creator>Mark Nelson</dc:creator>
				<category><![CDATA[Computer Science]]></category>
		<category><![CDATA[Snarkiness]]></category>
		<category><![CDATA[Web Articles]]></category>
		<category><![CDATA[Writing]]></category>

		<guid isPermaLink="false">http://marknelson.us/2010/10/01/headline-writing-gone-bad/</guid>
		<description><![CDATA[<div class="addthis_toolbox addthis_default_style" addthis:url='http://marknelson.us/2010/10/01/headline-writing-gone-bad/' addthis:title='Headline Writing Gone Bad' ><a class="addthis_button_twitter"></a><a class="addthis_button_favorites"></a><a class="addthis_button_print"></a><a class="addthis_button_facebook_like"></a><a class="addthis_button_google_plusone"></a><a class="addthis_button_compact"></a></div>Microsoft has added a new keyword to C# as part of the 4.0 release earlier this year. Objects that are typed as dynamic bypass normal static type checking, allowing C# to have the flexibility of other scripting languages. This is all well and good, but the headline writers of the blogosphere have taken a decided [...]]]></description>
			<content:encoded><![CDATA[<div class="addthis_toolbox addthis_default_style" addthis:url='http://marknelson.us/2010/10/01/headline-writing-gone-bad/' addthis:title='Headline Writing Gone Bad' ><a class="addthis_button_twitter"></a><a class="addthis_button_favorites"></a><a class="addthis_button_print"></a><a class="addthis_button_facebook_like"></a><a class="addthis_button_google_plusone"></a><a class="addthis_button_compact"></a></div><p>Microsoft has added a new keyword to C# as part of the 4.0 release earlier this year. Objects that are typed as <i>dynamic</i> bypass  normal static type checking, allowing C# to have the flexibility of other scripting languages.</p>
<p>This is all well and good, but the headline writers of the blogosphere have taken a decided wrong turn with their naming of this feature:</p>
<p><a href="http://www.codeproject.com/Articles/73856/Csharp-4-0-Dynamic-Programming.aspx" class="newpage">C# 4.0: Dynamic Programming</a><br/><br />
<a href="http://www.nikhilk.net/CSharp-Dynamic-Programming-JSON.aspx" class="newpage">C# 4.0, Dynamic Programming and JSON</a><br/><br />
<a href="http://www.codeguru.com/csharp/.net/net_general/visualstudionetadd-ins/article.php/c17991" class="newpage">Dynamic Programming Using C# 4.0 and Microsoft Visual Studio 2010</a><br/><br />
<a href="http://geekswithblogs.net/sdorman/archive/2008/11/16/c-4.0-dynamic-programming.aspx" class="newpage">C# 4.0: Dynamic Programming</a><br/></p>
<p>Note the misuse of the term <i>Dynamic Programming</i>. Everyone who takes an introductory algorithms course learns that the term <a href="http://en.wikipedia.org/wiki/Dynamic_programming" class="newpage">Dynamic Programming</a> has been in use for over fifty years, and refers to a method for solving problems by decomposition. It&#8217;s a useful technique that I&#8217;ve <a href="http://marknelson.us/2007/08/01/memoization/" class="newpage">covered here</a> in the past, and any skilled programmer should be familiar with it.</p>
<p>No, it&#8217;s not the end of the world, but people who are writing about Computer Science really ought to know something about Computer Science, don&#8217;t you think?</p>
]]></content:encoded>
			<wfw:commentRss>http://marknelson.us/2010/10/01/headline-writing-gone-bad/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Yet Another Word Puzzle</title>
		<link>http://marknelson.us/2007/11/13/yet-another-word-puzzle/</link>
		<comments>http://marknelson.us/2007/11/13/yet-another-word-puzzle/#comments</comments>
		<pubDate>Tue, 13 Nov 2007 20:58:45 +0000</pubDate>
		<dc:creator>Mark Nelson</dc:creator>
				<category><![CDATA[Computer Science]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Web Articles]]></category>

		<guid isPermaLink="false">http://marknelson.us/2007/11/18/yet-another-word-puzzle/</guid>
		<description><![CDATA[















<div class="addthis_toolbox addthis_default_style" addthis:url='http://marknelson.us/2007/11/13/yet-another-word-puzzle/' addthis:title='Yet Another Word Puzzle' ><a class="addthis_button_twitter"></a><a class="addthis_button_favorites"></a><a class="addthis_button_print"></a><a class="addthis_button_facebook_like"></a><a class="addthis_button_google_plusone"></a><a class="addthis_button_compact"></a></div>Dr. Dobb's PortalNovember, 2007 Article on DDJ site As I've confessed in the past, I'm a sucker for word puzzles. My recent post on a Will Shortz puzzle from NPR Morning Edition ended up provoking a surprising amount of comment, much of it in the vein of Watch me solve it better, faster, and with [...]]]></description>
			<content:encoded><![CDATA[







<div class="addthis_toolbox addthis_default_style" addthis:url='http://marknelson.us/2007/11/13/yet-another-word-puzzle/' addthis:title='Yet Another Word Puzzle' ><a class="addthis_button_twitter"></a><a class="addthis_button_favorites"></a><a class="addthis_button_print"></a><a class="addthis_button_facebook_like"></a><a class="addthis_button_google_plusone"></a><a class="addthis_button_compact"></a></div><table border="0" width="500">
<tr>
<td width="300"><img alt="DDJ Portal Logo" src="http://marknelson.us/attachments/misc/logo_ddj.gif" /></td>
<td><strong>Dr. Dobb's Portal</strong><br/>November, 2007<br />
         <a href="http://ddj.com/cpp/202806370" class="newpage">Article on DDJ site</a>
  </td>
</tr>
</table>
<p>As I've confessed in the past, I'm a sucker for word puzzles. My <a href="http://marknelson.us/2007/04/01/puzzling/" class="newpage">recent post</a> on a Will Shortz puzzle from NPR Morning Edition ended up provoking a surprising amount of <a href="http://thraxil.org/users/anders/posts/2007/10/30/A-Simple-Programming-Puzzle-Seen-Through-Three-Different-Lenses/" class="newpage">comment</a>, much of it in the vein of <i>Watch me solve it better, faster, and with more style using language XXX</i>.</p>
<p>I certainly enjoyed watching other people solve the problem, and found their solutions instructive. As the XP crowd has figured out, we programmers spend too much time working on our own problems and not enough time watching how other people work. There's a lot to learn, both good and bad, from getting a peek inside another person's head.</p>
<h4>Out of the Blue</h4>
<p>Which brings me to the puzzle at the center of this article.<br />
<span id="more-106"></span></p>
<table border="0" align="left">
<tr>
<td><center><img src="http://marknelson.us/attachments/2007/yet-another-word-puzzle/BethKatz.jpg" hspace="5"><br/>Beth Katz</center></td>
</tr>
</table>
<p>In what at first seemed to be an incident completely unrelated to word play, I had a pleasant email exchange with <a href="http://cs.millersville.edu/~katz/" class="newpage">Beth Katz</a>, who was teaching a Data Structures class at <a href="http://cs.millersville.edu/" class="newpage">Millersville University</a>. I happened to look at Beth's current homework assignment for her class, and you can imagine my reaction when I saw the problem she had posted for her class:</p>
<blockquote><p>We define word reduction as removing a single letter from a word while leaving the remaining letters in their original order so that the resulting sequence of characters is also a word. A good word can be reduced step-by-step until all that is left is <em>a</em> or <em>i</em>. Your program will answer the question: what is the biggest word in the given dictionary that can be reduced?</p></blockquote>
<p>Beth gave a short example of a good word: <i>planets</i>:</p>
<pre>
planets
plants
pants
pant
ant
an
a
</pre>
<p>As you can see, you can remove one letter at a time, and each time you are left with a valid word one character shorter. </p>
<p>This makes for an interesting problem indeed. I've read that the average English speaker has a vocabulary of perhaps 15,000 to 20,000 words, but many reasonable word lists have upwards of 100,000 English words. How many of these words qualify as <em>good </em>words? </p>
<p>As I discussed in the previous word puzzle article, the highly evolved pattern matching facility in the human mind is often pretty good at solving these problems, and I think this is the case (in a limited way) for this particular problem. If I give you a word (like <i>planets</i>) above, I think you'd be able to find a possible reduction path quickly, subject to the limitations of your own vocabulary.</p>
<p>But the human mind is not so good at certain variations on the same problems. Asking you for the biggest word that fits this pattern presents you with an almost impossible task. Basically, it requires you to be able to iterate through the words you know, ordered by length, and test each one. Unless you are subject to some pretty incredible flashes of insight, I think you're going to need a computer for this.</p>
<h4>Going Bottom Up</h4>
<p>Maybe the feeling isn't universal among programmers, but when I look at a problem, my first instinct is usually to try a top-down approach. For this problem, a top-down approach would mean identifying the longest words in the dictionary, then attempting to decompose them into successively shorter words.</p>
<p>This approach will work, but a little mental analysis shows that it might be a little resource heavy. Imagine that you are decomposing a 10 letter word by taking away one letter at a time. In the worst case, you might find all 9 shorter words in the dictionary, and then you could find all 720 8 letter words, and so on. Although in the general case you might only find one or two matches, particularly at the long lengths, even the potential for factorial growth leaves some room for concern.</p>
<p>So I took a shot at a bottom-up approach instead. It isn't usually my first choice, but I think you'll see that in this case it yields a much more satisfactory and efficient solution to the problem.</p>
<h5>The Inner Loop</h4>
<p>For this program to succeed, it must terminate its processing with a container that holds the longest good words in the dictionary. For this particular problem, my choice of container is the C++ <span class="inline_code">hash_set</span>, which is non-standard but universally implemented.</p>
<p>My bottom-up approach means that I will fill in the <span class="inline_code">hash_set</span> container for words of length 1 first, then words of length 2, and so on. For reasons of efficiency, it works out better if I keep a separate <span class="inline_code">hash_set</span> for each word length, so the <span class="inline_code">hash_set</span> objects are actually stored in a <span class="inline_code">map </span>that is indexed on the word size:</p>
<p><span class="inline_code">std::map&lt;size_t,hash_set&lt;std::string&gt; &gt; good_words</span></p>
<p>To fill in the hash for size <strong>i</strong>, I need a loop that iterates over all words of size <b>i</b>, removing one character at a time and then testing to see if the result is a good word of size <b>i-1</b>. If it is, I add it to <span class="inline_code">good_words[ i ]</span>. When I'm done iterating over all words of that size, I have a <span class="inline_code">hash_set</span> that contains all good words of that size, and I can move up to the next larger size.</p>
<p>So if we're testing words of size <b>i</b> the innermost part of the loop will look like this:</p>
<div class="igBar"><span id="lcpp-9"><a href="#" onclick="javascript:showPlainTxt('cpp-9'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-9">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1"><span class="kw1">for</span> <span class="br0">&#40;</span> <span class="kw4">size_t</span> j = <span class="nu0">0</span> ; j &lt;i ; j++ <span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; string test_word = word;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; test_word.<span class="me1">erase</span><span class="br0">&#40;</span> j, <span class="nu0">1</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span> good_words<span class="br0">&#91;</span> i-<span class="nu0">1</span> <span class="br0">&#93;</span>.<span class="me1">find</span><span class="br0">&#40;</span> test_word <span class="br0">&#41;</span> != fail <span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; good_words<span class="br0">&#91;</span> i <span class="br0">&#93;</span>.<span class="me1">insert</span><span class="br0">&#40;</span> word <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw2">break</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#125;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p>
In this inner part of the loop, we repeatedly remove a character from the word, testing to see if the resulting word appears in the list of words saved in the previous size hash set. If a match occurs, the word is inserted into the hash set for the current word size, and the loop breaks.</p>
<p>This bottom-up approach seems like the ideal way to build my lists of good words for a couple of reasons. First, I don't go to the expense of adding words to the hash sets unless they are good words. Second, determining that a given word is a good word only requires a test against the words at level i-1; I don't have to test for a complete path down to 'a' or 'i'.</p>
<h5>Building The Input Data</h4>
<p>In the previous section I mentioned that the innermost loop was going to be called as I iterated over all the dictionary words of a given size. So how do I get all the words of a given size?</p>
<p>The first thing that might occur to you is that you could read in all the words from the dictionary, then sort them by size. This approach would work, but given that the standard sort routines available to you in the C++ library are all going to work in O(n&middot;lgn) time, it might get kind of expensive as the dictionary grows to hundreds of thousands of words.</p>
<p>The good news is that with this data set we're in position to take advantage of a linear sort. Yes, we can sort data and do substantially better than O(n&middot;lgn) when we know that the  data to be sorted is constrained to a small set of values.</p>
<p>In this case, I just create one linked list for each word size, and as I read the input file, I add each input word to the front of the appropriate linked list. This would be a true linear algorithm if I constrained the input size to a fixed number, say 25, but for convenience I actually store the lists in a map that looks like this:</p>
<p><span class="inline_code">std::map&lt;size_t,std::list&lt;std::string&gt; &gt; words_by_length</span></p>
<p>As a result the input code runs in close to linear time. The actual loop that reads in the data is nice and simple:</p>
<div class="igBar"><span id="lcpp-10"><a href="#" onclick="javascript:showPlainTxt('cpp-10'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-10">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1"><span class="kw1">while</span> <span class="br0">&#40;</span> input <span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; string word;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; input&gt;&gt; word;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; words_by_length<span class="br0">&#91;</span> word.<span class="me1">size</span><span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="br0">&#93;</span>.<span class="me1">push_back</span><span class="br0">&#40;</span> word <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; count++;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span> <span class="br0">&#40;</span> count % <span class="nu0">100</span> <span class="br0">&#41;</span> == <span class="nu0">0</span> <span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw3">cout</span> &lt;&lt;count &lt;&lt;<span class="st0">"<span class="es0">\r</span>"</span>;</div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#125;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p>
Once all the words are read in, I can access the list of words of a given size with a simple map lookup: <span class="inline_code">words_by_length[ i ]</span>.</p>
<h4>Word Lists</h4>
<p>One final detail before I can compile my library - I need some lists of words! Lists of words are not hard to come by, although coming up with a suitable one for this exercise might take some work.</p>
<p>One of the first places to look is is on your local Linux system. My system has a list of words in <code>/usr/share/dict/words</code>, which is used by the spell checker application, and possibly by the password app. One one of my systems, this dictionary has a whopping 483,524 words, which means it is packed with obscure words. Just as an example, a typical 12-character good word and its derivation found using <code>/usr/share/dict/words</code> yields this head-scratching sequence:</p>
<pre>
abranchiate
branchiate
branchiae
branchia
branchi
branch
ranch
rach
ach
ch
h
</pre>
<p>Probably the first thing you want to do with that file is go through and remove all the single letter words except 'a' and 'i', but even so, you're going to be boggled by some of what you see.</p>
<p>Another good alternative are the collection of word lists distributed on Project Gutenberg as the <a href="http://www.gutenberg.org/etext/3201" class="newpage">Moby Word List</a>. This includes a wide variety of lists of various sizes.</p>
<p>Beth Katz had <a href="http://cs.millersville.edu/~katz/cs362/examples/dictionaries/" class="newpage">several good dictionaries</a> listed along with her homework assignment, including a short one called <a href="http://cs.millersville.edu/~katz/cs362/examples/dictionaries/kids.dict" class="newpage">kids.dict</a> that is nice and short, making it good for debugging runs.</p>
<p>Finally, Google searches for "word lists" will turn up many other good choices.</p>
<h4>Wrapping it up</h4>
<p>Once I had the bottom-up good word builder working with the file-based word list, I was ready to put it all together. The core of <span class="inline_code">main()</span> now looks like this:</p>
<div class="igBar"><span id="lcpp-11"><a href="#" onclick="javascript:showPlainTxt('cpp-11'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-11">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1">map&lt;size_t,list&lt;string&gt;&gt; words_by_length;</div>
</li>
<li class="li2">
<div class="de2">read_words<span class="br0">&#40;</span> argv<span class="br0">&#91;</span> <span class="nu0">1</span> <span class="br0">&#93;</span>, words_by_length <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">map&lt;size_t,hash_set&lt;string&gt;&gt; good_words;</div>
</li>
<li class="li2">
<div class="de2"><span class="kw4">size_t</span> longest = build_up_words<span class="br0">&#40;</span> words_by_length,</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;good_words <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">print_good_words<span class="br0">&#40;</span> longest,</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; words_by_length,</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; good_words <span class="br0">&#41;</span>; </div>
</li>
</ol>
</div>
</div>
</div>
<p>
Procedure <span class="inline_code">read_words()</span> simply reads all the words from the file into the map of <span class="inline_code">hash_set</span> containers called <span class="inline_code">word_by_length</span>, as described earlier.</p>
<p>Then <span class="inline_code">build_up_words</span> collects all the good words and stores them, organized by size, in the <span class="inline_code">map </span>of <span class="inline_code">hash_set</span> containers called <span class="inline_code">good_words</span>. The core of that routine was described earlier.</p>
<p>I show the results of the top two levels in <span class="inline_code">print_good_words()</span>, which is simple enough to read up on in the source code.</p>
<h4>The Results</h4>
<p>By now you are no doubt dying to see some results from the program. But first, to scope the size of the problem, here's the output from the program as the <span class="inline_code">good_words</span> hashes are populated on a run against <code>SINGLE.TXT</code>, a 354,985 word database:</p>
<pre>
Loading words from: SINGLE.TXT
Loaded 354985 words from SINGLE.TXT
Found 431 eligible words out of 431 total at length 2
Found 2075 eligible words out of 2092 total at length 3
Found 6213 eligible words out of 6758 total at length 4
Found 11322 eligible words out of 15047 total at length 5
Found 12495 eligible words out of 28473 total at length 6
Found 8939 eligible words out of 40094 total at length 7
Found 4295 eligible words out of 49528 total at length 8
Found 1210 eligible words out of 51216 total at length 9
Found 174 eligible words out of 43964 total at length 10
Found 20 eligible words out of 36082 total at length 11
Found 0 eligible words out of 28009 total at length 12
</pre>
<p>So that means about 13% of the words in this vocabulary were good words. I'm a little surprised that it's that high. To add some sanity to the mix, I removed all the single character words from <code>SINGLE.TXT</code> with the exception of 'a' and 'i', and the ratio went down to a more reasonable 8%.</p>
<p>You can also see, as you would expect, that the proportion of good words goes down at each level. At lengths 2, 3, and 4 nearly all words are good words, but by the time we get to length 11, we're down to less than one-tenth of one percent good.</p>
<p>Even with my modified version of <code>SINGLE.TXT</code>, you're bound to get plenty of esoteric words when working your way through the derivation of an 11 or 10 character good word. Of the 18 words of eleven characters, the derivation that works best with my vocabulary would be the following:</p>
<pre>
sparklingly
sparkingly
sparingly
springly
springy
spring
sprig
prig
pig
pi
i
</pre>
<p>With the more manageable scrabble.dict dictionary, containing 79,340 words, some of the first sequences that pop out include:</p>
<pre>
shopping hopping hoping oping ping pig pi i
breaches beaches baches aches aces ace ae a
marchese marches arches aches aces ace ae a
prawning pawning awning awing wing win in i
stablest stalest stales tales ales als as a
bravoing braving raving ravin rain ain in i
failings filings flings lings lins ins is i
relaters elaters elates elate late ate ae a
roadster roaster raster rater rate ate ae a
semioses semises seises seise seis sis is i
clambers lambers lamber lamer lame lam am a
claviers clavers lavers avers aves ave ae a
shrieves shrives shives hives hies his is i
stalkier talkier talker taker take tae ae a
statutes statues states tates ates ate ae a
swarming warming waring wring ring rin in i
brambled rambled ambled amble able ale ae a
stratous stratus status stats tats tas as a
paddlers paddles padles pales ales als as a
thirling tirling tiring iring ring rin in i
trucking trucing truing ruing ring rin in i
brawlier brawler bawler baler bale ale ae a
frilling filling filing fling ling lin in i
carouses arouses arouse arose arse are ae a
</pre>
<p>No doubt there are still plenty of obscure words here, but remember, this dictionary is probably composed of at least 50% words that aren't in your working vocabulary.</p>
<h4>Efficiency</h4>
<p>When run against a word list with 350K+ entries on my anemic notebook computer, it takes almost 10 seconds for the program to terminate, including display time. The vast majority of that time is spent checking words for goodness, which requires removing characters one at a time, then checking to see if their small descendants are in the word list.</p>
<p>Obviously, if you want to optimize this program for better performance, that's the place to do it.  My guess would be that the std::string class member to erase characters from a word is probably far from optimal, and could be replaced by a hand-coded routine designed to do the same task with much greater speed.</p>
<h4>Issues With Non-Standard Library Functions</h4>
<p>Because hashed containers did not make it into the original C++ standard, there is a somewhat higher level of peril when using them. Problems ranging from syntactic inconsistency to lack of performance guarantees definitely make <span class="inline_code">hash_set</span> and <span class="inline_code">hash_map</span> second class citizens compared to the other standard containers. I saw a good example of this when I first started work on this article.</p>
<p>When solving this problem, the first thing that seemed obvious to me was that we were going to be storing references to dictionary words in hash tables. And I thought it might be interesting to see how well the C++ hash classes were going to be able to handle input data with hundreds of thousands of words.</p>
<p>I thought a good test program would be one that simply reads in the text file and adds it to a hash set:</p>
<div class="igBar"><span id="lcpp-12"><a href="#" onclick="javascript:showPlainTxt('cpp-12'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-12">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1">hash_set&lt;std::<span class="me2">string</span>&gt; words;</div>
</li>
<li class="li2">
<div class="de2">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">while</span> <span class="br0">&#40;</span> input <span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; std::<span class="me2">string</span> word;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; input&gt;&gt; word;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; words.<span class="me1">insert</span><span class="br0">&#40;</span> word <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; count++;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span> <span class="br0">&#40;</span> count % <span class="nu0">100</span> <span class="br0">&#41;</span> == <span class="nu0">0</span> <span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; std::<span class="kw3">cout</span> &lt;&lt;count &lt;&lt;<span class="st0">"<span class="es0">\r</span>"</span>;</div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#125;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p>
Under Visual C++ .NET 2005 on a fairly slow laptop, I immediately saw a nasty problem with the Microsoft implementation of <span class="inline_code">hash_set<T></span>. Every time the counter hit a an even power of two, there was a bit of a pause. The pause grew longer and longer as the count grew larger, until by the time I was up to 128K words, it stretched out to many seconds.</p>
<p>Lesson learned. Hash table resizing can be expensive under some implementations of this non-standard class. Just resizing hash tables in any implementation can be difficult, but the additional requirements imposed on C++ library containers adds significantly to the work that must be done at this point. </p>
<p>I hoped that I would find a <span class="inline_code">reserve()</span> method or a constructor option that would let me preallocate a <span class="inline_code">hash_set</span> with perhaps 200K buckets, but this doesn't seem to be possible with Microsoft's implementation. The good news is that the <span class="inline_code">hash_set</span> replacement in <a href="http://en.wikipedia.org/wiki/Technical_Report_1" class="newpage">TR1</a>, <span class="inline_code">unordered_set</span>, will impose a requirement that conforming libraries allow for a bucket count as part of the container's constructor.</p>
<p>It turned out to not be too important, however. As I worked on the implementation of the algorithm, I drastically reduced the number of strings that were stored in any one hash, making this a moot point.</p>
<h4>Source Code</h4>
<p>You can download the <a href="http://marknelson.us/attachments/2007/yet-another-word-puzzle/Perfect.zip" class="newpage">project</a> here, including project files for Visual Studio 2003 and 2005, and a simple Makefile for gcc. The code has been checked under gcc 3.4, but I make no claims that it will work with all later versions of the library.</p>
<h4>References</h4>
<p>Ward, Grady. "Moby Word Lists by Grady Ward - Project Gutenberg." Main Page - Gutenberg. 12 Nov. 2007 <a href="http://www.gutenberg.org/etext/3201" class="newpage">http://www.gutenberg.org/etext/3201</a>.<br />
"ISO/IEC JTC1/SC22/WG21 - The C++ Standards Committee." Open Standards. 13 Nov. 2007 <a href="http://www.open-std.org/jtc1/sc22/wg21/" class="newpage">http://www.open-std.org/jtc1/sc22/wg21/</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://marknelson.us/2007/11/13/yet-another-word-puzzle/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Building the Convex Hull</title>
		<link>http://marknelson.us/2007/08/22/convex/</link>
		<comments>http://marknelson.us/2007/08/22/convex/#comments</comments>
		<pubDate>Wed, 22 Aug 2007 19:37:50 +0000</pubDate>
		<dc:creator>Mark Nelson</dc:creator>
				<category><![CDATA[Computer Science]]></category>
		<category><![CDATA[Web Articles]]></category>

		<guid isPermaLink="false">http://marknelson.us/2007/08/22/convex/</guid>
		<description><![CDATA[



















<div class="addthis_toolbox addthis_default_style" addthis:url='http://marknelson.us/2007/08/22/convex/' addthis:title='Building the Convex Hull' ><a class="addthis_button_twitter"></a><a class="addthis_button_favorites"></a><a class="addthis_button_print"></a><a class="addthis_button_facebook_like"></a><a class="addthis_button_google_plusone"></a><a class="addthis_button_compact"></a></div>Dr. Dobb's PortalSeptember, 2007 Article on DDJ site Finding the Convex Hull of a set of points is an interesting problem in computational geometry. It is useful as a building block for a diverse set of applications, including thing such as: Collision detection in video games, providing a useful replacement for bounding boxes. Visual pattern [...]]]></description>
			<content:encoded><![CDATA[









<div class="addthis_toolbox addthis_default_style" addthis:url='http://marknelson.us/2007/08/22/convex/' addthis:title='Building the Convex Hull' ><a class="addthis_button_twitter"></a><a class="addthis_button_favorites"></a><a class="addthis_button_print"></a><a class="addthis_button_facebook_like"></a><a class="addthis_button_google_plusone"></a><a class="addthis_button_compact"></a></div><table border="0" width="500">
<tr>
<td width="300"><img alt="DDJ Portal Logo" src="http://marknelson.us/attachments/misc/logo_ddj.gif" /></td>
<td><strong>Dr. Dobb's Portal</strong><br/>September, 2007<br />
         <a href="http://www.ddj.com/architect/201806315" class="newpage">Article on DDJ site</a>
  </td>
</tr>
</table>
<p>Finding the <a href="http://en.wikipedia.org/wiki/Convex_hull" class="newpage">Convex Hull</a> of a set of points is an interesting problem in computational geometry. It is useful as a building block for a diverse set of applications, including thing such as:</p>
<ul>
<li/>Collision detection in video games, providing a useful replacement for bounding boxes.
<li/>Visual pattern matching/object detection
<li/>Mapping
<li/>Path determination
</ul>
<p>Just as an example, if one of the Mars rovers has to chart a path around a boulder, the convex hull can be used to provide the shortest path past the obstacle, given a map that shows the points where the boulder abuts the ground.</p>
<p>This article will go over the definition of the 2D convex hull, describe Graham's efficient algorithm for finding the convex hull of a set of points, and present a sample C++ program that can be used to experiment with the algorithm.</p>
<h4>The Convex Hull</h4>
<p>There are many ways to draw a boundary around a set of points in a two-dimensional plane. One of the easiest to implement is a bounding-box, which is a rectangle that spans the set from its minimum and maximum points in the X and Y planes.</p>
<p>Creating a bounding-box is easy, but it doesn't form as tight a wrapper as we might like around a set of points. Consider the bounding box around the three points shown in Figure 1.<br />
<center><br />
<img src="http://marknelson.us/attachments/2007/convex/Figure1.png"><br />
Figure 1 - A standard bounding box around three points<br />
</center><br />
We can certainly wrap those points much more tightly using easy-to-compute straight lines, and Figure 2 shows an example that is significantly more compact:<br />
<center><br />
<img src="http://marknelson.us/attachments/2007/convex/Figure2.png"><br />
Figure 2 - A convex hull around the three points from Figure 1<br />
</center><br />
As it happens, Figure 2 is a convex hull.<br />
<span id="more-91"></span><br />
So what is the definition of a convex hull? The <a href="http://upload.wikimedia.org/wikipedia/commons/b/bc/ConvexHull.png" class="newpage">common visualization analogy</a> for a 2D convex hull is to imagine the set of points on the plane as nails pounded into a board. If you wrap the entire set in an appropriately sized rubber band, the band will snap into place, forming a convex hull, which is the minimum-energy wrapper that encloses all the points.</p>
<p>An informal definition that has a little more precision but is still easy to understand might say that the convex hull meets the following properties:</p>
<ol>
<li/>The hull is a cycle graph whose vertices are composed of a subset of the the points in set S.
<li/>No points in S lie outside the graph.
<li/>All interior angles in the graph are less than 180 degrees.
</ol>
<h4>Computing the Convex Hull</h4>
<p>So given a set of points, how do we compute the convex hull without benefit of hammer, nails, and rubber bands?</p>
<p>For some problems, a brute force solution is adequate. In the case of a convex hull, a reasonable brute force algorithm might look like this:</p>
<div class="igBar"><span id="lcode-25"><a href="#" onclick="javascript:showPlainTxt('code-25'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-25">
<div class="code">
<ol>
<li class="li1">
<div class="de1">for all points p in S</div>
</li>
<li class="li2">
<div class="de2">&nbsp; for all points q in S</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; if p != q</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; draw a line from p to q</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; if all points in S except p and q lie to the left of the line</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; add the directed vector pq to the solution set </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>After running this algorithm, you've got a list of point pairs that compose the solution set, and you simply have to put them together in the correct order.</p>
<p>This solution will work, but with just a quick look at the code you can see a big problem - a triply nested loop that runs over the magnitude of N, making this an O(n<sup>3</sup>) algorithm. That's not going to scale up as well as we might like.</p>
<p>Fortunately, a little searching will show you that there are algorithms that calculate a convex hull in a 2D space considerably faster - in O(<i>n&middot;lgn</i>) time, as a matter of fact.</p>
<h4>The Graham Scan</h4>
<p>The algorithm I'll demonstrate here is referred to as the Andrew's variant of the Graham scan. Ronald Graham's 1972 paper [1] proposed a convex hull construction algorithm that ran in O(<i>n&middot;lgn</i>) time, and Andrews variation is a simplification that requires a bit less computation. First I'll give the terse definition of the algorithm, then explain each step in more detail.</p>
<pre>
Algorithm ConvexHull( <i>S</i> )
Sort all points in <i>S</i> based on their position on the X axis
Designate point <i>left</i> as the leftmost point
Designate point <i>right</i> as the rightmost point
Remove <i>left</i> and <i>right</i> from <i>S</i>
While there are still points in <i>S</i>
   remove <i>Point</i> from <i>S</i>
   if <i>Point</i> is above the line from <i>left</i> to <i>right</i>
       add <i>Point</i> to the end of array <i>upper</i>
   else
       add <i>Point</i> to the end of array <i>lower</i>
//
// Construct the lower hull
//
Add <i>left</i> to <i>lower_hull</i>
While <i>lower</i> is not empty
  add <i>lower[0]</i> to the end of <i>lower_hull</i>
  remove <i>lower[0]</i> from <i>lower</i>
  while size(<i>lower_hull</i> >= 3 and the last 3 points <i>lower_hull</i> are not convex
    remove the next to last element from <em>lower_hull</em>
//
// Construct the upper hull
//
Add <i>left</i> to <i>upper_hull</i>
While <i>upper</i> is not empty
  add <i>upper[0]</i> to the end of <i>upper_hull</i>
  remove <i>upper[0]</i> from <i>upper</i>
  while size(<i>upper_hull</i> >= 3 and the last 3 points <i>upper_hull</i> are not convex
    remove the next to last element from <em>upper_hull</em>
//
Merge <i>upper_hull</i> and <i>lower_hull</i> to form <i>hull</i>
return <i>hull</i>
</pre>
<h4>The details</h4>
<p>The algorithm starts with an unordered set of points defined by cartesian coordinates - each point has a position on the X-axis and Y-axis. To illustrate the algorithm we'll start with the points shown in Figure 3.<br />
<center><br />
<img src="http://marknelson.us/attachments/2007/convex/RawPoints.gif"><br />
Figure 3 - The raw points used as input to the algorithm<br />
</center><br />
Before construction of the upper and lower hull can take place, we have to first sort the input data based on its X-axis value, then partition the resulting set into a leftmost point, a rightmost point, and the sets of points above and below the line between the leftmost and rightmost point.</p>
<p>Obviously once the data is sorted it's trivial to find the leftmost and rightmost points, and remove them from the set of points - these are just the first and last members of the sorted array.</p>
<p>Sorting the remaining points into the upper and lower sets requires that we have some function that determines whether a point is above or below a line. I accomplish this using the following strategy. Given a set of points on a line, p0, p1, and p2, I first perform a coordinate translation so that p1 is at 0,0. I then take the determinant of p0 and p2. The resulting value will be negative if p2 angled off in the left direction, positive if it has moved to the right, and 0 if it is colinear with the first two points.</p>
<p>The matrix that is used to get this determinant looks like this:</p>
<table border="0">
<tr>
<td rowspan="3" align="right">A=</td>
<td width = "20%">(p0<sub>x</sub>-p1<sub>x</sub>)</td>
<td width="20%">(p2<sub>x</sub>-p1<sub>x</sub>)</td>
</tr>
<tr>
<td width = "20%">(p0<sub>y</sub>-p1<sub>y</sub>)</td>
<td width="20%">(p2<sub>y</sub>-p1<sub>y</sub>)</td>
</tr>
</table>
<p>For this 2x2 matrix, the formula for the determinant will be:</p>
<table border="0">
<tr>
<td width="15%">&nbsp;</td>
<td width="70%">
det = ((p0<sub>x</sub>-p1<sub>x</sub>)x(p2<sub>y</sub>-p1<sub>y</sub>)) - ((p2<sub>x</sub>-p1<sub>x</sub>)x(p0<sub>y</sub>-p1<sub>y</sub>))</td>
</tr>
</table>
<p>Partitioning set <i>S</i> into <i>upper</i> and <i>lower</i> is simply a matter of iterating over each point, calculating the determinant, and moving it into <i>lower</i> for a value &ge; 0, or <i>upper</i> for a value &lt; 0.</p>
<p>Once this is complete, the resulting partitions will look like the ones shown in Figure 4.<br />
<center><br />
<img src="http://marknelson.us/attachments/2007/convex/Partitioned.gif"><br />
Figure 4 - The set of points after partitioning<br />
Green points are in the upper partition, red in the lower<br />
</center><br />
Now that things are partitioned, the actual construction of the hull can begin. From the algorithm description, you can see that the upper and lower hull construction steps are symmetrical. Both proceed from right to left, starting at the leftmost point and moving to the rightmost point. The only difference is the source of their input points, and the direction they check to insure convexity.</p>
<p>The upper or lower hull is started by simply adding <i>left</i> to the output hull. Points are then added from the correct input source. As each point is added, if the number of points in the working hull is equal to 3 or more, a test is made to see if the last three points have created a convex angle.</p>
<p>Testing for the convex angle is done using the same determinant formula as shown above. If the hull has <i>n</i> points, we simply test to see if p<sub>n-1</sub> is above or below the line formed by p<sub>n-2</sub> and p<sub>n</sub>. When constructing the lower hull, if we see that the point is above the line, we have violated convexity, and the middle point is removed from the hull. The opposite test is made when constructing the upper hull. This test-and-remove process is repeated until the last three points are convex, or there are fewer than 3 points in the working hull.</p>
<p>An animation of this process is shown in Figure 5.<br />
<center><br />
<img src="http://marknelson.us/attachments/2007/convex/BuildHull.gif"><br />
Figure 5 - An animation of the hull being built<br />
</center><br />
The final step, merging the upper and lower hulls, is a trivial matter of appending one hull to the other, and removing the extra copy of <i>right</i>.  Once that is done, the actual convex hull definition can be given as a list of points, starting with <i>left</i> and moving counter-clockwise around the hull. Figure 6 is the last frame in the animation, which shows the complete hull.<br />
<center><br />
<img src="http://marknelson.us/attachments/2007/convex/Complete.gif"><br />
Figure 6 - The finished convex hull<br />
</center></p>
<h4>Test Code</h4>
<p>A small demo program called graham.cpp implements this algorithm fairly faithfully. In order to make the experimentation a little more interesting, graham provides two forms of output showing the results of the program:</p>
<ul>
<li/>Standard text output listing the various data sets used in the program
<li/>A command file (gnuplot.cmd) that can be used with <a href="http://www.gnuplot.info/" class ="newpage">gnuplot</a> to visualize the process
</ul>
<p>The images shown in this article were collected using gnuplot 4, which is a fully featured 2D and 3D plotting program, with excellent multiplatform support. Seeing the algorithm operate visually in real time is very helpful in gaining a good understanding of how it works.</p>
<p>The C++ program utilizes a class called <code>GrahamScan</code> that takes care of all these details. By creating the object and then calling its methods, creation and display of the convex hull is easy to follow. In my test program, <code>main()</code> executes the entire operation by creating the object and then calling its methods:</p>
<div class="igBar"><span id="lcpp-26"><a href="#" onclick="javascript:showPlainTxt('cpp-26'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-26">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1"><span class="kw4">int</span> main<span class="br0">&#40;</span><span class="kw4">int</span> argc, <span class="kw4">char</span>* argv<span class="br0">&#91;</span><span class="br0">&#93;</span><span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; std::<span class="me2">ofstream</span> gnuplot_file<span class="br0">&#40;</span> <span class="st0">"gnuplot.cmd"</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="kw4">const</span> <span class="kw4">int</span> N = <span class="nu0">20</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; GrahamScan g<span class="br0">&#40;</span> N, <span class="nu0">0</span>, <span class="nu0">100</span>, <span class="nu0">0</span>, <span class="nu0">100</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; g.<span class="me1">log_raw_points</span><span class="br0">&#40;</span> std::<span class="kw3">cout</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; g.<span class="me1">plot_raw_points</span><span class="br0">&#40;</span> gnuplot_file <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; g.<span class="me1">partition_points</span><span class="br0">&#40;</span><span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; g.<span class="me1">log_partitioned_points</span><span class="br0">&#40;</span> std::<span class="kw3">cout</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; g.<span class="me1">plot_partitioned_points</span><span class="br0">&#40;</span> gnuplot_file <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="co1">//</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="co1">// Build the hull</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="co1">//</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; g.<span class="me1">build_hull</span><span class="br0">&#40;</span> gnuplot_file <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; g.<span class="me1">log_hull</span><span class="br0">&#40;</span> std::<span class="kw3">cout</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; g.<span class="me1">plot_hull</span><span class="br0">&#40;</span> gnuplot_file, <span class="st0">"complete"</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">return</span> <span class="nu0">0</span>;</div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#125;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p>
If you ignore method calls that start with <i>log_</i> or <i>plot_</i>, the real work takes place in only three steps:</p>
<ol>
<li/>Construct the <code>GrahamScan</code> object. This also creates the random set of points.
<li/>Partition the points into the upper and lower hull sets.
<li/>Build the convex hull.
</ol>
<h4>The constructor</h4>
<p>When calling the <code>GrahamScan</code> constructor, you will pass in five numbers: the number of points, and the min and max values for the X and Y axis. The min and max values not only bound the range of the randomly generated points, they also determine the scope of the axes that will be displayed when the values are shown in gnuplot.</p>
<p>The bulk of the work in the constructor is in these few lines of code:</p>
<div class="igBar"><span id="lcpp-27"><a href="#" onclick="javascript:showPlainTxt('cpp-27'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-27">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1"><span class="kw3">srand</span><span class="br0">&#40;</span> static_cast&lt;unsigned int&gt;<span class="br0">&#40;</span> <span class="kw3">time</span><span class="br0">&#40;</span><span class="kw2">NULL</span><span class="br0">&#41;</span> <span class="br0">&#41;</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2"><span class="kw1">for</span> <span class="br0">&#40;</span> <span class="kw4">size_t</span> i = <span class="nu0">0</span> ; i &lt;N ; i++ <span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw4">int</span> x = <span class="br0">&#40;</span> <span class="kw3">rand</span><span class="br0">&#40;</span><span class="br0">&#41;</span> % <span class="br0">&#40;</span> x_range.<span class="me1">second</span> - x_range.<span class="me1">first</span> + <span class="nu0">1</span> <span class="br0">&#41;</span> <span class="br0">&#41;</span> + x_range.<span class="me1">first</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="kw4">int</span> y = <span class="br0">&#40;</span> <span class="kw3">rand</span><span class="br0">&#40;</span><span class="br0">&#41;</span> % <span class="br0">&#40;</span> y_range.<span class="me1">second</span> - y_range.<span class="me1">first</span> + <span class="nu0">1</span> <span class="br0">&#41;</span> <span class="br0">&#41;</span> + y_range.<span class="me1">first</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; raw_points.<span class="me1">push_back</span><span class="br0">&#40;</span> std::<span class="me2">make_pair</span><span class="br0">&#40;</span> x, y <span class="br0">&#41;</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#125;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>The data set is stored as <code>std::pair<int,int></code> objects in a vector aptly called <code>raw_points</code>. After calling the constructor, <code>the log_raw_points()</code> method can be called, which will product output like this:</p>
<table border="0">
<tr>
<td width="10%">
<td width="80%">
<pre>
Creating raw points:
(97,90) (27,10) (59,8) (58,19) (85,90) (62,91) (94,42) (84,68)
(16,21) (49,14) (31,84) (40,25) (59,95) (55,89) (81,95) (22,46)
(27,80) (18,90) (59,37) (38,45)
</pre>
</td>
</td>
</tr>
</table>
<p>Calling <code>plot_raw_points()</code> will create a gnuplot command file that produces output like that shown in Figure 3.</p>
<h4>The Partitioning Code</h4>
<p>The algorithm definition tells us that the partition step needs to identify the leftmost point, the rightmost point, and the two sets of points above and below the line between leftmost and rightmost. </p>
<p>This is all accomplished in a straightforward manner:</p>
<div class="igBar"><span id="lcpp-28"><a href="#" onclick="javascript:showPlainTxt('cpp-28'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-28">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1"><span class="kw4">void</span> partition_points<span class="br0">&#40;</span><span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="co1">//</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="co1">// Step one in partitioning the points is to sort the raw data</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="co1">//</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; std::<span class="me2">sort</span><span class="br0">&#40;</span> raw_points.<span class="me1">begin</span><span class="br0">&#40;</span><span class="br0">&#41;</span>, raw_points.<span class="me1">end</span><span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="co1">//</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="co1">// The the far left and far right points, remove them from the</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="co1">// sorted sequence and store them in special members</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="co1">//</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; left = raw_points.<span class="me1">front</span><span class="br0">&#40;</span><span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; raw_points.<span class="me1">erase</span><span class="br0">&#40;</span> raw_points.<span class="me1">begin</span><span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; right = raw_points.<span class="me1">back</span><span class="br0">&#40;</span><span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; raw_points.<span class="me1">pop_back</span><span class="br0">&#40;</span><span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="co1">//</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="co1">// Now put the remaining points in one of the two output sequences</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="co1">//</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="kw1">for</span> <span class="br0">&#40;</span> <span class="kw4">size_t</span> i = <span class="nu0">0</span> ; i &lt;raw_points.<span class="me1">size</span><span class="br0">&#40;</span><span class="br0">&#41;</span> ; i++ <span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="br0">&#123;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw4">int</span> dir = direction<span class="br0">&#40;</span> left, right, raw_points<span class="br0">&#91;</span> i <span class="br0">&#93;</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span> dir &lt;<span class="nu0">0</span> <span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; upper_partition_points.<span class="me1">push_back</span><span class="br0">&#40;</span> raw_points<span class="br0">&#91;</span> i <span class="br0">&#93;</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">else</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; lower_partition_points.<span class="me1">push_back</span><span class="br0">&#40;</span> raw_points<span class="br0">&#91;</span> i <span class="br0">&#93;</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#125;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p>
Note that by storing the points in an <code>std::pair<int,int></code>, with x in the <code>first</code> member and y in the <code>second</code> member, we can use the standard library <code>sort()</code> routine to order the array. </p>
<p>After sorting and then moving each point into one of the two partitions, we have the following members of <code>GrahamScan</code> defined and ready to use in building the hull:</p>
<ul>
<li/><code>left</code>, the leftmost point in  the set of points
<li/><code>right</code>, the rightmost point in the set of points
<li/><code>upper_partition_points</code>, the points above the line between <code>left</code> and <code>right</code>.
<li/><code>lower_partition_points</code>, the points below or on the line between <code>left</code> and <code>right</code>.
</ul>
<p>If you call the <code>log_partitioned_points()</code> method, you'll get output that looks something like this after partitioning:</p>
<table border="0" cellpadding="10">
<tr>
<td width="10%">
<td width="80%">
<pre>
Partitioned set:
Left : (16,21)
Right : (97,90)
Lower partition: (27,10)(40,25)(49,14)(58,19)(59,8)(59,37)(84,68)
                 (94,42)
Upper partition: (18,90)(22,46)(27,80)(31,84)(38,45)(55,89)(59,95)
                 (62,91)(81,95) (85,90)</pre>
</td>
</td>
</tr>
</table>
<p>Calling the plot_partitioned_points() creates a gnuplot command sequence that will display a plot like that shown in Figure 4.</p>
<h4>Creating the Hull</h4>
<p>The actual creation of the hull is done by method <code>build_hull</code>, which calls method <code>build_half_hull</code> twice, once with the points on the lower hull, and once with the points on the upper hull. The output of <code>build_half_hull</code> is sent on the first call to array <code>lower_hull</code> and next to array <code>upper_hull</code>:</p>
<div class="igBar"><span id="lcpp-29"><a href="#" onclick="javascript:showPlainTxt('cpp-29'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-29">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1"><span class="kw4">void</span> build_hull<span class="br0">&#40;</span> std::<span class="me2">ofstream</span> &amp;f <span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; build_half_hull<span class="br0">&#40;</span> f, lower_partition_points, lower_hull, <span class="nu0">1</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; build_half_hull<span class="br0">&#40;</span> f, upper_partition_points, upper_hull, -<span class="nu0">1</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p>
So the bulk of the work is actually done in <code>build_half_hull</code>, which looks like this:</p>
<div class="igBar"><span id="lcpp-30"><a href="#" onclick="javascript:showPlainTxt('cpp-30'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-30">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1"><span class="kw4">void</span> build_half_hull<span class="br0">&#40;</span> std::<span class="me2">ostream</span> &amp;f, </div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; std::<span class="me2">vector</span>&lt;std::<span class="me2">pair</span>&lt;int,int&gt;&gt; input,</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; std::<span class="me2">vector</span>&lt;std::<span class="me2">pair</span>&lt;int,int&gt;&gt; &amp;output,</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="kw4">int</span> factor <span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#123;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="co1">// The hull will always start with the left point, and end with the right</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="co1">// point. Initialize input and output accordingly</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="co1">//</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; input.<span class="me1">push_back</span><span class="br0">&#40;</span> right <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; output.<span class="me1">push_back</span><span class="br0">&#40;</span> left <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">while</span> <span class="br0">&#40;</span> input.<span class="me1">size</span><span class="br0">&#40;</span><span class="br0">&#41;</span> != <span class="nu0">0</span> <span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; <span class="co1">//</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="co1">// Repeatedly add the leftmost point to the null, then test to see </span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; <span class="co1">// if a convexity violation has occurred. If it has, fix things up</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="co1">// by removing the next-to-last point in the output sequence until </span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; <span class="co1">// convexity is restored.</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="co1">//</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; output.<span class="me1">push_back</span><span class="br0">&#40;</span> input.<span class="me1">front</span><span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; input.<span class="me1">erase</span><span class="br0">&#40;</span> input.<span class="me1">begin</span><span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">while</span> <span class="br0">&#40;</span> output.<span class="me1">size</span><span class="br0">&#40;</span><span class="br0">&#41;</span>&gt;= <span class="nu0">3</span> <span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="kw4">size_t</span> end = output.<span class="me1">size</span><span class="br0">&#40;</span><span class="br0">&#41;</span> - <span class="nu0">1</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span> factor * direction<span class="br0">&#40;</span> output<span class="br0">&#91;</span> end - <span class="nu0">2</span> <span class="br0">&#93;</span>, </div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;output<span class="br0">&#91;</span> end <span class="br0">&#93;</span>, </div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;output<span class="br0">&#91;</span> end - <span class="nu0">1</span> <span class="br0">&#93;</span> <span class="br0">&#41;</span> &lt;= <span class="nu0">0</span> <span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; output.<span class="me1">erase</span><span class="br0">&#40;</span> output.<span class="me1">begin</span><span class="br0">&#40;</span><span class="br0">&#41;</span> + end - <span class="nu0">1</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="br0">&#125;</span> <span class="kw1">else</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="kw2">break</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#125;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p>
The main loop in this routine simply pulls points out of the <code>input</code> array, adds them to the <code>output</code> array, and then performs the check to make sure that convexity has not been violated. If it has, points are removed until it is again correct. Figure 5 gives a nice animated view of the process.</p>
<p>Once this is done, calling the log_hull routine produces output that looks like this:</p>
<table border="0" cellpadding="10">
<tr width="10%">
<td>
<pre>Lower hull: (16,21)(27,10)(59,8)(94,42)(97,90)
Upper hull: (16,21)(18,90)(59,95)(81,95)(97,90)
Convex hull: (16,21) (27,10) (59,8) (94,42) (97,90) (81,95) (59,95)
             (18,90) (16,21)</pre>
</td>
</td>
</tr>
</table>
<p>The <code>plot_hull()</code> method can then be called to create a gnuplot command file that will display the convex hull, as shown in Figure 6.</p>
<p>There are quite a few variations on the 2-D convex hull building process, and this program ought to be amenable to trying out many of them. If you use the existing data structures and just change the algorithms, you can use the existing gnuplot routines to animate your work and get a good feel for how it is working. Enjoy!</p>
<h4>References and Links</h4>
<p>Source code: <a href="http://marknelson.us/attachments/2007/convex/graham.zip">graham.zip</a>, which contains VS 2003 and 2005 solutions, plus a g++ Makefile.</p>
<p>[1] R.L. Graham, <a href="http://www.math.ucsd.edu/~ronspubs/72_10_convex_hull.pdf" class="newpage">An efficient algorithm for determining the convex hull of a finite planar set</a>, Info. Proc. Lett. 1, 132-133 (1972).</p>
<p>[2] A. M. Andrew. Another efficient algorithm for convex hulls in two dimensions. Inform. Process. Lett., 9(5):216-219, 1979. (A note about the this algorithm can be found <a href="http://www.mpi-inf.mpg.de/~stschirr/exact/cost_of_geometric_computing/algorithms.html" class="newpage">here</a>.)</p>
]]></content:encoded>
			<wfw:commentRss>http://marknelson.us/2007/08/22/convex/feed/</wfw:commentRss>
		<slash:comments>17</slash:comments>
		</item>
		<item>
		<title>C++ Hash Table Memoization: Simplifying Dynamic Programming</title>
		<link>http://marknelson.us/2007/08/01/memoization/</link>
		<comments>http://marknelson.us/2007/08/01/memoization/#comments</comments>
		<pubDate>Thu, 02 Aug 2007 03:43:52 +0000</pubDate>
		<dc:creator>Mark Nelson</dc:creator>
				<category><![CDATA[Computer Science]]></category>
		<category><![CDATA[Web Articles]]></category>

		<guid isPermaLink="false">http://marknelson.us/2007/08/01/memoization/</guid>
		<description><![CDATA[<div class="addthis_toolbox addthis_default_style" addthis:url='http://marknelson.us/2007/08/01/memoization/' addthis:title='C++ Hash Table Memoization: Simplifying Dynamic Programming' ><a class="addthis_button_twitter"></a><a class="addthis_button_favorites"></a><a class="addthis_button_print"></a><a class="addthis_button_facebook_like"></a><a class="addthis_button_google_plusone"></a><a class="addthis_button_compact"></a></div>This article discusses the use of C++ hash containers to improve storage of subproblem results when using dynamic programming (DP.) <i>Memoization</i> is a key part of dynamic programming, which is conventionally done by storing subproblem results in simple tables or lists. Using hash tables instead of these simpler structures will allow you to use dynamic programming while retaining your algorithm's natural recursive structure, simplifying design and making your code easier to follow. I'll provide a fully-developed example of an algorithm, and show how it can be adapted to use hash table memoization.]]></description>
			<content:encoded><![CDATA[





















<div class="addthis_toolbox addthis_default_style" addthis:url='http://marknelson.us/2007/08/01/memoization/' addthis:title='C++ Hash Table Memoization: Simplifying Dynamic Programming' ><a class="addthis_button_twitter"></a><a class="addthis_button_favorites"></a><a class="addthis_button_print"></a><a class="addthis_button_facebook_like"></a><a class="addthis_button_google_plusone"></a><a class="addthis_button_compact"></a></div><table border="0" width="500">
<tr>
<td width="300"><img alt="DDJ Portal Logo" src="http://marknelson.us/attachments/misc/logo_ddj.gif" /></td>
<td><strong>Dr. Dobb's Portal</strong><br/>October, 2007<br />
         <a href="http://ddj.com/cpp/202405227" class="newpage">Article on DDJ site</a>
  </td>
</tr>
</table>
<p>This article discusses the use of C++ hash containers to improve storage of subproblem results when using dynamic programming (DP.) <i>Memoization</i> is a key part of dynamic programming, which is conventionally done by storing subproblem results in simple tables or lists. Using hash tables instead of these simpler structures will allow you to use dynamic programming while retaining your algorithm's natural recursive structure, simplifying design and making your code easier to follow. I'll provide a fully-developed example of an algorithm, and show how it can be adapted to use hash table memoization.<br />
<span id="more-101"></span></p>
<h4>Background</h4>
<p><i>Dynamic Programming</i> (DP) is a useful technique for algorithm development that is saddled with an unfortunate name. When we refer to <i>greedy algorithms</i>, or the use of <i>divide and conquer</i> techniques, the name provides excellent semantic clues as to what is going on. With dynamic programming, no such luck, but I'm afraid were stuck with name for <a href="http://en.wikipedia.org/wiki/Richard_Bellman" class="newpage">historical reasons</a>.</p>
<p>Dynamic programming is frequently used to solve optimization problems - that is, problems where a number of choices can be made to lead to some maximum or minimum value. Optimization problems can often be solved efficiently using straightforward iterative or divide and conquer techniques, but dynamic programming becomes useful when those other techniques lead to <i>overlapping subproblems</i>.</p>
<p>The <a href="http://en.wikipedia.org/wiki/Dynamic_programming" class="newpage">Wikipedia article on dynamic programming</a> gives a good example of what is meant by overlapping subproblems. Imagine that you are a C programmer who decides to use a recursive function call to calculate the <i>n<sup>th</sup></i> Fibonacci number:</p>
<div class="igBar"><span id="lc-31"><a href="#" onclick="javascript:showPlainTxt('c-31'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C:</span>
<div id="c-31">
<div class="c">
<ol>
<li class="li1">
<div class="de1"><span class="kw4">int</span> fib<span class="br0">&#40;</span> <span class="kw4">int</span> n <span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span> n &lt;<span class="nu0">2</span> <span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">return</span> <span class="nu0">1</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">else</span> </div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">return</span> fib<span class="br0">&#40;</span> n - <span class="nu0">1</span> <span class="br0">&#41;</span> + fib<span class="br0">&#40;</span> n - <span class="nu0">2</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p>
This code is correct, but it leads to a lot of extra work. Figure 1 shows a map of the recursive calls made using this algorithm:<br />
<center></p>
<table border="0">
<tr>
<td><img src="http://marknelson.us/attachments/2007/memoization/fibcalls.gif"></td>
</tr>
<tr>
<td><center>Figure 1<br />The Call Map for fib(6)</center></td>
</tr>
</table>
<p></center><br />
For example, <span class="inline_code">fib(4)</span> is called twice, which leads to a large duplicated tree of recursive calls. All in all, a huge amount of wasted effort.</p>
<p>Dynamic programming provides a good way to avoid duplicate work done solving overlapping subproblems, but for the results to be useful, the problem has to adhere to the <i>optimal substructure</i> property.</p>
<p>Saying that a problem has this property is just a way of saying that we can find an optimal solution for problem A by breaking it down into problems B and C, both of which are optimal as well. Chapter 15 of <a href="http://www.amazon.com/gp/redirect.html?ie=UTF8&#038;location=http%3A%2F%2Fwww.amazon.com%2FIntroduction-Algorithms-Thomas-H-Cormen%2Fdp%2F0262032937&#038;tag=theinternetdatac&#038;linkCode=ur2&#038;camp=1789&#038;creative=9325" class="newpage">Introduction to Algorithms</a><img src="http://www.assoc-amazon.com/e/ir?t=theinternetdatac&amp;l=ur2&amp;o=1" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /> gives nice examples of cases where this property holds, and where it doesn't.</p>
<h4>Defeating Overlapping Subproblems</h4>
<p>Dynamic programming aims to defeat this problem of repeatedly solving overlapping subproblems. Conventionally, this is done by converting the algorithm to one that uses a <i>bottoms up</i> approach, combined with some global storage for intermediate results.</p>
<p>As an example, we could solve the Fibonacci problem using a bottoms-up approach with C++ code that looks something like this:</p>
<div class="igBar"><span id="lcpp-32"><a href="#" onclick="javascript:showPlainTxt('cpp-32'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-32">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1"><span class="kw4">int</span> fib<span class="br0">&#40;</span><span class="kw4">int</span> n<span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; std::<span class="me2">vector</span> results<span class="br0">&#40;</span> n <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; results<span class="br0">&#91;</span> <span class="nu0">0</span> <span class="br0">&#93;</span> = results<span class="br0">&#91;</span> <span class="nu0">1</span> <span class="br0">&#93;</span> = <span class="nu0">1</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">for</span> <span class="br0">&#40;</span> <span class="kw4">int</span> i = <span class="nu0">2</span> ; i &lt;n ; i++ <span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; results<span class="br0">&#91;</span> i <span class="br0">&#93;</span> = results<span class="br0">&#91;</span> i - <span class="nu0">1</span> <span class="br0">&#93;</span> + results<span class="br0">&#91;</span> i - <span class="nu0">2</span> <span class="br0">&#93;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">return</span> results<span class="br0">&#91;</span> n - <span class="nu0">1</span> <span class="br0">&#93;</span> + results<span class="br0">&#91;</span> n - <span class="nu0">2</span> <span class="br0">&#93;</span>;</div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#125;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p>
In this case we only calculate the value of each fib(n) once, completely eliminating the duplicate calculation of those duplicate subproblems.</p>
<p>Most examples of DP that you find in textbooks or on the web will use a table or array to hold the intermediate results from problems. Examples of algorithms that use this type of memoization include <a href="http://www.avatar.se/molbioinfo2001/dynprog/dynamic.html" class="newpage">Sequence Alignment</a>, <a href="http://ww3.algorithmdesign.net/handouts/DynamicProgramming.pdf" class="newpage">Matrix Chain Multiplication</a>, and <a href="http://linneus20.ethz.ch:8080/4_7_1.html" class="newpage">Optimal Binary Search Trees</a>.</p>
<p>Using the tabular approach for storing subproblems often works well with a bottoms up implementation of the algorithm. DP algorithms typically start by solving the smallest subproblems, storing the results, combining some of those, storing the results in a new level of the table, and so on, until the top case is reached. Again, most of the tutorial DP examples you see will combine a bottoms-up approach with tabular subproblem storage.</p>
<h4>DP Problems</h4>
<p>From my perspective, the traditional approach to DP suffers from a couple of problems.</p>
<p>First, it seems to me that people feel more comfortable attacking problems in a top-down fashion as opposed to bottoms-up. This is certainly a matter of psychology, not computer science, but it is relevant to consider human factors when designing for maintenance, testability, and review. In many cases the expression of the algorithm seems more natural when given in the original top-down mode, and requires fewer mental gymnastics on the part of whoever is studying the algorithm.</p>
<p>The second problem is the traditional use of array structures to hold subproblems. The problem starts to creep into play when you look at algorithms like Matrix Chain Multiplication, in which half of the storage space is not even used. Things get even worse when you use a problem like that described later in this article, in which subproblems just don't fit into a row/column organizational format.</p>
<p>Both of these problems can be addressed effectively by using standard C++ hash tables. The procedure is as follows:</p>
<ol>
<li/>Implement your algorithm using the basic, inefficient recursive implementation.
<li/>Create a C++ hash table with global scope that will hold your subproblem results. The key for the hash table should be a concatenation of the input parameters to your function, and the value will be the return type from your function.
<li/>At the entry point to your function, check the hash table for the presence of an existing value for your subproblem, using the input variables to form the key. If the value is present, instead of executing your function, return immediately.
<li/>At the exit for your function, store the value you are about to return in the hash table, using the input parameters as the key.
</ol>
<p>With these minor changes, which can often be accomplished in change to as few as three lines of code, you can convert your inefficient recursive algorithm to use dynamic programming, without having to refactor to a bottoms-up implementation, and without having to shoehorn your results into tabular format.</p>
<h4>A Simple Example</h4>
<p>Applying this technique to our Fibonacci example, we get a routine that looks like this:</p>
<div class="igBar"><span id="lcpp-33"><a href="#" onclick="javascript:showPlainTxt('cpp-33'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-33">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1">std::<span class="me2">map</span>&lt;int,int&gt; results;</div>
</li>
<li class="li2">
<div class="de2">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="kw4">int</span> fib<span class="br0">&#40;</span> <span class="kw4">int</span> n <span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span> n == <span class="nu0">0</span> || n == <span class="nu0">1</span> <span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">return</span> <span class="nu0">1</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; std::<span class="me2">hash_map</span>&lt;int,int&gt;::<span class="me2">iterator</span> ii = results.<span class="me1">find</span><span class="br0">&#40;</span> n <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span> ii != results.<span class="me1">end</span><span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">return</span> ii.<span class="me1">second</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="kw1">else</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">return</span> results<span class="br0">&#91;</span> n <span class="br0">&#93;</span> = fib<span class="br0">&#40;</span> n -<span class="nu0">1</span> <span class="br0">&#41;</span> + fib<span class="br0">&#40;</span> n - <span class="nu0">2</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#125;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>So the first time we encounter fib(4), we have to make the recursive calls to <span class="inline_code">fib(3)</span> and <span class="inline_code">fib(2)</span>. But the second time around, the result has been stored in the map, and it is returned without any additional calculations taking place.</p>
<h4>Hash Table Refresher</h4>
<p>Hash tables were part of the <a href="http://www.sgi.com/tech/stl/" class="newpage">Standard Template Library</a> when the language standard was ratified in 1998. And while most of the Standard Template Library was incorporated into the standard library at that time, the <span class="inline_code">hash_set</span> and <span class="inline_code">hash_map</span> classes were excluded. According to Bjarne Stroustroup <a href="http://research.att.com/~bs/hopl-almost-final.pdf" class="newpage">[1]</a>:</p>
<blockquote><p>
they would have been in C++98 had we had the time to do a proper detailed design and specification job. There was no doubt that a hash_map was needed as an alternative to map for large tables where the key was a character string and we could design a good hash function. In 1995, Javier Barreirro, Robert Fraley and David Musser tried to get a proposal ready in time for the standard and their work became the basis for many of the later hash_maps [8]. The committee didn’t have the time, though..
</p></blockquote>
<p>While these containers will be added to the standard library under the awkward names <span class="inline_code">unordered_map</span> and <span class="inline_code">unordered_set</span> when <a href="http://en.wikipedia.org/wiki/Technical_Report_1" class="newpage">TR1</a> is adopted, there is no reason we have to wait that long. Despite the fact that they are missing from the standard, virtually every modern C++ compiler has adopted a reasonable variant of <span class="inline_code">hash_map</span> and <span class="inline_code">hash_set</span>.<br />
The <span class="inline_code">hash_map</span> container does a great job of memoization as described in this article, with a caveat. As I said earlier, you'll need the key to your hash table to be some subset of the input parameters. That's all fine if your key is a single value using a built-in type, such as <span class="inline_code">int</span>, a <span class="inline_code">std::string</span>, or a pointer. These key values are supported implicitly by most implementations of <span class="inline_code">hash_map</span>.</p>
<p>Things get a little more complicated if you are using multiple values or your own structures as a key into the map. Owing to the lack of a standard, implementing non-standard keys requires slightly different techniques, depending on your compiler.</p>
<p>I'll give you examples here that work with the two compilers you are most likely to encounter: Visual C++ .NET 2003/2005, or gcc 3.x.</p>
<p>Let's say you're doing a study tracking peoples reading habits by age, and you want to implement the following code:</p>
<div class="igBar"><span id="lcpp-34"><a href="#" onclick="javascript:showPlainTxt('cpp-34'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-34">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1">class Sample <span class="br0">&#123;</span></div>
</li>
<li class="li2">
<div class="de2"><span class="kw2">public</span> :</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; Sample<span class="br0">&#40;</span> <span class="kw4">int</span> age=-<span class="nu0">1</span>, <span class="kw4">const</span> std::<span class="me2">string</span> &amp;genre=<span class="st0">""</span> <span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; : mAge<span class="br0">&#40;</span> age <span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; , mGenre<span class="br0">&#40;</span> genre <span class="br0">&#41;</span><span class="br0">&#123;</span><span class="br0">&#125;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="kw4">int</span> mAge;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; std::<span class="me2">string</span> mGenre;</div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#125;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li2">
<div class="de2"><span class="kw4">int</span> main<span class="br0">&#40;</span><span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#123;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp;hash_map&lt;Sample,int&gt; AnnualBooksReadByAgeAndGenre;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp;AnnualBooksReadByAgeAndGenre<span class="br0">&#91;</span> Sample<span class="br0">&#40;</span> <span class="nu0">15</span>, <span class="st0">"Fantasy"</span> <span class="br0">&#41;</span> <span class="br0">&#93;</span> = <span class="nu0">15</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;AnnualBooksReadByAgeAndGenre<span class="br0">&#91;</span> Sample<span class="br0">&#40;</span> <span class="nu0">18</span>, <span class="st0">"Sports"</span> <span class="br0">&#41;</span> <span class="br0">&#93;</span> = <span class="nu0">12</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp;AnnualBooksReadByAgeAndGenre<span class="br0">&#91;</span> Sample<span class="br0">&#40;</span> <span class="nu0">35</span>, <span class="st0">"Mystery"</span> <span class="br0">&#41;</span> <span class="br0">&#93;</span> = <span class="nu0">17</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;AnnualBooksReadByAgeAndGenre<span class="br0">&#91;</span> Sample<span class="br0">&#40;</span> <span class="nu0">42</span>, <span class="st0">"Romance"</span> <span class="br0">&#41;</span> <span class="br0">&#93;</span> = <span class="nu0">125</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp;<span class="kw1">return</span> <span class="nu0">1</span>;</div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>You won't be able to compile this code as-is, for at least three reasons:</p>
<ol>
<li/>You need to include the correct header file, which unfortunately differs depending on which compiler you are using.
<li/>You need to hoist the <span class="inline_code">hash_map</span> name into your namespace.
<li/>You need to define the hash function and at least one comparison function for the key class, <span class="inline_code">Sample</span>.
</ol>
<p>Item 3 is only needed if you are using something other than a basic type for your class, and it's the trickiest. Defining the comparison is necessary so that the hashing library code can verify key equality during lookups or collisions. </p>
<p>In both cases, the function is usually pretty easy to write - you just compose it using existing functions. Examples for Microsoft and gcc compilers circa 2007 are given below.</p>
<h4>Sample Code for gcc</h4>
<p>gcc puts the hashing header files in the <code>ext</code> folder, and uses the <span class="inline_code">__gnu_cxx</span> namespace, isolating the non-standard library extensions into a ghetto of sorts. Both of these are dealt with in the first two lines of the sample code below.</p>
<p>The gcc implementation of <span class="inline_code">hash_map</span> needs two template parameters to handle keys that aren't defined as built-in types: one class that has an operator which returns a hash index given an input key, and a second that returns a boolean value for equality test. I pack both of these into a single class, <span class="inline_code">SampleTraits</span>. The traits class is then passed to the <span class="inline_code">hash_map</span> type definition as the third and fourth template parameters.</p>
<p>When you are defining the hash function, you need to somehow combine hash values for each of the members of your structure. Creating hash keys is somewhat of an art, so I try to use the ones provided by the library writer as the basis for my hash function. In this case I take the hash values for the two members of <span class="inline_code">class Sample</span> and simply XOR them together, which should provide a decent value.</p>
<p>If you are using g++ 4.x, you may be better off using the tr1 <span class="inline_code">unordered_map</span> class. While it is not quite part of the standard yet, it should be soon, and that will insure its future support and compatibility.</p>
<div class="igBar"><span id="lcpp-35"><a href="#" onclick="javascript:showPlainTxt('cpp-35'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-35">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1"><span class="co2">#include &lt;ext/hash_map&gt;</span></div>
</li>
<li class="li2">
<div class="de2"><span class="kw2">using</span> <span class="kw2">namespace</span> __gnu_cxx;</div>
</li>
<li class="li1">
<div class="de1"><span class="co2">#include &lt;string&gt;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">class Sample <span class="br0">&#123;</span></div>
</li>
<li class="li2">
<div class="de2"><span class="kw2">public</span> :</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; Sample<span class="br0">&#40;</span> <span class="kw4">int</span> age=-<span class="nu0">1</span>, <span class="kw4">const</span> std::<span class="me2">string</span> &amp;genre=<span class="st0">""</span> <span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; : mAge<span class="br0">&#40;</span> age <span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; , mGenre<span class="br0">&#40;</span> genre <span class="br0">&#41;</span><span class="br0">&#123;</span><span class="br0">&#125;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="kw4">int</span> mAge;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; std::<span class="me2">string</span> mGenre;</div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#125;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li2">
<div class="de2"><span class="kw4">struct</span> SampleTraits</div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#123;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="kw4">size_t</span> operator<span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#40;</span> <span class="kw4">const</span> Sample&amp; that <span class="br0">&#41;</span> <span class="kw4">const</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="br0">&#123;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">return</span> hash&lt;int&gt;<span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#40;</span> that.<span class="me1">mAge</span> <span class="br0">&#41;</span> ^ </div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;hash&lt;const <span class="kw4">char</span> *&gt;<span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#40;</span> that.<span class="me1">mGenre</span>.<span class="me1">c_str</span><span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw4">bool</span> operator<span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#40;</span> <span class="kw4">const</span> Sample &amp;that1, <span class="kw4">const</span> Sample&amp; that2 <span class="br0">&#41;</span> <span class="kw4">const</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">return</span> that1.<span class="me1">mAge</span> == that2.<span class="me1">mAge</span> &amp;&amp; that1.<span class="me1">mGenre</span> == that2.<span class="me1">mGenre</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="kw4">int</span> main<span class="br0">&#40;</span><span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;hash_map&lt;Sample,</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="kw4">int</span>,</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;SampleTraits,</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;SampleTraits&gt; AnnualBooksReadByAgeAndGenre;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp;AnnualBooksReadByAgeAndGenre<span class="br0">&#91;</span> Sample<span class="br0">&#40;</span> <span class="nu0">15</span>, <span class="st0">"Fantasy"</span> <span class="br0">&#41;</span> <span class="br0">&#93;</span> = <span class="nu0">15</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;AnnualBooksReadByAgeAndGenre<span class="br0">&#91;</span> Sample<span class="br0">&#40;</span> <span class="nu0">18</span>, <span class="st0">"Sports"</span> <span class="br0">&#41;</span> <span class="br0">&#93;</span> = <span class="nu0">12</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp;AnnualBooksReadByAgeAndGenre<span class="br0">&#91;</span> Sample<span class="br0">&#40;</span> <span class="nu0">35</span>, <span class="st0">"Mystery"</span> <span class="br0">&#41;</span> <span class="br0">&#93;</span> = <span class="nu0">17</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;AnnualBooksReadByAgeAndGenre<span class="br0">&#91;</span> Sample<span class="br0">&#40;</span> <span class="nu0">42</span>, <span class="st0">"Romance"</span> <span class="br0">&#41;</span> <span class="br0">&#93;</span> = <span class="nu0">125</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp;<span class="kw1">return</span> <span class="nu0">1</span>;</div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<h4>Visual C++ .NET 2003/2005</h4>
<p>Microsoft's compilers have similar issues, but deal with them differently. (That's the problem with non-standard features.)</p>
<p>Although the <span class="inline_code">hash_map</span> header file is accessed from the normal C++ header folder, the class itself, as was the case with g++, is defined in a different namespace, <span class="inline_code">stdext</span>. Again, that is dealt with in the first two lines of the sample code.</p>
<p>The remaining two issues are resolved by defining a global <span class="inline_code">hash_value</span> class that has an operator that returns a hash key for a given object of <span class="inline_code">class Sample</span>, and a global comparison operator that is used to test for equality of two objects of <span class="inline_code">class Sample</span>. (It is not unusual to use the less-than operator to test for equality, by testing both <span class="inline_code">a > b</span> and <span class="inline_code">a < b</span>, we determine whether the objects are equal or not.)</p>
<div class="igBar"><span id="lcpp-36"><a href="#" onclick="javascript:showPlainTxt('cpp-36'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-36">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1"><span class="co2">#include &lt;hash_map&gt;</span></div>
</li>
<li class="li2">
<div class="de2"><span class="kw2">using</span> <span class="kw2">namespace</span> stdext;</div>
</li>
<li class="li1">
<div class="de1"><span class="co2">#include &lt;string&gt;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">class Sample <span class="br0">&#123;</span></div>
</li>
<li class="li2">
<div class="de2"><span class="kw2">public</span> :</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; Sample<span class="br0">&#40;</span> <span class="kw4">int</span> age=-<span class="nu0">1</span>, <span class="kw4">const</span> std::<span class="me2">string</span> &amp;genre=<span class="st0">""</span> <span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; : mAge<span class="br0">&#40;</span> age <span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; , mGenre<span class="br0">&#40;</span> genre <span class="br0">&#41;</span><span class="br0">&#123;</span><span class="br0">&#125;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="kw4">int</span> mAge;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; std::<span class="me2">string</span> mGenre;</div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#125;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li2">
<div class="de2"><span class="kw2">namespace</span> stdext <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="kw4">size_t</span> hash_value<span class="br0">&#40;</span> <span class="kw4">const</span> Sample&amp; that <span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp;<span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp;<span class="kw1">return</span> hash_value<span class="br0">&#40;</span> that.<span class="me1">mAge</span> <span class="br0">&#41;</span> ^ hash_value<span class="br0">&#40;</span> that.<span class="me1">mGenre</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="kw4">bool</span> operator&lt;<span class="br0">&#40;</span> <span class="kw4">const</span> Sample&amp; that1, <span class="kw4">const</span> Sample&amp; that2<span class="br0">&#41;</span> </div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">return</span> that1.<span class="me1">mAge</span> &lt;that2.<span class="me1">mAge</span> &amp;&amp; that1.<span class="me1">mGenre</span> &lt;that2.<span class="me1">mGenre</span>;</div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li2">
<div class="de2"><span class="kw4">int</span> main<span class="br0">&#40;</span><span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#123;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp;hash_map&lt;Sample,int&gt; AnnualBooksReadByAgeAndGenre;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp;AnnualBooksReadByAgeAndGenre<span class="br0">&#91;</span> Sample<span class="br0">&#40;</span> <span class="nu0">15</span>, <span class="st0">"Fantasy"</span> <span class="br0">&#41;</span> <span class="br0">&#93;</span> = <span class="nu0">15</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;AnnualBooksReadByAgeAndGenre<span class="br0">&#91;</span> Sample<span class="br0">&#40;</span> <span class="nu0">18</span>, <span class="st0">"Sports"</span> <span class="br0">&#41;</span> <span class="br0">&#93;</span> = <span class="nu0">12</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp;AnnualBooksReadByAgeAndGenre<span class="br0">&#91;</span> Sample<span class="br0">&#40;</span> <span class="nu0">35</span>, <span class="st0">"Mystery"</span> <span class="br0">&#41;</span> <span class="br0">&#93;</span> = <span class="nu0">17</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;AnnualBooksReadByAgeAndGenre<span class="br0">&#91;</span> Sample<span class="br0">&#40;</span> <span class="nu0">42</span>, <span class="st0">"Romance"</span> <span class="br0">&#41;</span> <span class="br0">&#93;</span> = <span class="nu0">125</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp;<span class="kw1">return</span> <span class="nu0">1</span>;</div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<h4>Some Notes on Usage</h4>
<p>The member functions of most implementations of <span class="inline_code">hash_map</span> will be identical to those of the standard <span class="inline_code">std::map</span> container. Storing an entry in a <span class="inline_code">hash_map</span> can be done using an overloaded operator that makes the container look like an associative array:</p>
<div class="igBar"><span id="lcpp-37"><a href="#" onclick="javascript:showPlainTxt('cpp-37'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-37">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1">hash_map&lt;std::<span class="me2">string</span>,int&gt; my_map;</div>
</li>
<li class="li2">
<div class="de2">.</div>
</li>
<li class="li1">
<div class="de1">.</div>
</li>
<li class="li2">
<div class="de2">.</div>
</li>
<li class="li1">
<div class="de1"><span class="me1">my_map</span><span class="br0">&#91;</span> <span class="st0">"foo"</span> <span class="br0">&#93;</span> = <span class="nu0">42</span>; </div>
</li>
</ol>
</div>
</div>
</div>
<p>
Looking up a value from the table is done using a call to <span class="inline_code">hash_map::find()</span>. This method takes a reference to a key as its argument, and returns an iterator which points to the end of the container on failure, or to a key/value pair on success. The pair is stored in an <span class="inline_code">std::pair<key type,value type></span> object, which allows you to access the elements with the <span class="inline_code">first</span> and <span class="inline_code">second</span> members of that object:</p>
<div class="igBar"><span id="lcpp-38"><a href="#" onclick="javascript:showPlainTxt('cpp-38'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-38">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1">hash_map&lt;std::<span class="me2">string</span>,int&gt;::<span class="me2">iterator</span> ii = my_map.<span class="me1">find</span><span class="br0">&#40;</span> <span class="st0">"bar"</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2"><span class="kw1">if</span> <span class="br0">&#40;</span> ii == my_map.<span class="me1">end</span><span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; std::<span class="kw3">cout</span> &lt;&lt;<span class="st0">"Couldn't find an entry for 'bar'<span class="es0">\n</span>"</span>;</div>
</li>
<li class="li2">
<div class="de2"><span class="kw1">else</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; std::<span class="kw3">cout</span> &lt;&lt;<span class="st0">"Entry in map for key '"</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &lt;&lt;ii-&gt;first</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &lt;&lt;<span class="st0">" is "</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &lt;&lt;ii-&gt;second</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &lt;&lt;<span class="st0">"<span class="es0">\n</span>"</span>; </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<h4>A Weightier Example</h4>
<p>Demonstrating the value of a technique with a problem like Fibonacci numbers can be less than convincing  - it's pretty easy to devise reasonable solutions to the problems in an ad hoc manner.</p>
<p>For a more substantial example, I've slightly revised a problem from the Cormen, Lieserson, Rivest, and Stein book:</p>
<table border="0">
<tr>
<td width="5%">&nbsp;</td>
<td>
<i>Before leaving Microsoft to work full-time on philanthropic ventures, Bill Gates has one problem left to solve: his employees are overweight. After a company health fair with a mandatory weigh-in, Bill had HR prepare a standard org chart which includes the number of extra pounds each employee is carrying around.</p>
<p/>
Bill decides the best approach to the problem is to create a regular exercise class for his overweight employees. But with a twist - he decides that he doesn't want any employee to be in a class with his or her direct superior, so as to avoid any hint at coercion.</p>
<p/>
Your task is to take that org chart, plus Bill's no-supervisor constraint, and invite as many employees as you can, maximizing the total excess weight in the class.<br />
</i>
</td>
<td width="5%">&nbsp;</td>
</tr>
</table>
<p>A sample version of the org chart is shown below in Figure 2. Keep in mind that this sample is fairly small, but Bill's chart will normally have almost 80,000 employees.<br />
<center></p>
<table border="0">
<tr>
<td><img src="http://marknelson.us/attachments/2007/memoization/org-chart-1.gif"></td>
</tr>
<tr>
<td><center>Figure 2<br />The Excess Weight Org Chart</center></td>
</tr>
</table>
<p></center><br />
The employee names are not too imaginative, but you can see that employee 'a' is carrying 1 extra pound, employee 'b' 2, and so on.</p>
<p>Because of the constraints on the problem, we know we can't invite all employees. We need to decide who to invite, but still obey the rule that no employees are there with their supervisors.</p>
<p>This problem is very amenable to a conventional recursive solution. To determine the maximum weight that can be achieved at a given node, we can use something like the pseudo code shown here:</p>
<div class="igBar"><span id="lc-39"><a href="#" onclick="javascript:showPlainTxt('c-39'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C:</span>
<div id="c-39">
<div class="c">
<ol>
<li class="li1">
<div class="de1">GET-MAX-WEIGHT<span class="br0">&#40;</span> node <span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp;<span class="kw1">if</span> node.<span class="me1">number_of_children</span> is <span class="nu0">0</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">return</span> node.<span class="me1">weight</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp;weight1 = <span class="nu0">0</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">for</span> i = <span class="nu0">1</span> to node.<span class="me1">number_of_children</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; weight1 = weight1 + GET-MAX-WEIGHT<span class="br0">&#40;</span> node.<span class="me1">child</span><span class="br0">&#91;</span> i <span class="br0">&#93;</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;weight2 = node.<span class="me1">weight</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp;<span class="kw1">for</span> i = <span class="nu0">1</span> to node.<span class="me1">number_of_children</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">for</span> j = <span class="nu0">1</span> to node.<span class="me1">child</span><span class="br0">&#91;</span> i <span class="br0">&#93;</span>.<span class="me1">number_of_children</span> </div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp;weight2 = weight2 + GET-MAX-WEIGHT<span class="br0">&#40;</span> node.<span class="me1">child</span><span class="br0">&#91;</span> i <span class="br0">&#93;</span>.<span class="me1">child</span><span class="br0">&#91;</span> j <span class="br0">&#93;</span> <span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">return</span> max<span class="br0">&#40;</span> weight1, weight2 <span class="br0">&#41;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p>
The logic is straightforward. At each node we calculate two possible values for the maximum weight. The first value, <span class="inline_code">weight1</span>, defines the maximum weight if the node does not attend. Since the node is not attending, the weight will consist of the sum of all that node's immediate children.</p>
<p>The second value, <span class="inline_code">weight2</span>, is used to calculate the maximum value of the node when the node does attend. Since the node is attending, none of its immediate children can attend, which means the total weight will be the sum of the node's weight, plus the max of all of its children's children.</p>
<p>You can immediately see that this algorithm shares a characteristic with the Fibonacci algorithm. When evaluating a node, we make recursive calls at two different depths in the calling tree, and this results in overlapping subproblem evaluation. For example, when implementing this pseudo code on the organizational chart showing in Figure 2, the list of nodes passed in to GET-MAX-WEIGHT will be:<br />
<center></p>
<table border="1">
<tr>
<td>Call Path</td>
<td>Weight 1</td>
<td>Weight 2</td>
</tr>
<tr>
<td>root</td>
<td>a</td>
<td>&nbsp;</td>
</tr>
<tr>
<td>a</td>
<td>b c</td>
<td>d e f g</td>
</tr>
<tr>
<td>a->b</td>
<td>d e</td>
<td>h i j k</td>
</tr>
<tr>
<td>a->b->d</td>
<td>h i</td>
<td>&nbsp;</td>
</tr>
<tr>
<td>a->b->e</td>
<td>j k</td>
<td>q r</td>
</tr>
<tr>
<td>a->b->e->j</td>
<td>q r</td>
<td>&nbsp;</td>
</tr>
<tr>
<td>a->c</td>
<td>f g</td>
<td>l m n p</td>
</tr>
<tr>
<td>a->c->f</td>
<td>l m</td>
<td>&nbsp;</td>
</tr>
<tr>
<td>a->c->g</td>
<td>n p</td>
<td>&nbsp;</td>
</tr>
</table>
<p></center><br />
You'll note that there are 17 nodes, but because of duplicates, we call GET-MAX-WEIGHT 31 times, leading to much extra work. And that level of work gets into the seriously excessive levels when faced with the entire Microsoft 80,000 person organizational chart.</p>
<p>So naturally, Dynamic Programming comes to the rescue. It turns out that this particular problem fits into the category well. We have overlapping subproblems, we have optimal substructure, so we can employ DP.</p>
<h4>The Dynamic Programming Solution</h4>
<p>A true bottoms-up approach is probably impractical for this problem. </p>
<p>First, it's a lot of work to traverse a tree one level at a time, starting at the lowest level. Second, there are problems with the tabular subproblem storage scheme usually favored in dynamic programming. It's not entirely clear where in a table we would store the result from <span class="inline_code">GET-MAX-WEIGHT( q )</span>. </p>
<p>An approximation of the bottoms up approach could be achieved by traversing the tree in postorder, visiting all the children of a node first, then visiting the node itself. In this scenario we would visit the nodes from Figure 2 in the following order:<br />
<center>h i d q r j k e b l m f n p g c a</center><br />
At every non-leaf node, we would have all the information we needed to calculate <span class="inline_code">GET-MAX-WEIGHT(&nbsp;node&nbsp;)</span> without actually actually recalculating any results  - the subproblem results will have already been calculated and stored.</p>
<p>Of course, traversing the tree in postorder is actually what we're already doing already in the recursive implementation. Seeing this, my solution to the algorithm was to use a C++ hash table to perform memoization of the subproblems. This has a few advantages:</p>
<ul>
<li/>My implementation looks just like the recursive definition of the algorithm. It just has a few additional lines needed to check for stored subproblems.
<li/>I don't have to worry about finding a tabular storage method - I store the results based on a hash of the the node pointer, making lookup and storage nice and simple.
<li/>I don't have to worry about how to do bottoms-up traversal of the org chart. My recursive definition visits the nodes in the order I want without any changes.
</ul>
<p>I'll start by simply presenting my modified version of <span class="inline_code">GetMax( node )</span>, then present the infrastructure that goes with it. Finally, I'll discuss the last step in any dynamic programming problem: using the stored subproblem results to get a solution.</p>
<h4>The Memoized GetMax()</h4>
<p>The C++ code to solve this problem is completely defined in a class called Memo, which relies on a tree whose nodes are defined by <span class="inline_code">class Node</span>. The two class definitions are shown here:</p>
<div class="igBar"><span id="lcpp-40"><a href="#" onclick="javascript:showPlainTxt('cpp-40'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-40">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1">class Node <span class="br0">&#123;</span></div>
</li>
<li class="li2">
<div class="de2"><span class="kw2">public</span> :</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; Node<span class="br0">&#40;</span> std::<span class="me2">istream</span> &amp;input <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="kw4">const</span> std::<span class="me2">string</span> &amp;Name<span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="kw4">const</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw4">int</span> Weight<span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="kw4">const</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="kw4">const</span> Node &amp;Child<span class="br0">&#40;</span> <span class="kw4">size_t</span> i <span class="br0">&#41;</span> <span class="kw4">const</span>; <span class="co1">//return ref to child node</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw4">size_t</span> Children<span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="kw4">const</span>; <span class="co1">//return count of child nodes</span></div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#125;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li2">
<div class="de2">class Memo <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw2">public</span> :</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="kw4">int</span> GetMax<span class="br0">&#40;</span> <span class="kw4">const</span> Node &amp;node <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw4">void</span> MarkIncluded<span class="br0">&#40;</span> <span class="kw4">const</span> Node &amp;node <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="kw4">void</span> PrintTree<span class="br0">&#40;</span> std::<span class="me2">ostream</span> &amp;s, <span class="kw4">const</span> Node &amp;node, <span class="kw4">bool</span> init=<span class="kw2">true</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1"><span class="kw2">private</span> :</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="kw4">struct</span> Solution <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; Solution<span class="br0">&#40;</span><span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw4">bool</span> mAttending;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw4">bool</span> mIncluded;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw4">int</span> mMaxWeight;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="br0">&#125;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; std::<span class="me2">map</span>&lt;const Node*,Solution&gt; mSolutions;</div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span>; </div>
</li>
</ol>
</div>
</div>
</div>
<p>
To run an instance of this problem, I first construct a tree by having the <span class="inline_code">Node</span> constructor read in a definition file. The resulting root node is then passed in to <span class="inline_code">Memo::GetMax()</span>, which works its way through all the subproblems and calculates the maximum total weight for this problem.</p>
<p>The code in <span class="inline_code">GetMax()</span> should appear nearly identical to that in the pseudocode version of <span class="inline_code">GET-MAX-WEIGHT()</span>, with one critical difference: in my implementation of <span class="inline_code">GetMax()</span>, I check to see if a hashed value of <span class="inline_code">GetMax()</span> has been saved upon entry to the routine, and if it has, I short-circuit execution and return that value immediately.</p>
<p>Likewise, at the end of the routine, I store the newly calculated maximum value in the hash table.</p>
<p>The key to this hash table is a pointer to a node - it could just as easily have been the name of the node, but using the pointer has the advantage of making the program a little more invulnerable to input error - duplicate node names won't break the algorithm.</p>
<p>The use of a pointer as a key into a <span class="inline_code">hash_map</span> is a technique that can be used to work around some of the issues I discussed earlier in the article. Because the hash containers offer built-in support for pointers of any type, use of a pointer as a key allows you to avoid having to define your own hash function and comparison function, which is nice.</p>
<div class="igBar"><span id="lcpp-41"><a href="#" onclick="javascript:showPlainTxt('cpp-41'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-41">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1"><span class="kw4">int</span> Memo::<span class="me2">GetMax</span><span class="br0">&#40;</span> <span class="kw4">const</span> Node &amp;node <span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; std::<span class="me2">map</span>&lt;const Node*,Solution&gt;::<span class="me2">const_iterator</span> ii;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; ii = mSolutions.<span class="me1">find</span><span class="br0">&#40;</span> &amp;node <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span> ii != mSolutions.<span class="me1">end</span><span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">return</span> ii-&gt;second.<span class="me1">mMaxWeight</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw4">int</span> weight_not_attending = <span class="nu0">0</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="kw1">for</span> <span class="br0">&#40;</span> <span class="kw4">size_t</span> i = <span class="nu0">0</span> ; i &lt;node.<span class="me1">Children</span><span class="br0">&#40;</span><span class="br0">&#41;</span> ; i++ <span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; weight_not_attending += GetMax<span class="br0">&#40;</span> node.<span class="me1">Child</span><span class="br0">&#40;</span> i <span class="br0">&#41;</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="kw4">int</span> weight_attending = node.<span class="me1">Weight</span><span class="br0">&#40;</span><span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">for</span> <span class="br0">&#40;</span> <span class="kw4">size_t</span> i = <span class="nu0">0</span> ; i &lt;node.<span class="me1">Children</span><span class="br0">&#40;</span><span class="br0">&#41;</span> ; i++ <span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">for</span> <span class="br0">&#40;</span> <span class="kw4">size_t</span> j = <span class="nu0">0</span> ; j &lt;node.<span class="me1">Child</span><span class="br0">&#40;</span> i <span class="br0">&#41;</span>.<span class="me1">Children</span><span class="br0">&#40;</span><span class="br0">&#41;</span> ; j++ <span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; weight_attending += GetMax<span class="br0">&#40;</span> node.<span class="me1">Child</span><span class="br0">&#40;</span> i <span class="br0">&#41;</span>.<span class="me1">Child</span><span class="br0">&#40;</span> j <span class="br0">&#41;</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; std::<span class="kw3">cout</span> &lt;&lt;<span class="st0">"GetMax("</span> &lt;&lt;node.<span class="me1">Name</span><span class="br0">&#40;</span><span class="br0">&#41;</span> &lt;&lt;<span class="st0">")<span class="es0">\n</span>"</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span> weight_attending&gt;= weight_not_attending <span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp;mSolutions<span class="br0">&#91;</span> &amp;node <span class="br0">&#93;</span>.<span class="me1">mAttending</span> = <span class="kw2">true</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp;<span class="kw1">return</span> mSolutions<span class="br0">&#91;</span> &amp;node <span class="br0">&#93;</span>.<span class="me1">mMaxWeight</span> = weight_attending;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="br0">&#125;</span> <span class="kw1">else</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp;<span class="kw1">return</span> mSolutions<span class="br0">&#91;</span> &amp;node <span class="br0">&#93;</span>.<span class="me1">mMaxWeight</span> = weight_not_attending;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="br0">&#125;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p>
This routine is conceptually pretty simple. The first four lines check to see if I've cached a previously seen solution to the problem in the <span class="inline_code">mSolutions hash_map</span>. If I have, the <span class="inline_code">mMaxWeight</span> member of that solution information is returned immediately, avoiding a duplicated subproblem.</p>
<p>If the solution is not already stored off, I have to do the real work. This means evaluating both possible solutions - the weight if this node attends, and thus excludes all its immediate children, and the weight if this node does not attend, and thus includes the children. Those two values are calculated and stored in local variables <span class="inline_code">weight_attending</span> and <span class="inline_code">weight_not_attending</span>.</p>
<p>Given those two values, its a simple matter to decide which of the two is larger, and return that one. However, there are a couple of issues to deal with before returning.</p>
<p>First, I want to make sure that we store the solved subproblem in the solution table. Both <span class="inline_code">return </span>statements do that by writing the value to the <span class="inline_code">mMaxWeight </span>member of the correct entry in <span class="inline_code">mSolutions</span>.</p>
<p>That would be enough to get the maximum value, but I may also want to create a solution for the problem - which means not only knowing the maximum weight achieved, but a list of attendees. In order to get that list of attendees, I need to also keep track of whether the maximum value achieved at this node was done so by attending or not attending. I do that by updating a member called <span class="inline_code">mAttending</span> in the <span class="inline_code">Solution</span> object. (Note that the constructor sets <span class="inline_code">mAttending</span> to <span class="inline_code">false</span> upon creation.)</p>
<h4>Getting The Max Value</h4>
<p>If all I wanted to know was the maximum value that could be achieved, this single method in <span class="inline_code">class Memo</span> would be enough. Running a program that calls this routine for the tree shown in Figure 2 produces the following output:</p>
<pre>GetMax(h)
GetMax(i)
GetMax(d)
GetMax(q)
GetMax(r)
GetMax(j)
GetMax(k)
GetMax(e)
GetMax(b)
GetMax(l)
GetMax(m)
GetMax(f)
GetMax(n)
GetMax(p)
GetMax(g)
GetMax(c)
GetMax(a)
Max = 31</pre>
<p><span class="inline_code">GetMax()</span> prints out each subproblem it's trying to solve, and you can see that each node is only called once. The listing also shows you that the nodes are evaluated in postorder, which is the order we would have needed to simulate with a pure bottoms-up implementation.</p>
<h4>Getting the Solution</h4>
<p>As I said, the implementation as is returns the maximum value, but to be really useful we need to actually know which of the people on the org chart are going to be invited to attend Bill's exercise class. </p>
<p>To come up with this list, we need to traverse the tree, only looking at nodes that are included as part of the solution. Nodes that are not included are those nodes that are skipped over because their parent is attending.</p>
<p>Just knowing that a node is included as part of the solution is not quite enough. A node could be included, yet not be attending because the maximum value for that node is achieved by not attending. </p>
<p>The code that actually develops the solution is called <span class="inline_code">MarkIncluded()</span>. It is called with the root node as the target, then successively walks down through the tree, marking child nodes that are part of the solution. As each node is marked, I can see if it is going to be attending, and its name is printed out:</p>
<div class="igBar"><span id="lcpp-42"><a href="#" onclick="javascript:showPlainTxt('cpp-42'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-42">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1"><span class="kw4">void</span> Memo::<span class="me2">MarkIncluded</span><span class="br0">&#40;</span> <span class="kw4">const</span> Node &amp;node <span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; mSolutions<span class="br0">&#91;</span> &amp;node <span class="br0">&#93;</span>.<span class="me1">mIncluded</span> = <span class="kw2">true</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span> mSolutions<span class="br0">&#91;</span> &amp;node <span class="br0">&#93;</span>.<span class="me1">mAttending</span> <span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; std::<span class="kw3">cout</span> &lt;&lt;node.<span class="me1">Name</span><span class="br0">&#40;</span><span class="br0">&#41;</span> &lt;&lt;<span class="st0">" "</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="kw1">for</span> <span class="br0">&#40;</span> <span class="kw4">size_t</span> i = <span class="nu0">0</span> ; i &lt;node.<span class="me1">Children</span><span class="br0">&#40;</span><span class="br0">&#41;</span> ; i++ <span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span> mSolutions<span class="br0">&#91;</span> &amp;node <span class="br0">&#93;</span>.<span class="me1">mAttending</span> <span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">for</span> <span class="br0">&#40;</span> <span class="kw4">size_t</span> j = <span class="nu0">0</span> ; j &lt;node.<span class="me1">Child</span><span class="br0">&#40;</span> i <span class="br0">&#41;</span>.<span class="me1">Children</span><span class="br0">&#40;</span><span class="br0">&#41;</span> ; j++ <span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; MarkIncluded<span class="br0">&#40;</span> node.<span class="me1">Child</span><span class="br0">&#40;</span> i <span class="br0">&#41;</span>.<span class="me1">Child</span><span class="br0">&#40;</span> j <span class="br0">&#41;</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">else</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; MarkIncluded<span class="br0">&#40;</span> node.<span class="me1">Child</span><span class="br0">&#40;</span> i <span class="br0">&#41;</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p>
The logic is pretty simple. When this method is called for a node, that node is automatically marked as included in the solution. Depending on whether this node has chosen to attend or not, I either mark all of its children as included, or mark all of its children's children.</p>
<p>Adding this code to the program lets me produce this output for the tree shown in Figure 2:</p>
<pre>Attendees: d e q r c l m n p</pre>
<p>A quick check of the weights of those nodes shows that their sum is 31, which agrees with the output shown above. </p>
<h4>Graphical Output</h4>
<p>I like being able to see the results of of my algorithm in a graphical format, so I added one additional method, <span class="inline_code">PrintTree()</span> to <span class="inline_code">class Memo</span>. This method takes the final tree and prints it out in a format that can be processed into a nice graph using <em>dot</em>, one of the programs in the <a href="http://www.graphviz.org/" class="newpage">Graphviz</a> package, a free set of tools from AT&#038;T.</p>
<p>The final result of running the data set shown in Figure 2 through my test program is shown in Figure 3.<br />
<center></p>
<table border="0">
<tr>
<td><img src="http://marknelson.us/attachments/2007/memoization/figure-3.gif"></td>
</tr>
<tr>
<td><center>Figure 13<br />Program Output Using GraphViz</center></td>
</tr>
</table>
<p></center><br />
In Figure 3, nodes that are not attending are outlined with a dotted border. This means that either their parent node has chosen to attend, excluding them, or their optimal solution calls for them to not attend.<br />
The individual node weight and name are shown in the two cells in the top half of the node. The bottom half has two cells that show the calculated value of <span class="inline_code">GetMax()</span> for that node, and the decision made as to whether that node attends or not in order to reach that maximum value.</p>
<h4>Running the Test Program</h4>
<p>The test program I used to create this output is a simple program that reads in a tree definition file, outputs the program data to the console, and creates a GraphViz format output file. It's run  with two arguments, an input definition file (one sample is included), and the desired output file name:</p>
<pre>memo tree01.def tree01.dot</pre>
<p>Project and/or make files are included for Visual Studio 2003/2005 and gcc. </p>
<p>Note that the format of the tree definition file is documented in source code. </p>
<p>The <span class="inline_code">main()</span> routine for the test program (minus some error handling) looks like this:</p>
<div class="igBar"><span id="lcpp-43"><a href="#" onclick="javascript:showPlainTxt('cpp-43'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-43">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1"><span class="kw4">int</span> main<span class="br0">&#40;</span> <span class="kw4">int</span> argc, <span class="kw4">char</span> *argv<span class="br0">&#91;</span><span class="br0">&#93;</span> <span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2"><span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; std::<span class="me2">ifstream</span> input<span class="br0">&#40;</span> argv<span class="br0">&#91;</span> <span class="nu0">1</span> <span class="br0">&#93;</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; Node root<span class="br0">&#40;</span> input <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; Memo m;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; std::<span class="me2">ofstream</span> output<span class="br0">&#40;</span> argv<span class="br0">&#91;</span> <span class="nu0">2</span> <span class="br0">&#93;</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; std::<span class="kw3">cout</span> &lt;&lt;<span class="st0">"Max = "</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &lt;&lt;m.<span class="me1">GetMax</span><span class="br0">&#40;</span> root <span class="br0">&#41;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &lt;&lt;<span class="st0">"<span class="es0">\n</span>"</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; std::<span class="kw3">cout</span> &lt;&lt;<span class="st0">"Attendees: "</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; m.<span class="me1">MarkIncluded</span><span class="br0">&#40;</span> root <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; std::<span class="kw3">cout</span> &lt;&lt;<span class="st0">"<span class="es0">\n</span>"</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; m.<span class="me1">PrintTree</span><span class="br0">&#40;</span> output, root <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="kw1">return</span> <span class="nu0">0</span>;</div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p>
This program should give you an idea of how to harness the power of Dynamic Programming in order to solve some fairly difficult problems with relatively simple algorithms. Dynamic Programming is the key to this, and memoization with C++ hash maps helps make it possible.</p>
<h4>References</h4>
<p>[1] <a href="http://research.att.com/~bs/hopl-almost-final.pdf" class="newpage">Evolving a language in and for the real world: C++ 1991-2006</a>, Bjarne Stroustrup, ACM HOPL-III. June 2007. http://research.att.com/~bs/hopl-almost-final.pdf<br />
[2] <a href="http://www.graphviz.org/" class="newpage">Graphviz</a>. The free project from AT&#038;T that supports graph visualization.<br />
[3] <a href="http://marknelson.us/attachments/2007/memoization/source.zip">Source Code for sample program</a><br />
[4] Cormen, Thomas. <i>et. al.</i> <a href="http://www.amazon.com/Introduction-Algorithms-Thomas-H-Cormen/dp/0262032937" class="newpage">Introduction to Algorithms</a>. Cambridge: MIT Press, 2001.</p>
]]></content:encoded>
			<wfw:commentRss>http://marknelson.us/2007/08/01/memoization/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Puzzling</title>
		<link>http://marknelson.us/2007/04/01/puzzling/</link>
		<comments>http://marknelson.us/2007/04/01/puzzling/#comments</comments>
		<pubDate>Mon, 02 Apr 2007 03:21:03 +0000</pubDate>
		<dc:creator>Mark Nelson</dc:creator>
				<category><![CDATA[Computer Science]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Web Articles]]></category>

		<guid isPermaLink="false">/2007/04/01/puzzling/</guid>
		<description><![CDATA[



















<div class="addthis_toolbox addthis_default_style" addthis:url='http://marknelson.us/2007/04/01/puzzling/' addthis:title='Puzzling' ><a class="addthis_button_twitter"></a><a class="addthis_button_favorites"></a><a class="addthis_button_print"></a><a class="addthis_button_facebook_like"></a><a class="addthis_button_google_plusone"></a><a class="addthis_button_compact"></a></div>Dr. Dobb's PortalApril, 2007 Article on DDJ site I'm an inveterate fan of wordplay of all sorts - puzzles, anagrams, crosswords. I've been known online by my anagrammatic name, SnorkelMan, all the way back to the ancient days of the text mode BBS. My continual hectoring of the staff at the Dallas Morning News over [...]]]></description>
			<content:encoded><![CDATA[









<div class="addthis_toolbox addthis_default_style" addthis:url='http://marknelson.us/2007/04/01/puzzling/' addthis:title='Puzzling' ><a class="addthis_button_twitter"></a><a class="addthis_button_favorites"></a><a class="addthis_button_print"></a><a class="addthis_button_facebook_like"></a><a class="addthis_button_google_plusone"></a><a class="addthis_button_compact"></a></div><table border="0" width="500">
<tr>
<td width="300"><img alt="DDJ Portal Logo" src="http://marknelson.us/attachments/misc/logo_ddj.gif" /></td>
<td><strong>Dr. Dobb's Portal</strong><br/>April, 2007<br />
         <a href="http://www.ddj.com/architect/198701685" class="newpage">Article on DDJ site</a>
  </td>
</tr>
</table>
<p>I'm an inveterate fan of wordplay of all sorts - puzzles, anagrams, crosswords. I've been known online by my anagrammatic name, SnorkelMan, all the way back to the ancient days of the text mode BBS. My continual hectoring of the staff at the Dallas Morning News over errors in their print version of the New York Times crossword puzzle led them to finally just give me the job of proofreading it. I spend way too much time on the crosswords and other puzzles, both online and in print. In other words, I'm a sucker for a good word puzzle.</p>
<p><center><br />
<img src="http://marknelson.us/attachments/2007/puzzling/WordplayArticle.jpg" alt="Wordplay Article from Dallas Morning News" /><br />
Figure 1 - Dallas Morning News Article<br />
</center></p>
<p>My sense of wordplay was naturally piqued this weekend when I heard the <a href="http://www.npr.org/templates/story/story.php?storyId=9264290">latest weekly puzzle challenge</a> from Will Shortz on NPR Weekend Edition. The challenge, from contributor David Edelheit, read as follows:</p>
<blockquote><p>
Take the names of two U.S. States, mix them all together, then rearrange the letters to form the names of two other U.S. States. What states are these?
</p></blockquote>
<p>As sometimes happens, when I heard this puzzle, and the answer didn't click immediately, my first thought was "I could write a program to solve this faster than I can figure it out myself."</p>
<p>That's a treacherous thought for a puzzler, because it immediately diverts that little thread in the back of your mind that is supposed to be solving the puzzle, instead putting it on the task of writing the program.</p>
<p>But it turned out to be an interesting problem in efficiency, and so I'm glad I went down that path. </p>
<h4>First Pass</h4>
<p>Most of my work these days is in C++, and while C++ doesn't have the world's best string manipulation facilities, I thought it had enough to do the job on this puzzle. Figuring that the problem was small enough to solve via brute force, I decided that the general course of the program would be to work my way through all 50*49/2 combinations of states, and test them against all 48*47/2 remaining combinations. That's just a little more than a million operations, which ought to be child's play. Thus, the basic program loop was going to look like this:</p>
<div class="igBar"><span id="lcpp-54"><a href="#" onclick="javascript:showPlainTxt('cpp-54'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-54">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1"><span class="kw1">for</span> <span class="br0">&#40;</span> <span class="kw4">int</span> i = <span class="nu0">0</span> ; i &lt;<span class="nu0">49</span> ; i++ <span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="kw1">for</span> <span class="br0">&#40;</span> <span class="kw4">int</span> j = i&nbsp; + <span class="nu0">1</span>; j &lt;<span class="nu0">50</span> ; j++ <span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">for</span> <span class="br0">&#40;</span> <span class="kw4">int</span> m = <span class="nu0">0</span> ; m &lt;<span class="nu0">49</span> ; m++ <span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span> m != i &amp;&amp; m != j <span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">for</span> <span class="br0">&#40;</span> <span class="kw4">int</span> n = m + <span class="nu0">1</span> ; n &lt;<span class="nu0">50</span> ; n++ <span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span> n != i &amp;&amp; n != j <span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="co1">//compare state i and j against state m and n </span></div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>Still thinking brute force, I was looking for the simplest way to store the data so that it would be easy to compare, and turned to <code>std::multiset</code>. I knew that if I stored all the characters from states <code>i</code> and <code>j</code> in one <code>std::multiset<char></code> object, and all the characters from states <code>m</code> and <code>n</code> in another, I could quickly compare one against the other with a simple equality operator.</p>
<p>So in the above loop, I inserted these lines after the first two <code>for</code> statements:</p>
<div class="igBar"><span id="lcpp-55"><a href="#" onclick="javascript:showPlainTxt('cpp-55'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-55">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1">std::<span class="me2">multiset</span>&lt;char&gt; label1;</div>
</li>
<li class="li2">
<div class="de2"><span class="kw4">char</span> *p = states<span class="br0">&#91;</span> i <span class="br0">&#93;</span>;</div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">while</span> <span class="br0">&#40;</span>*p<span class="br0">&#41;</span> </div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; label1.<span class="me1">insert</span><span class="br0">&#40;</span> *p++ <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">p = states<span class="br0">&#91;</span> j <span class="br0">&#93;</span>;</div>
</li>
<li class="li2">
<div class="de2"><span class="kw1">while</span> <span class="br0">&#40;</span>*p<span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; label1.<span class="me1">insert</span><span class="br0">&#40;</span> *p++ <span class="br0">&#41;</span>; </div>
</li>
</ol>
</div>
</div>
</div>
<p>
(Note that I had snagged the names of the states from one of the first sites in a Google search, and inserted them into an array of character pointers called <code>states</code>.) </p>
<p>I inserted a similar definition for <code>label2</code> inside the second set of two <code>for</code> statements, which means all I had left to do was a simple comparison of <code>label1</code> against <code>label2</code>:</p>
<div class="igBar"><span id="lcpp-56"><a href="#" onclick="javascript:showPlainTxt('cpp-56'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-56">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1"><span class="kw1">if</span> <span class="br0">&#40;</span> label1 == label2 <span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; std::<span class="kw3">cout</span> &lt;&lt;states<span class="br0">&#91;</span> i <span class="br0">&#93;</span> &lt;&lt;<span class="st0">", "</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &lt;&lt;states<span class="br0">&#91;</span> j <span class="br0">&#93;</span> &lt;&lt;<span class="st0">", "</span> </div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &lt;&lt;states<span class="br0">&#91;</span> m <span class="br0">&#93;</span> &lt;&lt;<span class="st0">", "</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &lt;&lt;states<span class="br0">&#91;</span> n <span class="br0">&#93;</span> &lt;&lt;<span class="st0">"<span class="es0">\n</span>"</span>; </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>I did a quick compile, tested the code, and sure enough, the contents of the multisets were indeed sorted concatentations of the letters of each pair of states. Time to run!</p>
<p>My first disappointment was seeing that, while the program was indeed running properly, it was going slow enough that it looked like it was going to take a sizable fraction of an hour to make it through the entire alphabet. I could just wait, but in this case I decided I could optimize faster than it would take to wait for the first results.</p>
<h4>Second Pass</h4>
<p>It's pretty obvious that calculating the concatenation of state <code>m</code> and <code>n</code> in the innermost loop is full of wasted cycles, since it is repeatedly calculating the same state values. I knew it was inefficient, but I didn't think it was going to matter too much.</p>
<p>Since it turned out that it did matter, I decided to precalculate all the values before entering the four-deep nested comparison loop, with code like this:</p>
<div class="igBar"><span id="lcpp-57"><a href="#" onclick="javascript:showPlainTxt('cpp-57'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-57">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1">std::<span class="me2">multiset</span>&lt;char&gt; letters<span class="br0">&#91;</span> <span class="nu0">50</span> <span class="br0">&#93;</span><span class="br0">&#91;</span> <span class="nu0">50</span> <span class="br0">&#93;</span>;</div>
</li>
<li class="li2">
<div class="de2"><span class="kw1">for</span> <span class="br0">&#40;</span> <span class="kw4">int</span> i = <span class="nu0">0</span> ; i &lt;<span class="nu0">49</span> ; i++ <span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">for</span> <span class="br0">&#40;</span> <span class="kw4">int</span> j = i + <span class="nu0">1</span> ; j &lt;<span class="nu0">50</span> ; j++ <span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw4">char</span> *p = states<span class="br0">&#91;</span> i <span class="br0">&#93;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">while</span> <span class="br0">&#40;</span>*p<span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; letters<span class="br0">&#91;</span> i <span class="br0">&#93;</span><span class="br0">&#91;</span> j <span class="br0">&#93;</span>.<span class="me1">insert</span><span class="br0">&#40;</span> *p++ <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; p = states<span class="br0">&#91;</span> j <span class="br0">&#93;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">while</span> <span class="br0">&#40;</span> *p <span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;letters<span class="br0">&#91;</span> i <span class="br0">&#93;</span><span class="br0">&#91;</span> j <span class="br0">&#93;</span>.<span class="me1">insert</span><span class="br0">&#40;</span> *p++ <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <span class="br0">&#125;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>Then I didn't have to do any computation in my main loop, I just had to modify the comparison line in the innermost loop:</p>
<div class="igBar"><span id="lcpp-58"><a href="#" onclick="javascript:showPlainTxt('cpp-58'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-58">
<div class="cpp">
<ol>
<li class="li1">
<div class="de1"><span class="kw1">if</span> <span class="br0">&#40;</span> letters<span class="br0">&#91;</span>i<span class="br0">&#93;</span><span class="br0">&#91;</span>j<span class="br0">&#93;</span> == letters<span class="br0">&#91;</span>m<span class="br0">&#93;</span><span class="br0">&#91;</span>n<span class="br0">&#93;</span> <span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; std::<span class="kw3">cout</span> &lt;&lt;states<span class="br0">&#91;</span> i <span class="br0">&#93;</span> &lt;&lt;<span class="st0">", "</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; ... </div>
</li>
</ol>
</div>
</div>
</div>
<p>
This modified program did indeed speed things up considerably, bringing the run time down from a fraction of an hour to just a few seconds, even with a bit of progress tracing turned on.</p>
<h4>Third and Final Pass</h4>
<p>Although I could have stopped here, I thought it might be interesting to see how expensive the use of an associative container like <code>multiset </code>was compared to something simpler. I replaced the setup code so that it stored the data in an <code>std::vector<char></code> instead of a <code>multiset</code>, on the theory that the comparison operator would run much faster on a <code>vector</code>. I had to add a call to sort the data after inserting it in the <code>vector</code>, which would be executed 50 times, but nowhere near as many times as the comparison operator.</p>
<p>The results were more or less as I expected. When run under Windows with default Release optimization, the <code>vector</code> version of the program ran about twice as fast as the <code>multiset</code> version. When compiled with g++ 3.3.5 with <code>-O2</code>, I saw roughly the same ratio of execution speeds. </p>
<h4>No Spoilers</h4>
<p>I'm not going to spoil the puzzle for you by giving away the answer. Let's just say that it is a good word puzzle, and if you manage to arrive at the answer you'll see why. If you don't manage to solve it in your head, you can download the source, compile it, and get there by brute force, just like I did.</p>
<p><a href="http://marknelson.us/attachments/2007/puzzling/usa.cpp">usa.cpp</a></p>
<p>But I don't think it will count as a spoiler if I tell you that you don't need a computer program to solve this problem. </p>
]]></content:encoded>
			<wfw:commentRss>http://marknelson.us/2007/04/01/puzzling/feed/</wfw:commentRss>
		<slash:comments>20</slash:comments>
		</item>
		<item>
		<title>Windows MCE Replacement On a Budget</title>
		<link>http://marknelson.us/2006/08/14/windows-mce-replacement-on-a-budget/</link>
		<comments>http://marknelson.us/2006/08/14/windows-mce-replacement-on-a-budget/#comments</comments>
		<pubDate>Mon, 14 Aug 2006 14:06:23 +0000</pubDate>
		<dc:creator>Mark Nelson</dc:creator>
				<category><![CDATA[Audio]]></category>
		<category><![CDATA[Video]]></category>
		<category><![CDATA[Web Articles]]></category>

		<guid isPermaLink="false">/2006/08/14/windows-mce-replacement-on-a-budget/</guid>
		<description><![CDATA[<div class="addthis_toolbox addthis_default_style" addthis:url='http://marknelson.us/2006/08/14/windows-mce-replacement-on-a-budget/' addthis:title='Windows MCE Replacement On a Budget' ><a class="addthis_button_twitter"></a><a class="addthis_button_favorites"></a><a class="addthis_button_print"></a><a class="addthis_button_facebook_like"></a><a class="addthis_button_google_plusone"></a><a class="addthis_button_compact"></a></div>Dr. Dobb's Portal August, 2006 Article on DDJ site Over the years I've spent quite a bit of time and money experimenting with the best ways to implement networked music and video in my home. Back in 2004 I showed you how to build a home video jukebox, albeit with a few compromises. Today you [...]]]></description>
			<content:encoded><![CDATA[<div class="addthis_toolbox addthis_default_style" addthis:url='http://marknelson.us/2006/08/14/windows-mce-replacement-on-a-budget/' addthis:title='Windows MCE Replacement On a Budget' ><a class="addthis_button_twitter"></a><a class="addthis_button_favorites"></a><a class="addthis_button_print"></a><a class="addthis_button_facebook_like"></a><a class="addthis_button_google_plusone"></a><a class="addthis_button_compact"></a></div><table border="0" width="500">
<tr>
<td width="300"><img alt="DDJ Portal Logo" src="http://marknelson.us/attachments/misc/logo_ddj.gif" /></td>
<td><strong>Dr. Dobb's Portal</strong> August, 2006<br />
         <a href="http://www.ddj.com/dept/linux/191801355" class="newpage">Article on DDJ site</a>
  </td>
</tr>
</table>
<p>Over the years I've spent quite a bit of time and money experimenting with the best ways to implement networked music and video in my home. Back in 2004 I showed you how to <a href="http://www.byte.com/documents/s=9032/byt1076349464330/0209_nelson.html" title="build a home video jukebox" class="newpage">build a home video jukebox</a>, albeit with a few compromises. Today you might think the best way to get this done is with a <a href="http://www.microsoft.com/windowsxp/mediacenter/default.mspx" title="Windows Media Center Edition" class="newpage">Windows Media Center Edition</a> PC (or perhaps its recent competitor, <a href="http://www.apple.com/imac/frontrow.html" title="Front Row" class="newpage">Front Row</a>). I'd like to offer up an alternative solution that offers a feature set that competes, and in some case beats the PC-based solution for a lot less money: the <a href="http://www.galaxymetalgear.com/Products/3500tvisto.html" title="Galaxy TVisto" class="newpage">GalaxyMetalGear TVisto</a> media player.</p>
<h4>The TVisto Concept</h4>
<p><img src="http://marknelson.us/attachments/tvisto-3500-review/tvisto1.gif"/ class="alignleft">The TVisto media player is basically nothing but a souped-up hard drive enclosure. GMG was already in the enclosure business when it must have occurred to them that adding a lightweight Linux distribution plus a little more CPU power, a media processor, and a remote-control driven UI would give them a box that could:</p>
<ul>
<li>Play video files encoded in MPEG-2 or DivX formats</li>
<li>Display picture files</li>
<li>Act as a music jukebox</li>
<li>Fill in as a removable hard drive</li>
</ul>
<p>In other words, do everything that a Windows MCE PC can do with the exception of recording live TV.</p>
<p>GMG sells this box without a hard drive, and with a little shopping you can find it for under $130 - I paid $125 for mine. Add the 5.25" hard drive of your choice and you are in business. My choice was a 300G Maxtor drive purchased at Fry's for $80 - making my total investment just a little over a couple hundred dollars. Because this is a simple hard drive enclosure, you also don't have to worry about a lot of the negatives associated with Media Center PC: no fan noise, small footprint, instant boot time - all big pluses in my book. Once I hooked it up and put it through its paces, I was sold on the value of this as a Media Center Edition replacement.</p>
<h4>Setup</h4>
<p>The hardware setup for the TVisto is identical to that for any hard drive enclosure: take out a few screws, open the case, connect the power and data cables to the drive, close the box up and you are done. I found the enclosure to be a bit cramped, and I was worried about damaging cables as I forced the drive in, but all went smoothly and that part of the setup took just 10 minutes. If you are comfortable with cracking the case on your PC to install drives or memory this will be a cake walk.</p>
<p>Once you have the drive installed, you connect it to your PC via a USB cable, power it up, and it should quickly appear as a removable drive. You can then format it as an NTFS, Mac OS Extended, or FAT32 drive (yes, this means you can use it with virtually any O/S.)</p>
<p>The next part of the setup gives you a hint about the simplistic nature of the software driving the TVisto. The instructions make it very clear that you have to create the folders shown in the figure below with exactly the correct names: Firmware, Movies, Music, and Pictures. As you will see when using the TVisto, there isn't a complicated database for music and video like you have with iTunes or Windows Media Player - the TVisto simply browses through folders, and it expects your media to be stored in the folder with the appropriate name.</p>
<p><center><br />
<img src="http://marknelson.us/attachments/tvisto-3500-review/ScreenShot001.gif"/></p>
<p>Figure 1<br/>The mandatory folder layout<br />
</center></p>
<p>Once you have the folders created, copying media from your PC is simply a drag and drop project, copying media files of the appropriate type to the correct folder. As you'll see later, large collections need to be organized at this point using a system of nested folders, and that is entirely up to you to do.</p>
<p><center><br />
<img src="http://marknelson.us/attachments/tvisto-3500-review/ScreenShot002.gif"/></p>
<p>Figure 2<br/>Copying media files to the TVisto<br />
</center></p>
<h4> The TV Connection </h4>
<p>After following safe ejection procedures from your PC, you can power down the TVisto and connect it to your TV. The unit ships with all the cables you need for NTSC viewing in stereo, and again, you should have no trouble connecting these standard inputs to your home video system.</p>
<p>One of the really great things about the TVisto was finding out that this little box also supports high quality audio and video. The standard unit ships with support for everything from Composite NTSC video up through 1080i, (although you need to purchase an optional cable for component HDTV connections). It also supports 5.1 audio via an SPDIF connector. Combining these two features means you can watch your stored DVDs without compromising on audio or video quality.</p>
<p>I purchased the add-on component cable so I could get the benefit of higher resolution DVD viewing - am annoying $30 expense, but if you have an HDTV it is well worth it.</p>
<p><center><br />
<img src="http://marknelson.us/attachments/tvisto-3500-review/component-cable.jpg"/></p>
<p>Figure 3<br/>The $30 component cable<br />
</center></p>
<p>Once you connect the TVisto to your TV, you may find that the default video settings aren't correct - this can be remedied by simply cycling through the various possibilities by repeatedly pressing the <em>TV Out</em> button on the remote. Eventually you'll see the menu shown in Figure 4. (Please note that these are low-res 640x480 screen captures - your HDTV viewing will be much crisper than what you see here.)</p>
<p><center><br />
<img src="http://marknelson.us/attachments/tvisto-3500-review/MainMenu.gif"/></p>
<p>Figure 4<br/>The intro menu for the TVisto 3500<br />
</center></p>
<p>A nice, simple interface. The place where you will spend most of your time is in <em>Movies</em>, <em>Music</em>, and <em>Pictures</em>. The only thing I've done in the <em>Settings</em> menu is pin down my video output settings, but you also have a few other options, such as <em>Languages</em>, <em>ScreenSaver</em>, and so on. </p>
<p>The three media buttons, <em>Movies</em>, <em>Music</em>, and <em>Pictures</em>, all have what amounts to nothing more than a folder browser interface. You plow through the files that you have loaded in the appropriate folder and select what you want to see, hear, or look at. Once you are in playback mode there are naturally quite a few more options, but the selection process is dead simple.</p>
<h4>Movies</h4>
<p>Even if you never use the TVisto 3500 for anything but a video jukebox, I think you'll find that it is worth the money. It supports multiple video formats, but I chose to go with the simplest and easiest format. I used the banned program <a href="http://fileforum.betanews.com/detail/DVD_Decrypter/1011845169/1" title="DVD Decrypter" class="newpage">DVD Decrypter</a>, and told it to rip my DVD to an ISO image. (By the way, if you want a good answer to the question "Is this legal" you might want to see what the Electronic Freedom Foundation has to <a href="http://www.eff.org/IP/fairuse/" class="newpage">say about it.</a>) I then simply copied those ISO images to my TVisto in the Movies folder, and was then able to browse them as shown in Figure 5:</p>
<p><center><br />
<img src="http://marknelson.us/attachments/tvisto-3500-review/movies-menu.gif" title="Figure 5"/></p>
<p>Figure 5<br/>Selecting a Movie<br />
</center></p>
<p>As you can see, selecting a movie is done simply by browsing file names, so you will want to be careful about naming your ISO files before installing them on the hard drive. There is no horizontal scrolling, either, so if your names are too long you will be in real trouble - the end of a title will simply be unreadable.</p>
<p>You can also see at the bottom of the list of files that I have put these six DVDs into a separate folder - I'm using a hierarchical storage system to make the navigation process a little easier to follow.</p>
<p>While all this may be a little crude, once you hit the play button the remote in order to start a movie, you will beging enjoing exactly the same playback experience you get from your DVD player - the full set of menus and features you expect:</p>
<p><center><br />
<img src="http://marknelson.us/attachments/tvisto-3500-review/alias-menu.gif" title="Alias Menu"/></p>
<p>Figure 6<br/>The Main Menu for Alias, Season 3, Disc 2<br />
</center></p>
<p>Figure 6 shows you a playback in progress, looking just as you would expect at the given resolution, with the English subtitles being overlaid on the screen. (Looks like Sydney is in the middle of a big operation!)</p>
<p><center><br />
<img src="http://marknelson.us/attachments/tvisto-3500-review/alias-playback.gif" title="Alias Playback"/></p>
<p>Figure 7<br/> Full Feature playback<br/><br />
</center></p>
<p>I couldn't be happier with this DVD emulation mode of the TVisto 3500. It does exactly what I want, the way I'm used to doing it. The ripping process is one-button-click simple, and I don't have to give up anything in terms of features or performance.</p>
<h4>Music</h4>
<p>I have to confess that I actually purchased the TVisto first as a music player. I like to have access to my entire music collection in my media room, and despite the efforts of all the manufacturers out there, have yet to have a satisfactory experience with a networked player. For example, the <a href="http://www.linksys.com/servlet/Satellite?c=L_Product_C2&amp;childpagename=US%2FLayout&amp;cid=1115416830950&amp;pagename=Linksys%2FCommon%2FVisitorWrapper" title="Wireless-B Media Adapter" class="newpage">Linksys Wireless-B Media Adapter</a> seems like solid product from the spec sheet, but it suffers from such a litany of shortcomings that I found it completely unusable. (Disclosure: I work for Linksys - or at least I did before this article was published.) My Windows Media Center Edition experiences were even more frustrating, again using a Linksys product, the <a href="http://www.linksys.com/servlet/Satellite?c=L_Product_C2&amp;childpagename=US%2FLayout&amp;cid=1115416829578&amp;pagename=Linksys%2FCommon%2FVisitorWrapper" title="Dual-Band Wireless A/G Media Center Extender" class="newpage">Dual-Band Wireless A/G Media Center Extender</a> (which notably seems to have stopped shipping.) In this case I blame most of the woes on Microsoft, not Linksys, for the unfriendly music experience.</p>
<p>So instead of hassling with a remote box that has to talk to a server, (always including some required server software), the TVisto offers an easier to manage remote experience with less hassle on my part. All I have to do is copy files to my Music Folder, and I'm in business.</p>
<p>Figure 8 shows the music navigation system in place, and just as you might have expected, it is simply a folder browser. This is both a strength and a shortcoming of the TVisto. Since it doesn't have to maintain a big database of titles, it is much simpler and more reliable. At the same time, that database of titles that you get with iTunes or Windows Media Player does make it easier to navigate through your songs.</p>
<p><center><br />
<img src="http://marknelson.us/attachments/tvisto-3500-review/music-nav.gif" title="Navigating through Music"/></p>
<p>Figure 8<br/>Navigating through the music collection<br />
</center></p>
<p>Note that again, in order to bring some help to the navigation problem of working through several hundred CDs, I've gone hierarchical, with each family member having their own folder, with artists under those, then albums under those. This is okay, but not optimal. If you want to play a specific CD it means you either have to know whose folder it is in or else spend a lot of time browsing - no search function available.</p>
<p>If your needs are restricted to listening to one CD at a time, you will find yourself in hog heaven. When you select a CD, you can either play individual songs of your selection or with one button press instantly play the entire CD.</p>
<p><center><br />
<img src="http://marknelson.us/attachments/tvisto-3500-review/music-playback.gif" title="Music Playback"/></p>
<p>Figure 9<br/>Music plaback in progress<br/><br />
</center></p>
<p>The one place where the GalaxyMetalGear team could really afford to work on firmware improvements would be here. Feature competition from Microsoft and Apple has really raised the bar on music playback, and a simple file-folder oraganized MP3 playback experience won't cut it for a lot of people. Things I'd like to see in updates to the current product would include:</p>
<ul>
<li>Big improvements in navigation, with an eye towards speed - pressing the page button 20 times to get to the end of a list is pretty annoying.</li>
<li>Support for playlists, ratings, and searching</li>
<li>Support for ID3V2 tags, including multiple Genres per song</li>
<li>Shuffle options</li>
<li>Display of album artwork and lyrics</li>
<li>Visualization options</li>
</ul>
<p>Despite these missing features, make no mistake about one thing, this is still an excellent replacement for the CD player in your media room. It performs the same function, but allows you to quickly access your entire collection and frees up the space that was occupied by that ugly CD rack you bought at Target 10 years ago.</p>
<h4>Pictures</h4>
<p>I consider the picture album feature almost a throwaway - it was easy to add, so it was included, but I don't know anyone who actually uses their TV to browse through their photos (my apologies to those of you out there who do!)</p>
<p>Basically, the photo album feature operates identically to the music feature, meaning you get to view one folder at a time. Figure 10 shows this in play on my TV, and its certainly nice, but the lack of tagging and organization makes it a bit lame compared to a web site like <a href="http://flickr.com/" title="Flickr.com" class="newpage">Flickr.com</a>.</p>
<p><center><br />
<img src="http://marknelson.us/attachments/tvisto-3500-review/Fig10.jpg" title="Viewing Photos"/></p>
<p>Figure 10<br/>Viewing Photos<br />
</center></p>
<h4>Conclusions</h4>
<p>If you're looking at price performance, I think the TVisto can't be beat. I won't be surprised if an iPod will be able to do all of these things soon (except for playing DVDs you've ripped yourself), but even then, the form factor will always mean you are paying a huge penalty for disk space. My $200 TVisto has 300G of disk space - an iPod with 1/5 that space costs twice as much.</p>
<p>The one thing that would take this product from second base to home would be network access. Given that it is running a standard Linux distribution, I have to believe that it would be fairly easy to support a USB network interface. If I could load new media on the TVisto without having to lug it from one room to another, I think I would then be able to say the TVisto 3500 is feature complete. (Opening this box up for community software development might be one way to get those new features in more quickly.)</p>
<p>A few notes on pros and cons to help you with your buying decision:</p>
<p><strong>Pro:</strong></p>
<ul>
<li>No fan</li>
<li>Small footprint</li>
<li>Very inexpensive</li>
<li>Plays ISO-ripped DVDs</li>
<li>Plays DivX compressed video</li>
<li>OS independent</li>
</ul>
<p><strong>Con:</strong></p>
<ul>
<li>No protected music</li>
<li>No network access</li>
<li>Lots of UI limitations - slow scan through music and photos</li>
<li>No playlists</li>
<li>No shuffling across folders</li>
<li>Doesn't read ID3V2 tags</li>
</ul>
<p>Look at it this way - in the worst case, you'll still get to use it as a portable hard drive!</p>
]]></content:encoded>
			<wfw:commentRss>http://marknelson.us/2006/08/14/windows-mce-replacement-on-a-budget/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Free ITU Documents?</title>
		<link>http://marknelson.us/2006/08/10/free-itu-docs/</link>
		<comments>http://marknelson.us/2006/08/10/free-itu-docs/#comments</comments>
		<pubDate>Thu, 10 Aug 2006 17:18:37 +0000</pubDate>
		<dc:creator>Mark Nelson</dc:creator>
				<category><![CDATA[Standards]]></category>
		<category><![CDATA[Web Articles]]></category>

		<guid isPermaLink="false">/2006/08/10/free-itu-docs/</guid>
		<description><![CDATA[<div class="addthis_toolbox addthis_default_style" addthis:url='http://marknelson.us/2006/08/10/free-itu-docs/' addthis:title='Free ITU Documents?' ><a class="addthis_button_twitter"></a><a class="addthis_button_favorites"></a><a class="addthis_button_print"></a><a class="addthis_button_facebook_like"></a><a class="addthis_button_google_plusone"></a><a class="addthis_button_compact"></a></div>Dr. Dobb's Portal August, 2006 Article on DDJ site International standards are good, we all love them, sure. But if you aren't lucky enough to be employed by a big company or a university, getting your hands on these standards can be a real pain. Typically the standards bodies sell electronic copies of their work [...]]]></description>
			<content:encoded><![CDATA[<div class="addthis_toolbox addthis_default_style" addthis:url='http://marknelson.us/2006/08/10/free-itu-docs/' addthis:title='Free ITU Documents?' ><a class="addthis_button_twitter"></a><a class="addthis_button_favorites"></a><a class="addthis_button_print"></a><a class="addthis_button_facebook_like"></a><a class="addthis_button_google_plusone"></a><a class="addthis_button_compact"></a></div><table border="0" width="500">
<tr>
<td width="300"><img alt="DDJ Portal Logo" src="http://marknelson.us/attachments/misc/logo_ddj.gif" /></td>
<td><strong>Dr. Dobb's Portal</strong> August, 2006<br />
         <a href="http://www.ddj.com/architect/192201183" class="newpage">Article on DDJ site</a>
  </td>
</tr>
</table>
<p><img src="http://marknelson.us/attachments/free-itu-docs/itu.gif" class="alignleft"/><br />
International standards are good, we all love them, sure. But if you aren't lucky enough to be employed by a big company or a university, getting your hands on these standards can be a real pain. Typically the standards bodies sell electronic copies of their work for fairly high prices, justifying it as one of the only ways they can subsidize their activities.</p>
<p>Just an example, if you want to buy a hardcopy of the C++ standard, you can end up paying as much as <a href="http://www.techstreet.com/cgi-bin/detail?product_id=1143945">$US 175</a> for the privilege - which is a killer if you are just doing this for personal use.</p>
<p>This turns into a FAQ in public newsgroups and forums when the uninitiated start operating under the idea that they ought to be able to just find a copy on the web. <a href="http://groups.google.com/group/comp.compression/browse_thread/thread/fdbc21de7039363a">This thread</a> on comp.compression is a classic example, where Nils Haeck is looking for free copies of some of the JPEG documents.</p>
<p>But oddly enough, later on in the thread Nils sheds a piece of light on the discussion, disclosing a piece of news that I hadn't heard before. It turns out that the ITU has a program in which individuals can get up to three free electronic copies of their standards documents per year. This is exciting!</p>
<p>I thought I'd give it a try by first signing in at the <a href="https://ecs.itu.ch/cgi-bin/run/ebookshop?choice=">ITU Bookstore</a>. You'll note on this page there is a button that says: <em>I wish to REGISTER in order to download up to three (3) Recommendations free of charge</em>. Yes, I certainly do. I selected the button and went to the registration page. The registration, shown below, has a bit of a problem being rather corporate-oriented, so I tried to steer it into the notion that I am an a company of one:</p>
<p><center><br />
<img src="http://marknelson.us/attachments/free-itu-docs/ScreenShot001.gif"><br />
Figure 1 - The registration form<br />
</center></p>
<p>All went well and the ITU blessed me immediately. My email was almost instantaneous, and I was ready to shop.</p>
<p><center><br />
<img src="http://marknelson.us/attachments/free-itu-docs/ScreenShot002.gif"><br />
Figure 2 - The ITU loves me<br />
</center></p>
<p>Just to test the system, I first downloaded a spec I knew I would find handy, <a href="http://www.itu.int/rec/T-REC-T.800-200208-I/en">JPEG 2000 image coding system: Core coding system</a>. Sure enough, a few minutes later I had my personal copy in hand. Now I just have to decide where to spend my precious remaining two downloads:</p>
<ul>
<li><a href="http://www.itu.int/rec/T-REC-T.81-199209-I/en">JPEG - Digital compression and coding of continuous-tone still images</a>
<li><a href="http://www.itu.int/rec/T-REC-T.804-200208-I/en">JPEG 2000 image coding system: Reference software</a>
<li><a href="http://www.itu.int/rec/T-REC-T.851-200509-I/en">ITU-T T.81 (JPEG-1)-based still-image coding using an alternative arithmetic coder</a>
<li><a href="http://www.itu.int/rec/T-REC-V.24-200002-I/en">List of definitions for interchange circuits between data terminal equipment (DTE) and data circuit-terminating equipment (DCE) (aka RS-232)</a>
<li><a href="http://www.itu.int/rec/T-REC-H.323-200606-P/en">H.323 - Packet-based multimedia communications systems</a>
<li><a href="http://www.itu.int/rec/T-REC-G.722/en">G.722 : 7 kHz audio-coding within 64 kbit/s</a>
<li><a href="http://www.itu.int/rec/T-REC-G.729-199603-I/en">Coding of speech at 8 kbit/s using conjugate-structure algebraic-code-excited linear prediction (CS-ACELP)</a>
<li><a href="http://www.itu.int/rec/T-REC-G.191-200509-I/en">Software tools for speech and audio coding standardization</a>
</ul>
<p>It's nice to at least have a choice, isn't it? Now if we could get the ISO and ANSI to adopt similarly enlightened policies the world would be a much better place.</p>
]]></content:encoded>
			<wfw:commentRss>http://marknelson.us/2006/08/10/free-itu-docs/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Java: Good World Citizen</title>
		<link>http://marknelson.us/2006/07/13/java-good-world-citizen/</link>
		<comments>http://marknelson.us/2006/07/13/java-good-world-citizen/#comments</comments>
		<pubDate>Thu, 13 Jul 2006 14:00:02 +0000</pubDate>
		<dc:creator>Mark Nelson</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Web Articles]]></category>

		<guid isPermaLink="false">/2001/09/01/java-good-world-citizen/</guid>
		<description><![CDATA[<div class="addthis_toolbox addthis_default_style" addthis:url='http://marknelson.us/2006/07/13/java-good-world-citizen/' addthis:title='Java: Good World Citizen' ><a class="addthis_button_twitter"></a><a class="addthis_button_favorites"></a><a class="addthis_button_print"></a><a class="addthis_button_facebook_like"></a><a class="addthis_button_google_plusone"></a><a class="addthis_button_compact"></a></div>This article gives an overview of how to handle internationalization in your Java program, with samples given for Chinese web pages.]]></description>
			<content:encoded><![CDATA[<div class="addthis_toolbox addthis_default_style" addthis:url='http://marknelson.us/2006/07/13/java-good-world-citizen/' addthis:title='Java: Good World Citizen' ><a class="addthis_button_twitter"></a><a class="addthis_button_favorites"></a><a class="addthis_button_print"></a><a class="addthis_button_facebook_like"></a><a class="addthis_button_google_plusone"></a><a class="addthis_button_compact"></a></div><table border="0" cellpadding="5">
<tr>
<td ><img alt="DDJ Portal Logo" src="http://marknelson.us/attachments/misc/logo_ddj.gif" /></td>
<td><strong>Dr. Dobb's Portal</strong> September, 2007<br />
         <a href="http://ddj.com/java/202100033" class="newpage">Article on DDJ site</a>
  </td>
</tr>
</table>
<p>In the new global economy, companies have the opportunity to market their wares to billions of customers who don't speak a word of English. Java was designed from the ground-up to help programmers deploy internationalized software. In this article I'm going to show you how Java makes your woes in the area of character sets and encoding melt like cotton candy on your tongue.</p>
<h4>Dipping My Toes in the Global Pool</h4>
<p>For the past year I've been spending a steadily increasing amount of time at work dealing with internationalization of our products. My division of Cisco makes an IP-based phone system that interfaces with users and administrators at dozens of different points. Users have alphanumeric displays on their phones, perform personal and system  administration on web pages, and listen to voice prompts when collecting messages. From the user interface perspective there is a lot going on.</p>
<p>Like most new products made in a skunk-works atmosphere, Cisco CallManager was developed with little or no thought towards our international customers. The focus was on quickly developing a stable product with as many features as possible. Of course, our success at this strategy led to immediate discontent from our business partners in Europe and Asia. It turns out that telephone users in France really do want to have their instruction manual to be written in French. </p>
<h4>The Four Problems of Text Internationalization</h4>
<p>On the surface, modifying your product for users of another language seems simple enough: translate everything and distribute the results.  Unfortunately it just isn't that easy. Translation is only the first of four big problems. Translating written material might present some logistical problems, but these are usually more budgetary than technical.</p>
<p>The other three problems are more technical in nature, and Java provides tools to deal with all three. In order, these problems are:</p>
<p><bl></p>
<li>Managing user-seen content once it has been translated into multiple languages.
<li>Selecting an appropriate character set and rendering text that uses it.
<li>Properly encoding text in a given character set so that it can be stored and transmitted in a world of eight-bit bytes.<br />
</bl></p>
<p>Java helps you deal effectively with all three of these problems. The first, management of translated content, is handled using Java Resource Bundles. This article is going to talk about the next two: character sets and encodings.<br />
<span id="more-27"></span></p>
<h4>Character Sets</h4>
<p>Cisco's business-class IP Phones have a nice LCD screen that presents call status information to users in a fairly friendly way. One of the  first problems we ran into when internationalizing the entire phone system was that this display only supported the 7-bit ASCII character set. This had the unfortunate effect of changing the name of Se&ntilde;or Nu&ntilde;ez to "Senor Nunez" when stored as a speed dial. That's because the character set we were using lacked the letters commonly seen in other countries using the Roman alphabet.</p>
<p>A short-sighted solution to this problem might be to expand the character set to a full eight bits, using the upper 128 character positions for the commonly missed characters. And in fact, there is a standard character set called ISO-8859-1 that does just that. Figure 1 shows how ISO-8859-1 populates the upper half of the character space with 112 characters used in Western Europe. (The first 32 positions in the upper half of the space are not used.)</p>
<p><center><br />
<img src="http://marknelson.us/attachments/java-good-world-citizen/figure1.gif"></p>
<p>Figure 1 - The Upper 112 Positions of the ISO-8859-1 Character Set<br />
</center></p>
<p>We quickly modified our phone to accept a new font, and soon found that we could properly render names of people in France, Germany, Italy, the Netherlands, and so on. As long as we confined our sales efforts to our friends in NATO all was well. Figure 2 shows our phone happily rendering most of the ISO-8859-1 character set, ready to march onto desktops anywhere the Euro is honored.</p>
<p><center><br />
<img src="http://marknelson.us/attachments/java-good-world-citizen/figure2.gif"></p>
<p>Figure 2 - A Pan-European Phone<br />
</center></p>
<p>Even in today's global economy, many manufacturers of software and other products find this state of affairs to be just dandy. But as soon as you try to sell a phone in Greece or the Russian Federation, you're out of luck. Languages such Russian and Greek just don't have enough characters in common with Western Europe to fit into the ISO-8859-1 character set. Not to fear, because ISO-8859-5 and ISO-8859-7 were created just to deal with this problem.</p>
<p><center><br />
<img src="http://marknelson.us/attachments/java-good-world-citizen/figure3.gif"></p>
<p>Figure 3 - The Upper 112 Positions of ISO-8859-7, a Greek Character Set<br />
</center></p>
<p>If you take a look at Figure 3, you can see the seeds of a problem beginning.  Adding a new character set to the phone is a manageable problem - one that  operating systems like Windows already manage quite effectively. We simply have to make it possible for our telephone to download one font in France, and a different one in Greece.</p>
<p>But now the phone finds itself in a situation that can seem a bit baffling to mossback programmers such as yours truly. If a multinational company has a Se&ntilde;or Nu&ntilde;ez listed in its corporate directory, things will work just fine when a user in France or Italy looks up his name. But unfortunately, a user in Greece will see the name rendered as of <em>Se&rho;or Nu&rho;ez</em>. We now have a fundamental problem: a given numeric value is rendered differently depending on the character set in use. A lower case 'n' with a tilde over it in ISO-8859-1 transforms itself into a lower case Greek rho when we switch to ISO-8859-7.</p>
<p>This unpleasant situation causes a huge paradigm shift for programmers working on internationalization. There's no longer any such thing as "plain text." When we store a user's name in the database, we now have to also store the name of the character set that properly renders it. The same thing holds true for error messages, speed number labels, soft-keys, you name it.</p>
<p>Or does it?</p>
<h4>Java to the rescue - Part 1</h4>
<p>Java deals with this problem in an effective way - it coerces you rather firmly into using Unicode for all character strings. C++ developers have a choice between narrow strings and wide strings - which aren't necessarily Unicode. For better or worse, Java eliminates that choice.</p>
<p>The nice part about this is that in Unicode, U+00F1 is always the &ntilde; character, and the lower case rho, &rho;, is always U+03F1. Even if I don't have a rho character on my keyboard, I know that I can use Java's escaping mechanism to represent it as "\u03F1", inconvenient as that may be.</p>
<p>This is a nice feature, because it means that at least internally, a Java string is a string is a string. You don't have to worry about what character set it is from - it's Unicode.</p>
<h4>It's worse than you think</h4>
<p>In a perfect world, Java's insistence on Unicode would spill over to every file system, network packet type, and so on, and everything would be fine. But unfortunately, there are still billions of web browsers in the world configured to read text from an ISO-8859-X character set. And when our attention turns to Asia, things get even worse, for two reasons.</p>
<ol>
<li>China, Japan, and Korea have character sets composed of thousands of ideographs. To compound this problem, there are competing character sets used to create Chinese web pages. Taiwan and the PRC tend to use two different character sets, known as Traditional (or Big5), and Simplified (or GB2312.)
<li>These character sets don't fit in a single byte, and accordingly must be encoded in order to be written into byte-oriented files and networks. Unicode is most commonly encoded as UTF-8, in which a single 16 bit character is encoded as one, two, or three bytes. Other 16 bit characters, including the Chinese, Japanese, and Korean character sets use different encoding schemes, usually a row/column value encoded as two bytes.
</ol>
<p>Naturally, the different character sets that I've mentioned here are incompatible with one another. Needless to say, the encoding schemes are incompatible as well.</p>
<p>Simply storing your data internally as Unicode doesn't solve the problem of incompatible character sets and encodings. But, the good news is, Java has built-in library support for converting to and from these encodings any time you convert to or from bytes during an I/O operation. </p>
<p>Both the <span class="inline_code">OutputStreamWriter</span> and <span class="inline_code">InputStreamReader</span> class have two constructors: one which takes just a reference to a stream object, and a second which requires both a stream object and an encoding parameter.</p>
<p>If you search through the Java docs for "Supported Encodings", you'll see that Java has built-in support for a huge library of character sets and encodings. Converting one of these to or from Unicode is simply a matter of instantiating a class with the correct encoding parameter.</p>
<h4>See the Code</h4>
<p>Figure 4 shows a sample Chinese language Web page that is written in Unicode and encoded with UTF-8. Users with the latest operating systems and browsers will usually be able to properly render this page properly.</p>
<p><center><br />
<img src="http://marknelson.us/attachments/java-good-world-citizen/figure4.gif"></p>
<p>Figure 4 - Welcome to the Classical Music Site<br />
</center></p>
<p>But not everyone has a Unicode capable computer, operating system, and browser. A user who browsed to this page with a browser set to use a Big5 character set would see the screen shown in Figure 5:</p>
<p><center><br />
<img src="http://marknelson.us/attachments/java-good-world-citizen/figure5.gif"></p>
<p>Figure 5 - Welcome to the Illegible Music Site<br />
</center></p>
<p>Solving this problem is easy with Java. All I have to do is develop my content in Unicode, then use Java's built-in classes to churn out localized versions suitable for users of whatever encodings are needed. </p>
<p>Listing 1 shows the WebWriter class that I used for this article. This program has a complete copy of a web page's content stored in an internal string. By using three different encodings, it creates web pages for browsers set to Unicode, Big5, and GB2312. As you can see, choosing the correct character set and encoding from Java is trivial.</p>
<div class="igBar"><span id="ljava-59"><a href="#" onclick="javascript:showPlainTxt('java-59'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">JAVA:</span>
<div id="java-59">
<div class="java">
<ol>
<li class="li1">
<div class="de1"><span class="co2">import java.io.*;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="kw2">public</span> <span class="kw2">class</span> WebWriter <span class="br0">&#123;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; <span class="kw4">static</span> <a href="http://www.google.com/search?q=allinurl%3AString+java.sun.com&#038;bntl=1"><span class="kw3">String</span></a> eol = <a href="http://www.google.com/search?q=allinurl%3ASystem+java.sun.com&#038;bntl=1"><span class="kw3">System</span></a>.<span class="me1">getProperty</span><span class="br0">&#40;</span> <span class="st0">"line.separator"</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw4">static</span> <a href="http://www.google.com/search?q=allinurl%3AString+java.sun.com&#038;bntl=1"><span class="kw3">String</span></a> s =</div>
</li>
<li class="li2">
<div class="de2">&nbsp; <span class="st0">"&lt;HTML&gt;"</span> + eol +</div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="st0">"&lt;BODY&gt;"</span> + eol +</div>
</li>
<li class="li2">
<div class="de2">&nbsp; <span class="st0">"&lt;TABLE cellspacing=<span class="es0">\"</span>5<span class="es0">\"</span>&gt;"</span> + eol +</div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="st0">" &lt;TR&gt;"</span> + eol +</div>
</li>
<li class="li2">
<div class="de2">&nbsp; <span class="st0">"&nbsp; &lt;TD&gt;&lt;img src=<span class="es0">\"</span>michael.jpg<span class="es0">\"</span>&gt;&lt;/TD&gt;"</span> + eol +</div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="st0">"&nbsp; &lt;TD&gt;&lt;H2&gt;"</span> +</div>
</li>
<li class="li2">
<div class="de2">&nbsp; <span class="st0">"<span class="es0">\u</span>6b61<span class="es0">\u</span>8fce<span class="es0">\u</span>5149<span class="es0">\u</span>81e8<span class="es0">\u</span>53e4<span class="es0">\u</span>5178<span class="es0">\u</span>97f3<span class="es0">\u</span>6a02<span class="es0">\u</span>7a7a<span class="es0">\u</span>9593"</span> +</div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="st0">"&lt;/H2&gt;&lt;/TD&gt;"</span> + eol +</div>
</li>
<li class="li2">
<div class="de2">&nbsp; <span class="st0">"&nbsp; &lt;TD&gt;&lt;img src=<span class="es0">\"</span>violin.jpg<span class="es0">\"</span>&gt;&lt;/TD&gt;"</span> + eol +</div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="st0">" &lt;/TR&gt;"</span> + eol +</div>
</li>
<li class="li2">
<div class="de2">&nbsp; <span class="st0">"&lt;/BODY&gt;"</span> + eol +</div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="st0">"&lt;/HTML&gt;"</span> + eol;</div>
</li>
<li class="li2">
<div class="de2">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">public</span> <span class="kw4">static</span> <span class="kw4">void</span> main<span class="br0">&#40;</span><a href="http://www.google.com/search?q=allinurl%3AString+java.sun.com&#038;bntl=1"><span class="kw3">String</span></a><span class="br0">&#91;</span><span class="br0">&#93;</span> args<span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp;<span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw2">try</span> <span class="br0">&#123;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; <a href="http://www.google.com/search?q=allinurl%3AFileOutputStream+java.sun.com&#038;bntl=1"><span class="kw3">FileOutputStream</span></a> fos = </div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; <span class="kw2">new</span> <a href="http://www.google.com/search?q=allinurl%3AFileOutputStream+java.sun.com&#038;bntl=1"><span class="kw3">FileOutputStream</span></a><span class="br0">&#40;</span><span class="st0">"c:/temp/page_utf8.htm"</span><span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; <a href="http://www.google.com/search?q=allinurl%3AWriter+java.sun.com&#038;bntl=1"><span class="kw3">Writer</span></a> out = <span class="kw2">new</span> <a href="http://www.google.com/search?q=allinurl%3AOutputStreamWriter+java.sun.com&#038;bntl=1"><span class="kw3">OutputStreamWriter</span></a><span class="br0">&#40;</span> fos, <span class="st0">"UTF8"</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; out.<span class="me1">write</span><span class="br0">&#40;</span> s <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; out.<span class="me1">close</span><span class="br0">&#40;</span><span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; fos = <span class="kw2">new</span> <a href="http://www.google.com/search?q=allinurl%3AFileOutputStream+java.sun.com&#038;bntl=1"><span class="kw3">FileOutputStream</span></a><span class="br0">&#40;</span> <span class="st0">"c:/temp/page_gb.htm"</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; out = <span class="kw2">new</span> <a href="http://www.google.com/search?q=allinurl%3AOutputStreamWriter+java.sun.com&#038;bntl=1"><span class="kw3">OutputStreamWriter</span></a><span class="br0">&#40;</span> fos, <span class="st0">"GBK"</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; out.<span class="me1">write</span><span class="br0">&#40;</span> s <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; out.<span class="me1">close</span><span class="br0">&#40;</span><span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; fos = <span class="kw2">new</span> <a href="http://www.google.com/search?q=allinurl%3AFileOutputStream+java.sun.com&#038;bntl=1"><span class="kw3">FileOutputStream</span></a><span class="br0">&#40;</span> <span class="st0">"c:/temp/page_big5.htm"</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; out = <span class="kw2">new</span> <a href="http://www.google.com/search?q=allinurl%3AOutputStreamWriter+java.sun.com&#038;bntl=1"><span class="kw3">OutputStreamWriter</span></a><span class="br0">&#40;</span> fos, <span class="st0">"BIG5"</span> <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; out.<span class="me1">write</span><span class="br0">&#40;</span> s <span class="br0">&#41;</span>;</div>
</li>
<li class="li2">
<div class="de2">&nbsp; out.<span class="me1">close</span><span class="br0">&#40;</span><span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; <span class="kw2">catch</span> <span class="br0">&#40;</span> <a href="http://www.google.com/search?q=allinurl%3AException+java.sun.com&#038;bntl=1"><span class="kw3">Exception</span></a> e <span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="br0">&#123;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; <a href="http://www.google.com/search?q=allinurl%3ASystem+java.sun.com&#038;bntl=1"><span class="kw3">System</span></a>.<span class="me1">out</span>.<span class="me1">println</span><span class="br0">&#40;</span> <span class="st0">"Exception "</span> + e <span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p><center>List 1 - WebWriter.java</center></p>
<h4>Details</h4>
<p>You can see the actual HTML files created for this article using the links below. Note that well written web pages use meta tags or HTTP headers to help a browser figure out what encoding and character set to use. (See the <a href="http://www.w3.org/TR/html4/charset.html#h-5.2.2" class="newpage">section 5.2.2 of the HTML Spec</a> for details.) These web pages don't; they provide no information intentionally, making experimentation a bit easier. </p>
<p>To view the web pages in their correct encoding, you will need to change your browser encoding setting, unless it guesses right based on content. In Internet Explorer, you select this from the <i>View|Encoding</i> portion of the menu. Firefox is nearly the same: <i>View|Character Encoding</i>. If you are an English-speaking computer user, you will undoubtedly have to install Chinese or Unicode fonts as well. If you're lucky, this will be a semi-automatic process.</p>
<p>The web pages can be found here:</p>
<p><bl></p>
<li/><a href="http://marknelson.us/attachments/java-good-world-citizen/page_utf8.htm" class="newpage">Unicode page</a>
<li/><a href="http://marknelson.us/attachments/java-good-world-citizen/page_big5.htm" class="newpage">Big5 page</a>
<li/><a href="http://marknelson.us/attachments/java-good-world-citizen/page_gb.htm" class="newpage">GB2312 page</a><br />
</bl></p>
<p>If you look at the source code, each Chinese ideograph will be two or more characters, which will look to your ANSI text-editor something like this:</p>
<p><H2>­¹âÅR¹ÅµäÒô˜·¿Õég</H2></p>
<p>The text shown above contains ten GB2312 characters encoded using twenty bytes. For non-Chinese speakers, the following table shows the translation of the individual characters, as well as the more meaningful translation of short phrases consisting of multiple characters.</p>
<p><center></p>
<table border=1>
<tr>
<th>Ideograph</th>
<th>Unicode</th>
<th>GB2312</th>
<th>Big5</th>
<th>Character Meaning</th>
<th>Phrase Meaning</th>
</tr>
<tr>
<td><img src="http://marknelson.us/attachments/java-good-world-citizen/ideo_1.gif"></td>
<td>U+6B61</td>
<td>2722</td>
<td>C577</td>
<td>happy, pleased, glad; joy; enjoy</td>
<td rowspan=2>Welcome</td>
</tr>
<tr>
<td><img src="http://marknelson.us/attachments/java-good-world-citizen/ideo_2.gif"></td>
<td>U+8FCE</td>
<td>5113</td>
<td>AAEF</td>
<td>receive, welcome, greet</td>
</tr>
<tr>
<td><img src="http://marknelson.us/attachments/java-good-world-citizen/ideo_3.gif"></td>
<td>U+5149</td>
<td>2566</td>
<td>A5FA</td>
<td>light, brilliant, shine; only</td>
<td rowspan=2>to</td>
</tr>
<tr>
<td><img src="http://marknelson.us/attachments/java-good-world-citizen/ideo_4.gif"></td>
<td>U+81E8</td>
<td>3357</td>
<td>C17B</td>
<td>draw near, approach; descend</td>
</tr>
<tr>
<td><img src="http://marknelson.us/attachments/java-good-world-citizen/ideo_5.gif"></td>
<td>U+53E4</td>
<td>2537</td>
<td>A56A</td>
<td>old, classic, ancient</td>
<td rowspan=2>classic</td>
</tr>
<tr>
<td><img src="http://marknelson.us/attachments/java-good-world-citizen/ideo_6.gif"></td>
<td>U+5178</td>
<td>2168</td>
<td>A8E5</td>
<td>law, canon; documentation; class</td>
</tr>
<tr>
<td><img src="http://marknelson.us/attachments/java-good-world-citizen/ideo_7.gif"></td>
<td>U+97F3</td>
<td>5084</td>
<td>ADB5</td>
<td>sound, tone, pitch, pronunciation</td>
<td rowspan=2>music</td>
</tr>
<tr>
<td><img src="http://marknelson.us/attachments/java-good-world-citizen/ideo_8.gif"></td>
<td>U+6A02</td>
<td>3254</td>
<td>BCD6</td>
<td>happy, glad; enjoyable; music</td>
</tr>
<tr>
<td><img src="http://marknelson.us/attachments/java-good-world-citizen/ideo_9.gif"></td>
<td>U+7A7A</td>
<td>3153</td>
<td>AAC5</td>
<td>empty, hollow, bare, deserted</td>
<td rowspan=2>space</td>
</tr>
<tr>
<td><img src="http://marknelson.us/attachments/java-good-world-citizen/ideo_a.gif"></td>
<td>U+9593</td>
<td>2868</td>
<td>B6A1</td>
<td>interval, space; place, between</td>
</tr>
</table>
<p>Table 1 - Translation of the ideographs in the sample web page<br />
</center></p>
<h4>Conclusion</h4>
<p>Successful products today need to support customers all over the world. Using Unicode for your core content makes this much easier, and Java is ready to help you on this path. More importantly, Java makes it simple to continue talking to devices on the edges of your network that are still using old-school character sets and encodings. </p>
<p>Unfortunately we still have to use human beings to do the difficult work of translating our content from one language to another, but outside of that Java does everything we need.</p>
]]></content:encoded>
			<wfw:commentRss>http://marknelson.us/2006/07/13/java-good-world-citizen/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Intellectual Property: A Modest Proposal</title>
		<link>http://marknelson.us/2005/10/03/ip-proposal/</link>
		<comments>http://marknelson.us/2005/10/03/ip-proposal/#comments</comments>
		<pubDate>Mon, 03 Oct 2005 09:14:28 +0000</pubDate>
		<dc:creator>Mark Nelson</dc:creator>
				<category><![CDATA[Audio]]></category>
		<category><![CDATA[Complaining]]></category>
		<category><![CDATA[Web Articles]]></category>

		<guid isPermaLink="false">http://marknelson.us/2005/10/03/ip-proposal/</guid>
		<description><![CDATA[<div class="addthis_toolbox addthis_default_style" addthis:url='http://marknelson.us/2005/10/03/ip-proposal/' addthis:title='Intellectual Property: A Modest Proposal' ><a class="addthis_button_twitter"></a><a class="addthis_button_favorites"></a><a class="addthis_button_print"></a><a class="addthis_button_facebook_like"></a><a class="addthis_button_google_plusone"></a><a class="addthis_button_compact"></a></div>My modest proposal could make the U.S. the first country to make a sensible adjustment of Intellectual Property laws to deal with realities in the 21st century.]]></description>
			<content:encoded><![CDATA[<div class="addthis_toolbox addthis_default_style" addthis:url='http://marknelson.us/2005/10/03/ip-proposal/' addthis:title='Intellectual Property: A Modest Proposal' ><a class="addthis_button_twitter"></a><a class="addthis_button_favorites"></a><a class="addthis_button_print"></a><a class="addthis_button_facebook_like"></a><a class="addthis_button_google_plusone"></a><a class="addthis_button_compact"></a></div><table border="0" width="500">
<tr>
<td width="300"><img alt="DDJ Portal Logo" src="http://marknelson.us/attachments/misc/logo_ddj.gif" /></td>
<td><strong>Dr. Dobb's Portal</strong><br/>October, 2005<br />
         <a href="http://www.ddj.com/architect/184407836" class="newpage">Article on DDJ site</a>
  </td>
</tr>
</table>
<p>How important is music in your life? Chances are good that the answer ranges from <em>somewhat</em> to <em>very</em>. People like music. I don't think Science can tell us why yet, but the relationship between people and their music goes way beyond normal affection.</p>
<p>How many hours a day does the average cube dweller spend listening to music? If you are a knowledge worker of any ilk, and you don't have to use your telephone, odds are you listen to music eight hours a day or more. And odds are this is very important to your ability to make it through the day.</p>
<p>How many times have you or a friend of yours latched on to a catchy new song, only to have it playing over and over in your head for days on end? Depending on whether you actually like the song, this either sours your mood or puts a little extra spring in your step. Judging by what I see in my humble domicile, I'd have to say the net effect on total happiness in the world is very, very, positive.</p>
<h4>A Serious Problem</h4>
<p>Given the importance that music plays in our lives, I think we have a serious problem. The industry that controls the flow of music into our brains is having trouble adjusting to the new world we live in. Instead of making it easier and cheaper to listen to music, they're letting new technologies make it more expensive and difficult.<br />
<span id="more-100"></span><br />
As an example, if I want to purchase digital music for the much-loved iPod, I pay a purchase price close to that I would pay for CD versions of the same music. (Assuming a $12 price at your favorite big box store.) But for that price, I am stuck with a reduced fidelity image that is locked down to a limited number of computers and other devices. And I am limited to playing these songs on a single vendor's portable player. And worse yet, should the vagaries of hard drive reliability play out the wrong way, a careless consumer can lose their entire digital investment via a single misfortune.</p>
<p>This is all wrong. Being a good American, I always want to see market forces solve our problems, but in this case market forces are not working. Instead of making things better for us, they are making things worse.</p>
<p>As a result, I'm calling on our nation's leaders to take a drastic step--nationalize the music industry.</p>
<h4>My Proposal - The Details</h4>
<p>My proposal is simple, economical, workable, and benefits both listeners and musicians. Let me spell it out for you and I think you will join me in my wish to see this become reality.</p>
<p>Step 1 is for the government to set up a music archive. By law, all copyrighted music must be published to this archive. The Library of Congress is well suited to management of this task, and really, all they need is a giant server farm, a high-speed Internet connection, and a copy of MySQL.</p>
<p>An important point to note about this publishing system is that publishing data to the Library of Congress music library would be a simple task. Any artist who can master the use of a web browser will now be able to publish their music by simply pushing a few buttons. There will no longer be a daunting distribution system standing between artists and listeners--and the majority of the revenue will no longer be siphoned off to what amounts to a parasitical business.</p>
<p>Step 2 is the development of a payment system. This is simplicity itself. Artists will be paid based on bandwidth, and nothing more. I can download 128Kbps MP3 versions of the latest Brittany Spears CD to my iPod, and the pop idol will get a small payment. If I download the 5.1 CD quality image to burn to a DVD, she'll get a bit more. If I stream it to my PC over and over, she'll get even more. The accounting is simple, and doesn't require any technological advances to implement.</p>
<p>Where does the money come from? That's where Step 3 comes in. Every paycheck issued in the US from now until the end of time is going to have a $5 monthly deduction to fund music. Everyone pays, and everyone benefits. With no record companies in the way, that $5 is going to go straight into the artists pockets, and I predict that the vast majority of performers will see an increase in their cash flow.</p>
<p>The final step: We must compensate the poor record companies who are going to lose this source of revenue. Basically, we buy them out using a discounted cash-flow analysis that properly reimburses them for the business they are giving up. For a company that does nothing but sell records, this should approximate their current stock price. And since the record companies tell us that business sucks, they ought to be happy to get out of it. And don't worry about the money, this is going to cost a lot less than the war in Iraq or Katrina rebuilding, and it's going to make everyone a lot happier.</p>
<h4>The Results</h4>
<p>The upshot? You now get to listen to any music you want. Promotion and distribution of music changes radically. And the U.S. becomes the first country to make a sensible adjustment of Intellectual Property laws to deal with realities in the 21st century.</p>
<p>Incidentally, this may sound like satire in the tradition of Jonathan Swift's Modest Proposal, but I assure you, this is a plan worth considering. Yes, it is a big change in the way we do things, but I believe it is the appropriate response to dealing with a system that is too clumsy and dumb to deal with new realities. We didn't ask the market to build our Interstate Highway system, we don't ask the market to defend our borders, and I don't really want to the market to control my music. It's time for a change. </p>
]]></content:encoded>
			<wfw:commentRss>http://marknelson.us/2005/10/03/ip-proposal/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Building Your Home Video Jukebox</title>
		<link>http://marknelson.us/2004/02/09/video-jukebox/</link>
		<comments>http://marknelson.us/2004/02/09/video-jukebox/#comments</comments>
		<pubDate>Mon, 09 Feb 2004 12:41:09 +0000</pubDate>
		<dc:creator>Mark Nelson</dc:creator>
				<category><![CDATA[Video]]></category>
		<category><![CDATA[Web Articles]]></category>

		<guid isPermaLink="false">http://marknelson.us/2004/02/09/video-jukebox/</guid>
		<description><![CDATA[<div class="addthis_toolbox addthis_default_style" addthis:url='http://marknelson.us/2004/02/09/video-jukebox/' addthis:title='Building Your Home Video Jukebox' ><a class="addthis_button_twitter"></a><a class="addthis_button_favorites"></a><a class="addthis_button_print"></a><a class="addthis_button_facebook_like"></a><a class="addthis_button_google_plusone"></a><a class="addthis_button_compact"></a></div>Byte.comFebruary 9, 2004 Article on Byte.com I've been an ardent proponent of server-based digital media for a long time. Way back in 1999, technology finally reached the point where I could migrate my music collection from CDs to a server, with the result chronicled here. Once that was accomplished, I naturally begin work on moving [...]]]></description>
			<content:encoded><![CDATA[<div class="addthis_toolbox addthis_default_style" addthis:url='http://marknelson.us/2004/02/09/video-jukebox/' addthis:title='Building Your Home Video Jukebox' ><a class="addthis_button_twitter"></a><a class="addthis_button_favorites"></a><a class="addthis_button_print"></a><a class="addthis_button_facebook_like"></a><a class="addthis_button_google_plusone"></a><a class="addthis_button_compact"></a></div><table border="0" width="500">
<tr>
<td width="300"><img alt="Byte.com Logo" src="http://marknelson.us/attachments/2004/video-jukebox/byte_logo.jpg" /></td>
<td><strong>Byte.com</strong><br/>February 9, 2004<br />
         <a href="http://www.byte.com/documents/s=9032/byt1076349464330/0209_nelson.html" class="newpage">Article on Byte.com</a>
  </td>
</tr>
</table>
<p>I've been an ardent proponent of server-based digital media for a long time. Way back in 1999, technology finally reached the point where I could migrate my music collection from CDs to a server, with the result chronicled <a href="http://marknelson.us/2000/01/01/the-ultimate-home-jukebox/" class="newpage">here</a>. Once that was accomplished, I naturally begin work on moving my video collection to a home server as well.</p>
<p>In the millenial year 2000, I found that creating a video server was impossible on nearly every front. My home network couldn't handle the bandwidth. I didn't have decent client devices to play stored video. And my server didn't have the storage capacity to hold my entire video catalog.</p>
<p>I'm happy to report that at the outset of 2004, all of these problems have vanished, and it's now completely feasible to store your entire video collection on a home server. This article is going to tell you how to do this, step by step, in a simple and easy fashion.</p>
<p>In order, you'll learn how to rip your DVD content to your hard drive, convert that content to the super-efficient DivX format, and how to play those videos using off-the-shelf players, over either a wired or wireless network.</p>
<p>The methods I'm describing here aren't necessarily for everyone. This article intends to provide a simple and straightforward solution for folks who want the benefits of a jukebox, but don't want to invest days and weeks learning how to use complex tools. You might want more. Check the <a href="http://marknelson.us/2004/02/09/video-jukebox/9" class="newpage">disclaimer</a> to see if you should take the path less traveled.</p>
<p><span id="more-105"></span></p>
<h4>Grabbing the Content</h4>
<p>Those DVDs stacked up in your entertainment center have digitized video stored in an encrypted MPEG-2 format. The actual size of the movie varies depending on length, screen size, and compression settings, but most feature films clock in somewhere between 3 and 5 Gigabytes.</p>
<p>No doubt your Windows PC is able to see those files without any trouble. If you insert a standard title into your PC's DVD drive, you'll see something like the listing shown here. </p>
<p><center><br />
<img src="http://marknelson.us/attachments/2004/video-jukebox/ErinList.gif"><br />
<strong>Directory listing of <i>Erin Brockovich</i></strong><br />
</center><br />
By convention, the MPEG-2 video is stored in VOB files in the <code>VIDEO_TS</code> folder. Everything else on this particular DVD is software and other fluff for PC users. For the purposes of the home server, we're only interested in the video content.</p>
<p>Grabbing that data off of a DVD is a bit problematic. Yes, your PC can easily read those files, but unfortunately the MPEG-2 content has been encrypted so that it can only be played through a licensed piece of hardware.</p>
<p>Fortunately, there are a few excellent programs that are able to bypass this protection, allowing you to decrypt the data and copy it to your PC simultaneously. The program I recommend for this purpose is called DVD Decrypter. It's a completely free piece of software for Win32 PCs, and you can download it <a href="http://www.dvddecrypter.org.uk/" class="newpage">here</a>.</p>
<p>A couple of notes about DVD Decrypter. First, if your PC is in the Win9X train, you will also need to download ASPI drivers for your DVD-ROM. (You can get these from the <a href="http://www.adaptec.com/en-US/speed/software_pc/aspi/aspi_v470_exe.htm" class="newpage">Adaptec ASPI download</a> site.) Second, it is illegal to distribute this program in the United States! Yes, that's right, it is not legal to sell software that accesses the DVD content that you paid good money for. A more detailed discussion of this can be found in the <a href="http://marknelson.us/2004/02/09/video-jukebox/8" class="newpage">DMCA sidebar.</a> </p>
<h4>Ripping your DVD</h4>
<p>Installing DVD Decrypter is a straight-ahead process, and I won't go into any of the details here. Once you have the program installed, start it up, and select the Mode menu option. Set the mode to the IFO setting. This is the most straightforward way to pull the correct VOB files from the DVD.<br />
<center><br />
<img src="http://marknelson.us/attachments/2004/video-jukebox/SelectIFO.gif"><br />
<strong>Selecting IFO mode in DVD Decrypter</strong><br />
</center><br />
The real excitement comes when you are ready to rip the content from your first CD. After you start DVD Decrypter, place your DVD in your DVD drive. Depending on the packaging of your DVD, you may be asked to install software that ships with the DVD - just say no! This software won't help you in the ripping process.<br />
<center><br />
<img src="http://marknelson.us/attachments/2004/video-jukebox/DVDSoftware.gif"><br />
<strong>DVD software installer</strong><br />
</center><br />
DVD Decrypter should automatically detect the presence of the DVD in your drive. If it doesn't, you can select the appropriate drive using the drop-down box labeled Source. After it finds the DVD content, DVD Decrypter looks through the various programs on the DVD, and selects the longest one. There are rare occasions when this not the actual movie, but 99 times out of 100 this is what you should select. (I'll tell you how to verify that you got the correct program later in this section.)<br />
<center><br />
<img src="http://marknelson.us/attachments/2004/video-jukebox/LongestProgramSelect.gif"><br />
<strong>DVD Decrypter selects the longest program</strong><br />
</center><br />
At this point, I usually click on the Destination icon and change the destination directory to the location I prefer. All that's left after that is to click the bottom icon showing the <em>DVD to Hard Drive</em> picture, and the ripping process starts.</p>
<p>Ripping the disk is really not much more than just copying the files from your DVD drive to your hard drive, so the process will go pretty quickly, especially if you have a nice high-speed DVD-ROM. The figure below shows what things look like while the extraction is in process. If you love details, be sure to turn on the log window so you can get the works.<br />
<center><br />
<img src="http://marknelson.us/attachments/2004/video-jukebox/Ripping.gif"><br />
<strong>The DVD Rip in progress</strong></p>
<p><img src="http://marknelson.us/attachments/2004/video-jukebox/RipLog.gif"><br />
<strong>The optional log window</strong><br />
</center></p>
<h4>Verifying Your Rip</h4>
<p>Once DVD Decrypter finishes its work, you will have a nice collection of files in your destination directory. You should have one IFO file, a collection of VOB files that are each up to 1 GByte in size, and a text file. A sample of that is shown here.<br />
<center><br />
<img src="http://marknelson.us/attachments/2004/video-jukebox/VOBDir.gif"><br />
<strong>The results of the rip process</strong><br />
</center><br />
DVD Decrypter created the text file to describe what's in the various streams you've captured. The IFO file and the VOB files are decrypted copies of what was on the DVD. Before we move on to the encoding phase, you'll want to verify that these VOB files do actually contain the contents of the movie as you expect to see it.</p>
<p>To do this, I like to use another great piece of free software, the VLC media player from Videolan.org. Follow the links from <a href="http://www.videolan.org/" class="newpage">Videolan.org home page</a>, download and execute the install package, and you're in business. Once you've done that, just start the player, and one by one drag the VOB files from their target directory right onto the VLC control panel. I usually do a quick check to make sure that I've got a series of VOB files that start at the beginning of the movie, end with the credits, and appear to have some reasonable content in the middle. A typical view of the process is shown below.<br />
<center><br />
<img src="http://marknelson.us/attachments/2004/video-jukebox/VLCMediaPlayer.gif"<br />
<strong>The VLC Media Player in use</strong><br />
</center></p>
]]></content:encoded>
			<wfw:commentRss>http://marknelson.us/2004/02/09/video-jukebox/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
	</channel>
</rss>

