<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Mark Nelson &#187; Puzzles</title>
	<atom:link href="http://marknelson.us/category/puzzles/feed/" rel="self" type="application/rss+xml" />
	<link>http://marknelson.us</link>
	<description>Programming, mostly.</description>
	<lastBuildDate>Fri, 13 Apr 2012 19:25:26 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>C++11: Range-based for and auto</title>
		<link>http://marknelson.us/2012/04/07/c11-range-based-for-and-auto/</link>
		<comments>http://marknelson.us/2012/04/07/c11-range-based-for-and-auto/#comments</comments>
		<pubDate>Sat, 07 Apr 2012 18:18:02 +0000</pubDate>
		<dc:creator>Mark Nelson</dc:creator>
				<category><![CDATA[C/C++]]></category>
		<category><![CDATA[Puzzles]]></category>

		<guid isPermaLink="false">http://marknelson.us/?p=1511</guid>
		<description><![CDATA[<div class="addthis_toolbox addthis_default_style" addthis:url='http://marknelson.us/2012/04/07/c11-range-based-for-and-auto/' addthis:title='C++11: Range-based for and auto' ><a class="addthis_button_twitter"></a><a class="addthis_button_favorites"></a><a class="addthis_button_print"></a><a class="addthis_button_facebook_like"></a><a class="addthis_button_google_plusone"></a><a class="addthis_button_compact"></a></div>Two really handy features in C++11 are the range-based for statement and the auto type specifier. The former allows you iterate over collections using a much more compact form of expression, and the latter takes some of the headache out of the complex type declarations encountered in the standard library. Both of these features have [...]]]></description>
			<content:encoded><![CDATA[<div class="addthis_toolbox addthis_default_style" addthis:url='http://marknelson.us/2012/04/07/c11-range-based-for-and-auto/' addthis:title='C++11: Range-based for and auto' ><a class="addthis_button_twitter"></a><a class="addthis_button_favorites"></a><a class="addthis_button_print"></a><a class="addthis_button_facebook_like"></a><a class="addthis_button_google_plusone"></a><a class="addthis_button_compact"></a></div><p>Two really handy features in C++11 are the <i>range-based for statement</i> and the <i>auto type specifier</i>. The former allows you iterate over collections using a much more compact form of expression, and the latter takes some of the headache out of the complex type declarations encountered in the standard library. Both of these features have been available in g++ since release 4.6, and are now present in Visual Studio 11, so you can start using them today. (auto typed variables are available in earlier versions of both compilers.) In this post I&#8217;ll give you a description of how these new features works, and show you a concrete example of the positive effects they can have on your programs.<br />
<span id="more-1511"></span></p>
<h4>The value of containers</h4>
<p>It&#8217;s hard to overstate the value of the containers in the C++ standard library. With the addition of the hash-based containers in TR1, I rarely if ever find myself tempted to roll my own, or use a third party library. The flexibility and power of the library created by Alexander Stepanov does everything I need.</p>
<p>Despite the technical merit of the container classes, newcomers are often hesitant about completely embracing them. One of the main reasons has to be the conceptual drag imposed by the use of iterators as the primary means of accessing the objects they contain. It&#8217;s not that there is anything complicated about the concept, but the syntax can be more than just a little annoying. Let me illustrate it with an example.</p>
<h4>Anagramania</h4>
<p>The listing below is a C++ program that reads through the Scrabble dictionary and determines which set of letters generates the most anagrams. I&#8217;m using C++ circa TR1, in which I have access to the unordered associative containers, but I don&#8217;t take any shortcuts to try to simplify the syntax. (The fact that I can write this program in one screen of simple code is a nice testament to the quality of the container library.)</p>
<p>The logic for the program is simple. I use an <code>unordered_multimap</code> called <code>counts</code> to hold the count of all anagram families in the dictionary, with its key being the sorted value of the scrabble word. This means that all words that are anagrams of one another will have the same key. I use an <code>unordered_multimap</code> called <code>words</code> to hold the list of all words that are anagrams of that key. Each time I process a word, I increment a value in <code>counts</code> and I add a new value to <code>words</code>.</p>
<p>After the input processing is done, I can just iterate over <code>counts</code> from top to bottom, looking for the highest count. When I have gone through the entire map, I have the sorted key that generates the most anagrams. Using that key, I query an <code>unordered_multimap</code> for a range of results. It returns two iterators in a <code>pair<T1,T2></code> object, which I then use to iterate over the result set. </p>
<p>Even if you are familiar with the type system used by the containers and don&#8217;t make too many mistakes, just the magnitude of how much you have to type to get this to work is a bit of a downer. And the length of those type definitions doesn&#8217;t help make the concepts being used any clearer.</p>
<pre>
#include &lt;iostream&gt;
#include &lt;fstream&gt;
#include &lt;string&gt;
#include &lt;iterator&gt;
#include &lt;algorithm&gt;
#include &lt;unordered_map&gt;

int main(int argc, char* argv[])
{
    std::ifstream data( &quot;sowpods.txt&quot; );
    std::unordered_map&lt;std::string,int&gt; counts;
    std::unordered_multimap&lt;std::string,std::string&gt; words;

    std::string s;
    while ( data &gt;&gt; s ) {
        std::string temp = s;
        std::sort(temp.begin(), temp.end() );
        counts[temp]++;
        words.insert( std::make_pair(temp,s) );
    }

    int max_count = -1;
    std::string max_string = &quot;&quot;;
    for ( std::unordered_map&lt;std::string,int&gt;::iterator ii = counts.begin();
          ii != counts.end();
          ii++ )
    {
        if ( ii-&gt;second &gt; max_count ) {
            max_count = ii-&gt;second;
            max_string = ii-&gt;first;
        }
    }
    std::cout &lt;&lt; &quot;The maximum anagram family has &quot; &lt;&lt; max_count &lt;&lt; &quot; members:\n&quot;;
    std::pair&lt; std::unordered_multimap&lt;std::string,std::string&gt;::iterator,
	       std::unordered_multimap&lt;std::string,std::string&gt;::iterator&gt; range;
    range = words.equal_range( max_string );
    for ( std::unordered_multimap&lt;std::string,std::string&gt;::iterator ii = range.first;
          ii != range.second;
          ii++ )
        std::cout &lt;&lt; ii-&gt;second &lt;&lt; &quot; &quot;;
    std::cout &lt;&lt; std::endl;
    return 0;
}
</pre>
<p><center>Anagram finder circa TR1</center><br />
Now let&#8217;s look at the two features that make major improvements to this program in C++11.</p>
<h4>The auto Type Specification</h4>
<p>The hard working committee members who hammered out the standard last year clearly listened to the millions of C++ programmers out there. While they were charting new waters for the language with things like move semantics and rvalue references, they were also making a lot of small changes that simply make the language a lot easier to work with. Maybe even a little more fun. The two things I find at the top of my list are the the use of auto type specifier and the for-range statement.</p>
<p>The auto keyword can be used in a number of different contexts, but in general it means that you can declare variables without having to enter a complete type. This solves some tricky problems for template programming, and it provides a convenience for awkward variable declarations. Most notably, it allows you to replace these two wordy lines of code:</p>
<pre>
    std::pair&lt; std::unordered_multimap&lt;std::string,std::string&gt;::iterator,
	       std::unordered_multimap&lt;std::string,std::string&gt;::iterator&gt; range;
    range = words.equal_range( max_string );
</pre>
<p>with this much simpler single line:</p>
<pre>
    auto range = words.equal_range( max_string );
</pre>
<p>In both cases, the type of <code>range</code> is the same &#8211; but by using the auto type specifier, we let the compiler replace all that typing with a bit of simple hand waving.</p>
<p>Bjarne Stroustrup has a good, concise explanation of <a href="http://www2.research.att.com/~bs/C++0xFAQ.html#auto" class="newpage">auto</a> on his C++11 FAQ, I recommend you spend the time to read it.</p>
<h4>The Range-based for Statement</h4>
<p>When working with standard library containers, one of the most common things we do is iterate over some or all of the container. This generally is done using a for or while loop with an iterator loop variable.</p>
<p>C++11 makes this type of iteration easier with new syntax injected into the <code>for</code> statement that has been around since 1969. The range-based for looks like this:</p>
<pre>
    for ( declaration : expression ) statement
</pre>
<p>In this new statement, <code>expression</code> can be an initializer list, an array, or an object that implements container semantics. This means that the object returns an iterator-like object from a <code>begin()</code> and <code>end()</code> methods, or via a call to <code>begin()</code> and <code>end()</code> functions in the current or std namespace.</p>
<p>The variable declaration is either a reference or value of the type of variable held in the container, array, or initializer list. The for loop is executed from the beginning of the container to the end, with <code>statement</code> executed once per value returned by the iterator.</p>
<p>Although this is a completely new language feature, I think most C++ programmers will be comfortable with it from the first time they are able &#8211; it makes those iterations over containers clean and concise.</p>
<h4>Putting it to Use</h4>
<p>Although it didn&#8217;t really cut down on my code size in a big way, I first made use of the range-based for in the loop that reads in the data from the scrabble dictionary. My new version of the loop is shown here:</p>
<pre>
for ( const std::string &#038;s : std::istream_iterator&lt;std::string&gt;( data ) )
{
    std::string temp = s;
    std::sort(temp.begin(), temp.end() );
    counts[temp]++;
    words.insert( std::make_pair(temp,s) );
}
</pre>
<p>The only big improvement here is that I was able to declare my string variable on first use, which is always my preference.</p>
<p>However, looking at this code, you might be wondering how it compiles. After all, the <code>istream_iterator</code> doesn&#8217;t have <code>begin()</code> or <code>end()</code> member functions.</p>
<p>That&#8217;s correct, and the reason it works is that I added a couple of convenience functions to my program that enable the use of this iterator type with the range-based for:</p>
<pre>
template&lt;class T&gt;
std::istream_iterator&lt;T&gt; begin(std::istream_iterator&lt;T&gt; &amp;ii_stream)
{
    return ii_stream;
}

template&lt;class T&gt;
std::istream_iterator&lt;T&gt; end(std::istream_iterator&lt;T&gt; &amp;ii_stream)
{
    return std::istream_iterator&lt;T&gt;();
}
</pre>
<p>I made use of a similar set of template functions to enable the use of the new for statement in my final output statement. I now iterate over the discovered members of the anagram family with two easy-to-read lines:</p>
<pre>
for ( const auto &#038;map_entry : words.equal_range( ii-&gt;first ) )
    std::cout &lt;&lt; map_entry.second &lt;&lt; &quot; &quot;;
</pre>
<p>Compare this to the TR1 code that does the same thing, and I think you will see the real value of both auto and range-based for.</p>
<p>Iterating over the values returned from a multimap is a common task, enabled it by these convenient template functions:</p>
<pre>
template&lt;class ITERATOR&gt;
ITERATOR begin( std::pair&lt;ITERATOR,ITERATOR&gt; &amp;range )
{
    return range.first;
}

template&lt;class ITERATOR&gt;
ITERATOR end( std::pair&lt;ITERATOR,ITERATOR&gt; &amp;range )
{
    return range.second;
}
</pre>
<p>When I first implemented the functions for my C++11 program, I was halfway expecting to find that this functionality had already been added to the standard library &#8211; they really make a big improvement for a small investment. But no, I couldn&#8217;t find them, so we will be using our own versions for the time being.</p>
<h4>The Final Product</h4>
<p>My much improved anagram finder is shown below. In addition to the use of range-based for and auto type declarations, I changed the way I find the maximum element in the container. Now that lambdas are part of the language, there is no excuse for not using the standard library algorithms, and this code gives an illustration of how that works as well. </p>
<pre>
#include &lt;iostream&gt;
#include &lt;fstream&gt;
#include &lt;string&gt;
#include &lt;iterator&gt;
#include &lt;algorithm&gt;
#include &lt;unordered_map&gt;

template&lt;class ITERATOR&gt;
ITERATOR begin( std::pair&lt;ITERATOR,ITERATOR&gt; &amp;range )
{
    return range.first;
}

template&lt;class ITERATOR&gt;
ITERATOR end( std::pair&lt;ITERATOR,ITERATOR&gt; &amp;range )
{
    return range.second;
}

template&lt;class T&gt;
std::istream_iterator&lt;T&gt; begin(std::istream_iterator&lt;T&gt; &amp;ii_stream)
{
    return ii_stream;
}

template&lt;class T&gt;
std::istream_iterator&lt;T&gt; end(std::istream_iterator&lt;T&gt; &amp;ii_stream)
{
    return std::istream_iterator&lt;T&gt;();
}

int main(int argc, char* argv[])
{
    std::ifstream data( &quot;sowpods.txt&quot; );
    std::unordered_map&lt;std::string,int&gt; counts;
    std::unordered_multimap&lt;std::string,std::string&gt; words;

    for ( const std::string &amp;s : std::istream_iterator&lt;std::string&gt;( data ) )
    {
        std::string temp = s;
        std::sort(temp.begin(), temp.end() );
        counts[temp]++;
        words.insert( std::make_pair(temp,s) );
    }
    auto ii = std::max_element( counts.begin(),
                                counts.end(),
                                [](const std::pair&lt;std::string,int&gt; &amp;v1,
                                   const std::pair&lt;std::string,int&gt; &amp;v2)
                                {
                                    return v1.second &lt; v2.second;
                                }
                              );
    std::cout &lt;&lt; &quot;The maximum anagram family has &quot; &lt;&lt; ii-&gt;second &lt;&lt; &quot; members:\n&quot;;
    for ( const auto &#038;map_entry : words.equal_range( ii-&gt;first ) )
        std::cout &lt;&lt; map_entry.second &lt;&lt; &quot; &quot;;
    std::cout &lt;&lt; std::endl;
    return 0;
}
</pre>
<p><center>Anagram finder in C++11</center><br />
If I move the four convenience functions into a utility header file, I think you&#8217;ll agree that the new version of the code implements my algorithm in a very clean and concise way. The new language improvements make a huge difference in readability and convenience.</p>
<p>Of course, these two features are just one small part of a huge new standard, but for right now, they are the ones I turn to the most. How about you? Let me know!</p>
]]></content:encoded>
			<wfw:commentRss>http://marknelson.us/2012/04/07/c11-range-based-for-and-auto/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Euler Mania</title>
		<link>http://marknelson.us/2011/04/10/euler-mania/</link>
		<comments>http://marknelson.us/2011/04/10/euler-mania/#comments</comments>
		<pubDate>Sun, 10 Apr 2011 21:46:51 +0000</pubDate>
		<dc:creator>Mark Nelson</dc:creator>
				<category><![CDATA[Computer Science]]></category>
		<category><![CDATA[Puzzles]]></category>

		<guid isPermaLink="false">http://marknelson.us/?p=390</guid>
		<description><![CDATA[<div class="addthis_toolbox addthis_default_style" addthis:url='http://marknelson.us/2011/04/10/euler-mania/' addthis:title='Euler Mania' ><a class="addthis_button_twitter"></a><a class="addthis_button_favorites"></a><a class="addthis_button_print"></a><a class="addthis_button_facebook_like"></a><a class="addthis_button_google_plusone"></a><a class="addthis_button_compact"></a></div>If I had been drawing a paycheck for every hour I spent working on Project Euler&#8217;s problem 328, I think my summer vacation would already be paid for. But instead, after a long ten days or so of distraction, I&#8217;ll have to settle for the satisfaction of being number 38 or 39 to solve it. [...]]]></description>
			<content:encoded><![CDATA[<div class="addthis_toolbox addthis_default_style" addthis:url='http://marknelson.us/2011/04/10/euler-mania/' addthis:title='Euler Mania' ><a class="addthis_button_twitter"></a><a class="addthis_button_favorites"></a><a class="addthis_button_print"></a><a class="addthis_button_facebook_like"></a><a class="addthis_button_google_plusone"></a><a class="addthis_button_compact"></a></div><p>If I had been drawing a paycheck for every hour I spent working on Project Euler&#8217;s <a href="http://projecteuler.net/index.php?section=problems&#038;id=328" class="newpage">problem 328</a>, I think my summer vacation would already be paid for. But instead, after a long ten days or so of distraction, I&#8217;ll have to settle for the satisfaction of being number  38 or 39 to solve it.<br />
<span id="more-390"></span><br />
In cause you haven&#8217;t visited <a href="http://projecteuler.net/" class="newpage">Project Euler</a>, it is a site dedicated to &#8220;challenging mathematical/computer programming problems&#8221;. A prototypical Project Euler challenge is one with a simple definition that is easy to solve for simple cases, but requires some ingenuity to scale up to the requirements given in the problem.</p>
<h4>You&#8217;re Getting Warmer</h4>
<p>Problem 328 is a classic example of the genre. The basic setup is that of a number guessing game. Knowing that a number is some integer between 1 and N, your job is to make successive guesses until you get the answer. Each guess is answered with one of three conditions: low, high, or a match.</p>
<p>The twist in this problem is that your job is not to minimize the number of guesses you have to make, but rather, to minimize the sum of the guesses. For any range 1 to N, you have to select a path that minimizes the worst case cost.</p>
<p>As an example, if I was going to guess a number between 1 and 10, the best of the worst case strategies yields a value of 16. I get this score if the hidden number is either 8, 9, or 10. My first guess is 7, and in the worst case, my second guess of 9 nails it down to either 8, 9, or 10. Of course if the number is 6 or less I&#8217;ll get a lower score.  Change the first guess to some other number, and you will always have a case which results in a score of greater than 16.</p>
<p>Figure 1 shows what the choice graph looks like when choosing a number between 1 and 20. The first number in each node is the guess. When a  second number is present, it represents the accumulated cost at that point, working up from the leaf nodes. The choice at the top of the graph, 13, shows the cost for that problem: 49.</p>
<table border="0" align="center">
<tr>
<td><image src="/attachments/2011/eulermania/Figure01.png"></td>
</tr>
<tr>
<td><center>Figure 1<br/>Choosing a number between 1 and 20</center></td>
</tr>
</table>
<h4>The Naive Approach</h4>
<p>Solving this problem in small cases is nice and easy. Using a recursive formulation you can implement it in a single screenful of code. My test implementation in C++ is shown here &#8211; it calculates both the optimal first choice and the cost for a given range. By calling itself recursively, the problem solution is tidy and compact:</p>
<pre>
pair&lt;int,int&gt; get_best_path( int low, int high )
{
    if ( low &gt;= high )
    	return pair&lt;int,int&gt;(low,0 );
    if ( low == ( high - 1 ) )
    	return pair&lt;int,int&gt;(low,low);
    if ( low == (high - 2 ))
    	return pair&lt;int,int&gt;(low+1, low+1);
    int best_cost = INT_MAX;
    int best_choice = -1;
    for ( int choice = low + 1 ; choice &lt; high ; choice ++ ) {
    	int cost = choice + max( get_best_path( low, choice-1).second,
                                 get_best_path( choice+1, high).second);
    	if ( cost &lt; best_cost ) {
    		best_cost = cost;
    		best_choice = choice;
    	}
    }
    return pair&lt;int,int&gt;( best_choice, best_cost);
}
</pre>
<p>Like most recursive routines, it bails out early with one of three base cases which have trivial solutions. For all non-trivial solutions, the routine simply iterates through all possible guesses, calculating the cost of that choice and using recursion to calculate the cost of the two subproblems it creates.</p>
<p>Although this algorithm is simple and has a certain elegance, it has one big problem. A little examination will show that the runtime of this routine is asymptotically proportional to k<sup>N</sup>. Running on my desktop Linux system I was able to calculate best choices pretty quickly when N was under 30, but after that the runtime started ramping up drastically.</p>
<h4>Getting There From Here</h4>
<p>Since the solution to this algorithm requires calculating the best choice for numbers up to 200,000, there is no way that an O(k<sup>N</sup>) algorithm was going to fly. And that, of course, is the essence of a good Project Euler problem. Developing a solution for the simple cases is just the start.</p>
<p>After realizing that the naive solution won&#8217;t do it, you have to start looking at the problem from all angles. Can some optimization reduce it to a tractable polynomial problem? Or do you need a completely different approach. Perhaps the problem has a closed form solution that just requires pumping some numbers into an equation?</p>
<p>Eventually I was able to develop a solution that calculated all 200,000 value in less than a second. And while that would make an interesting post all on its own, it would be the epitome of bad form to spill the beans on an Euler Project solution.</p>
<h4>My Path</h4>
<p>Without giving away the secrets, however, I can tell you what was the most important factor for me in nailing down this problem: visualization.</p>
<p>To try to make some sense out of these paths through the choice tree, I turned to an old friend: <a href="http://www.graphviz.org/" class="newpage">Graphiviz</a>. This open source package makes visualization of data structures like binary trees a piece of cake.</p>
<p>Figure 1 is a simple graph created with graphviz. To really see the value of this package, examine the choice tree for N=100 in <a href="/attachments/2011/eulermania/1-100.pdf" class="newpage">PDF format</a> or <a href="/attachments/2011/eulermania/1-100.svg" class="newpage">SVG format</a>, if your browser supports it. I spent a long inspecting these images, including some that had hundreds of nodes.</p>
<p>For this program, I didn&#8217;t even link to the graphviz library &#8211; I just created text files in the correct format, then made a <code>system()</code> call to the dot compiler program, which creates the graphics files.</p>
<p>For this particular problem, graphviz is what guided me to my solution, and I don&#8217;t know of any other package, free or commercial, that could have done as well. It was the perfect tool for the job.</p>
<h4>Up Next</h4>
<p>Now that I have put Problem 328 to bed, have my eye on <a href="http://projecteuler.net/index.php?section=problems&#038;id=304" class="newpage">Problem 304</a>, known as Primonacci. This problem requires working with Fibonacci numbers with trillions of digits &#8211; numbers so big that they won&#8217;t fit in RAM on any computer I have access to. </p>
]]></content:encoded>
			<wfw:commentRss>http://marknelson.us/2011/04/10/euler-mania/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

