When I finally wrapped up the mathematical modeling of the k-consecutive coin toss problem I had a concise formula:
To calculate the chances of k heads in a row appearing in n tosses, I just had to calculate a k-step Fibonacci number, and then divide it by an exponent of 2, and subtract the result from 1. Not complicated at all - but there was one catch. After a million coin tosses, the Fibonacci number and the exponent of 2 were approximately 300,000 digits long - way outside the scope of normal computations in languages like C++ and Java.
To get the exact result I needed, I used the BigInteger and BigDecimal Java classes to handle those big numbers, and had to wait about an hour to get the result. As it turns out, there is a much faster and easier way to do things
Simplifying the Computation
To solve this problem in a more reasonable matter, take a look at the part of the equation where all the real work is done:
In this equation, the p we are solving for is the probability that no runs of k consecutive heads will occur in n tosses of a fair coin. Although we might be able to find a closed-form equation that defines the k-step Fibonacci number, it is going to be a complicated polynomial, and the chances of us simplifying this equation are effectively nil.
So at first blush I seem to be stuck with calculating the top and bottom parts of the equation independently. But a little inspection of the equation shows that I can actually calculate p in an iterative fashion, starting with n=1, and working my way up to the desired result.
Calculating the Fibonacci Number
The first step in the calculation is writing the code to calculate k-step Fibonacci numbers. Recall that the k-step Fibonacci number n is found by summing the values of the previous k numbers - a recursive definition. The base cases for all values of k are: Fibk(n) is 1 when n is 1, and 0 when n is less than 1.
To calculate the value of Fibk in an iterative fashion, I make use of the the handy
deque<T>. To calculate a k-step Fibonacci number, I make use of a
deque<double> of size k that contains the previous k Fibonacci numbers. With the data
structure set up like that, calculating the next Fibonacci number is simple:
Note that after this calculation, q[ 0 ] contains the next new Fibonacci number, and the
previous k-1 numbers are still stored in sequence in the
deque. The call to
discards Fibk(n-k), which is not going to be needed for the next calculation.
Add the code to initialize the
deque, and a loop, and we can print out Fibk(n) for
values 1 through n:
Using the familiar value of k=2, and n=20, I get the following output:
1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765
deque is perfect for this class - it lets us simulate a rolling window through the
Fibonacci sequence by pushing new values on one end and popping aged ones off the other end. All
this while providing constant time access to indexed elements. (Admittedly this code could be
written just as efficiently with a
Divide By Two
This is a nice simple calculation, but it only gets me halfway to the goal. I’m calculating the
top part of the fraction that computes p, but I need to divide by the bottom value as
well. As it turns out, I can accomplish this in a way that allows me to iterate to the next value:
before I add the values of the previous Fibonacci numbers to
q, I just divide all of them by
two. The resulting loop ends up looking like this:
Note that I’ve slightly changed the initialization - the
deque<double> is initialized with
Fibk(2) - this is because the top half of the fraction uses Fibk(n+2), and
the bottom uses 2n - the value of n for the Fibonacci calculation always has to be two
ahead of the exponent of 2.
This calculation allows me to calculate the probabilities for large numbers of tosses in a matter of seconds, instead of the hour-plus it took using BigInteger. Using this code I was able to get the final word on just when we could be certain that a certain number of coin tosses would produce a run of 20 heads. If we want to be modestly certain, with 90% probability I can say that 4,828,842 coin tosses will produce a run of 20 heads.
If my life depended on it, and I wanted to get out to five nines, I can say that 24,144,137 coin tosses will produce a run of 20 heads with 99.999% probability. The output of kfib.cpp is shown here:
There is a 90% probability of seeing 20 consecutive heads in 4828842 tosses of a fair coin There is a 99% probability of seeing 20 consecutive heads in 9657666 tosses of a fair coin There is a 99.9% probability of seeing 20 consecutive heads in 14486490 tosses of a fair coin There is a 99.99% probability of seeing 20 consecutive heads in 19315313 tosses of a fair coin There is a 99.999% probability of seeing 20 consecutive heads in 24144137 tosses of a fair coin
And that’s the last thing I am going to say about it.