I believe that I can give you the most exhaustive treatement you will ever find in these two articles, which were followups to this one:
]]>Thanks – this helps validate the number I got with my quick Java Bignum program.
I actually have a nice formula for the solution, along with a good explanation of how one arrives at it. However, the formula for the solution incorporates a 20-step Fibonacci number.
Mathematics says that the recurrence that defines the 20-step Fibonacci number has a closed-form solution, which I could then use to give the exact result. But I don’t have the mental or computational horsepower to produce that solution.
- Mark
]]>agrees with 37.9% prob for 20 heads in a million flips.
You may be interested in Project Euler problem 316 which relates to this problem.
]]>My assumption was that p, the probability of a 20 head streak was 1/2^20 at every position in the imaginary million tosses. This was incorrect. This value of p is only true for the toss starting at position 1.
For positions 2 through 999,981, the value of p declines based on a complicated function. The good news is that the value of p quickly converges to a value somewhere around 1/2097131.
Plugging those numbers in mean that the chances of seeing the streak of 20 heads in a million tosses is more like 38% - considerably less than my previous solution of 62%.
Gory details to follow.
- Mark
]]>As an aside, I think your response to my correction should be trumpeted as a model of how to graciously accept corrections to errors in internet posts! I have to confess that my first draft of my post to you was not nearly as admirable, and I'm relieved that I edited it before posting.
Let's hear it for raising the level of numeracy all around!
]]>You are correct about my algorithm being off. This is a good news/bad news story.
The good news (for me) is that my system actually overestimates the probability of the sequence. The main point I was making was that the NY Times should have sensed right away that the figure was way off base - and that is still the case.
The bad news is that my simple formula is no good - although it does provide an upper bound for the probability, it is not correct. Probability usually boils down to being able to count things, and many mistakes in probability problems come about when you either count something twice or miss it altogether. My mistake was counting twice, and I'll be working out that in another post!
The really bad news is that I should have tested my algorithm using a simple case, as you did, in which case I would have seen the error right away.
But truthfully, whenever I have a post about innumeracy, I'm happy when people find mistakes. It helps me clear up my thinking and ideally raises the level of numeracy all around.
- Mark
]]>The flaw in the reasoning is the assumption that the probability of coins 2 through n+1 being all heads is independent of whether coins 1 through n are all heads. You are implicitly making this independence assumption by multiplying the probabilities together.
Now, in the case of 20 in a row out of a million, your analysis may be approximately correct, because the events you are multiplying may be close to independent, but I'm not even sure of that. I certainly hope it's close, now that you're on record in the New York Times with this result ;-)
I think the correct math is actually relatively complex. See http://wizardofodds.com/askthewizard/images/streaks.pdf, who shows results of a discrete time Markov chain analysis of a similar problem.
]]>I was fooling around with the numbers in a spreadsheet and I saw the rapid convergence you speak of. I wasn't familiar with that identity, however, thanks!
- Mark
]]>