## Wednesday, March 6, 2019

### Midweek Probability

Another re-run today of a posting I did previously....
Marilyn vos Savant famously introduced the "Monty Hall problem" to the public and gave the correct answer even when many professional mathematicians initially labelled her "wrong."

Here’s another fun question she later posed for her readers:

Say you plan to roll a die 20 times. Which result is more likely:
(a) 11111111111111111111 or
(b) 66234441536125563152

Marilyn answered that both sequences were equally likely as outcomes from such a procedure… and there is little controversy over that answer… from a frequentist view of rolling a fair die, all sequence-outcomes being equally likely (that likelihood being very small, BTW). BUT, then Marilyn went on to note, “But let’s say you rolled a die out of my view and then said the results were one of those series. Which is more likely? It’s (b) because the roll has already occurred. It was far more likely to have been that mix than a series of ones.

At least one mathematician again took her to task, claiming the answer is "wrong" and the probabilities are still equal… that "rolling the die out of view" has no consequence. But clearly there is a difference between anticipating in advance a resultant sequence out of ALL the possible sequences that a procedure might produce, versus addressing just two given sequences AFTER a procedure has already taken place. Vos Savant has essentially altered the original question (in order to make an interesting/worthwhile point).

[One way to think about it is simply to make the sequence more ridiculously long: suppose I roll a FAIR die a million times; I record the results and tell you that the outcome was either a million ones, OR, some more-random-looking list of figures… prior to rolling the die both sequences would be equally likely, but with the task already completed, and ONE of the TWO given choices GUARANTEED to be the actual sequence, the second one is clearly more probable.]
What might be a more interesting question to explore is at what point along "randomization" would two given sequences approach equal probability? i.e., suppose I throw the dice a million times and show a sequence of 225,000 ones, followed by 225,000 twos, followed by 225,000 threes, followed by 225,000 fours, versus a more genuinely-random-looking sequence -- still the second would be more probable... but I could keep altering the first sequence slowly step-by-step and at some point its probability would 'tip-over' to being the same as the second sequence. But when does it happen?