Original article was published on Artificial Intelligence on Medium
Introduction to Probability
“It is likely that unlikely things should happen.” — Aristotle
Every year, we would organize a fun-fair in high school, as a day off to chill while getting some first-hand learning exposure, as the students were the organizers, the charm and the business model of the exhibition. We used to put up booths with a “pay-to-play” scheme of games and try to trick people into an unfair play (which didn’t work).
I remember taking part in one such game called ‘7-up/7-down’. The rules were straightforward — the host rolls two dies and you have to bet beforehand whether the sum of the face values of those dies would be greater or less than 7. If your prediction was correct, you win X times (say twice) the money you bet on, if incorrect, you lose the wagered money, and if the sum was exactly 7, you get back your gambled money (break-even). It was a fair game as the probability of getting a sum greater than 7 is the same as the probability of getting a sum less than 7.
P (>7) = 15/36
P (<7) = 15/36
P (=7) = 6/36
Let’s look at how we arrived at these numbers:
As seen from Fig 1, there are 36 outcomes for a sum of two dies [sum (1,1),…, sum (6,6)]. Total combinations with a sum greater than 7 are 15, combinations with sum less than 7 are also 15, and there are 6 combinations where the sum is exactly 7 (the purple diagonal). Hence, the probability of getting a sum greater than 7 is 15/36 (similarly for less than 7) and the probability of getting a sum equal to 7 is 6/36.
But you probably knew that much already (check references below if you didn’t). This essay tackles the question — “what it means to get a probability of 41.67% (15/36)”?
Let’s flip a (fair) coin 10 times and record the face values.
In a coin toss, the probability of getting a Heads is 1/2 (or 50%). Hence, flipping a coin 10 times should result in 5 Heads & 5 Tails (50–50). However, such is not the case in the experiment I performed above where the ratio of H: T is (6: 4). If I experiment again, I might get the proportion of Heads to be 10% or even 100%. In what sense is the probability of getting a Heads, 50%?
The probability value of an event (P) is the extent to which the event is likely to occur “in the long run” of outcomes. Put differently, if I were to keep flipping the coin forever (iterations tend to infinity), the ratio of H: T would get closer and closer to 50:50. Fig 2 shows the ratio of H (50.07%) & T (49.43%) after simulating the coin flip experiment 1 million times (code in references below). As you can see, the ratio of H: T is almost 50:50!
The relative frequency of occurrence of an event, observed in several repetitions of the experiment, is a measure of the probability of that event. This is the core conception of probability in the frequentist interpretation.
So if I were to play the ‘7-up/7-down’ game indefinitely, I would eventually break even, but in reality, I remember winning enough money to boost my ego which would serve me to lose it all at the next booth.
The purpose of this essay was to entertain the meaning behind attributing a probability to an event. In a theoretical sense, the probability value used in calculations and interpretations is the expected probability (as in 0.5 for Tails in a coin toss) but in practical scenarios, the outcome might not match the expectations in short runs (e.g. getting 10 Heads in 10 tosses). So it’s better to understand the inherent meaning behind probability and not get confused or misconstrue the causality of the outcomes with its likelihood.
If you’d like to learn more and get a good intuitive sense about the basics of probability, I’d suggest the following resources
All codes and references at https://github.com/specbug/DataScienceJournal/tree/master/probability_sense_check