Probability Learning: Monte Carlo Methods

Original article can be found here (source): Artificial Intelligence on Medium

Hello again friends! Welcome to Probability Learning! In this post we will see what Monte Carlo methods are, their different use cases, and how they are applied in the real world. Lets get to it!

Isn’t Monte Carlo a town? What is all this about?

Image of Monte Carlo from Unsplash

You are right asking that question. Monte Carlo is a part of Monaco that is famous for its amazing and World-class luxury casino. If you take a walk around the place you will see a ton of kilometric boats and amazing cars, along with some fantastic hotels and top-notch restaurants.

However, that is not the reason why we are here today. Monte Carlo methods or Monte Carlo experiments are tools that are used in econometrics, Machine Learning, and statistics in order to obtain numerical results for certain problems. For this, Monte Carlo methods use one key feature of our world: Randomness.

Huh? What do you mean by that? We will see it in a moment, but just keep that in mind for the rest of the article:

Monte Carlo methods obtain a numerical solution to a specific problem using randomness.

Awesome right? Lets see how they do this.

What is a Monte Carlo method?

It is a method of estimating the value of an unknown quantity using inferential statistics. The main elements of Monte Carlo are:

  • A population: the whole array of options or possibilities.
  • Samples: subsets of this population.

The main idea behind Monte Carlo is that if we choose a sample randomly, it will tend to exhibit the same properties as the population from which it was drawn. The key here, as mentioned before, is the randomness. If we do not take a random sample, there is no reason to assume that this small subset will have the same properties as the initial population.

Also, for it to represent the properties of the initial population, the sample needs to be big enough. Lets see what I mean by this with a quick example.

The Coin Toss Simulation

Icon from Flaticon.

Imagine we are explaining the game of head and tails to someone who has never seen a coin before. The goal of the game is to predict if when the coin lands it will show a head or a tail. However, there is one quick catch to the game: we will never show them the coin before it lands, only once it has landed.

He picks tails, we pick heads.

We flip the coin once and it lands on heads. We throw it again, and this time it is heads too. We throw it a third time and it shows heads again. By now, our unusual friend might think that the coin only has a head. We keep repeating this process, and by the 10th flip we have got 8 heads and 2 tails, like reflected in the following figure.

Evolution of the number of heads and tails after 10 throws. Coins from Shutterstock.

Our friend has seen that tails can show up too, but he is fairly bored as he thinks that coins have a greater chance of landing on their heads than on their tails. He does not want to keep playing. We tell him that it has just been a case of luck, that it is all random, but he walks away and leaves us alone. If only he had stayed for more tosses, lets say 990 more, then the number of tails and heads would have evened out, leaving him more content about the game…

To convince him to come back, we code a simulator for coin tossing, to show him that heads and tails have exactly the same probability, and that he was just unlucky.

We prepare the code, and run simulations for 2 ,5 ,10 ,20 ,50 ,100 ,500 ,1000 ,10000 ,100000 , and 1000000 tosses.

The results are shown in the following snippet:

Coin Simulation Notebook

For our simulation with 10 tosses, we had exactly the opposite result as we had with our friend in real life: 2 heads and 8 tails. 80% of our throws showed the same outcome. Does this mean that there is an 80% probability of having heads or tails if we throw 10 times? Of course not!

Just as I said before, the number of times we simulate an event (or the size of the sample from a population), has to be big enough in order for it to truly reflect reality. In the previous snippet we can see how as we increase the number of simulations the outcome of the experiment becomes closer and closer to what we know it should be: 50% of the times head and 50% of the times tails.

We go running with this code to our friend, show him the results and he agrees to play again. Saved by Monte Carlo!

The previous example highlights what Monte Carlo is all about: running lots of simulations of a certain situation to be able to predict its most possible outcome. Because our friend had never played heads or tails, or seen a coin before, he could have no idea that the true probabilities where 1/2 and 1/2, however, by exhaustively running simulations of head and tails we proved it to him!

Lets see another example: The famous roulette.

The Game of Roulette

Icon from Flaticon.

You ever heard the typical phrase ‘The Casino always wins’? In this example we will explain why this is true using Monte Carlo methods.

Our game of roulette will work the following way: we will pick a number from 1 to 36 (7 in our case) and simulate 3 different scenarios of constantly betting 1$ in each spin for an specific number of spins. We will do this for 2, 5 ,10,20,100,10000 and 1000000 spins. Our roulette has no 0 pocket and we never play to black or white, only to number 7.

Note: The outcome would have been the same if we also randomly generated the number we are betting at, but we have chosen to always bet on the same number for simplicity.

Again, we are using the strength of Monte Carlo for this explanation: randomly simulating the outcome of the game for a specific number of times to understand its nature. Lets see what happens with the following code snippet!

Roulette Simulation

This code implements what we have described above. It simulates us betting 1$ at number 7 in different scenarios. We start with 2 spins, then go to 5 spins, then to 10 and so on…We do this 3 times for each simulation.

As we can see, for a low number of spins, we almost always loose all of our money. However, for 10 spins we got lucky once and made a 260% return. This shows that for a small number of spins the outcomes are quite volatile: we can loose everything or win a lot. The results are highly variable.

This is one of the fun things about betting for a night, the results are totally unexpected, you can loose (which is what will probably happen), but you can win too!

However, the important thing to observe is that as we increase the number of spins, the expected return starts converging to 0. If we observe closely, we can see that the results almost don’t vary now. The results are not only closer to 0, but they are closer together.

This is what the Casino cares about. They don’t care about what happens with 20, 50 or even 100 spins. They care what happens with a million spins. They care about when every night lots of people go to play and bet their money.

The casino always wins. Icons from Flaticon

The important takeaway here is that for an individual player that throws for a small number of times, the results are highly variable: his return can go from -100% to really high numbers like 260% or even further. However, as lots of players get together and play, for hundred of millions of throws compounded into days and weeks and months, the expected return for the all the players would be highly predictable and very very close to zero: the casino knows exactly what is going to happen over the long run.

So even if a player wins, it’s OK. They know the compounded average return of all the players is very close to zero: they will not loose any money, and you are probably paying for an entrance, drinks, shows, and loosing your money on other games.

Lets see one last final example of how Monte Carlo can be used for estimating the area of a certain flat shape.

The Area of the Circumference:

The last thing we are going to use Monte Carlo methods for is for estimating the area inside a certain shape, a circumference in our case.

Icons from Flaticon.

Specifically, we are going to use it to calculate the relationship between the area of a square and of a circle. We’ve got the following figure:

We have a square of side L and a circumference of radius L/2

We know that the area of the square is L times L, however, we have absolutely no idea of the area of the circle. We have heard however about Monte Carlo methods, and design a procedure to calculate it using this statistical tool!

What we will do is randomly generate points inside the square, and at the end of the simulation count the total number of points we have got inside our circle, and the total generated points, and use this to calculate the area inside the circle. For simplicity we will use a square of size L = 1.

The following code shows the Monte Carlo simulation implemented to discover the area of the circle: we randomly generate points for the top right quadrant of our shape, and using the known area of the square, the number of points that fall inside the circle, and the number of total generated points, we accurately calculate the area of the circle.

Simulation to calculate the area of a circle

In the following figure you can see the generated images one by one. It shows the magic of Monte Carlo. Using randomly generated points we draw a circle, and as we run more simulations, we become more and more precise. The more random points we generate, the better our simulation becomes. We can see in the previous code that as we increase the number of simulations the estimated area for the circle becomes closer to the real area.