We all must handle numbers that we are not certain of, like projecting sales, costs or project durations. One of my favorite methods of providing the answers to projections is Monte Carlo Simulation. But before we jump into that, let’s go over some fundamentals.
A random variable is a precise mathematical description of a number which one is uncertain of. A random variable can be classified as continuous or discrete.
Continuous random variables can have any value between two extreme values. Basically an infinite number of possible values. Examples are an outcome of a spinner or the duration of a phone call.
Discrete random variables can only have distinct values. A good example is an outcome of rolling a dice or the number of people who fill out a form.
We can show uncertain variables as a shape known as the histogram. It shows the likelihood that there will be different values.
The histogram can have any shape as long as the bars total 100%. In monte carlo simulations, as more trials are run, there will be more bars in the graph. They will still add up to 100%. Basically more data and thus more accuracy in gaining a visualization of the uncertain variable. The histogram has an average or mean of the uncertain variable located where at the balance point of the graph if you were to imagine the bars as blocks glued together.
As more data is simulated, the bars become narrower and the histogram become like a probability distribution which shows all the possible outcomes of the uncertain variable. Each bar can be termed as a bin.
Cumulative graphs give us the ability to identify the probabilities around given values. In the chart to the right, you can see the histogram with a cumulative graph layered over it in green. It has a y axis to the right of the graph showing probability.
Mean, Mode, Median
Our mean of an uncertain variable again will be the balance point in a histogram. The mode will be the location with the tallest bin/bar. And the median will be the bar with equal totals to the left and right of itself.
Variance and Standard Deviation
Variance is basically the degree of uncertainty. It can be found by subtracting the average from the uncertain variable, squaring that. And then taking the average of these values. The square root of the variance is the standard deviation. When talking about variance in revenue or costs, it will end up as dollars squared and so this is where the square root of that itself makes sense to use. Hence the need for standard deviation.
Diversification and Variance Reduction
A histogram will become shaped in such a way that it goes up in the middle and down at the ends. This happens when we take uncertain numbers and average them together. The idea that the histogram takes on this shape is what we call the phenomenon of diversification. The narrowness of the distribution defines the range of uncertainty. Some concepts are that when the distribution is wider, we will see a greater variance. And hence standard deviation and uncertainty. On the flip side, narrower means smaller variance and uncertainty. The narrowing is called variance reduction.
I’ll post the fundamentals on these graphical representations of uncertainty next. These are the normal, binomial, poisson, and exponential distributions. You should already know about the normal distribution or bell curve. But we can bring in the Central Limit Theorem and how the concept applies to monte carlo simulations.