The Normal Distribution

2020 · Posted by Sebastien Lemieux-Codere

Distribution 1 Mean (\$\mu\$)
Distribution 1 Standard Deviation (\$\sigma\$)
Distribution 2 Mean (\$\mu\$)
Distribution 2 Standard Deviation (\$\sigma\$)


One of the most important and commonly occurring probability distributions in the real world is the "bell shaped" normal distribution. For example, test scores, blood pressures and physical measurements errors are often (approximately) normally distributed. In fact, there are fundamental mathematical reasons like the Central Limit Theorem that explain why the normal distribution appears so often (spoiler alert: the sums or averages of independent or nearly independent processes are surprisingly often approximately normally distributed). Furthermore, the normal distribution also has some unique properties that make it very useful in mathematical and statistical modelling.

Normal Distribution Parameters

The Normal Distributions has 2 parameters (\$\mu\$ and \$\sigma\$):

  • The mean or expected value (\$\mu\$), which corresponds to the center of the distribution. This is where the distribution has the highest probability density. In other words, when we take a random value that follows a normal distribution (a sample from the distribution), we expect it to be around the mean of the distribution (how close it typically is from the mean depends on the other parameter of the normal distribution). Furthermore, if we take many different random samples, we expect the average of the samples to be around the mean of the distribution.
  • The standard deviation (\$\sigma\$), which corresponds to how spread out the distribution is. In a sense, it's a measure of how far samples tend to be be from the mean. When the standard deviation is small, we get samples that are very near to mean. In contrast, when the standard deviation is large, the values - although still centered around the mean - tend to be much more spread out. In fact, if you sample a random variable from a normal distribution, your sample will be within 1 standard deviation of the mean ≈68% of the time and within 2 standard deviation from the mean ≈95% of the time. Note that \$\sigma^2\$, the standard deviation squared, is called the variance of the distribution.

Normal Distribution Probability Density Function

We can plot the normal distribution as a Probability Density Function on a line chart where the value of the line (its height) is proportional to how likely a sample from the distribution is to be at the corresponding point (horizontal location). Its a function that takes a point \$x\$ and outputs the probability density at that point:

$$ f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{1}{2}(\frac{x-\mu}{\sigma})^2} $$ The interactive chart at the top of this page displays 2 normal distributions and allows you to adjust their \$\mu\$ and \$\sigma\$ parameters to see what effect they have on the shape of the probability density function. If we want to know what the probability that a normally distributed random variable is in the interval between 2 values, we have to find the area under the probability density function between the 2 values (its integral over the interval between the 2 points). This is no closed form formula to do this but standard normal distribution tables or numerical integration methods (computer programs) can be used.

The Author

Sebastien Lemieux-Codere

Sebastien is a Data Scientist and Software Developer.