Introduction to Probability (Part 3)
Expectation and variance explained clearly. Learn how probability distributions are summarised, how averages differ from spread, and why these concepts matter in statistics, ML, and probabilistic modelling.
Expectation and Variance
So far, probability distributions have helped us describe uncertainty in detail. They tell us how likely different outcomes are, and how probability is spread across possible values. In practice, however, we often want something simpler.
Instead of using the full distribution, we often use a few summary quantities. Expectation and variance are the most common.
Expectation
The expectation of a random variable is its probability-weighted average value. It tells us the long-run average outcome if we repeat the random process many times.
For a discrete random variable with distribution , the expectation is defined as
Each value of contributes to the expectation according to its probability. Values with higher probability affect the average more.
For a continuous random variable with density , we use an integral instead of a sum.
In both cases, expectation is the long-run average value from repeated trials. It is not always the most likely value, but it is the average over many repetitions.
Expectation of functions
We can also take the expectation of a function of a random variable. If is a function and is a random variable, then has its own expectation.
For example, we can consider or . This idea will be important later, especially when working with variance.
Linearity of expectation
One of the most useful properties of expectation is linearity. For any random variables and , and constants and ,
This holds even if and are not independent. Linearity lets us break problems into smaller parts.
It is important to note that expectation behaves very differently from probabilities. While probabilities of joint events depend on the dependence structure, expectations of sums do not.
Variance
While expectation tells us the average value of a random variable, it does not tell us how variable the outcomes are. Two distributions can have the same expectation, but behave very differently. The variance of a random variable measures how far values typically deviate from the expectation.
Formally, the variance of is defined as
This definition can be read as determining how far is from its mean, squaring that deviation, and then averaging the squared deviations. Squaring ensures that positive and negative deviations do not cancel each other out, and that larger deviations count more heavily.
Another common formula for variance is
Both definitions describe the same quantity.
Standard deviation
Variance uses squared units, which are often hard to interpret. We usually use the standard deviation instead, which is written as
The standard deviation has the same units as the random variable itself and can be interpreted as a typical scale of deviation from the mean.
Variance and independence
Variance does not behave as simply as expectation when combining variables. In general,
However, if and are independent, then
This distinction is important in modelling and inference. Independence affects how uncertainty adds up.
Why expectation and variance matter
Expectation and variance provide a compact summary of a distribution. Expectation captures where probability is centred, while variance captures how spread out it is.
These quantities help us compare distributions, reason about uncertainty, and make predictions without using the full probability model.

