Skip to content

Variance

The Formula

Population Variance (when you have data for the entire group):

\[ \sigma^2 = \frac{\sum_{i=1}^N (x_i - \mu)^2}{N} \]

Sample Variance (when you have a subset of the data):

\[ s^2 = \frac{\sum_{i=1}^n (x_i - \bar{x})^2}{n-1} \]

What It Means

Variance measures how spread out a set of numbers is. It quantifies the average degree to which each point differs from the mean.

  • Zero Variance: All values are identical (e.g., \([5, 5, 5]\)). There is no spread.
  • High Variance: Values are far from the mean (e.g., \([0, 100]\)).
  • Low Variance: Values are clustered close to the mean (e.g., \([48, 52]\)).

Because it squares the differences, variance gives extra weight to outliers. A point that is twice as far away contributes four times as much to the variance.

Why It Works — The Intuition

Why Square the Differences?

If we just added up the deviations from the mean \((x_i - \mu)\), they would sum to zero because the negatives cancel the positives. That's the definition of the mean!

To avoid this, we need to make everything positive. We could take the absolute value (Mean Absolute Deviation), but squaring has nice mathematical properties: 1. Continuously differentiable: Easier to use in calculus and optimization (like Least Squares). 2. Penalizes outliers: It highlights extreme values, which is often desirable in risk management and quality control.

Why Divide by \(n-1\)?

When calculating sample variance, we divide by \(n-1\) instead of \(n\). This is called Bessel's Correction.

  • The Problem: We don't know the true population mean \(\mu\), so we use the sample mean \(\bar{x}\).
  • The Bias: The sample mean \(\bar{x}\) is naturally "closer" to the sample data points than the true population mean \(\mu\) is (because \(\bar{x}\) is calculated from those specific points).
  • The Result: If we used \(\mu\), the squared distances would be larger. Using \(\bar{x}\) makes the sum of squares slightly too small.
  • The Fix: Dividing by a slightly smaller number (\(n-1\)) inflates the result just enough to correct this bias on average.

Derivation (Population)

Variance is defined as the Expected Value of the squared deviation from the mean:

\[ \text{Var}(X) = E[(X - \mu)^2] \]

Expanding the square:

\[ = E[X^2 - 2X\mu + \mu^2] \]

Using linearity of expectation (\(E[A+B] = E[A] + E[B]\)):

\[ = E[X^2] - 2\mu E[X] + \mu^2 \]

Since \(E[X] = \mu\) by definition:

\[ = E[X^2] - 2\mu(\mu) + \mu^2 \]
\[ = E[X^2] - 2\mu^2 + \mu^2 \]
\[ = E[X^2] - \mu^2 \]

This leads to the common computational formula:

\[ \sigma^2 = \text{Mean of Squares} - (\text{Mean})^2 \]

Variables Explained

Symbol Name Description
\(\sigma^2\) Population Variance Variance of the entire population
\(s^2\) Sample Variance Estimated variance from a sample
\(x_i\) Data Point Individual value
\(\mu\) Population Mean Average of the population
\(\bar{x}\) Sample Mean Average of the sample
\(N\) Population Size Total number of items in population
\(n\) Sample Size Number of items in sample
\(n-1\) Degrees of Freedom Correction factor for sample variance

Worked Example

Data (Sample): \([2, 4, 4, 4, 5, 5, 7, 9]\) \(n = 8\)

  1. Calculate Mean (\(\bar{x}\)):
\[ \frac{2+4+4+4+5+5+7+9}{8} = \frac{40}{8} = 5 \]
  1. Calculate Squared Deviations \((x_i - 5)^2\):

    • \(2 \to (2-5)^2 = 9\)
    • \(4 \to (4-5)^2 = 1\)
    • \(4 \to (4-5)^2 = 1\)
    • \(4 \to (4-5)^2 = 1\)
    • \(5 \to (5-5)^2 = 0\)
    • \(5 \to (5-5)^2 = 0\)
    • \(7 \to (7-5)^2 = 4\)
    • \(9 \to (9-5)^2 = 16\)
  2. Sum of Squared Deviations:

\[ 9 + 1 + 1 + 1 + 0 + 0 + 4 + 16 = 32 \]
  1. Divide by \(n-1\):
\[ s^2 = \frac{32}{8-1} = \frac{32}{7} \approx 4.57 \]

(If this were a population, we would divide by 8, giving \(\sigma^2 = 4\)).

Common Mistakes

  • Forgetting to Square: Variance is in "squared units" (e.g., \(\text{meters}^2\)). It's not intuitive to read directly. Always take the square root to get the Standard Deviation (\(\sigma\)) for a number in the original units.
  • Mixing up \(\sigma^2\) and \(s^2\): Using \(N\) for a small sample underestimates the true variability.
  • "Variance can be negative": Impossible. It's a sum of squares. If you get a negative number, check your math.