Skip to content

Covariance

The Formula

For a sample:

\[ \text{Cov}(x,y) = \frac{\sum_{i=1}^n (x_i - \bar{x})(y_i - \bar{y})}{n-1} \]

For a population:

\[ \sigma_{xy} = E[(X - \mu_x)(Y - \mu_y)] \]

What It Means

Covariance measures the direction of the linear relationship between two variables.

  • Positive Covariance: When \(x\) is high, \(y\) tends to be high. When \(x\) is low, \(y\) tends to be low. They move together.
  • Negative Covariance: When \(x\) is high, \(y\) tends to be low. They move in opposite directions.
  • Zero Covariance: There is no linear pattern connecting the movements of \(x\) and \(y\).

Unlike correlation, covariance is unscaled. If you double the values of \(x\) (e.g., measure in centimeters instead of meters), the covariance increases, even though the relationship strength is the same. This makes it hard to compare covariances across different datasets.

Why It Works — The Intuition

Imagine drawing a crosshair at the mean of your data \((\bar{x}, \bar{y})\). This divides your scatter plot into four quadrants.

  • Top-Right (\(x > \bar{x}, y > \bar{y}\)): Both deviations are positive. Product is \((+) \cdot (+) = +\).
  • Bottom-Left (\(x < \bar{x}, y < \bar{y}\)): Both deviations are negative. Product is \((-) \cdot (-) = +\).
  • Top-Left (\(x < \bar{x}, y > \bar{y}\)): \(x\) deviation is \((-)\), \(y\) is \((+)\). Product is \(-\).
  • Bottom-Right (\(x > \bar{x}, y < \bar{y}\)): \(x\) deviation is \((+)\), \(y\) is \((-)\). Product is \(-\).

Covariance is just the average of these products. * If most points are in the Top-Right and Bottom-Left, the positive products dominate \(\to\) Positive Covariance. * If most points are in the Top-Left and Bottom-Right, the negative products dominate \(\to\) Negative Covariance.

Derivation

Covariance comes from the definition of Expected Value (\(E\)). The variance of a single variable \(X\) is the expected squared deviation from the mean:

\[ \text{Var}(X) = E[(X - \mu_x)^2] = E[(X - \mu_x)(X - \mu_x)] \]

Covariance simply generalizes this to two variables:

\[ \text{Cov}(X,Y) = E[(X - \mu_x)(Y - \mu_y)] \]

Expanding this expectation: 1. Expand terms:

\[ E[XY - X\mu_y - Y\mu_x + \mu_x\mu_y] \]
  1. Use linearity of expectation (\(E[A + B] = E[A] + E[B]\) and \(E[cX] = cE[X]\)):
\[ E[XY] - \mu_y E[X] - \mu_x E[Y] + \mu_x \mu_y \]
  1. Substitute \(E[X] = \mu_x\) and \(E[Y] = \mu_y\):
\[ E[XY] - \mu_y \mu_x - \mu_x \mu_y + \mu_x \mu_y \]
  1. Simplify:
\[ \text{Cov}(X,Y) = E[XY] - \mu_x \mu_y \]

This alternate formula is computationally useful: "Mean of the product minus product of the means."

Variables Explained

Symbol Name Description
\(\text{Cov}(x,y)\) Sample Covariance The measure of joint variability from a sample
\(\sigma_{xy}\) Population Covariance The theoretical measure for the whole population
\(x_i, y_i\) Data Points Individual observations
\(\bar{x}, \bar{y}\) Sample Means Average of \(x\) and \(y\) samples
\(\mu_x, \mu_y\) Population Means Theoretical average of \(X\) and \(Y\)
\(n\) Sample Size Number of data pairs
\(E[\cdot]\) Expected Value The probability-weighted average

Worked Example

Data: \(x = [1, 2, 3]\), \(y = [2, 4, 6]\). Means: \(\bar{x} = 2\), \(\bar{y} = 4\).

  1. Calculate Deviations:

    • \((1-2, 2-4) = (-1, -2)\)
    • \((2-2, 4-4) = (0, 0)\)
    • \((3-2, 6-4) = (1, 2)\)
  2. Multiply Deviations:

    • \((-1)(-2) = 2\)
    • \((0)(0) = 0\)
    • \((1)(2) = 2\)
  3. Sum and Divide:

    • Sum = \(2 + 0 + 2 = 4\)
    • \(\text{Cov}(x,y) = \frac{4}{3-1} = \frac{4}{2} = 2\)

The covariance is 2. The positive sign tells us they move together. The magnitude (2) depends on the units (e.g., if \(y\) was doubled to \([4, 8, 12]\), covariance would become 4).

Common Mistakes

  • Interpreting the Magnitude: A covariance of 500 isn't necessarily "stronger" than a covariance of 0.5. It depends on the units. Always normalize to Correlation (\(r\)) to judge strength.
  • Confusing \(n\) and \(n-1\): For samples, divide by \(n-1\) (Bessel's correction) to get an unbiased estimator. For populations, divide by \(N\).
  • Pearson's Correlation — The normalized version of covariance (\(\frac{\text{Cov}}{s_x s_y}\)).
  • Variance — The covariance of a variable with itself (\(\text{Cov}(X,X)\)).