F-Test¶

The Question¶

You have two machines making screws. Machine A seems consistent. Machine B seems to be all over the place. How do you prove that Machine B is significantly more variable than Machine A?

Or, you are running an experiment with 3 different groups (A, B, C). You want to know if any of them are different. You could do t-tests for A vs B, B vs C, A vs C... but that increases your chance of a false alarm. How do you test them all at once?

The answer to both is comparing variances.

The Formula¶

The F-statistic is simply a ratio of two variances:

\[ F = \frac{\text{Variance 1}}{\text{Variance 2}} = \frac{s_1^2}{s_2^2} \]

In the context of ANOVA (Analysis of Variance):

\[ F = \frac{\text{Variance Between Groups}}{\text{Variance Within Groups}} \]

The Story: Fisher and Farming¶

The F-test is named after Sir Ronald A. Fisher, the father of modern statistics (the "F" is for Fisher). In the 1920s, he was working at Rothamsted Experimental Station, analyzing agricultural data.

He dealt with messy crop data. Some fields got fertilizer A, some B, some C. He wanted to know if the fertilizer mattered. But crop yields vary naturally (sun, rain, soil quality).

Fisher realized he could decompose the total variation into two parts: 1. Explained Variation: The differences caused by the fertilizer (Between Groups). 2. Unexplained Variation: The random noise within each field (Within Groups).

If the "Explained" part was much bigger than the "Unexplained" part, the fertilizer worked. He needed a way to compare these two sources of variation. Hence, the F-ratio.

The Intuition¶

The Battle of Variances¶

Think of the F-test as a wrestling match between two sources of noise.

Top (Numerator): The variance you care about (e.g., the difference between your treatments).
Bottom (Denominator): The variance you don't care about (random error, background noise).

If \(F = 1\), it means your treatments are adding no more variation than you'd expect from random noise. If \(F = 10\), it means the variation between your groups is 10 times larger than the background noise. Something is happening.

Why Squares?¶

Why do we divide variances (\(s^2\)) and not standard deviations (\(s\))? Because variances have beautiful mathematical properties. Variances are additive (for independent variables). Standard deviations are not. Fisher built his entire system (ANOVA) on the idea that you can add up sources of variance to get the total variance.

Derivazione: Ratio of Chi-Squareds¶

The F-distribution arises naturally when you divide one Chi-squared variable by another.

Variance is Chi-Squared: Recall that sample variance \(s^2\) is related to the Chi-squared distribution (\(\chi^2\)):

\[ s^2 \sim \frac{\chi^2}{df} \]

(Roughly speaking. Technically $(n-1)s^2/\sigma^2 \sim \chi^2$).

The Ratio: If we have two independent samples with variances \(s_1^2\) and \(s_2^2\), and we assume the population variances are equal (\(\sigma_1^2 = \sigma_2^2\)), then their ratio is:

\[ F = \frac{s_1^2}{s_2^2} \]

Since each $s^2$ is a "scaled" Chi-squared variable, the F-statistic is:

\[ F = \frac{\chi_1^2 / df_1}{\chi_2^2 / df_2} \]

The F-Distribution: This specific ratio—a Chi-squared divided by its degrees of freedom, over another Chi-squared divided by its degrees of freedom—defines the F-distribution. It has two parameters: degrees of freedom for the numerator (\(df_1\)) and degrees of freedom for the denominator (\(df_2\)).

Common Mistakes¶

One-Tailed vs Two-Tailed: F-tests in ANOVA are almost always one-tailed. We only care if the numerator is bigger than the denominator. If the numerator is smaller (\(F < 1\)), we usually don't care (it means the treatment did less than random noise, which is just... no effect).
Assuming Normality: Like the t-test, the F-test assumes the underlying data follows a normal distribution. It is actually quite sensitive to outliers (non-normality).
Confusing F with R-squared: In regression, F tells you if the whole model is significant. \(R^2\) tells you how much variance is explained. They are related, but answer different questions.

Chi-Squared Distribution — The building block of the F-test.
ANOVA — The main application of the F-test.
Variance — What we are comparing.