Definition
Statistical instability arises when an average is based on a small number of observations. The fewer the students, the greater the impact each individual has on the school's average.
In a school with 20 students in year 9, one student represents 5 % of the sample. If that student has an outlier result — high or low — the entire school average shifts noticeably.
| School size (year 9) | One student shifts the average by |
|---|---|
| 15 students | ~7 percentage points |
| 30 students | ~3 percentage points |
| 100 students | ~1 percentage point |
| 300 students | ~0.3 percentage points |
Year-to-year changes can be pure chance — not an actual change in quality.
How to interpret
Use multi-year averages (3–5 years). Single years are too uncertain to draw conclusions from. A three-year average smooths out random fluctuations and provides a more stable picture.
Do not compare single years between small schools. A 20-student school gaining 15 merit value points could be a random effect of one particularly strong or weak cohort graduating.
Look at the trend rather than individual values. If the merit value declines three years in a row, that is more meaningful than a single year's drop.
Combine with other indicators. Survey results, teacher certification and the Schools Inspectorate's reports provide complementary information that is less affected by sampling variation.
Common mistakes
- Drawing conclusions from a single cohort. "Best school in the municipality" one year, "worst" the next — at a 15-student school, this can be sampling noise, not a change in quality.
- Comparing a 20-student school with a 500-student school directly. The larger school has significantly more stable statistics. The difference in reliability is enormous.
- Interpreting a decline as deteriorating quality. A temporary decline may be entirely random — check the trend and the sample size.
- Publishing rankings where small schools end up high or low due to chance. "Best school" lists frequently miss that the variation is dominated by sample size, not quality.