The Deceptive Nature of Statistics

Statistics often produce results that seem counterintuitive or even contradictory. A classic example of this is Simpson’s paradox, a phenomenon where a trend observed in different groups of data disappears or reverses when these groups are combined.

The UC Berkeley Discrimination Case

In the 1970s, the University of California, Berkeley, faced a lawsuit alleging gender discrimination in its graduate admissions. Data showed that 44 percent of male applicants were admitted, compared to only 35 percent of female applicants, leading plaintiffs to claim bias against women.

However, when statistician Peter J. Bickel and his colleagues analyzed the data by department, the results shifted. In four of the six largest departments, women were actually admitted at higher rates than men. The overall lower admission rate for women was attributed to them applying to more competitive departments with higher rejection rates, while men applied to departments with more available spots.

Historical Context and Modern Examples

Simpson’s paradox was first described by mathematician Karl Pearson in 1899, followed by George Udny Yule in 1903. The concept was largely forgotten until Edward Simpson published a paper on the topic in 1951, which eventually gave the phenomenon its name.

The paradox remains relevant in modern data analysis. For instance, 2021 data indicated that COVID-19 mortality rates in Italy were nearly identical to those in China, despite the fact that every individual age group in Italy had a higher survival rate. This illustrates how big-picture trends can obscure the reality of smaller subgroups.

Navigating Statistical Ambiguity

Simpson’s paradox can complicate medical research, such as when a drug appears effective overall but fails to outperform a placebo when patients are divided by gender. In such cases, there is no universal rule for how to proceed.

Researchers suggest that the most effective approach is to conduct further studies to identify hidden variables and understand how factors like gender influence efficacy. Ultimately, careful analysis is required to distinguish between simple correlations and true causal relationships.