A well-known weakness of regression modeling based on observational data is that the observed association between two variables may be because both are related to a third variable that has been omitted from the regression model. This phenomenon is commonly referred to as “spurious correlation.” The term spurious correlation dates back to at least Pearson (1897).
Neyman (1952, pp. 143–154) provides an example based on fictitious data which dramatically illustrates spurious correlation. According to Kronmal (1993, p. 379), a fictitious friend of Neyman was interested in empirically examining the theory that storks bring babies and collected data on the number of women, babies born and storks in each of 50 counties. This fictitious data set was reported in Kronmal (1993, p. 383) and it can be found on the web page associated with Sheather (2009), namely, http://www.stat.tamu.edu/~sheather/book.
Figure 1shows scatter plots of all three variables from the stork data set along with the...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References and Further Reading
Cochrane AL, St. Leger AS, Moore F (1978) Health service “input” and mortality “output” in developed countries. J Epidemiol Community Health 32:200–205
Hinds MW (1974) Fewer doctors and infant survival. New Engl J Med 291:741
Jayachandran J, Jarvis GK (1986) Socioeconomic development, medical care and nutrition as determinants of infant mortality in less developed countries. Social Biol 33:301–315
Kronmal RA (1993) Spurious correlation and the fallacy of the ratio standard revisited. J R Stat Soc A 156:379–392
Neyman J (1952) Lectures and conferences on mathematical statistics and probability, 2nd edn. US Department of Agriculture, Washington DC, pp 143–154
Pearson K (1897) Mathematical contributions to the theory of evolution: on a form of spurious correlation which may arise when indices are used in the measurement of organs. Proc R Soc Lond 60:489–498
Sankrithi U, Emanuel I, Van Belle G (1991) Comparison of linear and exponential multivariate models for explaining national infant and child mortality. Int J Epidemiol 2:565–570
Sheather SJ (2009) A modern approach to regression with R. Springer, New York
St. Leger S (2001) The anomaly that finally went away? J Epidemiol Community Health 55:79
Stigler S (2005) Correlation and causation: a comment. Persp Biol Med 48(1 Suppl.):588–594
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this entry
Cite this entry
Sheather, S.J. (2011). Spurious Correlation. In: Lovric, M. (eds) International Encyclopedia of Statistical Science. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04898-2_534
Download citation
DOI: https://doi.org/10.1007/978-3-642-04898-2_534
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04897-5
Online ISBN: 978-3-642-04898-2
eBook Packages: Mathematics and StatisticsReference Module Computer Science and Engineering