Skip to main content

Spurious Correlation

  • Reference work entry
  • First Online:
International Encyclopedia of Statistical Science

A well-known weakness of regression modeling based on observational data is that the observed association between two variables may be because both are related to a third variable that has been omitted from the regression model. This phenomenon is commonly referred to as “spurious correlation.” The term spurious correlation dates back to at least Pearson (1897).

Neyman (1952, pp. 143–154) provides an example based on fictitious data which dramatically illustrates spurious correlation. According to Kronmal (1993, p. 379), a fictitious friend of Neyman was interested in empirically examining the theory that storks bring babies and collected data on the number of women, babies born and storks in each of 50 counties. This fictitious data set was reported in Kronmal (1993, p. 383) and it can be found on the web page associated with Sheather (2009), namely, http://www.stat.tamu.edu/~sheather/book.

Figure 1shows scatter plots of all three variables from the stork data set along with the...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 1,100.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 549.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References and Further Reading

  • Cochrane AL, St. Leger AS, Moore F (1978) Health service “input” and mortality “output” in developed countries. J Epidemiol Community Health 32:200–205

    Google Scholar 

  • Hinds MW (1974) Fewer doctors and infant survival. New Engl J Med 291:741

    Google Scholar 

  • Jayachandran J, Jarvis GK (1986) Socioeconomic development, medical care and nutrition as determinants of infant mortality in less developed countries. Social Biol 33:301–315

    Google Scholar 

  • Kronmal RA (1993) Spurious correlation and the fallacy of the ratio standard revisited. J R Stat Soc A 156:379–392

    Google Scholar 

  • Neyman J (1952) Lectures and conferences on mathematical statistics and probability, 2nd edn. US Department of Agriculture, Washington DC, pp 143–154

    MATH  Google Scholar 

  • Pearson K (1897) Mathematical contributions to the theory of evolution: on a form of spurious correlation which may arise when indices are used in the measurement of organs. Proc R Soc Lond 60:489–498

    MATH  Google Scholar 

  • Sankrithi U, Emanuel I, Van Belle G (1991) Comparison of linear and exponential multivariate models for explaining national infant and child mortality. Int J Epidemiol 2:565–570

    Google Scholar 

  • Sheather SJ (2009) A modern approach to regression with R. Springer, New York

    MATH  Google Scholar 

  • St. Leger S (2001) The anomaly that finally went away? J Epidemiol Community Health 55:79

    Google Scholar 

  • Stigler S (2005) Correlation and causation: a comment. Persp Biol Med 48(1 Suppl.):588–594

    MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this entry

Cite this entry

Sheather, S.J. (2011). Spurious Correlation. In: Lovric, M. (eds) International Encyclopedia of Statistical Science. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04898-2_534

Download citation

Publish with us

Policies and ethics