Skip to main content

Data  Mining

  • Reference work entry
  • First Online:
International Encyclopedia of Statistical Science
  • 365 Accesses

In recent decades technological innovation has made the availability of large and sometimes huge amounts of information on a phenomenon of interest simple and cheap. This is due to two main reasons: on one side the development of automatic methods of data acquisition, and on the other side the progress of storage technology producing the fall of related costs. This new environment involves all areas of human endeavor.

  • Every month a supermarket chain releases millions of receipts, one for each shopping trolley checking out. The content of each trolley summarizes the needs, the propensities and the economic behavior of the customer that selected it. The collection of all these shopping lists forms an important information base for the supermarket in order to decide the sales and purchases politics. Such an analysis becomes even more interesting when each shopping list is connected with the customers loyalty cards, allowing to follow the single client behavior by recording the purchases...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 1,100.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 549.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References and Further Reading

  • Azzalini A, Scarpa B (2010) Data analysis and data  mining. Oxford University Press, New York

    Google Scholar 

  • Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth International, Monterey, CA

    MATH  Google Scholar 

  • Breiman L (1996) Bagging predictors. Mach Learn 26:123–140

    Google Scholar 

  • Breiman L (2001a) Random forests. Mach Learn 45:5–32

    MATH  Google Scholar 

  • Breiman L (2001b) Statistical modeling: the two cultures (with discussion). Stat Sci 16:199–231

    MATH  MathSciNet  Google Scholar 

  • Coase RH (1982) How should economists choose? American Enterprise Institute for Public Policy Research, Washington, DC

    Google Scholar 

  • Friedman JH (1997) Data  mining and statistics: what’s the connection? In: Proceedings of the 29th symposium on the interface: computing science and statistics. Houston, TX, May 1997. Available at http://www-stat.stanford.edu/∼jhf/ftp/dm-stat.pdf

  • Freund Y, Schapire R (1996) Experiments with a new boosting algorithm. In: Machine learning: proceedings of the thirteenth international conference, Morgan Kauffman, San Francisco, pp 148–156

    Google Scholar 

  • Hand D, Mannila H, Smyth P (2001) Principles of data  mining. MIT Press, Cambridge

    Google Scholar 

  • Hastie T, Tibshirani R (1990) Generalized additive models. Chapman and Hall, London

    MATH  Google Scholar 

  • Hastie T, Tibshirani R, Friedman JH (2008) The elements of statistical learning: data  mining, inference, and prediction. Springer, New York

    Google Scholar 

  • Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New York

    Google Scholar 

  • McCullagh P, Nelder JA (1989) Generalized linear models. Chapman and Hall, London

    MATH  Google Scholar 

  • Ripley BD (1996) Pattern recognition and neural networks. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  • Vapnik V (1996) The nature of statistical learning theory. Springer, New York

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this entry

Cite this entry

Scarpa, B. (2011). Data  Mining. In: Lovric, M. (eds) International Encyclopedia of Statistical Science. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04898-2_20

Download citation

Publish with us

Policies and ethics