Data  Mining

Scarpa, Bruno

doi:10.1007/978-3-642-04898-2_20

Bruno Scarpa²

365 Accesses

In recent decades technological innovation has made the availability of large and sometimes huge amounts of information on a phenomenon of interest simple and cheap. This is due to two main reasons: on one side the development of automatic methods of data acquisition, and on the other side the progress of storage technology producing the fall of related costs. This new environment involves all areas of human endeavor.

Every month a supermarket chain releases millions of receipts, one for each shopping trolley checking out. The content of each trolley summarizes the needs, the propensities and the economic behavior of the customer that selected it. The collection of all these shopping lists forms an important information base for the supermarket in order to decide the sales and purchases politics. Such an analysis becomes even more interesting when each shopping list is connected with the customers loyalty cards, allowing to follow the single client behavior by recording the purchases...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 1,100.00; Price excludes VAT (USA)

Hardcover Book: USD 549.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References and Further Reading

Azzalini A, Scarpa B (2010) Data analysis and data mining. Oxford University Press, New York
Google Scholar
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth International, Monterey, CA
MATH Google Scholar
Breiman L (1996) Bagging predictors. Mach Learn 26:123–140
Google Scholar
Breiman L (2001a) Random forests. Mach Learn 45:5–32
MATH Google Scholar
Breiman L (2001b) Statistical modeling: the two cultures (with discussion). Stat Sci 16:199–231
MATH MathSciNet Google Scholar
Coase RH (1982) How should economists choose? American Enterprise Institute for Public Policy Research, Washington, DC
Google Scholar
Friedman JH (1997) Data mining and statistics: what’s the connection? In: Proceedings of the 29th symposium on the interface: computing science and statistics. Houston, TX, May 1997. Available at http://www-stat.stanford.edu/∼jhf/ftp/dm-stat.pdf
Freund Y, Schapire R (1996) Experiments with a new boosting algorithm. In: Machine learning: proceedings of the thirteenth international conference, Morgan Kauffman, San Francisco, pp 148–156
Google Scholar
Hand D, Mannila H, Smyth P (2001) Principles of data mining. MIT Press, Cambridge
Google Scholar
Hastie T, Tibshirani R (1990) Generalized additive models. Chapman and Hall, London
MATH Google Scholar
Hastie T, Tibshirani R, Friedman JH (2008) The elements of statistical learning: data mining, inference, and prediction. Springer, New York
Google Scholar
Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New York
Google Scholar
McCullagh P, Nelder JA (1989) Generalized linear models. Chapman and Hall, London
MATH Google Scholar
Ripley BD (1996) Pattern recognition and neural networks. Cambridge University Press, Cambridge
MATH Google Scholar
Vapnik V (1996) The nature of statistical learning theory. Springer, New York
Google Scholar

Download references

Author information

Authors and Affiliations

University of Padua, Padua, Italy
Bruno Scarpa

Authors

Bruno Scarpa
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Statistics and Informatics, Faculty of Economics, University of Kragujevac, City of Kragujevac, Serbia
Miodrag Lovric

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Scarpa, B. (2011). Data Mining. In: Lovric, M. (eds) International Encyclopedia of Statistical Science. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04898-2_20

Download citation

DOI: https://doi.org/10.1007/978-3-642-04898-2_20
Published: 02 December 2014
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04897-5
Online ISBN: 978-3-642-04898-2
eBook Packages: Mathematics and StatisticsReference Module Computer Science and Engineering

Publish with us

Policies and ethics

Data Mining