Bias-Variance Analysis of Local Classification Methods

Schiffner, Julia; Bischl, Bernd; Weihs, Claus

doi:10.1007/978-3-642-24466-7_6

Julia Schiffner⁵,
Bernd Bischl⁵ &
Claus Weihs⁵

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

2547 Accesses
2 Citations

Abstract

In recent years an increasing amount of so called local classification methods has been developed. Local approaches to classification are not new. Well-known examples are the k nearest neighbors method and classification trees (e.g. CART). However, the term ‘local’ is usually used without further explanation of its particular meaning, we neither know which properties local methods have nor for which types of classification problems they may be beneficial. In order to address these problems we conduct a benchmark study. Based on 26 artificial and real-world data sets selected local and global classification methods are analyzed in terms of the bias-variance decomposition of the misclassification rate. The results support our intuition that local methods exhibit lower bias compared to global counterparts. This reduction comes at the price of an only slightly increased variance such that the error rate in total may be improved.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Allwein EL, Shapire RE, Singer Y (2000) Reducing multiclass to binary: A unifying approach for margin classifiers. J Mach Learn Res 1:113–141
Google Scholar
Atkeson CG, Moore AW, Schaal S (1997) Locally weighted learning. Artif Intell Rev 11(1-5): 11–73
Article Google Scholar
Bischl B (2010) mlr: Machine learning in R. URL http://mlr.r-forge.r-project.org
Bishop CM (2006) Pattern recognition and machine learning. Springer, New York
MATH Google Scholar
Breiman L (1996) Bias, variance, and arcing classifiers. Tech. Rep. 460, Statistics Department, University of California at Berkeley, Berkeley, CA, URL www.stat.berkeley.edu
Czogiel I, Luebke K, Zentgraf M, Weihs C (2007) Localized linear discriminant analysis. In: Decker R, Lenz HJ (eds) Advances in data analysis, Springer, Berlin Heidelberg, Studies in classification, data analysis, and knowledge organization, vol 33, pp 133–140
Google Scholar
Eugster MJA, Hothorn T, Leisch F (2008) Exploratory and inferential analysis of benchmark experiments. Tech. Rep. 30, Institut für Statistik, Ludwig-Maximilians-Universität München, Germany, URL http://epub.ub.uni-muenchen.de/4134/
Frank A, Asuncion A (2010) UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences, URL http://archive.ics.uci.edu/ml
Hand DJ, Vinciotti V (2003) Local versus global models for classification problems: Fitting models where it matters. American Statistician 57(2):124–131
Article MathSciNet Google Scholar
Hastie T, Tibshirani R (1996) Discriminant analysis by Gaussian mixtures. J Royal Stat Soc B 58(1):155–176
MathSciNet MATH Google Scholar
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, New York
MATH Google Scholar
James GM (2003) Variance and bias for general loss functions. Mach Learn 51(2):115–135
Article MATH Google Scholar
Leisch F, Dimitriadou E (2010) mlbench: Machine learning benchmark problems. R package version 2.0-0
Google Scholar
R Development Core Team (2009) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, URL http://www.R-project.org
Szepannek G, Schiffner J, Wilson J, Weihs C (2008) Local modelling in classification. In: Perner P (ed) Advances in data mining. Medical applications, e-commerce, marketing, and theoretical aspects, Springer, Berlin Heidelberg, LNCS, vol 5077, pp 153–164
Google Scholar
Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn. Springer, New York, URL http://www.stats.ox.ac.uk/pub/MASS4

Download references

Author information

Authors and Affiliations

Department of Statistics, TU Dortmund University, 44221, Dortmund, Germany
Julia Schiffner, Bernd Bischl & Claus Weihs

Authors

Julia Schiffner
View author publications
You can also search for this author in PubMed Google Scholar
Bernd Bischl
View author publications
You can also search for this author in PubMed Google Scholar
Claus Weihs
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Julia Schiffner .

Editor information

Editors and Affiliations

Fak. Wirtschaftswissenschaften, Inst. Entscheidungstheorieund, Universität Karlsruhe (TH), Kaiserstr. 12, Karlsruhe, 76128, Germany
Wolfgang A. Gaul
Insitute for Information Systems, and Management (IISM), Karlsruhe Institute of Technology (KIT), Kaiserstr. 12, Karlsruhe, 76131, Baden-Württemberg, Germany
Andreas Geyer-Schulz
, Information Systems, University ofHildesheim, Marienburger Platz 22, Hildesheim, 31141, Germany
Lars Schmidt-Thieme
Institute for Information Systems, and Management (IISM), Karlsruhe Institute of Technology (KIT), Kaiserstraße 12, Karlsruhe, 76128, Germany
Jonas Kunze

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schiffner, J., Bischl, B., Weihs, C. (2012). Bias-Variance Analysis of Local Classification Methods. In: Gaul, W., Geyer-Schulz, A., Schmidt-Thieme, L., Kunze, J. (eds) Challenges at the Interface of Data Analysis, Computer Science, and Optimization. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24466-7_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-24466-7_6
Published: 05 January 2012
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24465-0
Online ISBN: 978-3-642-24466-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics