Cost-Based Classifier Evaluation for Imbalanced Problems

Landgrebe, Thomas; Paclík, Pavel; Tax, David M. J.; Verzakov, Serguei; Duin, Robert P. W.

doi:10.1007/978-3-540-27868-9_83

Thomas Landgrebe²¹,
Pavel Paclík²¹,
David M. J. Tax²¹,
Serguei Verzakov²¹ &
…
Robert P. W. Duin²¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3138))

Included in the following conference series:

Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR)

1459 Accesses

Abstract

A common assumption made in the field of Pattern Recognition is that the priors inherent to the class distributions in the training set are representative of the true class distributions. However this assumption does not always hold, since the true class-distributions may be different, and in fact may vary significantly. The implication of this is that the effect on cost for a given classifier may be worse than expected. In this paper we address this issue, discussing a theoretical framework and methodology to assess the effect on cost for a classifier in imbalanced conditions. The methodology can be applied to many different types of costs. Some artificial experiments show how the methodology can be used to assess and compare classifiers. It is observed that classifiers that model the underlying distributions well are more resilient to changes in the true class distribution than weaker classifiers.

Download to read the full chapter text

Chapter PDF

Empirical analysis of performance assessment for imbalanced classification

Article 23 January 2024

F-Measure Curves for Visualizing Classifier Performance with Imbalanced Data

Constrained Naïve Bayes with application to unbalanced data classification

Article Open access 20 October 2021

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Bishop, C.M.: Neural Networks for Pattern Recognition, 1st edn. Oxford University Press Inc., New York (1995)
Google Scholar
Duda, R., Hart, P., Stork, D.: Pattern Classification, 2nd edn. Wiley- Interscience, Chichester (2001)
MATH Google Scholar
Duin, R.P.W.: On the choice of smoothing parameters for parzen estimators of probability density functions. IEEE Trans. Computing 25, 1175–1179 (1976)
Article MATH MathSciNet Google Scholar
Duin, R.P.W.: PRTools Version 3.0, A Matlab Toolbox for Pattern Recognition. Pattern Recognition Group, TUDelft (January 2000)
Google Scholar
Flach, P.: The geometry of roc space: understanding machine learning metrics through roc isometrics. In: ICML 2003 Washington DC, pp. 194–201 (2003)
Google Scholar
Hand, D.J.: Construction and Assessment of Classification Rules. John Wiley and Sons, Chichester (1997) ISBN 0-471- 96583-9
MATH Google Scholar
Highleyman, W.: Linear decision functions, with application to pattern recognition. In: Proc. IRE, vol. 49, pp. 31–48 (1961)
Google Scholar
Kubat, M., Matwin, S.: Addressing the curse of imbalanced data sets: One-sided sampling. In: Proceedings, 14th ICML, Nashville, July 1997, pp. 179–186 (1997)
Google Scholar
Metz, C.: Basic principles of roc analysis. Seminars in Nuclear Medicine 3(4) (1978)
Google Scholar
Provost, F., Fawcett, T.: Robust classification for imprecise environments. Machine Learning 42, 203–231 (2001)
Article MATH Google Scholar
Weiss, G.M., Provost, F.: The effect of class distribution on classifier learning: an empirical study. Technical report ML-TR-44, Department of Computer Science, Rutgers University (August 2, 2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Elect. Eng., Maths and Comp. Sc., Delft University of Technology, The Netherlands
Thomas Landgrebe, Pavel Paclík, David M. J. Tax, Serguei Verzakov & Robert P. W. Duin

Authors

Thomas Landgrebe
View author publications
You can also search for this author in PubMed Google Scholar
Pavel Paclík
View author publications
You can also search for this author in PubMed Google Scholar
David M. J. Tax
View author publications
You can also search for this author in PubMed Google Scholar
Serguei Verzakov
View author publications
You can also search for this author in PubMed Google Scholar
Robert P. W. Duin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Instituto Superior Técnico, Instituto de Telecomunicações, Lisbon, Portugal
Ana Fred
RSISE, the Australian National University, ACT 0200, Canberra, Australia
Terry M. Caelli
Information and Communication Theory Group, Delft University of Technology, P.O. Box 5031, 2600GA, Delft, The Netherlands
Robert P. W. Duin
FEUP - Faculdade de Engenharia, Universidade do Porto, Rua Dr. Roberto Frias, 4200-465, Porto, Portugal
Aurélio C. Campilho
Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Information and Communication Theory Group, Delft, The Netherlands
Dick de Ridder

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Landgrebe, T., Paclík, P., Tax, D.M.J., Verzakov, S., Duin, R.P.W. (2004). Cost-Based Classifier Evaluation for Imbalanced Problems. In: Fred, A., Caelli, T.M., Duin, R.P.W., Campilho, A.C., de Ridder, D. (eds) Structural, Syntactic, and Statistical Pattern Recognition. SSPR /SPR 2004. Lecture Notes in Computer Science, vol 3138. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27868-9_83

Download citation

DOI: https://doi.org/10.1007/978-3-540-27868-9_83
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22570-6
Online ISBN: 978-3-540-27868-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)