Regularization in skewed binary classification

Lee, Sauchi Stephen

doi:10.1007/s001800050018

Regularization in skewed binary classification

Published: 09 September 1999

Volume 14, pages 277–292, (1999)
Cite this article

Computational Statistics Aims and scope Submit manuscript

Sauchi Stephen Lee¹

2401 Accesses
17 Citations
Explore all metrics

Summary

Skewed binary classification concerns the assignment of a new unknown object to one of two populations, 0 or 1, on the basis of a q-dimensional vector x = (x₁, …x_q), where one of the populations, for example population 0, is the prevalent class. Assignment rules are developed from learning samples of known objects, that is, objects known to come from each of the two populations. Since population 1 is the rare class, overfitting and generalization problems arise easily for many classification models. We propose an effective solution by assigning more weights to class 1. The idea is to produce noisy replicates of the rare cases while keeping the dominant class 0 cases unchanged. The classification models considered are: nearest neighbor method, neural networks, classification trees, and quadratic discriminant. Noisy replication of the rare cases was applied to three real world and simulated data sets. Encouraging results were obtained for all the classification models considered.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

6 References

Bishop, C. (1995). Training with noise is equivalent to Tikohonov regularization. Neural Computation, 7, pp.108–116.
Article Google Scholar
Breiman, L. (1996). Bagging predictors. Machine Learning, 26, No. 2, pp.123–140.
MATH Google Scholar
Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J. (1984). Classification and Regression Trees. Wadsworth & Brooks, Monterey, California.
MATH Google Scholar
Hanley, J.A. and McNeil, B.J. (1982). The meaning and use of the area under a receiver operating characteristics (ROC) curve. Radiology, 143, pp.29–36.
Article Google Scholar
Hertz, J., Krogh, A., and Palmer, R.G. (1991). Introduction to the Theory of Neural Computation. Addison-Wesley, Redwood City, CA.
Google Scholar
Mkhadri, A., Celeux G., and Nasroallah A. (1997). Regularization in discriminant analysis: An overview. Computational Statistics and Data Analysis, 23, pp.403–423.
Article MathSciNet Google Scholar
Quinlan, J. R. (1993). C4.5: Program for Machine Learning. Morgan Kaufmann, San Mateo.
Google Scholar
Raviv, Y. and Intrator, N. (1995). Bootstrapping with noise: An effective regularization technique. Technical Report, Tel-Aviv University, Israel.
Google Scholar
Ripley, B.D. (1996). Pattern Recognition and Neural Networks. Cambridge University Press.
Sietsma, J. and Dow, R.J.F. (1991). Creating artificial networks that generalize. Neural Networks, 4, pp.67–79.
Article Google Scholar
Venables, W.N. and Ripley, B.D. (1994). Modern Applied Statistics with S-plus. Springer-Verlag, New York.
Book Google Scholar

Download references

Author information

Authors and Affiliations

Division of Statistics, University of Idaho, Moscow, ID, 83844, USA
Sauchi Stephen Lee

Authors

Sauchi Stephen Lee
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, S.S. Regularization in skewed binary classification. Computational Statistics 14, 277–292 (1999). https://doi.org/10.1007/s001800050018

Download citation

Published: 09 September 1999
Issue Date: July 1999
DOI: https://doi.org/10.1007/s001800050018

Keywords

ROC curve

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Regularization in skewed binary classification

Summary

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Accuracy of regularized D-rule for binary classification

Handling Class Imbalance in k-Nearest Neighbor Classification by Balancing Prior Probabilities

Asymptotic performance of the quadratic discriminant function to skewed training samples

6 References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Regularization in skewed binary classification

Summary

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Accuracy of regularized D-rule for binary classification

Handling Class Imbalance in k-Nearest Neighbor Classification by Balancing Prior Probabilities

Asymptotic performance of the quadratic discriminant function to skewed training samples

6 References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation