A Theoretical Analysis of the Peaking Phenomenon in Classification

Zollanvari, Amin; James, Alex Pappachen; Sameni, Reza

doi:10.1007/s00357-019-09327-3

A Theoretical Analysis of the Peaking Phenomenon in Classification

Published: 11 July 2019

Volume 37, pages 421–434, (2020)
Cite this article

Journal of Classification Aims and scope Submit manuscript

623 Accesses
11 Citations
3 Altmetric
Explore all metrics

Abstract

In this work, we analytically study the peaking phenomenon in the context of linear discriminant analysis in the multivariate Gaussian model under the assumption of a common known covariance matrix. The focus is finite-sample setting where the sample size and observation dimension are comparable. Therefore, in order to study the phenomenon in such a setting, we use an asymptotic technique whereby the number of sample points is kept comparable in magnitude to the dimensionality of observations. The analysis provides a more thorough picture of the phenomenon. In particular, the analysis shows that as long as the Relative Cumulative Efficacy of an additional Feature set (RCEF) is greater (less) than the size of this set, the expected error of the classifier constructed using these additional features will be less (greater) than the expected error of the classifier constructed without them. Our result highlights underlying factors of the peaking phenomenon relative to the classifier used in this study and, at the same time, calls into question the classical wisdom around the peaking phenomenon.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Small Sample Size in High Dimensional Space - Minimum Distance Based Classification

Monotonicity of the $$\chi ^2$$ -statistic and Feature Selection

Article 25 March 2020

Firuz Kamalov, Ho Hon Leung & Sherif Moussa

Optimal projections for Gaussian discriminants

Article 04 January 2022

David P. Hofmeyr, Francois Kamper & Michail C. Melonas

References

Abend, K., & Harley, T.J.J. (1969). Comments on the mean accuracy of statistical pattern recognizers. IEEE Transactions on Information Theory, 14, 420–423.
Article Google Scholar
Bowker, A., & Sitgreaves, R. (1961). An asymptotic expansion for the distribution function of the w-classification statistic. In Solomon, H (Ed.) Studies in item analysis and prediction (pp. 292–310): Stanford University Press.
Braga-Neto, U., & Dougherty, E. (2015). Error estimation for pattern recognition. New Jersey: Wiley-IEEE Press.
Book Google Scholar
Campenhout, J.M.V. (1978). On the peaking of the Hughes mean recognition accuracy: the resolution of an apparent paradox. IEEE Transactions on Systems, Man and Cybernetics, 8, 390–395.
Article MathSciNet Google Scholar
Chandrasekaran, B., & Jain, A.K. (1974). Quantization complexity and independent measurements. IEEE Transactions on Computers, 23, 102–106.
Article Google Scholar
Couplet, R., & Debbah, M. (2013). Signal processing in large systems, a new paradigm. IEEE Signal Processing Magazine, 24–39.
Devroye, L., Gyorfi, L., & Lugosi, G. (1996). A probabilistic theory of pattern recognition. New York: Springer.
Book Google Scholar
Duda, R.O., Hart, P.E., & Stork, D.G. (2000). Pattern classification. Wiley.
Efron, B. (2005). Bayesian, frequentists, and scientists. Journal of the American Statistical Association, 100, 1–5.
Article MathSciNet Google Scholar
Girko, V.L. (1995). Statistical analysis of observations of increasing dimension. Dordrecht: Kluwer Academic Publishers.
Book Google Scholar
Hirschhorn, J., & Daly, M.J. (2005). Genome-wide association studies for common diseases and complex traits. Nature Reviews Genetics, 6, 95–108.
Article Google Scholar
Hua, J., Xiong, Z., Lowey, J., Suh, E., & Dougherty, E.R. (2005). Optimal number of features as a function of sample size for various classification rules. Bioinformatics, 21, 1509–1515.
Article Google Scholar
Hughes, G.F. (1968). On the mean accuracy of statistical pattern recognizers. IEEE Transactions on Information Theory, 14, 55–63.
Article Google Scholar
Jain, A., & Waller, W. (1978). On the optimal number of features in the classification of multivariate gaussian data. Pattern Recognition, 10, 365–374.
Article Google Scholar
McLachlan, G. (2004). Discriminant analysis and statistical pattern recognition. New York: Wiley.
MATH Google Scholar
Moran, M. (1975). On the expectation of errors of allocation associated with a linear discriminant function. Biometrika, 62, 141–148.
Article MathSciNet Google Scholar
Niu, G. (2017). Data-driven technology for engineering systems health management. Beijing: Science Press-Springer.
Book Google Scholar
Raudys, S. (1967). On determining training sample size of a linear classifier. Computer Systems, 28, 79–87. In Russian.
Google Scholar
Raudys, S. (2001). Statistical and neural classifiers an integrated approach to design. London: Springer.
Book Google Scholar
Raudys, S.J., & Jain, A.K. (1991). Small sample size effects in statistical pattern recognition: recommendations for practitioners. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13, 252–264.
Article Google Scholar
Raudys, S., & Young, D.M. (2004). Results in statistical discriminant analysis: a review of the former soviet union literature. Journal of Multivariate Analysis, 89, 1–35.
Article MathSciNet Google Scholar
Rubio, F., Mestre, X., & Palomar, D.P. (2012). Performance analysis and optimal selection of large minimum variance portfolios under estimation risk. IEEE Journal of Selected Topics Signal Process, 6, 337–350.
Article Google Scholar
Serdobolskii, V.I. (1983). On minimum error probability in discriminant analysis. Soviet. Math. Dokl., 27, 720–725.
Google Scholar
Serdobolskii, V.I. (2000). Multivariate statistical analysis: a high-dimensional approach. Kluwer Academic Publishers.
Serdobolskii, V. (2008). Multiparametric statistics. Elsevier.
Sitgreaves, R. (1961). Some results on the distribution of the W-classification statistics. In Solomon, H. (Ed.) Studies in item analysis and prediction (pp. 241–251). Stanford: Stanford University Press.
Sorum, M.J. (1973). Estimating the expected probability of misclassification for a rule based on the linear discriminant function: univariate normal case. Technometrics, 15, 329–339.
Article MathSciNet Google Scholar
Waller, W., & Jain, A. (1978). On the monotonicity of the performance of Bayesian classifiers. IEEE Transactions on Information Theory, 24, 392–394.
Article MathSciNet Google Scholar
Wigner, E.P. (1958). On the distribution of the roots of certain symmetric matrices. Annals of Mathematics, 67, 325–327.
Article MathSciNet Google Scholar
Zhang, M., Rubio, F., Palomar, D.P., & Mestre, X. (2013). Finite-sample linear filter optimization in wireless communications and financial systems. IEEE Transactions on Signal Processing, 61, 5014–5025.
Article MathSciNet Google Scholar
Zheng, N., & Xue, J. (2009). Statistical learning and pattern analysis for image and video processing. New York: Springer.
Book Google Scholar
Zondervan, K.T., & Cardon, L.R. (2004). The complex interplay among factors that influence allelic association. Nature Reviews Genetics, 5, 89–100.
Article Google Scholar
Zollanvari, A., & Dougherty, E.R. (2015). Generalized consistent error estimator of linear discriminant analysis. IEEE Transactions on Signal Processing, 63, 2804–2814.
Article MathSciNet Google Scholar
Zollanvari, A., Braga-Neto, U.M., & Dougherty, E.R. (2011). Analytic study of performance of error estimators for linear discriminant analysis. IEEE Transactions on Signal Processing, 59, 4238–4255.
Article MathSciNet Google Scholar

Download references

Acknowledgements

This material is based in part upon work supported by the Nazarbayev University Faculty Development Competitive Research Grant, under award number SOE2018008.

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Nazarbayev University, Nur-Sultan, Kazakhstan
Amin Zollanvari & Alex Pappachen James
Department of Computer Science & Engineering and Information Technology, Shiraz University, Shiraz, Iran
Reza Sameni

Authors

Amin Zollanvari
View author publications
You can also search for this author in PubMed Google Scholar
Alex Pappachen James
View author publications
You can also search for this author in PubMed Google Scholar
Reza Sameni
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Amin Zollanvari.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zollanvari, A., James, A.P. & Sameni, R. A Theoretical Analysis of the Peaking Phenomenon in Classification. J Classif 37, 421–434 (2020). https://doi.org/10.1007/s00357-019-09327-3

Download citation

Published: 11 July 2019
Issue Date: July 2020
DOI: https://doi.org/10.1007/s00357-019-09327-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Theoretical Analysis of the Peaking Phenomenon in Classification

Abstract

Access this article

Similar content being viewed by others

Small Sample Size in High Dimensional Space - Minimum Distance Based Classification

Monotonicity of the $$\chi ^2$$ -statistic and Feature Selection

Optimal projections for Gaussian discriminants

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Theoretical Analysis of the Peaking Phenomenon in Classification

Abstract

Access this article

Similar content being viewed by others

Small Sample Size in High Dimensional Space - Minimum Distance Based Classification

Monotonicity of the $$\chi ^2$$ -statistic and Feature Selection

Optimal projections for Gaussian discriminants

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation