Robustification of Gaussian Bayes Classifier by the Minimum β-Divergence Method

Rahaman, Md. Matiur; Mollah, Md. Nurul Haque

doi:10.1007/s00357-019-9306-1

Robustification of Gaussian Bayes Classifier by the Minimum β-Divergence Method

Published: 26 April 2019

Volume 36, pages 113–139, (2019)
Cite this article

Journal of Classification Aims and scope Submit manuscript

224 Accesses
2 Citations
Explore all metrics

Abstract

The goal of classification is to classify new objects into one of the several known populations. A common problem in most of the existing classifiers is that they are very much sensitive to outliers. To overcome this problem, several author’s attempt to robustify some classifiers including Gaussian Bayes classifiers based on robust estimation of mean vectors and covariance matrices. However, these type of robust classifiers work well when only training datasets are contaminated by outliers. They produce misleading results like the traditional classifiers when the test data vectors are contaminated by outliers as well. Most of them also show weak performance if we gradually increase the number of variables in the dataset by fixing the sample size. As the remedies of these problems, an attempt is made to propose a highly robust Gaussian Bayes classifiers by the minimum β-divergence method. The performance of the proposed method depends on the value of tuning parameter β, initialization of Gaussian parameters, detection of outlying test vectors, and detection of their variable-wise outlying components. We have discussed some techniques in this paper to improve the performance of the proposed method by tackling these issues. The proposed classifier reduces to the MLE-based Gaussian Bayes classifier when β → 0. The performance of the proposed method is investigated using both synthetic and real datasets. It is observed that the proposed method improves the performance over the traditional and other robust linear classifiers in presence of outliers. Otherwise, it keeps equal performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Bayesian Classifier Based on Discrete Multidimensional Gaussian Distribution

Learning Conditional Linear Gaussian Classifiers with Probabilistic Class Labels

Building the classification model based on the genetic algorithm and the improved Bayesian method

Article 07 August 2023

Dinh Pham-Toan & Tai Vo-Van

References

Agostinelli, C., Leung, A., Yohai, V.J., Zamar, R.H. (2015). Robust estimation of multivariate location and scatter in the presence of cellwise and casewise contamination. TEST, 24, 441–461. https://doi.org/10.1007/s11749-015-0450-6.
Article MathSciNet MATH Google Scholar
Alqallaf, F.A. (2003). A new contamination model for robust estimation with large high dimensional data sets. PHD Thesis: University of British Columbia.
Google Scholar
Anderson, T.W. (2003). An introduction to multivariate statistical analysis. Wiley Interscience.
Basu, A, Harris, I.R., Hjort, N.L., Jones, M.C. (1998). Robust and efficient estimation by minimising a density power divergence. Biometrika, 85, 549–559.
Article MathSciNet MATH Google Scholar
Chork, C., & Rousseeuw, P.J. (1992). Integrating a high breakdown option into discriminant analysis in exploration geochemistry. Journal of Geochemical Exploration, 43, 191–203.
Article Google Scholar
Croux, C., & Dehon, C. (2001). Robust linear discriminant analysis using s-estimators. The Canadian Journal of Statistics, 29, 473–492.
Article MathSciNet MATH Google Scholar
Donoho, D.L., & Huber, P.J. (1983). The notion of breakdown point. In Bickel, P.J., Doksum, K., Hodges, Jr. J.L. (Eds.) A Festschrift for Erich L. Lehmann (pp. 157–184). Belmont: Wadsworth.
Hawkins, D.M., & McLachlan, G. (1997). High-breakdown linear discriminant analysis. Journal of the American Statistical Association, 92, 136–143.
Article MathSciNet MATH Google Scholar
He, X., & Fung, W. (2000). High breakdown estimation for multiple populations with applications to discriminant analysis. Journal of Multivariate Analysis, 72, 151–162.
Article MathSciNet MATH Google Scholar
Hubert, M., & Driessen, K.V. (2004). Fast and robust discriminant analysis. Computational Statistics and Data Analysis, 45, 301–320.
Article MathSciNet MATH Google Scholar
Hubert, M., & Debruyne, M. (2010). Minimum covariance determinant. Advanced review (Vol. 2). Wiley.
Johnson, R.A., & Wichern, D.W. (2007). Applied multivariate statistical analysis, 6th edn. Prentice-Hall.
Lopuha, H.P., & Rousseeuw, P.J. (1991). Breakdown points of an equivariant estimators of multivariate location and covariance matrices. The Annals of Statistics, 19, 229–248.
Article MathSciNet Google Scholar
Maronna, R.A. (1976). Robust m-estimators of multivariate location and scatter. The Annals of Statistics, 4, 51–67.
Article MathSciNet MATH Google Scholar
Maronna, R., & Zamar, R. (2002). Robust estimation of location and dispersion for high-dimensional datasets. Technometrics, 44, 307–317.
Article MathSciNet Google Scholar
Minami, M., & Eguchi, S. (2002). Robust blind source separation by β-divergence. Neural Computation, 14, 1859–1886.
Article MATH Google Scholar
Mollah, M.N.H., Minami, M., Eguchi, S. (2006). Exploring latent structure of mixture ICA models by the minimum β-divergence method. Neural Computation, 18, 166–190.
Article MATH Google Scholar
Mollah, M.N.H., Minami, M., Eguchi, S. (2007). Robust prewhitening for ICA by minimizing beta-divergence and its application to FastICA. Neural processing Letters, 25(2), 91–110.
Article Google Scholar
Mollah, M.N.H., Sultana, N., Minami, M., Eguchi, S. (2010). Robust extraction of local structures by the minimum β-divergence method. Neural Networks, 23, 226–238.
Article MATH Google Scholar
Puranen, J. (2006). Fish catch data set. http://www.amstat.org/publications/jse/datasets/fishcatch.txt.
Randles, R.H., Broffitt, J.D., Ramberg, J.R., Hogg, R.V. (1978). Generalized linear and quadratic discriminant functions using robust estimates. Journal of the American Statistical Association, 73, 564–568.
Article MATH Google Scholar
Todorov, V.K. (2006). rrcov, Scable Robust Estimators with high breakdown point. R package.
Todorov, V., & Pires, A.M. (2007). Comparative performance of several robust linear discriminant analysis methods. REVSTAT-Statistical Journal, 5, 63–83.
MathSciNet MATH Google Scholar
Todorov, V., Neykov, N., Neytchev, P. (1990). Robust selection of variables in the discriminant analysis based on mve and mcd estimators. In Pro-ceedings in computational statistics COMPSTAT. Heidelberg: Physica.
Todorov, V., Neykov, N., Neytchev, P. (1994). Robust two-group discrimination by bounded influence regression. Computational Statistics and Data Analysis, 17, 289–302.
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, Faculty of Science, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj, 8100, Bangladesh
Md. Matiur Rahaman
Department of Statistics, University of Rajshahi, Rajshahi, 6205, Bangladesh
Md. Matiur Rahaman & Md. Nurul Haque Mollah

Authors

Md. Matiur Rahaman
View author publications
You can also search for this author in PubMed Google Scholar
Md. Nurul Haque Mollah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Md. Nurul Haque Mollah.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Figures 3a–b and 4a–b show the performance results of the proposed Bayesian robust LDA and QDA both under different conditions. Figure 3a shows the plots of training (left) and test (right) MER against the common differences between the mean components of two mean vectors with equal covariance matrices (V₁ = V₂) for the data structure D_1B of p = 15 variables respectively, in absence outliers, while Fig. 3b shows the plots of training (left) and test (right) MER against the common differences between the mean components of two mean vectors with equal covariance matrices (V₁ = V₂) for the data structure D_1B of p = 15 variables respectively, in presence of 15% outlying data vectors. We observed that the proposed LDA shows smaller test MER than the proposed QDA in both absence and presence of outliers. Figure 4a shows the plots of training (left) and test (right) MER against the common differences between the mean components of two mean vectors with unequal covariance matrices (V₁ ≠ V₂) for the data structure D_1B of p = 15 variables respectively, in absence outliers, while Fig. 4b shows the plots of training (left) and test (right) MER against the common differences between the mean components of two mean vectors with unequal covariance matrices (V₁ ≠ V₂) for the data structure D_1B of p = 15 variables respectively, in presence of 15% outlying data vectors. We observed that the proposed QDA shows smaller test MER than the proposed LDA in both absence and presence of outliers. Thus the proposed LDA shows better performance than the proposed QDA in the case of constant covariance matrices, while the proposed QDA shows better performance than the proposed LDA in the case of inconstant covariance matrices.

Table 4 The average values of miss-classification error (MER) based on 500 simulated datasets of type D₂ of p = 5, 15, 50, and 75 variables in presence of outliers are computed by the classical, FSA, MCD-A, MCD-B, MCD-C, OGK, and proposed methods, where m = 3, n_j = 100;j = 1, 2,3, and the difference among the mean vectors of m are τ

Full size table

However, it should be to be mentioned here that the performance of the proposed method also depends on the cutoff/threshold value (δ) of β_j-weights for outlier detection as discussed in Section 2.3.1. Figure 5 shows the plots of MER (Top) and ROC curve (Bottom) to investigate the performance of the proposed threshold/cutoff value (δ) of β_j-weight function in a comparison of the cutoff value used by Mollah et al. (2010) for outlier detection with the data structure D_1B of p = 15 variables. We observe that the proposed cutoff value produces smaller MER and larger TPR (true positive rate) by the proposed classifiers.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rahaman, M.M., Mollah, M.N.H. Robustification of Gaussian Bayes Classifier by the Minimum β-Divergence Method. J Classif 36, 113–139 (2019). https://doi.org/10.1007/s00357-019-9306-1

Download citation

Published: 26 April 2019
Issue Date: 15 April 2019
DOI: https://doi.org/10.1007/s00357-019-9306-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Robustification of Gaussian Bayes Classifier by the Minimum β-Divergence Method

Abstract

Access this article

Similar content being viewed by others

Bayesian Classifier Based on Discrete Multidimensional Gaussian Distribution

Learning Conditional Linear Gaussian Classifiers with Probabilistic Class Labels

Building the classification model based on the genetic algorithm and the improved Bayesian method

References