Abstract
The Bayes classifier depends on the conditional densities and the prior probabilities. Among many density functions, the Gaussian density has received more attention mainly motivated by its analytical tractability. The parameters of the Bayes classifier for the Gaussian distribution data are generally unknown, and approximations are calculated for the mean vector \({\hat{{\varvec{\mu }}}}\) and the covariance matrix \({\hat{\varSigma }}\). When a pattern is inserted in the training set of the class \(\omega _i\), the values of the parameters \(\hat{{\varvec{\mu }}}_{{\varvec{i}}}\) and \({\hat{\varSigma }}_{i}\) change by an amount given by \(\varDelta \hat{{\varvec{\mu }}}_{{\varvec{i}}}\) and \(\varDelta {\hat{\varSigma }}_i\), respectively. The insertion of one pattern can cause a perturbation, so we claim that this perturbation can be used for supervised classification purposes. Based on this assumption, we propose a supervised classifier called Perturbation-based Classifier PerC that assigns the class of the query pattern as the one that presents the smallest perturbation among all the classes after the insertion of this query pattern in the classes. The rationale is that the addition of a pattern that belongs to one specific class should not alter much the distribution of that class. PerC only uses the perturbations (\(\varDelta \hat{{\varvec{\mu }}}_{{\varvec{i}}}\) and \(\varDelta {\hat{\varSigma }}_i\)) to evaluate the class of a query pattern; so, it is a parameter-free classifier. The proposed method was assessed on 21 datasets from the UCI Machine Learning Repository, and its results were compared with classifiers from the literature. Results have shown that PerC obtains very competitive recognition rates.
Similar content being viewed by others
Notes
To be more accurate, we should also mention that \( \left\| \varDelta {{\varvec{x}}}\right\| \rightarrow 0\). However, we have not assumed that because there is no change in the query vector \({{\varvec{x}}}\).
References
Achieser NI (2013) Theory of approximation. Courier Corporation, North Chelmsford
Ade RR, Deshmukh PR (2013) Methods for incremental learning: a survey. Int J Data Min Knowl Manage Process 3(4):119–125
Bache K, Lichman M (2013) UCI Machine Learning Repository
Burges CJ (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):955–974
Cheng B, Titterington DM (1994) Neural networks: a review from a statistical perspective. Stat Sci 9(1):2–54
Cooper GF, Herskovits E (1992) A bayesian method for the induction of probabilistic networks from data. Mach Learn 9(4):309–347
Cover T, Hart P (2006) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27
de Jesus RJ (2017) A method with neural networks for the classification of fruits and vegetables. Soft Comput 21(23):7207–7220
Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 07:1–30
Devroye L, Gyorfi L, Lugosi G (1996) A probabilistic theory of pattern recognition. Springer, Berlin
Ding J, Wang H, Li C, Chai T, Wang J (2017) An online learning neural network ensembles with random weights for regression of sequential data stream. Soft Comput 21(20):5919–5937
Duda RO, Hart PE et al (1973) Pattern classification and scene analysis. Wiley, New York
Duda RO, Hart PE, Stork DG (2012) Pattern classification. Wiley, New York
Evgeniou T, Poggio T, Pontil M, Verri A (2002) Regularization and statistical learning theory for data analysis. Comput Stat Data Anal 38(4):421–432
Flores MJ, Gamez JA, Martinez AM, Puerta JM (2009) Gaode and haode: two proposals based on aode to deal with continuous variables. In: Proceedings of the 26th annual international conference on machine learning, pp 313–320
Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2–3):131–163
Fukunaga K (1972) Introduction to statistical pattern recognition, 1st edn. Academic Press, Orlando
Hoffbeck JP, Landgrebe DA (1996) Covariance matrix estimation and classification with limited training data. IEEE Trans Pattern Anal Mach Intell 18(7):763–767
Iosifidis A, Tefas A, Pitas I (2013) On the optimal class representation in linear discriminant analysis. IEEE Trans Neural Netw Learn Syst 24(9):1491–1497
Jain AK, Duin RPW, Mao J (2000) Statistical pattern recognition: a review. IEEE Trans Pattern Anal Mach Intell 22(1):4–37
Kivinen J, Smola AJ, Williamson RC (2004) Online learning with kernels. IEEE Trans Signal Process 52(8):2165–2176
Kumar R, Srivastava S, Gupta J (2017) Modeling and adaptive control of nonlinear dynamical systems using radial basis function network. Soft Comput 21(15):4447–4463
Kuo BC, Landgrebe DA (2002) A covariance estimator for small sample size classification problems and its application to feature extraction. IEEE Trans Geosci Remote Sens 40(4):814–819
Ledoit O, Wolf M (2004) A well-conditioned estimator for large-dimensional covariance matrices. J Multivar Anal 88:365–411
Liu P, Choo KKR, Wang L, Huang F (2017) Svm or deep learning? A comparative study on remote sensing image classification. Soft Comput 21(23):7053–7065
Lutz A, Rodner E, Denzler J (2013) I want to know more—efficient multi-class incremental learning using gaussian processes. Pattern Recognit Image Anal 23(3):402–407
Lutz A, Rodner E, Denzler J (2011) Efficient multi-class incremental learning using gaussian processes. In: Open German-Russian workshop on pattern recognition and image understanding, pp 182–185
Mitchell TM (1997) Machine learning. McGraw-Hill, Boston
Perez A, Larranaga P, Inza I (2006) Supervised classification with conditional gaussian networks: increasing the structure complexity from naive bayes. Int J Approx Reason 43(1):1–25
Perez A, Larranaga P, Inza I (2009) Bayesian classifiers based on kernel density estimation: flexible classifiers. Int J Approx Reason 50(2):341–362
Perron F (1992) Minimax estimators of a covariance matrix. J Multivar Anal 43(1):16–28
Searle SR (1982) Matrix algebra useful for statistics. Wiley, New York
Tadjudin S, Landgrebe DA (1999) Covariance estimation with limited training samples. IEEE Trans Geosci Remote Sens Sens 37(4):2113–2118
Theodoridis S, Koutroumbas K (2008) Pattern recognition, 4th edn. Academic Press, California
van Wieringen WN (2017) On the mean squared error of the ridge estimator of the covariance and precision matrix. Stat Probab Lett 123:88–92
Wu WB, Xiao H (2012) Covariance matrix estimation in time series. In: Tata Subba Rao SSR, Rao C (eds) Time series analysis: methods and applications, handbook of statistics, vol 30. Elsevier, Amsterdam pp, pp 187–209
Zhu F, Yang J, Xu S, Gao C, Ye N, Yin T (2017) Incorporating neighbors distribution knowledge into support vector machines. Soft Comput 21(21):6407–6420
Acknowledgements
The authors would like to thank Brazilian agencies: CAPES (Coordenação de Aperfeiçoamento de Pessoal de Nível Superior), CNPq (Conselho Nacional de Desenvolvimento Científico e Tecnológico) and FACEPE (Fundação de Amparo à Ciência e Tecnologia de Pernambuco).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Human and animal rights statement
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Araújo, E.L., Cavalcanti, G.D.C. & Ren, T.I. Perturbation-based classifier. Soft Comput 24, 16565–16576 (2020). https://doi.org/10.1007/s00500-020-04960-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-020-04960-2