Abstract
The matrix-pattern-oriented Ho–Kashyap classifier (MatMHKS), using two-sided weight vectors to constrain the matrixized samples, can deal with not only the vectorized sample but also the matrixized sample. For vectorized sample, by converting the vectorized mode into matrixized mode, MatMHKS relieves the curse of dimensionality and extends the expressive modes of sample. Although MatMHKS has been demonstrated to be effective in the classification performance, it consumes a lot of time to alternately update two weight vectors in each iteration. Moreover, MatMHKS is not suitable in dealing with imbalanced problems. Finally, there does not exist effective analysis of generalization risk for matrixized classifiers. To this end, this paper proposes an efficient matrixized Ho–Kashyap classifier (EMatMHKS), which separately updates the two-sided weight vectors to avoid repeatedly calculating the inverse matrix in MatMHKS, thus significantly improving the training speed. Moreover, by introducing a weight matrix, both balanced and imbalanced situations can be tackled. Finally, PAC-Bayes bound is used to reflect the error upper bound of matrixized and vectorized classifiers. Both balanced and imbalanced data sets are used to validate the effectiveness and the efficiency of the proposed EMatMHKS in the experiment.




Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
References
Bessa MA, Bostanabad R, Liu Z, Hu A, Apley Daniel W, Brinson C, Chen W, Liu Wing Kam (2017) A framework for data-driven analysis of materials under uncertainty: countering the curse of dimensionality. Comput Methods Appl Mech Eng 320:633–667
Camastra F, Staiano A (2016) Intrinsic dimension estimation: advances and open problems. Inf Sci 328(4):26–41
Cárdenas EH, Camargo HA, Túpac YJ (2016) Imbalanced datasets in the generation of fuzzy classification systems—an investigation using a multiobjective evolutionary algorithm based on decomposition. In: International conference on fuzzy systems and knowledge discovery, pp 1145–1452
Chen S, Wang Z, Tian Y (2007) Matrix-pattern-oriented ho-kashyap classifier with regularization learning. Pattern Recognit 40(5):1533–1543
Cormen TH, Leiserson Charles E, Rivest Ronald L, Stein Clifford (2009) Introduction to algorithms, 3rd edn. The MIT Press, Cambridge
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Demšar Janez (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(1):1–30
Duda RO, Hart PE, Stork DG (2012) Pattern Classif. Wiley, Hoboken
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7(2):179–188
Freund Y, Schapire RE (1995) A decision-theoretic generalization of on-line learning and an application to boosting. In: EuroCOLT ’95 proceedings of the 2nd european conference on computational learning theory, pp 23–27
Galar M, Fernández A, Barrenechea E, Bustince H, Herrera F (2014) A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybern Part C Appl Rev 42(4):463–484
Germain P, Lacasse A, Marchand M (2009) Pac-bayesian learning of linear classifiers. In: International conference on machine learning, pp 353–360
Gong M, Jiang X, Li H (2017) Optimization methods for regularization-based ill-posed problems: a survey and a multi-objective framework. Front Comput Sci 11(3):362–391
He H, Ma Y (2013) Imbalanced learning: foundations, algorithms, and applications. Wiley-IEEE Press, Hoboken
Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844
Hollander M, Wolfe D, Chicken E (2013) Nonparametric statistical methods. Wiley, Hoboken
Iman RL, Davenport JM (1980) Approximations of the critical region of the friedman statistic. Commun Stat 9:571–595
Koltchinskii V, Panchenko D (2000) Rademacher processes and bounding the risk of function learning. In: High dimensional probability II. Springer, Berlin, pp 443–457
Kullback S (1997) Information theory and statistics. Dover Publications, Mineola
Langford J (2005) Tutorial on practical prediction theory for classification. J Mach Learn Res 6(3):273–306
Langford J, Shawe-Taylor J (2003) PAC-Bayes and margins. In: NIPS’02 proceedings of the 15th international conference on neural information processing systems. MIT Press, Cambridge, MA, USA, pp 439–446
Leski J (2003) Ho-kashyap classifier with generalization control. Pattern Recognit Lett 24(14):2281–2290
Liu X, Wu J, Zhou ZH (2009) Exploratory undersampling for class-imbalance learning. IEEE Trans Syst Man Cybern Part B: Cybern 39(2):539–550
Mukherjee S, Niyogi P, Poggio T, Rifkin R (2006) Learning theory: stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization. Adv Comput Math 25(1–3):161–193
Nemenyi PB (1963) Distribution-free multiple comparisons. PhD thesis, Princeton University
Ng WW, Hu J, Yeung DS, Yin S, Roli F (2017) Diversified sensitivity-based undersampling for imbalance classification problems. IEEE Trans Cybern 45(11):2402–2412
Schölkopf B, Platt J, Hofmann T (2006) Tighter Pac-Bayes bounds. In: International conference on neural information processing systems, pp 9–16
Seeger M (2002) Pac-bayesian generalisation error bounds for gaussian process classification. J Mach Learn Res 3(2):233–269
Shao G, Sang N (2017) Regularized max-min linear discriminant analysis. Pattern Recognit 66:353–363
Sun ZB, Song QB, Zhu XY, Sun HL, Xu BW, Zhou YM (2015) A novel ensemble method for classifying imbalanced data. Pattern Recognit 48(5):1623–1637
Wang Z, Cao C (2019) Cascade interpolation learning with double subspaces and confidence disturbance for imbalanced problems. Neural Netw 118:17–31
Wang Z, Chen S, Liu J, Zhang D (2008) Pattern representation in feature extraction and classifier design: matrix versus vector. IEEE Trans Neur Network 19(5):758–769
Yang Y, Jiang J (2006) Considering cost asymmetry in learning classifiers. J Mach Learn Res 7:1713–1741
Yang Z, Tang W, Shintemirov A, Wu Q (2009) Association rule miningbased dissolved gas analysis for fault diagnosis of power transformers. IEEE Trans Syst Man Cybern Part C (Appl Rev) 39(6):597–610
Yuan X, Xie L, Abouelenien M (2018) A regularized ensemble framework of deep learning for cancer detection from multi-class, imbalanced training data. Pattern Recognit 77:160–172
Zhu Z, Wang Z, Li D, Zhu Y, Du W (2018) Geometric structural ensemble learning for imbalanced problems. IEEE Trans Cybern. https://doi.org/10.1109/TCYB.2018.2877663
Zieba M (2014) Service-oriented medical system for supporting decisions with missing and imbalanced data. IEEE J Biomed Health Inf 18(5):1533–1540
Acknowledgements
This work is supported by Natural Science Foundation of China under Grant No. 61672227, ‘Shuguang Program’ supported by Shanghai Education Development Foundation and Shanghai Municipal Education Commission, Natural Science Foundations of China under Grant No. 61806078, National Science Foundation of China for Distinguished Young Scholars under Grant 61725301, National Key R&D Program of China under Grant No. 2018YFC0910500, National Major Scientific and Technological Special Project for “Significant New Drugs Development” under Grant No. 2019ZX09201004, and the Special Fund Project for Shanghai Informatization Development in Big Data under Grant 201901043.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors of this manuscript state that there is no conflicts of interests between this manuscript and other published works.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhu, Z., Wang, Z., Li, D. et al. Efficient matrixized classification learning with separated solution process. Neural Comput & Applic 32, 10609–10632 (2020). https://doi.org/10.1007/s00521-019-04595-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-019-04595-x