Abstract
Two obvious limitations exist for baseline kernel minimum squared error (KMSE): lack of sparseness of the solution and the ill-posed problem. Previous sparse methods for KMSE have overcome the second limitation using a regularization strategy, which introduces an increase in the computational cost to determine the regularization parameter. Hence, in this paper, a constructive sparse algorithm for KMSE (CS-KMSE) and its improved version (ICS-KMSE) are proposed which will simultaneously address the two limitations described above. CS-KMSE chooses the training samples that incur the largest reductions on the objective function as the significant nodes on the basis of the Householder transformation. In contrast with CS-KMSE, there is an additional replacement mechanism using Givens rotation in ICS-KMSE, which results in ICS-KMSE giving better performance than CS-KMSE in terms of sparseness. CS-KMSE and ICS-KMSE do not require the regularization parameter at all before they begin to choose significant nodes, which is beneficial since it saves on the model selection time. More importantly, CS-KMSE and ICS-KMSE terminate their procedures with an early stopping strategy that acts as an implicit regularization term, which avoids overfitting and curbs the sparse level on the solution of the baseline KMSE. Finally, in comparison with other algorithms, both ICS-KMSE and CS-KMSE have superior sparseness, and extensive comparisons confirm their effectiveness and feasibility.
Similar content being viewed by others
References
Traboulsi YE, Dornaika F, Assoum A (2015) Kernel flexible manifold embedding for pattern classification. Neurocomputing 167:517–527. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0925231215005111
Zhao Y-P (2016) Parsimonious kernel extreme learning machine in primal via cholesky factorization. Neural Netw 80:95–109. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0893608016300399
Vapnik VN (1995) The Nature of statistical learning theory. Springer, New York
Vapnik VN (1999) An overview of statistical learning theory. IEEE Trans Neural Netw 10(5):988–99
Xu J, Zhang X, Li Y (2001) Kernel mse algorithm: a unified framework for kfd, ls-svm and krr International joint conference on neural networks, vol 2, pp 1486–1491
Suykens JAK, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300
Zhao Y-P, Sun J-G, Du Z-H, Zhang Z-A, Li Y-B (2012) Online independent reduced least squares support vector regression. Inf Sci 201:37–52. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0020025512001727
Zhao Y-P, Wang K-K, Li F (2015) A pruning method of refining recursive reduced least squares support vector regression. Inf Sci 296:160–174. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0020025514010512
Saunders C, Gammerman A, Vovk V (1998) Ridge regression learning algorithm in dual variables Proceedings of the 15th international conference on machine learning, pp 515–521
Mika S, Rätsch G, Weston J, Schölkopf B, Mller KR (1999) Fisher discriminant analysis with kernels Proceedings of the 1999 IEEE signal processing society workshop, pp 41–48
Chen Z, Haykin S (2002) On different facets of regularization theory. Neural Comput 14(12):2791–2846
Morozov VA (1984) Methods for solving incorrectly posed problems. Springer-Verlag
Tipping ME (2001) Sparse bayesian learning and the relevance vector machine. J Mach Learn Res 1(3):211–244
Zhao Y, Wang K (2014) Fast cross validation for regularized extreme learning machine. J Syst Eng Electron 25(5):895–900
Xu Y, Yang JY, Lu JF (2005) An efficient kernel-based nonlinear regression method for two-class classification International conference on machine learning and cybernetics, vol 7, pp 4442–4445
Xu Y, Zhang D, Jin Z, Li M, Yang J-Y (2006) A fast kernel-based nonlinear discriminant analysis for multi-class problems. Pattern Recogn 39(6):1026–1033. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0031320305004425
Wang J, Wang P, Li Q, You J (2013) Improvement of the kernel minimum squared error model for fast feature extraction. Neural Comput Appl 23(1):53–59
Jiang J, Chen X, Gan H, Sang N (2014) Sparsity based feature extraction for kernel minimum squared error Chinese conference on pattern recognition, vol 483, pp 273–282
Zhu Q (2009) A method for rapid feature extraction based on kmse 2010 Second WRI Global Congress on Intelligent Systems, pp 335–338
Zhu Q (2010) Reformative nonlinear feature extraction using kernel mse. Neurocomputing 73(16–18):3334–3337. 10th Brazilian Symposium on Neural Networks (SBRN2008). [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0925231210002043
Zhu Q, Xu Y, Cui J, Chen CF (2009) A method for constructing simplified kernel model based on kernel-mse Asia-Pacific Conference on Computational Intelligence and Industrial Applications, pp 237–240
Zhao Y-P, Du Z-H, Zhang Z-A, Zhang H-B (2011) A fast method of feature extraction for kernel mse. Neurocomputing 74(10):1654–1663. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0925231211001354
Zhao Y-P, Sun J-G, Du Z-H, Zhang Z-A, Zhang H-B (2011) Pruning least objective contribution in kmse. Neurocomputing 74(17):3009–3018. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0925231211002621
Zhao Y-P, Liang D, Ji Z (2017) A method of combining forward with backward greedy algorithms for sparse approximation to kmse. Soft Comput 21(9):2367–2383. [Online]. Available. doi:10.1007/s00500-015-1947-3
Zhao Y-P, Wang K-K, Liu J, Huerta R (2014) Incremental kernel minimum squared error (kmse). Inform Sci 270:92–111. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0020025514002230
Lauer F, Bloch G (2006) Hockashyap classifier with early stopping for regularization. Pattern Recogn Lett 27(9):1037–1044. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0167865505003776
Vincent P, Bengio Y (2002) Kernel matching pursuit. Mach Learn 48(1–3):165–187. [Online]. Available. doi:10.1023/A:1013955821559
Jiao L, Bo L, Wang L (2007) Fast sparse approximation for least squares support vector machine. IEEE Trans Neural Netw 18(3):685–697. [Online]. Available. doi:10.1109/TNN.2006.889500
Scholköpf B, Herbrich R, Smola AJ (2001) A generalized representer theorem Conference on computational learning theory and and European conference on computational learning theory, pp 416–426
An S, Liu W, Venkatesh S (2007) Fast cross-validation algorithms for least squares support vector machine and kernel ridge regression. Pattern Recogn 40(8):2154–2162. part Special Issue on Visual Information Processing. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0031320306005280
Chen S, Billings S, Luo W (1989) Orthogonal least squares methods and their application to non-linear system identification. Int J Control 50(5):1873 –1896
Zhao Y-P, Sun J-G (2010) Thrust estimator design based on least squares support vector regression machine. J Harbin Inst Technol (New Series) 17(4):578–583
Zhang X (2004) Matrix analysis and applications. Tsinghua University Press
Dubrulle AA (2000) Householder transformations revisited. Siam J Matrix Anal Appl 22(1):33–40
Householder AS (1958) Unitary triangularization of a nonsymmetric matrix. J ACM 5(4):339–342
Givens W (1958) Computation of plane unitary rotations transforming a general matrix to triangular form. J Soc Indust Appl Math 6(1):26–50
Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Wiley
Suykens J, Lukas L, Vandewalle J (2000) Sparse approximation using least squares support vector machines Proceedings of IEEE international symposium on circuits and systems, vol 2. Geneva, Switz, pp II–757–II–760
Zhao Y, Sun J (2009) Recursive reduced least squares support vector regression. Pattern Recogn 42 (5):837–842. [Online]. Available. doi:10.1016/j.patcog.2008.09.028
An S, Liu W, Venkatesh S (2007) Fast cross-validation algorithms for least squares support vector machine and kernel ridge regression. Pattern Recogn 40(8):2154–2162. [Online]. Available. doi:10.1016/j.patcog.2006.12.015
Acknowledgments
This research was supported by the Fundamental Research Funds for the Central Universities under Grant no. NJ20160021. Moreover, the author wish to thank the anonymous reviewers for their constructive comments and great help in the writing process, which improve the manuscript significantly.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhao, YP., Xi, PP., Li, B. et al. Sparse kernel minimum squared error using Householder transformation and givens rotation. Appl Intell 48, 390–415 (2018). https://doi.org/10.1007/s10489-017-0978-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-017-0978-0