Abstract
We revisit the classical technique of regularised least squares (RLS) for nonlinear classification in this paper. Specifically, we focus on a low-rank formulation of the RLS, which has linear time complexity in the size of data set only, independent of both the number of classes and number of features. This makes low-rank RLS particularly suitable for problems with large data and moderate feature dimensions. Moreover, we have proposed a general theorem for obtaining the closed-form estimation of prediction values on a holdout validation set given the low-rank RLS classifier trained on the whole training data. It is thus possible to obtain an error estimate for each parameter setting without retraining and greatly accelerate the process of cross-validation for parameter selection. Experimental results on several large-scale benchmark data sets have shown that low-rank RLS achieves comparable classification performance while being much more efficient than standard kernel SVM for nonlinear classification. The improvement in efficiency is more evident for data sets with higher dimensions.
Similar content being viewed by others
Notes
In practice, m instances are randomly chosen from the training data set. We can always rearrange the data matrix X to move the chosen instances to the front.
References
Boser B, Guyon I, Vapnik V (1992) A training algorithm for optimal margin classifiers. In: ACM conference on computational learning theory
Chang K.-W, Hsieh C.-J, Lin C.-J, Keerthi S, Sundararajan S (2008) A dual coordinate descent method for large-scale linear SVM. In: International conference on machine learning
Duda RO, Hart PE, Stock DG (2000) Pattern classification. 2nd edn. Wiley, Hoboken
Fan R-E, Chen P-H, Lin C-J (2005) Working set selection using the second order information for training SVM. J Mach Learn Res 6:1889–1918
Frank A, Asuncion A (2010) UCI machine learning repository. Irvine, CA: University of California, School of Information and Computer Science. http://archive.ics.uci.edu/ml
Fu Z, Lu G, Ting K.-M, Zhang D (2011) On low-rank regularized least squares for scalable nonlinear classification. In: International conference neural information processing, pp. 490–499
Joachims T (2006) Training linear SVMs in linear time? In: Proceedings of ACM SIGKDD international conference on knowledge discovery and data mining
Lin C-J, Weng RC, Keerthi SS (2008) Trust region newton method for large-scale logistic regression. J Mach Learn Res 9:627–650
Moody J (1994) Prediction risk and neural network architecture selection. In: From statistics to neural networks: theory and pattern recognition applications. Springer
Pahikkala T, Boberg J, Salakoski T (2006) Fast n-fold cross-validation for regularized least-squares. In: Scandinavian conference. Artificial Intelligence
Platt J (1998) Fast training of support vector machines using sequential minimal optimization. In: Advances in Kernel methods—support vector learning. MIT Press, Cambridge
Rifkin R (2002) Everything old is new again: a fresh look at historical approaches. PhD thesis, Mass. Inst. of Tech.
Rifkin Ryan, Klautau Aldebaro (2004) In defense of one-vs-all classification. J Mach Learn Res 5:101–141
Scholkopf B, Smola A (2002) Learning with Kernels: support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge
Tsoumakas G, Katakis I, Vlahavas I (2010) Mining multi-label data, data mining and knowledge discovery handbook, pp. 667–685
Vapnik V (1998) Statistical learning theory. Wiley, Hoboken
Wu M, Scholkopf B, Bakir G (2006) A direct method for building sparse kernel learning algorithms. J Mach Learn Res 7:603–624
Zhang P, Peng J (2004) Svm vs regularized least squares classification. In: International conference pattern recognition
Acknowledgments
This work has been done while Zhouyu Fu was with the Gippsland School of IT at Monash University. It was supported by the Australian Research Council under the Discovery Project (DP0986052) entitled “Automatic music feature extraction, classification and annotation”.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fu, Z., Lu, G., Ting, K.M. et al. Efficient nonlinear classification via low-rank regularised least squares. Neural Comput & Applic 22, 1279–1289 (2013). https://doi.org/10.1007/s00521-012-1076-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-012-1076-1