Skip to main content
Log in

Efficient nonlinear classification via low-rank regularised least squares

  • ICONIP 2011
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

We revisit the classical technique of regularised least squares (RLS) for nonlinear classification in this paper. Specifically, we focus on a low-rank formulation of the RLS, which has linear time complexity in the size of data set only, independent of both the number of classes and number of features. This makes low-rank RLS particularly suitable for problems with large data and moderate feature dimensions. Moreover, we have proposed a general theorem for obtaining the closed-form estimation of prediction values on a holdout validation set given the low-rank RLS classifier trained on the whole training data. It is thus possible to obtain an error estimate for each parameter setting without retraining and greatly accelerate the process of cross-validation for parameter selection. Experimental results on several large-scale benchmark data sets have shown that low-rank RLS achieves comparable classification performance while being much more efficient than standard kernel SVM for nonlinear classification. The improvement in efficiency is more evident for data sets with higher dimensions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. In practice, m instances are randomly chosen from the training data set. We can always rearrange the data matrix X to move the chosen instances to the front.

References

  1. Boser B, Guyon I, Vapnik V (1992) A training algorithm for optimal margin classifiers. In: ACM conference on computational learning theory

  2. Chang K.-W, Hsieh C.-J, Lin C.-J, Keerthi S, Sundararajan S (2008) A dual coordinate descent method for large-scale linear SVM. In: International conference on machine learning

  3. Duda RO, Hart PE, Stock DG (2000) Pattern classification. 2nd edn. Wiley, Hoboken

    Google Scholar 

  4. Fan R-E, Chen P-H, Lin C-J (2005) Working set selection using the second order information for training SVM. J Mach Learn Res 6:1889–1918

    MathSciNet  MATH  Google Scholar 

  5. Frank A, Asuncion A (2010) UCI machine learning repository. Irvine, CA: University of California, School of Information and Computer Science. http://archive.ics.uci.edu/ml

  6. Fu Z, Lu G, Ting K.-M, Zhang D (2011) On low-rank regularized least squares for scalable nonlinear classification. In: International conference neural information processing, pp. 490–499

  7. Joachims T (2006) Training linear SVMs in linear time? In: Proceedings of ACM SIGKDD international conference on knowledge discovery and data mining

  8. Lin C-J, Weng RC, Keerthi SS (2008) Trust region newton method for large-scale logistic regression. J Mach Learn Res 9:627–650

    MathSciNet  MATH  Google Scholar 

  9. Moody J (1994) Prediction risk and neural network architecture selection. In: From statistics to neural networks: theory and pattern recognition applications. Springer

  10. Pahikkala T, Boberg J, Salakoski T (2006) Fast n-fold cross-validation for regularized least-squares. In: Scandinavian conference. Artificial Intelligence

  11. Platt J (1998) Fast training of support vector machines using sequential minimal optimization. In: Advances in Kernel methods—support vector learning. MIT Press, Cambridge

  12. Rifkin R (2002) Everything old is new again: a fresh look at historical approaches. PhD thesis, Mass. Inst. of Tech.

  13. Rifkin Ryan, Klautau Aldebaro (2004) In defense of one-vs-all classification. J Mach Learn Res 5:101–141

    MathSciNet  MATH  Google Scholar 

  14. Scholkopf B, Smola A (2002) Learning with Kernels: support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge

    Google Scholar 

  15. Tsoumakas G, Katakis I, Vlahavas I (2010) Mining multi-label data, data mining and knowledge discovery handbook, pp. 667–685

  16. Vapnik V (1998) Statistical learning theory. Wiley, Hoboken

    MATH  Google Scholar 

  17. Wu M, Scholkopf B, Bakir G (2006) A direct method for building sparse kernel learning algorithms. J Mach Learn Res 7:603–624

    MathSciNet  MATH  Google Scholar 

  18. Zhang P, Peng J (2004) Svm vs regularized least squares classification. In: International conference pattern recognition

Download references

Acknowledgments

This work has been done while Zhouyu Fu was with the Gippsland School of IT at Monash University. It was supported by the Australian Research Council under the Discovery Project (DP0986052) entitled “Automatic music feature extraction, classification and annotation”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhouyu Fu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fu, Z., Lu, G., Ting, K.M. et al. Efficient nonlinear classification via low-rank regularised least squares. Neural Comput & Applic 22, 1279–1289 (2013). https://doi.org/10.1007/s00521-012-1076-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-012-1076-1

Keywords

Navigation