Fast newton method to solve KLR based on multilevel circulant matrix with log-linear complexity

Zhang, Junna; Zhou, Shuisheng; Fu, Cui; Ye, Feng

doi:10.1007/s10489-023-04606-4

Fast newton method to solve KLR based on multilevel circulant matrix with log-linear complexity

Published: 03 June 2023

Volume 53, pages 21407–21421, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Junna Zhang¹,
Shuisheng Zhou¹,
Cui Fu¹ &
…
Feng Ye¹

149 Accesses
Explore all metrics

Abstract

Kernel logistic regression (KLR) is a conventional nonlinear classifier in machine learning. With the explosive growth of data size, the storage and computation of large dense kernel matrices is a major challenge in scaling KLR. Even when the nyström approximation is applied to solve KLR, the corresponding method faces time complexity of \(\varvec{O}(\varvec{nc}^{\varvec{2}})\) and space complexity of \(\varvec{O(nc)}\), where n is the number of training instances and c is the sample size. We propose a fast Newton method to efficiently solve large-scale KLR problems by exploiting the storage and computing advantages of a multilevel circulant matrix (MCM). By approximating the kernel matrix with an MCM, the storage space is reduced to \(\varvec{O(n)}\), and further approximating the coefficient matrix of the Newton equation as an MCM, the computational complexity of Newton iteration is reduced to \(\varvec{O}(\varvec{n \log n})\). The proposed method can run in log-linear time complexity per iteration, because the multiplication of an MCM (or its inverse) and a vector can be implemented by the multidimensional fast Fourier transform (mFFT). Experimental results on some large-scale binary- and multi-classification problems show that the proposed method enables KLR to scale to large scale problems with less memory consumption and less training time without sacrificing test accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Survey on SVM and their application in image classification

Article 11 January 2018

PolieDRO: a novel classification and regression framework with non-parametric data-driven regularization

Article 15 April 2024

A New Insight on Augmented Lagrangian Method with Applications in Machine Learning

Article 13 April 2024

Data Availability

Some or all data, models, or code generated or used during the study are available from the corresponding author by request.

Notes

Codes are available in https://github.com/cnmusco/recursive-nystrom.
Codes are available in http://www.lfhsgre.org.

References

Cawley GC, Talbot NLC (2008) Efficient approximate leave-one-out cross-validation for kernel logistic regression. Mach Learn 71(2):243–264
Article MATH Google Scholar
Choudhury A (2021) Predicting cancer using supervised machine learning: Mesothelioma. Technol Health Care 29(1):45–58
Article Google Scholar
Yang G, Zhou Y, Sun L, Shi Y (2019) Logistic model based on Benford’s law and its application in fraud detection. Journal of Statistics and Information 34(8):50–56 ((in Chinese))
Google Scholar
Chen W, Shahabi H, Shirzadi A, Hong H, Akgun A, Tian Y, Liu J, Zhu A, Li S et al (2019) Novel hybrid artificial intelligence approach of bivariate statistical-methods-based kernel logistic regression classifier for landslide susceptibility modeling. Bull Eng Geol Env 78(6):4397–4419
Article Google Scholar
Zhu J, Hastie T (2005) Kernel logistic regression and the import vector machine. J Comput Graph Stat 14(1):185–205
Article MathSciNet Google Scholar
Sugiyama M, Simm J (2010) A computationally-efficient alternative to kernel logistic regression. In Proceedings of the 2010 IEEE International Workshop on Machine Learning for Signal Processing pages 124–129, Kittila, Finland
Keerthi SS, Duan KB, Shevade SK, Poo AN (2005) A fast dual algorithm for kernel logistic regression. Mach Learn 61(1):151–165
Article MATH Google Scholar
Lei D, Tang J, Li Z, Wu Y (2019) Using low-rank approximations to speed up kernel logistic regression algorithm. IEEE Access 7:84242–84252
Article Google Scholar
Williams C, Seeger M (2001) Using the Nyström method to speed up kernel machines. In Proceedings of the 13th International Conference on Neural Information Processing Systems pages 682–688, Cambridge, MA
Fang K, Liu F, Huang X, Yang J (2023) End-to-end kernel learning via generative random fourier features. Pattern Recogn 134:109057
He L, Zhang H (2018) Kernel k-means sampling for nyström approximation. IEEE Trans Image Process 27(5):2108–2120
Article MathSciNet MATH Google Scholar
Li M, Bi W, Kwok JT, Lu BL (2014) Large-scale Nyström kernel matrix approximation using randomized SVD. IEEE Transactions on Neural Networks and Learning Systems 26(1):152–164
Google Scholar
Alaoui A, Mahoney MW (2015) Fast randomized kernel ridge regression with statistical guarantees. In Proceedings of the 28th International Conference on Neural Information Processing Systems pages 775–783, Montreal, Canada
Musco C, Musco C (2017) Recursive sampling for the Nyström method. In Proceedings of the 31st International Conference on Neural Information Processing Systems pages 3836–3848, Red Hook, NY, USA
Gittens A, Mahoney MW (2016) Revisiting the Nyström method for improved large-scale machine learning. The Journal of Machine Learning Research 17(1):3977–4041
MATH Google Scholar
Rahimi A, Recht B (2007) Random features for large-scale kernel machines. In Proceedings of the 20th International Conference on Neural Information Processing Systems pages 1177–1184, Red Hook, NY, USA
He L, Ray N, Guan Y, Zhang H (2019) Fast large-scale spectral clustering via explicit feature mapping. IEEE Transactions on Cybernetics 49(3):1058–1071
Article Google Scholar
Feng C, Hu Q, Liao S (2015) Random feature mapping with signed circulant matrix projection. In Proceedings of the 24th International Joint Conference on Artificial Intelligence pages 3490–3496, Buenos Aires, Argentina
Xiong K, Iu HHC, Wang S (2020) Kernel correntropy conjugate gradient algorithms based on half-quadratic optimization. IEEE Transactions on Cybernetics 51(11):5497–5510
Article Google Scholar
Li Z, Ton JF, Oglic D, Sejdinovic D (2021) Towards a unified analysis of random fourier features. J Mach Learn Res 22(108):1–51
MathSciNet MATH Google Scholar
Liu F, Huang X, Chen Y, Suykens JAK (2021) Random features for kernel approximation: A survey on algorithms, theory, and beyond. IEEE Transactions on Pattern Analysis and Machine Intelligence
Song G (2009) Approximation of kernel matrices in machine learning. PhD thesis, Department of Mathematics, Syracuse University, Syracuse, NY, USA
Song G, Xu Y (2010) Approximation of high-dimensional kernel matrices by multilevel circulant matrices. J Complex 26(4):375–405
Article MathSciNet MATH Google Scholar
Ding L, Liao S (2011) Approximate model selection for large scale LSSVM. The 3rd Asian Conference on Machine Learning. Taoyuan, Taiwan, pp 165–180
Google Scholar
Edwards RE, Zhang H, Parker LE, New JR (2013) Approximate \(l\)-fold cross-validation with least squares SVM and kernel ridge regression. In Proceedings of the 2013 12th International Conference on Machine Learning and Applications pages 58–64, NW Washington, DC, United States
Ding L, Liao S (2017) An approximate approach to automatic kernel selection. IEEE Transactions on Cybernetics 47(3):554–565
Article Google Scholar
Ding L, Liao S, Liu Y, Liu L, Zhu F, Yao Y, Shao L, Gao X (2020) Approximate kernel selection via matrix approximation. IEEE Transactions on Neural Networks and Learning Systems 31(11):4881–4891
Article MathSciNet Google Scholar
Yin R, Liu Y, Wang W, Meng D (2019) Sketch kernel ridge regression using circulant matrix: Algorithm and theory. IEEE Transactions on Neural Networks and Learning Systems 31(9):3512–3524
Sato H (2022) Riemannian conjugate gradient methods: General framework and specific algorithms with convergence analyses. SIAM J Optim 32(4):2690–2717
Article MathSciNet MATH Google Scholar
Guevara J, Mendel JM, Hirata R (2022) Fuzzy-system kernel machines: A kernel method based on the connections between fuzzy inference systems and kernel machines. IEEE Trans Fuzzy Syst 30(10):4447–4459
Article Google Scholar
Dennis JE Jr, Schnabel RB (1983) Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Prentice-Hall, Englewoods Cliffs
MATH Google Scholar
Tyrtyshnikov EE (1996) A unifying approach to some old and new theorems on distribution and clustering. Linear Algebra Appl 232:1–43
Article MathSciNet MATH Google Scholar
Davis PJ (1979) Circulant Matrices. Wiley
MATH Google Scholar
Boyd S, Boyd SP, Vandenberghe L (2004) Convex Optimization. Cambridge University Press
Book MATH Google Scholar
Galli L, Lin CJ (2022) A study on truncated newton methods for linear classification. IEEE Transactions on Neural Networks and Learning Systems 33(7):2828–2841
Article MathSciNet Google Scholar
Lin CJ, Chang CC (2011) LIBSVM: a library for support vector machines. https://www.csie.ntu.edu.tw/~cjlin/libSVM/
Dua D, Graff C (2017) UCI machine learning repository. http://archive.ics.uci.edu/ml
Ho TK, Kleinberg EM (1996) Building projectable classifiers of arbitrary complexity. In Proceedings of 13th International Conference on Pattern Recognition pages 880–885, Vienna, Austria
Mangasarian OL, Musicant DR (2001) Lagrangian support vector machines. The Journal of Machine Learning Research 1(3):161–177
MathSciNet MATH Google Scholar
Zhou S (2016) Sparse LSSVM in primal using Cholesky factorization for large-scale problems. IEEE Transactions on Neural Networks and Learning Systems 27(4):783–795
Article MathSciNet Google Scholar
Chen L, Zhou S (2018) Sparse algorithm for robust LSSVM in primal space. Neurocomputing 275(C):2880–2891
Vapnik V (2013) The Nature of Statistical Learning Theory. Springer Science & Business Media
Takahashi K, Yamamoto K, Kuchiba A, Koyama T (2022) Confidence interval for micro-averaged F1 and macro-averaged F1 scores. Appl Intell 52(5):4961–4972
Zhou T, Lu H, Yang Z, Qiu S, Huo B, Dong Y (2021) The ensemble deep learning model for novel COVID-19 on CT images. Appl Soft Comput 98:106885
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1-30
MathSciNet MATH Google Scholar
Zhou S, Cui J, Ye F, Liu H, Zhu Q (2013) New smoothing SVM algorithm with tight error bound and efficient reduced techniques. Comput Optim Appl 56(3):599–617
Article MathSciNet MATH Google Scholar
Chauhan VK, Dahiya K, Sharma A (2019) Problem formulations and solvers in linear SVM: a review. Artif Intell Rev 52(2):803–855
Article Google Scholar
Zhou S, Zhou W (2021) Unified SVM algorithm based on LS-DC loss. Machine Learning

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China [Grants numbers 61772020].

Funding

This study was funded by the National Natural Science Foundation of China.

Author information

Authors and Affiliations

School of Mathematics and Statistics, Xidian University, Xi’an, 710126, China
Junna Zhang, Shuisheng Zhou, Cui Fu & Feng Ye

Authors

Junna Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Shuisheng Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Cui Fu
View author publications
You can also search for this author in PubMed Google Scholar
Feng Ye
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shuisheng Zhou.

Ethics declarations

Conflict of interest/Competing interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, J., Zhou, S., Fu, C. et al. Fast newton method to solve KLR based on multilevel circulant matrix with log-linear complexity. Appl Intell 53, 21407–21421 (2023). https://doi.org/10.1007/s10489-023-04606-4

Download citation

Accepted: 01 April 2023
Published: 03 June 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s10489-023-04606-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast newton method to solve KLR based on multilevel circulant matrix with log-linear complexity

Abstract

Access this article

Similar content being viewed by others

Survey on SVM and their application in image classification

PolieDRO: a novel classification and regression framework with non-parametric data-driven regularization

A New Insight on Augmented Lagrangian Method with Applications in Machine Learning

Data Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest/Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fast newton method to solve KLR based on multilevel circulant matrix with log-linear complexity

Abstract

Access this article

Similar content being viewed by others

Survey on SVM and their application in image classification

PolieDRO: a novel classification and regression framework with non-parametric data-driven regularization

A New Insight on Augmented Lagrangian Method with Applications in Machine Learning

Data Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest/Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation