A robust stochastic quasi-Newton algorithm for non-convex machine learning

Liu, Hanger; Liang, Yuqing; Liu, Jinlan; Xu, Dongpo

doi:10.1007/s10489-025-06475-5

A robust stochastic quasi-Newton algorithm for non-convex machine learning

Published: 25 March 2025

Volume 55, article number 569, (2025)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Hanger Liu¹^na1,
Yuqing Liang¹^na1,
Jinlan Liu² &
…
Dongpo Xu ORCID: orcid.org/0000-0002-9663-9743¹

100 Accesses
Explore all metrics

Abstract

Stochastic quasi-Newton methods have garnered considerable attention within large-scale machine learning optimization. Nevertheless, the presence of a stochastic gradient equaling zero poses a significant obstacle to updating the quasi-Newton matrix, thereby impacting the stability of the quasi-Newton algorithm. To address this issue, a checkpoint mechanism is introduced, i.e., checking the value of $\textbf{s}_k$ before updating the quasi-Newton matrix, which effectively prevents zero increments in the optimization variable and enhances algorithmic stability during iterations. Meanwhile, a novel gradient incremental formulation is introduced to satisfy curvature conditions, facilitating convergence for non-convex objectives. Additionally, finite-memory techniques are employed to reduce storage requirements in large-scale machine learning tasks. The last iteration of the proposed algorithm is proven to converge in a non-convex setting, which is better than average and minimum iteration convergence. Finally, experiments are conducted on benchmark datasets to compare the proposed RSLBFGS algorithm with other popular first and second-order methods, demonstrating the effectiveness and robustness of RSLBFGS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Algorithm 2

An Overview of Stochastic Quasi-Newton Methods for Large-Scale Machine Learning

Article Open access 25 February 2023

A Proximal Stochastic Quasi-Newton Algorithm with Dynamical Sampling and Stochastic Line Search

Article 02 December 2024

A new inexact stochastic recursive gradient descent algorithm with Barzilai–Borwein step size in machine learning

Article 23 October 2022

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Artificial Intelligence

Availability of Data and Materials

Only public datasets, in www.csie.ntu.edu.tw/~cjlin/libsvmtools/. are used.

Notes

www.csie.ntu.edu.tw/~cjlin/libsvmtools/.

References

Wang S, Takyi-Aninakwa P, Jin S, Yu C, Fernandez C, Stroe DI (2022) An improved feedforward-long short-term memory modeling method for the whole-life-cycle state of charge prediction of lithium-ion batteries considering current-voltage-temperature variation. Energy 254:124224
Google Scholar
Wang S, Fan Y, Jin S, Takyi-Aninakwa P, Fernandez C (2023) Improved anti-noise adaptive long short-term memory neural network modeling for the robust remaining useful life prediction of lithium-ion batteries. Reliab Eng Syst Saf 230:108920
Google Scholar
Shalev-Shwartz S, Ben-David S (2014) Understanding Machine Learning: From theory to algorithms. Cambridge University Press, Cambridge, UK
MATH Google Scholar
Zhang Y, Qiu M, Gao H (2023) Communication-efficient stochastic gradient descent ascent with momentum algorithms. In: International joint conference on artificial intelligence, pp 4602–4610
Lin T, Jin C, Jordan M (2020) On gradient descent ascent for nonconvex-concave minimax problems. In: International conference on machine learning, pp 6083–6093
Chen X, Liu S, Sun R, Hong M (2019) On the convergence of a class of Adam-type algorithms for non-convex optimization. In: International conference on learning representations, pp 1–30
Xu D, Zhang S, Zhang H, Mandic DP (2021) Convergence of the rmsprop deep learning method with penalty for nonconvex optimization. Neural Netw 139:17–23
MATH Google Scholar
Huang R, Qin Y, Liu K, Yuan G (2023) Biased stochastic conjugate gradient algorithm with adaptive step size for nonconvex problems. Expert Syst Appl 121556
Jiang W, Liang Y, Jiang Z, Xu D, Zhou L (2024) Abngrad: adaptive step size gradient descent for optimizing neural networks. Appl Intell 1–18
Yang Z (2023) Stochastic variance reduced gradient with hyper-gradient for non-convex large-scale learning. Appl Intell 53(23):28627–28641
Google Scholar
Wang S, Zhang S, Wen S, Fernandez C (2024) An accurate state-of-charge estimation of lithium-ion batteries based on improved particle swarm optimization-adaptive square root cubature kalman filter. J Power Sourc 624:235594
Google Scholar
Ouyang C, Lu C, Zhao X, Huang R, Yuan G, Jiang Y (2024) Stochastic three-term conjugate gradient method with variance technique for non-convex learning. Stat Comput 34(3):107
MathSciNet MATH Google Scholar
Luo L, Ye H, Huang Z, Zhang T (2020) Stochastic recursive gradient descent ascent for stochastic nonconvex-strongly-concave minimax problems. Advances in neural information processing systems 33:20566–20577
MATH Google Scholar
Liang Y, Liu J, Xu D (2023) Stochastic momentum methods for non-convex learning without bounded assumptions. Neural Netw 165:830–845
MATH Google Scholar
Byrd R, Hansen S, Nocedal J, Singer Y (2016) A stochastic quasi-Newton method for large-scale optimization. SIAM J Optim 26(2):1008–1031
MathSciNet MATH Google Scholar
Nicolas L, Fitzgibbon A (2010) A fast natural Newton method. In: International conference on machine learning, pp 623–630
Lucchi A, McWilliams B, Hofmann T (2015) A variance reduced stochastic Newton method. arXiv preprint arXiv:1503.08316
Wang Y, Wang Z, Huang H (2023) Stochastic adaptive CL-BFGS algorithms for fully complex-valued dendritic neuron model. Knowl-Based Syst 277:110788
MATH Google Scholar
Xu P, Roosta F, Mahoney M (2020) Newton-type methods for non-convex optimization under inexact Hessian information. Math Program 184(1–2):35–70
MathSciNet MATH Google Scholar
Na S, Dereziński M, Mahoney M (2023) Hessian averaging in stochastic Newton methods achieves superlinear convergence. Math Program 201(1):473–520
MathSciNet MATH Google Scholar
Luo J, Wei Z, Man J, Xu S (2023) TRBoost: a generic gradient boosting machine based on trust-region method. Appl Intell 53(22):27876–27891
MATH Google Scholar
Mokhtari A, Ribeiro A (2014) Res: Regularized stochastic bfgs algorithm. IEEE Trans Signal Process 62(23):6089–6104
MathSciNet MATH Google Scholar
Wang X, Ma S, Goldfarb D, Liu W (2017) Stochastic quasi-Newton methods for nonconvex stochastic optimization. SIAM J Optim 27(2):927–956
MathSciNet MATH Google Scholar
Wei Z, Li G, Qi L (2006) New quasi-Newton methods for unconstrained optimization problems. Appl Math Comput 175(2):1156–1188
MathSciNet MATH Google Scholar
Bordes A, Bottou L (2009) SGD-QN: Careful quasi-Newton stochastic gradient descent. J Mach Learn Res 10:1737–1754
MathSciNet MATH Google Scholar
Liu D, Nocedal J (1989) On the limited memory BFGS method for large scale optimization. Math Program 45:503–528
MathSciNet MATH Google Scholar
Deng N, Li Z (1995) Some global convergence properties of a conic-variable metric algorithm for minimization with inexact line searches. Numer Algebra Control Optim 5(1):105–122
MATH Google Scholar
Liu J, Kong J, Xu D, Qi M, Lu Y (2022) Convergence analysis of AdaBound with relaxed bound functions for non-convex optimization. Neural Netw 145:300–307
MATH Google Scholar
Bertsekas D (1997) Nonlinear programming. J Oper Res Soc 48(3):334–334
MATH Google Scholar
Li X, Orabona F (2019) On the convergence of stochastic gradient descent with adaptive stepsizes. In: International conference on artificial intelligence and statistics, pp 983–992
Yang Z (2022) Adaptive stochastic conjugate gradient for machine learning. Expert Syst Appl 206:117719
Google Scholar
Robbins H, Monro S (1951) A stochastic approximation method. Ann Math Stat 400–407

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers for their insightful and very helpful expert comments and suggestions, which have led to important improvements. The authors thank Dr. Kasper Karlsson for polishing the manuscript.

Funding

This work was funded in part by the National Key Research and Development Program of China (2021YFA1003400), in part by the National Natural Science Foundation of China (62176051), and in part by the Scientific Research Program of Jilin Provincial Department of Education.

Author information

Hanger Liu and Yuqing Liang contributed equally to this work.

Authors and Affiliations

Key Laboratory for Applied Statistics of MOE, School of Mathematics and Statistics, Northeast Normal University, Changchun, 130024, China
Hanger Liu, Yuqing Liang & Dongpo Xu
Department of Mathematics, Changchun Normal University, Changchun, 130032, China
Jinlan Liu

Authors

Hanger Liu
View author publications
You can also search for this author inPubMed Google Scholar
Yuqing Liang
View author publications
You can also search for this author inPubMed Google Scholar
Jinlan Liu
View author publications
You can also search for this author inPubMed Google Scholar
Dongpo Xu
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Hanger Liu: writing - original draft preparation, conceptualization, Yuqing Liang: writing - review and editing, validation, Jinlan Liu: writing-review and editing, supervision, Dongpo Xu: funding acquisition, supervision, review and editing for the manuscript. All authors have reviewed the manuscript.

Corresponding authors

Correspondence to Jinlan Liu or Dongpo Xu.

Ethics declarations

Ethical Approval and Consent to Participate

Not applicable.

Consent for Publication

Not applicable.

Competing Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liu, H., Liang, Y., Liu, J. et al. A robust stochastic quasi-Newton algorithm for non-convex machine learning. Appl Intell 55, 569 (2025). https://doi.org/10.1007/s10489-025-06475-5

Download citation

Accepted: 14 March 2025
Published: 25 March 2025
DOI: https://doi.org/10.1007/s10489-025-06475-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A robust stochastic quasi-Newton algorithm for non-convex machine learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Overview of Stochastic Quasi-Newton Methods for Large-Scale Machine Learning

A Proximal Stochastic Quasi-Newton Algorithm with Dynamical Sampling and Stochastic Line Search

A new inexact stochastic recursive gradient descent algorithm with Barzilai–Borwein step size in machine learning

Explore related subjects

Availability of Data and Materials

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethical Approval and Consent to Participate

Consent for Publication

Competing Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now