In this paper, we consider the problem that minimizing the sum of a nonsmooth function with a smooth one in the nonconvex setting, which arising in many contemporary applications such as machine learning, statistics, and signal/image processing. To solve this problem, we propose a new nonmonotone accelerated proximal gradient method with variable stepsize strategy. Note that incorporating inertial term into proximal gradient method is a simple and efficient acceleration technique, while the descent property of the proximal gradient algorithm will lost. In our algorithm, the iterates generated by inertial proximal gradient scheme are accepted when the objective function values decrease or increase appropriately; otherwise, the iteration point is generated by proximal gradient scheme, which makes the function values on a subset of iterates are decreasing. We also introduce a variable stepsize strategy, which does not need a line search or does not need to know the Lipschitz constant and makes the algorithm easy to implement. We show that the sequence of iterates generated by the algorithm converges to a critical point of the objective function. Further, under the assumption that the objective function satisfies the Kurdyka–Łojasiewicz inequality, we prove the convergence rates of the objective function values and the iterates. Moreover, numerical results on both convex and nonconvex problems are reported to demonstrate the effectiveness and superiority of the proposed method and stepsize strategy.

Data availability
Data sharing is not applicable to this article as no new data were created or analyzed in this study.
This research is supported by National Science Foundation of China (No. 12261019), Shaanxi Provincial Science and Technology Projects(No. 2024JC-YBQN-0048) and Guizhou Provincial Science and Technology Projects (No. QKHJC-ZK[2022]YB084).
Appendix A: Proof of Lemma 3.1
By the adaptive non-monotone stepsize strategy, if \({\lambda _{k + 1}}\) is generated by (7), then,
Hence, for any \(i \ge 1\) we have
Denote that
we have
which implies that \(\sum \nolimits _{i = 1}^\infty {{{\left( {{\lambda _{i + 1}} - {\lambda _i}} \right) }^ + }} \) is convergent from the fact that \(\sum \nolimits _{i = 1}^\infty {\textrm{E}\left( i \right) } \) is a convergent positive series.
The convergence of \(\sum \nolimits _{i = 1}^\infty {{{\left( {{\lambda _{i + 1}} - {\lambda _i}} \right) }^ - }} \) also can be proved as follows.
Assume by contradiction that \(\sum \nolimits _{i = 1}^\infty {{{\left( {{\lambda _{i + 1}} - {\lambda _i}} \right) }^ - }} \mathrm{{ = + }}\infty .\) Based on the convergence of \(\sum \nolimits _{i = 1}^\infty {{{\left( {{\lambda _{i + 1}} - {\lambda _i}} \right) }^ + }} \) and the equality
We can easily deduce \(\mathop {\lim }\nolimits _{k \rightarrow \infty } {\lambda _k} = - \infty ,\) which is a contradiction with \({\lambda _{k}} > 0,\) \(\forall k \ge 1.\) Therefore, \(\sum \nolimits _{i = 1}^\infty {{{\left( {{\lambda _{i + 1}} - {\lambda _i}} \right) }^ - }} \) is a convergent series. Then, in view of (91), we obtain the sequence \(\left\{ {{\lambda _k}} \right\} \) is convergent.
We can easily prove that \(\forall k \ge 1,\) \( {\lambda _k} \ge \min \left\{ {{\lambda _1},\frac{{{\mu _1}}}{{{L_f}}}} \right\} \) holds by induction. \(\square \)
Appendix B: Proof of Lemma 3.2
Suppose that the conclusion is not true, there exists a \(\left\{ {{k_j}} \right\} \) with \({k_j} \rightarrow \infty \) such that
holds. Then, based on the scheme of adaptive nonmonotone stepsize, we have
From the above two formulas, it is easy to obtain
Based on the facts that \({\mu _1} < {\mu _0}\) and the sequence \(\left\{ {{\lambda _k}} \right\} \) is convergent, we can get
which contradicts with (94). Therefore, (15) will holds constantly after a finite step \({\hat{k}}.\) \(\square \)
