1 Introduction

In this paper, we are interested in the following optimization problem

$$\begin{aligned} \min _{x\in \mathcal {R}^n} f(x)+g(x), \end{aligned}$$
(1.1)

where f is a strongly convex function with constant \(\beta >0\) and g is a hypoconvex (also called weakly convex or semiconvex) function with constant \(\omega >0\), see Sect. 3 for precise assumptions. Problem (1.1) is one of the most studied models in modern optimization with a huge body of literature (see, for instance, [6] and the references therein).

A well known algorithm for solving problem (1.1) is the so-called Douglas–Rachford splitting method (DRSM), which traces back to [10, 17] and has been well studied from various perspectives in the literature, see, e.g., [3, 4, 11, 12, 14]. The iterative process of the Douglas–Rachford splitting method only involves the evaluation of the proximal mappings of f and g, which is simple in many applications. We recall that for a proper lower semicontinuous function \(h:\mathcal {R}^n\rightarrow \mathcal {R}\cup \{+\,\infty \}\) and a parameter \(\nu >0\), the proximal mapping \(\text{ prox }_{\nu h}\) is defined as

$$\begin{aligned} \text{ prox }_{\nu h}(x):=\arg \min _{y\in \mathcal {R}^n}\left\{ h(y)+\frac{1}{2\nu }\Vert y-x\Vert ^2\right\} . \end{aligned}$$
(1.2)

If \(\inf \nolimits _{y\in \mathcal {R}^n}h(y)>-\,\infty \), then for every \(\nu \in (0, +\,\infty )\), the set \(\text{ prox }_{\nu h}(x)\) is nonempty and compact; and \(\text{ prox }_{\nu h}(x)\) is single-valued if h is further assumed to be convex, see, e.g [20, Theorems 1.25 and 2.26]. Precisely, for an arbitrary starting point \(z_{1}\in \mathcal {R}^{n}\), DRSM iteratively generates a sequence \(\{z_{k}\}_{k\in N}\) via the following rule

$$\begin{aligned} z_{k+1}=\widetilde{T}_{DR}(z_k):= ((1-\alpha )I+\alpha R_{\lambda f}R_{\lambda g})(z_k), \end{aligned}$$
(1.3)

where \(\alpha \in (0,1)\) is a parameter; I is the identity operator; \(\lambda >0\) is the proximal parameter, and \(R_{\lambda f}:=2\text{ prox }_{\lambda f}-I\) and \(R_{\lambda g}:=2\text{ prox }_{\lambda g}-I\) are the reflection operators of f and g, respectively.

For the case where both f and g are convex, the convergence of DRSM has been extensively studied, see, e.g., [11, 17]. Indeed, its convergence is an immediate conclusion of the convergence result of the well known Krasnosel’skiĭ-Mann theorem [16] if we regard the scheme (1.3) as a convex combination of the nonexpansive operator \(R_{\lambda f}R_{\lambda g}\) and the identity operator. However, the research on the convergence of the DRSM for the optimization problems involving nonconvex functions is still in infancy, and there are recently a few results for the “strongly + weakly” convex problem (1.1), i.e., the function f is strongly convex with constant \(\beta >0\) and g is weakly convex with constant \(\omega >0\).

The first effort seems to be the work [5], in which the convergence of (1.3) was established under the additional conditions that f is second-order differentiable, its gradient \({\nabla } f\) is Lipschitz continuous with constant \(L>0\), \(\beta = \omega \), \(0<\lambda \le 1/\sqrt{L\beta }\) and \(\alpha \in (0,1)\). Essentially, these additional assumptions are for ensuring the contraction property of the operator \(R_{\lambda f}\) and the nonexpansiveness of the operator \(R_{\lambda f}R_{\lambda g}\), because the analysis in [5] still follows the framework of the classical Krasnosel’skiĭ-Mann theorem. Very recently, the authors in [13] proved the convergence of the DRSM scheme (1.3) for the “strongly + weakly” convex problem (1.1) without any differentiability assumption on the strongly convex function f. Different from the technique in [5] based upon the nonexpansiveness of the operator \(R_{\lambda f}R_{\lambda g}\), their technique is based on the Fejér monotonicity of the sequence \(\{z_{k}\}_{k\in N}\) generated by (1.3) with respect to the fixed point set of \(\widetilde{T}_{DR}\) and it does not require the nonexpansivenss of the operator \(R_{\lambda f}R_{\lambda g}\). Here, for an operator \(M:\mathcal {R}^{n}\rightarrow \mathcal {R}^{n}\), the fixed point set of M is defined as \(\text{ Fix }(M):=\{z\in \mathcal {R}^{n}: M(z)=z \}\). Meanwhile, without any differentiability assumption on f, they alternatively require the condition \(\beta >\omega \) which is slightly stronger than the condition \(\beta = \omega \) in [5].

In many applications, the objective function \(f+g\) is merely convex, but not strongly convex. It is thus desirable if we can establish the convergence of DRSM under this mild assumption. In this paper, our principal purpose is to complete this task. To make our analysis more realistic, we allow variable combination parameters and inexact evaluation of the reflection operators. That is, we consider the more general inexact DRSM (GIDRSM) scheme

$$\begin{aligned} z_{k+1}:=\alpha _{k}z_{k}+\beta _{k}(R_{\lambda f}R_{\lambda g}(z_{k})+e_{k}), \end{aligned}$$
(1.4)

where \(\alpha _{k}, \beta _{k}\in [0,1]\) are suitable parameters satisfying \(\alpha _{k}+\beta _{k}\le 1\), and \(e_{k}\) represents an error in the evaluation of \(R_{\lambda f}R_{\lambda g}(z_{k})\). Note that, the DRSM (1.3) corresponds to the case \(e_{k}\equiv 0\), \(\alpha _{k}\equiv 1-\alpha \), and \(\beta _{k}\equiv \alpha \). Our aim is to prove the convergence of GIDRSM (1.4) for the case \(\beta =\omega \) under the additional assumption that f is continuously differentiable such that its gradient \({\nabla }f\) is Lipschitz continuous, which is weaker than that in [5].

The paper is organized in the following way. In Sect. 2, we recall some definitions and known results for further analysis. In Sect.  3, we present our main results, i.e., the convergence of GIDRSM (1.4) under weaker conditions than the existing work for (1.3). Finally, some concluding remarks are given in Sect. 4.

2 Preliminaries

In this section, we recall some definitions and known results that will be used in our analysis later.

Definition 2.1

[2, Definition 4.4] Let D be a nonempty subset of \(\mathcal {R}^n\), let \(M: D\rightarrow \mathcal {R}^n\), and let \(\kappa >0\). Then M is said to be \(\kappa \)-cocoercive if

$$\begin{aligned} \langle M(x)-M(y),x-y\rangle \ge \kappa \Vert M(x)-M(y)\Vert ^{2},~~\forall x,y\in D. \end{aligned}$$

Lemma 2.1

[15, Theorem 3.1] Let D be a nonempty closed convex subset of \(\mathcal {R}^{n}\). Suppose that \(M:\mathcal {R}^{n}\rightarrow D\) is a nonexpansive mapping such that its set of fixed points \(\text{ Fix }(M)\) is nonempty. Let the sequence \(\{z_{k}\}_{k\in N}\) in \(\mathcal {R}^{n}\) be generated by choosing \(z_{1}\in \mathcal {R}^{n}\) and using the recursion

$$\begin{aligned} z_{k+1}:=\alpha _{k}z_{k}+\beta _{k} M(z_{k})+r_{k},~~\forall k\ge 1, \end{aligned}$$
(2.1)

where \(r_{k}\) denotes the residual vector. Here we assume that \(\{\alpha _{k}\}_{k\in N}\) and \(\{\beta _{k}\}_{k\in N}\) are real sequences in [0,1] such that \(\alpha _{k}+\beta _{k}\le 1\) for all \(k\ge 1\) and the following conditions hold: (a) \(\sum _{k=1}^{\infty }\alpha _{k}\beta _{k}=\infty \); (b) \(\sum _{k=1}^{\infty }\Vert r_{k}\Vert <\infty \); and (c) \(\sum _{k=1}^{\infty }(1-\alpha _{k}-\beta _{k})<\infty \). Then the sequence \(\{z_{k}\}_{k\in N}\) generated by (2.1) converges to a fixed point of M.

Definition 2.2

[20, Definition 12.58] A function \(f:\mathcal {R}^n\rightarrow \mathcal {R}\cup \{+\,\infty \}\) is strongly convex with constant \(\beta >0\) if for any \(x,y\in \mathcal {R}^n\) and for any \(\theta \in (0,1)\), we have

$$\begin{aligned} f((1-\theta )x+\theta y)\le (1-\theta )f(x) + \theta f(y)-\frac{\beta \theta (1-\theta )}{2}\Vert x-y\Vert ^2. \end{aligned}$$

Moreover, if the above inequality holds for \(\beta =0\), then we call f is convex function.

Lemma 2.2

[2, Proposition 12.27] Let \(h:\mathcal {R}^n\rightarrow \mathcal {R}\cup \{+\,\infty \}\) be a proper lower semicontinuous convex function and \(\nu >0\), then the proximal operator \(\text{ prox }_{\nu h}\) given in (1.2) is firmly nonexpansive.

Definition 2.3

[21, Definition 3.10] A proper lower semicontinuous function \(g: \mathcal {R}^n\rightarrow \mathcal {R}\cup \{+\,\infty \}\) is called hypoconvex (weakly convex or semiconvex) if for some \(\omega >0\), the function \(g(\cdot )+\frac{\omega }{2}\Vert \cdot \Vert ^{2}\) is convex.

Remark 2.1

It is well-known that the set of hypoconvex functions contains several important classes of (nonsmooth) functions as special cases, for example, \(\varphi \)-convex functions [9] and primal-lower-nice functions [18]. Moreover, any twice continuously differentiable function with a bounded second-order derivative is hypoconvex, see, e.g., [7]. We refer to, e.g., [7, 8, 21], for more properties of hypoconvex functions.

Definition 2.4

[20, Definition 8.3] Consider a function \(f:\mathcal {R}^n\rightarrow \mathcal {R}\cup \{+\,\infty \}\) and \(\bar{x}\in \text{ dom }~f\).

  1. (i)

    The regular subdifferential of f at \(\bar{x}\), written \(\hat{\partial }f(\bar{x})\), is the set of vectors \(x^*\in \mathcal {R}^n\) that satisfy

    $$\begin{aligned} \liminf _{\begin{array}{c} y\rightarrow \bar{x} \\ y\ne \bar{x} \end{array}} \frac{f(y)-f(\bar{x})-\langle x^*, y-\bar{x}\rangle }{\Vert y-\bar{x}\Vert }\ge 0. \end{aligned}$$
  2. (ii)

    The subdifferential of f at \(\bar{x}\), written \(\partial f(\bar{x})\), is defined as follows:

    $$\begin{aligned} \partial f(\bar{x}):=\left\{ x^*\in \mathcal {R}^n: \exists x_k\rightarrow \bar{x}, f(x_k)\rightarrow f(\bar{x}), x^*_k\in \hat{\partial }f(x_k), \text{ with }~x^*_k\rightarrow x^* \right\} . \end{aligned}$$

Remark 2.2

It follows from Definition 2.4 that the following conclusions hold (see, e.g., [20]).

  1. (i)

    If \(h:\mathcal {R}^n\rightarrow \mathcal {R}\cup \{+\,\infty \}\) is a proper function and \(f:\mathcal {R}^n\rightarrow \mathcal {R}\) is continuously differentiable, then \(\partial (f+h)(x)={\nabla } f(x)+\partial h(x)\) for any \(x\in \text{ dom }~h\).

  2. (ii)

    For any proper convex function \(f:\mathcal {R}^n\rightarrow \mathcal {R}\cup \{+\,\infty \}\) and for any \(\bar{x}\in \text{ dom }~f\), the subdifferential of f at \(\bar{x}\) is defined as \(\bar{\partial }f(\bar{x}):=\{v\in \mathcal {R}^{n}|~f(x)\ge f(\bar{x})+\langle v, x-\bar{x}\rangle ~\text {for all}~x\}\). For a convex function f, we have \(\partial f(\bar{x})=\hat{\partial } f(\bar{x})=\bar{\partial } f(\bar{x})\) for any \(\bar{x}\in \text{ dom }~f\).

The next Lemma is known as the Baillon-Haddad theorem in the literature.

Lemma 2.3

[1, Corollaire 10] Let \(f:\mathcal {R}^{n}\rightarrow \mathcal {R}\) be differentiable convex on \(\mathcal {R}^{n}\), and such that \({\nabla } f\) is Lipschitz continuous with constant \(L>0\). Then \({\nabla } f\) is 1 / L-cocoercive.

The following interesting lemma is from [22].

Lemma 2.4

Let \(h:=h_{1}-h_{2}\), where \(h_{1}\) is a convex function with \({\nabla } h_{1}\) being Lipschitz continuous with constant \(L>0\), \(h_{2}\) is a convex function with \({\nabla } h_{2}\) being Lipschitz continuous with constant \(l>0\). Assume that \(L\ge l\), then \({\nabla } h\) is Lipschitz continuous with constant L.

3 Convergence Analysis

As we recalled in the introduction, the first effort seems to be the work [5], in which the convergence of the DRSM (1.3) for the case \(\beta =\omega \) was established under the additional assumptions that f is second-order differentiable, its gradient \({\nabla } f\) is Lipschitz continuous with constant \(L>0\) and \(0<\lambda \le 1/\sqrt{L\beta }\). In absence of differentiability, [13] established the convergence under the stronger assumption that \(\beta >\omega \). In this section, we prove the convergence of the more general iterative process GIDRSM (1.4) for the case \(\beta =\omega \), under the following assumptions.

Assumption 3.1

Assume the following conditions are satisfied:

  1. (A1)

    \(f:\mathcal {R}^n\rightarrow \mathcal {R}\) is strongly convex with constant \(\beta >0\), continuously differentiable such that \({\nabla } f\) is Lipschitz continuous with constant \(L>0\);

  2. (A2)

    \(g: \mathcal {R}^n\rightarrow \mathcal {R}\cup \{+\,\infty \}\) is proper and lower semicontinuous hypoconvex with constant \(\beta >0\);

  3. (A3)

    The set \(X^{*}\) of all optimal solutions of problem (1.1) is nonempty, that is , \(X^{*}\ne \emptyset \).

Remark 3.1

Compared with the assumptions made in [5], we only require the strongly convex function f is continuously differentiable, not necessarily second-order differentiable.

Recall that g is assumed to be hypoconvex with constant \(\beta >0\), i.e.,

$$\begin{aligned} \tilde{g}(x):=g(x)+\frac{\beta }{2}\Vert x\Vert ^{2} \end{aligned}$$
(3.1)

is convex. According to [13, Section 3], for \(0<\lambda <1/\beta \), it follows that \(\text{ prox }_{\lambda g}\) is single-valued on \(\mathcal {R}^{n}\) and

$$\begin{aligned} \text{ prox }_{\lambda g}(x)= (I+\lambda \partial g)^{-1}(x). \end{aligned}$$

Here, \(\partial g\) is the subdifferential of g defined in (ii) of Definition 2.4. Moreover, based on (A3) in Assumption 3.1, it follows from the assertions (i) and (ii) in Remark 2.2 that \(x^*\in X^{*}\) if and only if

$$\begin{aligned} 0\in \bar{\partial } (f+g)(x^*) = {\nabla } f(x^*) + \partial g(x^*). \end{aligned}$$
(3.2)

Proposition 3.1

Suppose that Assumption 3.1 holds and \(0\!<\!\lambda \!<\!1/\beta \). Then, \(\text{ Fix }(R_{\lambda f}R_{\lambda g})\!\ne \!\emptyset \). Moreover, \(\text{ prox }_{\lambda g}(z^*)\) is a solution of (1.1) for \(z^{*} \in \text{ Fix }(R_{\lambda f}R_{\lambda g})\).

Proof

The proof is similar to [13, Proposition 4.1], however, for the convenience of the reader, we sketch it here. Indeed, it follows from (3.2) that

$$\begin{aligned} 2x^*\in (I+\lambda {\nabla }f)(x^*)+(I+\lambda \partial g)(x^*). \end{aligned}$$

By setting \(z^{*}:=2x^{*}-(I+\lambda {\nabla } f)(x^{*})\), we obtain \(z^*\in (I+\lambda \partial g)(x^*)\), i.e., \(x^*=(I+\lambda \partial g)^{-1}(z^*)=\text{ prox }_{\lambda g}(z^*)\). By means of this, we have

$$\begin{aligned} R_{\lambda g}(z^*)=2\text{ prox }_{\lambda g}(z^*)-z^*=2x^{*}-z^{*}=(I+\lambda {\nabla } f)(x^{*}) \end{aligned}$$
(3.3)

Since \(\text{ prox }_{\lambda f}=(I+\lambda {\nabla } f)^{-1}\) is single-valued, it follows from (3.3) that

$$\begin{aligned} { x^*=\text{ prox }_{\lambda f}R_{\lambda g}(z^*).} \end{aligned}$$

Thus,

$$\begin{aligned} z^*=2x^*-R_{\lambda g}(z^*)=2\text{ prox }_{\lambda f}R_{\lambda g}(z^*)-R_{\lambda g}(z^*)=R_{\lambda f}R_{\lambda g}(z^*). \end{aligned}$$

Thence, \(\text{ Fix }(R_{\lambda f}R_{\lambda g})\ne \emptyset \). Moreover, for any \(z^{*}\in \text{ Fix }(R_{\lambda f}R_{\lambda g})\), we have

$$\begin{aligned} z^{*}=R_{\lambda f}R_{\lambda g}(z^*)=2\text{ prox }_{\lambda f}R_{\lambda g}(z^{*})-R_{\lambda g}(z^{*}). \end{aligned}$$
(3.4)

Setting \(y^{*}:=\text{ prox }_{\lambda f}R_{\lambda g}(z^{*})\) in (3.4), we get

$$\begin{aligned} 2y^{*}-z^{*}=R_{\lambda g}(z^{*})=2\text{ prox }_{\lambda g}(z^{*})-z^{*}. \end{aligned}$$
(3.5)

which means \(y^{*}=\text{ prox }_{\lambda g}(z^{*})\), i.e.,

$$\begin{aligned} z^{*}\in (I+\lambda \partial g)(y^{*}). \end{aligned}$$
(3.6)

On the other hand, it follows from the definition of \(y^{*}\) and (3.5) that \(y^{*}=\text{ prox }_{\lambda f}(2y^{*}-z^{*})\). This means

$$\begin{aligned} 2y^{*}-z^{*}= (I+\lambda {\nabla } f)(y^{*}). \end{aligned}$$
(3.7)

Adding (3.6) and (3.7), and invoking (3.2), we know that \(y^{*}=\text{ prox }_{\lambda g}(z^{*})\) is a solution of problem (1.1). The proof is complete. \(\square \)

To prove the convergence of the GIDRSM (1.4), we need to extensively analyze the terms \(\Vert R_{\lambda f}(x)-R_{\lambda f}(y)\Vert \) and \(\Vert R_{\lambda g}(x)-R_{\lambda g}(y)\Vert \); and derive their bounds. The following lemma focuses on \(\Vert R_{\lambda f}(x)-R_{\lambda f}(y)\Vert \).

Lemma 3.1

Let \(f:\mathcal {R}^n\rightarrow \mathcal {R}\) be a strongly convex function with constant \(\beta >0\), continuously differentiable such that \({\nabla } f\) is Lipschitz continuous with constant \(L>0\). Then for any \(x,y\in \mathcal {R}^n\) and \(\lambda >0\), we have

$$\begin{aligned} \Vert R_{\lambda f}(x)-R_{\lambda f}(y)\Vert \le \frac{|1-\lambda \beta |}{1+\lambda \beta }\cdot \Vert x-y\Vert \end{aligned}$$

Proof

First, we claim that \(\beta \le L\). To see this, since f is strongly convex with constant \(\beta >0\), by [20, Exercise 12.59] we know that \({\nabla } f\) is strongly monotone with constant \(\beta \), that is, for any \(x,y\in \mathcal {R}^{n}\),

$$\begin{aligned} \beta \Vert x-y\Vert ^{2}\le & {} \langle {\nabla } f(x)-{\nabla } f(y), x-y\rangle \nonumber \\\le & {} \Vert {\nabla } f(x)-{\nabla } f(y)\Vert \cdot \Vert x-y\Vert \nonumber \\\le & {} L\Vert x-y\Vert ^{2}, \end{aligned}$$
(3.8)

where the second inequality follows from the Cauchy-Schwarz inequality and the third inequality follows from the Lipschitz continuity of \({\nabla } f\). Thus, (3.8) yields the desired result. By setting

$$\begin{aligned} \tilde{f}(x):=f(x)-\frac{\beta }{2}\Vert x\Vert ^{2}, \end{aligned}$$
(3.9)

it follows from [20, Exercise 12.59] that \(\tilde{f}\) is convex because of the strong convexity of f.

Set \(h_{1}(x):=f(x)\) and \(h_{2}(x):=\frac{\beta }{2}\Vert x\Vert ^{2}\), then \(h_{1}\) is convex with \({\nabla } h_{1}\) being Lipschitz continuous with constant \(L>0\), \(h_{2}\) is convex with \({\nabla } h_{2}\) being Lipschitz continuous with constant \(\beta >0\). In view of \(L\ge \beta \), it follows from Lemma 2.4 that \({\nabla } \tilde{f}\) is Lipschitz continuous with constant \(L>0\). Recall the definition of the proximal mapping, we have

$$\begin{aligned} \text{ prox }_{\lambda f}(x)= & {} \arg \min _{y\in \mathcal {R}^{n}}\left\{ f(y)+\frac{1}{2\lambda }\Vert y-x\Vert ^{2} \right\} \nonumber \\= & {} \arg \min _{y\in \mathcal {R}^{n}}\left\{ \tilde{f}(y)+\frac{1}{2\lambda }\Vert y-x\Vert ^{2}+\frac{\beta }{2}\Vert y\Vert ^{2}\right\} \nonumber \\= & {} \text{ prox }_{\frac{\lambda }{1+\lambda \beta }\tilde{f}}\left( \frac{1}{1+\lambda \beta }\cdot x\right) . \end{aligned}$$
(3.10)

By the optimality condition of (3.10), we know

$$\begin{aligned} 0={\nabla } \tilde{f}(\text{ prox }_{\lambda f}(x))+\frac{1}{\lambda }(\text{ prox }_{\lambda f}(x)-x)+\beta \text{ prox }_{\lambda f}(x), \end{aligned}$$

which means that

$$\begin{aligned} {\nabla } \tilde{f}(\text{ prox }_{\lambda f}(x))=\frac{1}{\lambda }x-\left( \frac{1}{\lambda }+\beta \right) \text{ prox }_{\lambda f}(x). \end{aligned}$$
(3.11)

Since \(\tilde{f}\) is convex and \({\nabla } \tilde{f}\) is Lipschitz continuous with constant \(L>0\), by Lemma 2.3 we know \({\nabla } \tilde{f}\) is 1 / L-cocoercive, that is,

$$\begin{aligned} \left\langle {\nabla } \tilde{f}(\text{ prox }_{\lambda f}(x))-{\nabla } \tilde{f}(\text{ prox }_{\lambda f}(y)), x-y\right\rangle \ge \frac{1}{L}\left\| {\nabla } \tilde{f}(\text{ prox }_{\lambda f}(x))-{\nabla } \tilde{f}(\text{ prox }_{\lambda f}(y))\right\| ^{2}. \end{aligned}$$
(3.12)

Substituting (3.11) into (3.12), we have

$$\begin{aligned}&\left\langle \frac{1}{\lambda }x-\left( \frac{1}{\lambda }+\beta \right) \text{ prox }_{\lambda f}(x)-\frac{1}{\lambda }y+\left( \frac{1}{\lambda }+\beta \right) \text{ prox }_{\lambda f}(y), \text{ prox }_{\lambda f}(x)-\text{ prox }_{\lambda f}(y)\right\rangle \nonumber \\&\quad \, \ge \frac{1}{L}\left\| \frac{1}{\lambda }x-\left( \frac{1}{\lambda }+\beta \right) \text{ prox }_{\lambda f}(x)-\frac{1}{\lambda }y+\left( \frac{1}{\lambda }+\beta \right) \text{ prox }_{\lambda f}(y)\right\| ^{2}, \end{aligned}$$
(3.13)

Notice that, the left-hand side of (3.13) can be rewritten as

$$\begin{aligned}&\frac{1}{\lambda }\cdot \left\langle x-y, \text{ prox }_{\lambda f}(x)-\text{ prox }_{\lambda f}(y)\right\rangle \nonumber \\&\quad -\,\frac{1+\lambda \beta }{\lambda }\cdot \left\| \text{ prox }_{\lambda f}(x)-\text{ prox }_{\lambda f}(y)\right\| ^{2}, \end{aligned}$$
(3.14)

while the right-hand side of (3.13) can be rewritten as

$$\begin{aligned}&\frac{1}{L\lambda ^{2}}\cdot \Vert x-y\Vert ^{2}-\frac{2+2\lambda \beta }{L\lambda ^{2}}\cdot \langle x-y, \text{ prox }_{\lambda f}(x)\nonumber \\&\quad -\,\text{ prox }_{\lambda f}(y)\rangle +\frac{(1+\lambda \beta )^{2}}{L\lambda ^{2}}\cdot \Vert \text{ prox }_{\lambda f}(x)-\text{ prox }_{\lambda f}(y)\Vert ^{2}. \end{aligned}$$
(3.15)

Substituting (3.14) and (3.15) into (3.13) and rearrange terms, it follows that

$$\begin{aligned}&\left\langle x-y, \text{ prox }_{\lambda f}(x)-\text{ prox }_{\lambda f}(y)\right\rangle \nonumber \\&\quad \ge \, \frac{1}{(2\beta +L)\lambda +2}\cdot \Vert x-y\Vert ^{2}\nonumber \\&\quad +\,\frac{(1+\lambda \beta )^{2}+(1+\lambda \beta )\lambda L}{(2\beta +L)\lambda +2}\cdot \left\| \text{ prox }_{\lambda f}(x)-\text{ prox }_{\lambda f}(y)\right\| ^{2}. \end{aligned}$$
(3.16)

Since

$$\begin{aligned}&\Vert R_{\lambda f}(x)-R_{\lambda f}(y)\Vert ^{2} \nonumber \\&\quad =\,\Vert x-y\Vert ^{2}-4\left\langle x-y, \text{ prox }_{\lambda f}(x)-\text{ prox }_{\lambda f}(y)\right\rangle \nonumber \\&\qquad +\, 4\left\| \text{ prox }_{\lambda f}(x)-\text{ prox }_{\lambda f}(y)\right\| ^{2}. \end{aligned}$$
(3.17)

Substituting (3.16) into (3.17), we get that

$$\begin{aligned}&\Vert R_{\lambda f}(x)-R_{\lambda f}(y)\Vert ^{2} \nonumber \\&\quad \le \, \left( 1-\frac{4}{(2\beta +L)\lambda +2}\right) \cdot \Vert x-y\Vert ^{2}\nonumber \\&\quad +\,\left( 4-\frac{4(1+\lambda \beta )^{2}+4(1+\lambda \beta )\lambda L}{(2\beta +L)\lambda +2}\right) \cdot \left\| \text{ prox }_{\lambda f}(x) -\text{ prox }_{\lambda f}(y)\right\| ^{2} \nonumber \\&\quad =\,\frac{(2\beta +L)\lambda -2}{(2\beta +L)\lambda +2}\cdot \left\| x-y\right\| ^{2} +\frac{4-4\beta ^{2}\lambda ^{2}-4\beta L\lambda ^{2}}{(2\beta +L)\lambda +2}\cdot \left\| \text{ prox }_{\lambda f}(x)-\text{ prox }_{\lambda f}(y)\right\| ^{2}.\nonumber \\ \end{aligned}$$
(3.18)

Recall that, \(\text{ prox }_{\lambda f}(x)=\text{ prox }_{\frac{\lambda }{1+\lambda \beta }\tilde{f}}\left( \frac{1}{1+\lambda \beta }\cdot x\right) \). According to Lemma 2.2, \(\text{ prox }_{\frac{\lambda }{1+\lambda \beta }\tilde{f}}\) is firmly nonexpansive and hence nonexpansive, then we have

$$\begin{aligned} \Vert \text{ prox }_{\lambda f}(x)-\text{ prox }_{\lambda f}(y)\Vert= & {} \left\| \text{ prox }_{\frac{\lambda }{1+\lambda \beta }\tilde{f}}\left( \frac{1}{1+\lambda \beta }\cdot x\right) -\text{ prox }_{\frac{\lambda }{1+\lambda \beta }\tilde{f}}\left( \frac{1}{1+\lambda \beta }\cdot y\right) \right\| \nonumber \\\le & {} \frac{1}{1+\lambda \beta }\cdot \Vert x-y\Vert . \end{aligned}$$
(3.19)

Substituting (3.19) back into (3.18), we get

$$\begin{aligned} \Vert R_{\lambda f}(x)-R_{\lambda f}(y)\Vert ^{2}\le & {} \frac{(2\beta +L)\lambda -2}{(2\beta +L)\lambda +2}\cdot \left\| x-y\right\| ^{2}\\&+\,\frac{4-4\beta ^{2}\lambda ^{2}-4\beta L\lambda ^{2}}{(2\beta +L)\lambda +2}\cdot \frac{1}{(1+\lambda \beta )^{2}}\cdot \Vert x-y\Vert ^{2} \\= & {} \left( \frac{(2\beta +L)\lambda -2}{(2\beta +L)\lambda +2}+\frac{4-4\beta ^{2}\lambda ^{2}-4\beta L\lambda ^{2}}{((2\beta +L)\lambda +2)(1+\lambda \beta )^{2}}\right) \cdot \Vert x-y\Vert ^{2} \\= & {} \frac{((2\beta +L)\lambda -2)(1+\lambda \beta )^{2}+4-4\beta ^{2}\lambda ^{2}-4\beta L\lambda ^{2}}{(2(1+\lambda \beta )+\lambda L)(1+\lambda \beta )^{2}}\cdot \Vert x-y\Vert ^{2} \\= & {} \frac{(2\beta +L)\beta ^{2}\lambda ^{3}-(2\beta ^{2}+2\beta L)\lambda ^{2}+(L-2\beta )\lambda +2}{((2\beta +L)\lambda +2)(1+\lambda \beta )^{2}}\cdot \Vert x-y\Vert ^{2} \\= & {} \frac{(1-\lambda \beta )^{2}}{(1+\lambda \beta )^{2}}\cdot \Vert x-y\Vert ^{2}, \end{aligned}$$

where the last equality follows from the following observation

$$\begin{aligned}&(2\beta +L)\beta ^{2}\lambda ^{3}-(2\beta ^{2}+2\beta L)\lambda ^{2}+(L-2\beta )\lambda +2\\&\quad =((2\beta +L)\lambda +2)(1-\lambda \beta )^{2}. \end{aligned}$$

The proof is complete. \(\square \)

Next, we estimate \(\Vert R_{\lambda g}(x)-R_{\lambda g}(y)\Vert \) for the hypoconvex function g. The following lemma was mentioned in [5]; for completeness, here we give a slightly simple proof.

Lemma 3.2

Let \(g: \mathcal {R}^n\rightarrow \mathcal {R}\cup \{+\,\infty \}\) be a proper lower semicontinuous hypoconvex function with constant \(\beta >0\). Then for any \(x,y\in \mathcal {R}^n\) and \(0<\lambda <1/\beta \), we have

$$\begin{aligned} \Vert R_{\lambda g}(x)-R_{\lambda g}(y)\Vert \le \frac{1+\lambda \beta }{1-\lambda \beta }\cdot \Vert x-y\Vert . \end{aligned}$$
(3.20)

Proof

Actually, for any \(0<\lambda <1/\beta \), it follows from [13, Theorem 4.2] that

$$\begin{aligned} \Vert R_{\lambda g}(x)-R_{\lambda g}(y)\Vert ^{2} \le \Vert x-y\Vert ^{2}+4\beta \lambda \Vert \text{ prox }_{\lambda g}(x)-\text{ prox }_{\lambda g}(y)\Vert ^{2}. \end{aligned}$$
(3.21)

Recall the definition of \(\tilde{g}\) in (3.1), similar to (3.10), for any \(0<\lambda <1/\beta \), we can get

$$\begin{aligned} \text{ prox }_{\lambda g}(x)=\text{ prox }_{\frac{\lambda }{1-\lambda \beta }\tilde{g}}\left( \frac{1}{1-\lambda \beta }\cdot x\right) . \end{aligned}$$

With the help of the convexity of \(\tilde{g}\), then it follows from Lemma 2.2 that \(\text{ prox }_{\frac{\lambda }{1-\lambda \beta }\tilde{g}}\) is firmly nonexpansive and hence nonexpansive. Thus, \(\text{ prox }_{\frac{\lambda }{1-\lambda \beta }\tilde{g}}\left( \frac{1}{1-\lambda \beta }I\right) \) is Lipschitz continuous with constant \(\frac{1}{1-\lambda \beta }\), so is \(\text{ prox }_{\lambda g}\), i.e.,

$$\begin{aligned} \Vert \text{ prox }_{\lambda g}(x)-\text{ prox }_{\lambda g}(y)\Vert \le \frac{1}{1-\beta \lambda }\cdot \Vert x-y\Vert . \end{aligned}$$
(3.22)

Substituting (3.22) into (3.21), it follows that (3.20) holds. The proof is complete. \(\square \)

By virtue of Lemmas 3.1 and 3.2, we can prove the convergence of the GIDRSM (1.4) immediately. To this end, we make some assumptions on the parameters \(\{\alpha _{k}\}_{k\in N}\), \(\{\beta _{k}\}_{k\in N}\) and \(\{e_{k}\}_{k\in N}\).

Assumption 3.2

Assume that \(\{\alpha _{k}\}_{k\in N}\) and \(\{\beta _{k}\}_{k\in N}\) are real sequence in [0, 1] such that \(\alpha _{k}+\beta _{k}\le 1\) for all \(k\ge 1\) and the following conditions hold: (a) \(\sum _{k=1}^{\infty }\alpha _{k}\beta _{k}=\,\infty \); (b) \(\sum _{k=1}^{\infty }\beta _{k}\Vert e_{k}\Vert <\,\infty \); and (c) \(\sum _{k=1}^{\infty }(1-\alpha _{k}-\beta _{k})<\,\infty \).

Now we are in the position to present the main result of this paper, i.e., the convergence of the GIDRSM (1.4) under suitable conditions.

Theorem 3.1

Let \(\{z_{k}\}_{k\in N}\) be a sequence generated by the GIDRSM (1.4) and suppose that Assumptions 3.1 and 3.2 hold and \(0<\lambda <1/\beta \). Then \(\{z_{k}\}_{k\in N}\) converges to a fixed point of \(R_{\lambda f}R_{\lambda g}\). Moreover, \(\{\text{ prox }_{\lambda g}(z_{k})\}_{k\in N}\) converges to a solution of problem (1.1).

Proof

By Proposition 3.1 we know that \(\text{ Fix }(R_{\lambda f}R_{\lambda g})\ne \emptyset \). For any \(x,y\in \mathcal {R}^{n}\) and \(0<\lambda <1/\beta \), it follows from Lemmas 3.1 and 3.2 that

$$\begin{aligned} \left\| R_{\lambda f}R_{\lambda g}(x)-R_{\lambda f}R_{\lambda g}(y)\right\|\le & {} \frac{1-\lambda \beta }{1+\lambda \beta }\cdot \Vert R_{\lambda g}(x)-R_{\lambda g}(y)\Vert \\\le & {} \frac{1-\lambda \beta }{1+\lambda \beta }\cdot \frac{1+\lambda \beta }{1-\lambda \beta }\cdot \Vert x-y\Vert \\= & {} \Vert x-y\Vert . \end{aligned}$$

This means \(R_{\lambda f}R_{\lambda g}\) is nonexpansive. Thus, by letting \(M:=R_{\lambda f}R_{\lambda g}\) and \(r_{k}:=\beta _{k}e_{k}\) in Lemma 2.1, it follows from Assumption 3.2 that the sequence \(\{z_{k}\}\) generated by the GIDRSM (1.4) converges to a point in \(\text{ Fix }(R_{\lambda f}R_{\lambda g})\). Without loss of generality, we assume \(\{z_{k}\}_{k\in N}\) converges to \(z^{*}\in \text{ Fix }(R_{\lambda f}R_{\lambda g})\). Set \(x:=z_{k}\) and \(y:=z^{*}\) in (3.22), we have \(\text{ prox }_{\lambda g}(z_{k})\rightarrow \text{ prox }_{\lambda g}(z^{*})\). Furthermore, \( \text{ prox }_{\lambda g}(z^{*})\) is a solution of problem (1.1) in view of Proposition 3.1. The proof is complete.

Remark 3.2

To ensure the convergence, the authors in [5] assume \(0<\lambda \le 1/\sqrt{\beta L}\) while we only assume \(0<\lambda <1/\beta \). Since \(\beta \le L\) (see the proof in Lemma 3.1), it holds that \(1/\sqrt{L\beta }\le 1/\beta \). Thus, we obtained the same global convergence result as in [5], not only with weaker assumption on the strong convex function f, but also with bigger range of the proximal parameter \(\lambda \).

Remark 3.3

Since g is hypoconvex, there is a fundamental difference between \(R_{\lambda f}\) and \(R_{\lambda g}\): \(R_{\lambda f}\) is contraction while \(R_{\lambda g}\) is expansive. Thus, we can also consider the convergence of the following scheme that swaps the composition of \(R_{\lambda f}\) and \(R_{\lambda g}\) in (1.4):

$$\begin{aligned} z_{k+1}:=\alpha _{k}z_{k}+\beta _{k} (R_{\lambda g}R_{\lambda f}(z_{k})+e_{k}). \end{aligned}$$
(3.23)

Similar to \(R_{\lambda f}R_{\lambda g}\), we can also show \(R_{\lambda g}R_{\lambda f}\) is a nonexpansive mapping by means of Lemmas 3.1 and 3.2. Thus, the convergence of the GIDRSM (3.23) is analogous to the GIDRSM (1.4). For succinctness, we skip the details.

As an immediate consequence of Theorem 3.1, we obtain the following corollary for the DRSM scheme (1.3).

Corollary 3.1

Let \(\{z_{k}\}_{k\in N}\) be a sequence generated by the DRSM (1.3) and suppose that Assumption 3.1 holds and \(0<\lambda <1/\beta \). Then \(\{z_{k}\}_{k\in N}\) converges to a fixed point of \(R_{\lambda f}R_{\lambda g}\). Moreover, \(\{\text{ prox }_{\lambda g}(z_{k})\}_{k\in N}\) converges to a solution of problem (1.1).

4 Conclusions

In this paper, we analyzed the general inexact Douglas–Rachford splitting method (GIDRSM) for the minimization of the sum of a strongly convex function and a hypoconvex function. We focused on the case that the sum of the two functions is convex (\(\beta =\omega \)). Under some relatively mild assumptions compared with some existing work in the literature, we proved its global convergence.

Note that for the case that the sum is strongly convex (\(\beta >\omega \)), the convergence of DRSM was recently established without any differentiable assumption on the functions [13]. Hence, it is natural to ask if the convergence holds in the absence of the differentiable assumption, or give a counter example, if it does not. We leave this as one of our future research topics.