Complexity Analysis of a Stochastic Variant of Generalized Alternating Direction Method of Multipliers

Hu, Jia; Guo, Tiande; Han, Congying

doi:10.1007/978-3-031-20350-3_18

Jia Hu^11,12,
Tiande Guo¹¹ &
Congying Han¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13571))

Included in the following conference series:

International Conference on Theory and Applications of Models of Computation

352 Accesses

Abstract

Alternating direction method of multipliers (ADMM) receives much attention in the field of optimization and computer science, etc. The generalized ADMM (G-ADMM) proposed by Eckstein and Bertsekas incorporates an acceleration factor and is more efficient than the original ADMM. However, G-ADMM is not applicable in some models where the objective function value (or its gradient) is computationally costly or even impossible to compute. In this paper, we consider the two-block separable convex optimization problem with linear constraints, where only noisy estimations of the gradient of the objective function are accessible. Under this setting, we propose a stochastic linearized generalized ADMM (called SLG-ADMM) where two subproblems are approximated by some linearization strategies. By properly choosing algorithm parameters, we show, for objective function value gap and constraint violation, the worst-case $\mathcal {O}\left( {1}/{\sqrt{k}}\right) $ and $\mathcal {O}\left( {\ln k}/{k}\right) $ convergence rates in expectation measured by the iteration complexity for general convex and strongly convex problems respectively (k represents the iteration counter). For the latter case, we also obtain the convergence of the ergodic iterates generated by the proposed SLG-ADMM.

This work was supported by Science and Technology Project of SGCC (5700-202055486A-0-0-00).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We sometimes use $\left( {x,y,\lambda } \right) $ to denote ${\left( {{x^T},{y^T},{\lambda ^T}} \right) ^T}$.

References

Glowinski, R., Marroco, A.: Sur l’approximation, par éléments finis d’ordre un, et la résolution, par pénalisation-dualité d’une classe de problèmes de Dirichlet non linéaires. Revue française d’automatique, informatique, recherche opérationnelle. Analyse numérique 9(R2), 41–76 (1975)
Google Scholar
Gabay, D., Mercier, B.: A dual algorithm for the solution of nonlinear variational problems via finite element approximation. Comput. Math. Appl. 2(1), 17–40 (1976)
Article MATH Google Scholar
Glowinski, R.: On alternating direction methods of multipliers: a historical perspective. In: Fitzgibbon, W., Kuznetsov, Y.A., Neittaanmäki, P., Pironneau, O. (eds.) Modeling, Simulation and Optimization for Science and Technology. CMAS, vol. 34, pp. 59–82. Springer, Dordrecht (2014). https://doi.org/10.1007/978-94-017-9054-3_4
Chapter Google Scholar
Eckstein, J., Bertsekas, D.P.: On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 55(1), 293–318 (1992)
Article MathSciNet MATH Google Scholar
He, B., Yuan, X.: On the $O(1/n)$ convergence rate of the Douglas-Rachford alternating direction method. SIAM J. Numer. Anal. 50(2), 700–709 (2012)
Article MathSciNet MATH Google Scholar
Monteiro, R.D.C., Svaiter, B.F.: Iteration-complexity of block-decomposition algorithms and the alternating direction method of multipliers. SIAM J. Optim. 23(1), 475–507 (2013)
Article MathSciNet MATH Google Scholar
He, B., Yuan, X.: On non-ergodic convergence rate of Douglas-Rachford alternating direction method of multipliers. Numer. Math. 130(3), 567–577 (2015)
Article MathSciNet MATH Google Scholar
Deng, W., Yin, W.: On the global and linear convergence of the generalized alternating direction method of multipliers. J. Sci. Comput. 66(3), 889–916 (2016)
Article MathSciNet MATH Google Scholar
Yang, W.H., Han, D.: Linear convergence of the alternating direction method of multipliers for a class of convex optimization problems. SIAM J. Numer. Anal. 54(2), 625–640 (2016)
Article MathSciNet MATH Google Scholar
Han, D., Sun, D., Zhang, L.: Linear rate convergence of the alternating direction method of multipliers for convex composite programming. Math. Oper. Res. 43(2), 622–637 (2018)
Article MathSciNet MATH Google Scholar
Li, G., Pong, T.K.: Global convergence of splitting methods for nonconvex composite optimization. SIAM J. Optim. 25(4), 2434–2460 (2015)
Article MathSciNet MATH Google Scholar
Wang, Y., Yin, W., Zeng, J.: Global convergence of ADMM in nonconvex nonsmooth optimization. J. Sci. Comput. 78(1), 29–63 (2019)
Article MathSciNet MATH Google Scholar
Jiang, B., Lin, T., Ma, S., et al.: Structured nonconvex and nonsmooth optimization: algorithms and iteration complexity analysis. Comput. Optim. Appl. 72(1), 115–157 (2019)
Article MathSciNet MATH Google Scholar
Zhang, J., Luo, Z.Q.: A proximal alternating direction method of multiplier for linearly constrained nonconvex minimization. SIAM J. Optim. 30(3), 2272–2302 (2020)
Article MathSciNet MATH Google Scholar
Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22(3), 400–407 (1951)
Article MathSciNet MATH Google Scholar
Nemirovski, A., Juditsky, A., Lan, G., et al.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4), 1574–1609 (2009)
Article MathSciNet MATH Google Scholar
Ghadimi, S., Lan, G.: Stochastic first-and zeroth-order methods for nonconvex stochastic programming. SIAM J. Optim. 23(4), 2341–2368 (2013)
Article MathSciNet MATH Google Scholar
Ghadimi, S., Lan, G.: Accelerated gradient methods for nonconvex nonlinear and stochastic programming. Math. Program. 156(1), 59–99 (2016)
Article MathSciNet MATH Google Scholar
Lan, G.: An optimal method for stochastic composite optimization. Math. Program. 133(1), 365–397 (2012)
Article MathSciNet MATH Google Scholar
Ghadimi, S., Lan, G., Zhang, H.: Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization. Math. Program. 155(1), 267–305 (2016)
Article MathSciNet MATH Google Scholar
Ouyang, H., He, N., Tran, L., et al.: Stochastic alternating direction method of multipliers. In: Proceedings of the 30th International Conference on Machine Learning, pp. 80–88. PMLR, Atlanta (2013)
Google Scholar
Suzuki, T.: Dual averaging and proximal gradient descent for online alternating direction multiplier method. In: Proceedings of the 30th International Conference on Machine Learning, pp. 392–400. PMLR, Atlanta (2013)
Google Scholar
Suzuki, T.: Stochastic dual coordinate ascent with alternating direction method of multipliers. In: Proceedings of the 31th International Conference on Machine Learning, pp. 736–744. PMLR, Beijing (2014)
Google Scholar
Zhao, P., Yang, J., Zhang, T., et al.: Adaptive stochastic alternating direction method of multipliers. In: Proceedings of the 32th International Conference on Machine Learning, pp. 69–77. PMLR, Lille (2014)
Google Scholar
Gao, X., Jiang, B., Zhang, S.: On the information-adaptive variants of the ADMM: an iteration complexity perspective. J. Sci. Comput. 76(1), 327–363 (2018)
Article MathSciNet MATH Google Scholar
Fang, E.X., He, B., Liu, H., Yuan, X.: Generalized alternating direction method of multipliers: new theoretical insights and applications. Math. Program. Comput. 7(2), 149–187 (2015). https://doi.org/10.1007/s12532-015-0078-2
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Mathematical Sciences, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing, 100049, People’s Republic of China
Jia Hu, Tiande Guo & Congying Han
Networked Supporting Software International S &T Cooperation Base of China, Jiangxi Normal University, Nanchang, 330022, People’s Republic of China
Jia Hu

Authors

Jia Hu
View author publications
You can also search for this author in PubMed Google Scholar
Tiande Guo
View author publications
You can also search for this author in PubMed Google Scholar
Congying Han
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Congying Han .

Editor information

Editors and Affiliations

University of Texas at Dallas, Richardson, TX, USA
Ding-Zhu Du
University of New Brunswick, Fredericton, NB, Canada
Donglei Du
Tianjin University of Technology, Tianjin, China
Chenchen Wu
Beijing University of Technology, Beijing, China
Dachuan Xu

Appendix

Proof of Lemma 1

Proof

Since the gradient of f is L-Lipschitz continuous, then for any y, z we have

$$\begin{aligned} f\left( y \right) \le f\left( z \right) + {\left( {y - z} \right) ^T}\nabla f\left( z \right) + \frac{L}{2}{\left\| {y - z} \right\| ^2}. \end{aligned}$$

Also, due to the convexity of f, we have for any x, z

$$\begin{aligned} f\left( x \right) \ge f\left( z \right) + {\left( {x - z} \right) ^T}\nabla f\left( z \right) . \end{aligned}$$

Adding the above two inequalities, we get the conclusion. If f is $\mu $-strongly convex, then for any x, z

$$\begin{aligned} f\left( x \right) \ge f\left( z \right) + {\left( {x - z} \right) ^T}\nabla f\left( z \right) + \frac{\mu }{2}{\left\| {x - z} \right\| ^2}. \end{aligned}$$

Then combine this inequality with

$$\begin{aligned} f\left( y \right) \le f\left( z \right) + {\left( {y - z} \right) ^T}\nabla f\left( z \right) + \frac{L}{2}{\left\| {y - z} \right\| ^2}, \end{aligned}$$

and the proof is completed. $\square $

Proof of Lemma 2

Proof

The optimality condition of the x-subproblem in SLG-ADMM is

$$\begin{aligned} \begin{aligned}&{\left( {x - {x^{k + 1}}} \right) ^T}\left( {G\left( {{x^k},\xi } \right) - {A^T}{\lambda ^k} + \beta {A^T}\left( {A{x^{k + 1}} + B{y^k} - b} \right) + \frac{1}{{{\eta _k}}}G_{1,k}\left( {{x^{k + 1}} - {x^k}} \right) } \right) \\&\ge 0,\forall x \in \mathcal {X}. \end{aligned} \end{aligned}$$

(12)

Using $\tilde{x}^k$ and $\tilde{\lambda }^k$ defined in (2) and notation of $\delta ^k$, (12) can be rewritten as

$$\begin{aligned} \begin{aligned} {\left( {x - {{\tilde{x}}^k}} \right) ^T}\left( {\nabla {\theta _1}\left( {{x^k}} \right) + {\delta ^k} - {A^T}{{\tilde{\lambda }}^k} + \frac{1}{{{\eta _k}}}G_{1,k}\left( {{{\tilde{x}}^k} - {x^k}} \right) } \right) \ge 0,\forall x \in \mathcal {X}. \end{aligned} \end{aligned}$$

(13)

In lemma 1, letting $y = \tilde{x}^k$, $z = x^k$, and $f = \theta _1$, we get

$$\begin{aligned} {\left( {x - {{\tilde{x}}^k}} \right) ^T}\nabla {\theta _1}\left( {{x^k}} \right) \le {\theta _1}\left( x \right) - {\theta _1}\left( {{{\tilde{x}}^k}} \right) + \frac{L}{2}{\left\| {{x^k} - {{\tilde{x}}^k}} \right\| ^2}. \end{aligned}$$

(14)

Combining (13) and (14), we obtain

$$\begin{aligned} \begin{aligned}&{\theta _1}\left( x \right) - {\theta _1}\left( {{{\tilde{x}}^k}} \right) + {\left( {x - {{\tilde{x}}^k}} \right) ^T}\left( { - {A^T}{{\tilde{\lambda }}^k}} \right) \\&\ge \frac{1}{{{\eta _k}}}{\left( {x - {{\tilde{x}}^k}} \right) ^T}G_{1,k}\left( {{x^k} - {{\tilde{x}}^k}} \right) - {\left( {x - {{\tilde{x}}^k}} \right) ^T}{\delta ^k} - \frac{L}{2}{\left\| {{x^k} - {{\tilde{x}}^k}} \right\| ^2}. \end{aligned} \end{aligned}$$

(15)

Similarly, the optimality condition of y-subproblem is

$$\begin{aligned} \begin{aligned} {\theta _2}\left( y \right) - {\theta _2}\left( {{{\tilde{y}}^k}} \right) + {\left( {y - {{\tilde{y}}^k}} \right) ^T}\left( { - {B^T}{{\lambda }^{k+1}} + {G_{2,k}}\left( {{{\tilde{y}}^k} - {y^k}} \right) } \right) \ge 0,\forall y \in \mathcal {Y}. \end{aligned} \end{aligned}$$

(16)

Substituting (3) into (16), we obtain that

$$\begin{aligned} \begin{aligned}&{\theta _2}\left( y \right) - {\theta _2}\left( {{{\tilde{y}}^k}} \right) + {\left( {y - {{\tilde{y}}^k}} \right) ^T}\left( { - {B^T}{{\tilde{\lambda }}^k}} \right) \\&\ge \left( {1 - \alpha } \right) {\left( {y - {{\tilde{y}}^k}} \right) ^T}{B^T}\left( {{\lambda ^k} - {{\tilde{\lambda }}^k}} \right) + {\left( {y - {{\tilde{y}}^k}} \right) ^T}\left( {\beta {B^T}B + {G_{2,k}}} \right) \left( {{y^k} - {{\tilde{y}}^k}} \right) ,\forall y \in \mathcal {Y}. \end{aligned} \end{aligned}$$

(17)

At the same time,

$$\begin{aligned} {{\tilde{\lambda }}^k}&= {\lambda ^k} - \beta \left( {A{x^{k + 1}} + B{y^{k + 1}} - b} \right) + \beta B\left( {{y^{k + 1}} - {y^k}} \right) \\&= {\lambda ^k} - \beta \left( {A{{\tilde{x}}^k} + B{{\tilde{y}}^k} - b} \right) + \beta B\left( {{{\tilde{y}}^k} - {y^k}} \right) . \end{aligned}$$

That is

$$\begin{aligned} {\left( {\lambda - {{\tilde{\lambda }}^k}} \right) ^T}\left( {A{{\tilde{x}}^k} + B{{\tilde{y}}^k} - b} \right) = \frac{1}{\beta }{\left( {\lambda - {{\tilde{\lambda }}^k}} \right) ^T}\left( {{\lambda ^k} - {{\tilde{\lambda }}^k}} \right) + {\left( {\lambda - {{\tilde{\lambda }}^k}} \right) ^T}B\left( {{{\tilde{y}}^k} - {y^k}} \right) . \end{aligned}$$

(18)

Combining (15), (17), and (18), we get

$$\begin{aligned} \begin{aligned} \theta&\left( u \right) - \theta \left( {{{\tilde{u}}^k}} \right) + \begin{pmatrix} {x - {{\tilde{x}}^k}} \\ {y - {{\tilde{y}}^k}} \\ {\lambda - {{\tilde{\lambda }}^k}} \end{pmatrix} ^T \begin{pmatrix} { - {A^T}{{\tilde{\lambda }}^k}} \\ { - {B^T}{{\tilde{\lambda }}^k}} \\ {A{{\tilde{x}}^k} + B{{\tilde{y}}^k} - b} \end{pmatrix} \\ \ge&\frac{1}{{{\eta _k}}}{\left( {x - {{\tilde{x}}^k}} \right) ^T}G_{1,k}\left( {{x^k} - {{\tilde{x}}^k}} \right) - {\left( {x - {{\tilde{x}}^k}} \right) ^T}{\delta ^k} - \frac{L}{2}{\left\| {{x^k} - {{\tilde{x}}^k}} \right\| ^2} \\&+ \left( {1 - \alpha } \right) {\left( {y - {{\tilde{y}}^k}} \right) ^T}{B^T}\left( {{\lambda ^k} - {{\tilde{\lambda }}^k}} \right) + {\left( {y - {{\tilde{y}}^k}} \right) ^T}\left( {\beta {B^T}B + {G_{2,k}}} \right) \left( {{y^k} - {{\tilde{y}}^k}} \right) \\&+ \frac{1}{\beta }{\left( {\lambda - {{\tilde{\lambda }}^k}} \right) ^T}\left( {{\lambda ^k} - {{\tilde{\lambda }}^k}} \right) + {\left( {\lambda - {{\tilde{\lambda }}^k}} \right) ^T}B\left( {{{\tilde{y}}^k} - {y^k}} \right) ,\forall w \in \varOmega . \end{aligned} \end{aligned}$$

(19)

Finally, by the definition of F and $Q_k$, we come to the conclusion. $\square $

Proof of Lemma 3

Proof

Using $Q_k = H_k M$ and ${w^k} - {w^{k + 1}} = M\left( {{w^k} - {{\tilde{w}}^k}} \right) $ in (5), we have

$$\begin{aligned} \begin{aligned} {\left( {w - {{\tilde{w}}^k}} \right) ^T}{Q_k}\left( {{w^k} - {{\tilde{w}}^k}} \right) =&{\left( {w - {{\tilde{w}}^k}} \right) ^T}{H_k}M\left( {{w^k} - {{\tilde{w}}^k}} \right) \\ =&{\left( {w - {{\tilde{w}}^k}} \right) ^T}{H_k}\left( {{w^k} - {w^{k + 1}}} \right) . \end{aligned} \end{aligned}$$

(20)

Now applying the identity: for the vectors a, b, c, d and a matrix H with appropriate dimension,

$${\left( {a - b} \right) ^T}H\left( {c - d} \right) = \frac{1}{2}\left( {\left\| {a - d} \right\| _H^2 - \left\| {a - c} \right\| _H^2} \right) + \frac{1}{2}\left( {\left\| {c - b} \right\| _H^2 - \left\| {d - b} \right\| _H^2} \right) .$$

In this identity, letting $a = w$, $b = \tilde{w}^k$, $c = w^k$, $d = \tilde{w}^k$, and $H = Q_k$, we have

$$\begin{aligned} \begin{aligned} {\left( {w - {{\tilde{w}}^k}} \right) ^T}{H_k}\left( {{w^k} - {w^{k + 1}}} \right) =&\frac{1}{2}\left( {\left\| {w - {w^{k + 1}}} \right\| _{{H_k}}^2 - \left\| {w - {w^k}} \right\| _{{H_k}}^2} \right) \\&+ \frac{1}{2}\left( {\left\| {{w^k} - {{\tilde{w}}^k}} \right\| _{{H_k}}^2 - \left\| {{w^{k + 1}} - {{\tilde{w}}^k}} \right\| _{{H_k}}^2} \right) . \end{aligned} \end{aligned}$$

Next we simplify the term ${\left\| {{w^k} - {{\tilde{w}}^k}} \right\| _{{H_k}}^2 - \left\| {{w^{k + 1}} - {{\tilde{w}}^k}} \right\| _{{H_k}}^2}$.

$$\begin{aligned} \begin{aligned}&\left\| {{w^k} - {{\tilde{w}}^k}} \right\| _{{H_k}}^2 - \left\| {{w^{k + 1}} - {{\tilde{w}}^k}} \right\| _{{H_k}}^2 \\&= \left\| {{w^k} - {{\tilde{w}}^k}} \right\| _{{H_k}}^2 - \left\| {{w^{k + 1}} - {w^k} + {w^k} - {{\tilde{w}}^k}} \right\| _{{H_k}}^2 \\&= \left\| {{w^k} - {{\tilde{w}}^k}} \right\| _{{H_k}}^2 - \left\| {\left( {{I_{{n_1} + {n_2} + n}} - M} \right) \left( {{w^k} - {{\tilde{w}}^k}} \right) } \right\| _{{H_k}}^2 \\&= {\left( {{w^k} - {{\tilde{w}}^k}} \right) ^T}\left( {{H_k} - {{\left( {{I_{{n_1} + {n_2} + n}} - M} \right) }^T}{H_k}\left( {{I_{{n_1} + {n_2} + n}} - M} \right) } \right) \left( {{w^k} - {{\tilde{w}}^k}} \right) \\&= {\left( {{w^k} - {{\tilde{w}}^k}} \right) ^T}\left( {{H_k}M + {M^T}{H_k} - {M^T}{H_k}M} \right) \left( {{w^k} - {{\tilde{w}}^k}} \right) \\&= {\left( {{w^k} - {{\tilde{w}}^k}} \right) ^T}\left( {\left( {2I_{n_1+n_2+n} - {M^T}} \right) {Q_k}} \right) \left( {{w^k} - {{\tilde{w}}^k}} \right) , \end{aligned} \end{aligned}$$

where the second equality uses ${w^k} - {w^{k + 1}} = M\left( {{w^k} - {{\tilde{w}}^k}} \right) $ in (5), and the last equality holds since the transpose of $M^TH_k$ is $H_kM$ and hence

$$\begin{aligned} \begin{aligned} {\left( {{w^k} - {{\tilde{w}}^k}} \right) ^T}{H_k}M\left( {{w^k} - {{\tilde{w}}^k}} \right) =&{\left( {{w^k} - {{\tilde{w}}^k}} \right) ^T}{M^T}{H_k}\left( {{w^k} - {{\tilde{w}}^k}} \right) \\ =&{\left( {{w^k} - {{\tilde{w}}^k}} \right) ^T}{Q_k}\left( {{w^k} - {{\tilde{w}}^k}} \right) . \end{aligned} \end{aligned}$$

The remaining task is to prove

$$\begin{aligned} \begin{aligned}&{\left( {{w^k} - {{\tilde{w}}^k}} \right) ^T}\left( {\left( {2I_{n_1+n_2+n} - {M^T}} \right) {Q_k}} \right) \left( {{w^k} - {{\tilde{w}}^k}} \right) \\&= \frac{1}{{{\eta _k}}}\left\| {{x^k} - {{\tilde{x}}^k}} \right\| _{{G_{1,k}}}^2 + \left\| {{y^k} - {{\tilde{y}}^k}} \right\| _{{G_{2,k}}}^2 - \frac{{\alpha - 2}}{\beta }{\left\| {{\lambda ^k} - {{\tilde{\lambda }}^k}} \right\| ^2}. \end{aligned} \end{aligned}$$

(21)

By simple algebraic operation,

$$\begin{aligned} \left( {2I_{n_1+n_2+n} - {M^T}} \right) {Q_k} = \begin{pmatrix} \frac{1}{\eta _k}G_{1,k} &{} 0 &{} 0 \\ 0 &{} G_{2,k} &{} \left( 2 - \alpha \right) B^T \\ 0 &{} \left( \alpha - 2\right) B &{} \frac{2-\alpha }{\beta }I_n \end{pmatrix}. \end{aligned}$$

With this result, (21) holds and the proof is completed. $\square $

Proof of Theorem 1

Proof

Combining lemma 2 and lemma 3, we get

$$\begin{aligned} \begin{aligned}&\theta \left( {{{\tilde{u}}^t}} \right) - \theta \left( {u} \right) + {\left( {{{\tilde{w}}^t} - w} \right) ^T}F\left( {{{\tilde{w}}^t}} \right) \\ \le&\frac{1}{2}\left( {\left\| {{w^t} - w} \right\| _{{H_t}}^2 - \left\| {{w^{t + 1}} - w} \right\| _{{H_t}}^2} \right) - \frac{1}{{2{\eta _t}}}\left\| {{x^t} - {{\tilde{x}}^t}} \right\| _{{G_{1,t}}}^2 - \frac{1}{2}\left\| {{y^t} - {{\tilde{y}}^t}} \right\| _{{G_{2,t}}}^2 \\&+ \frac{{\alpha - 2}}{{2\beta }}{\left\| {{\lambda ^t} - {{\tilde{\lambda }}^t}} \right\| ^2} + {\left( {x - {{\tilde{x}}^t}} \right) ^T}{\delta ^t} + \frac{L}{2}{\left\| {{x^t} - {{\tilde{x}}^t}} \right\| ^2} \\ =&\frac{1}{2}\left( {\left\| {{w^t} - w} \right\| _{{H_t}}^2 - \left\| {{w^{t + 1}} - w} \right\| _{{H_t}}^2} \right) + {\left( {x - {x^t}} \right) ^T}{\delta ^t} + {\left( {{x^t} - {{\tilde{x}}^t}} \right) ^T}{\delta ^t} \\&+ \frac{1}{2}{\left( {{x^t} - {{\tilde{x}}^t}} \right) ^T}\left( {L{I_{{n_1}}} - \frac{1}{{{\eta _t}}}{G_{1,t}}} \right) \left( {{x^t} - {{\tilde{x}}^t}} \right) - \frac{1}{2}\left\| {{y^t} - {{\tilde{y}}^t}} \right\| _{{G_{2,t}}}^2 + \frac{{\alpha - 2}}{{2\beta }}{\left\| {{\lambda ^t} - {{\tilde{\lambda }}^t}} \right\| ^2} \\ \le&\frac{1}{2}\left( {\left\| {{w^t} - w} \right\| _{{H_t}}^2 - \left\| {{w^{t + 1}} - w} \right\| _{{H_t}}^2} \right) + {\left( {x - {x^t}} \right) ^T}{\delta ^t} + \frac{{{\alpha _t}}}{2}{\left\| {{\delta ^t}} \right\| ^2} \\&+ \frac{1}{2}{\left( {{x^t} - {{\tilde{x}}^t}} \right) ^T}\left( {\left( {\frac{1}{{{\alpha _t}}} + L} \right) {I_{{n_1}}} - \frac{1}{{{\eta _t}}}{G_{1,t}}} \right) \left( {{x^t} - {{\tilde{x}}^t}} \right) \\ \le&\frac{1}{2}\left( {\left\| {{w^t} - w} \right\| _{{H_t}}^2 - \left\| {{w^{t + 1}} - w} \right\| _{{H_t}}^2} \right) + {\left( {x - {x^t}} \right) ^T}{\delta ^t} + \frac{{{\alpha _t}}}{2}{\left\| {{\delta ^t}} \right\| ^2}, \end{aligned} \end{aligned}$$

(22)

where the second inequality holds owing to the Young’s inequality and $\alpha \in \left( 0,2\right) $. Meanwhile,

$$\begin{aligned} \begin{aligned}&\frac{1}{{k + 1}}\sum \limits _{t = 0}^k {\theta \left( {{{\tilde{u}}^t}} \right) - \theta \left( {u} \right) + {{\left( {{{\tilde{w}}^t} - w} \right) }^T}F\left( {{{\tilde{w}}^t}} \right) } \\&= \frac{1}{{k + 1}}\sum \limits _{t = 0}^k {\theta \left( {{{\tilde{u}}^t}} \right) - \theta \left( {u} \right) + {{\left( {{{\tilde{w}}^t} - w} \right) }^T}F\left( {w} \right) } \\&\ge \theta \left( {{{\bar{u}}_k}} \right) - \theta \left( {u} \right) + {\left( {{{\bar{w}}_k} - w} \right) ^T}F\left( {w} \right) , \end{aligned} \end{aligned}$$

(23)

where the equality holds since for any $w_1$ and $w_2$,

$$ {\left( {{w_1} - {w_2}} \right) ^T}\left( {F\left( {{w_1}} \right) - F\left( {{w_2}} \right) } \right) = 0,$$

and the inequality follows from the convexity of $\theta $. Now summing both sides of (22) from 0 to k and then taking the average, and using (23), the assertion of this theorem follows directly. $\square $

Proof of Corollary 1

Proof

In (9), let $w = \left( {{x^ * },{y^ * },\lambda } \right) $, and $k = N$, where $\lambda = {\lambda ^ * } + e$ and e is a vector satisfying $ - {e^T}\left( {A{{\bar{x}}_N} + B{{\bar{y}}_N} - b} \right) = \left\| {A{{\bar{x}}_N} + B{{\bar{y}}_N} - b} \right\| $. Obviously, $\left\| e \right\| = 1$. Then the left hand side of (9) is

$$\begin{aligned} \theta \left( {{{\bar{u}}_N}} \right) - \theta \left( {{u^ * }} \right) - {\left( {{\lambda ^ * }} \right) ^T}\left( {A{{\bar{x}}_N} + B{{\bar{y}}_N} - b} \right) + \left\| {A{{\bar{x}}_N} + B{{\bar{y}}_N} - b} \right\| . \end{aligned}$$

(24)

Such a result is attributed to

$$\begin{aligned} \begin{aligned}&{\left( {{{\bar{w}}_N} - w} \right) ^T}F\left( w \right) \\ =&{\left( {{{\bar{x}}_N} - {x^ * }} \right) ^T}\left( { - {A^T}\lambda } \right) + {\left( {{{\bar{y}}_N} - {y^ * }} \right) ^T}\left( { - {B^T}\lambda } \right) + {\left( {{{\bar{\lambda }}_N} - \lambda } \right) ^T}\left( {A{x^ * } + B{y^ * } - b} \right) \\ =&{\lambda ^T}\left( {A{x^ * } + B{y^ * } - b} \right) - \left( {{\lambda ^T}\left( {A{{\bar{x}}_N} + B{{\bar{y}}_N} - b} \right) } \right) \\ =&- {\left( {{\lambda ^ * }} \right) ^T}\left( {A{{\bar{x}}_N} + B{{\bar{y}}_N} - b} \right) + \left\| {A{{\bar{x}}_N} + B{{\bar{y}}_N} - b} \right\| , \end{aligned} \end{aligned}$$

where the first equality follows from the definition of F, and the second and last equalities hold due to ${A{x^ * } + B{y^ * } - b} = 0$ and the choice of $\lambda $. On the other hand, substituting $w = \bar{w}_N$ into the variational inequality associated with (1), we get

$$\begin{aligned} \theta \left( {{{\bar{u}}_N}} \right) - \theta \left( {{u^ * }} \right) - {\left( {{\lambda ^ * }} \right) ^T}\left( {A{{\bar{x}}_N} + B{{\bar{y}}_N} - b} \right) \ge 0. \end{aligned}$$

(25)

Combining (24) and (25), we obtain that the left hand side of (9) is no less than $\left\| {A{{\bar{x}}_N} + B{{\bar{y}}_N} - b} \right\| $ when letting $w = \left( x^*, y^*, \lambda \right) $ and $k = N$. Hence,

$$\begin{aligned} \begin{aligned}&\mathbb {E}\left[ \left\| A \bar{x}_N+B \bar{y}_N-b\right\| \right] \\ \le&\frac{1}{2(N+1)} \sum _{t=0}^N\left( \left\| w^t-w^*\right\| _{H_t}^2-\left\| w^{t+1}- w^*\right\| _{H_t}^2\right) +\frac{1}{N+1} \sum _{t=0}^N \frac{\xi _t}{2} \sigma ^2 \\ \le&\frac{1}{2(N+1)}\left( M\left\| x^0-x^*\right\| _{G_{1,0}}^2+\left\| y^0- y^*\right\| _{\frac{\beta }{\alpha } B^T B+G_2}^2+\frac{2}{\beta \alpha }\left( \left\| \lambda ^0- \lambda ^*\right\| ^2+1\right) \right) \\&+\frac{1}{2 \sqrt{N}}\left( \sigma ^2+\left\| x^0-x^*\right\| _{G_{1,0}}^2\right) +\frac{1-\alpha }{(N+1) \alpha }\left( \lambda ^0-\lambda ^*\right) ^T B\left( y^0-y^*\right) . \end{aligned} \end{aligned}$$

(26)

where in the first inequality we use $\mathbb {E}\left[ {{\delta ^k}} \right] = 0$ and $\mathbb {E}\left[ {{{\left\| {{\delta ^k}} \right\| }^2}} \right] \le {\sigma ^2}$. The first part of this corollary is proved. Next we prove the second part. Substituting $w = \bar{w}_N$ into the variational inequality associated with (1), we get

$$\begin{aligned} \begin{aligned}&\theta \left( {{{\bar{u}}_N}} \right) - \theta \left( {{u^ * }} \right) + {\left( {{{\bar{w}}_N} - {w^ * }} \right) ^T}F\left( {{w^ * }} \right) \\ =&\theta \left( {{{\bar{u}}_N}} \right) - \theta \left( {{u^ * }} \right) - {\left( {{\lambda ^ * }} \right) ^T}\left( {A{{\bar{x}}_N} + B{{\bar{y}}_N} - b} \right) \\ \ge&\theta \left( {{{\bar{u}}_N}} \right) - \theta \left( {{u^ * }} \right) - \left\| {{\lambda ^ * }} \right\| \left\| {A{{\bar{x}}_N} + B{{\bar{y}}_N} - b} \right\| , \end{aligned} \end{aligned}$$

i.e.,

$$\begin{aligned} \begin{aligned} \theta \left( {{{\bar{u}}_N}} \right) - \theta \left( {{u^ * }} \right) \le \theta \left( {{{\bar{u}}_N}} \right) - \theta \left( {{u^ * }} \right) + {\left( {{{\bar{w}}_N} - {w^ * }} \right) ^T}F\left( {{w^ * }} \right) + \left\| {{\lambda ^ * }} \right\| \left\| {A{{\bar{x}}_N} + B{{\bar{y}}_N} - b} \right\| . \end{aligned} \end{aligned}$$

(27)

Taking expectation on both sides of (27) to complete the proof. $\square $

Proof of Corollary 2

Proof

The proof of this corollary is almost similar to the corollary 1, except for estimating $\mathbb {E}\left[ {\left\| {A{{\bar{x}}_k} + B{{\bar{y}}_k} - b} \right\| } \right] $.

$$\begin{aligned} \begin{aligned}&\mathbb {E}\left[ \left\| A \bar{x}_k+B \bar{y}_k-b\right\| \right] \\ \le&\frac{1}{2(k+1)} \sum _{t=0}^k\left( \left\| w^t-w^*\right\| _{H_t}^2-\left\| w^{t+1}-w^*\right\| _{H_t}^2\right) +\frac{1}{k+1} \sum _{t=0}^k \frac{\alpha _t}{2} \sigma ^2 \\ \le&\frac{1}{2(k+1)}\left( \frac{1}{\eta _0}\left\| w^0-w^*\right\| _{G_{1,0}}^2+\sum _{i=0}^{k-1}\left( \frac{1}{\eta _{i+1}}-\frac{1}{\eta _i}\right) \mathbb {E}\left\| w^{i+1}-w^*\right\| _{G_{1, i}}^2\right. \\&\left. -\frac{1}{\eta _k} \mathbb {E}\left\| w^{k+1}-w^*\right\| _{G_{1, k}}^2+\left\| y^0-y^*\right\| _{\frac{\beta }{\alpha } B^T B+G_2}^2+\frac{2}{\beta \alpha }\left( \left\| \lambda ^0-\lambda ^*\right\| ^2+1\right) \right) \\&+\frac{1}{k+1} \sum _{t=0}^k \frac{1}{2 \sqrt{t}} \sigma ^2 + \frac{\left( 1-\alpha \right) \left( \left\| \lambda ^* \right\| + 1 \right) }{(k+1) \alpha }\left( \lambda ^0-\lambda ^*\right) ^T B\left( y^0-y^*\right) \\ \le&\frac{1}{2(k+1)}\left( \frac{R^2}{\eta _0}+\sum _{i=0}^{k-1}\left( \frac{1}{\eta _{i+1}}-\frac{1}{\eta _i}\right) R^2+\frac{2}{\beta \alpha }\left( \left\| \lambda ^0-\lambda ^*\right\| ^2+1\right) \right. \\&\left. \quad +\left\| y^0-y^*\right\| _{\frac{\beta }{\alpha } B^T B+G_2}^2\right) +\frac{1}{\sqrt{k}} \sigma ^2 + \frac{\left( 1-\alpha \right) \left( \left\| \lambda ^* \right\| + 1 \right) }{(k+1) \alpha }\left( \lambda ^0-\lambda ^*\right) ^T B\left( y^0-y^*\right) \\ \le&\frac{1}{2(k+1)}\left( \left\| y^0-y^*\right\| _{\frac{\beta }{\alpha } B^T B+G_2}^2+\frac{2}{\beta \alpha }\left( \left\| \lambda ^0-\lambda ^*\right\| ^2+1\right) +M R^2\right) \\&+\frac{1}{2 \sqrt{k}}\left( 2 \sigma ^2+R^2\right) + \frac{\left( 1-\alpha \right) \left( \left\| \lambda ^* \right\| + 1 \right) }{(k+1) \alpha }\left( \lambda ^0-\lambda ^*\right) ^T B\left( y^0-y^*\right) . \end{aligned} \end{aligned}$$

Proof of Theorem 2

Proof

First, similar to the proof of lemma 2, using the $\mu $-strong convexity of f, we conclude that for any $w \in \varOmega $

$$\begin{aligned} \begin{aligned}&\theta \left( u \right) - \theta \left( {{{\tilde{u}}^k}} \right) + {\left( {w - {{\tilde{w}}^k}} \right) ^T}F\left( {{{\tilde{w}}^k}} \right) \\ \ge&{\left( {w - {{\tilde{w}}^k}} \right) ^T}{Q_k}\left( {{w^k} - {{\tilde{w}}^k}} \right) - {\left( {x - {{\tilde{x}}^k}} \right) ^T}{\delta ^k} - \frac{L}{2}{\left\| {{x^k} - {{\tilde{x}}^k}} \right\| ^2} + \frac{\mu }{2}{\left\| {x - {x^k}} \right\| ^2}, \end{aligned} \end{aligned}$$

(28)

where $Q_k$ is defined in (7). Then using the result in lemma 3,

$$\begin{aligned} \begin{aligned}&{\left( {w - {{\tilde{w}}^k}} \right) ^T}{Q_k}\left( {{w^k} - {{\tilde{w}}^k}} \right) \\ =&\frac{1}{2}\left( {\left\| {w - {w^{k + 1}}} \right\| _{{H_k}}^2 - \left\| {w - {w^k}} \right\| _{{H_k}}^2} \right) + \frac{1}{{2{\eta _k}}}\left\| {{x^k} - {{\tilde{x}}^k}} \right\| _{{G_{1,k}}}^2 + \frac{1}{2}\left\| {{y^k} - {{\tilde{y}}^k}} \right\| _{{G_2}}^2 \\&- \frac{{\alpha - 2}}{{2\beta }}{\left\| {{\lambda ^k} - {{\tilde{\lambda }}^k}} \right\| ^2}. \end{aligned} \end{aligned}$$

(29)

Combining (28) and (29), we get

$$\begin{aligned} \begin{aligned}&\theta \left( {{{\tilde{u}}^t}} \right) - \theta \left( {u} \right) + {\left( {{{\tilde{w}}^t} - w} \right) ^T}F\left( {{{\tilde{w}}^t}} \right) \\ \le&\frac{1}{2}\left( \left\| {{w^t} - w} \right\| _{{H_t}}^2 - \left\| {{w^{t + 1}} - w} \right\| _{{H_t}}^2 - {\mu }{\left\| {x^t - x} \right\| ^2} \right) + {\left( {x - {x^t}} \right) ^T}{\delta ^t} + \frac{{{\alpha _t}}}{2}{\left\| {{\delta ^t}} \right\| ^2} . \end{aligned} \end{aligned}$$

(30)

Now using (23) and (30), we have

$$\begin{aligned} \begin{aligned}&\theta \left( \bar{u}_k\right) -\theta (u)+\left( \bar{w}_k-w\right) ^T F(w) \\ \le&\frac{1}{k+1} \sum _{t=0}^k \theta \left( \tilde{u}^t\right) -\theta (u)+\left( \tilde{w}^t-w\right) ^T F\left( \tilde{w}^t\right) \\ \le&\frac{1}{2(k+1)} \sum _{t=0}^k\left( \frac{1}{\eta t}\left\| x^t-x\right\| _{G_{1, t}}^2-\frac{1}{\eta t}\left\| x^{t+1}-x\right\| _{G_{1, t}}^2-\mu \left\| x^t-x\right\| ^2\right) +\frac{1}{k+1} \sum _{t=0}^k \frac{\alpha _t}{2}\left\| \delta ^t\right\| ^2 \\&+\frac{1}{2(k+1)}\left( \left\| y^0-y\right\| _{\frac{\beta }{\alpha } B^T B+G_2}^2+\frac{1}{\beta \alpha }\left\| \lambda ^0-\lambda \right\| ^2\right) +\frac{1}{k+1} \sum _{t=0}^k\left( x-x^t\right) ^T \delta ^t \\&+ \frac{1-\alpha }{(k+1) \alpha }\left( \lambda ^0-\lambda \right) ^T B\left( y^0-y\right) \\ \le&\frac{1}{2(k+1)} \sum _{t=0}^k\left( (\mu t+M)\left\| x^t-x\right\| ^2-(\mu (t+1)+M)\left\| x^{t+1}-x\right\| ^2\right) \\&+\frac{1}{2(k+1)}\left( \left\| y^0-y\right\| _{\frac{\beta }{\alpha } B^T B+G_2}^2+\frac{1}{\beta \alpha }\left\| \lambda ^0-\lambda \right\| ^2+\left\| x^0-x\right\| _{\tau I_n}^2-\beta A^T A\right) \\&+\frac{1}{k+1} \sum _{t=0}^k\left( x-x^t\right) ^T \delta ^t+\frac{1}{k+1} \sum _{t=0}^k \frac{\alpha t}{2}\left\| \delta ^t\right\| ^2 + \frac{1-\alpha }{(k+1) \alpha }\left( \lambda ^0-\lambda \right) ^T B\left( y^0-y\right) \\ \le&\frac{1}{2(k+1)}\left( \left\| x^0-x\right\| _{(\tau +M) I_n-\beta A^T A}^2+\left\| y^0-y\right\| _{\frac{\beta }{\alpha } B^T B+G_2}^2+\frac{1}{\beta \alpha }\left\| \lambda ^0-\lambda \right\| ^2\right) \\&+\frac{1}{k+1} \sum _{t=0}^k\left( x-x^t\right) ^T \delta ^t+\frac{1}{k+1} \sum _{t=0}^k \frac{\alpha }{2}\left\| \delta ^t\right\| ^2 + \frac{1-\alpha }{(k+1) \alpha }\left( \lambda ^0-\lambda \right) ^T B\left( y^0-y\right) . \end{aligned} \end{aligned}$$

(31)

Finally, taking expectation on both sides of (29) and following the proof for getting (26) and (27), we obtain

$$\begin{aligned} \begin{aligned}&\mathbb {E}\left[ \left\| A \bar{x}_k+B \bar{y}_k-b\right\| \right] \\ \le&\frac{1}{2(k+1)}\left( \left\| x^0-x^*\right\| _{(\tau +M) I_{n_1}-\beta A^T A}^2+\left\| y^0-y^*\right\| _{\frac{\beta }{\alpha } B^T B+G_2}^2 +\frac{2}{\beta \alpha }\left( \left\| \lambda ^0-\lambda ^*\right\| ^2+1\right) \right) \\&+\frac{\sigma ^2}{2 \mu (k+1)}(1+\ln (k+1)) + \frac{1-\alpha }{(k+1) \alpha }\left( \lambda ^0-\lambda ^*\right) ^T B\left( y^0-y^*\right) \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \theta \left( {{{\bar{u}}_k}} \right) - \theta \left( {{u^ * }} \right) \le \theta \left( {{{\bar{u}}_k}} \right) - \theta \left( {{u^ * }} \right) + {\left( {{{\bar{w}}_k} - {w^ * }} \right) ^T}F\left( {{w^ * }} \right) + \left\| {{\lambda ^ * }} \right\| \left\| {A{{\bar{x}}_k} + {{\bar{y}}_k} - b} \right\| . \end{aligned}$$

Therefore, this theorem is proved. $\square $

Proof of Theorem 3

Proof

Since $\left( x^*, y^*, \lambda ^*\right) $ is a solution of (1), we have

$${A^T}{\lambda ^ * } = \nabla {\theta _1}\left( {{x^ * }} \right) \ \textrm{and} \ {B^T}{\lambda ^ * } \in \partial {\theta _2}\left( {{y^ * }} \right) .$$

Hence, since $\theta _1$ is strongly convex and $\theta _2$ is convex, we have

$$\begin{aligned} {\theta _1}\left( {{{\bar{x}}_k}} \right) \ge {\theta _1}\left( {{x^ * }} \right) + {\left( {{\lambda ^ * }} \right) ^T}\left( {A{{\bar{x}}_k} - A{x^ * }} \right) + \frac{\mu }{2}{\left\| {{{\bar{x}}_k} - {x^ * }} \right\| ^2} \end{aligned}$$

(32)

and

$$\begin{aligned} {\theta _2}\left( {{{\bar{y}}_k}} \right) \ge {\theta _2}\left( {{y^ * }} \right) + {\left( {{\lambda ^ * }} \right) ^T}\left( {B{{\bar{y}}_k} - B{y^ * }} \right) . \end{aligned}$$

(33)

Adding up (32) and (33), we get

$$\theta \left( {{{\bar{u}}_k}} \right) \ge \theta \left( {{u^ * }} \right) + {\left( {{\lambda ^ * }} \right) ^T}\left( {A{{\bar{x}}_k} + B{{\bar{y}}_k} - b} \right) + \frac{\mu }{2}{\left\| {{{\bar{x}}_k} - {x^ * }} \right\| ^2}.$$

Taking expectation gives

$$\begin{aligned} \begin{aligned} {\left\| {{{\bar{x}}_k} - {x^ * }} \right\| ^2} \le&\frac{2}{\mu }\left( {\theta \left( {{{\bar{u}}_k}} \right) - \theta \left( {{u^ * }} \right) } - {\left( {{\lambda ^ * }} \right) ^T}\left( {A{{\bar{x}}_k} + B{{\bar{y}}_k} - b} \right) \right) \\ \le&\frac{2}{\mu }\left( {\theta \left( {{{\bar{u}}_k}} \right) - \theta \left( {{u^ * }} \right) } + \left\| {{\lambda ^ * }} \right\| \left\| {A{{\bar{x}}_k} + B{{\bar{y}}_k} - b} \right\| \right) . \end{aligned} \end{aligned}$$

(34)

On the other hand,

$$\begin{aligned} \begin{aligned} {\left\| {A{{\bar{x}}_k} + B{{\bar{y}}_k} - b} \right\| } =&{\left\| {A\left( {{{\bar{x}}_k} - {x^ * }} \right) + B\left( {{{\bar{y}}_k} - {y^ * }} \right) } \right\| } \\ \ge&\left\| {B\left( {{{\bar{y}}_k} - {y^ * }} \right) } \right\| - \left\| A \right\| \left\| {{{\bar{x}}_k} - {x^ * }} \right\| , \end{aligned} \end{aligned}$$

this implies ${\left\| {B\left( {{{\bar{y}}_k} - {y^ * }} \right) } \right\| ^2} \le 2{\left\| A \right\| ^2}{\left\| {{{\bar{x}}_k} - {x^ * }} \right\| ^2} + 2{\left\| {A{{\bar{x}}_k} + B{{\bar{y}}_k} - b} \right\| ^2}$ and hence

$$\begin{aligned} {\left\| {{{\bar{y}}_k} - {y^ * }} \right\| ^2} \le \frac{{2{{\left\| A \right\| }^2}}}{{s}}{\left\| {{{\bar{x}}_k} - {x^ * }} \right\| ^2} + \frac{2}{{s}}{\left\| {A{{\bar{x}}_k} + B{{\bar{y}}_k} - b} \right\| ^2}. \end{aligned}$$

(35)

Adding (34) and (35), and taking expectation imply

$$\begin{aligned} \begin{aligned} \mathbb {E}\left[ {\left\| {{{\bar{x}}_k} - {x^ * }} \right\| ^2} + {\left\| {{{\bar{y}}_k} - {y^ * }} \right\| ^2}\right] \le \left( \frac{2}{\mu } + \frac{4{{{\left\| A \right\| }^2}}}{\mu s}\right) \left( \mathbb {E}\left[ \theta \left( {{{\bar{u}}_k}} \right) - \theta \left( {{u^ * }} \right) \right] \right. \\ \left. + \left\| {{\lambda ^ * }} \right\| \mathbb {E}\left[ \left\| {A{{\bar{x}}_k} + B{{\bar{y}}_k} - b} \right\| \right] \right) + \frac{2}{s}{\mathbb {E}\left[ \left\| {A{{\bar{x}}_k} + B{{\bar{y}}_k} - b} \right\| ^2\right] }. \end{aligned} \end{aligned}$$

(36)

The remaining task is to estimate $\mathbb {E}\left[ \left\| {A{{\bar{x}}_k} + B{{\bar{y}}_k} - b} \right\| ^2\right] $.

In (9), let $w = \left( x^*, y^*, \lambda \right) $, where $\lambda = \lambda ^* + e$, and e is a vector satisfying $ - {e^T}\left( {A{{\bar{x}}_k} + B{{\bar{y}}_k} - b} \right) = {\left\| {A{{\bar{x}}_k} + B{{\bar{y}}_k} - b} \right\| ^2}$. Then, similar to the proof idea of getting (26), we get

$$\begin{aligned} \begin{aligned}&\mathbb {E}\left[ \left\| A \bar{x}_k+B \bar{y}_k-b\right\| ^2\right] \\ \le&\frac{1}{2(k+1)}\left( M\left\| x^0-x^*\right\| _{G_{1,0}}^2+\left\| y^0-y^*\right\| _{\frac{\beta }{\alpha } B^T B+G_2}^2+\frac{2}{\beta \alpha }\left\| \lambda ^0-\lambda ^*\right\| ^2\right) \\&+\frac{1}{2 \sqrt{k}}\left( \sigma ^2+\left\| x^0-x^*\right\| _{G_{1,0}}^2\right) +\frac{1}{\beta \alpha (k+1)} \mathbb {E}\left[ \left\| A \bar{x}_k+B \bar{y}_k-b\right\| ^2\right] \\&+ \frac{1-\alpha }{(k+1) \alpha }\left( \lambda ^0-\lambda ^*\right) ^T B\left( y^0-y^*\right) \end{aligned} \end{aligned}$$

(37)

Arranging this inequality, we obtain the desired bound and the proof is completed. $\square $

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hu, J., Guo, T., Han, C. (2022). Complexity Analysis of a Stochastic Variant of Generalized Alternating Direction Method of Multipliers. In: Du, DZ., Du, D., Wu, C., Xu, D. (eds) Theory and Applications of Models of Computation. TAMC 2022. Lecture Notes in Computer Science, vol 13571. Springer, Cham. https://doi.org/10.1007/978-3-031-20350-3_18

Download citation

DOI: https://doi.org/10.1007/978-3-031-20350-3_18
Published: 01 January 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20349-7
Online ISBN: 978-3-031-20350-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Complexity Analysis of a Stochastic Variant of Generalized Alternating Direction Method of Multipliers

Abstract

Access this chapter

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

Proof

Proof

Proof

Proof

Proof

Proof

Proof

Proof

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation