On the Non-ergodic Convergence Rate of the Directed Nonsmooth Composite Optimization

Dong, Yichuan; Cui, Zhuoxu; Zhang, Yong; Feng, Shengzhong

doi:10.1007/978-3-030-69244-5_11

Yichuan Dong¹¹,
Zhuoxu Cui¹²,
Yong Zhang¹³ &
…
Shengzhong Feng¹¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12606))

Included in the following conference series:

International Conference on Parallel and Distributed Computing: Applications and Technologies

1038 Accesses

Abstract

This paper considers the distributed “nonsmooth+nonsmooth” composite optimization problems for which n agents collaboratively minimize the sum of their local objective functions over the directed networks. In particular, we focus on the scenarios where the sought solutions are desired to possess some structural properties, e.g., sparsity. However, to ensure the convergence, most existing methods produce an ergodic solution via the averaging schemes as the output, which causes the desired structural properties of the output to be destroyed. To address this issue, we develop a new decentralized stochastic proximal gradient method, termed DSPG, in which the nonergodic (last) iteration acts as the output. We also show that the DSPG method achieves the nonergodic convergence rate \(O(\log (T)/\sqrt{T})\) for generally convex objective functions and \(O(\log (T)/T)\) for strongly convex objective functions. When the structure-enhancing regularization is absent and the simple and suffix averaging schemes are used, the convergence rates of DSPG reach \(O(1/\sqrt{T})\) for generally convex objective functions and O(1/T) for strongly convex objective functions, showing improvement relative to the rates \(O(\log (T)/\sqrt{T})\) and \(O(\log (T)/T)\) provided by the existing methods. Simulation examples further illustrate the effectiveness of the proposed method.

Supported by National Supercomputing Center in Shenzhen.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Lesser, V., Ortiz, C.L., Tambe, M.: Distributed Sensor Networks: A Multiagent Prespective. Kluwer, Norwell (2003)
Book Google Scholar
Xie, S., Guo, L.: Analysis of distributed adaptive filters based on diffusion strategies over sensor networks. IEEE Trans. Autom. Control 6(3), 3643–3658 (2018)
Article MathSciNet Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 5(21), 436–441 (2015)
Article Google Scholar
Author, F.: Article title. Journal 2(5), 99–110 (2003)
Google Scholar
Rakhlin, A., Shamir, O., Sridharan, K.: Making gradient descent optimal for strongly convex stochastic optimization. In: Proceedings of the International Conference on Machine Learning, vol. 2, no. 5, pp. 1571–1578 (2012)
Google Scholar
Shamir, O., Zhang, T.: Stochastic gradient for non-smooth optimization: Convergence results and optimal averaging schemes. In: Proceedings of the International Conference on Machine Learning, vol. 2, no. 1, pp. 71–79 (2013)
Google Scholar
Nedic, A., Olshevsky, A.: Stochastic gradient-push for strongly convex function on time-varying directed graphs. IEEE Trans. Autom. Control 6(1), 3936–3947 (2016)
Article MathSciNet Google Scholar
Makhdoumi, A., Ozdaglar, A.E.: Graph balancing for distributed subgradient methods over directed graphs. In: Proceedings of the International Conference on Machine Learning, vol. 2, pp. 71–79 (2013)
Google Scholar
Xiao, L.: Dual averaging method for regularized stochastic learning and online optimization. J. Mach. Learn. Res. 1, 99–110 (2003)
Google Scholar
Duchi, J.C., Singer, Y.: Efficient online and batch learning using forward backward splitting. J. Mach. Learn. Res. 1, 2899–2934 (2009)
MathSciNet MATH Google Scholar
Parikh, N., Boyd, S.: Proximal algorithms. Foundations Trends Optim. 1(3), 127–239 (2013)
Article Google Scholar
Ghadimi, S., Lan, G., Zhang, H.: Mini-batch stochastic approximation methods for nonconvex stochastics composite optimization. Math. Program. 1(55), 267–305 (2016)
Article MathSciNet Google Scholar
Kempe, D., Dobra, A., Gehrke, J.: Gossip-based computation of aggregate information. In: Proceedings of IEEE Symposium on Foundations of Computer Science, vol. 4, no. 4, pp. 482–491 (2003)
Google Scholar
Kushner, H.J., Yin, G.: Stochastic Approximation and Recursive Algorithms and Applications, vol. 3, pp. 99–110. Springer, Cham (2003)
MATH Google Scholar
Bottou, L.: Stochastic gradient descent tricks. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 421–436. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25
Chapter Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. B 5(8), 267–288 (1996)
MathSciNet MATH Google Scholar
Hearst, M.A., Dumais, S.T., Osman, E., Platt, J., Scholkopf, B.: Support vector machines. IEEE Intell. Syst. Their Appl. 1(3), 18–28 (1998)
Article MathSciNet Google Scholar
Hong, M., Chang, T.H.: Stochastic proximal gradient consensus over random networks. IEEE Trans. Signal Process. 6(5), 2933–2948 (2017)
Article MathSciNet Google Scholar
Scaman, K., Bach, F., Bubeck, S., Yin, T.L., Massoulie, L.: Optimal algorithms for non-smooth distributed optimization in networks. In: Proceedings of the Advances in Neural Information Processing Systems, vol. 2, no. 5, pp. 2745–2754 (2018)
Google Scholar
Xue, D., Hirche, S.: Distributed topology manipulation to control epidemic spreading over networks. IEEE Trans. Signal Process. 6(7), 1163–1174 (2019)
Article MathSciNet Google Scholar
Lu, X., Lai, J., Yu, X., Wang, Y., Guerrero, J.M.: Distributed coordination of islanded microgrid clusters using a two-layer intermittent communication network. IEEE Trans. Ind. Inform. J. 1(4), 3956–3969 (2018)
Article Google Scholar
Smith, G.B., Hein, B., Whitney, D.E., Fitzpatrick, D.: Distributed network interactions and their emergence in developing neocortex. Nat. Neurosci. 2(1), 1600–1608 (2018)
Article Google Scholar
Yu, D., Zou, Y., Yu, J., Dressler, F., Lau, F.C.: Implementation abstract MAC Layer in Dynamic Dressler. IEEE Trans. Mobile Comput. 1599 (2020). https://doi.org/10.1109/TMC.2020.297
Yu, D., Zou, Y., Yu, J., Dressler, F., Lau, F.C.: Stable local broadcast in Multihop wireless networks under SINR. IEEE/ACM Trans. Netw. J. 2(6), 1278–1291 (2018)
Article Google Scholar
Cai, Z., Zheng, X.: A private and efficient mechanism for data uploading in smart cyber physical systems. IEEE Trans. Netw. Sci. Eng. (TNSE) J. 7(2), 766–775 (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

National Supercomputing Center in Shenzhen, Guangzhou, China
Yichuan Dong & Shengzhong Feng
Wuhan University, Wuhan, China
Zhuoxu Cui
Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, People’s Republic of China
Yong Zhang

Authors

Yichuan Dong
View author publications
You can also search for this author in PubMed Google Scholar
Zhuoxu Cui
View author publications
You can also search for this author in PubMed Google Scholar
Yong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Shengzhong Feng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhuoxu Cui .

Editor information

Editors and Affiliations

Shenzhen Institutes of Advanced Technology, Shenzhen, China
Yong Zhang
Shenzhen Institutes of Advanced Technology, Shenzhen, China
Yicheng Xu
Griffith University, Gold Coast, QLD, Australia
Hui Tian

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 79 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dong, Y., Cui, Z., Zhang, Y., Feng, S. (2021). On the Non-ergodic Convergence Rate of the Directed Nonsmooth Composite Optimization. In: Zhang, Y., Xu, Y., Tian, H. (eds) Parallel and Distributed Computing, Applications and Technologies. PDCAT 2020. Lecture Notes in Computer Science(), vol 12606. Springer, Cham. https://doi.org/10.1007/978-3-030-69244-5_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-69244-5_11
Published: 21 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69243-8
Online ISBN: 978-3-030-69244-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics