Abstract
Recent distributed reinforcement learning techniques utilize networked agents to accelerate exploration and speed up learning. However, such techniques are not resilient in the presence of Byzantine agents which can disturb convergence. In this paper, we present a Byzantine resilient aggregation rule for distributed reinforcement learning with networked agents that incorporates the idea of optimizing the objective function in designing the aggregation rules. We evaluate our approach using multiple reinforcement learning environments for both value-based and policy-based methods with homogeneous and heterogeneous agents. The results show that cooperation using the proposed approach exhibits better learning performance than the non-cooperative case and is resilient in the presence of an arbitrary number of Byzantine agents.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
An episode is a sequence of states from the start state to a terminal state.
- 2.
During learning, t increases while i remains the same; and during model aggregation, i increases while t changes from \(\infty \) to 0. To simplify, hereafter, we omit the subscripts of t for cooperation and the superscripts of i for learning.
- 3.
Methods for updating these parameters are discussed in Sect. 3.
References
Sayed, A.H., Tu, S.Y., Chen, J., Zhao, X., Towfic, Z.J.: Diffusion strategies for adaptation and learning over networks: an examination of distributed strategies and network behavior. IEEE Signal Process. Mag. 30(3), 155–171 (2013)
Zhang, K., Yang, Z., Liu, H., Zhan g, T., Basar, T.: Fully decentralized multi-agent reinforcement learning with networked agents. In: ICML 2018, Stockholmsmässan, Stockholm, Sweden, 10–15 July 2018, pp. 5867–5876 (2018)
Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: JMLR Workshop and Conference Proceedings of, ICML 2016, New York City, NY, USA, 19-24 June 2016, vol. 48, pp. 1928–1937. JMLR.org (2016)
Espeholt, L., et al.: IMPALA: scalable distributed Deep-RL with importance weighted actor-learner architectures. In: ICML 2018, Stockholm, Sweden, 10-15 July 2018
Blanchard, P., El Mhamdi, E.M., Guerraoui, R., Stainer, J.: Machine learning with adversaries: byzantine tolerant gradient descent. In: Annual Conference on Neural Information Processing Systems, pp. 118–128 (2017)
Li, J., Abbas, W., Koutsoukos, X.: Resilient distributed diffusion in networks with adversaries. IEEE Trans. Signal Inf. Process. over Netw. 6, 1–17 (2020)
Yin, D., Chen, Y., Kannan, R., Bartlett, P.: Byzantine-robust distributed learning: towards optimal statistical rates. In: ICML 2018, Stockholmsmässan, Stockholm, Sweden, 10-15 July 2018, pp. 5636–5645 (2018)
Yang, Z., Bajwa, W.U.: ByRDiE: byzantine-resilient distributed coordinate descent for decentralized learning. IEEE Trans. Signal Info. Process. Over Netw. 5(4), 611–627 (2019)
Chen, Y., Su, L., Xu, J.: Distributed statistical machine learning in adversarial settings: byzantine gradient descent. In: Proceedings of the ACM on Measurement and Analysis of Computing Systems, vol. 1, no. 2, pp. 44:1–44:25, December 2017
Li, J., Abbas, W., Shabbir, M., Koutsoukos, X.: Resilient distributed diffusion for multi-robot systems using centerpoint. In: Proceedings of Robotics: Science and Systems, Corvalis, Oregon, USA, July 2020
Li, J., Abbas, W., Koutsoukos, X.: Byzantine resilient distributed multi-task learning. In: Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems, 6-12 December 2020 (2020)
Lin, Y., Gade, S., Sandhu, R., Liu, J.: Toward resilient multi-agent actor-critic algorithms for distributed reinforcement learning. In: 2020 American Control Conference, ACC 2020, Denver, CO, USA, 1-3 July 2020, pp. 3953–3958. IEEE (2020)
Xie, Y., Mou, S., Sundaram, S.: Towards resilience for multi-agent QD-learning. CoRR, abs/2104.03153 (2021)
Macua, S.V., et.al.: Distributed policy evaluation under multiple behavior strategies. IEEE Trans. Automat. Contr. 60(5), 1260–1274 (2015)
Nair, A., et al.: Massively parallel methods for deep reinforcement learning. CoRR, abs/1507.04296 (2015)
Zhang, K., Yang, Z., Basar, T.: Multi-agent reinforcement learning: a selective overview of theories and algorithms. CoRR, abs/1911.10635 (2019)
Balcan, M.F., Weinberger, K.Q. (eds.) Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, 19-24 June 2016, vol. 48 of JMLR Workshop and Conference Proceedings. JMLR.org (2016)
Kar, S., Moura, J.M., Poor, H.V.: QD-learning: a collaborative distributed strategy for multi-agent reinforcement learning through Consensus + Innovations. IEEE Trans. Signal Process. 61(7), 1848–1862 (2013)
Macua, S.V., Tukiainen, A., Hernández, D.G.O., Baldazo, D., de Cote, E.M., Zazo, S.: Diff-dac: distributed actor-critic for multitask deep reinforcement learning. CoRR, abs/1710.10363 (2017)
Watkins, C.J., Dayan, P.: Q-learning. In: Machine Learning, pp. 279–292 (1992)
Mnih, V., et al.: Playing atari with deep reinforcement learning. CoRR, abs/1312.5602 (2013)
Sutton, R.S., McAllester, D.A., Singh, S.P., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: Advances in Neural Information Processing Systems 12, Denver, Colorado, USA, pp. 1057–1063 (1999)
Schulman, J., Moritz, P., Levine, S., Jordan, M., Abbeel, P.: High-dimensional continuous control using generalized advantage estimation. In: ICLR 2016, San Juan, Puerto Rico, 2-4 May 2016, Conference Track Proceedings (2016)
Brockman, G.: OpenAI Gym. arXiv preprint arXiv:1606.01540 (2016)
Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. In: ICLR (2016)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR 2015, San Diego, CA, USA, 7-9 May 2015
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Li, J., Cai, F., Koutsoukos, X. (2022). Byzantine Resilient Aggregation in Distributed Reinforcement Learning. In: Matsui, K., Omatu, S., Yigitcanlar, T., González, S.R. (eds) Distributed Computing and Artificial Intelligence, Volume 1: 18th International Conference. DCAI 2021. Lecture Notes in Networks and Systems, vol 327. Springer, Cham. https://doi.org/10.1007/978-3-030-86261-9_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-86261-9_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86260-2
Online ISBN: 978-3-030-86261-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)