skip to main content
10.1145/3611643.3616347acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

Understanding the Bug Characteristics and Fix Strategies of Federated Learning Systems

Published: 30 November 2023 Publication History

Abstract

Federated learning (FL) is an emerging machine learning paradigm that aims to address the problem of isolated data islands. To preserve privacy, FL allows machine learning models and deep neural networks to be trained from decentralized data kept privately at individual devices. FL has been increasingly adopted in missioncritical fields such as finance and healthcare. However, bugs in FL systems are inevitable and may result in catastrophic consequences such as financial loss, inappropriate medical decision, and violation of data privacy ordinance. While many recent studies were conducted to understand the bugs in machine learning systems, there is no existing study to characterize the bugs arising from the unique nature of FL systems. To fill the gap, we collected 395 real bugs from six popular FL frameworks (Tensorflow Federated, PySyft, FATE, Flower, PaddleFL, and Fedlearner) in GitHub and StackOverflow, and then manually analyzed their symptoms and impacts, prone stages, root causes, and fix strategies. Furthermore, we report a series of findings and actionable implications that can potentially facilitate the detection of FL bugs.

Supplementary Material

Video (fse23main-p1188-p-video.mp4)
"Federated learning (FL) is an emerging machine learning paradigm that aims to address the problem of isolated data islands. To preserve privacy, FL allows machine learning models and deep neural networks to be trained from decentralized data kept privately at individual devices. FL has been increasingly adopted in mission-critical fields such as finance and healthcare. However, bugs in FL systems are inevitable and may result in catastrophic consequences such as financial loss, inappropriate medical decision, and violation of data privacy ordinance. While many recent studies were conducted to understand the bugs in machine learning systems, there is no existing study to characterize the bugs arising from the unique nature of FL systems. To fill the gap, we collected 395 real bugs from six popular FL frameworks (Tensorflow Federated, PySyft, FATE, Flower, PaddleFL and Fedlearner) in GitHub and StackOverflow, and then manually analyzed their symptoms and impacts, prone stages, root causes and fix strategies, and report a series of findings and actionable implications. Finally, we provide possible suggestions or solutions for developers of FL systems based on the above findings and implications."

References

[1]
Abdulkareem Alali, Huzefa H. Kagdi, and Jonathan I. Maletic. 2008. What’s a Typical Commit? A Characterization of Open Source Software Repositories. In Proceedings of the 16th IEEE International Conference on Program Comprehension, ICPC 2008, Amsterdam, The Netherlands, June 10-13, 2008, René L. Krikhaar, Ralf Lämmel, and Chris Verhoef (Eds.). IEEE Computer Society, 182–191. https://doi.org/10.1109/ICPC.2008.24
[2]
Kallista A. Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H. Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. 2017. Practical Secure Aggregation for Privacy-Preserving Machine Learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, Dallas, TX, USA, October 30 - November 03, 2017, Bhavani Thuraisingham, David Evans, Tal Malkin, and Dongyan Xu (Eds.). ACM, 1175–1191. https://doi.org/10.1145/3133956.3133982
[3]
Ran Canetti, Uriel Feige, Oded Goldreich, and Moni Naor. 1996. Adaptively Secure Multi-Party Computation. In Proceedings of the Twenty-Eighth Annual ACM Symposium on the Theory of Computing, Philadelphia, Pennsylvania, USA, May 22-24, 1996, Gary L. Miller (Ed.). ACM, 639–648. https://doi.org/10.1145/237814.238015
[4]
Junming Cao, Bihuan Chen, Chao Sun, Longjie Hu, and Xin Peng. 2021. Characterizing Performance Bugs in Deep Learning Systems. CoRR, abs/2112.01771 (2021), arXiv:2112.01771. arxiv:2112.01771
[5]
Junjie Chen, Yihua Liang, Qingchao Shen, and Jiajun Jiang. 2022. Toward Understanding Deep Learning Framework Bugs. CoRR, abs/2203.04026 (2022), https://doi.org/10.48550/arXiv.2203.04026 arXiv:2203.04026.
[6]
Kewei Cheng, Tao Fan, Yilun Jin, Yang Liu, Tianjian Chen, Dimitrios Papadopoulos, and Qiang Yang. 2021. SecureBoost: A Lossless Federated Learning Framework. IEEE Intell. Syst., 36, 6 (2021), 87–98. https://doi.org/10.1109/MIS.2021.3082561
[7]
Paul Feldman. 1987. A Practical Scheme for Non-interactive Verifiable Secret Sharing. In Proceedings of the 28th Annual Symposium on Foundations of Computer Science, Los Angeles, California, USA, 27-29 October 1987. IEEE Computer Society, 427–437. https://doi.org/10.1109/SFCS.1987.4
[8]
Stephen Hardy, Wilko Henecka, Hamish Ivey-Law, Richard Nock, Giorgio Patrini, Guillaume Smith, and Brian Thorne. 2017. Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. CoRR, abs/1711.10677 (2017), arXiv:1711.10677. arxiv:1711.10677
[9]
Haochen He, Zhouyang Jia, Shanshan Li, Erci Xu, Tingting Yu, Yue Yu, Ji Wang, and Xiangke Liao. 2020. CP-Detector: Using Configuration-related Performance Properties to Expose Performance Bugs. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, ASE 2020, Melbourne, Australia, September 21-25, 2020. IEEE, 623–634. https://doi.org/10.1145/3324884.3416531
[10]
Zhiqi Huang, Fenglin Liu, and Yuexian Zou. 2020. Federated Learning for Spoken Language Understanding. In Proceedings of the 28th International Conference on Computational Linguistics, COLING 2020, Barcelona, Spain (Online), December 8-13, 2020, Donia Scott, Núria Bel, and Chengqing Zong (Eds.). International Committee on Computational Linguistics, 3467–3478. https://doi.org/10.18653/v1/2020.coling-main.310
[11]
Nargiz Humbatova, Gunel Jahangirova, Gabriele Bavota, Vincenzo Riccio, Andrea Stocco, and Paolo Tonella. 2020. Taxonomy of real faults in deep learning systems. In Proceedings of the 42nd International Conference on Software Engineering, ICSE 2020, Seoul, South Korea, 27 June - 19 July, 2020, Gregg Rothermel and Doo-Hwan Bae (Eds.). ACM, 1110–1121. https://doi.org/10.1145/3377811.3380395
[12]
Md Johirul Islam, Giang Nguyen, Rangeet Pan, and Hridesh Rajan. 2019. A comprehensive study on deep learning bug characteristics. In Proceedings of the ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/SIGSOFT FSE 2019, Tallinn, Estonia, August 26-30, 2019, Marlon Dumas, Dietmar Pfahl, Sven Apel, and Alessandra Russo (Eds.). ACM, 510–520. https://doi.org/10.1145/3338906.3338955
[13]
Md Johirul Islam, Rangeet Pan, Giang Nguyen, and Hridesh Rajan. 2020. Repairing deep neural networks: fix patterns and challenges. In Proceedings of the 42nd International Conference on Software Engineering, ICSE 2020, Seoul, South Korea, 27 June - 19 July, 2020, Gregg Rothermel and Doo-Hwan Bae (Eds.). ACM, 1135–1146. https://doi.org/10.1145/3377811.3380378
[14]
Peter Kairouz, H. Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista A. Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, Rafael G. L. D’Oliveira, Hubert Eichner, Salim El Rouayheb, David Evans, Josh Gardner, Zachary Garrett, Adrià Gascón, Badih Ghazi, Phillip B. Gibbons, Marco Gruteser, Zaïd Harchaoui, Chaoyang He, Lie He, Zhouyuan Huo, Ben Hutchinson, Justin Hsu, Martin Jaggi, Tara Javidi, Gauri Joshi, Mikhail Khodak, Jakub Konečný, Aleksandra Korolova, Farinaz Koushanfar, Sanmi Koyejo, Tancrède Lepoint, Yang Liu, Prateek Mittal, Mehryar Mohri, Richard Nock, Ayfer Özgür, Rasmus Pagh, Hang Qi, Daniel Ramage, Ramesh Raskar, Mariana Raykova, Dawn Song, Weikang Song, Sebastian U. Stich, Ziteng Sun, Ananda Theertha Suresh, Florian Tramèr, Praneeth Vepakomma, Jianyu Wang, Li Xiong, Zheng Xu, Qiang Yang, Felix X. Yu, Han Yu, and Sen Zhao. 2021. Advances and Open Problems in Federated Learning. Found. Trends Mach. Learn., 14, 1-2 (2021), 1–210. https://doi.org/10.1561/2200000083
[15]
Hyesung Kim, Jihong Park, Mehdi Bennis, and Seong-Lyun Kim. 2020. Blockchained On-Device Federated Learning. IEEE Commun. Lett., 24, 6 (2020), 1279–1283. https://doi.org/10.1109/LCOMM.2019.2921755
[16]
Xinle Liang, Yang Liu, Tianjian Chen, Ming Liu, and Qiang Yang. 2019. Federated Transfer Reinforcement Learning for Autonomous Driving. CoRR, abs/1910.06001 (2019), arXiv:1910.06001. arxiv:1910.06001
[17]
Wei Yang Bryan Lim, Jianqiang Huang, Zehui Xiong, Jiawen Kang, Dusit Niyato, Xian-Sheng Hua, Cyril Leung, and Chunyan Miao. 2021. Towards Federated Learning in UAV-Enabled Internet of Vehicles: A Multi-Dimensional Contract-Matching Approach. IEEE Trans. Intell. Transp. Syst., 22, 8 (2021), 5140–5154. https://doi.org/10.1109/TITS.2021.3056341
[18]
Boyi Liu, Lujia Wang, and Ming Liu. 2019. Lifelong Federated Reinforcement Learning: A Learning Architecture for Navigation in Cloud Robotic Systems. IEEE Robotics Autom. Lett., 4, 4 (2019), 4555–4562. https://doi.org/10.1109/LRA.2019.2931179
[19]
Yuan Liu, Zhengpeng Ai, Shuai Sun, Shuangfeng Zhang, Zelei Liu, and Han Yu. 2020. FedCoin: A Peer-to-Peer Payment System for Federated Learning. In Federated Learning - Privacy and Incentive, Qiang Yang, Lixin Fan, and Han Yu (Eds.) (Lecture Notes in Computer Science, Vol. 12500). Springer, 125–138. https://doi.org/10.1007/978-3-030-63076-8_9
[20]
Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Agüera y Arcas. 2017. Communication-Efficient Learning of Deep Networks from Decentralized Data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017, 20-22 April 2017, Fort Lauderdale, FL, USA, Aarti Singh and Xiaojin (Jerry) Zhu (Eds.) (Proceedings of Machine Learning Research, Vol. 54). PMLR, 1273–1282. http://proceedings.mlr.press/v54/mcmahan17a.html
[21]
Mohammad Mehdi Morovati, Amin Nikanjam, Foutse Khomh, and Zhen Ming (Jack) Jiang. 2023. Bugs in machine learning-based systems: a faultload benchmark. Empir. Softw. Eng., 28, 3 (2023), 62. https://doi.org/10.1007/s10664-023-10291-1
[22]
Viraaji Mothukuri, Prachi Khare, Reza M. Parizi, Seyedamin Pouriyeh, Ali Dehghantanha, and Gautam Srivastava. 2022. Federated-Learning-Based Anomaly Detection for IoT Security Attacks. IEEE Internet Things J., 9, 4 (2022), 2545–2554. https://doi.org/10.1109/JIOT.2021.3077803
[23]
Nitin Naik. 2021. Demystifying Properties of Distributed Systems. In Proceedings of the IEEE International Symposium on Systems Engineering, ISSE 2021, Vienna, Austria, September 13 - October 13, 2021. IEEE, 1–8. https://doi.org/10.1109/ISSE51541.2021.9582515
[24]
Milad Nasr, Reza Shokri, and Amir Houmansadr. 2019. Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning. In Proceedings of the 2019 IEEE Symposium on Security and Privacy, SP 2019, San Francisco, CA, USA, May 19-23, 2019. IEEE, 739–753. https://doi.org/10.1109/SP.2019.00065
[25]
Jason Posner, Lewis Tseng, Moayad Aloqaily, and Yaser Jararweh. 2021. Federated Learning in Vehicular Networks: Opportunities and Solutions. IEEE Netw., 35, 2 (2021), 152–159. https://doi.org/10.1109/MNET.011.2000430
[26]
Abhijit Guha Roy, Shayan Siddiqui, Sebastian Pölsterl, Nassir Navab, and Christian Wachinger. 2019. BrainTorrent: A Peer-to-Peer Environment for Decentralized Federated Learning. CoRR, abs/1905.06731 (2019), arXiv:1905.06731. arxiv:1905.06731
[27]
Graeme D. Ruxton. 2006. The unequal variance t-test is an underused alternative to Student’s t-test and the Mann–Whitney U test. Behavioral Ecology, 17, 4 (2006), 05, 688–690. issn:1045-2249 https://doi.org/10.1093/beheco/ark016
[28]
Carolyn B. Seaman. 1999. Qualitative Methods in Empirical Studies of Software Engineering. IEEE Trans. Software Eng., 25, 4 (1999), 557–572. https://doi.org/10.1109/32.799955
[29]
Micah J. Sheller, G. Anthony Reina, Brandon Edwards, Jason Martin, and Spyridon Bakas. 2018. Multi-institutional Deep Learning Modeling Without Sharing Patient Data: A Feasibility Study on Brain Tumor Segmentation. In Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries - 4th International Workshop, BrainLes 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 16, 2018, Revised Selected Papers, Part I, Alessandro Crimi, Spyridon Bakas, Hugo J. Kuijf, Farahani Keyvan, Mauricio Reyes, and Theo van Walsum (Eds.) (Lecture Notes in Computer Science, Vol. 11383). Springer, 92–104. https://doi.org/10.1007/978-3-030-11723-8_9
[30]
Jingdong Wang, Ting Zhang, Jingkuan Song, Nicu Sebe, and Heng Tao Shen. 2018. A Survey on Learning to Hash. IEEE Trans. Pattern Anal. Mach. Intell., 40, 4 (2018), 769–790. https://doi.org/10.1109/TPAMI.2017.2699960
[31]
Jie Xu, Benjamin S. Glicksberg, Chang Su, Peter B. Walker, Jiang Bian, and Fei Wang. 2021. Federated Learning for Healthcare Informatics. J. Heal. Informatics Res., 5, 1 (2021), 1–19. https://doi.org/10.1007/s41666-020-00082-4
[32]
Tianyin Xu, Xinxin Jin, Peng Huang, Yuanyuan Zhou, Shan Lu, Long Jin, and Shankar Pasupathy. 2016. Early Detection of Configuration Errors to Reduce Failure Damage. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2016, Savannah, GA, USA, November 2-4, 2016, Kimberly Keeton and Timothy Roscoe (Eds.). USENIX Association, 619–634. https://www.usenix.org/conference/osdi16/technical-sessions/presentation/xu
[33]
Teng Xu, James B. Wendt, and Miodrag Potkonjak. 2014. Security of IoT systems: design challenges and opportunities. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2014, San Jose, CA, USA, November 3-6, 2014, Yao-Wen Chang (Ed.). IEEE, 417–423. https://doi.org/10.1109/ICCAD.2014.7001385
[34]
Zirui Xu, Fuxun Yu, Jinjun Xiong, and Xiang Chen. 2021. Helios: Heterogeneity-Aware Federated Learning with Dynamically Balanced Collaboration. In Proceedings of the 58th ACM/IEEE Design Automation Conference, DAC 2021, San Francisco, CA, USA, December 5-9, 2021. IEEE, 997–1002. https://doi.org/10.1109/DAC18074.2021.9586241
[35]
Qiang Yang, Yang Liu, Tianjian Chen, and Yongxin Tong. 2019. Federated Machine Learning: Concept and Applications. ACM Trans. Intell. Syst. Technol., 10, 2 (2019), 12:1–12:19. https://doi.org/10.1145/3298981
[36]
Ru Zhang, Wencong Xiao, Hongyu Zhang, Yu Liu, Haoxiang Lin, and Mao Yang. 2020. An empirical study on program failures of deep learning jobs. In Proceedings of the 42nd International Conference on Software Engineering, ICSE 2020, Seoul, South Korea, 27 June - 19 July, 2020, Gregg Rothermel and Doo-Hwan Bae (Eds.). ACM, 1159–1170. https://doi.org/10.1145/3377811.3380362
[37]
Yuhao Zhang, Yifan Chen, Shing-Chi Cheung, Yingfei Xiong, and Lu Zhang. 2018. An empirical study on TensorFlow program bugs. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2018, Amsterdam, The Netherlands, July 16-21, 2018, Frank Tip and Eric Bodden (Eds.). ACM, 129–140. https://doi.org/10.1145/3213846.3213866

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
November 2023
2215 pages
ISBN:9798400703270
DOI:10.1145/3611643
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 November 2023

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. Bug Characteristics
  2. Empirical Study
  3. Federated Learning

Qualifiers

  • Research-article

Funding Sources

  • the National Natural Science Foundation of China
  • the Young Elite Scientists Sponsorship Program by CAST
  • the Hong Kong Research Grant Council/General Research Fund
  • the Hong Kong Research Grant Council/Research Impact Fund
  • the Hong Kong Innovation and Technology Fund

Conference

ESEC/FSE '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 208
    Total Downloads
  • Downloads (Last 12 months)108
  • Downloads (Last 6 weeks)5
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media