skip to main content
10.1145/3589334.3645341acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Towards Energy-efficient Federated Learning via INT8-based Training on Mobile DSPs

Published: 13 May 2024 Publication History

Abstract

AI is making the Web an even cooler place, but also introduces serious privacy risks due to the extensive user data collection. Federated learning (FL), as a privacy-preserving machine learning paradigm, enables mobile devices to collaboratively learn a shared prediction model while keeping all training data on devices. However, a key obstacle towards practical cross-device FL training is huge energy consumption, especially for lightweight mobile devices. In this work, we perform the first-of-its-kind analysis of improving FL performance through low-precision training with an energy-friendly Digital Signal Processor (DSP) on mobile devices. We first demonstrate that directly integrating the state-of-the-art INT8 (8-bit integer) training algorithm and classic FL protocols will significantly degrade the model accuracy. Moreover, we observe that there are still unavoidable frequent quantization operations on devices that cause extreme load stress on DSP-enabled INT8 training. To address the above challenges, we present Q-FedUpdate, an FL framework that efficiently preserves model accuracy with ultra-low energy consumption. It maintains a global full-precision model and allows the tiny model updates to be continuously accumulated, instead of being erased by the quantization. Furthermore, it introduces pipelining technology to parallel CPU-based quantization and DSP-enabled training, which reduces the floating-point computation overhead of frequent data quantization. Extensive experiments show that Q-FedUpdate can effectively reduce the on-device energy consumption by 21×, and accelerate the FL convergence by 6.1× with only 2% accuracy loss.

Supplemental Material

MKV File
Supplemental video

References

[1]
General data protection regulation (gdpr). https://gdpr-info.eu/, 2016.
[2]
California consumer privacy act (ccpa). https://en.wikipedia.org/wiki/California_Consumer_Privacy_Act, 2018.
[3]
Qualcomm hexagon. https://en.wikipedia.org/wiki/Qualcomm_Hexagon, 2021.
[4]
single-instruction-multiple-data support. https://developer.android.com/ndk/guides/cpu-arm-neon, 2021.
[5]
Ahmed M Abdelmoniem and Marco Canini. Towards mitigating device heterogeneity in federated learning via adaptive model quantization. In Proceedings of Workshop on Machine Learning and Systems, pages 96--103, 2021.
[6]
Ron Banner, Itay Hubara, Elad Hoffer, and Daniel Soudry. Scalable methods for 8-bit training of neural networks. Advances in neural information processing systems, 31, 2018.
[7]
Keith Bonawitz, Hubert Eichner, Alex Ingerman, Vladimir Ivanov, Chloe Kiddon, Jakub Kone?ny, Stefano Mazzocchi, H Brendan McMahan, et al. Towards federated learning at scale: System design. Proceedings of Machine Learning and Systems, 1:374--388, 2019.
[8]
Dongqi Cai, Shangguang Wang, Yaozong Wu, Felix Xiaozhu Lin, and Mengwei Xu. Federated few-shot learning for mobile nlp. In Proceedings of the 29th Annual International Conference on Mobile Computing and Networking, pages 1--17, 2023.
[9]
Xiaowen Cao, Guangxu Zhu, Jie Xu, ZhiqinWang, and Shuguang Cui. Optimized power control design for over-the-air federated edge learning. Journal on Selected Areas in Communications, 40(1):342--358, 2022.
[10]
Gregory Cohen, Saeed Afshar, Jonathan Tapson, and Andre Van Schaik. Emnist: Extending mnist to handwritten letters. In International Joint Conference on Neural Networks, pages 2921--2926, 2017.
[11]
Anish Das, Young D. Kwon, Jagmohan Chauhan, and Cecilia Mascolo. Enabling on-device smartphone GPU based training: Lessons learned. In International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events, pages 533--538, 2022.
[12]
Junyi Gao, Cao Xiao, Yasha Wang, Wen Tang, Lucas M Glass, and Jimeng Sun. Stagenet: Stage-aware neural networks for health risk prediction. In Proceedings of The Web Conference, pages 530--540, 2020.
[13]
Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. Deep learning with limited numerical precision. In International conference on machine learning, pages 1737--1746, 2015.
[14]
Andrew Hard, Kanishka Rao, Rajiv Mathews, Swaroop Ramaswamy, Françoise Beaufays, Sean Augenstein, Hubert Eichner, Chloé Kiddon, and Daniel Ramage. Federated learning for mobile keyboard prediction. arXiv:1811.03604, 2018.
[15]
Florian Hartmann, Sunah Suh, Arkadiusz Komarzewski, Tim D Smith, and Ilana Segall. Federated learning for ranking browser history suggestions. Proceedings of Workshop on Advances in Neural Information Processing Systems, 2019.
[16]
Yuntao Hu, Ming Chen, Mingzhe Chen, Zhaohui Yang, Mohammad Shikh-Bahaei, H. Vincent Poor, and Shuguang Cui. Energy minimization for federated learning with irs-assisted over-the-air computation. In International Conference on Acoustics, Speech and Signal Processing, pages 3105--3109, 2021.
[17]
Divyansh Jhunjhunwala, Advait Gadhikar, Gauri Joshi, and Yonina C Eldar. Adaptive quantization of model updates for communication-efficient federated learning. In International Conference on Acoustics, Speech and Signal Processing, pages 3110--3114, 2021.
[18]
Xiaotang Jiang, Huan Wang, Yiliu Chen, Ziqi Wu, LichuanWang, Bin Zou, Yafeng Yang, Zongyang Cui, Yu Cai, Tianhang Yu, Chengfei Lyu, and Zhihua Wu. Mnn: A universal and efficient inference engine. In Proceedings of Machine Learning and Systems, pages 1--13, 2020.
[19]
Peter Kairouz, H Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, KA Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, et al. Advances and open problems in federated learning. Foundations and Trends in Machine Learning, 14(1--2):1--210, 2021.
[20]
Young Geun Kim and Carole-Jean Wu. Autofl: Enabling heterogeneity-aware energy efficient federated learning. In Annual International Symposium on Microarchitecture, pages 183--198, 2021.
[21]
Jakub Kone?ny, H Brendan McMahan, Felix X Yu, Peter Richtárik, Ananda Theertha Suresh, and Dave Bacon. Federated learning: Strategies for improving communication efficiency. arXiv:1610.05492, 2016.
[22]
Raghuraman Krishnamoorthi. Quantizing deep convolutional networks for efficient inference: A whitepaper. ArXiv, abs/1806.08342, 2018.
[23]
Alex Krizhevsky. Learning multiple layers of features from tiny images. 2009.
[24]
Fan Lai, Xiangfeng Zhu, Harsha V. Madhyastha, and Mosharaf Chowdhury. Oort: Efficient federated learning via guided participant selection. In Symposium on Operating Systems Design and Implementation, pages 19--35, 2021.
[25]
Chenning Li, Xiao Zeng, Mi Zhang, and Zhichao Cao. Pyramidfl: A fine-grained client selection framework for efficient federated learning. In Annual International Conference on Mobile Computing and Networking, 2022.
[26]
Liang Li, Dian Shi, Ronghui Hou, Hui Li, Miao Pan, and Zhu Han. To talk or to work: Flexible communication compression for energy efficient federated learning over heterogeneous mobile edge devices. In Conference on Computer Communications, pages 1--10, 2021.
[27]
Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith. Federated optimization in heterogeneous networks. Proceedings of Machine Learning and Systems, 2:429--450, 2020.
[28]
Junxin Liu, Fangzhao Wu, Chuhan Wu, Yongfeng Huang, and Xing Xie. Neural chinese word segmentation with lexicon and unlabeled data via posterior regularization. In Proceedings of The Web Conference, pages 3013--3019, 2019.
[29]
Yilin Liu and Mahanth Gowda. Neuropose: 3d hand pose tracking using emg wearables. In Proceedings of the Web Conference, pages 1471--1482, 2021.
[30]
Brendan McMahan, Eider Moore, Seth Hampson, and Blaise Aguera y Arcas. Communication-efficient learning of deep networks from decentralized data. In Artificial Intelligence and Statistics, pages 1273--1282, 2017.
[31]
Liang Qu, Ningzhi Tang, Ruiqi Zheng, Quoc Viet Hung Nguyen, Zi Huang, Yuhui Shi, and Hongzhi Yin. Semi-decentralized federated ego graph learning for recommendation. In Proceedings of the Web Conference, pages 339--348, 2023.
[32]
Amirhossein Reisizadeh, Aryan Mokhtari, Hamed Hassani, Ali Jadbabaie, and Ramtin Pedarsani. Fedpaq: A communication-efficient federated learning method with periodic averaging and quantization. In International Conference on Artificial Intelligence and Statistics, pages 2021--2031, 2020.
[33]
Nir Shlezinger, Mingzhe Chen, Yonina C Eldar, H Vincent Poor, and Shuguang Cui. Federated learning with quantization constraints. In International Conference on Acoustics, Speech and Signal Processing, pages 8851--8855, 2020.
[34]
Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556, 2014.
[35]
Hongyi Wang, Mikhail Yurochkin, Yuekai Sun, Dimitris S. Papailiopoulos, and Yasaman Khazaeni. Federated learning with matched averaging. In International Conference on Learning Representations, 2020.
[36]
Maolin Wang, Seyedramin Rasoulinezhad, Philip HWLeong, and Hayden K-H So. Niti: Training integer neural networks using integer-only arithmetic. Transactions on Parallel and Distributed Systems, 33(11):3249--3261, 2022.
[37]
Qipeng Wang, Mengwei Xu, Chao Jin, Xinran Dong, Jinliang Yuan, Xin Jin, Gang Huang, Yunxin Liu, and Xuanzhe Liu. Melon: breaking the memory wall for resource-efficient on-device machine learning. In Annual International Conferenceon Mobile Systems, Applications and Services, pages 450--463, 2022.
[38]
Jiaxiang Wu, Weidong Huang, Junzhou Huang, and Tong Zhang. Error compensated quantized sgd and its applications to large-scale distributed optimization. In International Conference on Machine Learning, pages 5325--5333, 2018.
[39]
Han Xie, Li Xiong, and Carl Yang. Federated node classification over graphs with latent link-type heterogeneity. In Proceedings of the Web Conference, pages 556--566, 2023.
[40]
Daliang Xu, Mengwei Xu, Qipeng Wang, Shangguang Wang, Yun Ma, Kang Huang, Guang Huang, Xin Jin, and Xuanzhe Liu. Mandheling: Mixed-precision on-device dnn training with dsp offloading. arXiv:2206.07509, 2022.
[41]
Mengwei Xu, Jiawei Liu, Yuanqiang Liu, Felix Xiaozhu Lin, Yunxin Liu, and Xuanzhe Liu. A first look at deep learning apps on smartphones. In The World Wide Web Conference, pages 2125--2136. ACM, 2019.
[42]
Mengwei Xu, Mengze Zhu, Yunxin Liu, Felix Xiaozhu Lin, and Xuanzhe Liu. Deepcache: principled cache for mobile deep vision. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking, pages 129--144. ACM, 2018.
[43]
Chengxu Yang, QipengWang, Mengwei Xu, Zhenpeng Chen, Kaigui Bian, Yunxin Liu, and Xuanzhe Liu. Characterizing impacts of heterogeneity in federated learning upon large-scale smartphone data. In Proceedings of the Web Conference, pages 935--946, 2021.
[44]
Zhaohui Yang, Mingzhe Chen, Walid Saad, Choong Seon Hong, and Mohammad Shikh-Bahaei. Energy efficient federated learning over wireless communication networks. Transactions on Wireless Communications, 20(3):1935--1949, 2021.
[45]
Jaehong Yoon, Geon Park, Wonyong Jeong, and Sung Ju Hwang. Bitwidth heterogeneous federated learning with progressive weight dequantization. In International Conference on Machine Learning, pages 25552--25565, 2022.
[46]
Jinliang Yuan, Shangguang Wang, Shihe Wang, Yuanchun Li, Xiao Ma, Ao Zhou, and Mengwei Xu. Privacy as a resource in differentially private federated learning. In IEEE INFOCOM 2023-IEEE Conference on Computer Communications, pages 1--10, 2023.
[47]
Jinliang Yuan, Mengwei Xu, Ao Zhou, and Shangguang Wang. Hierarchical federated learning through lan-wan orchestration. ArXiv:2010.11612, 2020.
[48]
Jinliang Yuan, Chen Yang, Dongqi Cai, Shihe Wang, Xin Yuan, Zeling Zhang, Xiang Li, Shangguang Wang, and Mengwei Xu. Rethinking mobile AI ecosystem in the LLM era. ArXiv:2308.14363, 2023.
[49]
Qihua Zhou, Song Guo, Zhihao Qu, Jingcai Guo, Zhenda Xu, Jiewei Zhang, Tao Guo, Boyuan Luo, and Jingren Zhou. Octo: Int8 training with loss-aware compensation and backward quantization for tiny on-device learning. In Annual Technical Conference, pages 177--191, 2021.
[50]
Feng Zhu, Ruihao Gong, Fengwei Yu, Xianglong Liu, Yanfei Wang, Zhelong Li, Xiuqi Yang, and Junjie Yan. Towards unified int8 training for convolutional neural network. In Proceedings of Conference on Computer Vision and Pattern Recognition, pages 1969--1979, 2020.

Cited By

View all
  • (2025)Small models, big impact: A review on the power of lightweight Federated LearningFuture Generation Computer Systems10.1016/j.future.2024.107484162(107484)Online publication date: Jan-2025
  • (2024)FwdLLMProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692028(579-596)Online publication date: 10-Jul-2024

Index Terms

  1. Towards Energy-efficient Federated Learning via INT8-based Training on Mobile DSPs

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WWW '24: Proceedings of the ACM Web Conference 2024
    May 2024
    4826 pages
    ISBN:9798400701719
    DOI:10.1145/3589334
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 May 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. energy efficiency
    2. federated learning
    3. mobile computing

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    WWW '24
    Sponsor:
    WWW '24: The ACM Web Conference 2024
    May 13 - 17, 2024
    Singapore, Singapore

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)238
    • Downloads (Last 6 weeks)28
    Reflects downloads up to 27 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Small models, big impact: A review on the power of lightweight Federated LearningFuture Generation Computer Systems10.1016/j.future.2024.107484162(107484)Online publication date: Jan-2025
    • (2024)FwdLLMProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692028(579-596)Online publication date: 10-Jul-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media