research-article

Towards Energy-efficient Federated Learning via INT8-based Training on Mobile DSPs

Authors:

Shangguang Wang,

Xuanzhe LiuAuthors Info & Claims

WWW '24: Proceedings of the ACM Web Conference 2024

Pages 2786 - 2794

https://doi.org/10.1145/3589334.3645341

Published: 13 May 2024 Publication History

Abstract

AI is making the Web an even cooler place, but also introduces serious privacy risks due to the extensive user data collection. Federated learning (FL), as a privacy-preserving machine learning paradigm, enables mobile devices to collaboratively learn a shared prediction model while keeping all training data on devices. However, a key obstacle towards practical cross-device FL training is huge energy consumption, especially for lightweight mobile devices. In this work, we perform the first-of-its-kind analysis of improving FL performance through low-precision training with an energy-friendly Digital Signal Processor (DSP) on mobile devices. We first demonstrate that directly integrating the state-of-the-art INT8 (8-bit integer) training algorithm and classic FL protocols will significantly degrade the model accuracy. Moreover, we observe that there are still unavoidable frequent quantization operations on devices that cause extreme load stress on DSP-enabled INT8 training. To address the above challenges, we present Q-FedUpdate, an FL framework that efficiently preserves model accuracy with ultra-low energy consumption. It maintains a global full-precision model and allows the tiny model updates to be continuously accumulated, instead of being erased by the quantization. Furthermore, it introduces pipelining technology to parallel CPU-based quantization and DSP-enabled training, which reduces the floating-point computation overhead of frequent data quantization. Extensive experiments show that Q-FedUpdate can effectively reduce the on-device energy consumption by 21×, and accelerate the FL convergence by 6.1× with only 2% accuracy loss.

Supplemental Material

MKV File

Supplemental video

Download
55.80 MB

References

[1]

General data protection regulation (gdpr). https://gdpr-info.eu/, 2016.

[2]

California consumer privacy act (ccpa). https://en.wikipedia.org/wiki/California_Consumer_Privacy_Act, 2018.

[3]

Qualcomm hexagon. https://en.wikipedia.org/wiki/Qualcomm_Hexagon, 2021.

[4]

single-instruction-multiple-data support. https://developer.android.com/ndk/guides/cpu-arm-neon, 2021.

[5]

Ahmed M Abdelmoniem and Marco Canini. Towards mitigating device heterogeneity in federated learning via adaptive model quantization. In Proceedings of Workshop on Machine Learning and Systems, pages 96--103, 2021.

Digital Library

[6]

Ron Banner, Itay Hubara, Elad Hoffer, and Daniel Soudry. Scalable methods for 8-bit training of neural networks. Advances in neural information processing systems, 31, 2018.

[7]

Keith Bonawitz, Hubert Eichner, Alex Ingerman, Vladimir Ivanov, Chloe Kiddon, Jakub Kone?ny, Stefano Mazzocchi, H Brendan McMahan, et al. Towards federated learning at scale: System design. Proceedings of Machine Learning and Systems, 1:374--388, 2019.

[8]

Dongqi Cai, Shangguang Wang, Yaozong Wu, Felix Xiaozhu Lin, and Mengwei Xu. Federated few-shot learning for mobile nlp. In Proceedings of the 29th Annual International Conference on Mobile Computing and Networking, pages 1--17, 2023.

Digital Library

[9]

Xiaowen Cao, Guangxu Zhu, Jie Xu, ZhiqinWang, and Shuguang Cui. Optimized power control design for over-the-air federated edge learning. Journal on Selected Areas in Communications, 40(1):342--358, 2022.

Digital Library

[10]

Gregory Cohen, Saeed Afshar, Jonathan Tapson, and Andre Van Schaik. Emnist: Extending mnist to handwritten letters. In International Joint Conference on Neural Networks, pages 2921--2926, 2017.

[11]

Anish Das, Young D. Kwon, Jagmohan Chauhan, and Cecilia Mascolo. Enabling on-device smartphone GPU based training: Lessons learned. In International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events, pages 533--538, 2022.

[12]

Junyi Gao, Cao Xiao, Yasha Wang, Wen Tang, Lucas M Glass, and Jimeng Sun. Stagenet: Stage-aware neural networks for health risk prediction. In Proceedings of The Web Conference, pages 530--540, 2020.

Digital Library

[13]

Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. Deep learning with limited numerical precision. In International conference on machine learning, pages 1737--1746, 2015.

Digital Library

[14]

Andrew Hard, Kanishka Rao, Rajiv Mathews, Swaroop Ramaswamy, Françoise Beaufays, Sean Augenstein, Hubert Eichner, Chloé Kiddon, and Daniel Ramage. Federated learning for mobile keyboard prediction. arXiv:1811.03604, 2018.

[15]

Florian Hartmann, Sunah Suh, Arkadiusz Komarzewski, Tim D Smith, and Ilana Segall. Federated learning for ranking browser history suggestions. Proceedings of Workshop on Advances in Neural Information Processing Systems, 2019.

[16]

Yuntao Hu, Ming Chen, Mingzhe Chen, Zhaohui Yang, Mohammad Shikh-Bahaei, H. Vincent Poor, and Shuguang Cui. Energy minimization for federated learning with irs-assisted over-the-air computation. In International Conference on Acoustics, Speech and Signal Processing, pages 3105--3109, 2021.

[17]

Divyansh Jhunjhunwala, Advait Gadhikar, Gauri Joshi, and Yonina C Eldar. Adaptive quantization of model updates for communication-efficient federated learning. In International Conference on Acoustics, Speech and Signal Processing, pages 3110--3114, 2021.

[18]

Xiaotang Jiang, Huan Wang, Yiliu Chen, Ziqi Wu, LichuanWang, Bin Zou, Yafeng Yang, Zongyang Cui, Yu Cai, Tianhang Yu, Chengfei Lyu, and Zhihua Wu. Mnn: A universal and efficient inference engine. In Proceedings of Machine Learning and Systems, pages 1--13, 2020.

[19]

Peter Kairouz, H Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, KA Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, et al. Advances and open problems in federated learning. Foundations and Trends in Machine Learning, 14(1--2):1--210, 2021.

[20]

Young Geun Kim and Carole-Jean Wu. Autofl: Enabling heterogeneity-aware energy efficient federated learning. In Annual International Symposium on Microarchitecture, pages 183--198, 2021.

Digital Library

[21]

Jakub Kone?ny, H Brendan McMahan, Felix X Yu, Peter Richtárik, Ananda Theertha Suresh, and Dave Bacon. Federated learning: Strategies for improving communication efficiency. arXiv:1610.05492, 2016.

[22]

Raghuraman Krishnamoorthi. Quantizing deep convolutional networks for efficient inference: A whitepaper. ArXiv, abs/1806.08342, 2018.

[23]

Alex Krizhevsky. Learning multiple layers of features from tiny images. 2009.

[24]

Fan Lai, Xiangfeng Zhu, Harsha V. Madhyastha, and Mosharaf Chowdhury. Oort: Efficient federated learning via guided participant selection. In Symposium on Operating Systems Design and Implementation, pages 19--35, 2021.

[25]

Chenning Li, Xiao Zeng, Mi Zhang, and Zhichao Cao. Pyramidfl: A fine-grained client selection framework for efficient federated learning. In Annual International Conference on Mobile Computing and Networking, 2022.

[26]

Liang Li, Dian Shi, Ronghui Hou, Hui Li, Miao Pan, and Zhu Han. To talk or to work: Flexible communication compression for energy efficient federated learning over heterogeneous mobile edge devices. In Conference on Computer Communications, pages 1--10, 2021.

Digital Library

[27]

Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith. Federated optimization in heterogeneous networks. Proceedings of Machine Learning and Systems, 2:429--450, 2020.

[28]

Junxin Liu, Fangzhao Wu, Chuhan Wu, Yongfeng Huang, and Xing Xie. Neural chinese word segmentation with lexicon and unlabeled data via posterior regularization. In Proceedings of The Web Conference, pages 3013--3019, 2019.

Digital Library

[29]

Yilin Liu and Mahanth Gowda. Neuropose: 3d hand pose tracking using emg wearables. In Proceedings of the Web Conference, pages 1471--1482, 2021.

Digital Library

[30]

Brendan McMahan, Eider Moore, Seth Hampson, and Blaise Aguera y Arcas. Communication-efficient learning of deep networks from decentralized data. In Artificial Intelligence and Statistics, pages 1273--1282, 2017.

[31]

Liang Qu, Ningzhi Tang, Ruiqi Zheng, Quoc Viet Hung Nguyen, Zi Huang, Yuhui Shi, and Hongzhi Yin. Semi-decentralized federated ego graph learning for recommendation. In Proceedings of the Web Conference, pages 339--348, 2023.

Digital Library

[32]

Amirhossein Reisizadeh, Aryan Mokhtari, Hamed Hassani, Ali Jadbabaie, and Ramtin Pedarsani. Fedpaq: A communication-efficient federated learning method with periodic averaging and quantization. In International Conference on Artificial Intelligence and Statistics, pages 2021--2031, 2020.

[33]

Nir Shlezinger, Mingzhe Chen, Yonina C Eldar, H Vincent Poor, and Shuguang Cui. Federated learning with quantization constraints. In International Conference on Acoustics, Speech and Signal Processing, pages 8851--8855, 2020.

[34]

Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556, 2014.

[35]

Hongyi Wang, Mikhail Yurochkin, Yuekai Sun, Dimitris S. Papailiopoulos, and Yasaman Khazaeni. Federated learning with matched averaging. In International Conference on Learning Representations, 2020.

[36]

Maolin Wang, Seyedramin Rasoulinezhad, Philip HWLeong, and Hayden K-H So. Niti: Training integer neural networks using integer-only arithmetic. Transactions on Parallel and Distributed Systems, 33(11):3249--3261, 2022.

[37]

Qipeng Wang, Mengwei Xu, Chao Jin, Xinran Dong, Jinliang Yuan, Xin Jin, Gang Huang, Yunxin Liu, and Xuanzhe Liu. Melon: breaking the memory wall for resource-efficient on-device machine learning. In Annual International Conferenceon Mobile Systems, Applications and Services, pages 450--463, 2022.

[38]

Jiaxiang Wu, Weidong Huang, Junzhou Huang, and Tong Zhang. Error compensated quantized sgd and its applications to large-scale distributed optimization. In International Conference on Machine Learning, pages 5325--5333, 2018.

[39]

Han Xie, Li Xiong, and Carl Yang. Federated node classification over graphs with latent link-type heterogeneity. In Proceedings of the Web Conference, pages 556--566, 2023.

Digital Library

[40]

Daliang Xu, Mengwei Xu, Qipeng Wang, Shangguang Wang, Yun Ma, Kang Huang, Guang Huang, Xin Jin, and Xuanzhe Liu. Mandheling: Mixed-precision on-device dnn training with dsp offloading. arXiv:2206.07509, 2022.

[41]

Mengwei Xu, Jiawei Liu, Yuanqiang Liu, Felix Xiaozhu Lin, Yunxin Liu, and Xuanzhe Liu. A first look at deep learning apps on smartphones. In The World Wide Web Conference, pages 2125--2136. ACM, 2019.

Digital Library

[42]

Mengwei Xu, Mengze Zhu, Yunxin Liu, Felix Xiaozhu Lin, and Xuanzhe Liu. Deepcache: principled cache for mobile deep vision. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking, pages 129--144. ACM, 2018.

Digital Library

[43]

Chengxu Yang, QipengWang, Mengwei Xu, Zhenpeng Chen, Kaigui Bian, Yunxin Liu, and Xuanzhe Liu. Characterizing impacts of heterogeneity in federated learning upon large-scale smartphone data. In Proceedings of the Web Conference, pages 935--946, 2021.

Digital Library

[44]

Zhaohui Yang, Mingzhe Chen, Walid Saad, Choong Seon Hong, and Mohammad Shikh-Bahaei. Energy efficient federated learning over wireless communication networks. Transactions on Wireless Communications, 20(3):1935--1949, 2021.

Digital Library

[45]

Jaehong Yoon, Geon Park, Wonyong Jeong, and Sung Ju Hwang. Bitwidth heterogeneous federated learning with progressive weight dequantization. In International Conference on Machine Learning, pages 25552--25565, 2022.

[46]

Jinliang Yuan, Shangguang Wang, Shihe Wang, Yuanchun Li, Xiao Ma, Ao Zhou, and Mengwei Xu. Privacy as a resource in differentially private federated learning. In IEEE INFOCOM 2023-IEEE Conference on Computer Communications, pages 1--10, 2023.

[47]

Jinliang Yuan, Mengwei Xu, Ao Zhou, and Shangguang Wang. Hierarchical federated learning through lan-wan orchestration. ArXiv:2010.11612, 2020.

[48]

Jinliang Yuan, Chen Yang, Dongqi Cai, Shihe Wang, Xin Yuan, Zeling Zhang, Xiang Li, Shangguang Wang, and Mengwei Xu. Rethinking mobile AI ecosystem in the LLM era. ArXiv:2308.14363, 2023.

[49]

Qihua Zhou, Song Guo, Zhihao Qu, Jingcai Guo, Zhenda Xu, Jiewei Zhang, Tao Guo, Boyuan Luo, and Jingren Zhou. Octo: Int8 training with loss-aware compensation and backward quantization for tiny on-device learning. In Annual Technical Conference, pages 177--191, 2021.

[50]

Feng Zhu, Ruihao Gong, Fengwei Yu, Xianglong Liu, Yanfei Wang, Zhelong Li, Xiuqi Yang, and Junjie Yan. Towards unified int8 training for convolutional neural network. In Proceedings of Conference on Computer Vision and Pattern Recognition, pages 1969--1979, 2020.

Cited By

Qi PChiaro DPiccialli F(2025)Small models, big impact: A review on the power of lightweight Federated LearningFuture Generation Computer Systems10.1016/j.future.2024.107484162(107484)Online publication date: Jan-2025
https://doi.org/10.1016/j.future.2024.107484
Xu MCai DWu YLi XWang SBagchi SZhang Y(2024)FwdLLMProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692028(579-596)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.5555/3691992.3692028

Index Terms

Towards Energy-efficient Federated Learning via INT8-based Training on Mobile DSPs
1. Human-centered computing
  1. Ubiquitous and mobile computing
    1. Ubiquitous and mobile computing systems and tools

Recommendations

Energy Efficient Face Recognition in Mobile-Fog Environment
Abstract
Proliferation in technological advancements has leveraged the evolution of cell-phones as powerful smart-phones with high computing competence. Integration of sensors and high definition camera has empowered the smart-phones as a tool to solve ...
Towards energy-efficient streaming system for mobile hotspots
SIGCOMM '11: Proceedings of the ACM SIGCOMM 2011 conference

Modern mobile devices have become an important part of our daily life but the performance of multimedia applications still suffers from the constrained energy supply and communication bandwidth of the mobile devices. In this work, we develop an energy-...
Towards Energy-Aware Federated Learning via Collaborative Computing Approach
Abstract
This research delves into the consequences of the high complexity of on-device operations executed during the federated learning process. We investigate how the varying computational capabilities and battery levels among mobile devices can ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '24: Proceedings of the ACM Web Conference 2024

May 2024

4826 pages

ISBN:9798400701719

DOI:10.1145/3589334

General Chairs:
Tat-Seng Chua
National University of Singapore
,
Chong-Wah Ngo
Singapore Management University
,
Proceedings Chair:
Roy Ka-Wei Lee
Singapore University of Technology and Design
,
Program Chairs:
Ravi Kumar
Google
,
Hady W. Lauw
Singapore Management University

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Key R&D Program of China
Beijing Nova Program
NSFC

Conference

WWW '24

Sponsor:

SIGWEB

WWW '24: The ACM Web Conference 2024

May 13 - 17, 2024

Singapore, Singapore

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
238
Total Downloads

Downloads (Last 12 months)238
Downloads (Last 6 weeks)28

Reflects downloads up to 27 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Qi PChiaro DPiccialli F(2025)Small models, big impact: A review on the power of lightweight Federated LearningFuture Generation Computer Systems10.1016/j.future.2024.107484162(107484)Online publication date: Jan-2025
https://doi.org/10.1016/j.future.2024.107484
Xu MCai DWu YLi XWang SBagchi SZhang Y(2024)FwdLLMProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692028(579-596)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.5555/3691992.3692028

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten