FusionFlow: Neural Fusion and Compression for Communication-Efficient Edge-Cloud Collaborative Computing

Zhang, Ningkang; Wu, Guangyu; Gu, Chao; Yuan, Mu; Li, Xiang-Yang

doi:10.1007/978-3-031-71467-2_41

Ningkang Zhang¹¹,
Guangyu Wu¹¹,
Chao Gu¹¹,
Mu Yuan¹¹ &
…
Xiang-Yang Li¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14998))

Included in the following conference series:

International Conference on Wireless Artificial Intelligent Computing Systems and Applications

225 Accesses

Abstract

With the recent advancement of the Internet of Things (IoT) and edge computing, there’s a surge in demand for effective intelligent data analytics, especially vision analysis. Neural network (NN) partitioning, an edge-cloud collaborative computing technique that divides NN models into distributable segments, offers a promising approach to deploy efficient AI-based intelligent system systems. However, to complete tasks such as after-the-fact video query and manual labeling in the cloud, the servers need to receive both intermediate features and original images, leading to significant communication redundancy. In this paper, we propose an approach to transmit a single data flow to support both image restoration and NN inference tasks. We design a bottleneck unit for fusing and compressing intermediate features and an image restoration module with self-attention blocks. Our proposed framework can be easily adapted to existing NN partitioning systems without modifying the structure of the model. With comprehensive evaluations, our system can reduce the amount of network transmission data by 57.3% within 1% accuracy loss, while introducing negligible overhead on local devices.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 74.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Split-DNN Computing for Video Analytics

DNN Model Deployment on Distributed Edges

Auto-tuning Neural Network Quantization Framework for Collaborative Inference Between the Cloud and Edge

References

Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. In: Low-Power Computer Vision, pp. 291–326. Chapman and Hall/CRC, Boca Raton (2022)
Google Scholar
Gou, J., Yu, B., Maybank, S.J., Tao, D.: Knowledge distillation: a survey. Int. J. Comput. Vision 129(6), 1789–1819 (2021)
Article Google Scholar
Guo, J., Zhang, W., Ouyang, W., Xu, D.: Model compression using progressive channel pruning. IEEE Trans. Circuits Syst. Video Technol. 31(3), 1114–1124 (2020)
Article Google Scholar
Laskaridis, S., Venieris, S.I., Almeida, M., Leontiadis, I., Lane, N.D.: Spinn: synergistic progressive inference of neural networks over device and cloud. In: Proceedings of the 26th Annual International Conference on Mobile Computing and Networking. MobiCom ’20. Association for Computing Machinery, New York, NY, USA (2020)
Google Scholar
Hsieh, K., et al.: Focus: Querying large video datasets with low latency and low cost. In: 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pp. 269–286 (2018)
Google Scholar
Bastani, F., Moll, O., Madden, S.: VAAS: video analytics at scale. In: Proc. VLDB Endow. (2020)
Google Scholar
Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: Swinir: image restoration using swin transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1833–1844 (2021)
Google Scholar
Harshvardhan, G., Gourisaria, M.K., Pandey, M., Rautaray, S.S.: A comprehensive survey and analysis of generative models in machine learning. Comput. Sci. Rev. 38, 100285 (2020)
Article MathSciNet Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. In: International Conference on Machine Learning, pp. 7354–7363. PMLR (2019)
Google Scholar
Pearlman, W.A., Said, A.: Digital Signal Compression: Principles and Practice. Cambridge University Press, Cambridge (2011)
Google Scholar
Agustsson, E., et al.: Soft-to-hard vector quantization for end-to-end learning compressible representations. In: Neural Information Processing Systems (2017)
Google Scholar
Bamler, R.: Understanding entropy coding with asymmetric numeral systems (ANS): a statistician’s perspective. arXiv e-prints arXiv:2201.01741 (2022). https://doi.org/10.48550/arXiv.2201.01741
Zhao, H., Gallo, O., Frosio, I., Kautz, J.: Loss functions for image restoration with neural networks. IEEE Trans. Comput. Imaging 3(1), 47–57 (2016)
Article Google Scholar
Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: The Thirty-Seventh Asilomar Conference on Signals, Systems and Computers, vol. 2, pp. 1398–1402. IEEE (2003)
Google Scholar
Nilsback, M.E., Zisserman, A.: Automated flower classification over a large number of classes. In: 2008 Sixth Indian Conference on Computer Vision, Graphics and Image Processing, pp. 722–729. IEEE (2008)
Google Scholar
Maintainers, T., Contributors: Torchvision: Pytorch’s computer vision library (2023). https://github.com/pytorch/vision
Shao, J., Zhang, J.: Bottlenet++: an end-to-end approach for feature compression in device-edge co-inference systems. In: 2020 IEEE International Conference on Communications Workshops (ICC Workshops), pp. 1–6. IEEE (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, University of Science and Technology of China, Hefei, 230027, China
Ningkang Zhang, Guangyu Wu, Chao Gu, Mu Yuan & Xiang-Yang Li

Authors

Ningkang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Guangyu Wu
View author publications
You can also search for this author in PubMed Google Scholar
Chao Gu
View author publications
You can also search for this author in PubMed Google Scholar
Mu Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Xiang-Yang Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiang-Yang Li .

Editor information

Editors and Affiliations

Georgia State University, Atlanta, GA, USA
Zhipeng Cai
Old Dominion University, Norfolk, VA, USA
Daniel Takabi
Beijing University of Posts and Telecommunications, Beijing, China
Shaoyong Guo
Shandong University, Qingdao, China
Yifei Zou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, N., Wu, G., Gu, C., Yuan, M., Li, XY. (2025). FusionFlow: Neural Fusion and Compression for Communication-Efficient Edge-Cloud Collaborative Computing. In: Cai, Z., Takabi, D., Guo, S., Zou, Y. (eds) Wireless Artificial Intelligent Computing Systems and Applications. WASA 2024. Lecture Notes in Computer Science, vol 14998. Springer, Cham. https://doi.org/10.1007/978-3-031-71467-2_41

Download citation

DOI: https://doi.org/10.1007/978-3-031-71467-2_41
Published: 14 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-71466-5
Online ISBN: 978-3-031-71467-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics