Abstract
With the recent advancement of the Internet of Things (IoT) and edge computing, there’s a surge in demand for effective intelligent data analytics, especially vision analysis. Neural network (NN) partitioning, an edge-cloud collaborative computing technique that divides NN models into distributable segments, offers a promising approach to deploy efficient AI-based intelligent system systems. However, to complete tasks such as after-the-fact video query and manual labeling in the cloud, the servers need to receive both intermediate features and original images, leading to significant communication redundancy. In this paper, we propose an approach to transmit a single data flow to support both image restoration and NN inference tasks. We design a bottleneck unit for fusing and compressing intermediate features and an image restoration module with self-attention blocks. Our proposed framework can be easily adapted to existing NN partitioning systems without modifying the structure of the model. With comprehensive evaluations, our system can reduce the amount of network transmission data by 57.3% within 1% accuracy loss, while introducing negligible overhead on local devices.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. In: Low-Power Computer Vision, pp. 291–326. Chapman and Hall/CRC, Boca Raton (2022)
Gou, J., Yu, B., Maybank, S.J., Tao, D.: Knowledge distillation: a survey. Int. J. Comput. Vision 129(6), 1789–1819 (2021)
Guo, J., Zhang, W., Ouyang, W., Xu, D.: Model compression using progressive channel pruning. IEEE Trans. Circuits Syst. Video Technol. 31(3), 1114–1124 (2020)
Laskaridis, S., Venieris, S.I., Almeida, M., Leontiadis, I., Lane, N.D.: Spinn: synergistic progressive inference of neural networks over device and cloud. In: Proceedings of the 26th Annual International Conference on Mobile Computing and Networking. MobiCom ’20. Association for Computing Machinery, New York, NY, USA (2020)
Hsieh, K., et al.: Focus: Querying large video datasets with low latency and low cost. In: 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pp. 269–286 (2018)
Bastani, F., Moll, O., Madden, S.: VAAS: video analytics at scale. In: Proc. VLDB Endow. (2020)
Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: Swinir: image restoration using swin transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1833–1844 (2021)
Harshvardhan, G., Gourisaria, M.K., Pandey, M., Rautaray, S.S.: A comprehensive survey and analysis of generative models in machine learning. Comput. Sci. Rev. 38, 100285 (2020)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. In: International Conference on Machine Learning, pp. 7354–7363. PMLR (2019)
Pearlman, W.A., Said, A.: Digital Signal Compression: Principles and Practice. Cambridge University Press, Cambridge (2011)
Agustsson, E., et al.: Soft-to-hard vector quantization for end-to-end learning compressible representations. In: Neural Information Processing Systems (2017)
Bamler, R.: Understanding entropy coding with asymmetric numeral systems (ANS): a statistician’s perspective. arXiv e-prints arXiv:2201.01741 (2022). https://doi.org/10.48550/arXiv.2201.01741
Zhao, H., Gallo, O., Frosio, I., Kautz, J.: Loss functions for image restoration with neural networks. IEEE Trans. Comput. Imaging 3(1), 47–57 (2016)
Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: The Thirty-Seventh Asilomar Conference on Signals, Systems and Computers, vol. 2, pp. 1398–1402. IEEE (2003)
Nilsback, M.E., Zisserman, A.: Automated flower classification over a large number of classes. In: 2008 Sixth Indian Conference on Computer Vision, Graphics and Image Processing, pp. 722–729. IEEE (2008)
Maintainers, T., Contributors: Torchvision: Pytorch’s computer vision library (2023). https://github.com/pytorch/vision
Shao, J., Zhang, J.: Bottlenet++: an end-to-end approach for feature compression in device-edge co-inference systems. In: 2020 IEEE International Conference on Communications Workshops (ICC Workshops), pp. 1–6. IEEE (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, N., Wu, G., Gu, C., Yuan, M., Li, XY. (2025). FusionFlow: Neural Fusion and Compression for Communication-Efficient Edge-Cloud Collaborative Computing. In: Cai, Z., Takabi, D., Guo, S., Zou, Y. (eds) Wireless Artificial Intelligent Computing Systems and Applications. WASA 2024. Lecture Notes in Computer Science, vol 14998. Springer, Cham. https://doi.org/10.1007/978-3-031-71467-2_41
Download citation
DOI: https://doi.org/10.1007/978-3-031-71467-2_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-71466-5
Online ISBN: 978-3-031-71467-2
eBook Packages: Computer ScienceComputer Science (R0)