Skip to main content

FusionFlow: Neural Fusion and Compression for Communication-Efficient Edge-Cloud Collaborative Computing

  • Conference paper
  • First Online:
Wireless Artificial Intelligent Computing Systems and Applications (WASA 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14998))

  • 225 Accesses

Abstract

With the recent advancement of the Internet of Things (IoT) and edge computing, there’s a surge in demand for effective intelligent data analytics, especially vision analysis. Neural network (NN) partitioning, an edge-cloud collaborative computing technique that divides NN models into distributable segments, offers a promising approach to deploy efficient AI-based intelligent system systems. However, to complete tasks such as after-the-fact video query and manual labeling in the cloud, the servers need to receive both intermediate features and original images, leading to significant communication redundancy. In this paper, we propose an approach to transmit a single data flow to support both image restoration and NN inference tasks. We design a bottleneck unit for fusing and compressing intermediate features and an image restoration module with self-attention blocks. Our proposed framework can be easily adapted to existing NN partitioning systems without modifying the structure of the model. With comprehensive evaluations, our system can reduce the amount of network transmission data by 57.3% within 1% accuracy loss, while introducing negligible overhead on local devices.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. In: Low-Power Computer Vision, pp. 291–326. Chapman and Hall/CRC, Boca Raton (2022)

    Google Scholar 

  2. Gou, J., Yu, B., Maybank, S.J., Tao, D.: Knowledge distillation: a survey. Int. J. Comput. Vision 129(6), 1789–1819 (2021)

    Article  Google Scholar 

  3. Guo, J., Zhang, W., Ouyang, W., Xu, D.: Model compression using progressive channel pruning. IEEE Trans. Circuits Syst. Video Technol. 31(3), 1114–1124 (2020)

    Article  Google Scholar 

  4. Laskaridis, S., Venieris, S.I., Almeida, M., Leontiadis, I., Lane, N.D.: Spinn: synergistic progressive inference of neural networks over device and cloud. In: Proceedings of the 26th Annual International Conference on Mobile Computing and Networking. MobiCom ’20. Association for Computing Machinery, New York, NY, USA (2020)

    Google Scholar 

  5. Hsieh, K., et al.: Focus: Querying large video datasets with low latency and low cost. In: 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pp. 269–286 (2018)

    Google Scholar 

  6. Bastani, F., Moll, O., Madden, S.: VAAS: video analytics at scale. In: Proc. VLDB Endow. (2020)

    Google Scholar 

  7. Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: Swinir: image restoration using swin transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1833–1844 (2021)

    Google Scholar 

  8. Harshvardhan, G., Gourisaria, M.K., Pandey, M., Rautaray, S.S.: A comprehensive survey and analysis of generative models in machine learning. Comput. Sci. Rev. 38, 100285 (2020)

    Article  MathSciNet  Google Scholar 

  9. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  10. Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. In: International Conference on Machine Learning, pp. 7354–7363. PMLR (2019)

    Google Scholar 

  11. Pearlman, W.A., Said, A.: Digital Signal Compression: Principles and Practice. Cambridge University Press, Cambridge (2011)

    Google Scholar 

  12. Agustsson, E., et al.: Soft-to-hard vector quantization for end-to-end learning compressible representations. In: Neural Information Processing Systems (2017)

    Google Scholar 

  13. Bamler, R.: Understanding entropy coding with asymmetric numeral systems (ANS): a statistician’s perspective. arXiv e-prints arXiv:2201.01741 (2022). https://doi.org/10.48550/arXiv.2201.01741

  14. Zhao, H., Gallo, O., Frosio, I., Kautz, J.: Loss functions for image restoration with neural networks. IEEE Trans. Comput. Imaging 3(1), 47–57 (2016)

    Article  Google Scholar 

  15. Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: The Thirty-Seventh Asilomar Conference on Signals, Systems and Computers, vol. 2, pp. 1398–1402. IEEE (2003)

    Google Scholar 

  16. Nilsback, M.E., Zisserman, A.: Automated flower classification over a large number of classes. In: 2008 Sixth Indian Conference on Computer Vision, Graphics and Image Processing, pp. 722–729. IEEE (2008)

    Google Scholar 

  17. Maintainers, T., Contributors: Torchvision: Pytorch’s computer vision library (2023). https://github.com/pytorch/vision

  18. Shao, J., Zhang, J.: Bottlenet++: an end-to-end approach for feature compression in device-edge co-inference systems. In: 2020 IEEE International Conference on Communications Workshops (ICC Workshops), pp. 1–6. IEEE (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiang-Yang Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, N., Wu, G., Gu, C., Yuan, M., Li, XY. (2025). FusionFlow: Neural Fusion and Compression for Communication-Efficient Edge-Cloud Collaborative Computing. In: Cai, Z., Takabi, D., Guo, S., Zou, Y. (eds) Wireless Artificial Intelligent Computing Systems and Applications. WASA 2024. Lecture Notes in Computer Science, vol 14998. Springer, Cham. https://doi.org/10.1007/978-3-031-71467-2_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-71467-2_41

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-71466-5

  • Online ISBN: 978-3-031-71467-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics