Attention-based variable-size feature compression module for edge inference

  Published:
Artificial intelligence has made significant breakthroughs in many fields, especially with the broad deployment of edge devices, which provides opportunities to develop and apply various intelligent models in edge networks. Edge device-server co-inference system has gradually become the mainstream of edge intelligent computing. However, the existing feature procession works in the edge inference framework neglect the focus on whether features are important, and the processed features are still redundant, affecting the inference efficiency. In this paper, we propose a novel attention-based variable-size feature compression module to enhance edge systems’ inference efficiency by leveraging input data’s varying importance levels. First, a multi-scale attention mechanism is introduced, which operates jointly in the channel spatial to effectively compute importance weights from the intermediate output features of the edge devices. These weights are then utilized to assign different transmission probabilities, filtering out irrelevant feature data and prioritizing task-relevant information. Second, the new loss algorithm and progressive model training strategy are designed to optimize the proposed module, enabling the model to adapt to the reduced feature data gradually and effectively. Finally, experimental results on CIFAR-10 and ImageNet datasets demonstrate the effectiveness of our proposed solution, showcasing a significant reduction in the data output volume of edge devices and minimizing communication overhead while ensuring minimal loss in model accuracy.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Data availability

Not applicable.


This work is funded by the National Natural Science Foundation of China (No. 61972417).

Shibao Li came up with the idea, Chenxu Ma and Yunwu Zhang wrote and tested the code, Longfei Li and Chengzhi Wang drew all the drawings and tables, and all authors contributed to the writing of the article and reviewed the manuscript.

Shibao Li

We declare that the authors have no competing interests as defined by Springer, or other interests that might be perceived to influence the results and/or discussion reported in this paper.

Not applicable.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Li, S., Ma, C., Zhang, Y. et al. Attention-based variable-size feature compression module for edge inference. J Supercomput 80, 8469–8484 (2024).

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

