Adaptive nonlinear deep coding using hybrid attention for wireless image transmission

Li, Chao; Zeng, Yangling; Ye, Zhiwei; Peng, Qin; Sun, Yu; Wang, Chen

doi:10.1007/s11760-025-03826-0

Adaptive nonlinear deep coding using hybrid attention for wireless image transmission

Original Paper
Published: 28 January 2025

Volume 19, article number 251, (2025)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Chao Li¹,
Yangling Zeng¹,
Zhiwei Ye¹,
Qin Peng²,
Yu Sun¹ &
…
Chen Wang¹

83 Accesses
Explore all metrics

Abstract

Recent advances have witnessed that deep learning-based joint source-channel coding (DeepJSCC) methods can achieve noise-resiliency performances for wireless image transmission tasks. Among them, a method named nonlinear transform source-channel coding (NTSCC) achieves superior performance by incorporating the nonlinear transform as a prior to extract the source semantic features and developing a hyperprior-aided codec refinement mechanism. However, the NTSCC framework still cannot achieve adaptive code rates for different channel signal-to-noises (SNRs), which reduces its flexibility and bandwidth efficiency. Additionally, the entropy model in the NTSCC inadequately captures the channel and spatial correlation between latent features, which leads to inaccurate rate transmission. In this paper, we propose an adaptive nonlinear deep coding (ANDC) framework for realizing flexible code rate optimization and improving accuracy of rate transmission. ANDC is realized by a hybrid attention multi-reference entropy model (HA-MREM) that captures the correlation of latent features in channel and spatial to improve the guidance of rate allocation, and adaptive hybrid attention (AHA) module that combines the SNR in both channel and spatial to adapt to different SNRs to flexibly adjust the transmission strategy. Combining these two structures, ANDC enables flexible and efficient image transmission. Simulation results show that the proposed model achieves equal or even better performance results in several metrics compared to the existing NTSCC model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Coupled Squeeze-and-Excitation Blocks Based CNN for Image Compression

An Enhanced Multi-frequency Learned Image Compression Method

Mixed Entropy Model Enhanced Residual Attention Network for Remote Sensing Image Compression

Article 25 March 2023

References

Gündüz, D., Qin, Z., Aguerri, I.E., Dhillon, H.S., Yang, Z., Yener, A., Wong, K.K., Chae, C.-B.: Beyond transmitting bits: Context, semantics, and task-oriented communications. IEEE J. Sel. Areas Commun. 41(1), 5–41 (2023). https://doi.org/10.1109/JSAC.2022.3223408
Article MATH Google Scholar
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 586– 595 ( 2018). https://doi.org/10.1109/CVPR.2018.00068
Duan, L., Liu, J., Yang, W., Huang, T., Gao, W.: Video coding for machines: A paradigm of collaborative compression and intelligent analytics. IEEE Trans. Image Process. 29, 8680–8695 (2020). https://doi.org/10.1109/TIP.2020.3016485
Article MATH Google Scholar
Gormish, M.J., Lee, D., Marcellin, M.W.: Jpeg 2000: overview, architecture, and applications. Proc. 2000 Int. Conf. Image Process. 2, 29–322 (2000). https://doi.org/10.1109/ICIP.2000.899217
Article Google Scholar
Richardson, T., Kudekar, S.: Design of low-density parity check codes for 5g new radio. IEEE Commun. Mag. 56(3), 28–34 (2018). https://doi.org/10.1109/MCOM.2018.1700839
Article MATH Google Scholar
Weithoffer, S., Nour, C.A., Wehn, N., Douillard, C., Berrou, C.: 25 years of turbo codes: From mb/s to beyond 100 gb/s. In: 2018 IEEE 10th International Symposium on Turbo Codes & Iterative Information Processing (ISTC), pp. 1– 6 ( 2018). IEEE
Shi, G., Xiao, Y., Li, Y., Xie, X.: From semantic communication to semantic-aware networking: Model, architecture, and open problems. IEEE Commun. Mag. 59(8), 44–50 (2021). https://doi.org/10.1109/MCOM.001.2001239
Article MATH Google Scholar
Fresia, M., Peréz-Cruz, F., Poor, H.V., Verdú, S.: Joint source and channel coding. IEEE Signal Process. Mag. 27(6), 104–113 (2010). https://doi.org/10.1109/MSP.2010.938080
Article MATH Google Scholar
Bourtsoulatze, E., Kurka, D.B., Gündüz, D.: Deep joint source-channel coding for wireless image transmission. In: ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4774– 4778 ( 2019). https://doi.org/10.1109/ICASSP.2019.8683463
Weng, Z., Qin, Z.: Semantic communication systems for speech transmission. IEEE J. Sel. Areas Commun. 39(8), 2434–2444 (2021). https://doi.org/10.1109/JSAC.2021.3087240
Article MATH Google Scholar
Farsad, N., Rao, M., Goldsmith, A.: Deep learning for joint source-channel coding of text. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2326– 2330 ( 2018). https://doi.org/10.1109/ICASSP.2018.8461983
Tung, T.-Y., Kurka, D.B., Jankowski, M., Gündüz, D.: Deepjscc-q: Channel input constrained deep joint source-channel coding. In: ICC 2022-IEEE International Conference on Communications, pp. 3880– 3885 ( 2022). IEEE
Dai, J., Wang, S., Tan, K., Si, Z., Qin, X., Niu, K., Zhang, P.: Nonlinear transform source-channel coding for semantic communications. IEEE J. Sel. Areas Commun. 40(8), 2300–2316 (2022)
Article MATH Google Scholar
Wang, S., Dai, J., Qin, X., Si, Z., Niu, K., Zhang, P.: Improved nonlinear transform source-channel coding to catalyze semantic communications. IEEE J. Select. Topics Signal Process. 17(5), 1022–1037 (2023). https://doi.org/10.1109/JSTSP.2023.3304140
Article Google Scholar
Ballé, J., Chou, P.A., Minnen, D., Singh, S., Johnston, N., Agustsson, E., Hwang, S.J., Toderici, G.: Nonlinear transform coding. IEEE J. Select. Topics Signal Process. 15(2), 339–353 (2021). https://doi.org/10.1109/JSTSP.2020.3034501
Article Google Scholar
Xu, J., Ai, B., Chen, W., Yang, A., Sun, P., Rodrigues, M.: Wireless image transmission using deep source channel coding with attention modules. IEEE Trans. Circuits Syst. Video Technol. 32(4), 2315–2328 (2022). https://doi.org/10.1109/TCSVT.2021.3082521
Article Google Scholar
Jiang, W., Yang, J., Zhai, Y., Ning, P., Gao, F., Wang, R.: Mlic: Multi-reference entropy model for learned image compression. In: Proceedings of the 31st ACM International Conference on Multimedia, pp. 7618– 7627 ( 2023)
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: Hierarchical vision transformer using shifted windows. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9992– 10002 ( 2021). https://doi.org/10.1109/ICCV48922.2021.00986
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Zhong, Z., Akutsu, H., Aizawa, K.: Channel-level variable quantization network for deep image compression. arXiv preprint arXiv:2007.12619 (2020)
Hou, Q., Zhou, D., Feng, J.: Coordinate attention for efficient mobile network design. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13708– 13717 ( 2021). https://doi.org/10.1109/CVPR46437.2021.01350
Dai, Y., Gieseke, F., Oehmcke, S., Wu, Y., Barnard, K.: Attentional feature fusion. In: 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 3559– 3568 ( 2021). https://doi.org/10.1109/WACV48630.2021.00360
Han, Y., Huang, G., Song, S., Yang, L., Wang, H., Wang, Y.: Dynamic neural networks: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 44(11), 7436–7456 (2022). https://doi.org/10.1109/TPAMI.2021.3117837
Article MATH Google Scholar
Zhang, W., Zhang, H., Ma, H., Shao, H., Wang, N., Leung, V.C.M.: Predictive and adaptive deep coding for wireless image transmission in semantic communication. IEEE Trans. Wireless Commun. 22(8), 5486–5501 (2023). https://doi.org/10.1109/TWC.2023.3234408
Article MATH Google Scholar
Kokalj-Filipović, S., Soljanin, E.: Suppressing the cliff effect in video reproduction quality. Bell Labs Tech. J. 16(4), 171–185 (2012). https://doi.org/10.1002/bltj.20540
Article MATH Google Scholar

Download references

Acknowledgements

This work was supported by the project of “Research on the identification of quality and safety risks of typical industrial products based on knowledge graph” (Project No. 262020Y-7506), which is funded by the Central Fundamental Operational Costs Project.

Author information

Authors and Affiliations

School of Computer Science, Hubei University of Technology, Wuhan, 430068, Hubei, China
Chao Li, Yangling Zeng, Zhiwei Ye, Yu Sun & Chen Wang
China National Institute of Standardization, Beijing, 100191, China
Qin Peng

Authors

Chao Li
View author publications
You can also search for this author inPubMed Google Scholar
Yangling Zeng
View author publications
You can also search for this author inPubMed Google Scholar
Zhiwei Ye
View author publications
You can also search for this author inPubMed Google Scholar
Qin Peng
View author publications
You can also search for this author inPubMed Google Scholar
Yu Sun
View author publications
You can also search for this author inPubMed Google Scholar
Chen Wang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Qin Peng.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, C., Zeng, Y., Ye, Z. et al. Adaptive nonlinear deep coding using hybrid attention for wireless image transmission. SIViP 19, 251 (2025). https://doi.org/10.1007/s11760-025-03826-0

Download citation

Received: 13 April 2024
Revised: 14 August 2024
Accepted: 07 January 2025
Published: 28 January 2025
DOI: https://doi.org/10.1007/s11760-025-03826-0

Keywords

Access this article

Log in via an institution

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Adaptive nonlinear deep coding using hybrid attention for wireless image transmission

Abstract

Access this article

Similar content being viewed by others

Coupled Squeeze-and-Excitation Blocks Based CNN for Image Compression

An Enhanced Multi-frequency Learned Image Compression Method

Mixed Entropy Model Enhanced Residual Attention Network for Remote Sensing Image Compression

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords