End-to-end image compression method based on perception metric

Liu, Shuai; Huang, Yingcong; Yang, Huoxiang; Liang, Yongsheng; Liu, Wei

doi:10.1007/s11760-022-02137-y

End-to-end image compression method based on perception metric

Original Paper
Published: 01 February 2022

Volume 16, pages 1803–1810, (2022)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Shuai Liu¹,
Yingcong Huang¹,
Huoxiang Yang¹,
Yongsheng Liang ORCID: orcid.org/0000-0002-0891-5577² &
…
Wei Liu³

503 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

In recent years, image compression methods based on deep learning have received extensive attention and research. Most methods focus on minimizing the mean squared error (MSE) to obtain reconstructed images with higher peak signal-to-noise ratio (PSNR). However, the ability of pixel-wise distortion to capture the perceptual differences between images is fairly limited, which may suffer from undesirable visual perception quality of the reconstructed images. To address this problem, we propose a novel rate-distortion loss based on perception metric in learned image compression. In this work, we introduce the perception metric into the rate-distortion loss, which can enhance the capacity of compression model to capture perceptual differences and semantic information in images. By performing that, the rate-distortion performance of our proposed model on multi-scale structural similarity (MS-SSIM) and the classification accuracy of reconstructed images have been improved. Comprehensive experimental results demonstrate that the proposed method has comparable performance in terms of PSNR, and the performance on MS-SSIM outperforms traditional image codecs, such as JPEG and BPG, as well as other previous end-to-end compression methods. More significantly, the visual quality of the reconstructed images is dramatically improved.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Content-Oriented Learned Image Compression

End-to-end optimized image compression with the frequency-oriented transform

Article 07 February 2024

Coupled Squeeze-and-Excitation Blocks Based CNN for Image Compression

References

Versatile video coding reference software version 9.1 (vtm-9.1) (2020). https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM/-/tags/VTM-9.1
Workshop and challenge on learned image compression (2020). http://www.compression.cc/challenge/
Agustsson, E., Mentzer, F., Tschannen, M., Cavigelli, L., Timofte, R., Benini, L., Van Gool, L.: Soft-to-hard vector quantization for end-to-end learning compressible representations. In: Advances in Neural Information Processing Systems (NIPS), pp. 1141–1151 (2017)
Ballé, J., Laparra, V., Simoncelli, E.P.: End-to-end optimized image compression. In: International Conference on Learning Representations (ICLR), pp. 1–27 (2017)
Ballé, J., Minnen, D., Singh, S., Hwang, S.J., Johnston, N.: Variational image compression with a scale hyperprior. In: International Conference on Learning Representations (ICLR), pp. 1–23 (2018)
Bégaint, J., Racapé, F., Feltman, S., Pushparaja, A.: Compressai: a pytorch library and evaluation platform for end-to-end compression research. arXiv preprint arXiv:2011.03029 (2020)
Bellard, F.: BPG image format (2014). https://bellard.org/bpg/
Bruna, J., Sprechmann, P., LeCun, Y.: Super-resolution with deep convolutional sufficient statistics. arXiv preprint arXiv:1511.05666 (2015)
Cheng, Z., Sun, H., Takeuchi, M., Katto, J.: Deep residual learning for image compression. In: CVPR Workshops, pp. 1–4 (2019)
Cheng, Z., Sun, H., Takeuchi, M., Katto, J.: Learned image compression with discretized gaussian mixture likelihoods and attention modules. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7939–7948 (2020)
Dosovitskiy, A., Brox, T.: Generating images with perceptual similarity metrics based on deep networks. arXiv preprint arXiv:1602.02644 (2016)
Gupta, P., Srivastava, P., Bhardwaj, S., Bhateja, V.: A modified PSNR metric based on HVS for quality assessment of color images. In: 2011 International Conference on Communication and Industrial Application (ICCIA), pp. 1–4. IEEE (2011)
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: European Conference on Computer Vision (ECCV), pp. 694–711. Springer (2016)
Johnston, N., Vincent, D., Minnen, D., Covell, M., Singh, S., Chinen, T., Hwang, S.J., Shor, J., Toderici, G.: Improved lossy image compression with priming and spatially adaptive bit rates for recurrent networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4385–4393 (2018)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kodak, E.: Kodak lossless true color image suite (photocd pcd0992) (1993). http://r0k.us/graphics/kodak/
Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4681–4690 (2017)
Lee, J., Cho, S., Beack, S.K.: Context-adaptive entropy model for end-to-end optimized image compression. arXiv preprint arXiv:1809.10452 (2018)
Li, M., Zuo, W., Gu, S., Zhao, D., Zhang, D.: Learning convolutional networks for content-weighted image compression. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3214–3223 (2018)
Minnen, D., Ballé, J., Toderici, G.: Joint autoregressive and hierarchical priors for learned image compression. In: Advances in Neural Information Processing Systems (NIPS), pp. 10771–10780 (2018)
Ohm, J.R., Sullivan, G.J.: Versatile video coding—towards the next generation of video compression. In: Picture Coding Symposium (2018)
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems (NIPS), pp. 8026–8037 (2019)
Rabbani, M., Joshi, R.: An overview of the jpeg2000 still image compression standard. Signal Proc. Image Commun. 17(1), 3–48 (2002)
Article Google Scholar
Rippel, O., Bourdev, L.: Real-time adaptive image compression. In: International Conference on Machine Learning (ICML), pp. 2922–2930. PMLR (2017)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Sullivan, G.J., Wiegand, T.: Rate-distortion optimization for video compression. IEEE Signal Process. Mag. 15(6), 74–90 (1998)
Article Google Scholar
Theis, L., Shi, W., Cunningham, A., Huszár, F.: Lossy image compression with compressive autoencoders. arXiv preprint arXiv:1703.00395 (2017)
Toderici, G., O’Malley, S.M., Hwang, S.J., Vincent, D., Minnen, D., Baluja, S., Covell, M., Sukthankar, R.: Variable rate image compression with recurrent neural networks. arXiv preprint arXiv:1511.06085 (2015)
Toderici, G., Vincent, D., Johnston, N., Jin Hwang, S., Minnen, D., Shor, J., Covell, M.: Full resolution image compression with recurrent neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5306–5314 (2017)
Van Oord, A., Kalchbrenner, N., Kavukcuoglu, K.: Pixel recurrent neural networks. In: International Conference on Machine Learning (ICML), pp. 1747–1756. PMLR (2016)
Wallace, G.K.: The jpeg still picture compression standard. IEEE Trans. Consum. Electron. 38(1), 43–59 (1992)
Article Google Scholar
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Article Google Scholar
Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: The Thrity-Seventh Asilomar Conference on Signals, Systems and Computers, pp. 1398–1402. IEEE (2003)

Download references

Acknowledgements

This research was supported by the National Natural Science Foundation of China (Grant No. 61871154, No. 62031013), by the Youth Program of National Natural Science Foundation of China (61906103, 61906124), by the Basic and applied basic research fund of Guangdong Province (2019A1515011307).

Author information

Authors and Affiliations

College of Electronics and Information Engineering, Shenzhen University, Shenzhen, 518060, China
Shuai Liu, Yingcong Huang & Huoxiang Yang
School of Electronics and Information Engineering, Harbin Institute of Technology, Shenzhen, 518055, China
Yongsheng Liang
Pengcheng Laboratory, Shenzhen, China
Wei Liu

Authors

Shuai Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yingcong Huang
View author publications
You can also search for this author in PubMed Google Scholar
Huoxiang Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yongsheng Liang
View author publications
You can also search for this author in PubMed Google Scholar
Wei Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yongsheng Liang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, S., Huang, Y., Yang, H. et al. End-to-end image compression method based on perception metric. SIViP 16, 1803–1810 (2022). https://doi.org/10.1007/s11760-022-02137-y

Download citation

Received: 30 July 2021
Revised: 23 October 2021
Accepted: 06 January 2022
Published: 01 February 2022
Issue Date: October 2022
DOI: https://doi.org/10.1007/s11760-022-02137-y

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

End-to-end image compression method based on perception metric

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Content-Oriented Learned Image Compression

End-to-end optimized image compression with the frequency-oriented transform

Coupled Squeeze-and-Excitation Blocks Based CNN for Image Compression

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now