Skip to main content

An Enhanced Multi-frequency Learned Image Compression Method

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2021)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 13021))

Included in the following conference series:

  • 2528 Accesses

Abstract

Learned image compression methods have represented the potential to outperform the traditional image compression methods in recent times. However, current learned image compression methods utilize the same spatial resolution for latent variables, which contains some redundancies. By representing different frequency latent variables with different spatial resolutions, the spatial redundancy is reduced, which improves the R-D performance. Based on the recently introduced generalized octave convolutions, which factorize latent variables into different frequency components, an enhanced multi-frequency learned image compression method is introduced. In this paper, we incorporate the channel attention module into multi-frequency learned image compression network to improve the performance of adaptive code word assignment. By using the attention module to capture the global correlation of latent variables, complex parts of the image such as textures and boundaries can be better reconstructed. Besides, an enhancement module on decoder side is utilized to generate gains. Our method shows the great visual appearance and achieves a better grade on the MS-SSIM distortion metrics at low bit rates than other standard codecs and learning-based image compression methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Agustsson, E., et al.: Soft-to-hard vector quantization for end-to-end learning compressible representations. arXiv preprint arXiv:1704.00648 (2017)

  2. Agustsson, E., Tschannen, M., Mentzer, F., Timofte, R., Gool, L.V.: Generative adversarial networks for extreme learned image compression. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 221–231 (2019)

    Google Scholar 

  3. Akbari, M., Liang, J., Han, J., Tu, C.: Generalized octave convolutions for learned multi-frequency image compression. arXiv preprint arXiv:2002.10032 (2020)

  4. Ballé, J., Laparra, V., Simoncelli, E.P.: End-to-end optimized image compression. arXiv preprint arXiv:1611.01704 (2016)

  5. Ballé, J., Minnen, D., Singh, S., Hwang, S.J., Johnston, N.: Variational image compression with a scale hyperprior. arXiv preprint arXiv:1802.01436 (2018)

  6. Bellard, F.: Bpg image format (http://bellard.org/bpg/). Accessed: 30 Jan 2021

  7. Chen, Y., et al.: Drop an octave: reducing spatial redundancy in convolutional neural networks with octave convolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3435–3444 (2019)

    Google Scholar 

  8. Cheng, Z., Sun, H., Takeuchi, M., Katto, J.: Learned image compression with discretized gaussian mixture likelihoods and attention modules. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,. pp. 7939–7948 (2020)

    Google Scholar 

  9. Covell, M., et al.: Target-quality image compression with recurrent, convolutional neural networks. arXiv preprint arXiv:1705.06687 (2017)

  10. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)

    Google Scholar 

  11. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)

    Google Scholar 

  12. Hu, Y., Yang, W., Liu, J.: Coarse-to-fine hyper-prior modeling for learned image compression. Proceedings of the AAAI Conference on Artificial Intelligence. 34, 11013–11020 (2020)

    Article  Google Scholar 

  13. Johnston, N., et al.: Improved lossy image compression with priming and spatially adaptive bit rates for recurrent networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4385–4393 (2018)

    Google Scholar 

  14. Kodak, E.: Kodak lossless true color image suite (photocd pcd0992). URL http://r0k.us/graphics/kodak 6 (1993)

  15. Lee, J., Cho, S., Beack, S.K.: Context-adaptive entropy model for end-to-end optimized image compression. arXiv preprint arXiv:1809.10452 (2018)

  16. Li, M., Zuo, W., Gu, S., Zhao, D., Zhang, D.: Learning convolutional networks for content-weighted image compression. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3214–3223 (2018)

    Google Scholar 

  17. Lim, B., Son, S., Kim, H., Nah, S., Mu Lee, K.: Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 136–144 (2017)

    Google Scholar 

  18. Liu, J., Lu, G., Hu, Z., Xu, D.: A unified end-to-end framework for efficient deep image compression. arXiv preprint arXiv:2002.03370 (2020)

  19. Mentzer, F., Agustsson, E., Tschannen, M., Timofte, R., Van Gool, L.: Conditional probability models for deep image compression. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4394–4402 (2018)

    Google Scholar 

  20. Minnen, D., Ballé, J., Toderici, G.: Joint autoregressive and hierarchical priors for learned image compression. arXiv preprint arXiv:1809.02736 (2018)

  21. Ohm, J.R., Sullivan, G.J.: Versatile video coding-towards the next generation of video compression. In: Picture Coding Symposium, vol. 2018 (2018)

    Google Scholar 

  22. Oord, A.v.d., Kalchbrenner, N., Vinyals, O., Espeholt, L., Graves, A., Kavukcuoglu, K.: Conditional image generation with pixelcnn decoders. arXiv preprint arXiv:1606.05328 (2016)

  23. Rabbani, M., Joshi, R.: An overview of the jpeg 2000 still image compression standard. Sig. Process. Image Commun. 17(1), 3–48 (2002)

    Article  Google Scholar 

  24. Rippel, O., Bourdev, L.: Real-time adaptive image compression. In: International Conference on Machine Learning. pp. 2922–2930. PMLR (2017)

    Google Scholar 

  25. Santurkar, S., Budden, D., Shavit, N.: Generative compression. In: 2018 Picture Coding Symposium (PCS), pp. 258–262. IEEE (2018)

    Google Scholar 

  26. Sullivan, G.J., Ohm, J.R., Han, W.J., Wiegand, T.: Overview of the high efficiency video coding (HEVC) standard. IEEE Trans. Circuits Syst. Video Technol. 22(12), 1649–1668 (2012)

    Article  Google Scholar 

  27. Theis, L., Shi, W., Cunningham, A., Huszár, F.: Lossy image compression with compressive autoencoders. arXiv preprint arXiv:1703.00395 (2017)

  28. Toderici, G., et al.: Variable rate image compression with recurrent neural networks. arXiv preprint arXiv:1511.06085 (2015)

  29. Toderici, G., et al.: Full resolution image compression with recurrent neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5306–5314 (2017)

    Google Scholar 

  30. Wallace, G.K.: The jpeg still picture compression standard. IEEE Trans. Consum. Electr. 38(1), xviii-xxxiv (1992)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Corresponding author

Correspondence to Zhihui Wei .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

He, L., Wei, Z., Xu, Y., Wu, Z. (2021). An Enhanced Multi-frequency Learned Image Compression Method. In: Ma, H., et al. Pattern Recognition and Computer Vision. PRCV 2021. Lecture Notes in Computer Science(), vol 13021. Springer, Cham. https://doi.org/10.1007/978-3-030-88010-1_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-88010-1_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-88009-5

  • Online ISBN: 978-3-030-88010-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics