A Differentiable Entropy Model for Learned Image Compression

Presta, Alberto; Fiandrotti, Attilio; Tartaglione, Enzo; Grangetto, Marco

doi:10.1007/978-3-031-43148-7_28

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14233))

Included in the following conference series:

International Conference on Image Analysis and Processing

544 Accesses

Abstract

In an end-to-end learned image compression framework, an encoder projects the image on a low-dimensional, quantized, latent space while a decoder recovers the original image. The encoder and decoder are jointly trained with standard gradient backpropagation to minimize a rate-distortion (RD) cost function accounting for both distortions between the original and reconstructed image and the quantized latent space rate. State-of-the-art methods rely on an auxiliary neural network to estimate the rate R of the latent space. We propose a non-parametric entropy model that estimates the statistical frequencies of the quantized latent space during training. The proposed model is differentiable, so it can be plugged into the cost function to be minimized as a rate proxy and can be adapted to a given context without retraining. Our experiments show comparable performance with a learned rate estimator and better performance when is adapted over a temporal context.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The code is publicly available on https://github.com/EIDOSLAB/SFC.

References

Ma, S., et al.: Image and video compression with neural networks: a review. In: IEEE TCSVT (2019)
Google Scholar
Ballé, J., Laparra, V., Simoncelli, E.P.: End-to-end optimized image compression. In: ICLR, Simoncelli (2017)
Google Scholar
Ballé, J., Minnen, D., Singh, S., Hwang, S.J., Johnston, N.: Variational image compression with a scale hyperprior. In: ICLR (2018)
Google Scholar
Minnen, D., et al.: Joint autoregressive and hierarchical priors for learned image compression. In: Advances in Neural Information Processing Systems (2018)
Google Scholar
Lee, J., et al.: Context-adaptive entropy model for end-to-end optimized image compression. In: International Conference on Learning Representations (ICLR) (2019)
Google Scholar
Minnen, D., Saurabh, S.: Channel-wise autoregressive entropy models for learned image compression. In: IEEE International Conference on Image Processing (2020)
Google Scholar
Yang, C., et al.: Graph-convolution network for image compression. In: IEEE International Conference on Image Processing (ICIP) (2021)
Google Scholar
Cheng, Z., e al.: Learned image compression with discretized gaussian mixture likelihoods and attention modules. In: CVPR (2020)
Google Scholar
Zou, R., et al.: The devil is in the details: window-based attention for image compression. In: CVPR (2022)
Google Scholar
Goyal, V.K.: Theoretical foundations of transform coding. In: IEEE Signal Processing Magazine (2001)
Google Scholar
Robert, M., Neuhoff, D.: Quantization. In: IEEE Transactions on Information Theory (1998)
Google Scholar
Lee, J., et al.: DPICT: deep progressive image compression using trit-planes. In: IEEE/CVF CVPR (2022)
Google Scholar
Eastman Kodak Company. Kodak Lossless True Color Image Suite (1999)
Google Scholar
Toderici, G., et al.: Workshop and challenge on learned image compression. In: CVPR (2021)
Google Scholar
Joint Video Exploration Team (JVET) of ITU-T SG16 WP3 andISO/IEC JTC1/SC29/WG11: JVET-G1010: JVET common test conditions and software reference configurations, in 7th Meeting, Torino (IT) (2017)
Google Scholar
Xue, T., et al.: Video enhancement with task-oriented flow. In: International Journal of Computer Vision (IJCV) (2019)
Google Scholar
Bégaint, J., et al.: CompressAI: a PyTorch library and evaluation platform for end-to-end compression research. In arXiv preprint arXiv:2011.03029 (2020)
Bjontegaard, G.: Calculation of average PSNR differences between RD-curves. In: VCEG-M33 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, University of Turin, Turin, Italy
Alberto Presta, Attilio Fiandrotti & Marco Grangetto
LTCI, Telecom Paris, Institut Polytechnique de Paris, Palaiseau, France
Enzo Tartaglione

Authors

Alberto Presta
View author publications
You can also search for this author in PubMed Google Scholar
Attilio Fiandrotti
View author publications
You can also search for this author in PubMed Google Scholar
Enzo Tartaglione
View author publications
You can also search for this author in PubMed Google Scholar
Marco Grangetto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alberto Presta .

Editor information

Editors and Affiliations

University of Udine, Udine, Italy
Gian Luca Foresti
University of Udine, Udine, Italy
Andrea Fusiello
University of York, York, UK
Edwin Hancock

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 45110 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Presta, A., Fiandrotti, A., Tartaglione, E., Grangetto, M. (2023). A Differentiable Entropy Model for Learned Image Compression. In: Foresti, G.L., Fusiello, A., Hancock, E. (eds) Image Analysis and Processing – ICIAP 2023. ICIAP 2023. Lecture Notes in Computer Science, vol 14233. Springer, Cham. https://doi.org/10.1007/978-3-031-43148-7_28

Download citation

DOI: https://doi.org/10.1007/978-3-031-43148-7_28
Published: 05 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43147-0
Online ISBN: 978-3-031-43148-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Differentiable Entropy Model for Learned Image Compression