research-article

Lambda-Domain Rate Control for Neural Image Compression

Authors:

Yuan ZhangAuthors Info & Claims

MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in Asia

Article No.: 3, Pages 1 - 7

https://doi.org/10.1145/3595916.3626372

Published: 01 January 2024 Publication History

Abstract

Rate control based on rate-distortion modeling is a classic problem in lossy image compression. Despite extensive research in neural image compression, its rate control remains understudied. In this paper, we introduce a variable rate neural image compression scheme that supports precise rate control with one-pass encoding. Our approach utilizes the Lagrangian multiplier method to transform rate control into an unconstrained optimization problem, mapping the target bitrate to λ for rate-distortion trade-off adjustment. We propose an improved exponential R-λ model and estimate the bitrates with a hybrid convolution-transformer network for model fitting. The encoder is controlled by λ, and a multi-layer modulation mechanism ensures variable rate ability. In our experiments, the proposed method outperforms the intra-frame coding of Versatile Video Coding (VVC). Meanwhile, the average rate control error is less than 5.1%, while maintaining almost identical rate-distortion performance and acceptable complexity.

Supplementary Material

supplementary pdf (Xue_supp.pdf)

Download
526.22 KB

References

[1]

2013. Kodak lossless true color image suite. https://r0k.us/graphics/kodak/

[2]

Johannes Ballé, Valero Laparra, and Eero P. Simoncelli. 2016. End-to-end Optimized Image Compression. CoRR abs/1611.01704 (2016). arXiv:1611.01704http://arxiv.org/abs/1611.01704

[3]

Johannes Ballé, David Minnen, Saurabh Singh, Sung Jin Hwang, and Nick Johnston. 2018. Variational image compression with a scale hyperprior. arxiv:1802.01436 [eess.IV]

[4]

Fabrice Bellard. 2018. Bpg image format. online. http://bellard.org/bpg/

[5]

Gisle Bjontegaard. 2001. Calculation of average PSNR differences between RD-curves. ITU SG16 Doc. VCEG-M33 (2001).

[6]

Benjamin Bross, Ye-Kui Wang, Yan Ye, Shan Liu, Jianle Chen, Gary J. Sullivan, and Jens-Rainer Ohm. 2021. Overview of the Versatile Video Coding (VVC) Standard and its Applications. IEEE Transactions on Circuits and Systems for Video Technology 31, 10 (2021), 3736–3764. https://doi.org/10.1109/TCSVT.2021.3101953

[7]

Jean Bégaint, Fabien Racapé, Simon Feltman, and Akshay Pushparaja. 2020. CompressAI: a PyTorch library and evaluation platform for end-to-end compression research. arxiv:2011.03029 [cs.CV]

[8]

Zhengxue Cheng, Heming Sun, Masaru Takeuchi, and Jiro Katto. 2020. Learned Image Compression With Discretized Gaussian Mixture Likelihoods and Attention Modules. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]

Yoojin Choi, Mostafa El-Khamy, and Jungwon Lee. 2019. Variable Rate Deep Image Compression With a Conditional Autoencoder. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV).

[10]

Ze Cui, Jing Wang, Shangyin Gao, Tiansheng Guo, Yihui Feng, and Bo Bai. 2021. Asymmetric Gained Deep Image Compression With Continuous Rate Adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 10532–10541.

[11]

Chenjian Gao, Tongda Xu, Dailan He, Yan Wang, and Hongwei Qin. 2022. Flexible Neural Image Compression via Code Editing. In Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh (Eds.). Vol. 35. Curran Associates, Inc., 12184–12196. https://proceedings.neurips.cc/paper_files/paper/2022/file/4f3820576130a8f796ddbf204c841487-Paper-Conference.pdf

[12]

Toderici George, Shi Wenzhe, Timofte Radu, Theis Lucas, Balle Johannes, Agustsson Eirikur, Johnston Nick, and Mentzer Fabian. 2020. Workshop and Challenge on Learned Image Compression (CLIC2020). http://www.compression.cc

[13]

Nilson D. Guerin, Renam Castro da Silva, Matheus C. de Oliveira, Henrique C. Jung, Luiz Gustavo R. Martins, Eduardo Peixoto, Bruno Macchiavello, Edson M. Hung, Vanessa Testoni, and Pedro Garcia Freitas. 2022. Rate-constrained learning-based image compression. Signal Processing: Image Communication 101 (2022), 116544. https://doi.org/10.1016/j.image.2021.116544

Digital Library

[14]

Zongyu Guo, Yaojun Wu, Runsen Feng, Zhizheng Zhang, and Zhibo Chen. 2020. 3-D Context Entropy Model for Improved Practical Image Compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.

[15]

Zongyu Guo, Zhizheng Zhang, Runsen Feng, and Zhibo Chen. 2022. Causal Contextual Prediction for Learned Image Compression. IEEE Transactions on Circuits and Systems for Video Technology 32, 4 (2022), 2329–2341. https://doi.org/10.1109/TCSVT.2021.3089491

[16]

Dailan He, Yaoyan Zheng, Baocheng Sun, Yan Wang, and Hongwei Qin. 2021. Checkerboard Context Model for Efficient Learned Image Compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 14771–14780.

[17]

Chuanmin Jia, Ziqing Ge, Shanshe Wang, Siwei Ma, and Wen Gao. 2022. Rate Distortion Characteristic Modeling for Neural Image Compression. In 2022 Data Compression Conference (DCC). 202–211. https://doi.org/10.1109/DCC52660.2022.00028

[18]

Jooyoung Lee, Seyoon Jeong, and Munchurl Kim. 2022. Selective compression learning of latent representations for variable-rate image compression. arxiv:2211.04104 [eess.IV]

[19]

Bin Li, Houqiang Li, Li Li, and Jinlei Zhang. 2014. λ Domain Rate Control Algorithm for High Efficiency Video Coding. IEEE Transactions on Image Processing 23, 9 (2014), 3841–3854. https://doi.org/10.1109/TIP.2014.2336550

[20]

Tsung-Yi Lin, Michael Maire, Serge Belongie, Lubomir Bourdev, Ross Girshick, James Hays, Pietro Perona, Deva Ramanan, C. Lawrence Zitnick, and Piotr Dollár. 2015. Microsoft COCO: Common Objects in Context. arxiv:1405.0312 [cs.CV]

[21]

Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 10012–10022.

[22]

Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, and Saining Xie. 2022. A ConvNet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 11976–11986.

[23]

David Minnen, Johannes Ballé, and George D Toderici. 2018. Joint Autoregressive and Hierarchical Priors for Learned Image Compression. In Advances in Neural Information Processing Systems, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.). Vol. 31. Curran Associates, Inc.https://proceedings.neurips.cc/paper_files/paper/2018/file/53edebc543333dfbf7c5933af792c9c4-Paper.pdf

[24]

David Minnen and Saurabh Singh. 2020. Channel-Wise Autoregressive Entropy Models for Learned Image Compression. In 2020 IEEE International Conference on Image Processing (ICIP). 3339–3343. https://doi.org/10.1109/ICIP40778.2020.9190935

[25]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Z. Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. CoRR abs/1912.01703 (2019). arXiv:1912.01703http://arxiv.org/abs/1912.01703

[26]

Zhenhong Sun, Zhiyu Tan, Xiuyu Sun, Fangyi Zhang, Yichen Qian, Dongyang Li, and Hao Li. 2021. Interpolation Variable Rate Image Compression. In Proceedings of the 29th ACM International Conference on Multimedia (Virtual Event, China) (MM ’21). Association for Computing Machinery, New York, NY, USA, 5574–5582. https://doi.org/10.1145/3474085.3475698

Digital Library

[27]

George Toderici, Damien Vincent, Nick Johnston, Sung Jin Hwang, David Minnen, Joel Shor, and Michele Covell. 2017. Full Resolution Image Compression With Recurrent Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]

Xining Wang, Ming Lu, and Zhan Ma. 2022. Block-Level Rate Control for Learnt Image Coding. In 2022 Picture Coding Symposium (PCS). 157–161. https://doi.org/10.1109/PCS56426.2022.10018043

[29]

Zongze Wu and Nanning Zheng. 2006. Efficient Rate-Control System With Three Stages for JPEG2000 Image Coding. IEEE Transactions on Circuits and Systems for Video Technology 16, 9 (2006), 1063–1073. https://doi.org/10.1109/TCSVT.2006.881196

Digital Library

[30]

Fei Yang, Luis Herranz, Yongmei Cheng, and Mikhail G. Mozerov. 2021. Slimmable Compressive Autoencoders for Practical Neural Image Compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 4998–5007.

Index Terms

Lambda-Domain Rate Control for Neural Image Compression
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Computer graphics
    1. Image compression

Recommendations

A rate control scheme for H.264/AVC CBR transmission
SPPR'07: Proceedings of the Fourth conference on IASTED International Conference: Signal Processing, Pattern Recognition, and Applications

In many applications, video sequences must be transmitted constant bit rate channels (CBR). Therefore, rate control has to be used to regulate the variable bit rate of coded stream. In this paper, we propose an effective bit-rate control algorithm in ...
A rate control scheme for H.264/AVC CBR transmission
SPPRA '07: Proceedings of the Fourth IASTED International Conference on Signal Processing, Pattern Recognition, and Applications

In many applications, video sequences must be transmitted constant bit rate channels (CBR). Therefore, rate control has to be used to regulate the variable bit rate of coded stream. In this paper, we propose an effective bit-rate control algorithm in ...
Region-of-interest based rate control algorithm for H.264/AVC video coding

Conventional rate control algorithms allocate bits for every macroblock (MB) without consider whether it needs encoding, and they choose encoding mode only from the set provided by H.264/AVC standard. While, according to the human visual system (HVS) ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in Asia

December 2023

745 pages

ISBN:9798400702051

DOI:10.1145/3595916

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 January 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

MMAsia '23

Sponsor:

SIGMM

MMAsia '23: ACM Multimedia Asia

December 6 - 8, 2023

Tainan, Taiwan

Acceptance Rates

Overall Acceptance Rate 59 of 204 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
225
Total Downloads

Downloads (Last 12 months)144
Downloads (Last 6 weeks)7

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten