skip to main content
10.1145/3595916.3626372acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Lambda-Domain Rate Control for Neural Image Compression

Published: 01 January 2024 Publication History

Abstract

Rate control based on rate-distortion modeling is a classic problem in lossy image compression. Despite extensive research in neural image compression, its rate control remains understudied. In this paper, we introduce a variable rate neural image compression scheme that supports precise rate control with one-pass encoding. Our approach utilizes the Lagrangian multiplier method to transform rate control into an unconstrained optimization problem, mapping the target bitrate to λ for rate-distortion trade-off adjustment. We propose an improved exponential R-λ model and estimate the bitrates with a hybrid convolution-transformer network for model fitting. The encoder is controlled by λ, and a multi-layer modulation mechanism ensures variable rate ability. In our experiments, the proposed method outperforms the intra-frame coding of Versatile Video Coding (VVC). Meanwhile, the average rate control error is less than 5.1%, while maintaining almost identical rate-distortion performance and acceptable complexity.

Supplementary Material

supplementary pdf (Xue_supp.pdf)

References

[1]
2013. Kodak lossless true color image suite. https://r0k.us/graphics/kodak/
[2]
Johannes Ballé, Valero Laparra, and Eero P. Simoncelli. 2016. End-to-end Optimized Image Compression. CoRR abs/1611.01704 (2016). arXiv:1611.01704http://arxiv.org/abs/1611.01704
[3]
Johannes Ballé, David Minnen, Saurabh Singh, Sung Jin Hwang, and Nick Johnston. 2018. Variational image compression with a scale hyperprior. arxiv:1802.01436 [eess.IV]
[4]
Fabrice Bellard. 2018. Bpg image format. online. http://bellard.org/bpg/
[5]
Gisle Bjontegaard. 2001. Calculation of average PSNR differences between RD-curves. ITU SG16 Doc. VCEG-M33 (2001).
[6]
Benjamin Bross, Ye-Kui Wang, Yan Ye, Shan Liu, Jianle Chen, Gary J. Sullivan, and Jens-Rainer Ohm. 2021. Overview of the Versatile Video Coding (VVC) Standard and its Applications. IEEE Transactions on Circuits and Systems for Video Technology 31, 10 (2021), 3736–3764. https://doi.org/10.1109/TCSVT.2021.3101953
[7]
Jean Bégaint, Fabien Racapé, Simon Feltman, and Akshay Pushparaja. 2020. CompressAI: a PyTorch library and evaluation platform for end-to-end compression research. arxiv:2011.03029 [cs.CV]
[8]
Zhengxue Cheng, Heming Sun, Masaru Takeuchi, and Jiro Katto. 2020. Learned Image Compression With Discretized Gaussian Mixture Likelihoods and Attention Modules. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[9]
Yoojin Choi, Mostafa El-Khamy, and Jungwon Lee. 2019. Variable Rate Deep Image Compression With a Conditional Autoencoder. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV).
[10]
Ze Cui, Jing Wang, Shangyin Gao, Tiansheng Guo, Yihui Feng, and Bo Bai. 2021. Asymmetric Gained Deep Image Compression With Continuous Rate Adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 10532–10541.
[11]
Chenjian Gao, Tongda Xu, Dailan He, Yan Wang, and Hongwei Qin. 2022. Flexible Neural Image Compression via Code Editing. In Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh (Eds.). Vol. 35. Curran Associates, Inc., 12184–12196. https://proceedings.neurips.cc/paper_files/paper/2022/file/4f3820576130a8f796ddbf204c841487-Paper-Conference.pdf
[12]
Toderici George, Shi Wenzhe, Timofte Radu, Theis Lucas, Balle Johannes, Agustsson Eirikur, Johnston Nick, and Mentzer Fabian. 2020. Workshop and Challenge on Learned Image Compression (CLIC2020). http://www.compression.cc
[13]
Nilson D. Guerin, Renam Castro da Silva, Matheus C. de Oliveira, Henrique C. Jung, Luiz Gustavo R. Martins, Eduardo Peixoto, Bruno Macchiavello, Edson M. Hung, Vanessa Testoni, and Pedro Garcia Freitas. 2022. Rate-constrained learning-based image compression. Signal Processing: Image Communication 101 (2022), 116544. https://doi.org/10.1016/j.image.2021.116544
[14]
Zongyu Guo, Yaojun Wu, Runsen Feng, Zhizheng Zhang, and Zhibo Chen. 2020. 3-D Context Entropy Model for Improved Practical Image Compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.
[15]
Zongyu Guo, Zhizheng Zhang, Runsen Feng, and Zhibo Chen. 2022. Causal Contextual Prediction for Learned Image Compression. IEEE Transactions on Circuits and Systems for Video Technology 32, 4 (2022), 2329–2341. https://doi.org/10.1109/TCSVT.2021.3089491
[16]
Dailan He, Yaoyan Zheng, Baocheng Sun, Yan Wang, and Hongwei Qin. 2021. Checkerboard Context Model for Efficient Learned Image Compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 14771–14780.
[17]
Chuanmin Jia, Ziqing Ge, Shanshe Wang, Siwei Ma, and Wen Gao. 2022. Rate Distortion Characteristic Modeling for Neural Image Compression. In 2022 Data Compression Conference (DCC). 202–211. https://doi.org/10.1109/DCC52660.2022.00028
[18]
Jooyoung Lee, Seyoon Jeong, and Munchurl Kim. 2022. Selective compression learning of latent representations for variable-rate image compression. arxiv:2211.04104 [eess.IV]
[19]
Bin Li, Houqiang Li, Li Li, and Jinlei Zhang. 2014. λ Domain Rate Control Algorithm for High Efficiency Video Coding. IEEE Transactions on Image Processing 23, 9 (2014), 3841–3854. https://doi.org/10.1109/TIP.2014.2336550
[20]
Tsung-Yi Lin, Michael Maire, Serge Belongie, Lubomir Bourdev, Ross Girshick, James Hays, Pietro Perona, Deva Ramanan, C. Lawrence Zitnick, and Piotr Dollár. 2015. Microsoft COCO: Common Objects in Context. arxiv:1405.0312 [cs.CV]
[21]
Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 10012–10022.
[22]
Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, and Saining Xie. 2022. A ConvNet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 11976–11986.
[23]
David Minnen, Johannes Ballé, and George D Toderici. 2018. Joint Autoregressive and Hierarchical Priors for Learned Image Compression. In Advances in Neural Information Processing Systems, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.). Vol. 31. Curran Associates, Inc.https://proceedings.neurips.cc/paper_files/paper/2018/file/53edebc543333dfbf7c5933af792c9c4-Paper.pdf
[24]
David Minnen and Saurabh Singh. 2020. Channel-Wise Autoregressive Entropy Models for Learned Image Compression. In 2020 IEEE International Conference on Image Processing (ICIP). 3339–3343. https://doi.org/10.1109/ICIP40778.2020.9190935
[25]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Z. Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. CoRR abs/1912.01703 (2019). arXiv:1912.01703http://arxiv.org/abs/1912.01703
[26]
Zhenhong Sun, Zhiyu Tan, Xiuyu Sun, Fangyi Zhang, Yichen Qian, Dongyang Li, and Hao Li. 2021. Interpolation Variable Rate Image Compression. In Proceedings of the 29th ACM International Conference on Multimedia (Virtual Event, China) (MM ’21). Association for Computing Machinery, New York, NY, USA, 5574–5582. https://doi.org/10.1145/3474085.3475698
[27]
George Toderici, Damien Vincent, Nick Johnston, Sung Jin Hwang, David Minnen, Joel Shor, and Michele Covell. 2017. Full Resolution Image Compression With Recurrent Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28]
Xining Wang, Ming Lu, and Zhan Ma. 2022. Block-Level Rate Control for Learnt Image Coding. In 2022 Picture Coding Symposium (PCS). 157–161. https://doi.org/10.1109/PCS56426.2022.10018043
[29]
Zongze Wu and Nanning Zheng. 2006. Efficient Rate-Control System With Three Stages for JPEG2000 Image Coding. IEEE Transactions on Circuits and Systems for Video Technology 16, 9 (2006), 1063–1073. https://doi.org/10.1109/TCSVT.2006.881196
[30]
Fei Yang, Luis Herranz, Yongmei Cheng, and Mikhail G. Mozerov. 2021. Slimmable Compressive Autoencoders for Practical Neural Image Compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 4998–5007.

Index Terms

  1. Lambda-Domain Rate Control for Neural Image Compression

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in Asia
      December 2023
      745 pages
      ISBN:9798400702051
      DOI:10.1145/3595916
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 01 January 2024

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Deep learning
      2. Neural image compression
      3. Rate control
      4. Rate-distortion optimization

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      MMAsia '23
      Sponsor:
      MMAsia '23: ACM Multimedia Asia
      December 6 - 8, 2023
      Tainan, Taiwan

      Acceptance Rates

      Overall Acceptance Rate 59 of 204 submissions, 29%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 225
        Total Downloads
      • Downloads (Last 12 months)144
      • Downloads (Last 6 weeks)7
      Reflects downloads up to 28 Feb 2025

      Other Metrics

      Citations

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media