Variable-length image compression based on controllable learning network

Zhao, Dong; Sun, Jiande; Chen, Lei; Wu, Yulin; Zhou, Hongchao

doi:10.1007/s11042-020-10346-1

Variable-length image compression based on controllable learning network

Published: 05 March 2021

Volume 80, pages 20065–20087, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Dong Zhao ORCID: orcid.org/0000-0002-3991-9570¹,
Jiande Sun²,
Lei Chen¹,
Yulin Wu¹ &
…
Hongchao Zhou¹

302 Accesses
3 Altmetric
Explore all metrics

Abstract

While neural-network-based lossy image compression methods have shown impressive performance, most of them output a fixed-length coding using a trained-specific network. However, it is essential to support the variable-length compression or meet a target rate with a high-coding performance in practice. This paper steps forward the neural-network-based image compression method, making it possible for a single network model to generate variable compression rates. Our network model combines an auto-encoder (AE) and a generative adversarial network (GAN) for generative compression. We introduce a noise interference mechanism to train the feature representation produced by the encoder, making the feature nodes training controllable and distributed from top to bottom according to their importance in feature expression. Based on this importance distribution, the latent nodes are quantized into bits and the variable-length compression can be achieved by discarding bits of those less-important feature nodes to meet the compression target. We propose several noise interference methods, and the experiments confirm the feasibility of method Random-add and Dropout in controllable learning. Further experiments illustrate that our compression method can not only achieve variable-length compression but also can recover high-quality compressed images at extremely low bit rates, outperforming that with a fixed rate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semantic-oriented learning-based image compression by Only-Train-Once quantized autoencoders

Article 29 April 2022

Analysing Image Compression Using Generative Adversarial Networks

Neural Multi-scale Image Compression

References

Agustsson E, Mentzer F, Tschannen M, Cavigelli L, Timofte R, Benini L, Gool LV (2017) Soft-to-hard vector quantization for end-to-end learning compressible representations. In: Advances in neural information processing systems (NIPS), pp 1141–1151
Agustsson E, Tschannen M, Mentzer F, Timofte R, Gool LV (2019) Generative adversarial networks for extreme learned image compression. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 221–231
Alain G, Bengio Y (2016) Understanding intermediate layers using linear classifier probes. arXiv:161001644
Arjovsky M, Bottou L (2017) Towards principled methods for training generative adversarial networks. arXiv:170104862
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: Proceedings of international conference on machine learning (ICML), vol 70, pp 214–223
Ballé J, Laparra V, Simoncelli E (2017) End-to-end optimized image compression. In: Proceedings of the IEEE international conference on learning representations (ICLR)
Chen Y, Wang J, Chen X, Sangaiah AK, Yang K, Cao Z (2019a) Image super-resolution algorithm based on dual-channel convolutional neural networks. Appl Sci 9(11):2316
Article Google Scholar
Chen Y, Wang J, Liu S, Chen X, Xiong J, Xie J, Yang K (2019b) Multiscale fast correlation filtering tracking algorithm based on a feature fusion model. Concur Comput Pract Exp:5533
Chen Y, Wang J, Xia R, Zhang Q, Cao Z, Yang K (2019c) The visual object tracking algorithm research based on adaptive combination kernel. J Ambient Intell Hum Comput 10(12):4855–4867
Article Google Scholar
Chen Y, Tao J, Liu L, Xiong J, Xia R, Xie J, Zhang Q, Yang K (2020) Research of improving semantic image segmentation based on a feature fusion model. J Ambient Intell Hum Comput
Denton L, Emily, Chintala S, Fergus R et al (2015) Deep generative image model using a laplacian pyramid of adversarial networks. In: Advances in neural information processing systems (NIPS), pp 1486–1494
Donahue J, Krähenbühl P, Darrell T (2016) Adversarial feature learning. arXiv:1605
Dosovitskiy A, Brox T (2016) Generating images with perceptual similarity metrics based on deep networks. In: Advances in neural information processing systems (NIPS), pp 658–666
Dumoulin V, Belghazi I, Poole B, Mastropietro O, Lamb A, Arjovsky M, Courville A (2016) Adversarially learned inference. arXiv:1606
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems (NIPS), pp 2672–2680
Gregor K, Besse F, Rezende DJ, Danihelka I, Wierstra D (2016) Towards conceptual compression. In: Advances in neural information processing systems (NIPS), pp 3549–3557
Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville AC (2017) Improved training of wasserstein gans. In: Advances in neural information processing systems (NIPS), pp 5767–5777
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778
Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
Article MathSciNet Google Scholar
Isola P, Zhu J, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 5967–5976
Jiang F, Tao W, Liu S, Ren J, Guo X, Zhao D (2017) An end-to-end compression framework based on convolutional neural networks. IEEE Trans Circ Sys Video Technol 28(10):3007–3018
Article Google Scholar
Jiang J (1999) Image compression with neural networks–a survey. Signal Process Imag Commun 14(9):737–760
Article Google Scholar
Johnston N, Vincent D, Minnen D, Covell M, Singh S, Chinen T, Jin Hwang S, Shor J, Toderici G (2018) Improved lossy image compression with priming and spatially adaptive bit rates for recurrent networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 4385–4393
Karras T, Aila T, Laine S, Lehtinen J (2018) Progressive growing of gans for improved quality, stability and variation. In: Proceedings of the international conference on learning representations (ICLR)
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems (NIPS), pp 1097–1105
Ledig C, Theis L, Huszar F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, Shi W (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 105–114
Li C, Wand M (2016) Precomputed real-time texture synthesis with markovian generative adversarial networks. CoRR 1604.04382
Li M, Zuo W, Gu S, Zhao D, Zhang D (2018) Learning convolutional networks for content-weighted image compression. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3214–3223
Lin TY, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 2980–2988
Liu Z, Luo P, Wang X, Tang X (2015) Deep learning face attributes in the wild. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 3730–3738
Lu X, Ma C, Ni B, Yang X, Reid I, Yang MH (2018) Deep regression tracking with shrinkage loss. In: Proceedings of the European conference on computer vision (ECCV), pp 353–369
Ma S, Zhang X, Jia C, Zhao Z, Wang S, Wanga S (2019) Image and video compression with neural networks: A review. IEEE Trans Circ Sys Video Technol
Mao X, Li Q, Xie H, Lau RY, Wang Z, Paul Smolley S (2017) Least squares generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 2794–2802
Mentzer F, Agustsson E, Tschannen M, Timofte R, Van Gool L (2018) Conditional probability models for deep image compression. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 4394–4402
Rippel O, Bourdev L (2017) Real-time adaptive image compression. In: Proceedings of international conference on machine learning (ICML), pp 2922–2930
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. In: Advances in neural information processing systems (NIPS), pp 2234–2242
Santurkar S, Budden D, Shavit N (2018) Generative compression. In: 2018 picture coding symposium (PCS). IEEE, pp 258–262
Schuster M, Paliwal K K (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681
Article Google Scholar
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:14091556
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: A simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Theis L, Bethge M (2015) Generative image modeling using spatial lstms. In: Advances in neural information processing systems (NIPS), pp 1927–1935
Theis L, Shi W, Cunningham A, Huszár F (2017) Lossy image compression with compressive autoencoders. In: Proceedings of the IEEE international conference on learning representations (ICLR)
Toderici G, O’Malley SM, Hwang SJ, Vincent D, Minnen D, Baluja S, Covell M, Sukthankar R (2016) Variable rate image compression with recurrent neural networks. In: Proceedings of the IEEE international conference on learning representation (ICLR)
Toderici G, Vincent D, Johnston N, Jin Hwang S, Minnen D, Shor J, Covell M (2017) Full resolution image compression with recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 5306–5314
Tschannen M, Agustsson E, Lucic M (2018) Deep generative models for distribution-preserving lossy compression. In: Advances in neural information processing systems (NIPS), pp 5929–5940
Van den Oord A, Kalchbrenner N, Espeholt L, Vinyals O, Graves A et al (2016) Conditional image generation with pixelcnn decoders. In: Advances in neural information processing systems (NIPS), pp 4790–4798
Van Oord A, Kalchbrenner N, Kavukcuoglu K (2016) Pixel recurrent neural networks. In: Proceedings of the international conference on machine learning (ICML), pp 1747–1756
Wallace GK (1992) The jpeg still picture compression standard. IEEE Trans Consumer Elec 38(1):xviii–xxxiv
Article Google Scholar
Wang TC, Liu MY, Zhu JY, Tao A, Kautz J, Catanzaro B (2018) High-resolution image synthesis and semantic manipulation with conditional gans. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 8798–8807
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: From error measurement to structural similarity. IEEE Trans Image Process 13(1)
Wolf S, Pinson M (2009) Reference algorithm for computing peak signal to noise ratio (psnr) of a video sequence with a constant delay. ITU-T Contribution COM9-C6-E
Xu M, Li S, Lu J, Zhu W (2014) Compressibility constrained sparse representation with learnt dictionary for low bit-rate image compression. IEEE Trans Circ Sys Video Technol 24(10):1743–1757
Article Google Scholar
Yu A, Grauman K (2014) Fine-grained visual comparisons with local learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 192–199
Yu A, Grauman K (2017) Semantic jitter: Dense supervision for visual comparisons via synthetic images. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 5570–5579
Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018a) Learning a discriminative feature network for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1857–1866
Yu L, Long X, Tong C (2018b) Single image super-resolution based on improved wgan. In: Proceedings of the international conference on advanced control automation and artificial intelligence (ACAAI). Atlantis Press
Zhang X, Lin W, Zhang Y, Wang S, Ma S, Duan L, Gao W (2017) Rate-distortion optimized sparse coding with ordered dictionary for image set compression. IEEE Trans Circ Sys Video Technol 28(12):3387–3397
Article Google Scholar
Zhou L, Cai C, Gao Y, Su S, Wu J (2018) Variational autoencoder for low bit-rate image compression. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) workshops, pp 2617–2620
Zhou W (2004) Image quality assessment: From error measurement to structural similarity. IEEE Trans Image Process 13:600–613
Article Google Scholar

Download references

Acknowledgements

This work is supported by Natural Science Foundation for Distinguished Young Scholars of Shandong Province (JQ201718), the Natural Science Foundation of China (U1736122), the National Natural Science Foundation of China under Grant No. 62001267, and the Foundamental Research Funds of Shandong University under Grant No. 2020HW017.

Author information

Authors and Affiliations

School of Information Science and Engineering, Shandong University, Jinan, 250100, China
Dong Zhao, Lei Chen, Yulin Wu & Hongchao Zhou
School of Information Science and Engineering, Shandong Normal University, Jinan, 250014, China
Jiande Sun

Authors

Dong Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Jiande Sun
View author publications
You can also search for this author in PubMed Google Scholar
Lei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yulin Wu
View author publications
You can also search for this author in PubMed Google Scholar
Hongchao Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongchao Zhou.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhao, D., Sun, J., Chen, L. et al. Variable-length image compression based on controllable learning network. Multimed Tools Appl 80, 20065–20087 (2021). https://doi.org/10.1007/s11042-020-10346-1

Download citation

Received: 08 June 2020
Revised: 25 September 2020
Accepted: 22 December 2020
Published: 05 March 2021
Issue Date: May 2021
DOI: https://doi.org/10.1007/s11042-020-10346-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Variable-length image compression based on controllable learning network

Abstract

Access this article

Similar content being viewed by others

Semantic-oriented learning-based image compression by Only-Train-Once quantized autoencoders

Analysing Image Compression Using Generative Adversarial Networks

Neural Multi-scale Image Compression

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Variable-length image compression based on controllable learning network

Abstract

Access this article

Similar content being viewed by others

Semantic-oriented learning-based image compression by Only-Train-Once quantized autoencoders

Analysing Image Compression Using Generative Adversarial Networks

Neural Multi-scale Image Compression

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation