Abstract
While neural-network-based lossy image compression methods have shown impressive performance, most of them output a fixed-length coding using a trained-specific network. However, it is essential to support the variable-length compression or meet a target rate with a high-coding performance in practice. This paper steps forward the neural-network-based image compression method, making it possible for a single network model to generate variable compression rates. Our network model combines an auto-encoder (AE) and a generative adversarial network (GAN) for generative compression. We introduce a noise interference mechanism to train the feature representation produced by the encoder, making the feature nodes training controllable and distributed from top to bottom according to their importance in feature expression. Based on this importance distribution, the latent nodes are quantized into bits and the variable-length compression can be achieved by discarding bits of those less-important feature nodes to meet the compression target. We propose several noise interference methods, and the experiments confirm the feasibility of method Random-add and Dropout in controllable learning. Further experiments illustrate that our compression method can not only achieve variable-length compression but also can recover high-quality compressed images at extremely low bit rates, outperforming that with a fixed rate.
Similar content being viewed by others
References
Agustsson E, Mentzer F, Tschannen M, Cavigelli L, Timofte R, Benini L, Gool LV (2017) Soft-to-hard vector quantization for end-to-end learning compressible representations. In: Advances in neural information processing systems (NIPS), pp 1141–1151
Agustsson E, Tschannen M, Mentzer F, Timofte R, Gool LV (2019) Generative adversarial networks for extreme learned image compression. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 221–231
Alain G, Bengio Y (2016) Understanding intermediate layers using linear classifier probes. arXiv:161001644
Arjovsky M, Bottou L (2017) Towards principled methods for training generative adversarial networks. arXiv:170104862
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: Proceedings of international conference on machine learning (ICML), vol 70, pp 214–223
Ballé J, Laparra V, Simoncelli E (2017) End-to-end optimized image compression. In: Proceedings of the IEEE international conference on learning representations (ICLR)
Chen Y, Wang J, Chen X, Sangaiah AK, Yang K, Cao Z (2019a) Image super-resolution algorithm based on dual-channel convolutional neural networks. Appl Sci 9(11):2316
Chen Y, Wang J, Liu S, Chen X, Xiong J, Xie J, Yang K (2019b) Multiscale fast correlation filtering tracking algorithm based on a feature fusion model. Concur Comput Pract Exp:5533
Chen Y, Wang J, Xia R, Zhang Q, Cao Z, Yang K (2019c) The visual object tracking algorithm research based on adaptive combination kernel. J Ambient Intell Hum Comput 10(12):4855–4867
Chen Y, Tao J, Liu L, Xiong J, Xia R, Xie J, Zhang Q, Yang K (2020) Research of improving semantic image segmentation based on a feature fusion model. J Ambient Intell Hum Comput
Denton L, Emily, Chintala S, Fergus R et al (2015) Deep generative image model using a laplacian pyramid of adversarial networks. In: Advances in neural information processing systems (NIPS), pp 1486–1494
Donahue J, Krähenbühl P, Darrell T (2016) Adversarial feature learning. arXiv:1605
Dosovitskiy A, Brox T (2016) Generating images with perceptual similarity metrics based on deep networks. In: Advances in neural information processing systems (NIPS), pp 658–666
Dumoulin V, Belghazi I, Poole B, Mastropietro O, Lamb A, Arjovsky M, Courville A (2016) Adversarially learned inference. arXiv:1606
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems (NIPS), pp 2672–2680
Gregor K, Besse F, Rezende DJ, Danihelka I, Wierstra D (2016) Towards conceptual compression. In: Advances in neural information processing systems (NIPS), pp 3549–3557
Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville AC (2017) Improved training of wasserstein gans. In: Advances in neural information processing systems (NIPS), pp 5767–5777
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778
Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
Isola P, Zhu J, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 5967–5976
Jiang F, Tao W, Liu S, Ren J, Guo X, Zhao D (2017) An end-to-end compression framework based on convolutional neural networks. IEEE Trans Circ Sys Video Technol 28(10):3007–3018
Jiang J (1999) Image compression with neural networks–a survey. Signal Process Imag Commun 14(9):737–760
Johnston N, Vincent D, Minnen D, Covell M, Singh S, Chinen T, Jin Hwang S, Shor J, Toderici G (2018) Improved lossy image compression with priming and spatially adaptive bit rates for recurrent networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 4385–4393
Karras T, Aila T, Laine S, Lehtinen J (2018) Progressive growing of gans for improved quality, stability and variation. In: Proceedings of the international conference on learning representations (ICLR)
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems (NIPS), pp 1097–1105
Ledig C, Theis L, Huszar F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, Shi W (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 105–114
Li C, Wand M (2016) Precomputed real-time texture synthesis with markovian generative adversarial networks. CoRR 1604.04382
Li M, Zuo W, Gu S, Zhao D, Zhang D (2018) Learning convolutional networks for content-weighted image compression. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3214–3223
Lin TY, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 2980–2988
Liu Z, Luo P, Wang X, Tang X (2015) Deep learning face attributes in the wild. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 3730–3738
Lu X, Ma C, Ni B, Yang X, Reid I, Yang MH (2018) Deep regression tracking with shrinkage loss. In: Proceedings of the European conference on computer vision (ECCV), pp 353–369
Ma S, Zhang X, Jia C, Zhao Z, Wang S, Wanga S (2019) Image and video compression with neural networks: A review. IEEE Trans Circ Sys Video Technol
Mao X, Li Q, Xie H, Lau RY, Wang Z, Paul Smolley S (2017) Least squares generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 2794–2802
Mentzer F, Agustsson E, Tschannen M, Timofte R, Van Gool L (2018) Conditional probability models for deep image compression. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 4394–4402
Rippel O, Bourdev L (2017) Real-time adaptive image compression. In: Proceedings of international conference on machine learning (ICML), pp 2922–2930
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. In: Advances in neural information processing systems (NIPS), pp 2234–2242
Santurkar S, Budden D, Shavit N (2018) Generative compression. In: 2018 picture coding symposium (PCS). IEEE, pp 258–262
Schuster M, Paliwal K K (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:14091556
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: A simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Theis L, Bethge M (2015) Generative image modeling using spatial lstms. In: Advances in neural information processing systems (NIPS), pp 1927–1935
Theis L, Shi W, Cunningham A, Huszár F (2017) Lossy image compression with compressive autoencoders. In: Proceedings of the IEEE international conference on learning representations (ICLR)
Toderici G, O’Malley SM, Hwang SJ, Vincent D, Minnen D, Baluja S, Covell M, Sukthankar R (2016) Variable rate image compression with recurrent neural networks. In: Proceedings of the IEEE international conference on learning representation (ICLR)
Toderici G, Vincent D, Johnston N, Jin Hwang S, Minnen D, Shor J, Covell M (2017) Full resolution image compression with recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 5306–5314
Tschannen M, Agustsson E, Lucic M (2018) Deep generative models for distribution-preserving lossy compression. In: Advances in neural information processing systems (NIPS), pp 5929–5940
Van den Oord A, Kalchbrenner N, Espeholt L, Vinyals O, Graves A et al (2016) Conditional image generation with pixelcnn decoders. In: Advances in neural information processing systems (NIPS), pp 4790–4798
Van Oord A, Kalchbrenner N, Kavukcuoglu K (2016) Pixel recurrent neural networks. In: Proceedings of the international conference on machine learning (ICML), pp 1747–1756
Wallace GK (1992) The jpeg still picture compression standard. IEEE Trans Consumer Elec 38(1):xviii–xxxiv
Wang TC, Liu MY, Zhu JY, Tao A, Kautz J, Catanzaro B (2018) High-resolution image synthesis and semantic manipulation with conditional gans. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 8798–8807
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: From error measurement to structural similarity. IEEE Trans Image Process 13(1)
Wolf S, Pinson M (2009) Reference algorithm for computing peak signal to noise ratio (psnr) of a video sequence with a constant delay. ITU-T Contribution COM9-C6-E
Xu M, Li S, Lu J, Zhu W (2014) Compressibility constrained sparse representation with learnt dictionary for low bit-rate image compression. IEEE Trans Circ Sys Video Technol 24(10):1743–1757
Yu A, Grauman K (2014) Fine-grained visual comparisons with local learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 192–199
Yu A, Grauman K (2017) Semantic jitter: Dense supervision for visual comparisons via synthetic images. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 5570–5579
Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018a) Learning a discriminative feature network for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1857–1866
Yu L, Long X, Tong C (2018b) Single image super-resolution based on improved wgan. In: Proceedings of the international conference on advanced control automation and artificial intelligence (ACAAI). Atlantis Press
Zhang X, Lin W, Zhang Y, Wang S, Ma S, Duan L, Gao W (2017) Rate-distortion optimized sparse coding with ordered dictionary for image set compression. IEEE Trans Circ Sys Video Technol 28(12):3387–3397
Zhou L, Cai C, Gao Y, Su S, Wu J (2018) Variational autoencoder for low bit-rate image compression. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) workshops, pp 2617–2620
Zhou W (2004) Image quality assessment: From error measurement to structural similarity. IEEE Trans Image Process 13:600–613
Acknowledgements
This work is supported by Natural Science Foundation for Distinguished Young Scholars of Shandong Province (JQ201718), the Natural Science Foundation of China (U1736122), the National Natural Science Foundation of China under Grant No. 62001267, and the Foundamental Research Funds of Shandong University under Grant No. 2020HW017.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhao, D., Sun, J., Chen, L. et al. Variable-length image compression based on controllable learning network. Multimed Tools Appl 80, 20065–20087 (2021). https://doi.org/10.1007/s11042-020-10346-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-10346-1