Skip to main content
Log in

Variable-length image compression based on controllable learning network

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

While neural-network-based lossy image compression methods have shown impressive performance, most of them output a fixed-length coding using a trained-specific network. However, it is essential to support the variable-length compression or meet a target rate with a high-coding performance in practice. This paper steps forward the neural-network-based image compression method, making it possible for a single network model to generate variable compression rates. Our network model combines an auto-encoder (AE) and a generative adversarial network (GAN) for generative compression. We introduce a noise interference mechanism to train the feature representation produced by the encoder, making the feature nodes training controllable and distributed from top to bottom according to their importance in feature expression. Based on this importance distribution, the latent nodes are quantized into bits and the variable-length compression can be achieved by discarding bits of those less-important feature nodes to meet the compression target. We propose several noise interference methods, and the experiments confirm the feasibility of method Random-add and Dropout in controllable learning. Further experiments illustrate that our compression method can not only achieve variable-length compression but also can recover high-quality compressed images at extremely low bit rates, outperforming that with a fixed rate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Agustsson E, Mentzer F, Tschannen M, Cavigelli L, Timofte R, Benini L, Gool LV (2017) Soft-to-hard vector quantization for end-to-end learning compressible representations. In: Advances in neural information processing systems (NIPS), pp 1141–1151

  2. Agustsson E, Tschannen M, Mentzer F, Timofte R, Gool LV (2019) Generative adversarial networks for extreme learned image compression. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 221–231

  3. Alain G, Bengio Y (2016) Understanding intermediate layers using linear classifier probes. arXiv:161001644

  4. Arjovsky M, Bottou L (2017) Towards principled methods for training generative adversarial networks. arXiv:170104862

  5. Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: Proceedings of international conference on machine learning (ICML), vol 70, pp 214–223

  6. Ballé J, Laparra V, Simoncelli E (2017) End-to-end optimized image compression. In: Proceedings of the IEEE international conference on learning representations (ICLR)

  7. Chen Y, Wang J, Chen X, Sangaiah AK, Yang K, Cao Z (2019a) Image super-resolution algorithm based on dual-channel convolutional neural networks. Appl Sci 9(11):2316

    Article  Google Scholar 

  8. Chen Y, Wang J, Liu S, Chen X, Xiong J, Xie J, Yang K (2019b) Multiscale fast correlation filtering tracking algorithm based on a feature fusion model. Concur Comput Pract Exp:5533

  9. Chen Y, Wang J, Xia R, Zhang Q, Cao Z, Yang K (2019c) The visual object tracking algorithm research based on adaptive combination kernel. J Ambient Intell Hum Comput 10(12):4855–4867

    Article  Google Scholar 

  10. Chen Y, Tao J, Liu L, Xiong J, Xia R, Xie J, Zhang Q, Yang K (2020) Research of improving semantic image segmentation based on a feature fusion model. J Ambient Intell Hum Comput

  11. Denton L, Emily, Chintala S, Fergus R et al (2015) Deep generative image model using a laplacian pyramid of adversarial networks. In: Advances in neural information processing systems (NIPS), pp 1486–1494

  12. Donahue J, Krähenbühl P, Darrell T (2016) Adversarial feature learning. arXiv:1605

  13. Dosovitskiy A, Brox T (2016) Generating images with perceptual similarity metrics based on deep networks. In: Advances in neural information processing systems (NIPS), pp 658–666

  14. Dumoulin V, Belghazi I, Poole B, Mastropietro O, Lamb A, Arjovsky M, Courville A (2016) Adversarially learned inference. arXiv:1606

  15. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems (NIPS), pp 2672–2680

  16. Gregor K, Besse F, Rezende DJ, Danihelka I, Wierstra D (2016) Towards conceptual compression. In: Advances in neural information processing systems (NIPS), pp 3549–3557

  17. Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville AC (2017) Improved training of wasserstein gans. In: Advances in neural information processing systems (NIPS), pp 5767–5777

  18. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778

  19. Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554

    Article  MathSciNet  Google Scholar 

  20. Isola P, Zhu J, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 5967–5976

  21. Jiang F, Tao W, Liu S, Ren J, Guo X, Zhao D (2017) An end-to-end compression framework based on convolutional neural networks. IEEE Trans Circ Sys Video Technol 28(10):3007–3018

    Article  Google Scholar 

  22. Jiang J (1999) Image compression with neural networks–a survey. Signal Process Imag Commun 14(9):737–760

    Article  Google Scholar 

  23. Johnston N, Vincent D, Minnen D, Covell M, Singh S, Chinen T, Jin Hwang S, Shor J, Toderici G (2018) Improved lossy image compression with priming and spatially adaptive bit rates for recurrent networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 4385–4393

  24. Karras T, Aila T, Laine S, Lehtinen J (2018) Progressive growing of gans for improved quality, stability and variation. In: Proceedings of the international conference on learning representations (ICLR)

  25. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems (NIPS), pp 1097–1105

  26. Ledig C, Theis L, Huszar F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, Shi W (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 105–114

  27. Li C, Wand M (2016) Precomputed real-time texture synthesis with markovian generative adversarial networks. CoRR 1604.04382

  28. Li M, Zuo W, Gu S, Zhao D, Zhang D (2018) Learning convolutional networks for content-weighted image compression. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3214–3223

  29. Lin TY, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 2980–2988

  30. Liu Z, Luo P, Wang X, Tang X (2015) Deep learning face attributes in the wild. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 3730–3738

  31. Lu X, Ma C, Ni B, Yang X, Reid I, Yang MH (2018) Deep regression tracking with shrinkage loss. In: Proceedings of the European conference on computer vision (ECCV), pp 353–369

  32. Ma S, Zhang X, Jia C, Zhao Z, Wang S, Wanga S (2019) Image and video compression with neural networks: A review. IEEE Trans Circ Sys Video Technol

  33. Mao X, Li Q, Xie H, Lau RY, Wang Z, Paul Smolley S (2017) Least squares generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 2794–2802

  34. Mentzer F, Agustsson E, Tschannen M, Timofte R, Van Gool L (2018) Conditional probability models for deep image compression. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 4394–4402

  35. Rippel O, Bourdev L (2017) Real-time adaptive image compression. In: Proceedings of international conference on machine learning (ICML), pp 2922–2930

  36. Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. In: Advances in neural information processing systems (NIPS), pp 2234–2242

  37. Santurkar S, Budden D, Shavit N (2018) Generative compression. In: 2018 picture coding symposium (PCS). IEEE, pp 258–262

  38. Schuster M, Paliwal K K (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681

    Article  Google Scholar 

  39. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:14091556

  40. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: A simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  41. Theis L, Bethge M (2015) Generative image modeling using spatial lstms. In: Advances in neural information processing systems (NIPS), pp 1927–1935

  42. Theis L, Shi W, Cunningham A, Huszár F (2017) Lossy image compression with compressive autoencoders. In: Proceedings of the IEEE international conference on learning representations (ICLR)

  43. Toderici G, O’Malley SM, Hwang SJ, Vincent D, Minnen D, Baluja S, Covell M, Sukthankar R (2016) Variable rate image compression with recurrent neural networks. In: Proceedings of the IEEE international conference on learning representation (ICLR)

  44. Toderici G, Vincent D, Johnston N, Jin Hwang S, Minnen D, Shor J, Covell M (2017) Full resolution image compression with recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 5306–5314

  45. Tschannen M, Agustsson E, Lucic M (2018) Deep generative models for distribution-preserving lossy compression. In: Advances in neural information processing systems (NIPS), pp 5929–5940

  46. Van den Oord A, Kalchbrenner N, Espeholt L, Vinyals O, Graves A et al (2016) Conditional image generation with pixelcnn decoders. In: Advances in neural information processing systems (NIPS), pp 4790–4798

  47. Van Oord A, Kalchbrenner N, Kavukcuoglu K (2016) Pixel recurrent neural networks. In: Proceedings of the international conference on machine learning (ICML), pp 1747–1756

  48. Wallace GK (1992) The jpeg still picture compression standard. IEEE Trans Consumer Elec 38(1):xviii–xxxiv

    Article  Google Scholar 

  49. Wang TC, Liu MY, Zhu JY, Tao A, Kautz J, Catanzaro B (2018) High-resolution image synthesis and semantic manipulation with conditional gans. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 8798–8807

  50. Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: From error measurement to structural similarity. IEEE Trans Image Process 13(1)

  51. Wolf S, Pinson M (2009) Reference algorithm for computing peak signal to noise ratio (psnr) of a video sequence with a constant delay. ITU-T Contribution COM9-C6-E

  52. Xu M, Li S, Lu J, Zhu W (2014) Compressibility constrained sparse representation with learnt dictionary for low bit-rate image compression. IEEE Trans Circ Sys Video Technol 24(10):1743–1757

    Article  Google Scholar 

  53. Yu A, Grauman K (2014) Fine-grained visual comparisons with local learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 192–199

  54. Yu A, Grauman K (2017) Semantic jitter: Dense supervision for visual comparisons via synthetic images. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 5570–5579

  55. Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018a) Learning a discriminative feature network for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1857–1866

  56. Yu L, Long X, Tong C (2018b) Single image super-resolution based on improved wgan. In: Proceedings of the international conference on advanced control automation and artificial intelligence (ACAAI). Atlantis Press

  57. Zhang X, Lin W, Zhang Y, Wang S, Ma S, Duan L, Gao W (2017) Rate-distortion optimized sparse coding with ordered dictionary for image set compression. IEEE Trans Circ Sys Video Technol 28(12):3387–3397

    Article  Google Scholar 

  58. Zhou L, Cai C, Gao Y, Su S, Wu J (2018) Variational autoencoder for low bit-rate image compression. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) workshops, pp 2617–2620

  59. Zhou W (2004) Image quality assessment: From error measurement to structural similarity. IEEE Trans Image Process 13:600–613

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by Natural Science Foundation for Distinguished Young Scholars of Shandong Province (JQ201718), the Natural Science Foundation of China (U1736122), the National Natural Science Foundation of China under Grant No. 62001267, and the Foundamental Research Funds of Shandong University under Grant No. 2020HW017.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongchao Zhou.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, D., Sun, J., Chen, L. et al. Variable-length image compression based on controllable learning network. Multimed Tools Appl 80, 20065–20087 (2021). https://doi.org/10.1007/s11042-020-10346-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-10346-1

Keywords

Navigation