Skip to main content

Task-Aware Quantization Network for JPEG Image Compression

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12365))

Abstract

We propose to learn a deep neural network for JPEG image compression, which predicts image-specific optimized quantization tables fully compatible with the standard JPEG encoder and decoder. Moreover, our approach provides the capability to learn task-specific quantization tables in a principled way by adjusting the objective function of the network. The main challenge to realize this idea is that there exist non-differentiable components in the encoder such as run-length encoding and Huffman coding and it is not straightforward to predict the probability distribution of the quantized image representations. We address these issues by learning a differentiable loss function that approximates bitrates using simple network blocks—two MLPs and an LSTM. We evaluate the proposed algorithm using multiple task-specific losses—two for semantic image understanding and another two for conventional image compression—and demonstrate the effectiveness of our approach to the individual tasks.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://r0k.us/graphics/kodak/.

References

  1. Agustsson, E., et al.: Soft-to-hard vector quantization for end-to-end learning compressible representations. In: NeurIPS (2017)

    Google Scholar 

  2. Ahmed, N., Natarajan, T., Rao, K.R.: Discrete cosine transform. IEEE Trans. Comput. 100(1), 90–93 (1974)

    Article  MathSciNet  Google Scholar 

  3. Ahumada Jr, A.J., Peterson, H.A.: Luminance-model-based DCT quantization for color image compression. In: SPIE (1992)

    Google Scholar 

  4. Ballé, J., Laparra, V., Simoncelli, E.P.: End-to-end optimized image compression. In: ICLR (2017)

    Google Scholar 

  5. Ballé, J., Minnen, D., Singh, S., Hwang, S.J., Johnston, N.: Variational image compression with a scale hyperprior. In: ICLR (2018)

    Google Scholar 

  6. Davisson, L.D.: Rate-distortion theory and application. Proc. IEEE 60(7), 800–808 (1972)

    Article  Google Scholar 

  7. Dodge, S., Karam, L.: Understanding how image quality affects deep neural networks. In: QoMEX (2016)

    Google Scholar 

  8. Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. In: CVPRW (2004)

    Google Scholar 

  9. Flores, B.E.: A pragmatic view of accuracy measurement in forecasting. Omega 14(2), 93–98 (1986)

    Article  Google Scholar 

  10. Gueguen, L., Sergeev, A., Kadlec, B., Liu, R., Yosinski, J.: Faster neural networks straight from JPEG. In: NeurIPS (2018)

    Google Scholar 

  11. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  12. Hodosh, M., Young, P., Hockenmaier, J.: Framing image description as a ranking task: data, models and evaluation metrics. J. Artif. Intell. Res. 47, 853–899 (2013)

    Article  MathSciNet  Google Scholar 

  13. Hopkins, M., Mitzenmacher, M., Wagner-Carena, S.: Simulated annealing for JPEG quantization. In: DCC (2018)

    Google Scholar 

  14. Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Netw. 2(5), 359–366 (1989)

    Article  Google Scholar 

  15. Jang, E., Gu, S., Poole, B.: Categorical reparameterization with gumbel-softmax. In: ICLR (2017)

    Google Scholar 

  16. Jayant, N., Johnston, J., Safranek, R.: Signal compression based on models of human perception. Proc. IEEE 81(10), 1385–1422 (1993)

    Article  Google Scholar 

  17. Johnston, N., et al.: Improved Lossy image compression with priming and spatially adaptive bit rates for recurrent networks. In: CVPR (2018)

    Google Scholar 

  18. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)

    Google Scholar 

  19. Knuth, D.E.: Dynamic Huffman coding. J. Algorithms 6(2), 163–180 (1985)

    Google Scholar 

  20. Lakhani, G.: Optimal Huffman coding of DCT blocks. IEEE Trans. Circuits Syst. Video Technol. 14(4), 522–527 (2004)

    Article  Google Scholar 

  21. Lee, J., Cho, S., Beack, S.K.: Context-adaptive entropy model for end-to-end optimized image compression. In: ICLR (2019)

    Google Scholar 

  22. Leshno, M., Lin, V.Y., Pinkus, A., Schocken, S.: Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Netw. 6(6), 861–867 (1993)

    Article  Google Scholar 

  23. Li, M., Zuo, W., Gu, S., Zhao, D., Zhang, D.: Learning convolutional networks for content-weighted image compression. In: CVPR (2018)

    Google Scholar 

  24. Liu, Z., et al.: DeepN-JPEG: a deep neural network favorable JPEG-based image compression framework. In: DAC (2018)

    Google Scholar 

  25. Lo, S.Y., Hang, H.M.: Exploring semantic segmentation on the DCT representation. In: MMAsia (2019)

    Google Scholar 

  26. Mentzer, F., Agustsson, E., Tschannen, M., Timofte, R., Van Gool, L.: Conditional probability models for deep image compression. In: CVPR (2018)

    Google Scholar 

  27. Mhaskar, H.N., Micchelli, C.A.: Approximation by superposition of sigmoidal and radial basis functions. Adv. Appl. Math. 13(3), 350–373 (1992)

    Article  MathSciNet  Google Scholar 

  28. Minnen, D., Ballé, J., Toderici, G.D.: Joint autoregressive and hierarchical priors for learned image compression. In: NeurIPS (2018)

    Google Scholar 

  29. Monro, D.M., Sherlock, B.G.: Optimum DCT quantization. In: DCC (1993)

    Google Scholar 

  30. Pennebaker, W.B., Mitchell, J.L.: JPEG: Still Image Data Compression Standard. Springer Science and Business Media, New York (1992)

    Google Scholar 

  31. Ratnakar, V., Livny, M.: RD-OPT: an efficient algorithm for optimizing DCT quantization tables. In: DCC (1995)

    Google Scholar 

  32. Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  33. Shi, W., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: CVPR (2016)

    Google Scholar 

  34. Skodras, A., Christopoulos, C., Ebrahimi, T.: The JPEG 2000 still image compression standard. IEEE Signal Process. Mag. 18(5), 36–58 (2001)

    Article  Google Scholar 

  35. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: CVPR (2016)

    Google Scholar 

  36. Theis, L., Shi, W., Cunningham, A., Huszár, F.: Lossy image compression with compressive autoencoders. In: ICLR (2017)

    Google Scholar 

  37. Toderici, G., et al.: Full resolution image compression with recurrent neural networks. In: CVPR (2017)

    Google Scholar 

  38. Verma, V., Agarwal, N., Khanna, N.: DCT-domain deep convolutional neural networks for multiple JPEG compression classification. Sig. Process. Image Commun. 67, 22–33 (2018)

    Article  Google Scholar 

  39. Vinyals, O., Toshev, A., Bengio, S., Erhan, D.: Show and tell: a neural image caption generator. In: CVPR (2015)

    Google Scholar 

  40. Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: ACSSC (2003)

    Google Scholar 

  41. Watson, A.B.: Visually optimal DCT quantization matrices for individual images. In: DCC (1993)

    Google Scholar 

  42. Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: CBAM: convolutional block attention module. In: ECCV (2018)

    Google Scholar 

  43. Zhang, L., Zhang, L., Mou, X., Zhang, D.: FSIM: a feature similarity index for image quality assessment. IEEE Trans. Image Process. 20(8), 2378–2386 (2011)

    Google Scholar 

  44. Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: CVPR (2016)

    Google Scholar 

Download references

Acknowledgments

This work was partly supported by Kakao and Kakao Brain Corporation, and IITP grant funded by the Korea government (MSIT) (2016-0-00563, 2017-0-01779). We also thank Hyeonwoo Noh for fruitful discussions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bohyung Han .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1222 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Choi, J., Han, B. (2020). Task-Aware Quantization Network for JPEG Image Compression. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12365. Springer, Cham. https://doi.org/10.1007/978-3-030-58565-5_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58565-5_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58564-8

  • Online ISBN: 978-3-030-58565-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics