Abstract
We propose to learn a deep neural network for JPEG image compression, which predicts image-specific optimized quantization tables fully compatible with the standard JPEG encoder and decoder. Moreover, our approach provides the capability to learn task-specific quantization tables in a principled way by adjusting the objective function of the network. The main challenge to realize this idea is that there exist non-differentiable components in the encoder such as run-length encoding and Huffman coding and it is not straightforward to predict the probability distribution of the quantized image representations. We address these issues by learning a differentiable loss function that approximates bitrates using simple network blocks—two MLPs and an LSTM. We evaluate the proposed algorithm using multiple task-specific losses—two for semantic image understanding and another two for conventional image compression—and demonstrate the effectiveness of our approach to the individual tasks.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
References
Agustsson, E., et al.: Soft-to-hard vector quantization for end-to-end learning compressible representations. In: NeurIPS (2017)
Ahmed, N., Natarajan, T., Rao, K.R.: Discrete cosine transform. IEEE Trans. Comput. 100(1), 90–93 (1974)
Ahumada Jr, A.J., Peterson, H.A.: Luminance-model-based DCT quantization for color image compression. In: SPIE (1992)
Ballé, J., Laparra, V., Simoncelli, E.P.: End-to-end optimized image compression. In: ICLR (2017)
Ballé, J., Minnen, D., Singh, S., Hwang, S.J., Johnston, N.: Variational image compression with a scale hyperprior. In: ICLR (2018)
Davisson, L.D.: Rate-distortion theory and application. Proc. IEEE 60(7), 800–808 (1972)
Dodge, S., Karam, L.: Understanding how image quality affects deep neural networks. In: QoMEX (2016)
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. In: CVPRW (2004)
Flores, B.E.: A pragmatic view of accuracy measurement in forecasting. Omega 14(2), 93–98 (1986)
Gueguen, L., Sergeev, A., Kadlec, B., Liu, R., Yosinski, J.: Faster neural networks straight from JPEG. In: NeurIPS (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Hodosh, M., Young, P., Hockenmaier, J.: Framing image description as a ranking task: data, models and evaluation metrics. J. Artif. Intell. Res. 47, 853–899 (2013)
Hopkins, M., Mitzenmacher, M., Wagner-Carena, S.: Simulated annealing for JPEG quantization. In: DCC (2018)
Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Netw. 2(5), 359–366 (1989)
Jang, E., Gu, S., Poole, B.: Categorical reparameterization with gumbel-softmax. In: ICLR (2017)
Jayant, N., Johnston, J., Safranek, R.: Signal compression based on models of human perception. Proc. IEEE 81(10), 1385–1422 (1993)
Johnston, N., et al.: Improved Lossy image compression with priming and spatially adaptive bit rates for recurrent networks. In: CVPR (2018)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)
Knuth, D.E.: Dynamic Huffman coding. J. Algorithms 6(2), 163–180 (1985)
Lakhani, G.: Optimal Huffman coding of DCT blocks. IEEE Trans. Circuits Syst. Video Technol. 14(4), 522–527 (2004)
Lee, J., Cho, S., Beack, S.K.: Context-adaptive entropy model for end-to-end optimized image compression. In: ICLR (2019)
Leshno, M., Lin, V.Y., Pinkus, A., Schocken, S.: Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Netw. 6(6), 861–867 (1993)
Li, M., Zuo, W., Gu, S., Zhao, D., Zhang, D.: Learning convolutional networks for content-weighted image compression. In: CVPR (2018)
Liu, Z., et al.: DeepN-JPEG: a deep neural network favorable JPEG-based image compression framework. In: DAC (2018)
Lo, S.Y., Hang, H.M.: Exploring semantic segmentation on the DCT representation. In: MMAsia (2019)
Mentzer, F., Agustsson, E., Tschannen, M., Timofte, R., Van Gool, L.: Conditional probability models for deep image compression. In: CVPR (2018)
Mhaskar, H.N., Micchelli, C.A.: Approximation by superposition of sigmoidal and radial basis functions. Adv. Appl. Math. 13(3), 350–373 (1992)
Minnen, D., Ballé, J., Toderici, G.D.: Joint autoregressive and hierarchical priors for learned image compression. In: NeurIPS (2018)
Monro, D.M., Sherlock, B.G.: Optimum DCT quantization. In: DCC (1993)
Pennebaker, W.B., Mitchell, J.L.: JPEG: Still Image Data Compression Standard. Springer Science and Business Media, New York (1992)
Ratnakar, V., Livny, M.: RD-OPT: an efficient algorithm for optimizing DCT quantization tables. In: DCC (1995)
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015)
Shi, W., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: CVPR (2016)
Skodras, A., Christopoulos, C., Ebrahimi, T.: The JPEG 2000 still image compression standard. IEEE Signal Process. Mag. 18(5), 36–58 (2001)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: CVPR (2016)
Theis, L., Shi, W., Cunningham, A., Huszár, F.: Lossy image compression with compressive autoencoders. In: ICLR (2017)
Toderici, G., et al.: Full resolution image compression with recurrent neural networks. In: CVPR (2017)
Verma, V., Agarwal, N., Khanna, N.: DCT-domain deep convolutional neural networks for multiple JPEG compression classification. Sig. Process. Image Commun. 67, 22–33 (2018)
Vinyals, O., Toshev, A., Bengio, S., Erhan, D.: Show and tell: a neural image caption generator. In: CVPR (2015)
Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: ACSSC (2003)
Watson, A.B.: Visually optimal DCT quantization matrices for individual images. In: DCC (1993)
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: CBAM: convolutional block attention module. In: ECCV (2018)
Zhang, L., Zhang, L., Mou, X., Zhang, D.: FSIM: a feature similarity index for image quality assessment. IEEE Trans. Image Process. 20(8), 2378–2386 (2011)
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: CVPR (2016)
Acknowledgments
This work was partly supported by Kakao and Kakao Brain Corporation, and IITP grant funded by the Korea government (MSIT) (2016-0-00563, 2017-0-01779). We also thank Hyeonwoo Noh for fruitful discussions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Choi, J., Han, B. (2020). Task-Aware Quantization Network for JPEG Image Compression. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12365. Springer, Cham. https://doi.org/10.1007/978-3-030-58565-5_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-58565-5_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58564-8
Online ISBN: 978-3-030-58565-5
eBook Packages: Computer ScienceComputer Science (R0)