Enhancing Image Rescaling Using High Frequency Guidance and Attentions in Downscaling and Upscaling Network

Gui, Yan; Xie, Yan; Kuang, Lidan; Chen, Zhihua; Zhang, Jin

doi:10.1007/978-3-031-50069-5_35

Yan Gui^12,13,
Yan Xie^12,13,
Lidan Kuang^12,13,
Zhihua Chen¹⁴ &
…
Jin Zhang^12,13

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14495))

Included in the following conference series:

Computer Graphics International Conference

201 Accesses

Abstract

Recent image rescaling methods adopt invertible bijective transformations to model downscaling and upscaling simultaneously, where the high-frequency information learned in the downscaling process is used to recover the high-resolution image by inversely passing the model. However, less attention has been paid to exploiting the high-frequency information when upscaling. In this paper, an efficient end-to-end learning model for image rescaling, based on a newly designed neural network, is developed. The network consists of a downscaling generation sub-network (DSNet) and a super-resolution sub-network (SRNet), and learns to recover high-frequency signals. Concretely, we introduce dense attention blocks to the DSNet to produce the visually-pleasing low resolution (LR) image and model the distribution of the high-frequency information using a latent variable following a specified distribution. For the SRNet, we adapt an enhanced deep residual network by using residual attention blocks and adding a long skip connection, which transforms the predicted LR image and the random samples of the latent variable back during upscaling. Finally, we define a joint loss and adopt a multi-stage training strategy to optimize the whole network. Experimental results demonstrate that the superior performance of our model over existing methods in terms of both quantitative metrics and visual quality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Zhang, Y., Zhao, D., Zhang, J., Xiong, R., Gao, W.: Interpolation-dependent image downsampling. IEEE Trans. Image Process. 20(11), 3291–3296 (2011)
Article MathSciNet Google Scholar
Kim, H., Choi, M., Lim, B., Lee, K.M.: Task-aware image downscaling. In: Proceedings of the 15th European Conference Computer Vision, Munich, Germany, 8–14 September, Part IV, vol. 11208, pp. 419–434 (2018)
Google Scholar
Li, Y., Liu, D., Li, H., Li, L., Li, Z., Wu, F.: Learning a convolutional neural network for image compactresolution. IEEE Trans. Image Process. 28(3), 1092–1107 (2019)
Article MathSciNet Google Scholar
Sun, W., Chen, Z.: Learned image downscaling for upscaling using content adaptive resampler. IEEE Trans. Image Process. 29, 4027–4040 (2020)
Article Google Scholar
Jiang, F., Tao, W., Liu, S., Ren, J., Guo, X., Zhao, D.: An end-to-end compression framework based on convolutional neural networks. IEEE Trans. Circuits Syst. Video Technol. 28(10), 3007–3018 (2018)
Article Google Scholar
Fang, L., Au, O.C., Tang, K., Wen, X., Wang, H.: Novel 2-d MMSE subpixel-based image down-sampling. IEEE Trans. Circuits Syst. Video Technol. 22(5), 740–753 (2012)
Article Google Scholar
Rubinstein, M., Gutierrez, D., Sorkine, O., Shamir, A.: A comparative study of image retargeting. ACM Trans. Graph. 29(6), 160 (2010)
Article Google Scholar
Liu, J., He, S., Lau, R.W.H.: L0-regularized image downscaling. IEEE Trans. Image Process. 27(3), 1076–1085 (2018)
Article MathSciNet Google Scholar
Shannon, C.E.: Communication in the presence of noise. Proc. IEEE 86(2), 447–457 (1998)
Article Google Scholar
Mitchell, D.P., Netravali, A.N.: Reconstruction filters in computer-graphics. In: Proceedings of the 15th Annual Conference on Computer Graphics and Interactive Techniques, Atlanta, Georgia, USA, 1–5 August, pp. 221–228 (1988)
Google Scholar
Schultz, R.R., Stevenson, R.L.: A Bayesian approach to image expansion for improved definition. IEEE Trans. Image Process. 3(3), 233–242 (1994)
Article Google Scholar
Glasner, D., Bagon, S., Irani, M.: Super-resolution from a single image. In: Proceedings of the IEEE 12th International Conference on Computer Vision, Kyoto, Japan, 27 September–4 October, pp. 349–356 (2009)
Google Scholar
Yang, J., Wright, J., Huang, T.S., Ma, Y.: Image super-resolution via sparse representation. IEEE Trans. Image Process. 19(11), 2861–2873 (2010)
Article MathSciNet Google Scholar
Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 295–307 (2016)
Article Google Scholar
Kim, J., Lee, J.K., Lee, K.M.: Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June, pp. 1646–1654 (2016)
Google Scholar
Lim, B., Son, S., Kim, H., Nah, S., Lee, K.M.: Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July, pp. 1132–1140 (2017)
Google Scholar
Wang, X., et al.: ESRGAN: enhanced super-resolution generative adversarial networks. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018, Part V. LNCS, vol. 11133, pp. 63–79. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_5
Chapter Google Scholar
Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part VII. LNCS, vol. 11211, pp. 294–310. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_18
Chapter Google Scholar
Hashemzadeh, M., Asheghi, B., Farajzadeh, N.: Content-aware image resizing: an improved and shadow-preserving seam carving method. Sig. Process. 155, 233–246 (2019)
Article Google Scholar
Li, L., Tang, J., Ye, Z., Sheng, B., Mao, L., Ma, L.: Unsupervised face super-resolution via gradient enhancement and semantic guidance. Vis. Comput. 37(9–11), 2855–2867 (2021)
Article Google Scholar
Liu, Y., Yang, D., Zhang, F., Xie, Q., Zhang, C.: Deep recurrent residual channel attention network for single image super-resolution. Vis. Comput. (2023)
Google Scholar
Xiao, M., et al.: Invertible image rescaling. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020, Part I. LNCS, vol. 12346, pp. 126–144. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_8
Chapter Google Scholar
Li, S., Zhang, G., Luo, Z., Liu, J., Zeng, Z., Zhang, S.: Approaching the limit of image rescaling via flow guidance. In: Proceedings of 32nd British Machine Vision Conference, Online, 22–25 November, p. 13 (2021)
Google Scholar
Liang, J., Lugmayr, A., Zhang, K., Danelljan, M., Gool, L.V., Timofte, R.: Hierarchical conditional flow: a unified framework for image super-resolution and image rescaling. In: Proceedings of IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October, pp. 4056–4065 (2021)
Google Scholar
Zhang, M., Pan, Z., Zhou, X., Kuo, C.J.: Enhancing image rescaling using dual latent variables in invertible neural network. CoRR abs/2207.11844 (2022)
Google Scholar
Dinh, L., Sohl-Dickstein, J., Bengio, S.: Density estimation using real NVP. In: Proceedings of 5th International Conference on Learning Representations, Toulon, France, 24–26 April (2017)
Google Scholar
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Honolulu, HI, USA, 21–26 July, pp. 2261–2269 (2017)
Google Scholar
Agustsson, E., Timofte, R.: NTIRE 2017 challenge on single image super-resolution: dataset and study. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July, pp. 1122–1131 (2017)
Google Scholar
Bevilacqua, M., Roumy, A., Guillemot, C., Alberi-Morel, M.: Low-complexity single-image super-resolution based on nonnegative neighbor embedding. In: Proceedings of British Machine Vision Conference, Surrey, UK, 3–7 September, pp. 1–10 (2012)
Google Scholar
Zeyde, R., Elad, M., Protter, M.: On single image scaleup using sparse-representations. In: Proceedings of the 7th International Conference on Curves and Surfaces, Avignon, France, 24–30 June, vol. 6920, pp. 711–730 (2010)
Google Scholar
Martin, D.R., Fowlkes, C.C., Tal, D., Malik, J.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proceedings of the Eighth International Conference on Computer Vision, Vancouver, British Columbia, Canada, 7–14 July, vol. 2, pp. 416–425 (2001)
Google Scholar
Huang, J., Singh, A., Ahuja, N.: Single image super-resolution from transformed self-exemplars. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June, pp. 5197–5206 (2015)
Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Project Nos. 62272164, 61402053), the Hunan Provincial Natural Science Foundation of China (Grant No. 2023JJ30050) and the Scientific Research Fund of Education Department of Hunan Province (Grant No. 21B0287).

Author information

Authors and Affiliations

School of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha, 410114, Hunan, China
Yan Gui, Yan Xie, Lidan Kuang & Jin Zhang
Hunan Provincial Key Laboratory of Intelligent Processing of Big Data on Transportation, Changsha University of Science and Technology, Changsha, 410114, Hunan, China
Yan Gui, Yan Xie, Lidan Kuang & Jin Zhang
Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai, 200237, China
Zhihua Chen

Authors

Yan Gui
View author publications
You can also search for this author in PubMed Google Scholar
Yan Xie
View author publications
You can also search for this author in PubMed Google Scholar
Lidan Kuang
View author publications
You can also search for this author in PubMed Google Scholar
Zhihua Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jin Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yan Gui .

Editor information

Editors and Affiliations

Shanghai Jiao Tong University, Shanghai, China
Bin Sheng
Shanghai Jiao Tong University, Shanghai, China
Lei Bi
University of Sydney, Sydney, NSW, Australia
Jinman Kim
MIRALab-CUI, University of Geneve, Carouge, Geneve, Switzerland
Nadia Magnenat-Thalmann
Swiss Federal Institute of Technology, Lausanne, Switzerland
Daniel Thalmann

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 508 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gui, Y., Xie, Y., Kuang, L., Chen, Z., Zhang, J. (2024). Enhancing Image Rescaling Using High Frequency Guidance and Attentions in Downscaling and Upscaling Network. In: Sheng, B., Bi, L., Kim, J., Magnenat-Thalmann, N., Thalmann, D. (eds) Advances in Computer Graphics. CGI 2023. Lecture Notes in Computer Science, vol 14495. Springer, Cham. https://doi.org/10.1007/978-3-031-50069-5_35

Download citation

DOI: https://doi.org/10.1007/978-3-031-50069-5_35
Published: 20 January 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50068-8
Online ISBN: 978-3-031-50069-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Enhancing Image Rescaling Using High Frequency Guidance and Attentions in Downscaling and Upscaling Network