Skip to main content
Log in

Multi-scale gated network for efficient image super-resolution

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Remarkable progress has been made in the field of single-image super-resolution (SISR), with convolutional neural network being widely adopted to achieve state-of-the-art performance. Recently, researchers have been increasingly interested in exploring the application of Transformer in SISR. However, the high computational cost of Transformer poses a challenge to its deployment on mobile devices. To address this issue, we propose a novel lightweight multi-scale gated network (MSGN) by exploring the variant of the Transformer which is built upon its general structure. MSGN utilizes efficient multi-scale gated block (EMGB) as the token mixer for the Transformer. Specifically, EMGB uses multi-scale filtering block and gating mechanism to extract and augment various features at multiple granularities. In addition, the simplified channel attention is used to extract channel global information. Furthermore, an enhanced multi-layer perceptron is employed instead of the MLP layer in Transformer to further improve the performance of the network. Our extensive experimental results demonstrate that MSGN achieves the best performance among the state-of-the-art efficient image SR models while utilizing the least number of parameters and FLOPs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

The associated datasets of the current study are available from the author on reasonable request.

References

  1. Zhang, S., Liang, G., Pan, S., Zheng, L.: A fast medical image super resolution method based on deep learning network. IEEE Access 7, 12319–12327 (2018)

    Article  MATH  Google Scholar 

  2. Uiboupin, T., Rasti, P., Anbarjafari, G., Demirel, H.: Facial image super resolution using sparse representation for improving face recognition in surveillance monitoring. In: 2016 24th Signal Processing and Communication Application Conference (SIU), pp. 437–440 (2016)

  3. Wang, P., Bayram, B., Sertel, E.: A comprehensive review on deep learning based remote sensing image super-resolution methods. Earth-Sci. Rev. 232, 104110 (2022)

    Article  Google Scholar 

  4. Chen, Y., Phonevilay, V., Tao, J., Chen, X., Xia, R., Zhang, Q., Yang, K., Xiong, J., Xie, J.: The face image super-resolution algorithm based on combined representation learning. Multim. Tools Appl. 80, 30839–30861 (2021)

    Article  Google Scholar 

  5. Capel, D., Zisserman, A.: Computer vision applied to super resolution. IEEE Signal Process. Mag. 20(3), 75–86 (2003)

    Article  MATH  Google Scholar 

  6. Farsiu, S., Robinson, M.D., Elad, M., Milanfar, P.: Fast and robust multiframe super resolution. IEEE Trans. Image Process. 13(10), 1327–1344 (2004)

    Article  MATH  Google Scholar 

  7. Romano, Y., Isidoro, J., Milanfar, P.: RAISR: rapid and accurate image super resolution. IEEE Trans. Comput. Imaging 3(1), 110–125 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  8. Timofte, R., De Smet, V., Van Gool, L.: Anchored neighborhood regression for fast example-based super-resolution. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1920–1927 (2013)

  9. Zhang, Y., Du, Y., Ling, F., Fang, S., Li, X.: Example-based super-resolution land cover mapping using support vector regression. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 7(4), 1271–1283 (2014)

    Article  MATH  Google Scholar 

  10. Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 184–199 (2014)

  11. Kim, J., Lee, J.K., Lee, K.M.: Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1646–1654 (2016)

  12. Zhang, Y., Tian, Y., Kong, Y., Zhong, B., Fu, Y.: Residual dense network for image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2472–2481 (2018)

  13. Lai, Q., Nie, Y., Sun, H., Xu, Q., Zhang, Z., Xiao, M.: Video super-resolution via pre-frame constrained and deep-feature enhanced sparse reconstruction. Pattern Recogn. 100, 107139 (2020)

    Article  Google Scholar 

  14. Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 286–301 (2018)

  15. Zhang, X., Zeng, H., Zhang, L.: Edge-oriented convolution block for real-time super resolution on mobile devices. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 4034–4043 (2021)

  16. Hui, Z., Gao, X., Yang, Y., Wang, X.: Lightweight image super-resolution with information multi-distillation network. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 2024–2032 (2019)

  17. Liu, J., Tang, J., Wu, G.: Residual feature distillation network for lightweight image super-resolution. In: Computer Vision–ECCV 2020 Workshops: Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16, pp. 41–55 (2020)

  18. Li, Z., Liu, Y., Chen, X., Cai, H., Gu, J., Qiao, Y., Dong, C.: Blueprint separable residual network for efficient image super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 833–843 (2022)

  19. Wang, L., Li, D., Tian, L., Shan, Y.: Efficient image super-resolution with collapsible linear blocks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 817–823 (2022)

  20. Du, Z., Liu, D., Liu, J., Tang, J., Wu, G., Fu, L.: Fast and memory-efficient network towards efficient image super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 853–862 (2022)

  21. Wang, Y.: Edge-enhanced feature distillation network for efficient super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 777–785 (2022)

  22. Chen, H., Wang, Y., Guo, T., Xu, C., Deng, Y., Liu, Z., Ma, S., Xu, C., Xu, C., Gao, W.: Pre-trained image processing transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12299–12310 (2021)

  23. Lu, Z., Li, J., Liu, H., Huang, C., Zhang, L., Zeng, T.: Transformer for single image super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 457–466 (2022)

  24. Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: SwinIR: image restoration using swin transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1833–1844 (2021)

  25. Tolstikhin, I.O., Houlsby, N., Kolesnikov, A., Beyer, L., Zhai, X., Unterthiner, T., Yung, J., Steiner, A., Keysers, D., Uszkoreit, J.: MLP-mixer: an all-MLP architecture for vision. Adv. Neural Inf. Process. Syst. 34, 24261–24272 (2021)

    Google Scholar 

  26. Wang, G., Zhao, Y., Tang, C., Luo, C., Zeng, W.: When shift operation meets vision transformer: an extremely simple alternative to attention mechanism. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 2423–2430 (2022)

  27. Yu, W., Luo, M., Zhou, P., Si, C., Zhou, Y., Wang, X., Feng, J., Yan, S.: Metaformer is actually what you need for vision. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10819–10829 (2022)

  28. Chen, L., Chu, X., Zhang, X., Sun, J.: Simple baselines for image restoration. In: European Conference on Computer Vision, pp. 17–33. Springer (2022)

  29. Lim, B., Son, S., Kim, H., Nah, S., Mu Lee, K.: Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 136–144 (2017)

  30. Luo, X., Xie, Y., Zhang, Y., Qu, Y., Li, C., Fu, Y.: LatticeNet: towards lightweight image super-resolution with lattice block. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXII 16, pp. 272–289 (2020)

  31. Ahn, N., Kang, B., Sohn, K.-A.: Fast, accurate, and lightweight super-resolution with cascading residual network. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 252–268 (2018)

  32. Hui, Z., Wang, X., Gao, X.: Fast and accurate single image super-resolution via information distillation network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 723–731 (2018)

  33. Zhang, W., Fan, W., Yang, X., Zhang, Q., Zhou, D.: Lightweight single-image super-resolution via multi-scale feature fusion CNN and multiple attention block. Vis. Comput. 39(8), 3519–3531 (2023)

    Article  MATH  Google Scholar 

  34. Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A.P., Bishop, R., Rueckert, D., Wang, Z.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1874–1883 (2016)

  35. Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization (2016). arXiv:1607.06450

  36. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456 (2015)

  37. Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M.-H.: Restormer: efficient transformer for high-resolution image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5728–5739 (2022)

  38. Hendrycks, D., Gimpel, K.: Gaussian error linear units (gelus) (2016). arXiv:1606.08415

  39. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)

  40. Cho, S.-J., Ji, S.-W., Hong, J.-P., Jung, S.-W., Ko, S.-J.: Rethinking coarse-to-fine approach in single image deblurring. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4641–4650 (2021)

  41. Sun, L., Pan, J., Tang, J.: Shufflemixer: an efficient convnet for image super-resolution (2022). arXiv:2205.15175

  42. Timofte, R., Agustsson, E., Van Gool, L., Yang, M.-H., Zhang, L.: NTIRE 2017 challenge on single image super-resolution: methods and results. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 114–125 (2017)

  43. Bevilacqua, M., Roumy, A., Guillemot, C., Alberi-Morel, M.L.: Low-complexity single-image super-resolution based on nonnegative neighbor embedding (2012)

  44. Zeyde, R., Elad, M., Protter, M.: On single image scale-up using sparse-representations. In: Curves and Surfaces: 7th International Conference, Avignon, France, June 24-30, 2010, Revised Selected Papers 7, pp. 711–730 (2012)

  45. Martin, D., Fowlkes, C., Tal, D., Malik, J.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, vol. 2, pp. 416–423 (2001)

  46. Huang, J.-B., Singh, A., Ahuja, N.: Single image super-resolution from transformed self-exemplars. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5197–5206 (2015)

  47. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2014). arXiv:1412.6980

  48. Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 807–814 (2010)

  49. Elfwing, S., Uchibe, E., Doya, K.: Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Netw. 107, 3–11 (2018)

    Article  MATH  Google Scholar 

  50. Dong, C., Loy, C.C., Tang, X.: Accelerating the super-resolution convolutional neural network. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part II 14, pp. 391–407 (2016)

  51. Kim, J., Lee, J.K., Lee, K.M.: Deeply-recursive convolutional network for image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1637–1645 (2016)

  52. Lai, W.-S., Huang, J.-B., Ahuja, N., Yang, M.-H.: Deep Laplacian pyramid networks for fast and accurate super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 624–632 (2017)

  53. Tai, Y., Yang, J., Liu, X.: Image super-resolution via deep recursive residual network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3147–3155 (2017)

  54. Tai, Y., Yang, J., Liu, X., Xu, C.: MemNet: a persistent memory network for image restoration. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4539–4547 (2017)

  55. Wang, L., Dong, X., Wang, Y., Ying, X., Lin, Z., An, W., Guo, Y.: Exploring sparsity in image super-resolution for efficient inference. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4917–4926 (2021)

Download references

Acknowledgements

This research was supported by National Key Research and Development Program of China (Grant No. 2020YFA0714003), Science and Technology Planning Project of Sichuan Province (Grant No. 2021YFQ0059), National Major Project of China (Grant No. GJXM92579).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Zheng Li or Ning Yang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Miao, X., Li, S., Li, Z. et al. Multi-scale gated network for efficient image super-resolution. Vis Comput 41, 1227–1239 (2025). https://doi.org/10.1007/s00371-024-03410-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-024-03410-6

Keywords