Skip to main content

Knowledge-Distillation-Warm-Start Training Strategy for Lightweight Super-Resolution Networks

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2023)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1966))

Included in the following conference series:

  • 461 Accesses

Abstract

In recent years, studies on lightweight networks have made rapid progress in the field of image Super-Resolution (SR). Although the lightweight SR network is computationally efficient and saves parameters, the simplification of the structure inevitably leads to limitations in its performance. To further enhance the efficacy of lightweight networks, we propose a Knowledge-Distillation-Warm-Start (KDWS) training strategy. This strategy enables further optimization of lightweight networks using dark knowledge from traditional large-scale SR networks during warm-start training and can empirically improve the performance of lightweight models. For experiment, we have chosen several traditional large-scale SR networks and lightweight networks as teacher and student networks, respectively. The student network is initially trained with a conventional warm-start strategy, followed by additional supervision from the teacher network for further warm-start training. The evaluation on common test datasets shows that our proposed training strategy can result in better performance for a lightweight SR network. Furthermore, our proposed approach can also be adopted in any deep learning network training process, not only image SR tasks, as it is not limited by network structure or task type.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agustsson, E., Timofte, R.: NTIRE 2017 challenge on single image super-resolution: dataset and study. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2017, pp. 1122–1131 (2017)

    Google Scholar 

  2. Ahn, S., Hu, S.X., Damianou, A.C., Lawrence, N.D., Dai, Z.: Variational information distillation for knowledge transfer. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, pp. 9163–9171 (2019)

    Google Scholar 

  3. Bevilacqua, M., Roumy, A., Guillemot, C., Alberi-Morel, M.: Low-complexity single-image super-resolution based on nonnegative neighbor embedding. In: British Machine Vision Conference, BMVC 2012, pp. 1–10 (2012)

    Google Scholar 

  4. Chang, J., Lu, Y., Xue, P., Xu, Y., Wei, Z.: Global balanced iterative pruning for efficient convolutional neural networks. Neural Comput. Appl. 34(23), 21119–21138 (2022)

    Article  Google Scholar 

  5. Chen, D., Mei, J., Zhang, H., Wang, C., Feng, Y., Chen, C.: Knowledge distillation with the reused teacher classifier. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, pp. 11923–11932 (2022)

    Google Scholar 

  6. Clancy, K., Aboutalib, S.S., Mohamed, A.A., Sumkin, J.H., Wu, S.: Deep learning pre-training strategy for mammogram image classification: an evaluation study. J. Digit. Imaging 33(5), 1257–1265 (2020)

    Article  Google Scholar 

  7. Dai, T., Cai, J., Zhang, Y., Xia, S., Zhang, L.: Second-order attention network for single image super-resolution. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, pp. 11065–11074 (2019)

    Google Scholar 

  8. Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014 Part IV. LNCS, vol. 8692, pp. 184–199. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10593-2_13

    Chapter  Google Scholar 

  9. Dong, C., Loy, C.C., Tang, X.: Accelerating the super-resolution convolutional neural network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016 Part II. LNCS, vol. 9906, pp. 391–407. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_25

    Chapter  Google Scholar 

  10. Du, Z., Liu, D., Liu, J., Tang, J., Wu, G., Fu, L.: Fast and memory-efficient network towards efficient image super-resolution. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2022, pp. 852–861 (2022)

    Google Scholar 

  11. Gao, Q., Zhao, Y., Li, G., Tong, T.: Image super-resolution using knowledge distillation. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11362, pp. 527–541. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20890-5_34

    Chapter  Google Scholar 

  12. Garg, A., Gowda, D., Kumar, A., Kim, K., Kumar, M., Kim, C.: Improved multi-stage training of online attention-based encoder-decoder models. In: IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019, pp. 70–77 (2019)

    Google Scholar 

  13. Gonzalez, S., Miikkulainen, R.: Improved training speed, accuracy, and data utilization through loss function optimization. In: IEEE Congress on Evolutionary Computation, CEC 2020, pp. 1–8 (2020)

    Google Scholar 

  14. Gou, J., Yu, B., Maybank, S.J., Tao, D.: Knowledge distillation: a survey. Int. J. Comput. Vis. 129(6), 1789–1819 (2021)

    Article  Google Scholar 

  15. Guan, Y., et al.: Differentiable feature aggregation search for knowledge distillation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020 Part XVII. LNCS, vol. 12362, pp. 469–484. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58520-4_28

    Chapter  Google Scholar 

  16. Heo, B., Lee, M., Yun, S., Choi, J.Y.: Knowledge transfer via distillation of activation boundaries formed by hidden neurons. In: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, pp. 3779–3787 (2019)

    Google Scholar 

  17. Hinton, G.E., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. CoRR abs/1503.02531 (2015)

    Google Scholar 

  18. Huang, J., Singh, A., Ahuja, N.: Single image super-resolution from transformed self-exemplars. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, pp. 5197–5206 (2015)

    Google Scholar 

  19. Khalifa, N.E.M., Loey, M., Mirjalili, S.: A comprehensive survey of recent trends in deep learning for digital images augmentation. Artif. Intell. Rev. 55(3), 2351–2377 (2022)

    Article  Google Scholar 

  20. Kim, Y., Li, Y., Park, H., Venkatesha, Y., Panda, P.: Neural architecture search for spiking neural networks. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022 Part XXIV, vol. 13684, pp. 36–56. Springer, Cham (2022)

    Chapter  Google Scholar 

  21. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: 3rd International Conference on Learning Representations, ICLR 2015, Conference Track Proceedings (2015)

    Google Scholar 

  22. Kong, F., et al.: Residual local feature network for efficient super-resolution. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2022, pp. 765–775 (2022)

    Google Scholar 

  23. Li, Y., et al.: NTIRE 2022 challenge on efficient super-resolution: Methods and results. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2022, pp. 1061–1101 (2022)

    Google Scholar 

  24. Lim, B., Son, S., Kim, H., Nah, S., Lee, K.M.: Enhanced deep residual networks for single image super-resolution. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2017, pp. 1132–1140 (2017)

    Google Scholar 

  25. Liu, J., Tang, J., Wu, G.: Residual feature distillation network for lightweight image super-resolution. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12537, pp. 41–55. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-67070-2_2

    Chapter  Google Scholar 

  26. Mahmud, T., Sayyed, A.Q.M.S., Fattah, S.A., Kung, S.: A novel multi-stage training approach for human activity recognition from multimodal wearable sensor data using deep neural network. IEEE Sens. J. 21(2), 1715–1726 (2021)

    Article  Google Scholar 

  27. Martin, D.R., Fowlkes, C.C., Tal, D., Malik, J.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proceedings of the Eighth International Conference On Computer Vision (ICCV-01), vol. 2, pp. 416–425 (2001)

    Google Scholar 

  28. Pham, H., Guan, M.Y., Zoph, B., Le, Q.V., Dean, J.: Efficient neural architecture search via parameter sharing. In: Proceedings of the 35th International Conference on Machine Learning, ICML 2018, pp. 4092–4101 (2018)

    Google Scholar 

  29. Raymond, C., Chen, Q., Xue, B., Zhang, M.: Online loss function learning. CoRR abs/2301.13247 (2023)

    Google Scholar 

  30. Siddegowda, S., Fournarakis, M., Nagel, M., Blankevoort, T., Patel, C., Khobare, A.: Neural network quantization with AI model efficiency toolkit (AIMET). CoRR abs/2201.08442 (2022)

    Google Scholar 

  31. Smith, L.N.: Cyclical learning rates for training neural networks. In: 2017 IEEE Winter Conference on Applications of Computer Vision, WACV 2017, pp. 464–472 (2017)

    Google Scholar 

  32. Tung, F., Mori, G.: Similarity-preserving knowledge distillation. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, pp. 1365–1374 (2019)

    Google Scholar 

  33. Wang, K., Sun, T., Dou, Y.: An adaptive learning rate schedule for SIGNSGD optimizer in neural networks. Neural Process. Lett. 54(2), 803–816 (2022)

    Article  Google Scholar 

  34. Wang, Z., Li, C., Wang, X.: Convolutional neural network pruning with structural redundancy reduction. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, pp. 14913–14922 (2021)

    Google Scholar 

  35. Xu, M., Yoon, S., Fuentes, A., Park, D.S.: A comprehensive survey of image augmentation techniques for deep learning. Pattern Recognit. 137, 109347 (2023)

    Article  Google Scholar 

  36. Zeyde, R., Elad, M., Protter, M.: On single image scale-up using sparse-representations. In: Curves and Surfaces - 7th International Conference, Revised Selected Papers, pp. 711–730 (2010)

    Google Scholar 

  37. Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks. In: Computer Vision - ECCV 2018–15th European Conference, Proceedings, Part VII, pp. 294–310 (2018)

    Google Scholar 

  38. Zhao, B., Cui, Q., Song, R., Qiu, Y., Liang, J.: Decoupled knowledge distillation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, pp. 11943–11952 (2022)

    Google Scholar 

  39. Zhou, A., Yao, A., Guo, Y., Xu, L., Chen, Y.: Incremental network quantization: Towards lossless CNNs with low-precision weights. In: 5th International Conference on Learning Representations, ICLR 2017, Conference Track Proceedings (2017)

    Google Scholar 

Download references

Acknowledgments

: This work is supported by the Ministry of Science and Technology of China (No. G2022036009L), Open Fund of Intelligent Terminal Key Laboratory of Sichuan Province (No. SCTLAB-2007), Yibin Science and Technology Program (No. 2021CG003) and Science and Technology Program of Yibin Sanjiang New Area (No. 2023SJXQYBKJJH001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hui Xu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lei, M., He, K., Xu, H., Yang, Y., Shao, J. (2024). Knowledge-Distillation-Warm-Start Training Strategy for Lightweight Super-Resolution Networks. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Communications in Computer and Information Science, vol 1966. Springer, Singapore. https://doi.org/10.1007/978-981-99-8148-9_22

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8148-9_22

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8147-2

  • Online ISBN: 978-981-99-8148-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics