RDSR:Reparameterized Lightweight Diffusion Model for Image Super-Resolution

Sun, Ouyang; Long, Jun; Huang, Wenti; Yang, Zhan; Li, ChenHao

doi:10.1007/978-981-97-8685-5_7

Ouyang Sun¹⁵,
Jun Long¹⁶,
Wenti Huang¹⁷,
Zhan Yang¹⁶ &
…
ChenHao Li¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15038))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

160 Accesses

Abstract

The diffusion model has achieved impressive results on low-level tasks, recent studies attempt to design efficient diffusion models for Image Super-Resolution. However, they have mainly focused on reducing the number of parameters and FLops through various network designs. Although these methods can decrease the number of parameters and floating-point operations, they may not necessarily reduce actual running time. To enable DM inference faster on limited computational resources while retaining their quality and flexibility, we propose a Reparameterized Lightweight Diffusion Model SR network (RDSR), which consists of a Latent Prior Encoder (LPE), Reparameterized Decoder (RepD), and diffusion model conditioned on degraded images. Specifically, we first pretrain a LPE, it takes paired HR and LR patches as inputs, mapping input from pixel space to latent space. RepD has a VGG-like inference-time body composed of nothing but a stack of 3$\times $3 convolution and ReLU, while the training-time model has a multi-branch. Our diffusion model serve as a bridge between LPE and RepD: LPE employs distillation loss to supervise reverse diffusion process, the output of reverse process diffusion as a modulator to guide RepD to reconstruct high-quality results. RDSR can effectively reduce GPU memory consumption and improve inference speed. Extensive experiments on SR benchmarks demonstrate the superiority of our RDSR over state-of-the-art DM methods, e.g., RDSR-2.2M achieve 30.11 dB PSNR on DIV2K100 dataset that surpass equal-order DM-based models, while trading-off the parameter, efficiency, and accuracy well: running ${{\boldsymbol{55.8}}} \times \uparrow $ faster than DiffIR on Intel(R) Xeon(R) Platinum 8255C CPU.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 74.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

You Only Need One Step: Fast Super-Resolution with Stable Diffusion via Scale Distillation

AdaDiffSR: Adaptive Region-Aware Dynamic Acceleration Diffusion Model for Real-World Image Super-Resolution

Exploiting Diffusion Prior for Real-World Image Super-Resolution

Article 11 July 2024

References

Agustsson, E., Timofte, R.: Ntire 2017 challenge on single image super-resolution: dataset and study. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). pp. 1122–1131 (2017). https://doi.org/10.1109/CVPRW.2017.150
Chen, H., et al.: Pre-trained image processing transformer. In: CVPR (2021)
Google Scholar
Chen, X., Wang, X., Zhou, J., Dong, C.: Activating more pixels in image super-resolution transformer (2022). CoRR arXiv:abs/2205.04437
Chen, Y., Tai, Y., Liu, X., Shen, C., Yang, J.: Fsrnet: End-to-end learning face super-resolution with facial priors. In: CVPR, pp. 2492–2501 (2018)
Google Scholar
Dhariwal, P., Nichol, A.: Diffusion models beat gans on image synthesis. NeurIPS (2021)
Google Scholar
Ding, X., Zhang, X., Ma, N., Han, J., Ding, G., Sun, J.: Repvgg: making vgg-style convnets great again (2021). arXiv:2101.03697
Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks (2015). CoRR arXiv:abs/1501.00092
Dong, C., Loy, C.C., Tang, X.: Accelerating the super-resolution convolutional neural network. In: ECCV (2016)
Google Scholar
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. NeurIPS (2020)
Google Scholar
Ho, J., Saharia, C., Chan, W., Fleet, D.J., Norouzi, M., Salimans, T.: Cascaded diffusion models for high fidelity image generation. JMLR (2022)
Google Scholar
Huang, J.B., Singh, A., Ahuja, N.: Single image super-resolution from transformed self-exemplars. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2015)
Google Scholar
Kim, J., Kwon Lee, J., Mu Lee, K.: Accurate image super-resolution using very deep convolutional networks. In: CVPR (2016)
Google Scholar
Kingma, D., Salimans, T., Poole, B., Ho, J.: Variational diffusion models. NeurIPS (2021)
Google Scholar
Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: CVPR (2017)
Google Scholar
Li, H., et al.: Srdiff: single image super-resolution with diffusion probabilistic models. Neurocomputing (2022)
Google Scholar
Li, W., Lu, X., Qian, S., Lu, J., Zhang, X., Jia, J.: On efficient transformer and image pre-training for low-level vision (2021). arXiv:2112.10175
Li, W., Zhou, K., Qi, L., Lu, L., Lu, J.: Best-buddy gans for highly detailed image super-resolution. In: AAAI (2022)
Google Scholar
Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: Swinir: Image restoration using swin transformer. In: ICCVW (2021)
Google Scholar
Lim, B., Son, S., Kim, H., Nah, S., Mu Lee, K.: Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 136–144 (2017)
Google Scholar
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: ICCV (2021)
Google Scholar
Lu, Z., Li, J., Liu, H., Huang, C., Zhang, L., Zeng, T.: Transformer for single image super-resolution. In: CVPR Workshops (2022)
Google Scholar
Lugmayr, A., Danelljan, M., Romero, A., Yu, F., Timofte, R., Van Gool, L.: Repaint: inpainting using denoising diffusion probabilistic models. In: CVPR (2022)
Google Scholar
Ma, C., Rao, Y., Cheng, Y., Chen, C., Lu, J., Zhou, J.: Structure-preserving super resolution with gradient guidance. In: CVPR (2020)
Google Scholar
Matsui, Y., Ito, K., Aramaki, Y., Fujimoto, A., Ogawa, T., Yamasaki, T., Aizawa, K.: Sketch-based manga retrieval using manga109 dataset. Multimed. Tools Appl. 76, 21811–21838 (2017)
Article Google Scholar
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: CVPR (2022)
Google Scholar
Saharia, C., et al.: Palette: image-to-image diffusion models. In: ACM SIGGRAPH (2022)
Google Scholar
Saharia, C., Ho, J., Chan, W., Salimans, T., Fleet, D.J., Norouzi, M.: Image super-resolution via iterative refinement. TPAMI (2022)
Google Scholar
Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., Ganguli, S.: Deep unsupervised learning using nonequilibrium thermodynamics. In: ICML (2015)
Google Scholar
Song, Y., Sohl-Dickstein, J., Kingma, D.P., Kumar, A., Ermon, S., Poole, B.: Score-based generative modeling through stochastic differential equations. ICLR (2021)
Google Scholar
Tai, Y., Yang, J., Liu, X., Xu, C.: Memnet: a persistent memory network for image restoration. In: ICCV (2017)
Google Scholar
Wang, X., Xie, L., Dong, C., Shan, Y.: Real-esrgan: Training real-world blind super-resolution with pure synthetic data. In: ICCV (2021)
Google Scholar
Wang, X., Yu, K., Dong, C., Loy, C.C.: Recovering realistic texture in image super-resolution by deep spatial feature transform. In: CVPR (2018)
Google Scholar
Wang, X., et al.: Esrgan: enhanced super-resolution generative adversarial networks. In: ECCVW, pp. 0–0 (2018)
Google Scholar
Xia, B., et al.: Diffir: efficient diffusion model for image restoration. ICCV (2023)
Google Scholar
Zeyde, R., Elad, M., Protter, M.: On single image scale-up using sparse-representations. In: Curves and Surfaces: 7th International Conference, Avignon, France, June 24-30, 2010, Revised Selected Papers 7, pp. 711–730. Springer (2012)
Google Scholar
Zhang, K., Gool, L.V., Timofte, R.: Deep unfolding network for image super-resolution. In: CVPR (2020)
Google Scholar
Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks. In: ECCV (2018)
Google Scholar
Zhou, S., Zhang, J., Zuo, W., Loy, C.C.: Cross-scale internal graph neural network for image super-resolution. In: Advances in Neural Information Processing Systems (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

Xinjiang University, Urumqi, China
Ouyang Sun & ChenHao Li
Central South University, Changsha, China
Jun Long & Zhan Yang
Hunan University of Science and Technology, Xiangtan, China
Wenti Huang

Authors

Ouyang Sun
View author publications
You can also search for this author in PubMed Google Scholar
Jun Long
View author publications
You can also search for this author in PubMed Google Scholar
Wenti Huang
View author publications
You can also search for this author in PubMed Google Scholar
Zhan Yang
View author publications
You can also search for this author in PubMed Google Scholar
ChenHao Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jun Long .

Editor information

Editors and Affiliations

Peking University, Beijing, China
Zhouchen Lin
Nankai University, Tianjin, China
Ming-Ming Cheng
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Ran He
Xinjiang University, Ürümqi, Xinjiang, China
Kurban Ubul
Xinjiang University, Ürümqi, Xinjiang, China
Wushouer Silamu
Peking University, Beijing, China
Hongbin Zha
Tsinghua University, Beijing, China
Jie Zhou
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Cheng-Lin Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sun, O., Long, J., Huang, W., Yang, Z., Li, C. (2025). RDSR:Reparameterized Lightweight Diffusion Model for Image Super-Resolution. In: Lin, Z., et al. Pattern Recognition and Computer Vision. PRCV 2024. Lecture Notes in Computer Science, vol 15038. Springer, Singapore. https://doi.org/10.1007/978-981-97-8685-5_7

Download citation

DOI: https://doi.org/10.1007/978-981-97-8685-5_7
Published: 03 November 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-8684-8
Online ISBN: 978-981-97-8685-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics