Abstract
The diffusion model has achieved impressive results on low-level tasks, recent studies attempt to design efficient diffusion models for Image Super-Resolution. However, they have mainly focused on reducing the number of parameters and FLops through various network designs. Although these methods can decrease the number of parameters and floating-point operations, they may not necessarily reduce actual running time. To enable DM inference faster on limited computational resources while retaining their quality and flexibility, we propose a Reparameterized Lightweight Diffusion Model SR network (RDSR), which consists of a Latent Prior Encoder (LPE), Reparameterized Decoder (RepD), and diffusion model conditioned on degraded images. Specifically, we first pretrain a LPE, it takes paired HR and LR patches as inputs, mapping input from pixel space to latent space. RepD has a VGG-like inference-time body composed of nothing but a stack of 3\(\times \)3 convolution and ReLU, while the training-time model has a multi-branch. Our diffusion model serve as a bridge between LPE and RepD: LPE employs distillation loss to supervise reverse diffusion process, the output of reverse process diffusion as a modulator to guide RepD to reconstruct high-quality results. RDSR can effectively reduce GPU memory consumption and improve inference speed. Extensive experiments on SR benchmarks demonstrate the superiority of our RDSR over state-of-the-art DM methods, e.g., RDSR-2.2M achieve 30.11 dB PSNR on DIV2K100 dataset that surpass equal-order DM-based models, while trading-off the parameter, efficiency, and accuracy well: running \({{\boldsymbol{55.8}}} \times \uparrow \) faster than DiffIR on Intel(R) Xeon(R) Platinum 8255C CPU.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Agustsson, E., Timofte, R.: Ntire 2017 challenge on single image super-resolution: dataset and study. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). pp. 1122–1131 (2017). https://doi.org/10.1109/CVPRW.2017.150
Chen, H., et al.: Pre-trained image processing transformer. In: CVPR (2021)
Chen, X., Wang, X., Zhou, J., Dong, C.: Activating more pixels in image super-resolution transformer (2022). CoRR arXiv:abs/2205.04437
Chen, Y., Tai, Y., Liu, X., Shen, C., Yang, J.: Fsrnet: End-to-end learning face super-resolution with facial priors. In: CVPR, pp. 2492–2501 (2018)
Dhariwal, P., Nichol, A.: Diffusion models beat gans on image synthesis. NeurIPS (2021)
Ding, X., Zhang, X., Ma, N., Han, J., Ding, G., Sun, J.: Repvgg: making vgg-style convnets great again (2021). arXiv:2101.03697
Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks (2015). CoRR arXiv:abs/1501.00092
Dong, C., Loy, C.C., Tang, X.: Accelerating the super-resolution convolutional neural network. In: ECCV (2016)
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. NeurIPS (2020)
Ho, J., Saharia, C., Chan, W., Fleet, D.J., Norouzi, M., Salimans, T.: Cascaded diffusion models for high fidelity image generation. JMLR (2022)
Huang, J.B., Singh, A., Ahuja, N.: Single image super-resolution from transformed self-exemplars. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2015)
Kim, J., Kwon Lee, J., Mu Lee, K.: Accurate image super-resolution using very deep convolutional networks. In: CVPR (2016)
Kingma, D., Salimans, T., Poole, B., Ho, J.: Variational diffusion models. NeurIPS (2021)
Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: CVPR (2017)
Li, H., et al.: Srdiff: single image super-resolution with diffusion probabilistic models. Neurocomputing (2022)
Li, W., Lu, X., Qian, S., Lu, J., Zhang, X., Jia, J.: On efficient transformer and image pre-training for low-level vision (2021). arXiv:2112.10175
Li, W., Zhou, K., Qi, L., Lu, L., Lu, J.: Best-buddy gans for highly detailed image super-resolution. In: AAAI (2022)
Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: Swinir: Image restoration using swin transformer. In: ICCVW (2021)
Lim, B., Son, S., Kim, H., Nah, S., Mu Lee, K.: Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 136–144 (2017)
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: ICCV (2021)
Lu, Z., Li, J., Liu, H., Huang, C., Zhang, L., Zeng, T.: Transformer for single image super-resolution. In: CVPR Workshops (2022)
Lugmayr, A., Danelljan, M., Romero, A., Yu, F., Timofte, R., Van Gool, L.: Repaint: inpainting using denoising diffusion probabilistic models. In: CVPR (2022)
Ma, C., Rao, Y., Cheng, Y., Chen, C., Lu, J., Zhou, J.: Structure-preserving super resolution with gradient guidance. In: CVPR (2020)
Matsui, Y., Ito, K., Aramaki, Y., Fujimoto, A., Ogawa, T., Yamasaki, T., Aizawa, K.: Sketch-based manga retrieval using manga109 dataset. Multimed. Tools Appl. 76, 21811–21838 (2017)
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: CVPR (2022)
Saharia, C., et al.: Palette: image-to-image diffusion models. In: ACM SIGGRAPH (2022)
Saharia, C., Ho, J., Chan, W., Salimans, T., Fleet, D.J., Norouzi, M.: Image super-resolution via iterative refinement. TPAMI (2022)
Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., Ganguli, S.: Deep unsupervised learning using nonequilibrium thermodynamics. In: ICML (2015)
Song, Y., Sohl-Dickstein, J., Kingma, D.P., Kumar, A., Ermon, S., Poole, B.: Score-based generative modeling through stochastic differential equations. ICLR (2021)
Tai, Y., Yang, J., Liu, X., Xu, C.: Memnet: a persistent memory network for image restoration. In: ICCV (2017)
Wang, X., Xie, L., Dong, C., Shan, Y.: Real-esrgan: Training real-world blind super-resolution with pure synthetic data. In: ICCV (2021)
Wang, X., Yu, K., Dong, C., Loy, C.C.: Recovering realistic texture in image super-resolution by deep spatial feature transform. In: CVPR (2018)
Wang, X., et al.: Esrgan: enhanced super-resolution generative adversarial networks. In: ECCVW, pp. 0–0 (2018)
Xia, B., et al.: Diffir: efficient diffusion model for image restoration. ICCV (2023)
Zeyde, R., Elad, M., Protter, M.: On single image scale-up using sparse-representations. In: Curves and Surfaces: 7th International Conference, Avignon, France, June 24-30, 2010, Revised Selected Papers 7, pp. 711–730. Springer (2012)
Zhang, K., Gool, L.V., Timofte, R.: Deep unfolding network for image super-resolution. In: CVPR (2020)
Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks. In: ECCV (2018)
Zhou, S., Zhang, J., Zuo, W., Loy, C.C.: Cross-scale internal graph neural network for image super-resolution. In: Advances in Neural Information Processing Systems (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Sun, O., Long, J., Huang, W., Yang, Z., Li, C. (2025). RDSR:Reparameterized Lightweight Diffusion Model for Image Super-Resolution. In: Lin, Z., et al. Pattern Recognition and Computer Vision. PRCV 2024. Lecture Notes in Computer Science, vol 15038. Springer, Singapore. https://doi.org/10.1007/978-981-97-8685-5_7
Download citation
DOI: https://doi.org/10.1007/978-981-97-8685-5_7
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-8684-8
Online ISBN: 978-981-97-8685-5
eBook Packages: Computer ScienceComputer Science (R0)