Skip to main content

Accelerating Score-Based Generative Models with Preconditioned Diffusion Sampling

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13683))

Included in the following conference series:

  • 3065 Accesses

Abstract

Score-based generative models (SGMs) have recently emerged as a promising class of generative models. However, a fundamental limitation is that their inference is very slow due to a need for many (e.g., 2000) iterations of sequential computations. An intuitive acceleration method is to reduce the sampling iterations which however causes severe performance degradation. We investigate this problem by viewing the diffusion sampling process as a Metropolis adjusted Langevin algorithm, which helps reveal the underlying cause to be ill-conditioned curvature. Under this insight, we propose a model-agnostic preconditioned diffusion sampling (PDS) method that leverages matrix preconditioning to alleviate the aforementioned problem. Crucially, PDS is proven theoretically to converge to the original target distribution of a SGM, no need for retraining. Extensive experiments on three image datasets with a variety of resolutions and diversity validate that PDS consistently accelerates off-the-shelf SGMs whilst maintaining the synthesis quality. In particular, PDS can accelerate by up to \(29\times \) on more challenging high resolution (1024\(\times \)1024) image generation.

L. Zhang—School of Data Science, Fudan University.

H. Ma and J. Feng—Institute of Science and Technology for Brain-inspired Intelligence, Fudan University.

X. Zhu—Surrey Institute for People-Centred Artificial Intelligence, CVSSP, University of Surrey.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    More theoretical explanation on why directly regulating the frequency domain of a diffusion process is possible is provided in Supplementary material .

  2. 2.

    For NCSN++ [33], we use \(\bigtriangledown _{\textbf{x}} \log p_t(\textbf{x})\), where \(p_t\) is the distribution function of \(\textbf{x}\) at t, since \(\bigtriangledown _{\textbf{x}} \log p^{*}(\textbf{x})\) is inaccessible in NCSN++.

References

  1. Bao, F., Li, C., Zhu, J., Zhang, B.: Analytic-dpm: an analytic estimate of the optimal reverse variance in diffusion probabilistic models. In: ICLR (2022)

    Google Scholar 

  2. Bovik, A.C.: The essential guide to image processing (2009)

    Google Scholar 

  3. Brigham, E.O.: The fast Fourier transform and its applications (1988)

    Google Scholar 

  4. Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis. In: ICLR (2019)

    Google Scholar 

  5. De Bortoli, V., Thornton, J., Heng, J., Doucet, A.: Diffusion schrödinger bridge with applications to score-based generative modeling. In: NeurIPS (2021)

    Google Scholar 

  6. Dhariwal, P., Nichol, A.: Diffusion models beat gans on image synthesis. In: NeurIPS (2021)

    Google Scholar 

  7. Dockhorn, T., Vahdat, A., Kreis, K.: Score-based generative modeling with critically-damped Langevin diffusion. In: ICLR (2022)

    Google Scholar 

  8. Gardiner, C.W., et al.: Handbook of stochastic methods (1985)

    Google Scholar 

  9. Girolami, M., Calderhead, B.: Riemann manifold langevin and hamiltonian monte carlo methods. J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.) (2011)

    Google Scholar 

  10. Goodfellow, I., et al.: Generative adversarial nets. In: NeurIPS (2014)

    Google Scholar 

  11. Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: NeurIPS (2017)

    Google Scholar 

  12. Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: NeurIPS (2020)

    Google Scholar 

  13. Ho, J., Saharia, C., Chan, W., Fleet, D.J., Norouzi, M., Salimans, T.: Cascaded diffusion models for high fidelity image generation. arXiv preprint (2021)

    Google Scholar 

  14. Hwang, C.R., Hwang-Ma, S.Y., Sheu, S.J.: Accelerating diffusions. Ann. Appl. Probabil. (2005)

    Google Scholar 

  15. Hyvärinen, A., Dayan, P.: Estimation of non-normalized statistical models by score matching. JMLR 6, 695–709 (2005)

    MathSciNet  MATH  Google Scholar 

  16. Jolicoeur-Martineau, A., Li, K., Piché-Taillefer, R., Kachman, T., Mitliagkas, I.: Gotta go fast when generating data with score-based models. arXiv preprint arXiv (2021)

    Google Scholar 

  17. Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of gans for improved quality, stability, and variation. In: ICLR (2018)

    Google Scholar 

  18. Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: CVPR (2019)

    Google Scholar 

  19. Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)

    Google Scholar 

  20. Lelievre, T., Nier, F., Pavliotis, G.A.: Optimal non-reversible linear drift for the convergence to equilibrium of a diffusion. J. Stat. Phys. 152, 237–274 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  21. Li, C., Chen, C., Carlson, D., Carin, L.: Preconditioned stochastic gradient langevin dynamics for deep neural networks. In: AAAI (2016)

    Google Scholar 

  22. Neal, R.M., et al.: Mcmc using hamiltonian dynamics. In: Handbook of Markov Chain Monte Carlo (2011)

    Google Scholar 

  23. Nichol, A., et al.: Glide: towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint (2021)

    Google Scholar 

  24. Nichol, A.Q., Dhariwal, P.: Improved denoising diffusion probabilistic models. In: ICML (2021)

    Google Scholar 

  25. Ottobre, M.: Markov chain monte carlo and irreversibility. Rep. Math. Phys. 77, 267–292 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  26. Rey-Bellet, L., Spiliopoulos, K.: Irreversible langevin samplers and variance reduction: a large deviations approach. Nonlinearity 28, 2081 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  27. Roberts, G.O., Stramer, O.: Langevin diffusions and metropolis-hastings algorithms. Methodol. Comput. Appl. Probabil. 4, 337–357 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  28. Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., Ganguli, S.: Deep unsupervised learning using nonequilibrium thermodynamics. In: ICML (2015)

    Google Scholar 

  29. Song, J., Meng, C., Ermon, S.: Denoising diffusion implicit models. In: ICLR (2020)

    Google Scholar 

  30. Song, Y., Durkan, C., Murray, I., Ermon, S.: Maximum likelihood training of score-based diffusion models. In: NeurIPS (2021)

    Google Scholar 

  31. Song, Y., Ermon, S.: Generative modeling by estimating gradients of the data distribution. In: NeurIPS (2019)

    Google Scholar 

  32. Song, Y., Ermon, S.: Improved techniques for training score-based generative models. In: NeurIPS (2020)

    Google Scholar 

  33. Song, Y., Sohl-Dickstein, J., Kingma, D.P., Kumar, A., Ermon, S., Poole, B.: Score-based generative modeling through stochastic differential equations. In: ICLR (2021)

    Google Scholar 

  34. Vahdat, A., Kreis, K., Kautz, J.: Score-based generative modeling in latent space. In: NeurIPS (2021)

    Google Scholar 

  35. Welling, M., Teh, Y.W.: Bayesian learning via stochastic gradient langevin dynamics. In: ICML (2011)

    Google Scholar 

  36. Xiao, Z., Kreis, K., Vahdat, A.: Tackling the generative learning trilemma with denoising diffusion gans. In: ICLR (2022)

    Google Scholar 

  37. Yu, F., Seff, A., Zhang, Y., Song, S., Funkhouser, T., Xiao, J.: Lsun: construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint (2015)

    Google Scholar 

Download references

Acknowledgments

This work was supported in part by National Natural Science Foundation of China (Grant No. 6210020439), Lingang Laboratory (Grant No. LG-QS-202202-07), Natural Science Foundation of Shanghai (Grant No. 22ZR1407500), Shanghai Municipal Science and Technology Major Project (Grant No. 2018SHZDZX01 and 2021SHZDZX0103), Science and Technology Innovation 2030 - Brain Science and Brain-Inspired Intelligence Project (Grant No. 2021ZD0200204).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li Zhang .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 19411 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ma, H., Zhang, L., Zhu, X., Feng, J. (2022). Accelerating Score-Based Generative Models with Preconditioned Diffusion Sampling. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13683. Springer, Cham. https://doi.org/10.1007/978-3-031-20050-2_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-20050-2_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20049-6

  • Online ISBN: 978-3-031-20050-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics