Abstract
We present Flow-Guided Density Ratio Learning (FDRL), a simple and scalable approach to generative modeling which builds on the stale (time-independent) approximation of the gradient flow of entropy-regularized f-divergences introduced in recent work. Specifically, the intractable time-dependent density ratio is approximated by a stale estimator given by a GAN discriminator. This is sufficient in the case of sample refinement, where the source and target distributions of the flow are close to each other. However, this assumption is invalid for generation and a naive application of the stale estimator fails due to the large chasm between the two distributions. FDRL proposes to train a density ratio estimator such that it learns from progressively improving samples during the training process. We show that this simple method alleviates the density chasm problem, allowing FDRL to generate images of dimensions as high as \(128\times 128\), as well as outperform existing gradient flow baselines on quantitative benchmarks. We also show the flexibility of FDRL with two use cases. First, unconditional FDRL can be easily composed with external classifiers to perform class-conditional generation. Second, FDRL can be directly applied to unpaired image-to-image translation with no modifications needed to the framework. Our code and relevant supplementary material are available at https://github.com/clear-nus/fdrl.
A. F. Ansari—Work done while at the National University of Singapore, prior to joining Amazon.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Altekrüger, F., Hertrich, J., Steidl, G.: Neural Wasserstein gradient flows for maximum mean discrepancies with riesz kernels. arXiv preprint arXiv:2301.11624 (2023)
Ansari, A.F., Ang, M.L., Soh, H.: Refining deep generative models via discriminator gradient flow. In: ICLR (2021)
Arbel, M., Korba, A., Salim, A., Gretton, A.: Maximum mean discrepancy gradient flow. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Choi, Y., Uh, Y., Yoo, J., Ha, J.W.: Stargan v2: diverse image synthesis for multiple domains. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8188–8197 (2020)
Dhariwal, P., Nichol, A.: Diffusion models beat GANs on image synthesis. In: Advances in Neural Information Processing Systems, vol. 34, pp. 8780–8794 (2021)
Du, C., Li, T., Pang, T., Yan, S., Lin, M.: Nonparametric generative modeling with conditional and locally-connected sliced-Wasserstein flows. arXiv preprint arXiv:2305.02164 (2023)
Du, Y., Mordatch, I.: Implicit generation and generalization in energy-based models. arXiv preprint arXiv:1903.08689 (2019)
Engstrom, L., Ilyas, A., Santurkar, S., Tsipras, D.: Robustness (python library) (2019). https://github.com/MadryLab/robustness
Fan, J., Taghvaei, A., Chen, Y.: Variational Wasserstein gradient flow. arXiv preprint arXiv:2112.02424 (2021)
Franceschi, J.Y., et al.: Unifying GANs and score-based diffusion as generative particle models. In: Advances in Neural Information Processing Systems, vol. 36 (2024)
Gao, Y., Huang, J., Jiao, Y., Liu, J., Lu, X., Yang, Z.: Deep generative learning via euler particle transport. In: Mathematical and Scientific Machine Learning, pp. 336–368. PMLR (2022)
Gao, Y., Jiao, Y., Wang, Y., Wang, Y., Yang, C., Zhang, S.: Deep generative learning via variational gradient flow. In: International Conference on Machine Learning, pp. 2093–2101. PMLR (2019)
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, vol. 27 (2014)
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of Wasserstein GANs. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: Advances in Neural Information Processing Systems, vol. 33, pp. 6840–6851 (2020)
Jordan, R., Kinderlehrer, D., Otto, F.: The variational formulation of the fokker-planck equation. SIAM J. Math. Anal. 29(1), 1–17 (1998)
LeCun, Y., Chopra, S., Hadsell, R., Ranzato, M., Huang, F.: A tutorial on energy-based learning. Predicting Structured Data 1 (2006)
Lee, H.Y., Tseng, H.Y., Huang, J.B., Singh, M., Yang, M.H.: Diverse image-to-image translation via disentangled representations. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 35–51 (2018)
Liu, C., Zhuo, J., Zhu, J.: Understanding MCMC dynamics as flows on the Wasserstein space. In: International Conference on Machine Learning, pp. 4093–4103. PMLR (2019)
Liutkus, A., Simsekli, U., Majewski, S., Durmus, A., Stöter, F.R.: Sliced-Wasserstein flows: nonparametric generative modeling via optimal transport and diffusions. In: International Conference on Machine Learning, pp. 4104–4113. PMLR (2019)
Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957 (2018)
Mokrov, P., Korotin, A., Li, L., Genevay, A., Solomon, J.M., Burnaev, E.: Large-scale Wasserstein gradient flows. In: Advances in Neural Information Processing Systems, vol. 34, pp. 15243–15256 (2021)
Mroueh, Y., Nguyen, T.: On the convergence of gradient descent in GANs: MMD GAN as a gradient flow. In: International Conference on Artificial Intelligence and Statistics, pp. 1720–1728. PMLR (2021)
Mroueh, Y., Sercu, T., Raj, A.: Sobolev descent. In: The 22nd International Conference on Artificial Intelligence and Statistics, pp. 2976–2985. PMLR (2019)
Nijkamp, E., Hill, M., Zhu, S.C., Wu, Y.N.: Learning non-convergent non-persistent short-run MCMC toward energy-based model. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Pang, B., Han, T., Nijkamp, E., Zhu, S.C., Wu, Y.N.: Learning latent space energy-based prior model. In: Advances in Neural Information Processing Systems, vol. 33, pp. 21994–22008 (2020)
Rhodes, B., Xu, K., Gutmann, M.U.: Telescoping density-ratio estimation. In: Advances in Neural Information Processing Systems, vol. 33, pp. 4905–4916 (2020)
Risken, H., Eberly, J.: The fokker-planck equation, methods of solution and applications. J. Opt. Soc. Am. B: Opt. Phys. 2(3), 508 (1985)
Santambrogio, F.: \(\{\)Euclidean, metric, and Wasserstein\(\}\) gradient flows: an overview. Bull. Math. Sci. 7(1), 87–154 (2017)
Santurkar, S., Ilyas, A., Tsipras, D., Engstrom, L., Tran, B., Madry, A.: Image synthesis with a single (robust) classifier. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Song, Y., Ermon, S.: Generative modeling by estimating gradients of the data distribution. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Song, Y., Ermon, S.: Improved techniques for training score-based generative models. In: Advances in Neural Information Processing Systems, vol. 33, pp. 12438–12448 (2020)
Song, Y., Sohl-Dickstein, J., Kingma, D.P., Kumar, A., Ermon, S., Poole, B.: Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456 (2020)
Sugiyama, M., Suzuki, T., Kanamori, T.: Density Ratio Estimation in Machine Learning. Cambridge University Press, Cambridge (2012)
Sugiyama, M., Suzuki, T., Kanamori, T.: Density-ratio matching under the Bregman divergence: a unified framework of density-ratio estimation. Ann. Inst. Stat. Math. 64(5), 1009–1044 (2012)
Xiao, Z., Kreis, K., Kautz, J., Vahdat, A.: Vaebm: a symbiosis between variational autoencoders and energy-based models. arXiv preprint arXiv:2010.00654 (2020)
Xie, J., Lu, Y., Gao, R., Wu, Y.N.: Cooperative learning of energy-based model and latent variable model via MCMC teaching. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Xie, J., Lu, Y., Zhu, S.C., Wu, Y.: A theory of generative convnet. In: International Conference on Machine Learning, pp. 2635–2644. PMLR (2016)
Zhao, Y., Chen, C.: Unpaired image-to-image translation via latent energy transport. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16418–16427 (2021)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)
Acknowledgments
This research is supported by the National Research Foundation Singapore and DSO National Laboratories under the AI Singapore Programme (AISG Award No: AISG2-RP-2020-016).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors have no competing interests to declare that are relevant to the content of this article.
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Heng, A., Ansari, A.F., Soh, H. (2024). Generative Modeling with Flow-Guided Density Ratio Learning. In: Bifet, A., Davis, J., Krilavičius, T., Kull, M., Ntoutsi, E., Žliobaitė, I. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2024. Lecture Notes in Computer Science(), vol 14942. Springer, Cham. https://doi.org/10.1007/978-3-031-70344-7_15
Download citation
DOI: https://doi.org/10.1007/978-3-031-70344-7_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70343-0
Online ISBN: 978-3-031-70344-7
eBook Packages: Computer ScienceComputer Science (R0)