ProCreate, Don’t Reproduce! Propulsive Energy Diffusion for Creative Generation

Lu, Jack; Teehan, Ryan; Ren, Mengye

doi:10.1007/978-3-031-73027-6_23

Jack Lu¹³,
Ryan Teehan¹³ &
Mengye Ren¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15118))

Included in the following conference series:

European Conference on Computer Vision

165 Accesses

Abstract

In this paper, we propose ProCreate, a simple and easy-to-implement method to improve sample diversity and creativity of diffusion-based image generative models and to prevent training data reproduction. ProCreate operates on a set of reference images and actively propels the generated image embedding away from the reference embeddings during the generation process. We propose FSCG-8 (Few-Shot Creative Generation 8), a few-shot creative generation dataset on eight different categories—encompassing different concepts, styles, and settings—in which ProCreate achieves the highest sample diversity and fidelity. Furthermore, we show that ProCreate is effective at preventing replicating training data in a large-scale evaluation using training text prompts. Code and FSCG-8 are available at https://github.com/Agentic-Learning-AI-Lab/procreate-diffusion-public.

Project Webpage: https://procreate-diffusion.github.io.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

The Fabrication of Reality and Fantasy: Scene Generation with LLM-Assisted Prompt Interpretation

Fundamentals of Encoders and Decoders in Generative AI

Diffusion Soup: Model Merging for Text-to-Image Diffusion Models

References

Bai, A., Hsieh, C., Kan, W.C., Lin, H.: Reducing training sample memorization in GANs by training with memorization rejection. arXiv preprint arXiv:2210.12231 (2022)
Bai, C., Lin, H., Raffel, C., Kan, W.C.: On training sample memorization: lessons from benchmarking generative modeling with a large-scale competition. arXiv preprint arXiv:2106.03062 (2021)
Bansal, A., et al.: Universal guidance for diffusion models. In: CVPR (2023)
Google Scholar
Benigmim, Y., Roy, S., Essid, S., Kalogeiton, V., Lathuilière, S.: One-shot unsupervised domain adaptation with personalized diffusion models. In: CVPR (2023)
Google Scholar
Binkowski, M., Sutherland, D.J., Arbel, M., Gretton, A.: Demystifying MMD GANs. In: ICLR (2018)
Google Scholar
Carlini, N., et al.: Extracting training data from diffusion models. In: USENIX Security Symposium (2023)
Google Scholar
Corso, G., Xu, Y., Bortoli, V.D., Barzilay, R., Jaakkola, T.S.: Particle guidance: non-I.I.D. diverse sampling with diffusion modelsarXiv preprint arXiv:2310.13102 (2023)
Dhariwal, P., Nichol, A.Q.: Diffusion models beat GANs on image synthesis. In: NeurIPS (2021)
Google Scholar
Douze, M., et al.: The FAISS Library (2024)
Google Scholar
Du, Y., et al.: Reduce, reuse, recycle: compositional generation with energy-based diffusion models and MCMC. In: ICML (2023)
Google Scholar
Farshad, A., Yeganeh, Y., Chi, Y., Shen, C., Ommer, B., Navab, N.: SceneGenie: scene graph guided diffusion models for image synthesis. In: ICCV - Workshops (2023)
Google Scholar
Friedman, D., Dieng, A.B.: The Vendi score: a diversity evaluation metric for machine learning. arXiv preprint arXiv:2210.02410 (2022)
Fu, S., et al.: DreamSim: learning new dimensions of human visual similarity using synthetic data. In: NeurIPS (2023)
Google Scholar
Gal, R., et al.: An image is worth one word: personalizing text-to-image generation using textual inversion. In: ICLR (2023)
Google Scholar
Giannone, G., Nielsen, D., Winther, O.: Few-shot diffusion models (2022)
Google Scholar
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In: NeurIPS (2017)
Google Scholar
Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14(8), 1771–1800 (2002)
Article Google Scholar
Ho, J., et al.: Imagen video: high definition video generation with diffusion models. arXiv preprint arXiv:2210.02303 (2022)
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: NeurIPS (2020)
Google Scholar
Ho, J., Saharia, C., Chan, W., Fleet, D.J., Norouzi, M., Salimans, T.: Cascaded diffusion models for high fidelity image generation. J. Mach. Learn. Res. 23, 47:1–47:33 (2022)
Google Scholar
Ho, J., Salimans, T.: Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598 (2022)
Ho, J., Salimans, T., Gritsenko, A.A., Chan, W., Norouzi, M., Fleet, D.J.: Video diffusion models. arXiv preprint arXiv:2204.03458 (2022)
Kingma, D.P., Salimans, T., Poole, B., Ho, J.: Variational diffusion models. arXiv preprint arXiv:2107.00630 (2021)
Krizhevsky, A.: Learning multiple layers of features from tiny images (2009)
Google Scholar
Kumari, N., Zhang, B., Zhang, R., Shechtman, E., Zhu, J.: Multi-concept customization of text-to-image diffusion. In: CVPR (2023)
Google Scholar
Kynkäänniemi, T., Karras, T., Laine, S., Lehtinen, J., Aila, T.: Improved precision and recall metric for assessing generative models (2019)
Google Scholar
Liu, L., Ren, Y., Lin, Z., Zhao, Z.: Pseudo numerical methods for diffusion models on manifolds. In: ICLR (2022)
Google Scholar
Lu, H., Tunanyan, H., Wang, K., Navasardyan, S., Wang, Z., Shi, H.: Specialist diffusion: plug-and-play sample-efficient fine-tuning of text-to-image diffusion models to learn any unseen style. In: CVPR (2023)
Google Scholar
Lugmayr, A., Danelljan, M., Romero, A., Yu, F., Timofte, R., Gool, L.V.: RePaint: inpainting using denoising diffusion probabilistic models. arXiv preprint arXiv:2201.09865 (2022)
Nagarajan, V.: Theoretical insights into memorization in GANs (2019)
Google Scholar
Nichol, A.Q., et al.: GLIDE: towards photorealistic image generation and editing with text-guided diffusion models. In: ICML (2022)
Google Scholar
Pizzi, E., Roy, S.D., Ravindra, S.N., Goyal, P., Douze, M.: A self-supervised descriptor for image copy detection. In: CVPR (2022)
Google Scholar
Qu, W., Shao, Y., Meng, L., Huang, X., Xiao, L.: A conditional denoising diffusion probabilistic model for point cloud upsampling. arXiv preprint arXiv:2312.02719 (2023)
Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., Chen, M.: Hierarchical text-conditional image generation with CLIP latents. arXiv preprint arXiv:2204.06125 (2022)
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: CVPR (2022)
Google Scholar
Ruiz, N., Li, Y., Jampani, V., Pritch, Y., Rubinstein, M., Aberman, K.: DreamBooth: fine tuning text-to-image diffusion models for subject-driven generation. In: CVPR (2023)
Google Scholar
Sadat, S., Buhmann, J., Bradley, D., Hilliges, O., Weber, R.M.: CADS: unleashing the diversity of diffusion models through condition-annealed sampling. arXiv preprint arXiv:2310.17347 (2023)
Saharia, C., et al.: Palette: image-to-image diffusion models. In: SIGGRAPH (2022)
Google Scholar
Saharia, C., et al.: Photorealistic text-to-image diffusion models with deep language understanding. In: NeurIPS (2022)
Google Scholar
Schuhmann, C., et al.: LAION-5B: an open large-scale dataset for training next generation image-text models. In: NeurIPS (2022)
Google Scholar
Sehwag, V., Hazirbas, C., Gordo, A., Ozgenel, F., Canton-Ferrer, C.: Generating high fidelity data from low-density regions using diffusion models. In: CVPR (2022)
Google Scholar
Somepalli, G., Singla, V., Goldblum, M., Geiping, J., Goldstein, T.: Diffusion art or digital forgery? Investigating data replication in diffusion models. In: CVPR (2023)
Google Scholar
Somepalli, G., Singla, V., Goldblum, M., Geiping, J., Goldstein, T.: Understanding and mitigating copying in diffusion models. In: NeurIPS (2023)
Google Scholar
Song, J., Meng, C., Ermon, S.: Denoising diffusion implicit models. In: ICLR (2021)
Google Scholar
Song, Y., Kingma, D.P.: How to train your energy-based models. arXiv preprint arXiv:2101.03288 (2021)
Song, Y., Sohl-Dickstein, J., Kingma, D.P., Kumar, A., Ermon, S., Poole, B.: Score-based generative modeling through stochastic differential equations. In: ICLR (2021)
Google Scholar
Wallace, B., Gokul, A., Ermon, S., Naik, N.: End-to-end diffusion latent optimization improves classifier guidance. In: ICCV (2023)
Google Scholar
Wang, Z., Zhao, L., Xing, W.: StyleDiffusion: controllable disentangled style transfer via diffusion models. In: ICCV (2023)
Google Scholar
Yang, L., et al.: Diffusion-based scene graph to image generation with masked contrastive pre-training. arXiv preprint arXiv:2211.11138 (2022)
Zeng, X., et al.: LION: latent point diffusion models for 3D shape generation. In: NeurIPS (2022)
Google Scholar
Zhang, Y., et al.: Inversion-based style transfer with diffusion models. In: CVPR (2023)
Google Scholar
Zheng, G., Zhou, X., Li, X., Qi, Z., Shan, Y., Li, X.: LayoutDiffusion: controllable diffusion model for layout-to-image generation. In: CVPR (2023)
Google Scholar
Zhu, J., Ma, H., Chen, J., Yuan, J.: DomainStudio: fine-tuning diffusion models for domain-driven image generation using limited data. arXiv preprint arXiv:2306.14153 (2023)

Download references

Acknowledgement

We thank Zhun Deng and members of the NYU Agentic Learning AI Lab for their helpful discussions. The compute was supported by the NYU High-Performance Computing resources, services, and staff expertise.

Author information

Authors and Affiliations

New York University, New York, USA
Jack Lu, Ryan Teehan & Mengye Ren

Authors

Jack Lu
View author publications
You can also search for this author in PubMed Google Scholar
Ryan Teehan
View author publications
You can also search for this author in PubMed Google Scholar
Mengye Ren
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jack Lu .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1309 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lu, J., Teehan, R., Ren, M. (2025). ProCreate, Don’t Reproduce! Propulsive Energy Diffusion for Creative Generation. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15118. Springer, Cham. https://doi.org/10.1007/978-3-031-73027-6_23

Download citation

DOI: https://doi.org/10.1007/978-3-031-73027-6_23
Published: 26 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73026-9
Online ISBN: 978-3-031-73027-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

ProCreate, Don’t Reproduce! Propulsive Energy Diffusion for Creative Generation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

The Fabrication of Reality and Fantasy: Scene Generation with LLM-Assisted Prompt Interpretation

Fundamentals of Encoders and Decoders in Generative AI

Diffusion Soup: Model Merging for Text-to-Image Diffusion Models

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 1309 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

ProCreate, Don’t Reproduce! Propulsive Energy Diffusion for Creative Generation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

The Fabrication of Reality and Fantasy: Scene Generation with LLM-Assisted Prompt Interpretation

Fundamentals of Encoders and Decoders in Generative AI

Diffusion Soup: Model Merging for Text-to-Image Diffusion Models

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 1309 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation