Exploring Image Transformations with Diffusion Models: A Survey of Applications and Implementation Code

Arellano, Silvia; Otero, Beatriz; Tous, Ruben

doi:10.1007/978-3-031-53966-4_2

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14506))

Included in the following conference series:

International Conference on Machine Learning, Optimization, and Data Science

199 Accesses

Abstract

Diffusion Models have become increasingly popular in recent years and their applications span a wide range of fields. This survey focuses on the use of diffusion models in computer vision, specially in the branch of image transformations. The objective of this survey is to provide an overview of state-of-the-art applications of diffusion models in image transformations, including image inpainting, super-resolution, restoration, translation, and editing. This survey presents a selection of notable papers and repositories including practical applications of diffusion models for image transformations. The applications are presented in a practical and concise manner, facilitating the understanding of concepts behind diffusion models and how they function. Additionally, it includes a curated collection of GitHub repositories featuring popular examples of these subjects.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Yang, B., et al.: Paint by Example: Exemplar-Based Image Editing with Diffusion Models. GitHub repository. https://github.com/Fantasy-Studio/Paint-by-Example. Accessed 20 Apr 2023
Mackay, D.: Dallin Mackay’s Hugging Face repository. https://huggingface.co/dallinmackay. Accessed 22 Apr 2023
Ho, J., Jain, A., Abbeel, P.: Denoising Diffusion Probabilistic Models (2020). https://doi.org/10.48550/arXiv.2006.11239
Ho, J., Salimans, T.: Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598 (2022)
Pinkney, J.: Implementation of Imagic: Text-Based Real Image Editing with Diffusion Models using Stable Diffusion. https://github.com/justinpinkney/stable-diffusion/blob/main/notebooks/imagic.ipynb. Accessed 22 Apr 2023
Pinkney, J.: Text to Pokemon Generator. https://www.justinpinkney.com/pokemon-generator/. Accessed 22 Apr 2023
Kawar, B., et al.: Imagic: text-based real image editing with diffusion models. In: Conference on Computer Vision and Pattern Recognition 2023 (2023). https://doi.org/10.48550/arXiv.2210.09276
Jiang, L.: Image Super-Resolution via Iterative Refinement. GitHub repository. https://github.com/Janspiry/Image-Super-Resolution-via-Iterative-Refinement. Accessed 20 Apr 2023
Jiang, L., Belousov, Y.: Palette: Image-to-Image Diffusion Models. GitHub repository. https://github.com/Janspiry/Palette-Image-to-Image-Diffusion-Models. Accessed 22 Apr 2023
Lugmayr, A., Danelljan, M., Romero, A., Yu, F., Timofte, R., Van Gool, L.: Repaint: inpainting using denoising diffusion probabilistic models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11461–11471 (2022). https://doi.org/10.48550/arXiv.2201.09865
Lugmayr, A., Danelljan, M., Romero, A., Yu, F., Timofte, R., Van Gool, L.: Repaint. GitHub repository. https://github.com/andreas128/RePaint. Accessed 20 Apr 2023
Nguyen, C.M., Chan, E.R., Bergman, A.W., Wetzstein, G.: Diffusion in the Dark: A Diffusion Model for Low-Light Text Recognition (2023). https://doi.org/10.48550/arXiv.2303.04291
Nguyen, C.M., Chan, E.R., Bergman, A.W., Wetzstein, G.: Diffusion in the Dark: A Diffusion Model for Low-Light Text Recognition. Project web. https://ccnguyen.github.io/diffusion-in-the-dark/. Accessed 21 Apr 2023
Nichol, A., et al.: Glide: towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741 (2021)
Nichol, A., et al.: GLIDE. GitHub repository. https://github.com/openai/glide-text2im. Accessed 25 Apr 2023
Özdenizci, O., Legenstein, R.: Restoring vision in adverse weather conditions with patch-based denoising diffusion models. GitHub repository. https://github.com/IGITUGraz/WeatherDiffusion. Accessed 21 Apr 2023
Özdenizci, O., Legenstein, R.: Restoring vision in adverse weather conditions with patch-based denoising diffusion models. IEEE Trans. Pattern Anal. Mach. Intell. 1–12 (2023). https://doi.org/10.1109/TPAMI.2023.3238179
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10684–10695 (2022). https://doi.org/10.48550/arXiv.2112.10752
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. GitHub repository. https://github.com/CompVis/latent-diffusion. Accessed 20 Apr 2023
Ruiz, N., Li, Y., Jampani, V., Pritch, Y., Rubinstein, M., Aberman, K.: DreamBooth: fine tuning text-to-image diffusion models for subject-driven generation. arXiv preprint arXiv:2208.12242 (2022)
Runwayml: Stable-Diffusion-Inpainting. https://huggingface.co/runwayml/stable-diffusion-inpainting. Accessed 20 Apr 2023
Sahak, H., Watson, D., Saharia, C., Fleet, D.: Denoising Diffusion Probabilistic Models for Robust Image Super-Resolution in the Wild (2023). https://doi.org/10.48550/arXiv.2302.07864
Saharia, C., et al.: Palette: image-to-image diffusion models. In: ACM SIGGRAPH 2022 Conference Proceedings, pp. 1–10 (2022). https://doi.org/10.48550/arXiv.2111.05826
Saharia, C., et al.: Photorealistic text-to-image diffusion models with deep language understanding. In: Advances in Neural Information Processing Systems, vol. 35, pp. 36479–36494 (2022). https://doi.org/10.48550/arXiv.2205.11487
Saharia, C., Ho, J., Chan, W., Salimans, T., Fleet, D.J., Norouzi, M.: Image super-resolution via iterative refinement. IEEE Trans. Pattern Anal. Mach. Intell. (2022). https://doi.org/10.1109/TPAMI.2022.3204461
Seff, A.: What are Diffusion Models? (2022). https://www.youtube.com/watch?v=fbLgFrlTnGU
Wang, X., Xie, L., Dong, C., Shan, Y.: Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data (2021)
Google Scholar
Weng, L.: What are diffusion models? lilianweng.github.io (2021). https://lilianweng.github.io/posts/2021-07-11-diffusion-models/
Yang, B., et al.: Paint by Example: Exemplar-Based Image Editing with Diffusion Models (2022). https://doi.org/10.48550/arXiv.2211.13227
Yang, L., et al.: Diffusion models: a comprehensive survey of methods and applications (2022). https://doi.org/10.48550/arXiv.2209.00796
Zhang, Z., Han, L., Ghosh, A., Metaxas, D., Ren, J.: SINE: SINgle Image Editing with Text-to-Image Diffusion Models. arXiv preprint arXiv:2212.04489 (2022)
Zhang, Z., Han, L., Ghosh, A., Metaxas, D., Ren, J.: SINE: SINgle Image Editing with Text-to-Image Diffusion Models. https://zhang-zx.github.io/SINE/. Accessed 25 Apr 2023

Download references

Acknowledgements

This work is partially supported by the Spanish Ministry of Science and Innovation under contract PID2019-107255GB and PID2021-124463OB-IOO, by the Generalitat de Catalunya under grants 2021-SGR-00478 and 2021-SGR-00326. Finally, the research leading to these results also has received funding from the European Union’s Horizon 2020 research and innovation programme under the HORIZON-EU VITAMIN-V (101093062) project.

Author information

Authors and Affiliations

Universitat Politècnica de Catalunya, Barcelona, Spain
Silvia Arellano, Beatriz Otero & Ruben Tous

Authors

Silvia Arellano
View author publications
You can also search for this author in PubMed Google Scholar
Beatriz Otero
View author publications
You can also search for this author in PubMed Google Scholar
Ruben Tous
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Beatriz Otero .

Editor information

Editors and Affiliations

University of Catania, Catania, Catania, Italy
Giuseppe Nicosia
Newcastle University, Newcastle upon Tyne, UK
Varun Ojha
University of Oxford, Oxford, UK
Emanuele La Malfa
University of Cambridge, Cambridge, UK
Gabriele La Malfa
University of Florida, Gainesville, FL, USA
Panos M. Pardalos
Dana-Farber Cancer Institute, Boston, MA, USA
Renato Umeton

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Arellano, S., Otero, B., Tous, R. (2024). Exploring Image Transformations with Diffusion Models: A Survey of Applications and Implementation Code. In: Nicosia, G., Ojha, V., La Malfa, E., La Malfa, G., Pardalos, P.M., Umeton, R. (eds) Machine Learning, Optimization, and Data Science. LOD 2023. Lecture Notes in Computer Science, vol 14506. Springer, Cham. https://doi.org/10.1007/978-3-031-53966-4_2

Download citation

DOI: https://doi.org/10.1007/978-3-031-53966-4_2
Published: 15 February 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53965-7
Online ISBN: 978-3-031-53966-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Exploring Image Transformations with Diffusion Models: A Survey of Applications and Implementation Code