Abstract
Diffusion Models have become increasingly popular in recent years and their applications span a wide range of fields. This survey focuses on the use of diffusion models in computer vision, specially in the branch of image transformations. The objective of this survey is to provide an overview of state-of-the-art applications of diffusion models in image transformations, including image inpainting, super-resolution, restoration, translation, and editing. This survey presents a selection of notable papers and repositories including practical applications of diffusion models for image transformations. The applications are presented in a practical and concise manner, facilitating the understanding of concepts behind diffusion models and how they function. Additionally, it includes a curated collection of GitHub repositories featuring popular examples of these subjects.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Yang, B., et al.: Paint by Example: Exemplar-Based Image Editing with Diffusion Models. GitHub repository. https://github.com/Fantasy-Studio/Paint-by-Example. Accessed 20 Apr 2023
Mackay, D.: Dallin Mackay’s Hugging Face repository. https://huggingface.co/dallinmackay. Accessed 22 Apr 2023
Ho, J., Jain, A., Abbeel, P.: Denoising Diffusion Probabilistic Models (2020). https://doi.org/10.48550/arXiv.2006.11239
Ho, J., Salimans, T.: Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598 (2022)
Pinkney, J.: Implementation of Imagic: Text-Based Real Image Editing with Diffusion Models using Stable Diffusion. https://github.com/justinpinkney/stable-diffusion/blob/main/notebooks/imagic.ipynb. Accessed 22 Apr 2023
Pinkney, J.: Text to Pokemon Generator. https://www.justinpinkney.com/pokemon-generator/. Accessed 22 Apr 2023
Kawar, B., et al.: Imagic: text-based real image editing with diffusion models. In: Conference on Computer Vision and Pattern Recognition 2023 (2023). https://doi.org/10.48550/arXiv.2210.09276
Jiang, L.: Image Super-Resolution via Iterative Refinement. GitHub repository. https://github.com/Janspiry/Image-Super-Resolution-via-Iterative-Refinement. Accessed 20 Apr 2023
Jiang, L., Belousov, Y.: Palette: Image-to-Image Diffusion Models. GitHub repository. https://github.com/Janspiry/Palette-Image-to-Image-Diffusion-Models. Accessed 22 Apr 2023
Lugmayr, A., Danelljan, M., Romero, A., Yu, F., Timofte, R., Van Gool, L.: Repaint: inpainting using denoising diffusion probabilistic models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11461–11471 (2022). https://doi.org/10.48550/arXiv.2201.09865
Lugmayr, A., Danelljan, M., Romero, A., Yu, F., Timofte, R., Van Gool, L.: Repaint. GitHub repository. https://github.com/andreas128/RePaint. Accessed 20 Apr 2023
Nguyen, C.M., Chan, E.R., Bergman, A.W., Wetzstein, G.: Diffusion in the Dark: A Diffusion Model for Low-Light Text Recognition (2023). https://doi.org/10.48550/arXiv.2303.04291
Nguyen, C.M., Chan, E.R., Bergman, A.W., Wetzstein, G.: Diffusion in the Dark: A Diffusion Model for Low-Light Text Recognition. Project web. https://ccnguyen.github.io/diffusion-in-the-dark/. Accessed 21 Apr 2023
Nichol, A., et al.: Glide: towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741 (2021)
Nichol, A., et al.: GLIDE. GitHub repository. https://github.com/openai/glide-text2im. Accessed 25 Apr 2023
Özdenizci, O., Legenstein, R.: Restoring vision in adverse weather conditions with patch-based denoising diffusion models. GitHub repository. https://github.com/IGITUGraz/WeatherDiffusion. Accessed 21 Apr 2023
Özdenizci, O., Legenstein, R.: Restoring vision in adverse weather conditions with patch-based denoising diffusion models. IEEE Trans. Pattern Anal. Mach. Intell. 1–12 (2023). https://doi.org/10.1109/TPAMI.2023.3238179
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10684–10695 (2022). https://doi.org/10.48550/arXiv.2112.10752
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. GitHub repository. https://github.com/CompVis/latent-diffusion. Accessed 20 Apr 2023
Ruiz, N., Li, Y., Jampani, V., Pritch, Y., Rubinstein, M., Aberman, K.: DreamBooth: fine tuning text-to-image diffusion models for subject-driven generation. arXiv preprint arXiv:2208.12242 (2022)
Runwayml: Stable-Diffusion-Inpainting. https://huggingface.co/runwayml/stable-diffusion-inpainting. Accessed 20 Apr 2023
Sahak, H., Watson, D., Saharia, C., Fleet, D.: Denoising Diffusion Probabilistic Models for Robust Image Super-Resolution in the Wild (2023). https://doi.org/10.48550/arXiv.2302.07864
Saharia, C., et al.: Palette: image-to-image diffusion models. In: ACM SIGGRAPH 2022 Conference Proceedings, pp. 1–10 (2022). https://doi.org/10.48550/arXiv.2111.05826
Saharia, C., et al.: Photorealistic text-to-image diffusion models with deep language understanding. In: Advances in Neural Information Processing Systems, vol. 35, pp. 36479–36494 (2022). https://doi.org/10.48550/arXiv.2205.11487
Saharia, C., Ho, J., Chan, W., Salimans, T., Fleet, D.J., Norouzi, M.: Image super-resolution via iterative refinement. IEEE Trans. Pattern Anal. Mach. Intell. (2022). https://doi.org/10.1109/TPAMI.2022.3204461
Seff, A.: What are Diffusion Models? (2022). https://www.youtube.com/watch?v=fbLgFrlTnGU
Wang, X., Xie, L., Dong, C., Shan, Y.: Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data (2021)
Weng, L.: What are diffusion models? lilianweng.github.io (2021). https://lilianweng.github.io/posts/2021-07-11-diffusion-models/
Yang, B., et al.: Paint by Example: Exemplar-Based Image Editing with Diffusion Models (2022). https://doi.org/10.48550/arXiv.2211.13227
Yang, L., et al.: Diffusion models: a comprehensive survey of methods and applications (2022). https://doi.org/10.48550/arXiv.2209.00796
Zhang, Z., Han, L., Ghosh, A., Metaxas, D., Ren, J.: SINE: SINgle Image Editing with Text-to-Image Diffusion Models. arXiv preprint arXiv:2212.04489 (2022)
Zhang, Z., Han, L., Ghosh, A., Metaxas, D., Ren, J.: SINE: SINgle Image Editing with Text-to-Image Diffusion Models. https://zhang-zx.github.io/SINE/. Accessed 25 Apr 2023
Acknowledgements
This work is partially supported by the Spanish Ministry of Science and Innovation under contract PID2019-107255GB and PID2021-124463OB-IOO, by the Generalitat de Catalunya under grants 2021-SGR-00478 and 2021-SGR-00326. Finally, the research leading to these results also has received funding from the European Union’s Horizon 2020 research and innovation programme under the HORIZON-EU VITAMIN-V (101093062) project.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Arellano, S., Otero, B., Tous, R. (2024). Exploring Image Transformations with Diffusion Models: A Survey of Applications and Implementation Code. In: Nicosia, G., Ojha, V., La Malfa, E., La Malfa, G., Pardalos, P.M., Umeton, R. (eds) Machine Learning, Optimization, and Data Science. LOD 2023. Lecture Notes in Computer Science, vol 14506. Springer, Cham. https://doi.org/10.1007/978-3-031-53966-4_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-53966-4_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53965-7
Online ISBN: 978-3-031-53966-4
eBook Packages: Computer ScienceComputer Science (R0)