skip to main content
10.1145/3594409.3594430acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiciaiConference Proceedingsconference-collections
research-article

Usability of Pre-trained Diffusion Models in Generating Novel Datasets and Its Performance Evaluation

Published:26 July 2023Publication History

ABSTRACT

Even though sophisticated deep learning methods are getting better and better day by day, still they rely on a large number of datasets. But it is not always possible to acquire large datasets for all kinds of problems. Though diffusion models are now popular for their creative applications, it is already proven that they can generate better realistic-looking synthetic images compared to Generative Adversarial Networks (GAN). GANs are a popular option for image synthesis that helps the data sampling process for datasets that have low amounts of data or imbalanced data. In our work, we have experimented with a pre-trained text-to-image generation diffusion model for generating datasets for two different classes of problems. These problems are two common problems that can get benefitted from deep learning-based solutions but the lack of datasets hampers the process. We used the diffusion model to generate synthetic images and used those images as the training and validation data for the problems we tried to solve. Then we tested the models with manually collected real-world data and demonstrated the performance of such a method comparatively. From our experiments, we found that the diffusion model can generate realistic images and is up to 50 times faster in data generation compared to the manual human process. Also, in our testing, we found that the Convolutional Neural Networks trained with these synthetic data can achieve up to 80% and 89% accuracy scores.

References

  1. Alceu Bissoto, Eduardo Valle, and Sandra Avila. 2021. GAN-Based Data Augmentation and Anonymization for Skin-Lesion Analysis: A Critical Review. 1847–1856.Google ScholarGoogle Scholar
  2. Pierre Chambon, Christian Bluethgen, Curtis P. Langlotz, and Akshay Chaudhari. 2022. Adapting Pretrained Vision-Language Foundational Models to Medical Imaging Domains. (October 2022). DOI:https://doi.org/10.48550/arxiv.2210.04133Google ScholarGoogle ScholarCross RefCross Ref
  3. Prafulla Dhariwal, Openai, and Alex Nichol. 2021. Diffusion Models Beat GANs on Image Synthesis. Adv Neural Inf Process Syst 34, (December 2021), 8780–8794.Google ScholarGoogle Scholar
  4. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. 770–778. Retrieved February 17, 2023 from http://image-net.org/challenges/LSVRC/2015/Google ScholarGoogle Scholar
  5. Chip Huyen and an O'Reilly Media Company. Safari. 2022. Designing Machine Learning Systems. (2022), 350. Retrieved February 17, 2023 from https://www.oreilly.com/library/view/designing-machine-learning/9781098107956/Google ScholarGoogle Scholar
  6. Amina Kammoun, Rim Slama, Hedi Tabia, Tarek Ouni, and Mohmed Abid. 2022. Generative Adversarial Networks for Face Generation: A Survey. ACM Comput Surv 55, 5 (December 2022). DOI:https://doi.org/10.1145/3527850Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Diederik P. Kingma and Jimmy Lei Ba. 2014. Adam: A Method for Stochastic Optimization. 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings (December 2014). DOI:https://doi.org/10.48550/arxiv.1412.6980Google ScholarGoogle ScholarCross RefCross Ref
  8. Alex Nichol, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob McGrew, Ilya Sutskever, and Mark Chen. 2021. GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models. (December 2021). DOI:https://doi.org/10.48550/arxiv.2112.10741Google ScholarGoogle ScholarCross RefCross Ref
  9. Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, and Kfir Aberman. 2022. DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation. (August 2022). DOI:https://doi.org/10.48550/arxiv.2208.12242Google ScholarGoogle ScholarCross RefCross Ref
  10. Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily Denton, Seyed Kamyar Seyed Ghasemipour, Burcu Karagol Ayan, S. Sara Mahdavi, Rapha Gontijo Lopes, Tim Salimans, Jonathan Ho, David J Fleet, and Mohammad Norouzi. 2022. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding. (May 2022). DOI:https://doi.org/10.48550/arxiv.2205.11487Google ScholarGoogle ScholarCross RefCross Ref
  11. Veit Sandfort, Ke Yan, Perry J. Pickhardt, and Ronald M. Summers. 2019. Data augmentation using generative adversarial networks (CycleGAN) to improve generalizability in CT segmentation tasks. Scientific Reports 2019 9:1 9, 1 (November 2019), 1–9. DOI:https://doi.org/10.1038/s41598-019-52737-xGoogle ScholarGoogle ScholarCross RefCross Ref
  12. Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, Patrick Schramowski, Srivatsa Kundurthy, Katherine Crowson, Ludwig Schmidt, Robert Kaczmarczyk, Jenia Jitsev, Uc Berkeley, Gentec Data, and Tu Darmstadt. 2022. LAION-5B: An open large-scale dataset for training next generation image-text models. (October 2022). DOI:https://doi.org/10.48550/arxiv.2210.08402Google ScholarGoogle ScholarCross RefCross Ref
  13. Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger, Huiyu Zhou, Ruili Wang, M. Emre Celebi, and Jie Yang. 2021. Image synthesis with adversarial networks: A comprehensive survey and case studies. Information Fusion 72, (August 2021), 126–146. DOI:https://doi.org/10.1016/J.INFFUS.2021.02.014Google ScholarGoogle ScholarCross RefCross Ref
  14. Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. 2015. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. 2256–2265. Retrieved February 17, 2023 from https://proceedings.mlr.press/v37/sohl-dickstein15.htmlGoogle ScholarGoogle Scholar
  15. Jun-Yan Zhu, Taesung Park, Phillip Isola, Alexei A Efros, and Berkeley Ai Research. 2017. Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks. 2223–2232. Retrieved February 17, 2023 from https://github.com/junyanz/CycleGAN.Google ScholarGoogle Scholar
  16. DALL·E: Creating Images from Text. Retrieved February 17, 2023 from https://openai.com/blog/dall-e/Google ScholarGoogle Scholar
  17. Stable Diffusion Public Release — Stability AI. Retrieved February 17, 2023 from https://stability.ai/blog/stable-diffusion-public-releaseGoogle ScholarGoogle Scholar

Index Terms

  1. Usability of Pre-trained Diffusion Models in Generating Novel Datasets and Its Performance Evaluation
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          ICIAI '23: Proceedings of the 2023 7th International Conference on Innovation in Artificial Intelligence
          March 2023
          212 pages
          ISBN:9781450398398
          DOI:10.1145/3594409

          Copyright © 2023 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 26 July 2023

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited
        • Article Metrics

          • Downloads (Last 12 months)48
          • Downloads (Last 6 weeks)1

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format