Skip to main content

An Investigation into AI-Generated Art Through GANs and ML Neural Network

  • Conference paper
  • First Online:
Intelligent Computing (SAI 2024)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 1016))

Included in the following conference series:

  • 314 Accesses

Abstract

Art has always been the primary means of human expression, aided by various tools humans were equipped with. Amongst recent tools and advances in artistic technologies there are generative adversarial networks (GANs) and various neural networks, performing the creative task of art generation through the application of different techniques. The scope of this paper is to investigate and compare the use of GANs and machine learning neural techniques to replicate state‐of‐the‐art applications whose code is not publicly available. Furthermore, it aims to evaluate the implementation against the state‐of‐the‐art applications. The VQGAN + CLIP architecture with an additional neural style transfer (NST) layer was tested and compared to other AI-generated art applications. The results showed that while the VQGAN + CLIP images were creative and unique, they lacked structure, had high blurriness, and lower overall quality, which gives a direction for future work. The application called Midjourney outperformed others in most categories.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Zangwill, N.: Aesthetic Judgment, The Stanford Encyclopedia of Philosophy (2022). https://plato.stanford.edu/archives/fall2022/entries/aesthetic-judgment/

  2. Reben, A.: The weird and wonderful art created when AI and humans unite, BBC Future, 28-Nov-2022. https://www.bbc.com/future/article/20221123-the-weird-and-wonderful-art-created-when-ai-and-humans-unite. Accessed 02 Dec 2022

  3. Clarke, L.: Is generative AI really a threat to creative professionals? The Guardian, 12-Nov-2022. https://www.theguardian.com/technology/2022/nov/12/when-ai-can-make-art-what-does-it-mean-for-creativity-dall-e-midjourney. Accessed 02 Dec 2022

  4. Frolov, S., Hinz, T., Raue, F., Hees, J., Dengel, A.: Adversarial text-to-image synthesis: a review. Neural Netw. 144, 187–209 (2021)

    Article  Google Scholar 

  5. Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: International Conference on Machine Learning, pp. 214–223 (2017)

    Google Scholar 

  6. Cohn, G.: Ai Art at Christie’s sells for $432,500, The New York Times, 25-Oct-2018. https://www.nytimes.com/2018/10/25/arts/design/ai-art-sold-christies.html. Accessed 02 Dec 2022

  7. O’Brien, C.: How Pixar uses AI and Gans to create high-resolution content, VentureBeat, 17-Jul-2020. https://venturebeat.com/business/how-pixar-uses-ai-and-gans-to-create-high-resolution-content/. Accessed 02 Dec 2022

  8. Goodfellow, I.J., et al.: Generative Adversarial Networks, arXiv (2014)

    Google Scholar 

  9. Langr, J., Bok, V.: Gans in Action: Deep Learning with Generative Adversarial Networks. Shelter Island, NY: Manning (2019)

    Google Scholar 

  10. Mirza, M., Osindero, S.: Conditional Generative Adversarial Nets. arXiv (2014)

    Google Scholar 

  11. Radford, A., Metz, L., Chintala, S.: Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv (2015)

    Google Scholar 

  12. Fu, Y., Xiang, T., Jiang, Y.-G., Xue, X., Sigal, L., Gong, S.: Recent advances in Zero-shot recognition: toward data-efficient understanding of visual content. IEEE Sign. Process. Mag. 35(1), 112–125 (2018)

    Article  Google Scholar 

  13. Bucher, M., Herbin, S., Jurie, F.: Improving semantic embedding consistency by metric learning for Zero-shot classification. In: Computer Vision – ECCV 2016, pp. 730–746 (2016)

    Google Scholar 

  14. Chang, M.-W., Ratinov, L., Roth, D., Srikumar, V.: Importance of semantic representation: dataless classification. In: Proceedings of the 23rd National Conference on Artificial Intelligence – vol. 2, pp. 830–835 (2008)

    Google Scholar 

  15. Larochelle, H., Erhan, D., Bengio, Y.: Zero-data learning of new tasks. In: Proceedings of the 23rd National Conference on Artificial Intelligence, vol. 2, pp. 646–651 (2008)

    Google Scholar 

  16. Goldberg, Y.: Supervised classification and feed-forward neural networks. In: Neural network methods for Natural Language Processing, San Rafael, CA: Morgan & Claypool, pp. 11–14 (2017)

    Google Scholar 

  17. Srivastava, N., Salakhutdinov, R.: Multimodal learning with deep Boltzmann machines. J. Mach. Learn. Res. 15(1), 2949–2980 (2014)

    MathSciNet  Google Scholar 

  18. Mehta, P.: Multimodal Deep Learning, Medium, 13-Jun-2020. https://towardsdatascience.com/multimodal-deep-learning-ce7d1d994f4. Accessed 02 Dec 2022

  19. Radford, A., et al.: Learning Transferable Visual Models From Natural Language Supervision, CoRR, vol. abs/2103.00020 (2021)

    Google Scholar 

  20. Liu, Y., Qin, Z., Luo, Z., Wang, H.: Auto-painter: Cartoon Image Generation from Sketch by Using Conditional Generative Adversarial Networks, CoRR, vol. abs/1705.01908 (2017)

    Google Scholar 

  21. Kuriakose, B., Thomas, T., Thomas, N.E., Varghese, S.J., Kumar, V.A.: Synthesizing images from hand-drawn sketches using conditional generative adversarial networks. In: 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC), pp. 774–778 (2020)

    Google Scholar 

  22. Liu, B., Song, K., Elgammal, A.: Sketch-to-Art: Synthesizing Stylized Art Images From Sketches. CoRR, vol. abs/2002.12888 (2020)

    Google Scholar 

  23. Berman, A.: Generative Adversarial Networks for Fine Art Generation’, M.S. thesis, CSD, UCT (2020). https://open.uct.ac.za/bitstream/handle/11427/32458/thesis_sci_2020_berman_alan.pdf?sequence=1&isAllowed=y

  24. Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2414–2423 (2016)

    Google Scholar 

  25. Jing, Y., Yang, Y., Feng, Z., Ye, J., Yu, Y., Song, M.: Neural style transfer: a review. IEEE Trans. Vis. Comput. Graph. 26(11) 3365–3385 (2020)

    Google Scholar 

  26. Ulyanov, D., Vedaldi, A., Lempitsky, V.: Improved texture networks: maximizing quality and diversity in feed-forward stylization and texture synthesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6924–6932 (2017)

    Google Scholar 

  27. Chen, D., Yuan, L., Liao, J., Yu, N., Hua, G.: Stylebank: an explicit representation for neural image style transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1897–1906 (2017)

    Google Scholar 

  28. Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., Yang, M.H.: Universal style transfer via feature transforms. Adv. Neural Inf. Process. Syst. 385–395 (2017)

    Google Scholar 

  29. Galatolo, F., Cimino, M., Vaglini, G.: Generating images from caption and vice versa via CLIP-guided generative latent space search. In: Proceedings of the International Conference on Image Processing and Vision Engineering (2021)

    Google Scholar 

  30. Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of StyleGAN’. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020

    Google Scholar 

  31. Brock, A., Donahue, J., Simonyan, K.: Large Scale GAN Training for High Fidelity Natural Image Synthesis. arXiv (2018)

    Google Scholar 

  32. Smith, A., Colton, S.: CLIP-Guided GAN image generation: an artistic exploration. In: EVO*2021, pp. 17–20 (2021)

    Google Scholar 

  33. DeJausserand, M.: Art trend of 2022: How AI art emerged and polarized the art world, My Modern Met, 07-Dec-2022. https://mymodernmet.com/ai-art-2022/. Accessed 07 Dec 2022

  34. Esser, P., Rombach, R., Ommer, B.: Taming transformers for high‐resolution image synthesis. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2021)

    Google Scholar 

  35. Yu, J., et al.: Vector‐quantized image modeling with improved VQGAN, CoRR, vol. abs/2110.04627, arXiv: 2110.04627 (2021). https://arxiv.org/abs/2110.04627

  36. Yu, J., et al.: Scaling autoregressive models for content‐rich text‐to‐image generation, arXiv: 2206.10789 [cs.CV] (2022)

    Google Scholar 

  37. Crowson, K., et al.: VQGAN-CLIP: open domain image generation and editing with natural language guidance. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol. 13697, pp. 88-105. Springer, Cham. https://doi.org/10.1007/978-3-031-19836-6_6

  38. Ramesh, A., et al.: Zero‐shot text‐to‐image generation, CoRR, vol. abs/2102.12092, arXiv: 2102.12092 (2021). https://arxiv.org/abs/2102.12092

  39. Nichol, A., et al.: Glide: Towards photorealistic image generation and editing with text‐guided diffusion models (2021)

    Google Scholar 

  40. Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., Chen, M.: Hierarchical text‐conditional image generation with clip latents (2022). arXiv: 2204.06125 [cs.CV]

    Google Scholar 

  41. Ideami, J.: Generative ai, from gans to clip, with python and pytorch. https://www.udemy.com/course/generative-creative-ai-fromgans-to-clip-with-python-and-pytorch/

  42. Neural style transfer tensorflow core. https://www.tensorflow.org/tutorials/generative/style_transfer

  43. Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style, CoRR, vol. abs/1508.06576, arXiv: 1508.06576 (2015). http://arxiv.org/abs/1508.06576

  44. Hagtvedt, H., Hagtvedt, R., Patrick, V.: The perception and evaluation of visual art. Empir. Stud. Arts 26, 197–218 (2008). https://doi.org/10.2190/EM.26.2.d

    Article  Google Scholar 

  45. Adajian, T.: The definition of art. In: Zalta, E.N., Ed. The Stanford Encyclopedia of Philosophy, Spring 2022, Metaphysics Research Lab, Stanford University (2022)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jean Claude Sacco .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sacco, J.C., Camilleri, V. (2024). An Investigation into AI-Generated Art Through GANs and ML Neural Network. In: Arai, K. (eds) Intelligent Computing. SAI 2024. Lecture Notes in Networks and Systems, vol 1016. Springer, Cham. https://doi.org/10.1007/978-3-031-62281-6_7

Download citation

Publish with us

Policies and ethics