Abstract
Art has always been the primary means of human expression, aided by various tools humans were equipped with. Amongst recent tools and advances in artistic technologies there are generative adversarial networks (GANs) and various neural networks, performing the creative task of art generation through the application of different techniques. The scope of this paper is to investigate and compare the use of GANs and machine learning neural techniques to replicate state‐of‐the‐art applications whose code is not publicly available. Furthermore, it aims to evaluate the implementation against the state‐of‐the‐art applications. The VQGAN + CLIP architecture with an additional neural style transfer (NST) layer was tested and compared to other AI-generated art applications. The results showed that while the VQGAN + CLIP images were creative and unique, they lacked structure, had high blurriness, and lower overall quality, which gives a direction for future work. The application called Midjourney outperformed others in most categories.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Zangwill, N.: Aesthetic Judgment, The Stanford Encyclopedia of Philosophy (2022). https://plato.stanford.edu/archives/fall2022/entries/aesthetic-judgment/
Reben, A.: The weird and wonderful art created when AI and humans unite, BBC Future, 28-Nov-2022. https://www.bbc.com/future/article/20221123-the-weird-and-wonderful-art-created-when-ai-and-humans-unite. Accessed 02 Dec 2022
Clarke, L.: Is generative AI really a threat to creative professionals? The Guardian, 12-Nov-2022. https://www.theguardian.com/technology/2022/nov/12/when-ai-can-make-art-what-does-it-mean-for-creativity-dall-e-midjourney. Accessed 02 Dec 2022
Frolov, S., Hinz, T., Raue, F., Hees, J., Dengel, A.: Adversarial text-to-image synthesis: a review. Neural Netw. 144, 187–209 (2021)
Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: International Conference on Machine Learning, pp. 214–223 (2017)
Cohn, G.: Ai Art at Christie’s sells for $432,500, The New York Times, 25-Oct-2018. https://www.nytimes.com/2018/10/25/arts/design/ai-art-sold-christies.html. Accessed 02 Dec 2022
O’Brien, C.: How Pixar uses AI and Gans to create high-resolution content, VentureBeat, 17-Jul-2020. https://venturebeat.com/business/how-pixar-uses-ai-and-gans-to-create-high-resolution-content/. Accessed 02 Dec 2022
Goodfellow, I.J., et al.: Generative Adversarial Networks, arXiv (2014)
Langr, J., Bok, V.: Gans in Action: Deep Learning with Generative Adversarial Networks. Shelter Island, NY: Manning (2019)
Mirza, M., Osindero, S.: Conditional Generative Adversarial Nets. arXiv (2014)
Radford, A., Metz, L., Chintala, S.: Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv (2015)
Fu, Y., Xiang, T., Jiang, Y.-G., Xue, X., Sigal, L., Gong, S.: Recent advances in Zero-shot recognition: toward data-efficient understanding of visual content. IEEE Sign. Process. Mag. 35(1), 112–125 (2018)
Bucher, M., Herbin, S., Jurie, F.: Improving semantic embedding consistency by metric learning for Zero-shot classification. In: Computer Vision – ECCV 2016, pp. 730–746 (2016)
Chang, M.-W., Ratinov, L., Roth, D., Srikumar, V.: Importance of semantic representation: dataless classification. In: Proceedings of the 23rd National Conference on Artificial Intelligence – vol. 2, pp. 830–835 (2008)
Larochelle, H., Erhan, D., Bengio, Y.: Zero-data learning of new tasks. In: Proceedings of the 23rd National Conference on Artificial Intelligence, vol. 2, pp. 646–651 (2008)
Goldberg, Y.: Supervised classification and feed-forward neural networks. In: Neural network methods for Natural Language Processing, San Rafael, CA: Morgan & Claypool, pp. 11–14 (2017)
Srivastava, N., Salakhutdinov, R.: Multimodal learning with deep Boltzmann machines. J. Mach. Learn. Res. 15(1), 2949–2980 (2014)
Mehta, P.: Multimodal Deep Learning, Medium, 13-Jun-2020. https://towardsdatascience.com/multimodal-deep-learning-ce7d1d994f4. Accessed 02 Dec 2022
Radford, A., et al.: Learning Transferable Visual Models From Natural Language Supervision, CoRR, vol. abs/2103.00020 (2021)
Liu, Y., Qin, Z., Luo, Z., Wang, H.: Auto-painter: Cartoon Image Generation from Sketch by Using Conditional Generative Adversarial Networks, CoRR, vol. abs/1705.01908 (2017)
Kuriakose, B., Thomas, T., Thomas, N.E., Varghese, S.J., Kumar, V.A.: Synthesizing images from hand-drawn sketches using conditional generative adversarial networks. In: 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC), pp. 774–778 (2020)
Liu, B., Song, K., Elgammal, A.: Sketch-to-Art: Synthesizing Stylized Art Images From Sketches. CoRR, vol. abs/2002.12888 (2020)
Berman, A.: Generative Adversarial Networks for Fine Art Generation’, M.S. thesis, CSD, UCT (2020). https://open.uct.ac.za/bitstream/handle/11427/32458/thesis_sci_2020_berman_alan.pdf?sequence=1&isAllowed=y
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2414–2423 (2016)
Jing, Y., Yang, Y., Feng, Z., Ye, J., Yu, Y., Song, M.: Neural style transfer: a review. IEEE Trans. Vis. Comput. Graph. 26(11) 3365–3385 (2020)
Ulyanov, D., Vedaldi, A., Lempitsky, V.: Improved texture networks: maximizing quality and diversity in feed-forward stylization and texture synthesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6924–6932 (2017)
Chen, D., Yuan, L., Liao, J., Yu, N., Hua, G.: Stylebank: an explicit representation for neural image style transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1897–1906 (2017)
Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., Yang, M.H.: Universal style transfer via feature transforms. Adv. Neural Inf. Process. Syst. 385–395 (2017)
Galatolo, F., Cimino, M., Vaglini, G.: Generating images from caption and vice versa via CLIP-guided generative latent space search. In: Proceedings of the International Conference on Image Processing and Vision Engineering (2021)
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of StyleGAN’. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020
Brock, A., Donahue, J., Simonyan, K.: Large Scale GAN Training for High Fidelity Natural Image Synthesis. arXiv (2018)
Smith, A., Colton, S.: CLIP-Guided GAN image generation: an artistic exploration. In: EVO*2021, pp. 17–20 (2021)
DeJausserand, M.: Art trend of 2022: How AI art emerged and polarized the art world, My Modern Met, 07-Dec-2022. https://mymodernmet.com/ai-art-2022/. Accessed 07 Dec 2022
Esser, P., Rombach, R., Ommer, B.: Taming transformers for high‐resolution image synthesis. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2021)
Yu, J., et al.: Vector‐quantized image modeling with improved VQGAN, CoRR, vol. abs/2110.04627, arXiv: 2110.04627 (2021). https://arxiv.org/abs/2110.04627
Yu, J., et al.: Scaling autoregressive models for content‐rich text‐to‐image generation, arXiv: 2206.10789 [cs.CV] (2022)
Crowson, K., et al.: VQGAN-CLIP: open domain image generation and editing with natural language guidance. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol. 13697, pp. 88-105. Springer, Cham. https://doi.org/10.1007/978-3-031-19836-6_6
Ramesh, A., et al.: Zero‐shot text‐to‐image generation, CoRR, vol. abs/2102.12092, arXiv: 2102.12092 (2021). https://arxiv.org/abs/2102.12092
Nichol, A., et al.: Glide: Towards photorealistic image generation and editing with text‐guided diffusion models (2021)
Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., Chen, M.: Hierarchical text‐conditional image generation with clip latents (2022). arXiv: 2204.06125 [cs.CV]
Ideami, J.: Generative ai, from gans to clip, with python and pytorch. https://www.udemy.com/course/generative-creative-ai-fromgans-to-clip-with-python-and-pytorch/
Neural style transfer tensorflow core. https://www.tensorflow.org/tutorials/generative/style_transfer
Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style, CoRR, vol. abs/1508.06576, arXiv: 1508.06576 (2015). http://arxiv.org/abs/1508.06576
Hagtvedt, H., Hagtvedt, R., Patrick, V.: The perception and evaluation of visual art. Empir. Stud. Arts 26, 197–218 (2008). https://doi.org/10.2190/EM.26.2.d
Adajian, T.: The definition of art. In: Zalta, E.N., Ed. The Stanford Encyclopedia of Philosophy, Spring 2022, Metaphysics Research Lab, Stanford University (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Sacco, J.C., Camilleri, V. (2024). An Investigation into AI-Generated Art Through GANs and ML Neural Network. In: Arai, K. (eds) Intelligent Computing. SAI 2024. Lecture Notes in Networks and Systems, vol 1016. Springer, Cham. https://doi.org/10.1007/978-3-031-62281-6_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-62281-6_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-62280-9
Online ISBN: 978-3-031-62281-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)