Generative Model of Suitable Meme Sentences for Images Using AutoEncoder

Yamatomi, Ryo; Mahboubi, Shahrzad; Ninomiya, Hiroshi

doi:10.1007/978-981-99-7019-3_23

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14325))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

558 Accesses

Abstract

This paper proposes a new image caption generative model for Memes called GUMI-AE. Meme denotes a humorous short sentence suitable for the given image in this paper. An Image caption generative model usually consists of an image encoder and a sentence decoder. Furthermore, most conventional models use a pre-trained neural network model for the image encoder, e.g., ResNet152 trained using ImageNet. However, pre-trained ResNet152 may not be effective as an encoder for extracting features from arbitrary images. Because the training samples for the meme generative model can be obtained from the website “Bokete” (in Japanese) which is a website that provides a system for people to post images and humorous short sentences associated with these images. Images posted on Bokete include a wide variety of images such as illustrations and text-only images which may be outside of the training images of ImageNet. This paper proposes an image caption generative model incorporating AutoEncoder (AE) as the image encoder. AE can be trained with the training samples obtained from Bokete without the image annotation. This enables the proposed method to generate short sentences with humor for memes. Finally, the proposed model is compared with the conventional one, and the evaluation of the proposed GUMI-AE will be discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Goodfellow, I., et al.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020)
Article MathSciNet Google Scholar
OpenAI. https://chat.openai.com, ChatGPT. Accessed 6 June 2023
Vinyals, O., et al.: Show and tell: a neural image caption generator. In: Proceedings of IEEE Computer Vision and Pattern Recognition (CVPR), pp. 3156–3164 (2015)
Google Scholar
Akhther, N.: Internet Memes as Form of Cultural Discourse: A Rhetorical Analysis on Facebook. PsyArXiv (2021)
Google Scholar
He, K., Zhang, et al.: Deep residual learning for image recognition. In: Proceedings of IEEE Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Google Scholar
Szegedy, C., et al.: Rethinking the inception architecture for computer vision. In: Proceedings of IEEE Computer Vision and Pattern Recognition (CVPR), pp. 2818–2826 (2016)
Google Scholar
Deng, J., et al.: ImageNet: a large-scale hierarchical image database. In: Proceedings of IEEE Computer Vision and Pattern Recognition (CVPR), pp. 248–255 (2009)
Google Scholar
Peirson, V., et al.: Dank learning: generating memes using deep neural networks. arXiv preprint arXiv:1806.04510 (2018)
Yoshida, K., et al.: Neural joking machine: humorous image captioning. In: Proceedings of IEEE Computer Vision and Pattern Recognition Conference (CVPR) Language and Vision Workshop (2018)
Google Scholar
Memegenerator.net. https://memegenerator.net. Accessed 7 June 2023
Omoroki INC. https://bokete.jp, Bokete. Accessed 7 June 2023
Hinton, G.E., et al.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Google Scholar
Graves, A., et al.: Long short-term memory. In: Supervised Sequence Labeling with Recurrent Neural Networks, vol. 385, pp. 37–45. Springer, Cham (2012). https://doi.org/10.1007/978-3-642-24797-2_4
LeCun, Y., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Williams, R.J., et al.: A learning algorithm for continually running fully recurrent neural networks. Neural Comput. 1(2), 270–280 (1989)
Article Google Scholar
Kingma, D.P., et al.: Adam: a method for stochastic optimization. In: Proceedings of International Conference on Learning Representations (ICLR), pp. 1–13 (2015)
Google Scholar
Yoshikawa, Y., et al.: STAIR captions: constructing a large-scale Japanese image caption dataset. In: Proceedings of Annual Meeting of the Association for Computational Linguistics (ACL), vol. 2, pp. 417–421 (2017)
Google Scholar
LINE Corp. https://linecorp.com, LINE. Accessed 7 June 2023

Download references

Acknowledgements

This work is supported by The Japan Society Promotion of Science (JSPS), KAKENHI (23K11267).

Author information

Authors and Affiliations

Shonan Institute of Technology (SIT), 1-1-25 Tsujido-nishikaigan, Fujisawa, Kanagawa, 251-8511, Japan
Ryo Yamatomi, Shahrzad Mahboubi & Hiroshi Ninomiya

Authors

Ryo Yamatomi
View author publications
You can also search for this author in PubMed Google Scholar
Shahrzad Mahboubi
View author publications
You can also search for this author in PubMed Google Scholar
Hiroshi Ninomiya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ryo Yamatomi .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Fenrong Liu
SEEK Limited, Cremorne, NSW, Australia
Arun Anand Sadanandan
MIMOS (Malaysia), Kuala Lumpur, Malaysia
Duc Nghia Pham
Universitas Indonesia, Depok, Indonesia
Petrus Mursanto
Tabcorp Holdings Limited, Melbourne, VIC, Australia
Dickson Lukose

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yamatomi, R., Mahboubi, S., Ninomiya, H. (2024). Generative Model of Suitable Meme Sentences for Images Using AutoEncoder. In: Liu, F., Sadanandan, A.A., Pham, D.N., Mursanto, P., Lukose, D. (eds) PRICAI 2023: Trends in Artificial Intelligence. PRICAI 2023. Lecture Notes in Computer Science(), vol 14325. Springer, Singapore. https://doi.org/10.1007/978-981-99-7019-3_23

Download citation

DOI: https://doi.org/10.1007/978-981-99-7019-3_23
Published: 10 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7018-6
Online ISBN: 978-981-99-7019-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics