ABSTRACT
Interest in the research areas related to meme propagation and generation has been increasing rapidly in the last couple of years. Meme datasets available online are either specific to a context or contain no class information. Here, we prepare a large-scale dataset of memes with captions and class labels. The dataset consists of 1.1 million meme captions from 128 classes. We also provide a reasoning for the existence of broad categories, called 'themes' across the meme dataset; each theme consists of multiple meme classes. Our generation system uses a trained state-of-the-art transformer based model for caption generation by employing an encoderdecoder architecture. We develop a web interface, called Memeify for users to generate memes of their choice, and explain in detail, the working of individual components of the system. We also perform qualitative evaluation of the generated memes by conducting a user study. A link to the demonstration of the Memeify system is https://youtu.be/P_Tfs0X-czs.
- Pavel Berkhin. 2006. A Survey of Clustering Data Mining Techniques. In Grouping Multidimensional Data.Google Scholar
- Florian Colombo, Alexander Seeholzer, and Wulfram Gerstner. 2017. Deep artificial composer: A creative neural network model for automated melody generation. In International Conference on Evolutionary and Biologically Inspired Music and Art. 81--96.Google ScholarCross Ref
- R Dawkins. 1976. The Selfish Gene. Oxford University Press, Oxford, UK.Google Scholar
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. In arXiv preprint arXiv:1810.04805.Google Scholar
- Albert Gatt and Emiel Krahmer. 2018. Survey of the state of the art in natural language generation: Core tasks, applications and evaluation. Journal of Artificial Intelligence Research (2018).Google Scholar
- Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems.Google Scholar
- Sayan Goswami. 2018. Reddit Dank Memes Dataset. https://www.kaggle.com/sayangoswami/reddit-memes-dataset.Google Scholar
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8 (Nov. 1997), 1735--1780. https://doi.org/10.1162/neco.1997.9.8.1735Google ScholarDigital Library
- Joel Lehman, Sebastian Risi, and Jeff Clune. 2016. Creative generation of 3D objects with deep learning and innovation engines. In Proceedings of the 7th International Conference on Computational Creativity.Google Scholar
- Xiaodong Liu, Pengcheng He, Weizhu Chen, and Jianfeng Gao. 2019. Multi-task Deep Neural Networks for Natural Language Understanding. In arXiv preprint arXiv:1901.11504.Google Scholar
- Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. CoRR (2019).Google Scholar
- Suraj Nath. 2018. Meme Generator Dataset. https://www.kaggle.com/electron0zero/memegenerator-dataset/home.Google Scholar
- V Peirson, L Abel, and E Meltem Tolunay. 2018. Dank learning: Generating memes using deep neural networks. arXiv preprint arXiv:1806.04510 (2018).Google Scholar
- QuickMeme. 2016. Quick Meme Website. http://quickmeme.com/.Google Scholar
- Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. OpenAI Blog (2019).Google Scholar
- Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention Is All You Need. CoRR (2017).Google Scholar
- Zhilin Yang, Zihang Dai, Yiming Yang, Jaime G. Carbonell, Ruslan Salakhutdinov, and Quoc V. Le. 2019. XLNet: Generalized Autoregressive Pretraining for Language Understanding. CoRR (2019).Google Scholar
Recommendations
Identifying the influential bloggers in a community
WSDM '08: Proceedings of the 2008 International Conference on Web Search and Data MiningBlogging becomes a popular way for a Web user to publish information on the Web. Bloggers write blog posts, share their likes and dislikes, voice their opinions, provide suggestions, report news, and form groups in Blogosphere. Bloggers form their ...
Disinformation Warfare: Understanding State-Sponsored Trolls on Twitter and Their Influence on the Web
WWW '19: Companion Proceedings of The 2019 World Wide Web ConferenceOver the past couple of years, anecdotal evidence has emerged linking coordinated campaigns by state-sponsored actors with efforts to manipulate public opinion on the Web, often around major political events, through dedicated accounts, or “trolls.” ...
Rumor Gauge: Predicting the Veracity of Rumors on Twitter
Special Issue on KDD 2016 and Regular PapersThe spread of malicious or accidental misinformation in social media, especially in time-sensitive situations, such as real-world emergencies, can have harmful effects on individuals and society. In this work, we developed models for automated ...
Comments