Abstract
Humor plays an important role in human communication. Besides language, multimodal information is also of great significance in humor expression and understanding, which promotes the development of multimodal humor research. However, in existing datasets, images and text often have a one-to-one relationship, making it difficult to control image modality variables. It causes the low correlation and low enhancement between the two modalities in humor recognition tasks. Moreover, with the development of Vision Transformers (ViTs), the generalization ability of visual models has been greatly enhanced. Using ViTs alone can achieve impressive performance, but is difficult to explain. In this paper, we introduce Memeplate (Our dataset is available at https://github.com/chineselzf/memeplate.), a novel multimodal humor dataset containing 203 templates, 5,184 memes and manually annotated humor levels. The template transfers images and text into a one-to-many relationship, which can make it easier for researchers to cut through the linguistic lens to multimodal humor. And it provides examples closer to human behavior for generation research. In addition, we provide multiple baseline results on the humor recognition task, which demonstrate the effectiveness of our control over image modality and the importance of introducing multimodal cues.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ahuja, V., Bali, T., Singh, N.: What makes us laugh? Investigations into automatic humor classification. In: Proceedings of the Second Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media, pp. 1–9 (2018)
Ali, A., et al.: XCiT: cross-covariance image transformers. Adv. Neural Inf. Process. Syst. 34, 20014–20027 (2021)
Bao, H., Dong, L., Wei, F.: BEiT: BERT pre-training of image transformers. arXiv preprint arXiv:2106.08254 (2021)
Blinov, V., Bolotova-Baranova, V., Braslavski, P.: Large dataset and language model fun-tuning for humor recognition. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4027–4032 (2019)
Bonheme, L., Grześ, M.: SESAM at SemEval-2020 task 8: investigating the relationship between image and text in sentiment analysis of memes. In: Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp. 804–816 (2020)
Castro, S., Chiruzzo, L., Rosá, A., Garat, D., Moncecchi, G.: A crowd-annotated Spanish corpus for humor analysis. In: Proceedings of the Sixth International Workshop on Natural Language Processing for Social Media, pp. 7–11 (2018)
Castro, S., Cubero, M., Garat, D., Moncecchi, G.: Is this a joke? Detecting humor in Spanish tweets. In: Montes-y-Gómez, M., Escalante, H.J., Segura, A., Murillo, J.D. (eds.) IBERAMIA 2016. LNCS (LNAI), vol. 10022, pp. 139–150. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47955-2_12
Cui, Y., Che, W., Liu, T., Qin, B., Wang, S., Hu, G.: Revisiting pre-trained models for Chinese natural language processing. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 657–668 (2020)
Hasan, M.K., et al.: Ur-funny: a multimodal language dataset for understanding humor. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 2046–2056 (2019)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Huo, Y., et al.: WenLan: bridging vision and language by large-scale multi-modal pre-training. arXiv preprint arXiv:2103.06561 (2021)
Kayatani, Y., et al.: The laughing machine: predicting humor in video. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2073–2082 (2021)
Kenton, J.D.M.W.C., Toutanova, L.K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp. 4171–4186 (2019)
Khandelwal, A., Swami, S., Akhtar, S.S., Shrivastava, M.: Humor detection in English-Hindi code-mixed social media content: Corpus and baseline system. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) (2018)
Krippendorff, K.: Computing Krippendorff’s alpha-reliability (2011)
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Mihalcea, R., Strapparava, C.: Making computers laugh: investigations in automatic humor recognition. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pp. 531–538 (2005)
Mihalcea, R., Strapparava, C., Pulman, S.: Computational models for incongruity detection in humour. In: Gelbukh, A. (ed.) CICLing 2010. LNCS, vol. 6008, pp. 364–374. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12116-6_30
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 28 (2015)
Sharma, C., et al.: Semeval-2020 task 8: Memotion analysis-the visuo-lingual metaphor! In: Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp. 759–773 (2020)
Tseng, Y.H., Wu, W.S., Chang, C.Y., Chen, H.C., Hsu, W.L.: Development and validation of a corpus for machine humor comprehension. In: Proceedings of the 12th Language Resources and Evaluation Conference, pp. 1346–1352 (2020)
Wu, J., Lin, H., Yang, L., Xu, B.: MUMOR: a multimodal dataset for humor detection in conversations. In: Wang, L., Feng, Y., Hong, Yu., He, R. (eds.) NLPCC 2021. LNCS (LNAI), vol. 13028, pp. 619–627. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-88480-2_49
Yang, D., Lavie, A., Dyer, C., Hovy, E.: Humor recognition and humor anchor extraction. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 2367–2376 (2015)
Zhang, D., Zhang, H., Liu, X., Lin, H., Xia, F.: Telling the whole story: a manually annotated Chinese dataset for the analysis of humor in jokes. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 6402–6407 (2019)
Zhang, R., Liu, N.: Recognizing humor on twitter. In: Proceedings of the 23rd ACM International Conference on Information and Knowledge Management, pp. 889–898 (2014)
Ziser, Y., Kravi, E., Carmel, D.: Humor detection in product question answering systems. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 519–528 (2020)
Acknowledgements
This work is supported by National Natural Science Foundation of China (NSFC) Program (No. 62076046). And we would like to thank the anonymous reviewers for their insightful and valuable comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Li, Z., Lin, H., Yang, L., Xu, B., Zhang, S. (2022). Memeplate: A Chinese Multimodal Dataset for Humor Understanding in Meme Templates. In: Lu, W., Huang, S., Hong, Y., Zhou, X. (eds) Natural Language Processing and Chinese Computing. NLPCC 2022. Lecture Notes in Computer Science(), vol 13551. Springer, Cham. https://doi.org/10.1007/978-3-031-17120-8_41
Download citation
DOI: https://doi.org/10.1007/978-3-031-17120-8_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17119-2
Online ISBN: 978-3-031-17120-8
eBook Packages: Computer ScienceComputer Science (R0)