A Cross-Modal Classification Dataset on Social Network

Hu, Yong; Huang, Heyan; Chen, Anfan; Mao, Xian-Ling

doi:10.1007/978-3-030-60450-9_55

A Cross-Modal Classification Dataset on Social Network

Yong Hu¹²,
Heyan Huang¹²,
Anfan Chen¹³ &
…
Xian-Ling Mao¹²

Conference paper
First Online: 02 October 2020

3122 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12430))

Abstract

Classifying tweets into general categories, such as food, music and games, is an essential work for social network platforms, which is the basis for information recommendation, user portraits and content construction. As far as we know, nearly all existing general tweet classification datasets only have textual content. However, textual content in tweets may be short, meaningless, and even none, which would harm the classification performance. In fact, images and videos are widespread in tweets, and they can intuitively provide extra useful information. To fill this gap, we construct a novel Cross-Modal Classification Dataset constructed from Weibo called CMCD. Specifically, we collect tweets with three modalities of text, image and video from 18 general categories, and then filter tweets that can easily be classified by only textual contents. Finally, the whole dataset consists of 85,860 tweets, and all of them have been manually labelled. Among them, 64.4% of tweets contain images, and 16.2% of tweets contain videos. We implement classical baselines for tweets classification and report human performance. Empirical results show that the classification over CMCD is challenging enough and requires further efforts.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Arevalo, J., Solorio, T., Montes-y Gómez, M., González, F.A.: Gated multimodal units for information fusion. In: 5th International conference on learning representations 2017 workshop, (2017)
Google Scholar
Banerjee, N., Chakraborty, D., Joshi, A., Mittal, S., Rai, A., Ravindran, B.: Towards analyzing micro-blogs for detection and classification of real-time intentions. In: Sixth International AAAI Conference on Weblogs and Social Media, (2012)
Google Scholar
Cai, Y., Cai, H., Wan, X.: Multi-modal sarcasm detection in twitter with hierarchical fusion model. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 2506–2515 (2019)
Google Scholar
Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: Nus-wide: a real-world web image database from national university of singapore. In: Proceedings of the ACM international conference on image and video retrieval, pp. 1–9 (2009)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255. IEEE (2009)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778. IEEE (2016)
Google Scholar
Huang, F., Zhang, X., Zhao, Z., Xu, J., Li, Z.: Image-text sentiment analysis via deep multimodal attentive fusion. Knowl. Based Syst. 167, 26–37 (2019)
Article Google Scholar
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Association for Computational Linguistics. 2, 427–431 (2017)
Google Scholar
Kay, W., et al.: The kinetics human action video dataset. arXiv preprint arXiv:1705.06950 (2017)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kumar, A., Singh, J.P., Dwivedi, Y.K., Rana, N.P.: A deep multi-modal neural network for informative twitter content classification during emergencies. Annals of Operations Research, pp. 1–32 (2020)
Google Scholar
Kumar, A., Garg, G.: Sentiment analysis of multimodal twitter data. Multimedia Tools Appl. 78(17), 24103–24119 (2019)
Article Google Scholar
Lee, K., Palsetia, D., Narayanan, R., Patwary, M.M.A., Agrawal, A., Choudhary, A.: Twitter trending topic classification. In: 2011 IEEE 11th International Conference on Data Mining Workshops, pp. 251–258. IEEE (2011)
Google Scholar
Liu, Z., Yu, W., Chen, W., Wang, S., Wu, F.: Short text feature selection for micro-blog mining. In: 2010 International Conference on Computational Intelligence and Software Engineering, pp. 1–4. IEEE (2010)
Google Scholar
Qiu, Z., Yao, T., Mei, T.: Learning spatio-temporal representation with pseudo-3d residual networks. In: proceedings of the IEEE International Conference on Computer Vision, pp. 5533–5541 (2017)
Google Scholar
Rasiwasia, N., et al.: A new approach to cross-modal multimedia retrieval. In: Proceedings of the 18th ACM international conference on Multimedia, pp. 251–260 (2010)
Google Scholar
Sakaguchi, K., Bras, R.L., Bhagavatula, C., Choi, Y.: Winogrande: an adversarial winograd schema challenge at scale. arXiv preprint arXiv:1907.10641 (2019)
Sriram, B., Fuhry, D., Demir, E., Ferhatosmanoglu, H., Demirbas, M.: Short text classification in twitter to improve information filtering. In: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, pp. 841–842 (2010)
Google Scholar
Vadicamo, L., et al.: Cross-media learning for image sentiment analysis in the wild. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 308–317. IEEE (2017)
Google Scholar
Wang, X., Kumar, D., Thome, N., Cord, M., Precioso, F.: Recipe recognition with large multimodal food dataset. In: 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 1–6. IEEE (2015)
Google Scholar
You, Q., Luo, J., Jin, H., Yang, J.: Joint visual-textual sentiment analysis with deep neural networks. In: Proceedings of the 23rd ACM international conference on Multimedia, pp. 1071–1074 (2015)
Google Scholar
Yunjie, F., Huailiang, L.: Research on chinese short text classification based on wikipedia. Data Anal. Knowl. Discov. 28(3), 47–52 (2012)
Google Scholar
Zellers, R., Bisk, Y., Schwartz, R., Choi, Y.: Swag: a large-scale adversarial dataset for grounded commonsense inference. arXiv preprint arXiv:1808.05326 (2018)
Zubiaga, A., Spina, D., Martínez, R., Fresno, V.: Real-time classification of twitter trends. J. Assoc. Inf. Sci. Technol. 66(3), 462–473 (2015)
Article Google Scholar

Download references

Acknowledgement

The work is supported by National Key R&D Plan (No. 2016QY03D0602), NSFC (No. U19B2020, 61772076, 61751201 and 61602197) and NSFB (No. Z181100008918002).

Author information

Authors and Affiliations

School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
Yong Hu, Heyan Huang & Xian-Ling Mao
University of Science and Technology of China, Hefei, China
Anfan Chen

Authors

Yong Hu
View author publications
You can also search for this author in PubMed Google Scholar
Heyan Huang
View author publications
You can also search for this author in PubMed Google Scholar
Anfan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xian-Ling Mao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Heyan Huang .

Editor information

Editors and Affiliations

ECE & Ingenuity Labs Research Institute, Queen’s University, Kingston, ON, Canada
Xiaodan Zhu
Department of Computer Science and Technology, Tsinghua University, Beijing, China
Min Zhang
School of Computer Science and Technology, Soochow University, Suzhou, China
Yu Hong
College of Intelligence and Computing, Tianjin University, Tianjin, China
Ruifang He

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hu, Y., Huang, H., Chen, A., Mao, XL. (2020). A Cross-Modal Classification Dataset on Social Network. In: Zhu, X., Zhang, M., Hong, Y., He, R. (eds) Natural Language Processing and Chinese Computing. NLPCC 2020. Lecture Notes in Computer Science(), vol 12430. Springer, Cham. https://doi.org/10.1007/978-3-030-60450-9_55

Download citation

DOI: https://doi.org/10.1007/978-3-030-60450-9_55
Published: 02 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60449-3
Online ISBN: 978-3-030-60450-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)