Abstract
Multimodal Named Entity Recognition (MNER) is a fundamental task in the field of natural language processing for social media posts. Current MNER models fail to deal with the relation between text and image entities, which results in the textual noise, image noise and even multimodal noise during processing. In this paper, we first introduce the Part-of-speech (POS) information, which is used for non-entity words eliminating and textual noise filtering. A POS-base gated cross-modal attention network is established to precisely learn the textual and visual representations to remove the image noise. Then, a Mixture-of-Experts (MOE) is proposed for multimodality integration, which optimize the effectiveness of named entity identification and filter the multimodal noise. We evaluate the proposed model on the Twitter dataset and the experimental results establish a strong evidence of the state-of-the-art performance.





Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Suman C, Reddy SM, Saha S, Bhattacharyya P (2021) Why pay more? a simple and efficient named entity recognition system for tweets. Expert Syst Appl 167:114101
Hogan M, Strasburger VC (2018) Social media and new technology: a primer. Clin Pediatr 57(10):1204–1215
Pierri F, Piccardi C, Ceri S (2020) A multi-layer approach to disinformation detection in us and italian news spreading on twitter. EPJ Data Sci 9(1):35
Lizhen L, Wei S, Hanshi W, Chuchu L, Jingli L (2014) A novel feature-based method for sentiment analysis of Chinese product reviews. China Commun 11(3):154–164
Bruns A, Liang YE (2012) Tools and methods for capturing twitter data during natural disasters. First Monday 17(4):1–8
Zhang Q, Fu J, Liu X, Huang X (2018) Adaptive co-attention network for named entity recognition in tweets. In: Thirty-Second AAAI Conference on Artificial Intelligence, 2018
Yu J, Jiang J, Yang L, Xia R (2020) Improving multimodal named entity recognition via entity span detection with unified multimodal transformer. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp 3342–3352
Corchs S, Fersini E, Gasparini F (2019) Ensemble learning on visual and textual data for social image emotion classification. Int J Mach Learn Cybern 10(8):2057–2070
Kim G, Lee C, Jo J, Lim H (2020) Automatic extraction of named entities of cyber threats using a deep bi-lstm-crf network. Int J Mach Learn Cybern 11(10):2341–2355
Wang D, Fan X (2009) Named entity recognition for short text. J Comput Appl 29(1):143–145
Ruokolainen T, Kauppinen P, Silfverberg M, Linden K (2020) A finnish news corpus for named entity recognition. Lang Resour Eval 54(1):247–272
Zhou L, Li J, Gu Z, Qiu J, Gupta BB, Tian Z Panner: Pos-aware nested named entity recognition through heterogeneous graph neural network. In: IEEE Transactions on Computational Social Systems.
Gangadharan V, Gupta D (2020) Recognizing named entities in agriculture documents using lda based topic modelling techniques. Procedia Comput Sci 171:1337–1345
Baltruˇsaitis T, Ahuja C, Morency L-P (2018) Multimodal machine learning: a survey and taxonomy. IEEE Trans Pattern Anal Mach Intell 41(2):423–443
Zeng C, Kwong S (2022) Learning cross-modality features for image caption generation. Int J Mach Learn Cybern 13(7):2059–2070
Bruni E, Tran N-K, Baroni M (2014) Multimodal distributional semantics. J Artif Intell Res 49:1–47
Lu D, Neves L, Carvalho V, Zhang N, Ji H (2018) Visual attention model for name tagging in multimodal social media. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 1990–1999
Zheng C, Wu Z, Wang T, Cai Y, Li Q (2020) Object-aware multimodal named entity recognition in social media posts with adversarial learning. IEEE Trans Multimed 23:2520–2532
Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT (1)
Qin Y, Shen G-W, Zhao W-B, Chen Y-P, Yu M, Jin X (2019) A network security entity recognition method based on feature template and cnn-bilstm-crf. Front Inf Technol Electron Eng 20(6):872–884
Yang J, Liang S, Zhang Y (2018) Design challenges and misconceptions in neural sequence labeling. In: Proceedings of the 27th international conference on computational linguistics, pp 3879–3889
Chen D, Li Z, Gu B, Chen Z (2021) Multimodal named entity recognition with image attributes and image knowledge. In: International conference on database systems for advanced applications. Springer, pp186–201
Liu L, Wang M, Zhang M, Qing L, He X (2022) Uamner: uncertainty-aware multimodal named entity recognition in social media posts. Appl Intell 52(4):4109–4125
Funding
This work was supported by the National Statistical Science Research Project of China under Grant No. 2016LY98, the Characteristic Innovation Projects of Guangdong Colleges and Universities (Nos. 2018KTSCX049), the Science and Technology Plan Project of Guangzhou under Grant Nos. 202102080258 and 201903010013.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chen, J., Xue, Y., Zhang, H. et al. On development of multimodal named entity recognition using part-of-speech and mixture of experts. Int. J. Mach. Learn. & Cyber. 14, 2181–2192 (2023). https://doi.org/10.1007/s13042-022-01754-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-022-01754-w