On development of multimodal named entity recognition using part-of-speech and mixture of experts

Chen, Jianying; Xue, Yun; Zhang, Haolan; Ding, Weiping; Zhang, Zhengxuan; Chen, Jiehai

doi:10.1007/s13042-022-01754-w

On development of multimodal named entity recognition using part-of-speech and mixture of experts

Original Article
Published: 24 December 2022

Volume 14, pages 2181–2192, (2023)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Jianying Chen¹,
Yun Xue⁴,
Haolan Zhang²,
Weiping Ding³,
Zhengxuan Zhang⁴ &
…
Jiehai Chen⁴

616 Accesses
8 Citations
1 Altmetric
Explore all metrics

Abstract

Multimodal Named Entity Recognition (MNER) is a fundamental task in the field of natural language processing for social media posts. Current MNER models fail to deal with the relation between text and image entities, which results in the textual noise, image noise and even multimodal noise during processing. In this paper, we first introduce the Part-of-speech (POS) information, which is used for non-entity words eliminating and textual noise filtering. A POS-base gated cross-modal attention network is established to precisely learn the textual and visual representations to remove the image noise. Then, a Mixture-of-Experts (MOE) is proposed for multimodality integration, which optimize the effectiveness of named entity identification and filter the multimodal noise. We evaluate the proposed model on the Twitter dataset and the experimental results establish a strong evidence of the state-of-the-art performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MAFN: multi-level attention fusion network for multimodal named entity recognition

Article 20 October 2023

P-MNER: Cross Modal Correction Fusion Network with Prompt Learning for Multimodal Named Entity Recognition

A Multi-expert Collaborative Framework for Multimodal Named Entity Recognition

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Suman C, Reddy SM, Saha S, Bhattacharyya P (2021) Why pay more? a simple and efficient named entity recognition system for tweets. Expert Syst Appl 167:114101
Article Google Scholar
Hogan M, Strasburger VC (2018) Social media and new technology: a primer. Clin Pediatr 57(10):1204–1215
Article Google Scholar
Pierri F, Piccardi C, Ceri S (2020) A multi-layer approach to disinformation detection in us and italian news spreading on twitter. EPJ Data Sci 9(1):35
Article Google Scholar
Lizhen L, Wei S, Hanshi W, Chuchu L, Jingli L (2014) A novel feature-based method for sentiment analysis of Chinese product reviews. China Commun 11(3):154–164
Article Google Scholar
Bruns A, Liang YE (2012) Tools and methods for capturing twitter data during natural disasters. First Monday 17(4):1–8
Google Scholar
Zhang Q, Fu J, Liu X, Huang X (2018) Adaptive co-attention network for named entity recognition in tweets. In: Thirty-Second AAAI Conference on Artificial Intelligence, 2018
Yu J, Jiang J, Yang L, Xia R (2020) Improving multimodal named entity recognition via entity span detection with unified multimodal transformer. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp 3342–3352
Corchs S, Fersini E, Gasparini F (2019) Ensemble learning on visual and textual data for social image emotion classification. Int J Mach Learn Cybern 10(8):2057–2070
Article Google Scholar
Kim G, Lee C, Jo J, Lim H (2020) Automatic extraction of named entities of cyber threats using a deep bi-lstm-crf network. Int J Mach Learn Cybern 11(10):2341–2355
Article Google Scholar
Wang D, Fan X (2009) Named entity recognition for short text. J Comput Appl 29(1):143–145
MATH Google Scholar
Ruokolainen T, Kauppinen P, Silfverberg M, Linden K (2020) A finnish news corpus for named entity recognition. Lang Resour Eval 54(1):247–272
Article Google Scholar
Zhou L, Li J, Gu Z, Qiu J, Gupta BB, Tian Z Panner: Pos-aware nested named entity recognition through heterogeneous graph neural network. In: IEEE Transactions on Computational Social Systems.
Gangadharan V, Gupta D (2020) Recognizing named entities in agriculture documents using lda based topic modelling techniques. Procedia Comput Sci 171:1337–1345
Article Google Scholar
Baltruˇsaitis T, Ahuja C, Morency L-P (2018) Multimodal machine learning: a survey and taxonomy. IEEE Trans Pattern Anal Mach Intell 41(2):423–443
Article Google Scholar
Zeng C, Kwong S (2022) Learning cross-modality features for image caption generation. Int J Mach Learn Cybern 13(7):2059–2070
Article Google Scholar
Bruni E, Tran N-K, Baroni M (2014) Multimodal distributional semantics. J Artif Intell Res 49:1–47
Article MathSciNet MATH Google Scholar
Lu D, Neves L, Carvalho V, Zhang N, Ji H (2018) Visual attention model for name tagging in multimodal social media. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 1990–1999
Zheng C, Wu Z, Wang T, Cai Y, Li Q (2020) Object-aware multimodal named entity recognition in social media posts with adversarial learning. IEEE Trans Multimed 23:2520–2532
Article Google Scholar
Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT (1)
Qin Y, Shen G-W, Zhao W-B, Chen Y-P, Yu M, Jin X (2019) A network security entity recognition method based on feature template and cnn-bilstm-crf. Front Inf Technol Electron Eng 20(6):872–884
Article Google Scholar
Yang J, Liang S, Zhang Y (2018) Design challenges and misconceptions in neural sequence labeling. In: Proceedings of the 27th international conference on computational linguistics, pp 3879–3889
Chen D, Li Z, Gu B, Chen Z (2021) Multimodal named entity recognition with image attributes and image knowledge. In: International conference on database systems for advanced applications. Springer, pp186–201
Liu L, Wang M, Zhang M, Qing L, He X (2022) Uamner: uncertainty-aware multimodal named entity recognition in social media posts. Appl Intell 52(4):4109–4125
Article Google Scholar

Download references

Funding

This work was supported by the National Statistical Science Research Project of China under Grant No. 2016LY98, the Characteristic Innovation Projects of Guangdong Colleges and Universities (Nos. 2018KTSCX049), the Science and Technology Plan Project of Guangzhou under Grant Nos. 202102080258 and 201903010013.

Author information

Authors and Affiliations

Guangdong Provincial Key Laboratory of Quantum Engineering and Quantum Materials, School of Physics and Tele-Communication Engineering, South China Normal University, Guangzhou, 510006, China
Jianying Chen
NIT, Zhejiang University, Zhejiang, 310058, China
Haolan Zhang
School of Information Science and Technology, Nantong University, Nantong, 226019, Nantong, China
Weiping Ding
School of Electronics and Information Engineering, South China Normal University, 528225, Foshan, China
Yun Xue, Zhengxuan Zhang & Jiehai Chen

Authors

Jianying Chen
View author publications
You can also search for this author inPubMed Google Scholar
Yun Xue
View author publications
You can also search for this author inPubMed Google Scholar
Haolan Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Weiping Ding
View author publications
You can also search for this author inPubMed Google Scholar
Zhengxuan Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Jiehai Chen
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Yun Xue.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chen, J., Xue, Y., Zhang, H. et al. On development of multimodal named entity recognition using part-of-speech and mixture of experts. Int. J. Mach. Learn. & Cyber. 14, 2181–2192 (2023). https://doi.org/10.1007/s13042-022-01754-w

Download citation

Received: 04 January 2022
Accepted: 13 December 2022
Published: 24 December 2022
Issue Date: June 2023
DOI: https://doi.org/10.1007/s13042-022-01754-w

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On development of multimodal named entity recognition using part-of-speech and mixture of experts

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

MAFN: multi-level attention fusion network for multimodal named entity recognition

P-MNER: Cross Modal Correction Fusion Network with Prompt Learning for Multimodal Named Entity Recognition

A Multi-expert Collaborative Framework for Multimodal Named Entity Recognition

Explore related subjects

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now