Owner named entity recognition in website based on multidimensional text guidance and space alignment co-attention

Zheng, Xin; He, Xin; Ren, Yimo; Wang, Jinfa; Yu, Junyang

doi:10.1007/s00530-023-01170-2

Owner named entity recognition in website based on multidimensional text guidance and space alignment co-attention

Regular Paper
Published: 30 August 2023

Volume 29, pages 3757–3770, (2023)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Xin Zheng^1,2,
Xin He¹,
Yimo Ren²,
Jinfa Wang² &
…
Junyang Yu¹

177 Accesses
Explore all metrics

Abstract

In recent research, the task of Owner Named Entity Recognition (ONER) in websites has been proposed as a specific and practical application of Multimodal Named Entity Recognition (MNER). The ONER aims to identify the true owner of websites on the Internet, which plays a crucial role in network security. The existing method involves identifying the website owner’s name through the text, image, and domain in the content of the website, where the owner information usually appears. However, most of the previous methods simply extracted features from the image and the domain as two independent modalities and did not fully utilize the text information in them. Additionally, these methods do not consider that different modality features are trained on their respective modality space, which makes it difficult to model cross-modal interactions due to different feature spaces. To address these two issues, this paper proposes a Multidimensional Text Guidance and Space Alignment Co-Attention (MTGSAC) model to realize owner named entity recognition in websites. The MTGSAC model can utilize the text information in the image and the domain modalities to guide the text modality for features extraction. Meanwhile, the model designs a features fusion module based on Transformer and co-attention gate mechanism to effectively model cross-modal interactions. Furthermore, to address the problems of insufficient data samples and poor data diversity in the existing ONER dataset, we extended the ONER dataset and proposed the ONER-2.0 dataset. Experimental results on both the ONER and ONER-2.0 datasets show that our model achieves state-of-the-art performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Owner name entity recognition in websites based on heterogeneous and dynamic graph transformer

Article 01 June 2023

MVPN: Multi-granularity visual prompt-guided fusion network for multimodal named entity recognition

Article 08 February 2024

Improving embedding learning by virtual attribute decoupling for text-based person search

Article 07 January 2022

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author or the first author on reasonable request.

References

Li, G., Yanan, C., Majing, S., Yanmin, S., Yujia, Z., Peng, Z., Chuan, Z.: Cyberspace resources surveying and mapping: the concepts and technologies. J. Cyber Secur. 3(4), 1 (2018)
Google Scholar
Ren, Y., Li, H., Liu, P., Liu, J., Zhu, H., Sun, L.: Owner name entity recognition in websites based on multiscale features and multimodal co-attention. Expert Syst. Appl. 224, 120014 (2023)
Article Google Scholar
Ruiz-Sánchez, M.Á., Biersack, E.W., Dabbous, W.: Survey and taxonomy of ip address lookup algorithms. IEEE Netw. 15(2), 8–23 (2001)
Article Google Scholar
Fiebig, T., Borgolte, K., Hao, S., Kruegel, C., Vigna, G.: Something from nothing (there): collecting global ipv6 datasets from dns. In: Passive and Active Measurement: 18th International Conference, PAM 2017, Sydney, NSW, Australia, March 30-31, 2017, Proceedings 18, pp. 30–43. Springer (2017)
Moon, S., Neves, L., Carvalho, V.: Multimodal named entity recognition for short social media posts. In: Proceedings of the 2018 Conference of the North Ameri- can Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long Papers), pp. 852–860 (2018)
Zhang, Q., Fu, J., Liu, X., Huang, X.: Adaptive co-attention network for named entity recognition in tweets. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Yu, J., Jiang, J., Yang, L., Xia, R.: Improving multimodal named entity recognition via entity span detection with unified multimodal transformer. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 3342–3352 (2020)
Wang, X., Gui, M., Jiang, Y., Jia, Z., Bach, N., Wang, T., Huang, Z., Tu, K.: ITA: Image-text alignments for multi-modal named entity recognition. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 3176–3189 (2022)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6000–6010 (2017)
Hou, Y., Chen, X., Hao, Y., Shi, Z., Yang, S.: Survey of cyberspace resources scanning and analyzing. In: Innovative Mobile and Internet Services in Ubiquitous Computing: Proceedings of the 14th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS-2020), pp. 279–291. Springer (2021)
Daigle, L.: Whois protocol specification. Rfc 49(8), 756–757 (2004)
Romero-Gomez, R., Nadji, Y., Antonakakis, M.: Towards designing effective visualizations for dns-based network threat analysis. In: 2017 IEEE Symposium on Visualization for Cyber Security (VizSec), pp. 1–8 (2017)
Wang, Y., Wang, X., Zhu, H., Zhao, H., Li, H., Sun, L.: One-geo: client-independent ip geolocation based on owner name extraction. In: Wireless Algorithms, Systems, and Applications: 14th International Conference, WASA 2019, Honolulu, HI, USA, June 24–26, 2019, Proceedings 14, pp. 346–357. Springer (2019)
Wang, Y., Burgener, D., Flores, M., Kuzmanovic, A., Huang, C.: Towards street-level client-independent ip geolocation. Nsdi 11, 27 (2011)
Arshad, O., Gallo, I., Nawaz, S., Calefati, A.: Aiding intra-text representations with visual context for multimodal named entity recognition. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 337–342 (2019)
Sun, L., Wang, J., Su, Y., Weng, F., Sun, Y., Zheng, Z., Chen, Y.: Riva: a pre-trained tweet multimodal model based on text-image relation for multimodal ner. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 1852–1862 (2020)
Chen, D., Li, Z., Gu, B., Chen, Z.: Multimodal named entity recognition with image attributes and image knowledge. In: Database Systems for Advanced Applications: 26th International Conference, DASFAA 2021, Taipei, Taiwan, April 11–14, 2021, Proceedings, Part II 26, pp. 186–201. Springer (2021)
Baltrušaitis, T., Ahuja, C., Morency, L.-P.: Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 41(2), 423–443 (2018)
Article Google Scholar
Wang, D., Mao, K.: Learning semantic text features for web text-aided image classification. IEEE Trans. Multimed. 21(12), 2985–2996 (2019)
Article Google Scholar
Su, W., Zhu, X., Cao, Y., Li, B., Lu, L., Wei, F., Dai, J.: Vl-bert: Pre-training of generic visual-linguistic representations. In: International Conference on Learning Representations (2020)
Xu, B., Huang, S., Sha, C., Wang, H.: Maf: a general matching and alignment framework for multimodal named entity recognition. In: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, pp. 1215–1223 (2022)
Gao, T., Yao, X., Chen, D.: SimCSE: Simple contrastive learning of sentence embeddings. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 6894–6910 (2021)
Tjong Kim Sang, E.F., Veenstra, J.: Representing text chunks. In: Ninth Conference of the European Chapter of the Association for Computational Linguistics, Bergen, Norway, pp. 173–179 (1999)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long and Short Papers), pp. 4171–4186 (2019)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Liu, P., Wang, G., Li, H., Liu, J., Ren, Y., Zhu, H., Sun, L.: Multi-granularity cross-modality representation learning for named entity recognition on social media. arXiv preprint arXiv:2210.14163 (2022)
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 282–289 (2001)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (2015)
Sun, C., Shrivastava, A., Singh, S., Gupta, A.: Revisiting unreasonable effectiveness of data in deep learning era. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 843–852 (2017)

Download references

Acknowledgements

This work was supported by the Major Science and Technology Special Project of Henan Province (No.201300210400), the Science and Technology Department of Henan Province (No. 222102520006).

Funding

This article is funded by Major Science and Technology Special Project of Henan Province (No. 201300210400); the Science and Technology Department of Henan Province (No. 222102520006).

Author information

Authors and Affiliations

School of Software, Henan University, Kaifeng, 475004, China
Xin Zheng, Xin He & Junyang Yu
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, 100089, China
Xin Zheng, Yimo Ren & Jinfa Wang

Authors

Xin Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Xin He
View author publications
You can also search for this author in PubMed Google Scholar
Yimo Ren
View author publications
You can also search for this author in PubMed Google Scholar
Jinfa Wang
View author publications
You can also search for this author in PubMed Google Scholar
Junyang Yu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study’s conception and design. The first draft of the manuscript was prepared by XZ, while XH was responsible for reviewing and editing the paper. YR and XZ carried out the data collection and analysis for the study. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Xin He or Yimo Ren.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Communicated by B. Bao.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zheng, X., He, X., Ren, Y. et al. Owner named entity recognition in website based on multidimensional text guidance and space alignment co-attention. Multimedia Systems 29, 3757–3770 (2023). https://doi.org/10.1007/s00530-023-01170-2

Download citation

Received: 03 July 2023
Accepted: 21 August 2023
Published: 30 August 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s00530-023-01170-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Owner named entity recognition in website based on multidimensional text guidance and space alignment co-attention

Abstract

Access this article

Similar content being viewed by others

Owner name entity recognition in websites based on heterogeneous and dynamic graph transformer

MVPN: Multi-granularity visual prompt-guided fusion network for multimodal named entity recognition

Improving embedding learning by virtual attribute decoupling for text-based person search

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Owner named entity recognition in website based on multidimensional text guidance and space alignment co-attention

Abstract

Access this article

Similar content being viewed by others

Owner name entity recognition in websites based on heterogeneous and dynamic graph transformer

MVPN: Multi-granularity visual prompt-guided fusion network for multimodal named entity recognition

Improving embedding learning by virtual attribute decoupling for text-based person search

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation