Owner name entity recognition in websites based on heterogeneous and dynamic graph transformer

Ren, Yimo; Li, Hong; Liu, Peipei; Liu, Jie; Li, Zhi; Zhu, Hongsong; Sun, Limin

doi:10.1007/s10115-023-01908-4

Owner name entity recognition in websites based on heterogeneous and dynamic graph transformer

Regular Paper
Published: 01 June 2023

Volume 65, pages 4411–4429, (2023)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Yimo Ren^1,2,
Hong Li^1,2,
Peipei Liu^1,2,
Jie Liu^1,2,
Zhi Li^1,2,
Hongsong Zhu^1,2 &
…
Limin Sun^1,2

312 Accesses
Explore all metrics

Abstract

Identifying owners of devices on the Internet can enable numerous network security applications. For example, accurate Owner Name Entity Recognition (ONER) of websites is critical to find influenced owners in light of new security threats. In this situation, as a specific task of Multimodal Named Entity Recognition (MNER), ONER is essential and helpful for network security. Currently, most existing MNER models only use texts and images, so they cannot effectively utilize the multimodal data of devices to achieve ONER accurately. Also, most of the existing MNER models separately use information in each modality and between modalities. Thus, the fusion is inconsistent, so the effect is not satisfied. Therefore, the paper proposes HDGT: A heterogeneous and Dynamic Graph Transformer, to improve the performance of ONER. The core components in HDGT to realize MNER are a dynamic graph and two-stream mechanism, which could learn the relationship between different modalities during training and the graph’s structure well. The paper manually labels a multimodal dataset containing texts, images, and domains to prove the performance of HDGT. Also, the paper conducts experiments on existing and public MNER datasets. The results show that HDGT achieves 84.88% F1 scores on the recognition of owner entities, 75.21% F1 on Twitter2015, and 87.03% F1 on Twitter2017, which outperforms other existing MNER models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Owner named entity recognition in website based on multidimensional text guidance and space alignment co-attention

Article 30 August 2023

MMUIL: enhancing multi-platform user identity linkage with multi-information

Article 28 March 2024

Key-Guided Identity Document Classification Method by Graph Attention Network

References

Ruiz-Sánchez MÁ, Biersack EW, Dabbous W (2001) Survey and taxonomy of IP address lookup algorithms. IEEE Netw 15(2):8–23
Article Google Scholar
Huang Z, Xu W, Yu K (2015) Bidirectional LSTM-CRF models for sequence tagging. arXiv:1508.01991
Zhang Q, Fu J, Liu X, Huang X (2018) Adaptive co-attention network for named entity recognition in tweets. In: Thirty-Second AAAI conference on artificial intelligence
Yu J, Jiang J, Yang L, Xia R (2020) Improving multimodal named entity recognition via entity span detection with unified multimodal transformer. Association for Computational Linguistics
Ren Y, Li H, Liu P, Liu J, Zhu H, Sun L (2023) Owner name entity recognition in websites based on multiscale features and multimodal co-attention. Expert Syst Appl 224:120014
Moon S, Neves L, Carvalho V (2018) Multimodal named entity disambiguation for noisy social media posts. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long papers), pp 2000–2008
Lu D, Neves L, Carvalho V, Zhang N, Ji H (2018) Visual attention model for name tagging in multimodal social media. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long papers), pp 1990–1999
Xu B, Huang S, Sha C, Wang H (2022) Maf: a general matching and alignment framework for multimodal named entity recognition. In: Proceedings of the 15th ACM international conference on web search and data mining, pp 1215–1223
Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning, pp 1597–1607. PMLR
Bhattacharjee A, Karami M, Liu H (2022) Text transformations in contrastive self-supervised learning: a review. arXiv:2203.12000
Zhang D, Wei S, Li S, Wu H, Zhu Q, Zhou G (2021) Multi-modal graph fusion for named entity recognition with targeted visual guidance. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 14347–14355
Yin Y, Meng F, Su J, Zhou C, Yang Z, Zhou J, Luo J (2020) A novel graph-based multi-modal fusion encoder for neural machine translation. arXiv:2007.08742
Grover A, Leskovec J (2016) node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 855–864
Ribeiro LF, Saverese PH, Figueiredo DR (2017) struc2vec: learning node representations from structural identity. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 385–394
Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio, Y (2018) Graph attention networks. In: International conference on learning representations
Dong Y, Chawla NV, Swami A (2017) metapath2vec: scalable representation learning for heterogeneous networks. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 135–144
Ishiwatari T, Yasuda Y, Miyazaki T, Goto J (2020) Relation-aware graph attention networks with relational position encodings for emotion recognition in conversations. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pp 7360–7370
Linmei H, Yang T, Shi C, Ji H, Li X (2019) Heterogeneous graph attention networks for semi-supervised short text classification. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 4821–4830
Hu Z, Dong Y, Wang K, Sun Y (2020) Heterogeneous graph transformer. In: Proceedings of the web conference 2020, pp 2704–2710
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, vol 30
Chen D, Manning CD (2014) A fast and accurate dependency parser using neural networks. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 740–750
Miller GA (1995) Wordnet: a lexical database for English. Commun ACM 38(11):39–41
Article Google Scholar
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv:1609.02907
Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. In: Advances in neural information processing systems, vol 30
Li Y, Li C, Li X et al (2022) A comprehensive review of Markov random field and conditional random field approaches in pathology image analysis. Arch Comput Methods Eng 29(1):609–639
Article Google Scholar
Ma X, Hovy E (2016) End-to-end sequence labeling via bi-directional lstm-cnns-crf. arXiv:1603.01354
Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C (2016) Neural architectures for named entity recognition. In: Proceedings of NAACL-HLT, pp 260–270
Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: lre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805

Download references

Funding

This work is supported by National Natural Science Foundation of China (No.U1766215).

Author information

Authors and Affiliations

School of Cyber Security, University of Chinese Academy of Science, Beijing, China
Yimo Ren, Hong Li, Peipei Liu, Jie Liu, Zhi Li, Hongsong Zhu & Limin Sun
Institute of Information Engineering, University of Chinese Academy of Science, Beijing, China
Yimo Ren, Hong Li, Peipei Liu, Jie Liu, Zhi Li, Hongsong Zhu & Limin Sun

Authors

Yimo Ren
View author publications
You can also search for this author in PubMed Google Scholar
Hong Li
View author publications
You can also search for this author in PubMed Google Scholar
Peipei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jie Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhi Li
View author publications
You can also search for this author in PubMed Google Scholar
Hongsong Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Limin Sun
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

YR: Conceptualization, Methodology, Writing—original draft HL: Supervision PL: Investigation, Validation JL: Investigation, Formal Analysis HZ: Data curation, Writing—review & editing LS: Writing—review & editing, Funding acquisition.

Corresponding author

Correspondence to Yimo Ren.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Ethical approval

The authors declare that the manuscript has not to been submitted to more than one journal for simultaneous consideration. The submitted work is original and has not been published elsewhere in any form or language (partially or in full). Results are presented honestly and without fabrication, falsification, or inappropriate data manipulation.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ren, Y., Li, H., Liu, P. et al. Owner name entity recognition in websites based on heterogeneous and dynamic graph transformer. Knowl Inf Syst 65, 4411–4429 (2023). https://doi.org/10.1007/s10115-023-01908-4

Download citation

Received: 15 January 2023
Revised: 28 February 2023
Accepted: 11 May 2023
Published: 01 June 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s10115-023-01908-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Owner name entity recognition in websites based on heterogeneous and dynamic graph transformer

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Owner named entity recognition in website based on multidimensional text guidance and space alignment co-attention

MMUIL: enhancing multi-platform user identity linkage with multi-information

Key-Guided Identity Document Classification Method by Graph Attention Network

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Owner name entity recognition in websites based on heterogeneous and dynamic graph transformer

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Owner named entity recognition in website based on multidimensional text guidance and space alignment co-attention

MMUIL: enhancing multi-platform user identity linkage with multi-information

Key-Guided Identity Document Classification Method by Graph Attention Network

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation