Abstract
Identifying owners of devices on the Internet can enable numerous network security applications. For example, accurate Owner Name Entity Recognition (ONER) of websites is critical to find influenced owners in light of new security threats. In this situation, as a specific task of Multimodal Named Entity Recognition (MNER), ONER is essential and helpful for network security. Currently, most existing MNER models only use texts and images, so they cannot effectively utilize the multimodal data of devices to achieve ONER accurately. Also, most of the existing MNER models separately use information in each modality and between modalities. Thus, the fusion is inconsistent, so the effect is not satisfied. Therefore, the paper proposes HDGT: A heterogeneous and Dynamic Graph Transformer, to improve the performance of ONER. The core components in HDGT to realize MNER are a dynamic graph and two-stream mechanism, which could learn the relationship between different modalities during training and the graph’s structure well. The paper manually labels a multimodal dataset containing texts, images, and domains to prove the performance of HDGT. Also, the paper conducts experiments on existing and public MNER datasets. The results show that HDGT achieves 84.88% F1 scores on the recognition of owner entities, 75.21% F1 on Twitter2015, and 87.03% F1 on Twitter2017, which outperforms other existing MNER models.
Similar content being viewed by others
References
Ruiz-Sánchez MÁ, Biersack EW, Dabbous W (2001) Survey and taxonomy of IP address lookup algorithms. IEEE Netw 15(2):8–23
Huang Z, Xu W, Yu K (2015) Bidirectional LSTM-CRF models for sequence tagging. arXiv:1508.01991
Zhang Q, Fu J, Liu X, Huang X (2018) Adaptive co-attention network for named entity recognition in tweets. In: Thirty-Second AAAI conference on artificial intelligence
Yu J, Jiang J, Yang L, Xia R (2020) Improving multimodal named entity recognition via entity span detection with unified multimodal transformer. Association for Computational Linguistics
Ren Y, Li H, Liu P, Liu J, Zhu H, Sun L (2023) Owner name entity recognition in websites based on multiscale features and multimodal co-attention. Expert Syst Appl 224:120014
Moon S, Neves L, Carvalho V (2018) Multimodal named entity disambiguation for noisy social media posts. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long papers), pp 2000–2008
Lu D, Neves L, Carvalho V, Zhang N, Ji H (2018) Visual attention model for name tagging in multimodal social media. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long papers), pp 1990–1999
Xu B, Huang S, Sha C, Wang H (2022) Maf: a general matching and alignment framework for multimodal named entity recognition. In: Proceedings of the 15th ACM international conference on web search and data mining, pp 1215–1223
Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning, pp 1597–1607. PMLR
Bhattacharjee A, Karami M, Liu H (2022) Text transformations in contrastive self-supervised learning: a review. arXiv:2203.12000
Zhang D, Wei S, Li S, Wu H, Zhu Q, Zhou G (2021) Multi-modal graph fusion for named entity recognition with targeted visual guidance. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 14347–14355
Yin Y, Meng F, Su J, Zhou C, Yang Z, Zhou J, Luo J (2020) A novel graph-based multi-modal fusion encoder for neural machine translation. arXiv:2007.08742
Grover A, Leskovec J (2016) node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 855–864
Ribeiro LF, Saverese PH, Figueiredo DR (2017) struc2vec: learning node representations from structural identity. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 385–394
Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio, Y (2018) Graph attention networks. In: International conference on learning representations
Dong Y, Chawla NV, Swami A (2017) metapath2vec: scalable representation learning for heterogeneous networks. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 135–144
Ishiwatari T, Yasuda Y, Miyazaki T, Goto J (2020) Relation-aware graph attention networks with relational position encodings for emotion recognition in conversations. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pp 7360–7370
Linmei H, Yang T, Shi C, Ji H, Li X (2019) Heterogeneous graph attention networks for semi-supervised short text classification. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 4821–4830
Hu Z, Dong Y, Wang K, Sun Y (2020) Heterogeneous graph transformer. In: Proceedings of the web conference 2020, pp 2704–2710
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, vol 30
Chen D, Manning CD (2014) A fast and accurate dependency parser using neural networks. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 740–750
Miller GA (1995) Wordnet: a lexical database for English. Commun ACM 38(11):39–41
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv:1609.02907
Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. In: Advances in neural information processing systems, vol 30
Li Y, Li C, Li X et al (2022) A comprehensive review of Markov random field and conditional random field approaches in pathology image analysis. Arch Comput Methods Eng 29(1):609–639
Ma X, Hovy E (2016) End-to-end sequence labeling via bi-directional lstm-cnns-crf. arXiv:1603.01354
Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C (2016) Neural architectures for named entity recognition. In: Proceedings of NAACL-HLT, pp 260–270
Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: lre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Funding
This work is supported by National Natural Science Foundation of China (No.U1766215).
Author information
Authors and Affiliations
Contributions
YR: Conceptualization, Methodology, Writing—original draft HL: Supervision PL: Investigation, Validation JL: Investigation, Formal Analysis HZ: Data curation, Writing—review & editing LS: Writing—review & editing, Funding acquisition.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interest regarding the publication of this paper.
Ethical approval
The authors declare that the manuscript has not to been submitted to more than one journal for simultaneous consideration. The submitted work is original and has not been published elsewhere in any form or language (partially or in full). Results are presented honestly and without fabrication, falsification, or inappropriate data manipulation.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ren, Y., Li, H., Liu, P. et al. Owner name entity recognition in websites based on heterogeneous and dynamic graph transformer. Knowl Inf Syst 65, 4411–4429 (2023). https://doi.org/10.1007/s10115-023-01908-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-023-01908-4