Abstract
Named entity recognition (NER) is a word-level sequence tagging task. The key of Chinese cybersecurity NER is to obtain meaningful word representations and to delicately model the inter-word relations. However, Chinese is a language of compound words and lacks morphological inflections. Moreover, the role and meaning of a word depends on the context in a complicated way. In this paper, we present an NER model named Star-HGCN, short for Star-Transformer with Hybrid embeddings and Graph Convolutional Network. To make full use of the intra-word information, we set a hybrid embedding layer at the very beginning, which enriches word representations with character-level information and part-of-speech features. More importantly, we further enhance the hybrid embeddings by modeling inter-word implicit local and long-range semantic associations using the efficient Star-Transformer architecture, and modeling the explicit syntactic dependencies between words in the dependency tree using the graph convolutional network. Experiments on the Chinese cybersecurity dataset show that our model is superior to other neural network methods for NER, and achieves a significant relative improvement of 36.59% for the class of software entities. Experiments on other public datasets also validate the effectiveness of the model on other general and specific domains.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Appelt D, Hobbs JR, Bear J, et al (1995) SRI international FASTUS system: MUC-6 test results and analysis. In: Sixth message understanding conference (MUC-6): proceedings of a conference held in Columbia, Maryland, November 6–8, 1995
Bikel DM, Schwartz R, Weischedel RM (1999) An algorithm that learns what’s in a name. Mach Learn 34(1):211–231
Cetoli A, Bragaglia S, O’Harney A, et al (2017) Graph convolutional networks for named entity recognition. In: Proceedings of the 16th international workshop on treebanks and linguistic theories, Prague, Czech Republic, pp 37–45. https://aclanthology.org/W17-7607
Conlon SJ, Abrahams AS, Simmons LL (2015) Terrorism information extraction from online reports. J Comput Inf Syst 55(3):20–28. https://doi.org/10.1080/08874417.2015.11645768
Gasmi H, Laval J, Bouras A (2019) Information extraction of cybersecurity concepts: an LSTM approach. Appl Sci 9(19):3945. https://doi.org/10.3390/app9193945
Ghazi Y, Anwar Z, Mumtaz R, et al (2018) A supervised machine learning based approach for automatically extracting high-level threat intelligence from unstructured sources. In: 2018 International conference on frontiers of information technology (FIT). IEEE, pp 129–134. https://doi.org/10.1109/fit.2018.00030
Gomez-Hidalgo JM, Martín-Abreu JM, Nieves J, et al (2010) Data leak prevention through named entity recognition. In: 2010 IEEE second international conference on social computing. IEEE, pp 1129–1134. https://doi.org/10.1109/socialcom.2010.167
Guo Q, Qiu X, Liu P, et al (2019) Star-Transformer. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers). Association for Computational Linguistics, Minneapolis, Minnesota, pp 1315–1325. https://doi.org/10.18653/v1/N19-1133
Hammerton J (2003) Named entity recognition with long short-term memory. In: Proceedings of the seventh conference on natural language learning at HLT-NAACL, vol 2003, pp 172–175. https://doi.org/10.3115/1119176.1119202
He H, Sun X (2017) A unified model for cross-domain and semi-supervised named entity recognition in Chinese social media. In: Proceedings of the AAAI conference on artificial intelligence
Hou J, Li X, Yao H et al (2020) BERT-based Chinese relation extraction for public security. IEEE Access 8:132,367-132,375. https://doi.org/10.1109/ACCESS.2020.3002863
Huang Z, Xu W, Yu K (2015) Bidirectional LSTM-CRF models for sequence tagging. CoRR arXiv:1508.01991
Husari G, Niu X, Chu B, et al (2018) Using entropy and mutual information to extract threat actions from cyber threat intelligence. In: 2018 IEEE international conference on intelligence and security informatics (ISI). IEEE, pp 1–6. https://doi.org/10.1109/isi.2018.8587343
Isozaki H, Kazawa H (2002) Efficient support vector classifiers for named entity recognition. In: COLING 2002: the 19th international conference on computational linguistics. https://doi.org/10.3115/1072228.1072282
Jia Y, Qi Y, Shang H et al (2018) A practical approach to constructing a knowledge graph for cybersecurity. Engineering 4(1):53–60. https://doi.org/10.1016/j.eng.2018.01.004
Joshi A, Lal R, Finin T, et al (2013) Extracting cybersecurity related linked data from text. In: 2013 IEEE seventh international conference on semantic computing. IEEE, pp 252–259. https://doi.org/10.1109/icsc.2013.50
Kim JH, Woodland P (2000) A rule-based named entity recognition system for speech input. pp 528–531
Kingma D, Ba J (2014) Adam: a method for stochastic optimization. In: International conference on learning representations
Lafferty JD, McCallum A, Pereira FCN (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the eighteenth international conference on machine learning, ICML ’01. Morgan Kaufmann Publishers Inc., San Francisco, pp 282–289
Lal R (2013) Information extraction of security related entities and concepts from unstructured text. Master’s thesis, University of Maryland Baltimore County
Landauer M, Skopik F, Wurzenberger M, et al (2019) A framework for cyber threat intelligence extraction from raw log data. In: 2019 IEEE international conference on big data (big data). IEEE, pp 3200–3209. https://doi.org/10.1109/bigdata47090.2019.9006328
Li S, Zhao Z, Hu R, et al (2018) Analogical reasoning on Chinese morphological and semantic relations. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 2: short papers). Association for Computational Linguistics, Melbourne, pp 138–143. https://doi.org/10.18653/v1/P18-2023
Li X, Yan H, Qiu X, et al (2020) FLAT: Chinese NER using flat-lattice transformer. In: Proceedings of the 58th annual meeting of the association for computational linguistics. Association for Computational Linguistics, pp 6836–6842. https://doi.org/10.18653/v1/2020.acl-main.611
Ling W, Dyer C, Black AW, et al (2015) Finding function in form: compositional character models for open vocabulary word representation. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 1520–1530. https://doi.org/10.18653/v1/D15-1176
Liu H, Song J, Peng W, et al (2022) TFM: A triple fusion module for integrating lexicon information in Chinese named entity recognition. Neural Process Lett 1–18. https://doi.org/10.1007/s11063-022-10768-y
Ma R, Peng M, Zhang Q, et al (2020) Simplify the usage of lexicon in Chinese NER. In: Proceedings of the 58th annual meeting of the association for computational linguistics. Association for Computational Linguistics, pp 5951–5960. https://doi.org/10.18653/v1/2020.acl-main.528
Ma X, Hovy E (2016) End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: long papers). Association for Computational Linguistics, Berlin, pp 1064–1074. https://doi.org/10.18653/v1/P16-1101
Marcheggiani D, Titov I (2017) Encoding sentences with graph convolutional networks for semantic role labeling. In: Proceedings of the 2017 conference on empirical methods in natural language processing. Association for Computational Linguistics, Copenhagen, pp 1506–1515. https://doi.org/10.18653/v1/D17-1159
McCallum A, Li W (2003) Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In: Proceedings of the seventh conference on natural language learning at HLT-NAACL, vol 2003, pp 188–191. https://doi.org/10.3115/1119176.1119206
Mulwad V, Li W, Joshi A, et al (2011) Extracting information about security vulnerabilities from web text. In: 2011 IEEE/WIC/ACM international conferences on web intelligence and intelligent agent technology. IEEE, pp 257–260. https://doi.org/10.1109/wi-iat.2011.26
Peng N, Dredze M (2016) Improving named entity recognition for Chinese social media with word segmentation representation learning. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 2: short papers). Association for Computational Linguistics, Berlin, pp 149–155. https://doi.org/10.18653/v1/P16-2025
Souza F, Nogueira RF, de Alencar Lotufo R (2019) Portuguese named entity recognition using BERT-CRF. CoRR arXiv:1909.10649
Szarvas G, Farkas R, Kocsor A (2006) A multilingual named entity recognition system using boosting and c4.5 decision tree learning algorithms. In: International conference on discovery science. Springer, pp 267–278
Vaswani A, Shazeer N, Parmar N, et al (2017) Attention is all you need. In: Proceedings of the 31st international conference on neural information processing systems, NIPS’17. Curran Associates Inc., Red Hook, pp 6000–6010. https://doi.org/10.5555/3295222.3295349
Wang W, Bao F, Gao G (2019) Learning morpheme representation for Mongolian named entity recognition. Neural Process Lett 50(3):2647–2664. https://doi.org/10.1007/s11063-019-10044-6
Wang Y, Sun Y, Ma Z, et al (2020) Application of pre-training models in named entity recognition. In: 2020 12th International conference on intelligent human–machine systems and cybernetics (IHMSC), pp 23–26. https://doi.org/10.1109/IHMSC49165.2020.00013
Xie B, Shen G, Guo C et al (2021) The named entity recognition of Chinese cybersecurity using an active learning strategy. Wirel Commun Mob Comput 2021:1–11. https://doi.org/10.1155/2021/6629591
Yan H, Deng B, Li X, et al (2019) TENER: adapting transformer encoder for named entity recognition. CoRR arXiv:1911.04474
Yan R, Jiang X, Dang D (2021) Named entity recognition by using XLNet-BiLSTM-CRF. Neural Process Lett 53(5):3339–3356. https://doi.org/10.1007/s11063-021-10547-1
Yang Y (1999) An evaluation of statistical approaches to text categorization. Inf Retr 1(1):69–90
Zhang S, Wang L, Sun K, et al (2020) A practical Chinese dependency parser based on a large-scale dataset. CoRR arXiv:2009.00901
Zhang Y, Yang J (2018) Chinese NER using lattice LSTM. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long papers). Association for Computational Linguistics, Melbourne, pp 1554–1564. https://doi.org/10.18653/v1/P18-1144
Funding
This work was supported by the National Natural Science Foundation of China (Grant numbers 62102279 and 11702289), the Key Core Technology and Generic Technology R &D Project of Shanxi Province (Grant number 2020XXX013), the Scientific and Technological Innovation Programs of Higher Education Institutions in Shanxi (Grant number 2020L0102), and the National Key R &D Program.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yang, D., Lian, T., Zheng, W. et al. Enriching Word Information Representation for Chinese Cybersecurity Named Entity Recognition. Neural Process Lett 55, 7689–7707 (2023). https://doi.org/10.1007/s11063-023-11280-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-023-11280-7