skip to main content
10.1145/3573834.3574526acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaissConference Proceedingsconference-collections
research-article

TRAPPER:Learning with Weak Supervision for Threat Intelligence Entity Recognition

Published: 17 January 2023 Publication History

Abstract

The emergence of threat intelligence provides more foundation for tracing the source of network attacks, but it also necessitates a significant amount of manual analysis. Although data-driven automatic information extraction can effectively reduce labor consumption, it is limited by a lack of labeled data in the field of threat intelligence. To overcome this limitation, we propose TRAPPER, a threat entity recognition framework that can infer real threat entities from unlabeled threat sentences, avoiding the difficult labeling work. TRAPPER relies on label functions and three components, label aggregator, label predictor, and label expander, which guides the model with weak supervision and uses transfer knowledge as an aid. The label functions permit us to inject expert knowledge into the label aggregator to generate the inputs needed by the label predictor. It enables the label predictor to learn to recognize threat entities. The label expander combines the multi-source noisy label information with the transferred entity recognition semantic knowledge to further expand the entities. Throughout the process, the components promote each other by learning from each other. Comparative experiments on three threat intelligence-related datasets show that our method can effectively identify threat entities and achieve a maximum F1 score improvement of 6.3% over the best baseline.

References

[1]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
[2]
Chen Gao, Xuan Zhang, Mengting Han, and Hui Liu. 2021. A review on cyber security named entity recognition. Frontiers of Information Technology & Electronic Engineering 22, 9(2021), 1153–1168.
[3]
Tiberiu-Marian Georgescu, Bogdan Iancu, and Madalina Zurini. 2019. Named-entity-recognition-based automated system for diagnosing cybersecurity situations in IoT networks. Sensors 19, 15 (2019), 3380.
[4]
Zhiting Hu, Xuezhe Ma, Zhengzhong Liu, Eduard Hovy, and Eric Xing. 2016. Harnessing deep neural networks with logic rules. arXiv preprint arXiv:1603.06318(2016).
[5]
Gyeongmin Kim, Chanhee Lee, Jaechoon Jo, and Heuiseok Lim. 2020. Automatic extraction of named entities of cyber threats using a deep Bi-LSTM-CRF network. Int. J. Mach. Learn. Cybern. 11, 10 (2020), 2341–2355.
[6]
Pierre Lison, Jeremy Barnes, Aliaksandr Hubin, and Samia Touileb. 2020. Named Entity Recognition without Labelled Data: A Weak Supervision Approach. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020. Association for Computational Linguistics, 1518–1533.
[7]
Jian Liu, Junjie Yan, Jun Jiang, Yitong He, Xuren Wang, Zhengwei Jiang, Peian Yang, and Ning Li. 2022. TriCTI: an actionable cyber threat intelligence discovery system via trigger-enhanced neural network. Cybersecurity 5, 1 (2022), 1–16.
[8]
Nikki McNeil, Robert A Bridges, Michael D Iannacone, Bogdan Czejdo, Nicolas Perez, and John R Goodall. 2013. Pace: Pattern accurate computationally efficient bootstrapping for timely discovery of cyber-security concepts. In 2013 12th International Conference on Machine Learning and Applications, Vol. 2. 60–65.
[9]
Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. CoRR abs/1802.05365(2018).
[10]
Lance A Ramshaw and Mitchell P Marcus. 1999. Text chunking using transformation-based learning. In Natural language processing using very large corpora. Springer, 157–176.
[11]
Alexander Ratner, Stephen H. Bach, Henry R. Ehrenberg, Jason A. Fries, Sen Wu, and Christopher Ré. 2020. Snorkel: rapid training data creation with weak supervision. VLDB J. 29, 2-3 (2020), 709–730.
[12]
Andreas Rücklé, Steffen Eger, Maxime Peyrard, and Iryna Gurevych. 2018. Concatenated power mean word embeddings as universal cross-lingual sentence representations. arXiv preprint arXiv:1803.01400(2018).
[13]
Esteban Safranchik, Shiying Luo, and Stephen Bach. 2020. Weakly supervised sequence tagging from noisy rules. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 5570–5578.
[14]
Hai Wang and Hoifung Poon. 2018. Deep Probabilistic Logic: A Unifying Framework for Indirect Supervision. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018. 1891–1902.
[15]
Xuren Wang, Runshi Liu, Jie Yang, Rong Chen, Zhiting Ling, Peian Yang, and Kai Zhang. 2022. Cyber Threat Intelligence Entity Extraction Based on Deep Learning and Field Knowledge Engineering. In 2022 IEEE 25th International Conference on Computer Supported Cooperative Work in Design (CSCWD). IEEE, 406–413.
[16]
Xuren Wang, Xinpei Liu, Shengqin Ao, Ning Li, Zhengwei Jiang, Zongyi Xu, Zihan Xiong, Mengbo Xiong, and Xiaoqing Zhang. 2020. DNRTI: A Large-scale Dataset for Named Entity Recognition in Threat Intelligence. In 19th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2020, Guangzhou, China, December 29, 2020 - January 1, 2021. 1842–1848.
[17]
Xuren Wang, Jie Yang, Qiuyun Wang, and Changxin Su. 2020. Threat Intelligence Relationship Extraction Based on Distant Supervision and Reinforcement Learning. In SEKE. 572–576.
[18]
Han Wu, Xiaoyong Li, and Yali Gao. 2020. An effective approach of named entity recognition for cyber threat intelligence. In 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), Vol. 1. 1370–1374.
[19]
Zhifeng Xiao. 2017. Towards a two-phase unsupervised system for cybersecurity concepts extraction. In 2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD). 2161–2168.
[20]
Morteza Ziyadi, Yuting Sun, Abhishek Goswami, Jade Huang, and Weizhu Chen. 2020. Example-Based Named Entity Recognition. CoRR abs/2008.10570(2020).

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
AISS '22: Proceedings of the 4th International Conference on Advanced Information Science and System
November 2022
396 pages
ISBN:9781450397933
DOI:10.1145/3573834
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 January 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. knowledge transfer
  2. threat intelligence
  3. weakly supervision

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Opening Project of Intelligent Policing Key Laboratory of Sichuan Province
  • Criminal Examination Key Laboratory of Sichuan Province

Conference

AISS 2022

Acceptance Rates

Overall Acceptance Rate 41 of 95 submissions, 43%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 66
    Total Downloads
  • Downloads (Last 12 months)20
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media