APTBert: Abstract Generation and Event Extraction from APT Reports

Zhou, Chenxin; Huang, Cheng; Wang, Yanghao; Zuo, Zheng

doi:10.1007/978-3-031-56583-0_14

Chenxin Zhou¹⁷,
Cheng Huang^17,18,
Yanghao Wang¹⁷ &
…
Zheng Zuo¹⁹

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 571))

Included in the following conference series:

International Conference on Digital Forensics and Cyber Crime

440 Accesses

Abstract

Due to the rapid development of information technology in this century, APT attacks(Advanced Persistent Threat) occur more frequently. The best way to combat APT is to quickly extract and integrate the roles of the attack events involved in the report from the APT reports that have been released, and to further perceive, analyze and prevent APT for the relevant security professionals. With the above issues in mind, an event extraction model for APT attack is proposed. This model, which is called APTBert, uses targeted text characterization results from the security filed text generated by the APTBert pre-training model to feed into the multi-head self-attention mechanism neural network for training, improving the accuracy of sequence labelling. At the experiment stage, on the basis of 1300 open source APT attack reports from security vendors and forums, we first pre-trained an APTBert pre-training model. We ended up annotating 600 APT reports with event roles, which were used to train the extraction model and evaluate the effect of event extraction. Experiment results show that the proposed method has better performance in training time and F1(77.4%) as compared to traditional extraction methods like BiLSTM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Research on Named Entity Recognition Method of Network Threat Intelligence

ATDG: An Automatic Cyber Threat Intelligence Extraction Model of DPCNN and BIGRU Combined with Attention Mechanism

Event Detection and Information Extraction Strategies from Text: A Preliminary Study Using GENIA Corpus

References

Moon, D., Im, H., Kim, I., et al.: DTB-IDS: an intrusion detection system based on decision tree using behavior analysis for preventing APT attacks. J. Supercomput. 73, 2881–2895 (2017). https://doi.org/10.1007/s11227-015-1604-8
Article Google Scholar
Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summarization. In: Conference on Empirical Methods in Natural Language Processing (2015)
Google Scholar
Mihalcea, R., Tarau, P.: TextRank: bringing order into text. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pp. 404–411 (2004)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)
Google Scholar
Ostendorff, M., et al.: Enriching BERT with knowledge graph embeddings for document classification. arXiv preprint: arXiv:1909.08402 (2019)
Lou, D., et al.: MLBiNet: a cross-sentence collective event detection network. arXiv preprint: arXiv:2105.09458 (2021)
Chen, Y., et al.: Event extraction via dynamic multi-pooling convolutional neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 167–176 (2015)
Google Scholar
Yagcioglu, S., et al.: Detecting cybersecurity events from noisy short text. arXiv preprint: arXiv:1904.05054 (2019)
LeCun, Y., Bottou, L., Bengio, Y., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Luo, N., et al.: A framework for document-level cybersecurity event extraction from open source data. In: 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD), pp. 422–427. IEEE (2021)
Google Scholar
Piskorski, J., Tanev, H., Balahur, A.: Exploiting twitter for border security-related intelligence gathering. In: 2013 European Intelligence and Security Informatics Conference, pp. 239–246. IEEE (2013)
Google Scholar
Burr, B., et al.: On the detection of persistent attacks using alert graphs and event feature embeddings. In: NOMS 2020–2020 IEEE/IFIP Network Operations and Management Symposium, pp. 1–4. IEEE (2020)
Google Scholar
Nguyen, T.H., Cho, K., Grishman, R.: Joint event extraction via recurrent neural networks. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 300–309. Association for Computational Linguistics, San Diego, California (2016)
Google Scholar
Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. Neural Comput. 12(10), 2451–2471 (2000). https://doi.org/10.1162/089976600300015015
Article Google Scholar
Guo, Q., Huang, J., Xiong, N., Wang, P.: MS-pointer network: abstractive text summary based on multi-head self-attention. IEEE Access 7, 138603–138613 (2019). https://doi.org/10.1109/ACCESS.2019.2941964
Article Google Scholar
Satyapanich, T., Ferraro, F., Finin, T.: CASIE: extracting cybersecurity event information from text. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 05, pp. 8749–8757 (2020)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
He, K., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv preprint: arXiv:1607.06450 (2016)
Santurkar, S., et al.: How does batch normalization help optimization? In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Google Scholar
Manning, C.D., et al.: The Stanford CoreNLP natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55–60 (2014)
Google Scholar
Wang, Q., Mao, Z., Wang, B., et al.: Knowledge graph embedding: a survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 29(12), 2724–2743 (2017)
Article Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Chung, J., et al.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint: arXiv:1412.3555 (2014)
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint: arXiv:1508.01991 (2015)

Download references

Acknowledgment

This work was supported in part by National Key Research and Development Program of China (No.2021YFB3100500), Sichuan Science and Technology Program (No.2023YFG0162), and Open Fund of Anhui Province Key Laboratory of Cyberspace Security Situation Awareness and Evaluation (No.CSSAE-2021-001).

Author information

Authors and Affiliations

School of Cyber Science and Engineering, Sichuan University, Chengdu, China
Chenxin Zhou, Cheng Huang & Yanghao Wang
Anhui Province Key Laboratory of Cyberspace Security Situation Awareness and Evaluation, Hefei, China
Cheng Huang
Chengdu University of Information Technology, Chengdu, China
Zheng Zuo

Authors

Chenxin Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Cheng Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yanghao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zheng Zuo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cheng Huang .

Editor information

Editors and Affiliations

University of Albany, Albany, GA, USA
Sanjay Goel
Universidade Federal do Espírito Santo, Alegre, Brazil
Paulo Roberto Nunes de Souza

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, C., Huang, C., Wang, Y., Zuo, Z. (2024). APTBert: Abstract Generation and Event Extraction from APT Reports. In: Goel, S., Nunes de Souza, P.R. (eds) Digital Forensics and Cyber Crime. ICDF2C 2023. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 571. Springer, Cham. https://doi.org/10.1007/978-3-031-56583-0_14

Download citation

DOI: https://doi.org/10.1007/978-3-031-56583-0_14
Published: 03 April 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-56582-3
Online ISBN: 978-3-031-56583-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

APTBert: Abstract Generation and Event Extraction from APT Reports

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Research on Named Entity Recognition Method of Network Threat Intelligence

ATDG: An Automatic Cyber Threat Intelligence Extraction Model of DPCNN and BIGRU Combined with Attention Mechanism

Event Detection and Information Extraction Strategies from Text: A Preliminary Study Using GENIA Corpus

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

APTBert: Abstract Generation and Event Extraction from APT Reports

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Research on Named Entity Recognition Method of Network Threat Intelligence

ATDG: An Automatic Cyber Threat Intelligence Extraction Model of DPCNN and BIGRU Combined with Attention Mechanism

Event Detection and Information Extraction Strategies from Text: A Preliminary Study Using GENIA Corpus

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation