Abstract
Due to the rapid development of information technology in this century, APT attacks(Advanced Persistent Threat) occur more frequently. The best way to combat APT is to quickly extract and integrate the roles of the attack events involved in the report from the APT reports that have been released, and to further perceive, analyze and prevent APT for the relevant security professionals. With the above issues in mind, an event extraction model for APT attack is proposed. This model, which is called APTBert, uses targeted text characterization results from the security filed text generated by the APTBert pre-training model to feed into the multi-head self-attention mechanism neural network for training, improving the accuracy of sequence labelling. At the experiment stage, on the basis of 1300 open source APT attack reports from security vendors and forums, we first pre-trained an APTBert pre-training model. We ended up annotating 600 APT reports with event roles, which were used to train the extraction model and evaluate the effect of event extraction. Experiment results show that the proposed method has better performance in training time and F1(77.4%) as compared to traditional extraction methods like BiLSTM.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Moon, D., Im, H., Kim, I., et al.: DTB-IDS: an intrusion detection system based on decision tree using behavior analysis for preventing APT attacks. J. Supercomput. 73, 2881ā2895 (2017). https://doi.org/10.1007/s11227-015-1604-8
Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summarization. In: Conference on Empirical Methods in Natural Language Processing (2015)
Mihalcea, R., Tarau, P.: TextRank: bringing order into text. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pp. 404ā411 (2004)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993ā1022 (2003)
Ostendorff, M., et al.: Enriching BERT with knowledge graph embeddings for document classification. arXiv preprint: arXiv:1909.08402 (2019)
Lou, D., et al.: MLBiNet: a cross-sentence collective event detection network. arXiv preprint: arXiv:2105.09458 (2021)
Chen, Y., et al.: Event extraction via dynamic multi-pooling convolutional neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 167ā176 (2015)
Yagcioglu, S., et al.: Detecting cybersecurity events from noisy short text. arXiv preprint: arXiv:1904.05054 (2019)
LeCun, Y., Bottou, L., Bengio, Y., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278ā2324 (1998)
Luo, N., et al.: A framework for document-level cybersecurity event extraction from open source data. In: 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD), pp. 422ā427. IEEE (2021)
Piskorski, J., Tanev, H., Balahur, A.: Exploiting twitter for border security-related intelligence gathering. In: 2013 European Intelligence and Security Informatics Conference, pp. 239ā246. IEEE (2013)
Burr, B., et al.: On the detection of persistent attacks using alert graphs and event feature embeddings. In: NOMS 2020ā2020 IEEE/IFIP Network Operations and Management Symposium, pp. 1ā4. IEEE (2020)
Nguyen, T.H., Cho, K., Grishman, R.: Joint event extraction via recurrent neural networks. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 300ā309. Association for Computational Linguistics, San Diego, California (2016)
Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. Neural Comput. 12(10), 2451ā2471 (2000). https://doi.org/10.1162/089976600300015015
Guo, Q., Huang, J., Xiong, N., Wang, P.: MS-pointer network: abstractive text summary based on multi-head self-attention. IEEE Access 7, 138603ā138613 (2019). https://doi.org/10.1109/ACCESS.2019.2941964
Satyapanich, T., Ferraro, F., Finin, T.: CASIE: extracting cybersecurity event information from text. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 05, pp. 8749ā8757 (2020)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
He, K., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770ā778 (2016)
Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv preprint: arXiv:1607.06450 (2016)
Santurkar, S., et al.: How does batch normalization help optimization? In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Manning, C.D., et al.: The Stanford CoreNLP natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55ā60 (2014)
Wang, Q., Mao, Z., Wang, B., et al.: Knowledge graph embedding: a survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 29(12), 2724ā2743 (2017)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735ā1780 (1997)
Chung, J., et al.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint: arXiv:1412.3555 (2014)
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint: arXiv:1508.01991 (2015)
Acknowledgment
This work was supported in part by National Key Research and Development Program of China (No.2021YFB3100500), Sichuan Science and Technology Program (No.2023YFG0162), and Open Fund of Anhui Province Key Laboratory of Cyberspace Security Situation Awareness and Evaluation (No.CSSAE-2021-001).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2024 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Zhou, C., Huang, C., Wang, Y., Zuo, Z. (2024). APTBert: Abstract Generation and Event Extraction from APT Reports. In: Goel, S., Nunes de Souza, P.R. (eds) Digital Forensics and Cyber Crime. ICDF2C 2023. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 571. Springer, Cham. https://doi.org/10.1007/978-3-031-56583-0_14
Download citation
DOI: https://doi.org/10.1007/978-3-031-56583-0_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-56582-3
Online ISBN: 978-3-031-56583-0
eBook Packages: Computer ScienceComputer Science (R0)