A BERT-Based Framework for Automated Extraction of Behavioral Indicators of Compromise from Security Incident Reports

Bekhouche, Mohamed El Amine; Adi, Kamel

doi:10.1007/978-3-031-57537-2_14

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14551))

Included in the following conference series:

International Symposium on Foundations and Practice of Security

61 Accesses

Abstract

The exponential growth of cyberattacks in recent years has highlighted the inadequacy of existing detection mechanisms and therefore the need to develop more relevant predictive models and methods in the field of Cyber Threat Intelligence (CTI). Many cybersecurity systems use behavioral indicators of compromise (IoCs), such as tactics, techniques, and procedures (TTPs), to design their defense strategies and detect future attacks attempts in an early stage. Typically, behavioral IoCs are gathered from unstructured incident reports, often written in natural language, and are typically extracted with manual analysis by cybersecurity experts. However, due to the huge number of reports daily released, this task has become more difficult and time-consuming to make it effective. In this paper, we propose a framework based on Bidirectional Encoder Representations from Transformers (BERT) to identify and recognize behavioral IoCs in incident reports. The results of our contribution showed a significant improvement of the F1-score compared to the state-of-the-art works.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alves, F., Ferreira, P.M., Bessani, A.: Design of a classification model for a Twitter-based streaming threat monitor. In: 2019 49th annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W), pp. 9–14. IEEE (2019)
Google Scholar
Asiri, M., Saxena, N., Gjomemo, R., Burnap, P.: Understanding indicators of compromise against cyber-attacks in industrial control systems: a security perspective. ACM Trans. Cyber-phys. Syst. 7(2), 1–33 (2023)
Article Google Scholar
Brown, S., Gommers, J., Serrano, O.: From cyber security information sharing to threat management. In: Proceedings of the 2nd ACM Workshop on Information Sharing and Collaborative Security, pp. 43–49 (2015)
Google Scholar
CrowdStrike, Inc. https://www.crowdstrike.com/
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
FireEye, Inc. https://www.fireeye.com/
Fujii, S., Kawaguchi, N., Shigemoto, T., Yamauchi, T.: CyNER: information extraction from unstructured text of CTI sources with noncontextual IOCs. In: Cheng, CM., Akiyama, M. (eds.) Advances in Information and Computer Security, IWSEC 2022. LNCS, vol. 13504, pp. 85–104. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-15255-9_5
Ghazi, Y., Anwar, Z., Mumtaz, R., Saleem, S., Tahir, A.: A supervised machine learning based approach for automatically extracting high-level threat intelligence from unstructured sources. In: 2018 International Conference on Frontiers of Information Technology (FIT), pp. 129–134. IEEE (2018)
Google Scholar
Jang, B., Kim, M., Harerimana, G., Kang, S., Kim, J.W.: Bi-LSTM model to increase accuracy in text classification: combining Word2vec CNN and attention mechanism. Appl. Sci. 10(17), 5841 (2020)
Article Google Scholar
Lehto, M.: Apt cyber-attack modelling: building a general model. In: International Conference on Cyber Warfare and Security, vol. 17, pp. 121–129. Academic Conferences International Limited (2022)
Google Scholar
Ma, P., Jiang, B., Lu, Z., Li, N., Jiang, Z.: Cybersecurity named entity recognition using bidirectional long short-term memory with conditional random fields. Tsinghua Sci. Technol. 26(3), 259–265 (2020)
Article Google Scholar
Mohammad, R.M., Thabtah, F., McCluskey, L.: Intelligent rule-based phishing websites classification. IET Inf. Secur. 8(3), 153–160 (2014)
Article Google Scholar
Peters, M.E., et al.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), Volume 1 (Long Papers) (2018)
Google Scholar
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., et al.: Improving language understanding by generative pre-training (2018)
Google Scholar
Roy, A., Park, Y., Pan, S.: Learning domain-specific word embeddings from sparse cybersecurity texts. arXiv preprint arXiv:1709.07470 (2017)
Sapienza, A., Ernala, S.K., Bessi, A., Lerman, K., Ferrara, E.: DISCOVER: mining online chatter for emerging cyber threats. In: Companion Proceedings of the The Web Conference 2018, pp. 983–990 (2018)
Google Scholar
Shahi, M.A.H.: Tactics, techniques and procedures (TTPs) to augment cyber threat intelligence (CTI): a comprehensive study (2018)
Google Scholar
Strom, B.E., Applebaum, A., Miller, D.P., Nickels, K.C., Pennington, A.G., Thomas, C.B.: MITRE ATT &CK: Design and Philosophy. Technical report. The MITRE Corporation (2018)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar

Download references

Acknowledgments

I am grateful to Prof. Kamel ADI for his mentorship and guidance throughout this paper. I also extend my thanks to the members of the Computer Security Research Laboratory (LRSI) at the University of Quebec in Outaouais for their collaborative support and insightful discussions that greatly enhanced this work.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, University of Quebec in Outaouais (UQO), Quebec, Canada
Mohamed El Amine Bekhouche & Kamel Adi

Authors

Mohamed El Amine Bekhouche
View author publications
You can also search for this author in PubMed Google Scholar
Kamel Adi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohamed El Amine Bekhouche .

Editor information

Editors and Affiliations

University of Bordeaux, Bordeaux, France
Mohamed Mosbah
Toulouse III - Paul Sabatier University, Toulouse, France
Florence Sèdes
Université Laval, Québec, QC, Canada
Nadia Tawbi
University of Bordeaux, Bordeaux, France
Toufik Ahmed
Polytechnique Montréal, Montreal, QC, Canada
Nora Boulahia-Cuppens
Telecom SudParis, Palaiseau, France
Joaquin Garcia-Alfaro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bekhouche, M.E.A., Adi, K. (2024). A BERT-Based Framework for Automated Extraction of Behavioral Indicators of Compromise from Security Incident Reports. In: Mosbah, M., Sèdes, F., Tawbi, N., Ahmed, T., Boulahia-Cuppens, N., Garcia-Alfaro, J. (eds) Foundations and Practice of Security. FPS 2023. Lecture Notes in Computer Science, vol 14551. Springer, Cham. https://doi.org/10.1007/978-3-031-57537-2_14

Download citation

DOI: https://doi.org/10.1007/978-3-031-57537-2_14
Published: 25 April 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-57536-5
Online ISBN: 978-3-031-57537-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A BERT-Based Framework for Automated Extraction of Behavioral Indicators of Compromise from Security Incident Reports