FAF-BM: An Approach for False Alerts Filtering Using BERT Model with Semi-supervised Active Learning

Du, Dan; Li, Yunpeng; Cao, Yiyang; Liu, Yuling; Meng, Guozhu; Li, Ning; Han, Dongxu; Feng, Huamin

doi:10.1007/978-981-96-2417-1_16

Dan Du^9,10,
Yunpeng Li^9,10,
Yiyang Cao^9,10,
Yuling Liu^9,10,
Guozhu Meng^9,10,
Ning Li^9,10,
Dongxu Han^9,10 &
…
Huamin Feng¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15441))

Included in the following conference series:

International Conference on Science of Cyber Security

143 Accesses

Abstract

In the field of cybersecurity, the deluge of alerts presents a significant challenge to human review capabilities. Despite existing solutions, there is still an urgent need for more advanced methods to improve the effectiveness and accuracy of false alerts filtering. In this paper, we propose FAF-BM, a cutting-edge approach that integrates the BERT model, semi-supervised learning and active learning to enhance alert filtering capabilities. FAF-BM leverages the fine-tuned BERT model to fully exploit the deep semantics of alerts without being constrained by the format of the alerts. Subsequently, the semi-supervised learning is dedicated to mining the hidden potential within unlabeled data, thus expanding the learning scope beyond the confines of labeled datasets. In addition, the active learning strategically utilizes the expertise of security professionals to guide the learning process, ensuring that the approach adapts to the evolving threat landscape. Through a series of experiments, it has been demonstrated that FAF-BM not only improves the effectiveness of the filter but also enhances the generalization ability of dealing with the heterogeneity of alerts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abouabdalla, O., El-Taj, H., Manasrah, A., Ramadass, S.: False positive reduction in intrusion detection system: a survey. In: 2009 2nd IEEE International Conference on Broadband Network & Multimedia Technology, pp. 463–466. IEEE, Beijing, China (2009)
Google Scholar
Abu Afza, A.J.M., Uddin, M.S.: Intrusion detection learning algorithm through network mining. In: 16th International Conference on Computer and Information Technology, pp. 490–495. IEEE, Khulna (2014)
Google Scholar
Alahmadi, B.A., Axon, L., Martinovic, I.: 99% false positives: a qualitative study of soc analysts’ perspectives on security alarms, pp. 2783–2800 (2022)
Google Scholar
Almgren, M., Jonsson, E.: Using active learning in intrusion detection. In: Proceedings. 17th IEEE Computer Security Foundations Workshop, 2004, pp. 88–98. IEEE, Pacific Grove, CA, USA (2004)
Google Scholar
Ban, T., Takahashi, T., Ndichu, E.A.: Breaking alert fatigue: AI-assisted SIEM framework for effective incident response. Appl. Sci. 13(11), 6610 (2023)
Google Scholar
Behera, S.K., Dash, R.: Fine-tuning of a BERT-based uncased model for unbalanced text classification. In: Mohanty, M.N., Das, S. (eds.) Advances in Intelligent Computing and Communication, pp. 377–384. Springer Nature, Singapore (2022)
Google Scholar
Chiu, C.-Y., Lee, Y.-J., Chang, C.-C., Luo, W.-Y., Huang, H.-C.: Semi-supervised learning for false alarm reduction. In: Perner, P. (ed.) ICDM 2010. LNCS (LNAI), vol. 6171, pp. 595–605. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14400-4_46
Chapter MATH Google Scholar
De Alvarenga, S.C., Barbon, S., Miani, R.S., et al., C.: Process mining and hierarchical clustering to help intrusion alert visualization. Comput. Secur. 73, 474–491 (2018)
Google Scholar
Devlin, J., Chang, et al.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota (2019)
Google Scholar
Doak, J.E., Ingram, J., Shelburg, J., Johnson, J., Rohrer, B.R.: Active learning for alert triage. In: 2013 12th International Conference on Machine Learning and Applications, pp. 34–39. IEEE, Miami, FL, USA (2013)
Google Scholar
Ede, T.V., et al.: DEEPCASE: semi-supervised contextual analysis of security events. In: 2022 IEEE Symposium on Security and Privacy (SP), pp. 522–539. IEEE, San Francisco, CA, USA (2022)
Google Scholar
Fang, Y., et al.: EVA: Exploring the Limits of Masked Visual Representation Learning at Scale, pp. 19358–19369 (2023)
Google Scholar
Hubballi, N., Suryanarayanan, V.: False alarm minimization techniques in signature-based intrusion detection systems: a survey. Comput. Commun. 49, 1–17 (2014)
Article MATH Google Scholar
Jazzar, M., Jantan, A.B.: Using fuzzy cognitive maps to reduce false alerts in SOM-based intrusion detection sensors. In: 2008 Second Asia International Conference on Modelling & Simulation (AMS), pp. 1054–1060. IEEE (2008)
Google Scholar
Landauer, M., Skopik, F., Frank, M., Hotwagner, W., Wurzenberger, M., Rauber, A.: Maintainable log datasets for evaluation of intrusion detection systems. IEEE Trans. Dependable Secure Comput. 20(4), 3466–3482 (2023)
Article MATH Google Scholar
Landauer, M., Skopik, F., Wurzenberger, M.: Introducing a New Alert Data Set for Multi-Step Attack Analysis, August 2023. http://arxiv.org/abs/2308.12627
Law, K.H., Kwok, L.F.: IDS false alarm filtering using KNN classifier. In: Lim, C.H., Yung, M. (eds.) WISA 2004. LNCS, vol. 3325, pp. 114–121. Springer, Heidelberg (2005). https://doi.org/10.1007/978-3-540-31815-6_10
Chapter MATH Google Scholar
Li, G., Yan, Z., Fu, Y., Chen, H.: Data fusion for network intrusion detection: a review. Secur. Commun. Netw. 2018, 1–16 (2018)
MATH Google Scholar
Li, H., et al.: Learning adaptive criteria weights for active semi-supervised learning. Inf. Sci. 561, 286–303 (2021)
Article MathSciNet MATH Google Scholar
Li, W., Meng, W., Luo, X., Kwok, L.F.: MVPSys : toward practical multi-view based false alarm reduction system in network intrusion detection. Comput. Secur. 60, 177–192 (2016)
Google Scholar
Lin, Z., Akin, H., Rao, R., Hie, B., Zhu, E.A.: Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379(6637), 1123–1130 (2023)
Google Scholar
Liu, J., Li, S., Zhang, R.: Algorithm of reducing the false positives in IDS based on correlation analysis. In: IOP Conference Series: Materials Science and Engineering, vol. 322, p. 062016 (2018)
Google Scholar
Meng, Y., Kwok, L.: Intrusion detection using disagreement-based semi-supervised learning: detection enhancement and false alarm reduction. In: Xiang, Y., Lopez, J., Kuo, C.-C.J., Zhou, W. (eds.) CSS 2012. LNCS, vol. 7672, pp. 483–497. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35362-8_36
Chapter MATH Google Scholar
Meng, Y., Kwok, L.-F.: Enhancing false alarm reduction using pool-based active learning in network intrusion detection. In: Deng, R.H., Feng, T. (eds.) ISPEC 2013. LNCS, vol. 7863, pp. 1–15. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38033-4_1
Chapter MATH Google Scholar
Meng, Y., Kwok, L.F.: Adaptive non-critical alarm reduction using hash-based contextual signatures in intrusion detection. Comput. Commun. 38, 50–59 (2014)
Article MATH Google Scholar
MIT Lincoln Laboratory: Darpa lldos 1.0 (2000). https://www.ll.mit.edu/r-d/datasets/2000-darpa-intrusion-detection-scenario-specific-datasets. Accessed 07 Apr 2024
Pietraszek, T.: Using adaptive alert classification to reduce false positives in intrusion detection. In: Jonsson, E., Valdes, A., Almgren, M. (eds.) RAID 2004. LNCS, vol. 3224, pp. 102–124. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30143-1_6
Chapter MATH Google Scholar
Settles, B.: Active learning literature survey (2009)
Google Scholar
Shon, H.G., Lee, Y., Yoon, M.: Semi-supervised alert filtering for network security. Electronics 12(23), 4755 (2023)
Article MATH Google Scholar
Tharwat, A., Schenck, W.: A survey on active learning: state-of-the-art. Pract. Chall. Res. Dir. Math. 11(4), 820 (2023)
MATH Google Scholar
Vu, Q.H., Ruta, D., Cen, L.: Gradient boosting decision trees for cyber security threats detection based on network events logs. In: 2019 IEEE International Conference on Big Data, pp. 5921–5928. IEEE, Los Angeles, CA, USA (2019)
Google Scholar
Wang, T., Zhang, C., Lu, Z., Du, D., Han, Y.: Identifying truly suspicious events and false alarms based on alert graph. In: 2019 IEEE International Conference on Big Data (Big Data), pp. 5929–5936. IEEE, Los Angeles, CA, USA (2019)
Google Scholar
Wang, X., Yang, X., Liang, X., Zhang, X., Zhang, W., Gong, X.: Combating alert fatigue with AlertPro: context-aware alert prioritization using reinforcement learning for multi-step attack detection. Comput. Secur. 137, 103583 (2023)
Article MATH Google Scholar
Wang, Y., Chen, H., Heng, Q., et al.: FreeMatch: self-adaptive thresholding for semi-supervised learning (2023). http://arxiv.org/abs/2205.07246
Yuan, Z., et al.: DualTeacher: bridging coexistence of unlabelled classes for semi-supervised incremental object detection (2023). http://arxiv.org/abs/2401.05362
Zhao, X., Greenberg, J., An, Y., Hu, X.T.: Fine-tuning BERT model for materials named entity recognition. In: 2021 IEEE International Conference on Big Data (Big Data), pp. 3717–3720. IEEE, Orlando, FL, USA (2021)
Google Scholar

Download references

Acknowledgments

This study is funded by science and technology project of the headquarters of State Grid Corporation of China (Research and Application of Network Security Situation Awareness Technology Based on Knowledge Graph, Project Code: 5700-202352606A-3-2-ZN). It also receives support from the Key Laboratory of Network Assessment Technology at the Chinese Academy of Sciences, and the Beijing Key Laboratory of Network Security and Protection Technology.

Author information

Authors and Affiliations

Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Dan Du, Yunpeng Li, Yiyang Cao, Yuling Liu, Guozhu Meng, Ning Li & Dongxu Han
School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Dan Du, Yunpeng Li, Yiyang Cao, Yuling Liu, Guozhu Meng, Ning Li & Dongxu Han
Beijing Electronic Science and Technology Institute, Beijing, China
Huamin Feng

Authors

Dan Du
View author publications
You can also search for this author in PubMed Google Scholar
Yunpeng Li
View author publications
You can also search for this author in PubMed Google Scholar
Yiyang Cao
View author publications
You can also search for this author in PubMed Google Scholar
Yuling Liu
View author publications
You can also search for this author in PubMed Google Scholar
Guozhu Meng
View author publications
You can also search for this author in PubMed Google Scholar
Ning Li
View author publications
You can also search for this author in PubMed Google Scholar
Dongxu Han
View author publications
You can also search for this author in PubMed Google Scholar
Huamin Feng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dongxu Han .

Editor information

Editors and Affiliations

Nanyang Technological University, Singapore, Singapore
Jun Zhao
Technical University of Denmark, Kongens Lyngby, Denmark
Weizhi Meng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Du, D. et al. (2025). FAF-BM: An Approach for False Alerts Filtering Using BERT Model with Semi-supervised Active Learning. In: Zhao, J., Meng, W. (eds) Science of Cyber Security. SciSec 2024. Lecture Notes in Computer Science, vol 15441. Springer, Singapore. https://doi.org/10.1007/978-981-96-2417-1_16

Download citation

DOI: https://doi.org/10.1007/978-981-96-2417-1_16
Published: 04 March 2025
Publisher Name: Springer, Singapore
Print ISBN: 978-981-96-2416-4
Online ISBN: 978-981-96-2417-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics