Log Analysis for Feature Engineering and Application of a Boosting Algorithm to Detect Insider Threats

Besnaci, Samiha; Hafidi, Mohamed; Lamia, Mahnane

doi:10.1007/978-3-031-46335-8_21

Samiha Besnaci⁸,
Mohamed Hafidi⁸ &
Mahnane Lamia⁸

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1940))

Included in the following conference series:

International Conference on Intelligent Systems and Pattern Recognition

221 Accesses

Abstract

The insider threat has captured the attention of a large number of researchers, as a sensitive and critical issue for most organizations in today’s digital world. It is also a major source of information security and can cause more damage and financial loss than any other threat. In this article, we’ve used feature engineering for features that represent users’ day-to-day activities. We tried different machine learning models such as random forest, xgboost and Catboost. Since the data used to detect malicious activity is unbalanced, the target audience is small. We used KMeansSmote to balance the classes of learning so that the algorithms can learn both classes well. And we used the catboost algorithm to identify the malicious user. The dataset used to evaluate this model is Cert v4.2. CatBoost outperformed other models with the highest F1-score of 95%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

An Insider Threat Detection Method Based on User Behavior Analysis

Identifying the most accurate machine learning classification technique to detect network threats

Article Open access 05 March 2024

Detecting Deceptive Identities: A Machine Learning Approach to Unveiling Fake Profiles on Social Media

Article 19 December 2024

References

Gayathri, R.G., Atul, S., Xiang, Y.: Image-Based Feature Representation for Insider Threat Classification. Appl. Sci. 10(14), 4945 (2020). https://doi.org/10.3390/app10144945
Article Google Scholar
Figures: Insider threat statistics for 2022: facts and figures (2022). Ekransystem.com. Available: https://www.ekransystem.com/en/blog/insider-threat-statistics-factsand-figures. Accessed 05 Apr 2022
Verizon: 2019 Data Breach Investigations Report. In Computer Fraud & Security; Elsevier BV: Oxfordshire, UK, vol. 2019, p. 4 (2019)
Google Scholar
Accenture/Ponemon Institute: The Cost of Cybercrime, Network Security; Elsevier BV: Amsterdam, The Netherlands, vol. 2019, p. 4 (2019)
Google Scholar
IBM: Cost of a Data Breach Report 2019. In Computer Fraud & Security; Elsevier BV: Oxfordshire, UK, vol. 2019, p. 4 (2019)
Google Scholar
Garcia, A., Orts-Escolano, S., Oprea, S., VillenaMartinez, V., Martinez-Gonzalez, P., Garcia-Rodriguez, J.: A survey on deep learning techniques for image and video semantic segmentation. Appl. Soft Comput. 70, 41–65 (2018)
Article Google Scholar
Yuan, F., Shang, Y., Liu, Y., Cao, Y., Tan, J.: Data augmentation for insider threat detection with GAN. In: 32^nd International Conference on Tools with Artificial Intelligence, ICTAI 2000 (2020)
Google Scholar
Azaria, A., Richardson, A., Kraus, S., Subrahmanian, V.S.: Behavioral analysis of insider threat: a survey and bootstrapped prediction in imbalanced data. IEEE Trans. Comput. Soc. Syst. 1(2), 135–155 (2014)
Article Google Scholar
Yuan, S., Wu, X.: Deep learning for insider threat detection: review, challenges and opportunities. Comput. Secur. 104, 1–14 (2021)
Article Google Scholar
Zhang, C., Wang, S., Zhan, D., Tingyue, Y., Wang, T., Yin, M.: Detecting insider threat from behavioral logs based on ensemble and self-supervised learning. Secur. Commun. Networks 2021, 1–11 (2021). https://doi.org/10.1155/2021/4148441
Article Google Scholar
AlSlaiman, M., Salman, M.I., Saleh, M.M., Wang, B.: Enhancing false negative and positive rates for efficient insider threat detection. Comput. Secur. 126, 103066 (2023). https://doi.org/10.1016/j.cose.2022.103066
Article Google Scholar
Yuan, S., Wu, X.: Deep learning for insider threat detection: review, challenges and opportunities. arXiv:2005.12433v1 (2020)
Raval, M.S., Gandhi, R., Chaudhary, S.: Insider threat detection: machine learning way. In: Conti, M., Somani, G., Poovendran, R. (eds.) Versatile Cybersecurity, pp. 19–53. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-97643-3_2
Chapter Google Scholar
Yuan, F., Cao, Y., Shang, Y., Liu, Y., Tan, J., Fang, B.: Insider threat detection with deep neural network. In: Shi, Y., Haohuan, F., Tian, Y., Krzhizhanovskaya, V.V., Lees, M.H., Dongarra, J., Sloot, P.M.A. (eds.) Computational Science – ICCS 2018: 18th International Conference, Wuxi, China, June 11–13, 2018, Proceedings, Part I, pp. 43–54. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93698-7_4
Chapter Google Scholar
Al-Mhiqani, M.N., et al.: A review of insider threat detection: classification, machine learning techniques, datasets, open challenges, and recommendations. Appl. Sci. 10(15), 5208 (2020). https://doi.org/10.3390/app10155208
Article Google Scholar
Liu, L., de Vel, O., Chen, C., Zhang, J., Xiang, Y.: Anomaly-based insider threat detection using deep autoencoders. In: 2018 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 39–48. IEEE (2018)
Google Scholar
AL-Mhiquani, M.N., Ahmed, R., Abidin, Z.Z.: An integrated imbalanced learning and deep neural network model for insider threat detection. Int. J. Adv. Comput. Sci. Appl. 12, 573–577 (2021)
Google Scholar
Yuan, F., Shang, Y., Liu, Y., Cao, Y., Tan, J.: Data augmentation for insider threat detection with GAN. In: 32^nd International Conference on Tools with Artificial Intelligence, ICTAI 2020 (2020)
Google Scholar
Mohammed, M., Kadhem, A., Maisa, S., Ali, A.: Insider Attacker Detection Using Light Gradient Boosting Machine. Tech-Knowledge 1, 48–66 (2021)
Google Scholar
Douzas, G., Bacao, F., Last, F.: Oversampling for imbalanced learning based on K-Means and SMOTE. Inf. Sci. 465, 120 (2017). https://doi.org/10.1016/j.ins.2018.06.056
Article Google Scholar
Janjua, F., Masood, A., Abbas, H., Rashid, I., Zaki, M.M., Khan, M.: Textual analysis of traitor-based dataset through semi supervised machine learning. Future Gener. Comput. Syst. 125, 652–660 (2021). https://doi.org/10.1016/j.future.2021.06.036
Article Google Scholar
Glasser, J., Lindauer, B.: Bridging the gap: a pragmatic approach to generating insider threat data. In: Conference on Tools IEEE Security and Privacy Workshops (2013). https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6565236
Eldardiry, H., Bart, E., Liu, J., Hanley, J., Price, B., Brdiczka, O.: Multi-domain information fusion for insider threat detection. In: 2013 IEEE Security and Privacy Workshops, pp. 45–51 (2013). https://doi.org/10.1109/SPW.2013.14
Le, D.C., Zincir-Heywood, N., Heywood, M.I.: Analyzing data granularity levels for insider threat detection using machine learning. IEEE Trans. Netw. Serv. Manage. 17(1), 30–44 (2020). https://doi.org/10.1109/TNSM.2020.2967721
Article Google Scholar
Dorogush, A.V., Gulin, A., Gusev, G., Ostroumova Prokhorenkova, L., Vorobev, A.: Catboost: unbiased boosting with categorical features. arXiv preprint arXiv:1706.09516 (2017)

Download references

Author information

Authors and Affiliations

LRS Laboratory, computer science department, Badji Mokhtar University, Annaba, Algeria
Samiha Besnaci, Mohamed Hafidi & Mahnane Lamia

Authors

Samiha Besnaci
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Hafidi
View author publications
You can also search for this author in PubMed Google Scholar
Mahnane Lamia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Samiha Besnaci .

Editor information

Editors and Affiliations

Larbi Tebessi University, Tebessa, Algeria
Akram Bennour
Sharjah University, Sharjah, United Arab Emirates
Ahmed Bouridane
University of Toulouse, Toulouse, France
Lotfi Chaari

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Besnaci, S., Hafidi, M., Lamia, M. (2024). Log Analysis for Feature Engineering and Application of a Boosting Algorithm to Detect Insider Threats. In: Bennour, A., Bouridane, A., Chaari, L. (eds) Intelligent Systems and Pattern Recognition. ISPR 2023. Communications in Computer and Information Science, vol 1940. Springer, Cham. https://doi.org/10.1007/978-3-031-46335-8_21

Download citation

DOI: https://doi.org/10.1007/978-3-031-46335-8_21
Published: 05 November 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46334-1
Online ISBN: 978-3-031-46335-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics