Abstract
The insider threat has captured the attention of a large number of researchers, as a sensitive and critical issue for most organizations in today’s digital world. It is also a major source of information security and can cause more damage and financial loss than any other threat. In this article, we’ve used feature engineering for features that represent users’ day-to-day activities. We tried different machine learning models such as random forest, xgboost and Catboost. Since the data used to detect malicious activity is unbalanced, the target audience is small. We used KMeansSmote to balance the classes of learning so that the algorithms can learn both classes well. And we used the catboost algorithm to identify the malicious user. The dataset used to evaluate this model is Cert v4.2. CatBoost outperformed other models with the highest F1-score of 95%.
Similar content being viewed by others
References
Gayathri, R.G., Atul, S., Xiang, Y.: Image-Based Feature Representation for Insider Threat Classification. Appl. Sci. 10(14), 4945 (2020). https://doi.org/10.3390/app10144945
Figures: Insider threat statistics for 2022: facts and figures (2022). Ekransystem.com. Available: https://www.ekransystem.com/en/blog/insider-threat-statistics-factsand-figures. Accessed 05 Apr 2022
Verizon: 2019 Data Breach Investigations Report. In Computer Fraud & Security; Elsevier BV: Oxfordshire, UK, vol. 2019, p. 4 (2019)
Accenture/Ponemon Institute: The Cost of Cybercrime, Network Security; Elsevier BV: Amsterdam, The Netherlands, vol. 2019, p. 4 (2019)
IBM: Cost of a Data Breach Report 2019. In Computer Fraud & Security; Elsevier BV: Oxfordshire, UK, vol. 2019, p. 4 (2019)
Garcia, A., Orts-Escolano, S., Oprea, S., VillenaMartinez, V., Martinez-Gonzalez, P., Garcia-Rodriguez, J.: A survey on deep learning techniques for image and video semantic segmentation. Appl. Soft Comput. 70, 41–65 (2018)
Yuan, F., Shang, Y., Liu, Y., Cao, Y., Tan, J.: Data augmentation for insider threat detection with GAN. In: 32nd International Conference on Tools with Artificial Intelligence, ICTAI 2000 (2020)
Azaria, A., Richardson, A., Kraus, S., Subrahmanian, V.S.: Behavioral analysis of insider threat: a survey and bootstrapped prediction in imbalanced data. IEEE Trans. Comput. Soc. Syst. 1(2), 135–155 (2014)
Yuan, S., Wu, X.: Deep learning for insider threat detection: review, challenges and opportunities. Comput. Secur. 104, 1–14 (2021)
Zhang, C., Wang, S., Zhan, D., Tingyue, Y., Wang, T., Yin, M.: Detecting insider threat from behavioral logs based on ensemble and self-supervised learning. Secur. Commun. Networks 2021, 1–11 (2021). https://doi.org/10.1155/2021/4148441
AlSlaiman, M., Salman, M.I., Saleh, M.M., Wang, B.: Enhancing false negative and positive rates for efficient insider threat detection. Comput. Secur. 126, 103066 (2023). https://doi.org/10.1016/j.cose.2022.103066
Yuan, S., Wu, X.: Deep learning for insider threat detection: review, challenges and opportunities. arXiv:2005.12433v1 (2020)
Raval, M.S., Gandhi, R., Chaudhary, S.: Insider threat detection: machine learning way. In: Conti, M., Somani, G., Poovendran, R. (eds.) Versatile Cybersecurity, pp. 19–53. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-97643-3_2
Yuan, F., Cao, Y., Shang, Y., Liu, Y., Tan, J., Fang, B.: Insider threat detection with deep neural network. In: Shi, Y., Haohuan, F., Tian, Y., Krzhizhanovskaya, V.V., Lees, M.H., Dongarra, J., Sloot, P.M.A. (eds.) Computational Science – ICCS 2018: 18th International Conference, Wuxi, China, June 11–13, 2018, Proceedings, Part I, pp. 43–54. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93698-7_4
Al-Mhiqani, M.N., et al.: A review of insider threat detection: classification, machine learning techniques, datasets, open challenges, and recommendations. Appl. Sci. 10(15), 5208 (2020). https://doi.org/10.3390/app10155208
Liu, L., de Vel, O., Chen, C., Zhang, J., Xiang, Y.: Anomaly-based insider threat detection using deep autoencoders. In: 2018 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 39–48. IEEE (2018)
AL-Mhiquani, M.N., Ahmed, R., Abidin, Z.Z.: An integrated imbalanced learning and deep neural network model for insider threat detection. Int. J. Adv. Comput. Sci. Appl. 12, 573–577 (2021)
Yuan, F., Shang, Y., Liu, Y., Cao, Y., Tan, J.: Data augmentation for insider threat detection with GAN. In: 32nd International Conference on Tools with Artificial Intelligence, ICTAI 2020 (2020)
Mohammed, M., Kadhem, A., Maisa, S., Ali, A.: Insider Attacker Detection Using Light Gradient Boosting Machine. Tech-Knowledge 1, 48–66 (2021)
Douzas, G., Bacao, F., Last, F.: Oversampling for imbalanced learning based on K-Means and SMOTE. Inf. Sci. 465, 120 (2017). https://doi.org/10.1016/j.ins.2018.06.056
Janjua, F., Masood, A., Abbas, H., Rashid, I., Zaki, M.M., Khan, M.: Textual analysis of traitor-based dataset through semi supervised machine learning. Future Gener. Comput. Syst. 125, 652–660 (2021). https://doi.org/10.1016/j.future.2021.06.036
Glasser, J., Lindauer, B.: Bridging the gap: a pragmatic approach to generating insider threat data. In: Conference on Tools IEEE Security and Privacy Workshops (2013). https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6565236
Eldardiry, H., Bart, E., Liu, J., Hanley, J., Price, B., Brdiczka, O.: Multi-domain information fusion for insider threat detection. In: 2013 IEEE Security and Privacy Workshops, pp. 45–51 (2013). https://doi.org/10.1109/SPW.2013.14
Le, D.C., Zincir-Heywood, N., Heywood, M.I.: Analyzing data granularity levels for insider threat detection using machine learning. IEEE Trans. Netw. Serv. Manage. 17(1), 30–44 (2020). https://doi.org/10.1109/TNSM.2020.2967721
Dorogush, A.V., Gulin, A., Gusev, G., Ostroumova Prokhorenkova, L., Vorobev, A.: Catboost: unbiased boosting with categorical features. arXiv preprint arXiv:1706.09516 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Besnaci, S., Hafidi, M., Lamia, M. (2024). Log Analysis for Feature Engineering and Application of a Boosting Algorithm to Detect Insider Threats. In: Bennour, A., Bouridane, A., Chaari, L. (eds) Intelligent Systems and Pattern Recognition. ISPR 2023. Communications in Computer and Information Science, vol 1940. Springer, Cham. https://doi.org/10.1007/978-3-031-46335-8_21
Download citation
DOI: https://doi.org/10.1007/978-3-031-46335-8_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46334-1
Online ISBN: 978-3-031-46335-8
eBook Packages: Computer ScienceComputer Science (R0)