Skip to main content

Log Analysis for Feature Engineering and Application of a Boosting Algorithm to Detect Insider Threats

  • Conference paper
  • First Online:
Intelligent Systems and Pattern Recognition (ISPR 2023)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1940))

  • 221 Accesses

Abstract

The insider threat has captured the attention of a large number of researchers, as a sensitive and critical issue for most organizations in today’s digital world. It is also a major source of information security and can cause more damage and financial loss than any other threat. In this article, we’ve used feature engineering for features that represent users’ day-to-day activities. We tried different machine learning models such as random forest, xgboost and Catboost. Since the data used to detect malicious activity is unbalanced, the target audience is small. We used KMeansSmote to balance the classes of learning so that the algorithms can learn both classes well. And we used the catboost algorithm to identify the malicious user. The dataset used to evaluate this model is Cert v4.2. CatBoost outperformed other models with the highest F1-score of 95%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Similar content being viewed by others

References

  1. Gayathri, R.G., Atul, S., Xiang, Y.: Image-Based Feature Representation for Insider Threat Classification. Appl. Sci. 10(14), 4945 (2020). https://doi.org/10.3390/app10144945

    Article  Google Scholar 

  2. Figures: Insider threat statistics for 2022: facts and figures (2022). Ekransystem.com. Available: https://www.ekransystem.com/en/blog/insider-threat-statistics-factsand-figures. Accessed 05 Apr 2022

  3. Verizon: 2019 Data Breach Investigations Report. In Computer Fraud & Security; Elsevier BV: Oxfordshire, UK, vol. 2019, p. 4 (2019)

    Google Scholar 

  4. Accenture/Ponemon Institute: The Cost of Cybercrime, Network Security; Elsevier BV: Amsterdam, The Netherlands, vol. 2019, p. 4 (2019)

    Google Scholar 

  5. IBM: Cost of a Data Breach Report 2019. In Computer Fraud & Security; Elsevier BV: Oxfordshire, UK, vol. 2019, p. 4 (2019)

    Google Scholar 

  6. Garcia, A., Orts-Escolano, S., Oprea, S., VillenaMartinez, V., Martinez-Gonzalez, P., Garcia-Rodriguez, J.: A survey on deep learning techniques for image and video semantic segmentation. Appl. Soft Comput. 70, 41–65 (2018)

    Article  Google Scholar 

  7. Yuan, F., Shang, Y., Liu, Y., Cao, Y., Tan, J.: Data augmentation for insider threat detection with GAN. In: 32nd International Conference on Tools with Artificial Intelligence, ICTAI 2000 (2020)

    Google Scholar 

  8. Azaria, A., Richardson, A., Kraus, S., Subrahmanian, V.S.: Behavioral analysis of insider threat: a survey and bootstrapped prediction in imbalanced data. IEEE Trans. Comput. Soc. Syst. 1(2), 135–155 (2014)

    Article  Google Scholar 

  9. Yuan, S., Wu, X.: Deep learning for insider threat detection: review, challenges and opportunities. Comput. Secur. 104, 1–14 (2021)

    Article  Google Scholar 

  10. Zhang, C., Wang, S., Zhan, D., Tingyue, Y., Wang, T., Yin, M.: Detecting insider threat from behavioral logs based on ensemble and self-supervised learning. Secur. Commun. Networks 2021, 1–11 (2021). https://doi.org/10.1155/2021/4148441

    Article  Google Scholar 

  11. AlSlaiman, M., Salman, M.I., Saleh, M.M., Wang, B.: Enhancing false negative and positive rates for efficient insider threat detection. Comput. Secur. 126, 103066 (2023). https://doi.org/10.1016/j.cose.2022.103066

    Article  Google Scholar 

  12. Yuan, S., Wu, X.: Deep learning for insider threat detection: review, challenges and opportunities. arXiv:2005.12433v1 (2020)

  13. Raval, M.S., Gandhi, R., Chaudhary, S.: Insider threat detection: machine learning way. In: Conti, M., Somani, G., Poovendran, R. (eds.) Versatile Cybersecurity, pp. 19–53. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-97643-3_2

    Chapter  Google Scholar 

  14. Yuan, F., Cao, Y., Shang, Y., Liu, Y., Tan, J., Fang, B.: Insider threat detection with deep neural network. In: Shi, Y., Haohuan, F., Tian, Y., Krzhizhanovskaya, V.V., Lees, M.H., Dongarra, J., Sloot, P.M.A. (eds.) Computational Science – ICCS 2018: 18th International Conference, Wuxi, China, June 11–13, 2018, Proceedings, Part I, pp. 43–54. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93698-7_4

    Chapter  Google Scholar 

  15. Al-Mhiqani, M.N., et al.: A review of insider threat detection: classification, machine learning techniques, datasets, open challenges, and recommendations. Appl. Sci. 10(15), 5208 (2020). https://doi.org/10.3390/app10155208

    Article  Google Scholar 

  16. Liu, L., de Vel, O., Chen, C., Zhang, J., Xiang, Y.: Anomaly-based insider threat detection using deep autoencoders. In: 2018 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 39–48. IEEE (2018)

    Google Scholar 

  17. AL-Mhiquani, M.N., Ahmed, R., Abidin, Z.Z.: An integrated imbalanced learning and deep neural network model for insider threat detection. Int. J. Adv. Comput. Sci. Appl. 12, 573–577 (2021)

    Google Scholar 

  18. Yuan, F., Shang, Y., Liu, Y., Cao, Y., Tan, J.: Data augmentation for insider threat detection with GAN. In: 32nd International Conference on Tools with Artificial Intelligence, ICTAI 2020 (2020)

    Google Scholar 

  19. Mohammed, M., Kadhem, A., Maisa, S., Ali, A.: Insider Attacker Detection Using Light Gradient Boosting Machine. Tech-Knowledge 1, 48–66 (2021)

    Google Scholar 

  20. Douzas, G., Bacao, F., Last, F.: Oversampling for imbalanced learning based on K-Means and SMOTE. Inf. Sci. 465, 120 (2017). https://doi.org/10.1016/j.ins.2018.06.056

    Article  Google Scholar 

  21. Janjua, F., Masood, A., Abbas, H., Rashid, I., Zaki, M.M., Khan, M.: Textual analysis of traitor-based dataset through semi supervised machine learning. Future Gener. Comput. Syst. 125, 652–660 (2021). https://doi.org/10.1016/j.future.2021.06.036

    Article  Google Scholar 

  22. Glasser, J., Lindauer, B.: Bridging the gap: a pragmatic approach to generating insider threat data. In: Conference on Tools IEEE Security and Privacy Workshops (2013). https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6565236

  23. Eldardiry, H., Bart, E., Liu, J., Hanley, J., Price, B., Brdiczka, O.: Multi-domain information fusion for insider threat detection. In: 2013 IEEE Security and Privacy Workshops, pp. 45–51 (2013). https://doi.org/10.1109/SPW.2013.14

  24. Le, D.C., Zincir-Heywood, N., Heywood, M.I.: Analyzing data granularity levels for insider threat detection using machine learning. IEEE Trans. Netw. Serv. Manage. 17(1), 30–44 (2020). https://doi.org/10.1109/TNSM.2020.2967721

    Article  Google Scholar 

  25. Dorogush, A.V., Gulin, A., Gusev, G., Ostroumova Prokhorenkova, L., Vorobev, A.: Catboost: unbiased boosting with categorical features. arXiv preprint arXiv:1706.09516 (2017)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Samiha Besnaci .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Besnaci, S., Hafidi, M., Lamia, M. (2024). Log Analysis for Feature Engineering and Application of a Boosting Algorithm to Detect Insider Threats. In: Bennour, A., Bouridane, A., Chaari, L. (eds) Intelligent Systems and Pattern Recognition. ISPR 2023. Communications in Computer and Information Science, vol 1940. Springer, Cham. https://doi.org/10.1007/978-3-031-46335-8_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-46335-8_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-46334-1

  • Online ISBN: 978-3-031-46335-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics