Abstract
Advanced persistent threats (APT) are increasingly sophisticated and pose a significant threat to organizations’ cybersecurity. Detecting APT attacks in a timely manner is crucial to prevent significant damage. However, hunting for APT attacks requires access to large amounts of sensitive data, which is typically spread across different organizations. This makes it challenging to train effective APT detection models while preserving data privacy. To address this challenge, this paper proposes XFedGraph-Hunter, an interpretable federated learning framework for detecting APT attacks in provenance graphs. The framework leverages federated learning to train APT attack hunting models collaboratively on decentralized data stored on multiple devices. This approach helps to preserve data privacy and security while improving the model’s performance. The machine learning (ML) model employed in the framework is GraphSAGE. Moreover, a pre-trained transformer model is leveraged into the feature preprocessing process to enhance GraphSAGE’s performance. Additionally, GNNexplainer is employed to provide explanations for the APT attack hunting model’s predictions, thereby increasing transparency and interpretability. The proposed framework is evaluated on DARPA TCE3 datasets, using FedAvg as the federated learning algorithm. The results indicate that the proposed framework can effectively detect APT attacks, achieving high accuracy and F1 scores. The interpretability provided by GNNexplainer helps in understanding the features contributing to the detection of APT attacks. The collaborative approach to APT attack hunting presented in this paper enables multiple parties to contribute their data while preserving privacy, providing an effective and scalable solution for APT detection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
References
Alshamrani, A., Myneni, S., Chowdhary, A., Huang, D.: A survey on advanced persistent threats: techniques, solutions, challenges, and research opportunities. IEEE Commun. Surv. Tutorials 21(2), 1851–1877 (2019)
Doshi-Velez, F., Kim, B.: Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 (2017)
Gehani, A., Ahmad, R., Irshad, H., Zhu, J., Patel, J.: Digging into big provenance (with spade). Commun. ACM 64(12), 48–56 (2021)
Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Huang, Q., Yamada, M., Tian, Y., Singh, D., Yin, D., Chang, Y.: Graphlime: local interpretable model explanations for graph neural networks (2020)
Jenkinson, G., et al.: Applying provenance in APT monitoring and analysis. In: Proceedings of the USENIX Workshop Theory Practice Provenance, pp. 16–16 (2017)
Khaleefa, E.J., Abdulah, D.A.: Concept and difficulties of advanced persistent threats (APT): survey. Int. J. Nonlinear Anal. Appl. 13(1), 4037–4052 (2022)
Kurniawan, K., Ekelhart, A., Kiesling, E., Quirchmayr, G., Tjoa, A.M.: Krystal: knowledge graph-based framework for tactical attack discovery in audit data. Comput. Secur. 121, 102828 (2022)
Lo, W.W., Layeghy, S., Sarhan, M., Gallagher, M., Portmann, M.: E-graphsage: a graph neural network based intrusion detection system for IoT. In: NOMS 2022–2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–9. IEEE (2022)
Lv, Y., Qin, S., Zhu, Z., Yu, Z., Li, S., Han, W.: A review of provenance graph based APT attack detection: applications and developments. In: 2022 7th IEEE International Conference on Data Science in Cyberspace (DSC), pp. 498–505 (2022)
McMahan, B., Moore, E., Ramage, D., Hampson, S., Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics, pp. 1273–1282. PMLR (2017)
Nils, R., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. arXiv:1908.10084 (2019)
Reimers, N., Iryna, G.: Making monolingual sentence embeddings multilingual using knowledge distillation. arXiv: 2004.09813 (2020)
Ribeiro, M.T., Singh, S., Guestrin, C.: Model-agnostic interpretability of machine learning. arXiv preprint arXiv:1606.05386 (2016)
Ribeiro, M.T., Singh, S., Guestrin, C.: “why should i trust you?” explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1135–1144 (2016)
Stojanović, B., Hofer-Schmitz, K., Kleb, U.: Apt datasets and attack modeling for automated detection methods: a review. Comput. Secur. 92, 101734 (2020)
Thi, H.T., Son, N.D.H., Duv, P.T., Pham, V.H.: Federated learning-based cyber threat hunting for apt attack detection in SDN-enabled networks. In: 2022 21st International Symposium on Communications and Information Technologies (ISCIT), pp. 1–6. IEEE (2022)
Velickovic, P., et al.: Graph attention networks. Stat 1050(20), 10–48550 (2017)
Wei, R., Cai, L., Zhao, L., Yu, A., Meng, D.: DeepHunter: a graph neural network based approach for robust cyber threat hunting. In: Garcia-Alfaro, J., Li, S., Poovendran, R., Debar, H., Yung, M. (eds.) SecureComm 2021. LNICST, vol. 398, pp. 3–24. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-90019-9_1
Wu, Y., et al.: Paradise: real-time, generalized, and distributed provenance-based intrusion detection. IEEE Trans. Dependable Secure Comput. 20(2), 1624–1640 (2023)
Xie, Y., Feng, D., Hu, Y., Li, Y., Sample, S., Long, D.: Pagoda: a hybrid approach to enable efficient real-time provenance based intrusion detection in big data environments. IEEE Trans. Dependable Secure Comput. 17(6), 1283–1296 (2018)
Ying, Z., Bourgeois, D., You, J., Zitnik, M., Leskovec, J.: GNNExplainer: generating explanations for graph neural networks. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Yuan, H., Tang, J., Hu, X., Ji, S.: XGNN: towards model-level explanations of graph neural networks. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM (2020)
Acknowledgement
This research is funded by Vietnam National University HoChiMinh City (VNU-HCM), Viet Nam under grant number DS2022-26-02.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Son, N.D.H., Thi, H.T., Duy, P.T., Pham, VH. (2023). XFedGraph-Hunter: An Interpretable Federated Learning Framework for Hunting Advanced Persistent Threat in Provenance Graph. In: Meng, W., Yan, Z., Piuri, V. (eds) Information Security Practice and Experience. ISPEC 2023. Lecture Notes in Computer Science, vol 14341. Springer, Singapore. https://doi.org/10.1007/978-981-99-7032-2_32
Download citation
DOI: https://doi.org/10.1007/978-981-99-7032-2_32
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7031-5
Online ISBN: 978-981-99-7032-2
eBook Packages: Computer ScienceComputer Science (R0)