Abstract
In response to the growing threat of Advanced Persistent Threat (APT) in network security, our research introduces an innovative APT malware attribution tool, the APTMalKG knowledge graph. This knowledge graph is constructed from comprehensive APT malware data and refined through a multi-stage graph clustering process. We have incorporated domain-specific meta-paths into the GraphSAGE graph embedding algorithm to enhance its effectiveness. Our approach includes an ontology model capturing complex APT malware characteristics and behaviors, extracted from sandbox analysis reports and expanded intelligence. To manage the graph’s granularity and scale, we categorize nodes based on domain knowledge, form a correlation subgraph, and progressively adjust similarity thresholds and edge weights. The refined graph maintains crucial attribution data while reducing complexity. By integrating domain-specific meta-paths into GraphSAGE, we achieve improved APT attribution accuracy with an average accuracy of 91.16%, an F1 score of 89.82%, and an average AUC of 98.99%, enhancing performance significantly. This study benefits network security analysts with an intuitive knowledge graph and explores large-scale graph computing methods for practical scenarios, offering a multi-dimensional perspective on APT malware analysis and attribution research, highlighting the value of knowledge graphs in network security.
Supported by Youth Innovation Promotion Association, CAS (No. 2023170).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Malware Attribute Enumeration and Characterization (MAEC) (2023). https://maecproject.github.io/. Accessed 11 Nov 2023
Balan, G., Gavriluţ, D.T., Luchian, H.: Using API calls for sequence-pattern feature mining-based malware detection. In: Su, C., Gritzalis, D., Piuri, V. (eds.) ISPEC 2022, pp. 233–251. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-21280-2_13
Busch, J., Kocheturov, A., Tresp, V., Seidl, T.: Nf-gnn: network flow graph neural networks for malware detection and classification. In: Proceedings of the 33rd International Conference on Scientific and Statistical Database Management, pp. 121–132. Association for Computing Machinery (2021)
Chang, H.Y., Yang, T.Y., Zhuang, C.J., Tseng, W.L.: Ransomware detection by distinguishing api call sequences through lstm and bert models. Comput. J. 13, 5439 (2023)
Cremer, F., Sheehan, B., Fortmann, M., Kia, A.N., Mullins, M., Murphy, F., Materne, S.: Cyber risk and cybersecurity: a systematic review of data availability. Geneva Papers Risk Insur. Issues Pract. 47, 698–736 (2022)
CyberMonitor, Robert Haist, K., et al.: APT and cybercriminals campaign collection. GitHub repository (2022). https://github.com/CyberMonitor/APT_CyberCriminal_Campagin_Collections
Do Xuan, C., Huong, D.: A new approach for apt malware detection based on deep graph network for endpoint systems. Appl. Intell. 52(12), 14005–14024 (2022)
Dutta, S., Rastogi, N., Yee, D., Gu, C., Ma, Q.: Malware knowledge graph: a comprehensive knowledge base for malware analysis and detection. In: 2021 IEEE Network Security and Privacy Protection International Conference (NSPW) (2021)
Feurer, M., et al.: auto-sklearn: automated machine learning toolkit (2023). https://automl.github.io/auto-sklearn/master/. gitHub repository
Hasan, M.M., Islam, M.U., Uddin, J.: Advanced persistent threat identification with boosting and explainable AI. SN Comput. Sci. 4, 271–279 (2023)
Kiesling, E., Ekelhart, A., Kurniawan, K., Ekaputra, F.: The SEPSES knowledge graph: an integrated resource for cybersecurity. In: Ghidini, C., et al. (eds.) ISWC 2019. LNCS, vol. 11779, pp. 198–214. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30796-7_13
Kiran Bandla, S.C.: Aptnotes data. GitHub repository (2021). https://github.com/aptnotes/data
Lee, K., Lee, J., Yim, K.: Classification and analysis of malicious code detection techniques based on the apt attack. Appl. Sci. 13, 2894 (2023)
Li, S., Zhou, Q., Zhou, R., Lv, Q.: Intelligent malware detection based on graph convolutional network. J. Supercomput. 78, 4182–4198 (2022)
Li, S., Zhang, Q., Wu, X., Han, W., Tian, Z.: Attribution classification method of apt malware in IoT using machine learning techniques. Secur. Commun. Netw. 2021, 1–12 (2021)
Li, Z., Zeng, J., Chen, Y., Liang, Z.: AttacKG: constructing technique knowledge graph from cyber threat intelligence Reports. In: Atluri, V., Di Pietro, R., Jensen, C.D., Meng, W. (eds.) ESORICS 2022, pp. 589–609. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-17140-6_29
MLG at Neo4j. Community detection (2022). https://neo4j.com/docs/graph-data-science/current/algorithms/community/
Moon, H.-J., Bu, S.-J., Cho, S.-B.: Directional graph transformer-based control flow embedding for malware classification. In: Yin, H., et al. (eds.) IDEAL 2021. LNCS, vol. 13113, pp. 426–436. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-91608-4_42
Peng, C., Xia, F., Naseriparsa, M., Osborne, F.: Knowledge graphs: opportunities and challenges. Artif. Intell. Rev. 56, 13071–13102 (2023)
RedDrip7. Apt_digital_weapon: indicators of compromise (IOCS) collected from public resources and categorized by qi-anxin. GitHub repository (2022)
Ren, Y., Xiao, Y., Zhou, Y., Zhang, Z., Tian, Z.: Cskg4apt: a cybersecurity knowledge graph for advanced persistent threat organization attribution. IEEE Trans. Knowl. Data Eng. 35, 5695–5709 (2023)
Renz, M., Kröger, P., Koschmider, A., Landsiedel, O., de Sousa, N.T.: Cross domain fusion for spatiotemporal applications: taking interdisciplinary, holistic research to the next level. Informatik Spektrum 45, 271–277 (2022)
Sahoo, D.: Cyber threat attribution with multi-view heuristic analysis. In: Choo, K.-K.R., Dehghantanha, A. (eds.) Handbook of Big Data Analytics and Forensics, pp. 53–73. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-74753-4_4
Sharma, A., Gupta, B.B., Singh, A.K., Saraswat, V.K.: Advanced persistent threats (apt): evolution, anatomy, attribution and countermeasures. J. Ambient. Intell. Humaniz. Comput. 14, 9355–9381 (2023)
Sikos, L.F.: Cybersecurity knowledge graphs. Knowl. Inf. Syst. 65, 3511–3531 (2023)
Soni, H., Kishore, P., Mohapatra, D.P.: Opcode and API based machine learning framework for malware classification. In: 2022 2nd International Conference on Intelligent Technologies (CONIT), pp. 1–7 (2022)
Tekerek, A., Yapici, M.M.: A novel malware classification and augmentation model based on convolutional neural network. Comput. Secur. 112, 102515 (2022)
VirusTotal. Virustotal: analyse suspicious files and URLs to detect malware. Website (2022). https://www.virustotal.com/
Wai, F.K., Thing, V.L.L.: Clustering based opcode graph generation for malware variant detection. In: 2021 18th International Conference on Privacy, Security and Trust (PST), pp. 1–11 (2021)
Wei, C., Li, Q., Guo, D., Meng, X.: Toward identifying apt malware through API system calls. Secur. Commun. Netw. 2021, 8077220 (2021)
Wu, X.W., Wang, Y., Fang, Y., Jia, P.: Embedding vector generation based on function call graph for effective malware detection and classification. Neural Comput. Appl. 34, 8643–8656 (2022)
Xuan, C.D., Dao, M.H.: A novel approach for apt attack detection based on combined deep learning model. Neural Comput. Appl. 33, 13251–13264 (2021)
Funding
Supported by Youth Innovation Promotion Association, CAS (No. 2020166) and Youth Innovation Promotion Association, CAS (No. 2023170).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Jing, R., Jiang, Z., Wang, Q., Wang, S., Li, H., Chen, X. (2024). From Fine-Grained to Refined: APT Malware Knowledge Graph Construction and Attribution Analysis Driven by Multi-stage Graph Computation. In: Franco, L., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2024. ICCS 2024. Lecture Notes in Computer Science, vol 14832. Springer, Cham. https://doi.org/10.1007/978-3-031-63749-0_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-63749-0_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-63748-3
Online ISBN: 978-3-031-63749-0
eBook Packages: Computer ScienceComputer Science (R0)