From Fine-Grained to Refined: APT Malware Knowledge Graph Construction and Attribution Analysis Driven by Multi-stage Graph Computation

Jing, Rongqi; Jiang, Zhengwei; Wang, Qiuyun; Wang, Shuwei; Li, Hao; Chen, Xiao

doi:10.1007/978-3-031-63749-0_6

Rongqi Jing^30,31,
Zhengwei Jiang^30,31,
Qiuyun Wang^30,31,
Shuwei Wang^30,31,
Hao Li^30,31 &
…
Xiao Chen^30,31

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14832))

Included in the following conference series:

International Conference on Computational Science

557 Accesses

Abstract

In response to the growing threat of Advanced Persistent Threat (APT) in network security, our research introduces an innovative APT malware attribution tool, the APTMalKG knowledge graph. This knowledge graph is constructed from comprehensive APT malware data and refined through a multi-stage graph clustering process. We have incorporated domain-specific meta-paths into the GraphSAGE graph embedding algorithm to enhance its effectiveness. Our approach includes an ontology model capturing complex APT malware characteristics and behaviors, extracted from sandbox analysis reports and expanded intelligence. To manage the graph’s granularity and scale, we categorize nodes based on domain knowledge, form a correlation subgraph, and progressively adjust similarity thresholds and edge weights. The refined graph maintains crucial attribution data while reducing complexity. By integrating domain-specific meta-paths into GraphSAGE, we achieve improved APT attribution accuracy with an average accuracy of 91.16%, an F1 score of 89.82%, and an average AUC of 98.99%, enhancing performance significantly. This study benefits network security analysts with an intuitive knowledge graph and explores large-scale graph computing methods for practical scenarios, offering a multi-dimensional perspective on APT malware analysis and attribution research, highlighting the value of knowledge graphs in network security.

Supported by Youth Innovation Promotion Association, CAS (No. 2023170).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Embedding and Predicting Software Security Entity Relationships: A Knowledge Graph Based Approach

A Survey of Cybersecurity Knowledge Base and Its Automatic Labeling

Uncovering Security Entity Relations with Cyber Threat Knowledge Graph Embedding

References

Malware Attribute Enumeration and Characterization (MAEC) (2023). https://maecproject.github.io/. Accessed 11 Nov 2023
Balan, G., Gavriluţ, D.T., Luchian, H.: Using API calls for sequence-pattern feature mining-based malware detection. In: Su, C., Gritzalis, D., Piuri, V. (eds.) ISPEC 2022, pp. 233–251. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-21280-2_13
Busch, J., Kocheturov, A., Tresp, V., Seidl, T.: Nf-gnn: network flow graph neural networks for malware detection and classification. In: Proceedings of the 33rd International Conference on Scientific and Statistical Database Management, pp. 121–132. Association for Computing Machinery (2021)
Google Scholar
Chang, H.Y., Yang, T.Y., Zhuang, C.J., Tseng, W.L.: Ransomware detection by distinguishing api call sequences through lstm and bert models. Comput. J. 13, 5439 (2023)
Google Scholar
Cremer, F., Sheehan, B., Fortmann, M., Kia, A.N., Mullins, M., Murphy, F., Materne, S.: Cyber risk and cybersecurity: a systematic review of data availability. Geneva Papers Risk Insur. Issues Pract. 47, 698–736 (2022)
Article Google Scholar
CyberMonitor, Robert Haist, K., et al.: APT and cybercriminals campaign collection. GitHub repository (2022). https://github.com/CyberMonitor/APT_CyberCriminal_Campagin_Collections
Do Xuan, C., Huong, D.: A new approach for apt malware detection based on deep graph network for endpoint systems. Appl. Intell. 52(12), 14005–14024 (2022)
Article Google Scholar
Dutta, S., Rastogi, N., Yee, D., Gu, C., Ma, Q.: Malware knowledge graph: a comprehensive knowledge base for malware analysis and detection. In: 2021 IEEE Network Security and Privacy Protection International Conference (NSPW) (2021)
Google Scholar
Feurer, M., et al.: auto-sklearn: automated machine learning toolkit (2023). https://automl.github.io/auto-sklearn/master/. gitHub repository
Hasan, M.M., Islam, M.U., Uddin, J.: Advanced persistent threat identification with boosting and explainable AI. SN Comput. Sci. 4, 271–279 (2023)
Article Google Scholar
Kiesling, E., Ekelhart, A., Kurniawan, K., Ekaputra, F.: The SEPSES knowledge graph: an integrated resource for cybersecurity. In: Ghidini, C., et al. (eds.) ISWC 2019. LNCS, vol. 11779, pp. 198–214. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30796-7_13
Kiran Bandla, S.C.: Aptnotes data. GitHub repository (2021). https://github.com/aptnotes/data
Lee, K., Lee, J., Yim, K.: Classification and analysis of malicious code detection techniques based on the apt attack. Appl. Sci. 13, 2894 (2023)
Article Google Scholar
Li, S., Zhou, Q., Zhou, R., Lv, Q.: Intelligent malware detection based on graph convolutional network. J. Supercomput. 78, 4182–4198 (2022)
Article Google Scholar
Li, S., Zhang, Q., Wu, X., Han, W., Tian, Z.: Attribution classification method of apt malware in IoT using machine learning techniques. Secur. Commun. Netw. 2021, 1–12 (2021)
Google Scholar
Li, Z., Zeng, J., Chen, Y., Liang, Z.: AttacKG: constructing technique knowledge graph from cyber threat intelligence Reports. In: Atluri, V., Di Pietro, R., Jensen, C.D., Meng, W. (eds.) ESORICS 2022, pp. 589–609. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-17140-6_29
MLG at Neo4j. Community detection (2022). https://neo4j.com/docs/graph-data-science/current/algorithms/community/
Moon, H.-J., Bu, S.-J., Cho, S.-B.: Directional graph transformer-based control flow embedding for malware classification. In: Yin, H., et al. (eds.) IDEAL 2021. LNCS, vol. 13113, pp. 426–436. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-91608-4_42
Peng, C., Xia, F., Naseriparsa, M., Osborne, F.: Knowledge graphs: opportunities and challenges. Artif. Intell. Rev. 56, 13071–13102 (2023)
Article Google Scholar
RedDrip7. Apt_digital_weapon: indicators of compromise (IOCS) collected from public resources and categorized by qi-anxin. GitHub repository (2022)
Google Scholar
Ren, Y., Xiao, Y., Zhou, Y., Zhang, Z., Tian, Z.: Cskg4apt: a cybersecurity knowledge graph for advanced persistent threat organization attribution. IEEE Trans. Knowl. Data Eng. 35, 5695–5709 (2023)
Google Scholar
Renz, M., Kröger, P., Koschmider, A., Landsiedel, O., de Sousa, N.T.: Cross domain fusion for spatiotemporal applications: taking interdisciplinary, holistic research to the next level. Informatik Spektrum 45, 271–277 (2022)
Article Google Scholar
Sahoo, D.: Cyber threat attribution with multi-view heuristic analysis. In: Choo, K.-K.R., Dehghantanha, A. (eds.) Handbook of Big Data Analytics and Forensics, pp. 53–73. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-74753-4_4
Sharma, A., Gupta, B.B., Singh, A.K., Saraswat, V.K.: Advanced persistent threats (apt): evolution, anatomy, attribution and countermeasures. J. Ambient. Intell. Humaniz. Comput. 14, 9355–9381 (2023)
Article Google Scholar
Sikos, L.F.: Cybersecurity knowledge graphs. Knowl. Inf. Syst. 65, 3511–3531 (2023)
Article Google Scholar
Soni, H., Kishore, P., Mohapatra, D.P.: Opcode and API based machine learning framework for malware classification. In: 2022 2nd International Conference on Intelligent Technologies (CONIT), pp. 1–7 (2022)
Google Scholar
Tekerek, A., Yapici, M.M.: A novel malware classification and augmentation model based on convolutional neural network. Comput. Secur. 112, 102515 (2022)
Article Google Scholar
VirusTotal. Virustotal: analyse suspicious files and URLs to detect malware. Website (2022). https://www.virustotal.com/
Wai, F.K., Thing, V.L.L.: Clustering based opcode graph generation for malware variant detection. In: 2021 18th International Conference on Privacy, Security and Trust (PST), pp. 1–11 (2021)
Google Scholar
Wei, C., Li, Q., Guo, D., Meng, X.: Toward identifying apt malware through API system calls. Secur. Commun. Netw. 2021, 8077220 (2021)
Article Google Scholar
Wu, X.W., Wang, Y., Fang, Y., Jia, P.: Embedding vector generation based on function call graph for effective malware detection and classification. Neural Comput. Appl. 34, 8643–8656 (2022)
Article Google Scholar
Xuan, C.D., Dao, M.H.: A novel approach for apt attack detection based on combined deep learning model. Neural Comput. Appl. 33, 13251–13264 (2021)
Article Google Scholar

Download references

Funding

Supported by Youth Innovation Promotion Association, CAS (No. 2020166) and Youth Innovation Promotion Association, CAS (No. 2023170).

Author information

Authors and Affiliations

Institute of Information Engineering, Chinese Academy of Sciences, Beijing, 100093, China
Rongqi Jing, Zhengwei Jiang, Qiuyun Wang, Shuwei Wang, Hao Li & Xiao Chen
School of Cyber Security, University of Chinese Academy of Sciences, Beijing, 100049, China
Rongqi Jing, Zhengwei Jiang, Qiuyun Wang, Shuwei Wang, Hao Li & Xiao Chen

Authors

Rongqi Jing
View author publications
You can also search for this author in PubMed Google Scholar
Zhengwei Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Qiuyun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shuwei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hao Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiao Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qiuyun Wang .

Editor information

Editors and Affiliations

University of Malaga, Malaga, Spain
Leonardo Franco
University of Amsterdam, Amsterdam, The Netherlands
Clélia de Mulatier
AGH University of Science and Technology, Krakow, Poland
Maciej Paszynski
University of Amsterdam, Amsterdam, The Netherlands
Valeria V. Krzhizhanovskaya
University of Tennessee, Knoxville, TN, USA
Jack J. Dongarra
University of Amsterdam, Amsterdam, The Netherlands
Peter M. A. Sloot

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jing, R., Jiang, Z., Wang, Q., Wang, S., Li, H., Chen, X. (2024). From Fine-Grained to Refined: APT Malware Knowledge Graph Construction and Attribution Analysis Driven by Multi-stage Graph Computation. In: Franco, L., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2024. ICCS 2024. Lecture Notes in Computer Science, vol 14832. Springer, Cham. https://doi.org/10.1007/978-3-031-63749-0_6

Download citation

DOI: https://doi.org/10.1007/978-3-031-63749-0_6
Published: 28 June 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-63748-3
Online ISBN: 978-3-031-63749-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

From Fine-Grained to Refined: APT Malware Knowledge Graph Construction and Attribution Analysis Driven by Multi-stage Graph Computation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Embedding and Predicting Software Security Entity Relationships: A Knowledge Graph Based Approach

A Survey of Cybersecurity Knowledge Base and Its Automatic Labeling

Uncovering Security Entity Relations with Cyber Threat Knowledge Graph Embedding

References

Funding

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

From Fine-Grained to Refined: APT Malware Knowledge Graph Construction and Attribution Analysis Driven by Multi-stage Graph Computation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Embedding and Predicting Software Security Entity Relationships: A Knowledge Graph Based Approach

A Survey of Cybersecurity Knowledge Base and Its Automatic Labeling

Uncovering Security Entity Relations with Cyber Threat Knowledge Graph Embedding

References

Funding

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation