Abstract
Malware homology identification is important in attacking event tracing, emergency response scheme generation, and event trend prediction. Current malware homology identification methods still rely on manual analysis, which is inefficient and cannot respond quickly to the outbreak of attack events. In response to these problems, we propose a new malware homology identification method from a gene perspective. A malware gene is represented by the subgraph, which can describe the homology of malware families. We extract the key subgraph from the function dependency graph as the malware gene by selecting the key application programming interface (API) and using the community partition algorithm. Then, we encode the gene and design a frequent subgraph mining algorithm to find the common genes between malware families. Finally, we use the family genes to guide the identification of malware based on homology. We evaluate our method with a public dataset, and the experiment results show that the accuracy of malware classification reaches 97% with high efficiency.
Access this article
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.
Similar content being viewed by others
References
Alam S, Horspool RN, Traore I, 2013. MAIL: Malware Analysis Intermediate Language: a step towards automating and optimizing malware detection. Proc 6th Int Conf on Security of Information and Networks, p.233–240. https://doi.org/10.1145/2523514.2527006
Alam S, Horspool RN, Traore I, 2014. MARD: a framework for metamorphic malware analysis and real-time detection. 28th Int Conf on Advanced Information Networking and Applications, p.212–233. https://doi.org/10.1109/AINA.2014.59
Cesare S, Xiang Y, Zhou WL, 2013. Malwise—an effective and efficient classification system for packed and polymorphic malware. IEEE Trans Comput, s62(6): 1193–1206. https://doi.org/10.1109/TC.2012.65
Defferrard M, Bresson X, Vandergheynst P, 2016. Convolutional neural networks on graphs with fast localized spectral filtering. Conf and Workshop on Neural Information Processing Systems, p.3837–3845.
Drew J, Moore T, Hahsler M, 2016. Polymorphic malware detection using sequence classification methods. Security and Privacy Workshops, p.81–87. https://doi.org/10.1109/SPW.2016.30
Han J, Zhao RC, Shan Z, et al., 2018. Analyzing and recognizing Android malware via semantic-based malware gene. Int Conf on Cyber-Enabled Distributed Computing and Knowledge Discovery, p.17–20. https://doi.org/10.1109/CyberC.2017.36
Jang JW, Woo J, Yun J, et al., 2014. Mal-netminer: malware classification based on social network analysis of call graph. Proc 23rd Int Conf on World Wide Web, p.731–734. https://doi.org/10.1145/2567948.2579364
Kaggle, 2015. Microsoft Malware Classification Challenge (Big 2015). https://doi.org/www.kaggle.com/c/malware-classification [Accessed on Nov. 4, 2015].
Kinable J, Kostakis O, 2011. Malware classification based on call graph clustering. J Comput Virol, s7(4): 233–245. https://doi.org/10.1007/s11416-011-0151-y
Kipf TN, Welling M, 2016. Semi-supervised classification with graph convolutional networks. https://doi.org/arxiv.org/abs/1609.02907?context=cs
Kirat D, Vigna G, 2015. MalGene: automatic extraction of malware analysis evasion signature. Proc 22nd ACM SIGSAC Conf on Computer and Communications Security, p.769–780. https://doi.org/10.1145/2810103.2813642
Liu L, Wang BS, Yu B, et al., 2017. Automatic malware classification and new malware detection using machine learning. Front Inform Technol Electron Eng, s18(9): 1336–1347. https://doi.org/10.1631/FITEE.1601325
Naval S, Laxmi V, Rajarajan M, et al., 2017. Employing program semantics for malware detection. IEEE Trans Inform Forens Secur, s10(12): 2591–2604. https://doi.org/10.1109/TIFS.2015.2469253
Qiao YC, Yun XC, Zhang YZ, et al., 2016. An automatic malware homology identification method based on calling habits. Acta Electron Sin, s44(10): 2410–2414. https://doi.org/10.3969/j.issn.0372-2112.2016.10.019
Qihoo 360, 2017. Ransomware Threat Situation Analysis Report. https://doi.org/zt.360.cn/1101061855.php?dtid=1101062360&did=490927082
Wang XZ, Liu JW, Chen XE, 2015. Microsoft Malware Classification Challenge (Big 2015) first place team: say no to overfitting. https://doi.org/github.com/xiaozhouwang/kaggle_Microsoft_Malware/blob/master/Saynotooverfitting.pdf [Accessed on Nov. 2, 2015].
Wu J, Dong MX, Ota K, et al., 2018a. Big data analysis-based secure cluster management for optimized control plane in software-defined networks. IEEE Trans Network Ser Manag, s15(1): 27–38. https://doi.org/10.1109/TNSM.2018.2799000
Wu J, Luo SB, Wang S, et al., 2018b. NLES: a novel lifetime extension scheme for safety-critical cyber-physical systems using SDN and NFV. IEEE Int Things J, s6(2): 2463–2475. https://doi.org/10.1109/JIOT.2018.2870294
Yu B, Fang Y, Yang Q, et al., 2018. A survey of malware behavior description and analysis. Front Inform Technol Electron Eng, s19(5): 583–603. https://doi.org/10.1631/FITEE.1601745
Author information
Authors and Affiliations
Corresponding author
Additional information
Project supported by the National Natural Science Foundation of China (Nos. 61472447 and 61802435)
Rights and permissions
About this article
Cite this article
Zhao, Bl., Shan, Z., Liu, Fd. et al. Malware homology identification based on a gene perspective. Frontiers Inf Technol Electronic Eng 20, 801–815 (2019). https://doi.org/10.1631/FITEE.1800523
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/FITEE.1800523