Abstract
The proliferation of metamorphic malware has recently gained a lot of research interest. This is because of their ability to transform their program codes stochastically. Several detectors are unable to detect this malware family because of how quickly they obfuscate their code. It has also been shown that Machine learning (ML) models are not robust to these attacks due to the insufficient data to train these models resulting from the constant code mutation of metamorphic malware. Although recent studies have shown how to generate samples of metamorphic malware to serve as training data, this process can be computationally expensive. One way to improve the performance of these ML models is to transfer learning from other fields which have robust models such as what has been done with the transfer of learning from computer vision and image processing to improve malware detection. In this work, we introduce an evolutionary-based transfer learning approach that uses evolved mutants of malware generated using a traditional Evolutionary Algorithm (EA) as well as models from Natural Language Processing (NLP) text classification to improve the classification of metamorphic malware. Our preliminary results demonstrate that using NLP models can improve the classification of metamorphic malware in some instances.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Droidbox - https://www.honeynet.org/taxonomy/term/191.
- 2.
Google Play - https://play.google.com/store?hl=en.
- 3.
Apkdownloader -https://apps.evozi.com/apk-downloader/.
- 4.
Wondoujia Play - www.wandoujia.com.
- 5.
Contagio Minidump - http://contagiominidump.blogspot.com/2015/01/android-hideicon-malware-samples.html.
- 6.
Malgenome - http://www.malgenomeproject.org/.
- 7.
- 8.
- 9.
- 10.
Strace - https://linux.die.net/man/1/strace.
- 11.
Monkeyrunner - https://developer.android.com/studio/test/monkey.
- 12.
Keras - https://github.com/fchollet/keras.
References
Alam, S., Traore, I., Sogukpinar, I.: Annotated control flow graph for metamorphic malware detection. Comput. J. 58(10), 2608–2621 (2015). https://doi.org/10.1093/comjnl/bxu148
Alam, S., Traore, I., Sogukpinar, I.: Current trends and the future of metamorphic malware detection. In: Proceedings of the 7th International Conference on Security of Information and Networks. SIN 2014, pp. 411–416. ACM, New York (2014)
Alazab, M., Venkatraman, S., Watters, P., Alazab, M.: Zero-day malware detection based on supervised learning algorithms of API call signatures. In: Proceedings of the Ninth Australasian Data Mining Conference. AusDM 2011, vol. 121, pp. 171–182. Australian Computer Society Inc., Darlinghurst (2011). http://dl.acm.org/citation.cfm?id=2483628.2483648
Armoun, S.E., Hashemi, S.: A general paradigm for normalizing metamorphic malwares. In: 2012 10th International Conference on Frontiers of Information Technology. pp. 348–353 (Dec 2012). DOI: 10.1109/FIT.2012.69
Austin, T.H., Filiol, E., Josse, S., Stamp, M.: Exploring hidden Markov models for virus analysis: a semantic approach. In: 2013 46th Hawaii International Conference on System Sciences, pp. 5039–5048, January 2013. https://doi.org/10.1109/HICSS.2013.217
Aydogan, E., Sen, S.: Automatic generation of mobile malwares using genetic programming. In: Mora, A.M., Squillero, G. (eds.) EvoApplications 2015. LNCS, vol. 9028, pp. 745–756. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16549-3_60
Babaagba, K.O., Tan, Z., Hart, E.: Nowhere metamorphic malware can hide - a biological evolution inspired detection scheme. In: Wang, G., Bhuiyan, M.Z.A., De Capitani di Vimercati, S., Ren, Y. (eds.) DependSys 2019. CCIS, vol. 1123, pp. 369–382. Springer, Singapore (2019). https://doi.org/10.1007/978-981-15-1304-6_29
Babaagba, K.O., Tan, Z., Hart, E.: Automatic generation of adversarial metamorphic malware using MAP-elites. In: Castillo, P.A., Jiménez Laredo, J.L., Fernández de Vega, F. (eds.) EvoApplications 2020. LNCS, vol. 12104, pp. 117–132. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-43722-0_8
Babaagba, K.O., Tan, Z., Hart, E.: Improving classification of metamorphic malware by augmenting training data with a diverse set of evolved mutant samples. In: 2020 IEEE Congress on Evolutionary Computation (CEC), pp. 1–7. IEEE (2020)
Bashari Rad, B., Masrom, M., Ibrahim, S., Ibrahim, S.: Morphed virus family classification based on opcodes statistical feature using decision tree. In: Abd Manaf, A., Zeki, A., Zamani, M., Chuprat, S., El-Qawasmeh, E. (eds.) ICIEIS 2011. CCIS, vol. 251, pp. 123–131. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25327-0_11
Baysa, D., Low, R.M., Stamp, M.: Structural entropy and metamorphic malware. J. Comput. Virol. Hack. Tech. 9(4), 179–192 (2013). https://doi.org/10.1007/s11416-013-0185-4
Bhodia, N., Prajapati, P., Troia, F.D., Stamp, M.: Transfer learning for image-based malware classification. CoRR abs/1903.11551 (2019). http://arxiv.org/abs/1903.11551
Chen, L.: Deep transfer learning for static malware classification. CoRR abs/1812.07606 (2018). http://arxiv.org/abs/1812.07606
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018). http://arxiv.org/abs/1810.04805
Di, S., Zhang, H., Li, C., Mei, X., Prokhorov, D., Ling, H.: Cross-domain traffic scene understanding: a dense correspondence-based transfer learning approach. IEEE Trans. Intell. Transp. Syst. 19(3), 745–757 (2018)
Eiben, A.E., Smith, J.E.: What is an evolutionary algorithm? In: Eiben, A.E., Smith, J.E. (eds.) Introduction to Evolutionary Computing. NCS, pp. 25–48. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-44874-8_3
F-Secure: Trojan:Android/DroidKungFu.C (2019). https://www.f-secure.com/v-descs/trojan_android_droidkungfu_c.shtml
F-Secure: Trojan:Android/GGTracker.A (2019). https://www.f-secure.com/v-descs/trojan_android_ggtracker.shtml
Fiñones, R.G., Fernandez, R.: Solving the metamorphic puzzle. Virus Bull. 14–19 (2006). https://www.virusbulletin.com/virusbulletin/2006/03/solving-metamorphic-puzzle/
Gao, J., Ling, H., Hu, W., Xing, J.: Transfer learning based visual tracking with gaussian processes regression. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 188–203. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10578-9_13
Hwang, T., Kuang, R.: A heterogeneous label propagation algorithm for disease gene discovery. In: Proceedings of the 2010 SIAM International Conference on Data Mining, pp. 583–594. SIAM (2010)
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pp. 427–431. Association for Computational Linguistics, Valencia, April 2017. https://aclanthology.org/E17-2068
Kim, J.Y., Bu, S.J., Cho, S.B.: Malware detection using deep transferred generative adversarial networks. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, E.S.M. (eds.) ICONIP 2017. LNCS, vol. 10634, pp. 556–564. Springer International Publishing, Cham (2017). https://doi.org/10.1007/978-3-319-70087-8_58
Kim, J.Y., Bu, S.J., Cho, S.B.: Zero-day malware detection using transferred generative adversarial networks based on deep autoencoders. Inf. Sci. 460–461, 83–102 (2018). https://doi.org/10.1016/j.ins.2018.04.092
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR abs/1412.6980 (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates, Inc. (2012)
Kuriakose, J., Vinod, P.: Ranked linear discriminant analysis features for metamorphic malware detection. In: 2014 IEEE International Advance Computing Conference (IACC), pp. 112–117, February 2014. https://doi.org/10.1109/IAdCC.2014.6779304
Lee, J., Austin, T.H., Stamp, M.: Compression-based analysis of metamorphic malware. Int. J. Secur. Netw. 10(2), 124–136 (2015). https://doi.org/10.1504/IJSN.2015.070426
Maqsood, M., et al.: Transfer learning assisted classification and detection of Alzheimer’s disease stages using 3D MRI scans. Sensors (Basel, Switzerland) 19(11), 2645 (2019). https://doi.org/10.3390/s19112645
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Petegrosso, R., Park, S., Hwang, T.H., Kuang, R.: Transfer learning across ontologies for phenome-genome association prediction. Bioinformatics 33(4), 529–536 (2016). https://doi.org/10.1093/bioinformatics/btw649
Rezende, E., Ruppert, G., Carvalho, T., Ramos, F., de Geus, P.: Malicious software classification using transfer learning of resnet-50 deep neural network. In: 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 1011–1014 (2017)
Ruder, S., Peters, M.E., Swayamdipta, S., Wolf, T.: Transfer learning in natural language processing. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorials, pp. 15–18. Association for Computational Linguistics, Minneapolis, June 2019. https://doi.org/10.18653/v1/N19-5004
Sahs, J., Khan, L.: A machine learning approach to android malware detection. In: 2012 European Intelligence and Security Informatics Conference, pp. 141–147, August 2012. https://doi.org/10.1109/EISIC.2012.34
Sahs, J., Khan, L.: A machine learning approach to android malware detection. In: 2012 European Intelligence and Security Informatics Conference (2012)
Toderici, A.H., Stamp, M.: Chi-squared distance and metamorphic virus detection. J. Comput. Virol. 9(1), 1–14 (2013)
TRENDMICRO: ANDROIDOS_DOUGALEK.A (2012). https://www.trendmicro.com/vinfo/us/threat-encyclopedia/malware/androidos_dougalek.a
Vinod, P., Laxmi, V., Gaur, M.S., Kumar, G.V.S.S.P., Chundawat, Y.S.: Static CFG analyzer for metamorphic malware code. In: Proceedings of the 2Nd International Conference on Security of Information and Networks. SIN 2009, pp. 225–228. ACM, New York (2009). https://doi.org/10.1145/1626195.1626251
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Babaagba, K.O., Ayodele, M. (2023). Evolutionary Based Transfer Learning Approach to Improving Classification of Metamorphic Malware. In: Correia, J., Smith, S., Qaddoura, R. (eds) Applications of Evolutionary Computation. EvoApplications 2023. Lecture Notes in Computer Science, vol 13989. Springer, Cham. https://doi.org/10.1007/978-3-031-30229-9_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-30229-9_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30228-2
Online ISBN: 978-3-031-30229-9
eBook Packages: Computer ScienceComputer Science (R0)