Skip to main content

Evolutionary Based Transfer Learning Approach to Improving Classification of Metamorphic Malware

  • Conference paper
  • First Online:
Applications of Evolutionary Computation (EvoApplications 2023)

Abstract

The proliferation of metamorphic malware has recently gained a lot of research interest. This is because of their ability to transform their program codes stochastically. Several detectors are unable to detect this malware family because of how quickly they obfuscate their code. It has also been shown that Machine learning (ML) models are not robust to these attacks due to the insufficient data to train these models resulting from the constant code mutation of metamorphic malware. Although recent studies have shown how to generate samples of metamorphic malware to serve as training data, this process can be computationally expensive. One way to improve the performance of these ML models is to transfer learning from other fields which have robust models such as what has been done with the transfer of learning from computer vision and image processing to improve malware detection. In this work, we introduce an evolutionary-based transfer learning approach that uses evolved mutants of malware generated using a traditional Evolutionary Algorithm (EA) as well as models from Natural Language Processing (NLP) text classification to improve the classification of metamorphic malware. Our preliminary results demonstrate that using NLP models can improve the classification of metamorphic malware in some instances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Droidbox - https://www.honeynet.org/taxonomy/term/191.

  2. 2.

    Google Play - https://play.google.com/store?hl=en.

  3. 3.

    Apkdownloader -https://apps.evozi.com/apk-downloader/.

  4. 4.

    Wondoujia Play - www.wandoujia.com.

  5. 5.

    Contagio Minidump - http://contagiominidump.blogspot.com/2015/01/android-hideicon-malware-samples.html.

  6. 6.

    Malgenome - http://www.malgenomeproject.org/.

  7. 7.

    Dougalek - https://www.trendmicro.com/vinfo/us/threat-encyclopedia/malware/androidosdougalek.a.

  8. 8.

    Droidkungfu - https://www.f-secure.com/v-descs/trojan_android_droidkungfu_c.shtml.

  9. 9.

    GGtracker - https://www.f-secure.com/v-descs/trojan_android_ggtracker.shtml.

  10. 10.

    Strace - https://linux.die.net/man/1/strace.

  11. 11.

    Monkeyrunner - https://developer.android.com/studio/test/monkey.

  12. 12.

    Keras - https://github.com/fchollet/keras.

References

  1. Alam, S., Traore, I., Sogukpinar, I.: Annotated control flow graph for metamorphic malware detection. Comput. J. 58(10), 2608–2621 (2015). https://doi.org/10.1093/comjnl/bxu148

    Article  Google Scholar 

  2. Alam, S., Traore, I., Sogukpinar, I.: Current trends and the future of metamorphic malware detection. In: Proceedings of the 7th International Conference on Security of Information and Networks. SIN 2014, pp. 411–416. ACM, New York (2014)

    Google Scholar 

  3. Alazab, M., Venkatraman, S., Watters, P., Alazab, M.: Zero-day malware detection based on supervised learning algorithms of API call signatures. In: Proceedings of the Ninth Australasian Data Mining Conference. AusDM 2011, vol. 121, pp. 171–182. Australian Computer Society Inc., Darlinghurst (2011). http://dl.acm.org/citation.cfm?id=2483628.2483648

  4. Armoun, S.E., Hashemi, S.: A general paradigm for normalizing metamorphic malwares. In: 2012 10th International Conference on Frontiers of Information Technology. pp. 348–353 (Dec 2012). DOI: 10.1109/FIT.2012.69

    Google Scholar 

  5. Austin, T.H., Filiol, E., Josse, S., Stamp, M.: Exploring hidden Markov models for virus analysis: a semantic approach. In: 2013 46th Hawaii International Conference on System Sciences, pp. 5039–5048, January 2013. https://doi.org/10.1109/HICSS.2013.217

  6. Aydogan, E., Sen, S.: Automatic generation of mobile malwares using genetic programming. In: Mora, A.M., Squillero, G. (eds.) EvoApplications 2015. LNCS, vol. 9028, pp. 745–756. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16549-3_60

    Chapter  Google Scholar 

  7. Babaagba, K.O., Tan, Z., Hart, E.: Nowhere metamorphic malware can hide - a biological evolution inspired detection scheme. In: Wang, G., Bhuiyan, M.Z.A., De Capitani di Vimercati, S., Ren, Y. (eds.) DependSys 2019. CCIS, vol. 1123, pp. 369–382. Springer, Singapore (2019). https://doi.org/10.1007/978-981-15-1304-6_29

    Chapter  Google Scholar 

  8. Babaagba, K.O., Tan, Z., Hart, E.: Automatic generation of adversarial metamorphic malware using MAP-elites. In: Castillo, P.A., Jiménez Laredo, J.L., Fernández de Vega, F. (eds.) EvoApplications 2020. LNCS, vol. 12104, pp. 117–132. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-43722-0_8

    Chapter  Google Scholar 

  9. Babaagba, K.O., Tan, Z., Hart, E.: Improving classification of metamorphic malware by augmenting training data with a diverse set of evolved mutant samples. In: 2020 IEEE Congress on Evolutionary Computation (CEC), pp. 1–7. IEEE (2020)

    Google Scholar 

  10. Bashari Rad, B., Masrom, M., Ibrahim, S., Ibrahim, S.: Morphed virus family classification based on opcodes statistical feature using decision tree. In: Abd Manaf, A., Zeki, A., Zamani, M., Chuprat, S., El-Qawasmeh, E. (eds.) ICIEIS 2011. CCIS, vol. 251, pp. 123–131. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25327-0_11

    Chapter  Google Scholar 

  11. Baysa, D., Low, R.M., Stamp, M.: Structural entropy and metamorphic malware. J. Comput. Virol. Hack. Tech. 9(4), 179–192 (2013). https://doi.org/10.1007/s11416-013-0185-4

    Article  Google Scholar 

  12. Bhodia, N., Prajapati, P., Troia, F.D., Stamp, M.: Transfer learning for image-based malware classification. CoRR abs/1903.11551 (2019). http://arxiv.org/abs/1903.11551

  13. Chen, L.: Deep transfer learning for static malware classification. CoRR abs/1812.07606 (2018). http://arxiv.org/abs/1812.07606

  14. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018). http://arxiv.org/abs/1810.04805

  15. Di, S., Zhang, H., Li, C., Mei, X., Prokhorov, D., Ling, H.: Cross-domain traffic scene understanding: a dense correspondence-based transfer learning approach. IEEE Trans. Intell. Transp. Syst. 19(3), 745–757 (2018)

    Article  Google Scholar 

  16. Eiben, A.E., Smith, J.E.: What is an evolutionary algorithm? In: Eiben, A.E., Smith, J.E. (eds.) Introduction to Evolutionary Computing. NCS, pp. 25–48. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-44874-8_3

    Chapter  MATH  Google Scholar 

  17. F-Secure: Trojan:Android/DroidKungFu.C (2019). https://www.f-secure.com/v-descs/trojan_android_droidkungfu_c.shtml

  18. F-Secure: Trojan:Android/GGTracker.A (2019). https://www.f-secure.com/v-descs/trojan_android_ggtracker.shtml

  19. Fiñones, R.G., Fernandez, R.: Solving the metamorphic puzzle. Virus Bull. 14–19 (2006). https://www.virusbulletin.com/virusbulletin/2006/03/solving-metamorphic-puzzle/

  20. Gao, J., Ling, H., Hu, W., Xing, J.: Transfer learning based visual tracking with gaussian processes regression. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 188–203. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10578-9_13

    Chapter  Google Scholar 

  21. Hwang, T., Kuang, R.: A heterogeneous label propagation algorithm for disease gene discovery. In: Proceedings of the 2010 SIAM International Conference on Data Mining, pp. 583–594. SIAM (2010)

    Google Scholar 

  22. Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pp. 427–431. Association for Computational Linguistics, Valencia, April 2017. https://aclanthology.org/E17-2068

  23. Kim, J.Y., Bu, S.J., Cho, S.B.: Malware detection using deep transferred generative adversarial networks. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, E.S.M. (eds.) ICONIP 2017. LNCS, vol. 10634, pp. 556–564. Springer International Publishing, Cham (2017). https://doi.org/10.1007/978-3-319-70087-8_58

    Chapter  Google Scholar 

  24. Kim, J.Y., Bu, S.J., Cho, S.B.: Zero-day malware detection using transferred generative adversarial networks based on deep autoencoders. Inf. Sci. 460–461, 83–102 (2018). https://doi.org/10.1016/j.ins.2018.04.092

    Article  Google Scholar 

  25. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR abs/1412.6980 (2014)

    Google Scholar 

  26. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates, Inc. (2012)

    Google Scholar 

  27. Kuriakose, J., Vinod, P.: Ranked linear discriminant analysis features for metamorphic malware detection. In: 2014 IEEE International Advance Computing Conference (IACC), pp. 112–117, February 2014. https://doi.org/10.1109/IAdCC.2014.6779304

  28. Lee, J., Austin, T.H., Stamp, M.: Compression-based analysis of metamorphic malware. Int. J. Secur. Netw. 10(2), 124–136 (2015). https://doi.org/10.1504/IJSN.2015.070426

    Article  Google Scholar 

  29. Maqsood, M., et al.: Transfer learning assisted classification and detection of Alzheimer’s disease stages using 3D MRI scans. Sensors (Basel, Switzerland) 19(11), 2645 (2019). https://doi.org/10.3390/s19112645

  30. Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)

    Article  Google Scholar 

  31. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)

    Google Scholar 

  32. Petegrosso, R., Park, S., Hwang, T.H., Kuang, R.: Transfer learning across ontologies for phenome-genome association prediction. Bioinformatics 33(4), 529–536 (2016). https://doi.org/10.1093/bioinformatics/btw649

  33. Rezende, E., Ruppert, G., Carvalho, T., Ramos, F., de Geus, P.: Malicious software classification using transfer learning of resnet-50 deep neural network. In: 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 1011–1014 (2017)

    Google Scholar 

  34. Ruder, S., Peters, M.E., Swayamdipta, S., Wolf, T.: Transfer learning in natural language processing. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorials, pp. 15–18. Association for Computational Linguistics, Minneapolis, June 2019. https://doi.org/10.18653/v1/N19-5004

  35. Sahs, J., Khan, L.: A machine learning approach to android malware detection. In: 2012 European Intelligence and Security Informatics Conference, pp. 141–147, August 2012. https://doi.org/10.1109/EISIC.2012.34

  36. Sahs, J., Khan, L.: A machine learning approach to android malware detection. In: 2012 European Intelligence and Security Informatics Conference (2012)

    Google Scholar 

  37. Toderici, A.H., Stamp, M.: Chi-squared distance and metamorphic virus detection. J. Comput. Virol. 9(1), 1–14 (2013)

    Google Scholar 

  38. TRENDMICRO: ANDROIDOS_DOUGALEK.A (2012). https://www.trendmicro.com/vinfo/us/threat-encyclopedia/malware/androidos_dougalek.a

  39. Vinod, P., Laxmi, V., Gaur, M.S., Kumar, G.V.S.S.P., Chundawat, Y.S.: Static CFG analyzer for metamorphic malware code. In: Proceedings of the 2Nd International Conference on Security of Information and Networks. SIN 2009, pp. 225–228. ACM, New York (2009). https://doi.org/10.1145/1626195.1626251

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kehinde O. Babaagba .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Babaagba, K.O., Ayodele, M. (2023). Evolutionary Based Transfer Learning Approach to Improving Classification of Metamorphic Malware. In: Correia, J., Smith, S., Qaddoura, R. (eds) Applications of Evolutionary Computation. EvoApplications 2023. Lecture Notes in Computer Science, vol 13989. Springer, Cham. https://doi.org/10.1007/978-3-031-30229-9_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-30229-9_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-30228-2

  • Online ISBN: 978-3-031-30229-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics