Skip to main content
Log in

Multi-context features for detecting malicious programs

  • Original Paper
  • Published:
Journal of Computer Virology and Hacking Techniques Aims and scope Submit manuscript

Abstract

Malware detection is still an open problem. There are numerous attacks that take place every day where malware is used to steal private information, disrupt services, or sabotage industrial systems. In this paper, we combine three kinds of contextual information, namely static, dynamic, and instruction-based, for malware detection. This leads to the definition of more than thirty thousand features, which is a large features set that covers a wide range of a sample characteristics. Through experiments with one million files, we show that this features set leads to machine learning based models that can detect both malware seen roughly at the time when the models are built, and malware first seen even months after the models were built (i.e., the detection models remain effective months ahead of time). This may be due to the comprehensiveness of the features set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. For more information, see https://goo.gl/FCEPLh

References

  1. Ahmadi, M., Giacinto, G., Ulyanov, D., Semenov, S., Trofimov, M.: Novel feature extraction, selection and fusion for effective malware family classification. ArXiv e-prints (2015)

  2. Ahmed, F., Hameed, H., Shafiq, M.Z., Farooq, M.: Using spatio-temporal information in api calls with machine learning algorithms for malware detection. In: Proceedings of the 2Nd ACM Workshop on Security and Artificial Intelligence, AISec ’09, pp. 55–62. ACM, New York, NY, USA (2009). doi:10.1145/1654988.1655003

  3. aldeid.com: PEiD. http://www.aldeid.com/wiki/PEiD. Accessed: Feb. 8th, 2014

  4. Anderson, B., Storlie, C., Lane, T.: Improving malware classification: bridging the static/dynamic gap. In: Proceedings of the 5th ACM workshop on Security and artificial intelligence, pp. 3–14. ACM (2012)

  5. AV-Comparative: File detection test of malicious software. (March 2015)

  6. CNET: lenovo hit by lawsuit over superfish adware. http://www.cnet.com/news/lenovo-hit-by-lawsuit-over-superfish-adware/. Accessed 9 December 2015

  7. Demme, J., Maycock, M., Schmitz, J., Tang, A., Waksman, A., Sethumadhavan, S., Stolfo, S.: On the feasibility of online malware detection with performance counters. SIGARCH Comput. Archit. News 41(3), 559–570 (2013). doi:10.1145/2508148.2485970

    Article  Google Scholar 

  8. Ding, Y., Dai, W., Yan, S., Zhang, Y.: Control flow-based opcode behavior analysis for malware detection. Computers & Security 44, 65–74 (2014). doi:10.1016/j.cose.2014.04.003. http://www.sciencedirect.com/science/article/pii/S0167404814000558

  9. Hiramoto, K.: Technical account manager at VirusTotal. Personal Communication. Sept. 24th, 2014

  10. Huang, J., Zhang, X., Tan, L., Wang, P., Liang, B.: Asdroid: Detecting stealthy behaviors in android applications by user interface and program behavior contradiction. In: Proceedings of the 36th International Conference on Software Engineering, ICSE 2014, pp. 1036–1046. ACM, New York, NY, USA (2014). doi:10.1145/2568225.2568301

  11. Kang, B., Han, K.S., Kang, B., Im, E.G.: Malware categorization using dynamic mnemonic frequency analysis with redundancy filtering. Digit. Investig. 11(4), 323–335 (2014). doi:10.1016/j.diin.2014.06.003. http://www.sciencedirect.com/science/article/pii/S1742287614000772

  12. Kolter, J.Z., Maloof, M.A.: Learning to detect malicious executables in the wild. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’04, pp. 470–478. ACM, New York, NY, USA (2004). doi:10.1145/1014052.1014105

  13. Kompalli, S.: Using existing hardware services for malware detection. In: Security and Privacy Workshops (SPW), 2014 IEEE, pp. 204–208. IEEE (2014)

  14. Labs, K.: The great bank robbery: the carbanak apt. http://securelist.com/blog/research/68732/the-great-bank-robbery-the-carbanak-apt/. Accessed 25 Mar 2015

  15. Labs, M.: Mcafee labs threats report for february 2015. http://www.mcafee.com/us/resources/reports/rp-quarterly-threat-q4-2014.pdf. Accessed 25 Mar 2015

  16. M0SA: Syp.01: Bypassing online dynamic analysis systems. Valhalla ezine, issue #4, November 2013. http://vxheaven.org/lib/vmo04.html

  17. Martinez, E.: Software engineer at VirusTotal. Personal Communication. Dec. 25th, 2014

  18. Miao, Q., Liu, J., Cao, Y., Song, J.: Malware detection using bilayer behavior abstraction and improved one-class support vector machines. Int. J. Inf. Secur. 15(14), 1–19 (2015). doi:10.1007/s10207-015-0297-6

    Google Scholar 

  19. Microsoft: Microsoft pe and coff specification. https://msdn.microsoft.com/en-us/windows/hardware/gg463119.aspx. Accessed 20 Nov 2015

  20. pefile: https://github.com/erocarrera/pefile. Accessed 6 June 2015

  21. Perdisci, R., Lanzi, A., Lee, W.: Classification of packed executables for accurate computer virus detection. Pattern Recogn. Lett. 29(14), 1941–1946 (2008). doi:10.1016/j.patrec.2008.06.016

    Article  Google Scholar 

  22. Quist, D., Smith, V., Computing, O.: Detecting the presence of virtual machines using the local data table. Offens. Comput. (2006)

  23. Ravula, R.R., Liszka, K.J., Chan, C.C.: Learning attack features from static and dynamic analysis of malware. In: Knowledge Discovery, Knowledge Engineering and Knowledge Management, pp. 109–125. Springer (2013)

  24. Saleh, M., Ratazzi, E., Xu, S.: Instructions-based detection of sophisticated obfuscation and packing. In: Military Communications Conference (MILCOM), 2014 IEEE, pp. 1–6 (2014). doi:10.1109/MILCOM.2014.9

  25. Saleh, M.E., Mohamed, A.B., Nabi, A.A.: Eigenviruses for metamorphic virus recognition. IET Inf. Secur. 5(4), 191–198 (2011)

    Article  Google Scholar 

  26. Salehi, Z., Sami, A., Ghiasi, M.: Using feature generation from API calls for malware detection. Comput. Fraud Secur. 2014(9), 9–18 (2014)

    Article  Google Scholar 

  27. Sandbox, C.: Cuckoo sandbox: automated malware analysis. Accessed 6 June 2015

  28. Santos, I., Devesa, J., Brezo, F., Nieves, J., Bringas, P.G.: Opem: a static-dynamic approach for machine-learning-based malware detection. In: International Joint Conference CISIS12-ICEUTE’ 12-SOCO’ 12 Special Sessions, pp. 271–280. Springer (2013)

  29. Santos, I., Ugarte-Pedrero, X., Sanz, B., Laorden, C., Bringas, P.G.: Collective classification for packed executable identification. In: Proceedings of the 8th Annual Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference, CEAS ’11, pp. 23–30. ACM, New York, NY, USA (2011). doi:10.1145/2030376.2030379

  30. Saxe, J., Berlin, K.: Deep neural network based malware detection using two dimensional binary program features. arXiv preprint arXiv:1508.03096 (2015)

  31. Schultz, M.G., Eskin, E., Zadok, E., Stolfo, S.J.: Data mining methods for detection of new malicious executables. In: Proceedings 2001 IEEE Symposium on Security and Privacy, 2001. S&P 2001, pp. 38–49. IEEE (2001)

  32. Shafiq, M., Tabish, S., Farooq, M.: PE-probe: leveraging packer detection and structural information to detect malicious portable executables. In: Proceedings of the Virus Bulletin Conference (VB), pp. 29–33 (2009)

  33. Shafiq, M., Tabish, S., Mirza, F., Farooq, M.: PE-Miner: Mining structural information to detect malicious executables in real-time. In: E. Kirda, S. Jha, D. Balzarotti (eds.) Recent Advances in Intrusion Detection. Lecture Notes in Computer Science, vol. 5758, pp. 121–141. Springer, Berlin Heidelberg (2009). doi:10.1007/978-3-642-04342-0_7

  34. Shahzad, F., Farooq, M.: Elf-miner: using structural knowledge and data mining methods to detect new (linux) malicious executables. Knowl. Inf. Syst. 30(3), 589–612 (2012). doi:10.1007/s10115-011-0393-5

    Article  Google Scholar 

  35. Storlie, C., Anderson, B., Vander Wiel, S., Quist, D., Hash, C., Brown, N.: Stochastic identification of malware with dynamic traces. ArXiv e-prints (2014)

  36. Tang, A., Sethumadhavan, S., Stolfo, S.J.: Unsupervised anomaly-based malware detection using hardware features. CoRR arXiv:1403.1631 (2014)

  37. Tian, R., Islam, M., Batten, L., Versteeg, S.: Differentiating malware from cleanware using behavioural analysis. In: 2010 5th International Conference on Malicious and Unwanted Software (MALWARE), pp. 23–30 (2010). doi:10.1109/MALWARE.2010.5665796

  38. Treadwell, S., Zhou, M.: A heuristic approach for detection of obfuscated malware. In: IEEE International Conference on Intelligence and Security Informatics, 2009 ISI ’09, pp. 291–299 (2009). doi:10.1109/ISI.2009.5137328

  39. UPX: Upx: The ultimate packer for executables. http://upx.sourceforge.net/. Accessed 7 Dec 2015

  40. VirusTotal: http://www.VirusTotal.com/. Accessed 6 June 2015

  41. Weka: Weka 3: Data mining software in Java. http://www.cs.waikato.ac.nz/ml/weka/. Accessed 6 June 2015

  42. Yan, G., Brown, N., Kong, D.: Exploring discriminatory features for automated malware classification. In: Detection of Intrusions and Malware, and Vulnerability Assessment, pp. 41–61. Springer (2013)

  43. You, I., Yim, K.: Malware obfuscation techniques: a brief survey. In: BWCCA, pp. 297–300 (2010)

  44. Zetter, K.: Countdown to Zero Day: Stuxnet and the Launch of the World’s First Digital Weapon. Crown Publishing Group, New York (2014)

    Google Scholar 

Download references

Acknowledgements

We thank VirusTotal for providing us the dataset that is analyzed in the present paper. We also thank John Charlton for proofreading the paper. The research was supported in part by ARO Grant #W911NF-13-1-0141, NSF Grants #1111925, #IIS-1213026 and #CNS-1461926.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Moustafa Saleh.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Saleh, M., Li, T. & Xu, S. Multi-context features for detecting malicious programs. J Comput Virol Hack Tech 14, 181–193 (2018). https://doi.org/10.1007/s11416-017-0304-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11416-017-0304-8

Keywords

Navigation