Multi-context features for detecting malicious programs

Saleh, Moustafa; Li, Tao; Xu, Shouhuai

doi:10.1007/s11416-017-0304-8

Multi-context features for detecting malicious programs

Original Paper
Published: 24 August 2017

Volume 14, pages 181–193, (2018)
Cite this article

Journal of Computer Virology and Hacking Techniques Aims and scope Submit manuscript

448 Accesses
6 Citations
Explore all metrics

Abstract

Malware detection is still an open problem. There are numerous attacks that take place every day where malware is used to steal private information, disrupt services, or sabotage industrial systems. In this paper, we combine three kinds of contextual information, namely static, dynamic, and instruction-based, for malware detection. This leads to the definition of more than thirty thousand features, which is a large features set that covers a wide range of a sample characteristics. Through experiments with one million files, we show that this features set leads to machine learning based models that can detect both malware seen roughly at the time when the models are built, and malware first seen even months after the models were built (i.e., the detection models remain effective months ahead of time). This may be due to the comprehensiveness of the features set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unsupervised Anomaly-Based Malware Detection Using Hardware Features

Exploring Discriminatory Features for Automated Malware Classification

An investigation of byte n-gram features for malware classification

Article 12 September 2016

Notes

For more information, see https://goo.gl/FCEPLh

References

Ahmadi, M., Giacinto, G., Ulyanov, D., Semenov, S., Trofimov, M.: Novel feature extraction, selection and fusion for effective malware family classification. ArXiv e-prints (2015)
Ahmed, F., Hameed, H., Shafiq, M.Z., Farooq, M.: Using spatio-temporal information in api calls with machine learning algorithms for malware detection. In: Proceedings of the 2Nd ACM Workshop on Security and Artificial Intelligence, AISec ’09, pp. 55–62. ACM, New York, NY, USA (2009). doi:10.1145/1654988.1655003
aldeid.com: PEiD. http://www.aldeid.com/wiki/PEiD. Accessed: Feb. 8th, 2014
Anderson, B., Storlie, C., Lane, T.: Improving malware classification: bridging the static/dynamic gap. In: Proceedings of the 5th ACM workshop on Security and artificial intelligence, pp. 3–14. ACM (2012)
AV-Comparative: File detection test of malicious software. (March 2015)
CNET: lenovo hit by lawsuit over superfish adware. http://www.cnet.com/news/lenovo-hit-by-lawsuit-over-superfish-adware/. Accessed 9 December 2015
Demme, J., Maycock, M., Schmitz, J., Tang, A., Waksman, A., Sethumadhavan, S., Stolfo, S.: On the feasibility of online malware detection with performance counters. SIGARCH Comput. Archit. News 41(3), 559–570 (2013). doi:10.1145/2508148.2485970
Article Google Scholar
Ding, Y., Dai, W., Yan, S., Zhang, Y.: Control flow-based opcode behavior analysis for malware detection. Computers & Security 44, 65–74 (2014). doi:10.1016/j.cose.2014.04.003. http://www.sciencedirect.com/science/article/pii/S0167404814000558
Hiramoto, K.: Technical account manager at VirusTotal. Personal Communication. Sept. 24th, 2014
Huang, J., Zhang, X., Tan, L., Wang, P., Liang, B.: Asdroid: Detecting stealthy behaviors in android applications by user interface and program behavior contradiction. In: Proceedings of the 36th International Conference on Software Engineering, ICSE 2014, pp. 1036–1046. ACM, New York, NY, USA (2014). doi:10.1145/2568225.2568301
Kang, B., Han, K.S., Kang, B., Im, E.G.: Malware categorization using dynamic mnemonic frequency analysis with redundancy filtering. Digit. Investig. 11(4), 323–335 (2014). doi:10.1016/j.diin.2014.06.003. http://www.sciencedirect.com/science/article/pii/S1742287614000772
Kolter, J.Z., Maloof, M.A.: Learning to detect malicious executables in the wild. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’04, pp. 470–478. ACM, New York, NY, USA (2004). doi:10.1145/1014052.1014105
Kompalli, S.: Using existing hardware services for malware detection. In: Security and Privacy Workshops (SPW), 2014 IEEE, pp. 204–208. IEEE (2014)
Labs, K.: The great bank robbery: the carbanak apt. http://securelist.com/blog/research/68732/the-great-bank-robbery-the-carbanak-apt/. Accessed 25 Mar 2015
Labs, M.: Mcafee labs threats report for february 2015. http://www.mcafee.com/us/resources/reports/rp-quarterly-threat-q4-2014.pdf. Accessed 25 Mar 2015
M0SA: Syp.01: Bypassing online dynamic analysis systems. Valhalla ezine, issue #4, November 2013. http://vxheaven.org/lib/vmo04.html
Martinez, E.: Software engineer at VirusTotal. Personal Communication. Dec. 25th, 2014
Miao, Q., Liu, J., Cao, Y., Song, J.: Malware detection using bilayer behavior abstraction and improved one-class support vector machines. Int. J. Inf. Secur. 15(14), 1–19 (2015). doi:10.1007/s10207-015-0297-6
Google Scholar
Microsoft: Microsoft pe and coff specification. https://msdn.microsoft.com/en-us/windows/hardware/gg463119.aspx. Accessed 20 Nov 2015
pefile: https://github.com/erocarrera/pefile. Accessed 6 June 2015
Perdisci, R., Lanzi, A., Lee, W.: Classification of packed executables for accurate computer virus detection. Pattern Recogn. Lett. 29(14), 1941–1946 (2008). doi:10.1016/j.patrec.2008.06.016
Article Google Scholar
Quist, D., Smith, V., Computing, O.: Detecting the presence of virtual machines using the local data table. Offens. Comput. (2006)
Ravula, R.R., Liszka, K.J., Chan, C.C.: Learning attack features from static and dynamic analysis of malware. In: Knowledge Discovery, Knowledge Engineering and Knowledge Management, pp. 109–125. Springer (2013)
Saleh, M., Ratazzi, E., Xu, S.: Instructions-based detection of sophisticated obfuscation and packing. In: Military Communications Conference (MILCOM), 2014 IEEE, pp. 1–6 (2014). doi:10.1109/MILCOM.2014.9
Saleh, M.E., Mohamed, A.B., Nabi, A.A.: Eigenviruses for metamorphic virus recognition. IET Inf. Secur. 5(4), 191–198 (2011)
Article Google Scholar
Salehi, Z., Sami, A., Ghiasi, M.: Using feature generation from API calls for malware detection. Comput. Fraud Secur. 2014(9), 9–18 (2014)
Article Google Scholar
Sandbox, C.: Cuckoo sandbox: automated malware analysis. Accessed 6 June 2015
Santos, I., Devesa, J., Brezo, F., Nieves, J., Bringas, P.G.: Opem: a static-dynamic approach for machine-learning-based malware detection. In: International Joint Conference CISIS12-ICEUTE’ 12-SOCO’ 12 Special Sessions, pp. 271–280. Springer (2013)
Santos, I., Ugarte-Pedrero, X., Sanz, B., Laorden, C., Bringas, P.G.: Collective classification for packed executable identification. In: Proceedings of the 8th Annual Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference, CEAS ’11, pp. 23–30. ACM, New York, NY, USA (2011). doi:10.1145/2030376.2030379
Saxe, J., Berlin, K.: Deep neural network based malware detection using two dimensional binary program features. arXiv preprint arXiv:1508.03096 (2015)
Schultz, M.G., Eskin, E., Zadok, E., Stolfo, S.J.: Data mining methods for detection of new malicious executables. In: Proceedings 2001 IEEE Symposium on Security and Privacy, 2001. S&P 2001, pp. 38–49. IEEE (2001)
Shafiq, M., Tabish, S., Farooq, M.: PE-probe: leveraging packer detection and structural information to detect malicious portable executables. In: Proceedings of the Virus Bulletin Conference (VB), pp. 29–33 (2009)
Shafiq, M., Tabish, S., Mirza, F., Farooq, M.: PE-Miner: Mining structural information to detect malicious executables in real-time. In: E. Kirda, S. Jha, D. Balzarotti (eds.) Recent Advances in Intrusion Detection. Lecture Notes in Computer Science, vol. 5758, pp. 121–141. Springer, Berlin Heidelberg (2009). doi:10.1007/978-3-642-04342-0_7
Shahzad, F., Farooq, M.: Elf-miner: using structural knowledge and data mining methods to detect new (linux) malicious executables. Knowl. Inf. Syst. 30(3), 589–612 (2012). doi:10.1007/s10115-011-0393-5
Article Google Scholar
Storlie, C., Anderson, B., Vander Wiel, S., Quist, D., Hash, C., Brown, N.: Stochastic identification of malware with dynamic traces. ArXiv e-prints (2014)
Tang, A., Sethumadhavan, S., Stolfo, S.J.: Unsupervised anomaly-based malware detection using hardware features. CoRR arXiv:1403.1631 (2014)
Tian, R., Islam, M., Batten, L., Versteeg, S.: Differentiating malware from cleanware using behavioural analysis. In: 2010 5th International Conference on Malicious and Unwanted Software (MALWARE), pp. 23–30 (2010). doi:10.1109/MALWARE.2010.5665796
Treadwell, S., Zhou, M.: A heuristic approach for detection of obfuscated malware. In: IEEE International Conference on Intelligence and Security Informatics, 2009 ISI ’09, pp. 291–299 (2009). doi:10.1109/ISI.2009.5137328
UPX: Upx: The ultimate packer for executables. http://upx.sourceforge.net/. Accessed 7 Dec 2015
VirusTotal: http://www.VirusTotal.com/. Accessed 6 June 2015
Weka: Weka 3: Data mining software in Java. http://www.cs.waikato.ac.nz/ml/weka/. Accessed 6 June 2015
Yan, G., Brown, N., Kong, D.: Exploring discriminatory features for automated malware classification. In: Detection of Intrusions and Malware, and Vulnerability Assessment, pp. 41–61. Springer (2013)
You, I., Yim, K.: Malware obfuscation techniques: a brief survey. In: BWCCA, pp. 297–300 (2010)
Zetter, K.: Countdown to Zero Day: Stuxnet and the Launch of the World’s First Digital Weapon. Crown Publishing Group, New York (2014)
Google Scholar

Download references

Acknowledgements

We thank VirusTotal for providing us the dataset that is analyzed in the present paper. We also thank John Charlton for proofreading the paper. The research was supported in part by ARO Grant #W911NF-13-1-0141, NSF Grants #1111925, #IIS-1213026 and #CNS-1461926.

Author information

Authors and Affiliations

Microsoft Malware Protection Center, Microsoft, One Microsoft Way, Redmond, WA, USA
Moustafa Saleh
School of Computer Science, Florida International University, Miami, FL, USA
Tao Li
Department of Computer Science, University of Texas at San Antonio, San Antonio, TX, USA
Shouhuai Xu

Authors

Moustafa Saleh
View author publications
You can also search for this author in PubMed Google Scholar
Tao Li
View author publications
You can also search for this author in PubMed Google Scholar
Shouhuai Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Moustafa Saleh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Saleh, M., Li, T. & Xu, S. Multi-context features for detecting malicious programs. J Comput Virol Hack Tech 14, 181–193 (2018). https://doi.org/10.1007/s11416-017-0304-8

Download citation

Received: 14 November 2016
Accepted: 03 August 2017
Published: 24 August 2017
Issue Date: May 2018
DOI: https://doi.org/10.1007/s11416-017-0304-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-context features for detecting malicious programs

Abstract

Access this article

Similar content being viewed by others

Unsupervised Anomaly-Based Malware Detection Using Hardware Features

Exploring Discriminatory Features for Automated Malware Classification

An investigation of byte n-gram features for malware classification

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-context features for detecting malicious programs

Abstract

Access this article

Similar content being viewed by others

Unsupervised Anomaly-Based Malware Detection Using Hardware Features

Exploring Discriminatory Features for Automated Malware Classification

An investigation of byte n-gram features for malware classification

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation