Skip to main content

Improved Malware Classification through Sensor Fusion Using Disjoint Union

  • Conference paper
Information Systems, Technology and Management (ICISTM 2012)

Abstract

In classifying malware, an open research question is how to combine similar extracted data from program analyzers in such a way that the advantages of the analyzers accrue and the errors are minimized. We propose an approach to fusing multiple program analysis outputs by abstracting the features to a common form and utilizing a disjoint union fusion function. The approach is evaluated in an experiment measuring classification accuracy on fused dynamic trace data on over 18,000 malware files. The results indicate that a naïve fusion approach can yield improvements over non-fused results, but the disjoint union fusion function outperforms naïve union by a statistically significant amount in three of four classification methods applied.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Allen, W.H., Ford, R.: How not to be seen II: The defenders fight back. IEEE Security & Privacy 5(6), 65–68 (2007)

    Article  Google Scholar 

  2. Anubis: Analyzing unknown binaries (June 2011), http://anubis.iseclab.org

  3. Balzarotti, D., Cova, M., Karlberger, C., Kruegel, C., Kirda, E., Vigna, G.: Efficient detection of split personalities in malware. In: Network and Distributed System Security, NDSS (2010)

    Google Scholar 

  4. Barak, B., Goldreich, O., Impagliazzo, R., Rudich, S., Sahai, A., Vadhan, S.P., Yang, K.: On the (Im)possibility of Obfuscating Programs. In: Kilian, J. (ed.) CRYPTO 2001. LNCS, vol. 2139, pp. 1–18. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  5. Bayer, U., Kruegel, C.: TTAnalyze: A tool for analyzing malware. In: Proceedings of the 15th European Institute for Computer Antivirus Research (EICAR 2006) Annual Conference (2006)

    Google Scholar 

  6. Boudjemaa, R., Forbes, A.: Parameter estimation methods for data fusion. NPL Report CMSC 38(04) (2004)

    Google Scholar 

  7. Chen, X., Andersen, J., Mao, Z., Bailey, M., Nazario, J.: Towards an understanding of anti-virtualization and anti-debugging behavior in modern malware. In: Proceedings of the IEEE International Conference on Dependable Systems and Networks, Anchorage, AK, U.S.A., pp. 177–186 (2008)

    Google Scholar 

  8. Collberg, C., Nagra, J.: Surreptitious Software: Obfuscation, Watermarking, and Tamperproofing for Software Protection. Addison-Wesley Professional (2009)

    Google Scholar 

  9. CWSandbox: behavior-based malware analysis (June 2011), http://mwanalysis.org

  10. Hall, D., Llinas, J.: An introduction to multisensor data fusion. Proceedings of the IEEE 85(1), 6–23 (1997)

    Article  Google Scholar 

  11. Islam, R., Tian, R., Batten, L., Versteeg, S.: Classification of malware based on string and function feature selection. In: Cybercrime and Trustworthy Computing, Workshop, pp. 9–17 (2010)

    Google Scholar 

  12. Kang, M.G., Yin, H., Hanna, S., McCamant, S., Song, D.: Emulating emulation-resistant malware. In: Proceedings of the 1st ACM Workshop on Virtual Machine Security, pp. 11–22. ACM, Chicago (2009)

    Chapter  Google Scholar 

  13. Kruegel, C., Robertson, W., Valeur, F., Vigna, G.: Static disassembly of obfuscated binaries. In: Proceedings of the 13th USENIX Security Symposium, pp. 255–270. Usenix (2004)

    Google Scholar 

  14. Laskov, P., Lippman, R.: Machine learning in adversarial environments. Machine Learning 81, 115–119 (2010)

    Article  Google Scholar 

  15. Lu, Y., Din, S., Zheng, C., Gao, B.: Using multi-feature and classifier ensembles to improve malware detection. Journal of C.C.I.T. 39(2) (November 2010)

    Google Scholar 

  16. Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., Euler, T.: Yale: Rapid prototyping for complex data mining tasks. In: Ungar, L., Craven, M., Gunopulos, D., Eliassi-Rad, T. (eds.) Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 935–940. ACM (2006)

    Google Scholar 

  17. Trinius, P., Willems, C., Holz, T., Rieck, K.: A malware instruction set for behavior-based analysis. Tech. Rep. TR-2009-07, University of Mannheim (2009)

    Google Scholar 

  18. Walenstein, A., Hefner, D., Wichers, J.: Header information in malware families and impact on automated classifiers. In: Proceedings of the 5th International Conference on Malicious and Unwanted Software, pp. 15–22. IEEE CSP (2010)

    Google Scholar 

  19. Willems, C., Holz, T., Freiling, F.: Toward automated dynamic malware analysis using CWSandbox. IEEE Security & Privacy 5(2), 32–39 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

LeDoux, C., Walenstein, A., Lakhotia, A. (2012). Improved Malware Classification through Sensor Fusion Using Disjoint Union. In: Dua, S., Gangopadhyay, A., Thulasiraman, P., Straccia, U., Shepherd, M., Stein, B. (eds) Information Systems, Technology and Management. ICISTM 2012. Communications in Computer and Information Science, vol 285. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29166-1_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-29166-1_32

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-29165-4

  • Online ISBN: 978-3-642-29166-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics