Skip to main content

Study of Soft Computing Methods for Large-Scale Multinomial Malware Types and Families Detection

  • Chapter
  • First Online:
Recent Developments and the New Direction in Soft-Computing Foundations and Applications

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 361))

Abstract

There exist different methods of malware identification, while the most common is signature-based used by anti-virus vendors that includes one-way cryptographic hash sums to characterize each particular malware sample. In most cases such detection results in a simple classification into malware and goodware. In a modern Information Security society it is not enough to separate only between goodware and malware. The reason for this is increasingly complex functionality used by various malware families, in which there has been several thousand of new ones created during the last decade. In addition to this, a number of new malware types have emerged. We believe that Soft Computing (SC) may help to understand such complicated multinomial problems better. To study this we ensambled a novel large-scale dataset based on 400 k malware samples. Furthermore, we investigated the limitation of community-accepted Soft Computing methods and can clearly observe that the optimization is required for such non-trivial task. The contribution of this paper is a thorough investigation of large-scale multinomial malware classification by Soft Computing using static characteristics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://testimon.ccis.no.

References

  1. McAfee. Part of Intel Security., “Threats report,” McAfee., Technical Report, Aug 2015. Accessed 19 Sep 2015

    Google Scholar 

  2. S. Ravi, N. Balakrishnan, B. Venkatesh, Behavior-based malware analysis using profile hidden markov models, in 2013 International Conference on Security and Cryptography (SECRYPT), July 2013, pp. 1–12

    Google Scholar 

  3. “Virustotal,” https://www.virustotal.com/. Accessed 10 Aug 2015

  4. L. Seltzer, Tools for Analyzing Static Properties of Suspicious Files on Windows, 2014

    Google Scholar 

  5. G. Amato, “Peframe,” https://github.com/guelfoweb/peframe, Nov 2015. Accessed 17 June 2015

  6. M. Ligh, S. Adair, B. Hartstein, M. Richard, Malware Analyst’s Cookbook and DVD: Tools and Techniques for Fighting Malicious Code (Wiley Publishing, 2010)

    Google Scholar 

  7. D. Gavrilut, M. Cimpoesu, D. Anton, L. Ciortuz, Malware detection using machine learning, in International Multiconference on Computer Science and Information Technology, 2009. IMCSIT ’09, Oct 2009, pp. 735–741

    Google Scholar 

  8. N. Idika, A.P. Mathur, A Survey of Malware Detection Techniques, vol. 48 (Purdue University, 2007)

    Google Scholar 

  9. Y. Ye, D. Wang, T. Li, D. Ye, Imds: Intelligent malware detection system, in Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ser. KDD ’07. New York, NY, USA (ACM, 2007), pp. 1043–1047

    Google Scholar 

  10. J.Z. Kolter, M.A. Maloof, Learning to detect and classify malicious executables in the wild. J. Mach. Learn. Res. 7, 2721–2744 (2006)

    MathSciNet  MATH  Google Scholar 

  11. S.K. Das, A. Kumar, B. Das, A. Burnwal, On soft computing techniques in various areas. Comput. Sci. Inf. Technol. 59 (2013)

    Google Scholar 

  12. F. Cohen, Computer viruses: theory and experiments. Comput. Secur. 6(1), 22–35 (1987)

    Article  Google Scholar 

  13. D.M. Chess, S.R. White, An undetectable computer virus, in Proceedings of Virus Bulletin Conference, vol. 5, 2000

    Google Scholar 

  14. S. R. Bragen, Malware detection through opcode sequence analysis using machine learning, Gjvik University College, 2015

    Google Scholar 

  15. Z. Markel, M. Bilzor, Building a machine learning classifier for malware detection, in Second Workshop on Anti-malware Testing Research (WATeR), vol. 2014 (IEEE, 2014), pp. 1–4

    Google Scholar 

  16. M. Shankarapani, K. Kancherla, S. Ramammoorthy, R. Movva, S. Mukkamala, Kernel machines for malware classification and similarity analysis, in The 2010 International Joint Conference on Neural Networks (IJCNN) (IEEE, 2010), pp. 1–6

    Google Scholar 

  17. K. Rieck, T. Holz, C. Willems, P. Düssel, P. Laskov, Learning and classification of malware behavior, in Proceedings of the 5th International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, ser. DIMVA ’08 (Springer, Berlin, Heidelberg, 2008), pp. 108–125

    Google Scholar 

  18. B. Zhang, J. Yin, J. Hao, D. Zhang, S. Wang, Malicious codes detection based on ensemble learning, in Proceedings of the 4th International Conference on Autonomic and Trusted Computing, ser. ATC’07 (Springer, Berlin, Heidelberg, 2007), pp. 468–477

    Google Scholar 

  19. Naming scheme-caro-computer antivirus research organization (2015), www.caro.org/naming/scheme.html. Accessed 20 Aug 2015

  20. Microsoft Malware Protection Center, “Naming malware”

    Google Scholar 

  21. G. Ou, Y.L. Murphey, Multi-class pattern classification using neural networks. Pattern Recogn. 40(1), 4–18 (2007)

    Article  Google Scholar 

  22. A. Shalaginov, K. Franke, Towards improvement of multinomial classification accuracy of neuro-fuzzy for digital forensics applications, in 15th International Conference on Hybrid Intelligent Systems (HIS 2015), vol. 420, no. 1 (Springer Publishing, 2015), p. 1

    Google Scholar 

  23. K. Maxwell, “maltrieve,” May 2015. Accessed 10 June 2015

    Google Scholar 

  24. “Virusshare,” https://virusshare.com/. Accessed 08 May 2015

  25. I. Kononenko, M. Kukar, Machine Learning and Data Mining: Introduction to Principles and Algorithms. (Horwood Publishing Limited, 2007)

    Chapter  Google Scholar 

  26. M.A. Hall, Correlation-based feature selection for machine learning, Ph.D. dissertation, The University of Waikato, 1999

    Google Scholar 

  27. D. Roobaert, G. Karakoulas, N.V. Chawla, Information gain, correlation and support vector machines, in Feature Extraction (Springer, 2006), pp. 463–470

    Google Scholar 

  28. R. Singh, H. Kumar, R. Singla, Review of soft computing in malware detection. Spec. Issues IP Multimed. Commun. 1, 55–60 (2011)

    Google Scholar 

  29. I. Rish, An empirical study of the naive bayes classifier, in IJCAI, Workshop on Empirical Methods in Artificial Intelligence, vol. 3, no. 22 (IBM, New York, 2001), pp. 41–46

    Google Scholar 

  30. N. Friedman, D. Geiger, M. Goldszmidt, Bayesian network classifiers. Mach. Learn. 29(2–3), 131–163 (1997)

    Article  Google Scholar 

  31. M.A. Hearst, S.T. Dumais, E. Osman, J. Platt, B. Scholkopf, Support vector machines. Intell. Syst. Appl. IEEE 13(4), 18–28 (1998)

    Article  Google Scholar 

  32. Weka 3: Data mining software in java. Accessed: 15 Dec 2015

    Google Scholar 

  33. R. Kath, The portable executable file format from top to bottom (Microsoft Corporation, MSDN Library, 1993)

    Google Scholar 

  34. A. Abraham, Hybrid soft and hard computing based forex monitoring systems, in Fuzzy Systems Engineering (Springer, 2005), pp. 113–129

    Google Scholar 

Download references

Acknowledgements

The authors would like to acknowledge help and support by Karl Hiramoto from VirusTotal. Special thanks to Carl Leichter for valuable and critical comments. Also we are grateful for sponsorship and support from COINS Research School of Computer and Information Security.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lars Strande Grini .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Grini, L.S., Shalaginov, A., Franke, K. (2018). Study of Soft Computing Methods for Large-Scale Multinomial Malware Types and Families Detection. In: Zadeh, L., Yager, R., Shahbazova, S., Reformat, M., Kreinovich, V. (eds) Recent Developments and the New Direction in Soft-Computing Foundations and Applications. Studies in Fuzziness and Soft Computing, vol 361. Springer, Cham. https://doi.org/10.1007/978-3-319-75408-6_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-75408-6_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-75407-9

  • Online ISBN: 978-3-319-75408-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics