Study of Soft Computing Methods for Large-Scale Multinomial Malware Types and Families Detection

Grini, Lars Strande; Shalaginov, Andrii; Franke, Katrin

doi:10.1007/978-3-319-75408-6_26

Lars Strande Grini⁷,
Andrii Shalaginov⁷ &
Katrin Franke⁷

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 361))

638 Accesses
3 Citations

Abstract

There exist different methods of malware identification, while the most common is signature-based used by anti-virus vendors that includes one-way cryptographic hash sums to characterize each particular malware sample. In most cases such detection results in a simple classification into malware and goodware. In a modern Information Security society it is not enough to separate only between goodware and malware. The reason for this is increasingly complex functionality used by various malware families, in which there has been several thousand of new ones created during the last decade. In addition to this, a number of new malware types have emerged. We believe that Soft Computing (SC) may help to understand such complicated multinomial problems better. To study this we ensambled a novel large-scale dataset based on 400 k malware samples. Furthermore, we investigated the limitation of community-accepted Soft Computing methods and can clearly observe that the optimization is required for such non-trivial task. The contribution of this paper is a thorough investigation of large-scale multinomial malware classification by Soft Computing using static characteristics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://testimon.ccis.no.

References

McAfee. Part of Intel Security., “Threats report,” McAfee., Technical Report, Aug 2015. Accessed 19 Sep 2015
Google Scholar
S. Ravi, N. Balakrishnan, B. Venkatesh, Behavior-based malware analysis using profile hidden markov models, in 2013 International Conference on Security and Cryptography (SECRYPT), July 2013, pp. 1–12
Google Scholar
“Virustotal,” https://www.virustotal.com/. Accessed 10 Aug 2015
L. Seltzer, Tools for Analyzing Static Properties of Suspicious Files on Windows, 2014
Google Scholar
G. Amato, “Peframe,” https://github.com/guelfoweb/peframe, Nov 2015. Accessed 17 June 2015
M. Ligh, S. Adair, B. Hartstein, M. Richard, Malware Analyst’s Cookbook and DVD: Tools and Techniques for Fighting Malicious Code (Wiley Publishing, 2010)
Google Scholar
D. Gavrilut, M. Cimpoesu, D. Anton, L. Ciortuz, Malware detection using machine learning, in International Multiconference on Computer Science and Information Technology, 2009. IMCSIT ’09, Oct 2009, pp. 735–741
Google Scholar
N. Idika, A.P. Mathur, A Survey of Malware Detection Techniques, vol. 48 (Purdue University, 2007)
Google Scholar
Y. Ye, D. Wang, T. Li, D. Ye, Imds: Intelligent malware detection system, in Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ser. KDD ’07. New York, NY, USA (ACM, 2007), pp. 1043–1047
Google Scholar
J.Z. Kolter, M.A. Maloof, Learning to detect and classify malicious executables in the wild. J. Mach. Learn. Res. 7, 2721–2744 (2006)
MathSciNet MATH Google Scholar
S.K. Das, A. Kumar, B. Das, A. Burnwal, On soft computing techniques in various areas. Comput. Sci. Inf. Technol. 59 (2013)
Google Scholar
F. Cohen, Computer viruses: theory and experiments. Comput. Secur. 6(1), 22–35 (1987)
Article Google Scholar
D.M. Chess, S.R. White, An undetectable computer virus, in Proceedings of Virus Bulletin Conference, vol. 5, 2000
Google Scholar
S. R. Bragen, Malware detection through opcode sequence analysis using machine learning, Gjvik University College, 2015
Google Scholar
Z. Markel, M. Bilzor, Building a machine learning classifier for malware detection, in Second Workshop on Anti-malware Testing Research (WATeR), vol. 2014 (IEEE, 2014), pp. 1–4
Google Scholar
M. Shankarapani, K. Kancherla, S. Ramammoorthy, R. Movva, S. Mukkamala, Kernel machines for malware classification and similarity analysis, in The 2010 International Joint Conference on Neural Networks (IJCNN) (IEEE, 2010), pp. 1–6
Google Scholar
K. Rieck, T. Holz, C. Willems, P. Düssel, P. Laskov, Learning and classification of malware behavior, in Proceedings of the 5th International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, ser. DIMVA ’08 (Springer, Berlin, Heidelberg, 2008), pp. 108–125
Google Scholar
B. Zhang, J. Yin, J. Hao, D. Zhang, S. Wang, Malicious codes detection based on ensemble learning, in Proceedings of the 4th International Conference on Autonomic and Trusted Computing, ser. ATC’07 (Springer, Berlin, Heidelberg, 2007), pp. 468–477
Google Scholar
Naming scheme-caro-computer antivirus research organization (2015), www.caro.org/naming/scheme.html. Accessed 20 Aug 2015
Microsoft Malware Protection Center, “Naming malware”
Google Scholar
G. Ou, Y.L. Murphey, Multi-class pattern classification using neural networks. Pattern Recogn. 40(1), 4–18 (2007)
Article Google Scholar
A. Shalaginov, K. Franke, Towards improvement of multinomial classification accuracy of neuro-fuzzy for digital forensics applications, in 15th International Conference on Hybrid Intelligent Systems (HIS 2015), vol. 420, no. 1 (Springer Publishing, 2015), p. 1
Google Scholar
K. Maxwell, “maltrieve,” May 2015. Accessed 10 June 2015
Google Scholar
“Virusshare,” https://virusshare.com/. Accessed 08 May 2015
I. Kononenko, M. Kukar, Machine Learning and Data Mining: Introduction to Principles and Algorithms. (Horwood Publishing Limited, 2007)
Chapter Google Scholar
M.A. Hall, Correlation-based feature selection for machine learning, Ph.D. dissertation, The University of Waikato, 1999
Google Scholar
D. Roobaert, G. Karakoulas, N.V. Chawla, Information gain, correlation and support vector machines, in Feature Extraction (Springer, 2006), pp. 463–470
Google Scholar
R. Singh, H. Kumar, R. Singla, Review of soft computing in malware detection. Spec. Issues IP Multimed. Commun. 1, 55–60 (2011)
Google Scholar
I. Rish, An empirical study of the naive bayes classifier, in IJCAI, Workshop on Empirical Methods in Artificial Intelligence, vol. 3, no. 22 (IBM, New York, 2001), pp. 41–46
Google Scholar
N. Friedman, D. Geiger, M. Goldszmidt, Bayesian network classifiers. Mach. Learn. 29(2–3), 131–163 (1997)
Article Google Scholar
M.A. Hearst, S.T. Dumais, E. Osman, J. Platt, B. Scholkopf, Support vector machines. Intell. Syst. Appl. IEEE 13(4), 18–28 (1998)
Article Google Scholar
Weka 3: Data mining software in java. Accessed: 15 Dec 2015
Google Scholar
R. Kath, The portable executable file format from top to bottom (Microsoft Corporation, MSDN Library, 1993)
Google Scholar
A. Abraham, Hybrid soft and hard computing based forex monitoring systems, in Fuzzy Systems Engineering (Springer, 2005), pp. 113–129
Google Scholar

Download references

Acknowledgements

The authors would like to acknowledge help and support by Karl Hiramoto from VirusTotal. Special thanks to Carl Leichter for valuable and critical comments. Also we are grateful for sponsorship and support from COINS Research School of Computer and Information Security.

Author information

Authors and Affiliations

Norwegian Information Security Laboratory, Center for Cyber - and Information Security, Norwegian University of Science and Technology, Trondheim, Norway
Lars Strande Grini, Andrii Shalaginov & Katrin Franke

Authors

Lars Strande Grini
View author publications
You can also search for this author in PubMed Google Scholar
Andrii Shalaginov
View author publications
You can also search for this author in PubMed Google Scholar
Katrin Franke
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lars Strande Grini .

Editor information

Editors and Affiliations

Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, California, USA
Lotfi A. Zadeh
Machine Intelligence Institute, Iona College, New Rochelle, New York, USA
Ronald R. Yager
Department of Information Technology and Programming, Azerbaijan Technical University, Baku, Azerbaijan
Shahnaz N. Shahbazova
Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta, Canada
Marek Z. Reformat
Department of Computer Science, University of Texas at El Paso, El Paso, Texas, USA
Vladik Kreinovich

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Grini, L.S., Shalaginov, A., Franke, K. (2018). Study of Soft Computing Methods for Large-Scale Multinomial Malware Types and Families Detection. In: Zadeh, L., Yager, R., Shahbazova, S., Reformat, M., Kreinovich, V. (eds) Recent Developments and the New Direction in Soft-Computing Foundations and Applications. Studies in Fuzziness and Soft Computing, vol 361. Springer, Cham. https://doi.org/10.1007/978-3-319-75408-6_26

Download citation

DOI: https://doi.org/10.1007/978-3-319-75408-6_26
Published: 29 May 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75407-9
Online ISBN: 978-3-319-75408-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics