Skip to main content

Malware Detection Using Black-Box Neural Method

  • Conference paper
  • First Online:
Multimedia and Network Information Systems (MISSI 2018)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 833))

Included in the following conference series:

Abstract

Because of the great loss and damage caused by malwares, malware detection has become a central issue of computer security. It has to be fast and very accurate. To develop suitable methods on needs very good quality benchmarks. One such benchmark is the Microsoft Kaggle malware challenge system run in 2015. Since then over 50 papers were published on this system. The best result were achieved with complex feature engineering. In this work we analyze the black-box neural method and what is novel analyze its results against the Microsoft Kaggle malware challenge benchmark. It is tempting to use convolution neural networks for malware analysis following the great success with analysis of images. Even the use of balanced classes and drop-out convergence does not beat XGBoost with feature engineering, although some room for improvement exists. The situation is similar to that for language analysis. The language is much more hierarchical than image, and apparently malware is too. The malware analysis still awaits optimal neural network architecture.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Gartner. https://www.gartner.com/newsroom/id/3836563. Accessed 6 June 2018

  2. Ponemon Institute: COST OF CYBER CRIME STUDY 2017: https://www.accenture.com/t20171006T095146Z__w__/us-en/_acnmedia/PDF-62/Accenture-2017CostCybercrime-US-FINAL.pdf#zoom=50. Accessed 6 June 2018

  3. PWC: Cyber-ruletka po polsku Dlaczego firmy w walce z cyberprzestępcami liczą na szczęście, Polish cyber-roulette Why companies count on luck in the fight against cybercriminals: https://www.pwc.pl/pl/pdf/publikacje/2018/cyber-ruletka-po-polsku-raport-pwc-gsiss-2018.pdf. Accessed 6 June 2018

  4. Microsoft Kaggle challenge: https://www.kaggle.com/c/malware-classification. Accessed 6 June 2018

  5. Ronen, R., Radu, M., Feuerstein, C., Yom-Tov, E., Ahmadi, M.: Microsoft Malware Classification Challenge: https://arxiv.org/abs/1802.10135. Accessed 6 June 2018

  6. Microsoft malware winner 1st place: http://blog.kaggle.com/2015/05/26/microsoft-malware-winners-interview-1st-place-no-to-overfitting; https://www.youtube.com/watch?time_continue=979&v=VLQTRlLGz5Y. Accessed 6 June 2018

  7. Microsoft malware winners 2nd place: https://www.kaggle.com/c/malware-classification/discussion/13863. Accessed 6 June 2018

  8. Ahmadi, M., Ulyanov, D., Semenov, S., Trofimov, M., Giacinto, G.: Novel feature extraction, selection and fusion for effective malware family classification. In: Proceedings of the CODASPY 2016, pp. 183–194. ACM, New York (2016)

    Google Scholar 

  9. Bat-Erdene, M., Kim, T., Park, H., Lee, H.: Packer detection for multi-layer executables using entropy analysis. Entropy 19(3), 125 (2017)

    Article  Google Scholar 

  10. Masud, M.M., Khan, L., Thuraisingham, B.M.: A scalable multi-level feature extraction technique to detect malicious executables. Inf. Syst. Front. 10(1), 33–45 (2008)

    Article  Google Scholar 

  11. Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath, B.S.: Malware images: visualization and automatic classification. In: Proceedings of the 8th VizSec 2011. ACM, New York (2011). Article 4

    Google Scholar 

  12. Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: KDD, pp. 785–794 (2016)

    Google Scholar 

  13. Zhou, Z.-H.: Ensemble Methods: Foundations and Algorithms. CRC Press, Florida (2012)

    Book  Google Scholar 

  14. Raff, E., Barker, J., Sylvester, J., Brandon, R., Catanzaro, B., Nicholas, C.K.: Malware Detection by Eating a Whole EXE. CoRR abs/1710.09435 (2017)

    Google Scholar 

  15. Krčál, M., Švec, O., Bálek, M., Jašek, O.: Deep convolutional malware classifiers can learn from raw executables and labels only. In: ICLR 2018 Workshop (2018)

    Google Scholar 

  16. http://imageimage-net.org/challenges/LSVRC/2017/results. Accessed 6 June 2018

  17. Zhang, X., LeCun, Y.: Byte-Level Recursive Convolutional Auto-Encoder for Text. CoRR abs/1802.01817 (2018)

    Google Scholar 

  18. Schwenk, H., Barrault, L., Conneau, A., LeCun, Y.: Very deep convolutional networks for text classification. EACL 1, 1107–1116 (2017)

    Google Scholar 

  19. Yan, J., Qi, Y., Rao, Q.: Detecting malware with an ensemble method based on deep neural network. Security and Communication Networks 2018:1–7247095:16 (2018)

    Google Scholar 

  20. Yuxin, D., Siyi, Z.: Malware detection based on deep learning algorithm: Neural Comput & Appl. (2017). https://doi.org/10.1007/s00521-017-3077-6

  21. Zdrojewska, A., Dutkiewicz, J., Jędrzejek, C., Olejnik, M.: Comparison of the novel classification methods on the reuters-21578 corpus. In: Choroś, K., et al. (eds.) Proceedings of MISSI 2018 (2018)

    Google Scholar 

  22. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: The International Conference on Learning Representations (ICLR), San Diego (2015)

    Google Scholar 

  23. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    Article  Google Scholar 

  24. Iwamoto, K., Wasaki, K.: Malware classification based on extracted API sequences using static analysis. In: Proceedings of the Asian Internet Engineeering Conference (AINTEC 2012). ACM, New York (2012)

    Google Scholar 

  25. Anderson, H.S., Roth, P.: EMBER: An Open Dataset for Training Static PE Malware Machine Learning Models. CoRRabs/1804.04637 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Czesław Jędrzejek .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pieczyński, D., Jędrzejek, C. (2019). Malware Detection Using Black-Box Neural Method. In: Choroś, K., Kopel, M., Kukla, E., Siemiński, A. (eds) Multimedia and Network Information Systems. MISSI 2018. Advances in Intelligent Systems and Computing, vol 833. Springer, Cham. https://doi.org/10.1007/978-3-319-98678-4_20

Download citation

Publish with us

Policies and ethics