Skip to main content

A Dynamic Malicious Document Detection Method Based on Multi-Memory Features

  • Chapter
  • First Online:
Advances in Digital Forensics XIX (DigitalForensics 2023)

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 687))

Included in the following conference series:

  • 120 Accesses

Abstract

The massive use of Microsoft Office documents underscores the need for effective malicious document detection techniques. Most detection methods characterize document behavior using application programming interface traces or other descriptive information, but ignore memory information due to inherent difficulties. Since many malicious behavior patterns are only manifested in memory, these detection methods are vulnerable to ubiquity evasion attacks. One difficulty in extracting malicious behavior information from memory is that only high-coverage memory dump sequences are meaningful, but no established methods can be employed. Another difficulty is that no efficient method exists for representing the numerous long memory dump sequences associated with malicious document samples.

This chapter describes a multi-memory-feature-based method that leverages memory information to detect malicious documents. The detection method employs a high-coverage memory dump service and a multiple memory dump sequence reduction approach. The memory dump service hooks system application programming interfaces to cover the entire lifetimes of processes while also monitoring the initial Office process and every spawned subprocess. The multiple memory dump sequence reduction approach efficiently represents each memory dump in terms of the difference from its adjacent dump. Ablation experiments demonstrate that the memory dump sequence reduction approach performs best using a long short-term memory classifier, yielding an accuracy of \(98.27\%\). Experiments also demonstrate that the detection method outperforms state-of-the-art methods based on application programming interfaces in terms of accuracy and precision.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 89.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. Bozkir, E. Tahillioglu, M. Aydos and I. Kara, Catch them alive: A malware detection approach through memory forensics, manifold learning and computer vision, Computers and Security, vol. 103, article no. 102166, 2021.

    Google Scholar 

  2. A. Cohen, N. Nissim, L. Rokach and Y. Elovici, SFEM: Structural feature extraction methodology for the detection of malicious Office documents using machine learning methods, Expert Systems with Applications, vol. 63, pp. 324–343, 2016.

    Google Scholar 

  3. I. Corona, D. Maiorca, D. Ariu and G. Giacinto, Lux0R: Detection of malicious PDF-embedded JavaScript code through discriminant analysis of API references, Proceedings of the Workshop on Artificial Intelligence and Security, pp. 47–57, 2014.

    Google Scholar 

  4. M. Cova, C. Kruegel and G. Vigna, Detection and analysis of drive-by-download attacks and malicious JavaScript code, Proceedings of the Nineteenth International Conference on the World Wide Web, pp. 281–290, 2010.

    Google Scholar 

  5. C. Curtsinger, B. Livshits, B. Zorn and C. Seifert, ZOZZLE: Fast and precise in-browser JavaScript malware detection, Proceedings of the Twentieth USENIX Security Symposium, 2011.

    Google Scholar 

  6. Y. Dai, H. Li, Y. Qian, R. Yang and M. Zheng, SMASH: A malware detection method based on multi-feature ensemble learning, IEEE Access, vol. 7, pp. 112588–112597, 2019.

    Google Scholar 

  7. C. Guarnieri, M. Schloesser, J. Bremer and A. Tanasi, Cuckoo Sandbox open-source automated malware analysis, presented at Black Hat USA, 2013.

    Google Scholar 

  8. D. Javaheri and M. Hosseinzadeh, A framework for recognition and confronting of obfuscated malware based on memory dumping and filter drivers, Wireless Personal Communications, vol. 98(1), pp. 119–137, 2018.

    Google Scholar 

  9. Kaspersky North America, Eight times more users attacked via an old Microsoft Office vulnerability in Q2, Press Release, Woburn, Massachusetts (www.kaspersky.com/about/press-releases/2022_eight-times-more-users-attacked-via-an-old-microsoft-office-vulnerability-in-q2), August 15, 2022.

  10. P. Laskov and N. Srndic, Static detection of malicious JavaScript-bearing PDF documents, Proceedings of the Twenty-Seventh Annual Computer Security Applications Conference, pp. 373–382, 2011.

    Google Scholar 

  11. J. Lin and H. Pao, Multi-view malicious document detection, Proceedings of the Conference on Technologies and Applications of Artificial Intelligence, pp. 170–175, 2013.

    Google Scholar 

  12. L. Liu, X. He, L. Liu, L. Qing, Y. Fang and J. Liu, Capturing the symptoms of malicious code in electronic documents by file entropy signals combined with machine learning, Applied Soft Computing, vol. 82, article no. 105598, 2019.

    Google Scholar 

  13. X. Lu, J. Zhuge, R. Wang, Y. Cao and Y. Chen, De-obfuscation and detection of malicious PDF files with high accuracy, Proceedings of the Forty-Sixth Hawaii International Conference on System Sciences, pp. 4890–4899, 2013.

    Google Scholar 

  14. D. Maiorca, G. Giacinto and I. Corona, A pattern recognition system for malicious PDF file detection, Proceedings of the Eighth International Workshop on Machine Learning and Data Mining in Pattern Recognition, pp. 510–524, 2012.

    Google Scholar 

  15. M. Mimura and T. Ohminami, Using LSI to detect unknown malicious VBA macros, Journal of Information Processing, vol. 28, pp. 493–501, 2020.

    Google Scholar 

  16. T. Mohammed, L. Nataraj, S. Chikkagoudar, S. Chandrasekaran and B. Manjunath, HAPSSA: Holistic approach to PDF malware detection using signal and statistical analysis, Proceedings of the IEEE Military Communications Conference, pp. 709–714, 2021.

    Google Scholar 

  17. N. Nissim, O. Lahav, A. Cohen, Y. Elovici and L. Rokach, Volatile memory analysis using the minhash method for efficient and secure detection of malware in private clouds, Computers and Security, vol. 87, article no. 101590, 2019.

    Google Scholar 

  18. T. Panker and N. Nissim, Leveraging malicious behavior traces from volatile memory using machine learning methods for trusted unknown malware detection in Linux cloud environments, Knowledge-Based Systems, vol. 226, article no. 107095, 2021.

    Google Scholar 

  19. H. Pareek, P. Eswari and N. Babu, Entropy and n-gram analysis of malicious PDF documents, International Journal of Engineering and Technology, vol. 2(2), 2013.

    Google Scholar 

  20. C. Rathnayaka and A. Jamdagni, An efficient approach for advanced malware analysis using a memory forensic technique, Proceedings of the Sixteenth IEEE International Conference on Trust, Security and Privacy in Computing and Communications, Eleventh IEEE International Conference on Big Data Science and Engineering and Fourteenth IEEE International Conference on Embedded Software and Systems, pp. 1145–1150, 2017.

    Google Scholar 

  21. K. Rieck, T. Krueger and A. Dewald, Cujo: Efficient detection and prevention of drive-by-download attacks, Proceedings of the Twenty-Sixth Annual Computer Security Applications Conference, pp. 31–39, 2010.

    Google Scholar 

  22. T. Schreck, S. Berger and J. Gobel, BISSAM: Automatic vulnerability identification of Office documents, Proceedings of the International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, pp. 204–213, 2012.

    Google Scholar 

  23. M. Shafiq, S. Khayam, and M. Farooq, Embedded malware detection using Markov n-grams, Proceedings of the Fifth International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, pp. 88–107, 2008.

    Google Scholar 

  24. N. Srndic and P. Laskov, Detection of malicious PDF files based on hierarchical document structure, Proceedings of the Twentieth Annual Network and Distributed System Security Symposium, 2013.

    Google Scholar 

  25. N. Srndic and P. Laskov, Practical evasion of a learning-based classifier: A case study, Proceedings of the IEEE Symposium on Security and Privacy, pp. 197–211, 2014.

    Google Scholar 

  26. N. Srndic and P. Laskov, Hidost: A static machine-learning-based detector of malicious files, EURASIP Journal on Information Security, vol. 2016(1), article no. 45, 2016.

    Google Scholar 

  27. S. Stolfo, K. Wang and W. Li, Towards stealthy malware detection, in Malware Detection, M. Christodorescu, S. Jha, D. Maughan, D. Song and C. Wang (Eds.), Springer, Boston, Massachusetts, pp. 231–249, 2007.

    Google Scholar 

  28. Z. Tzermias, G. Sykiotakis, M. Polychronakis and E. Markatos, Combining static and dynamic analysis for the detection of malicious documents, Proceedings of the Fourth European Workshop on System Security, article no. 4, 2011.

    Google Scholar 

  29. C. Willems, T. Holz and F. Freiling, Toward automated dynamic malware analysis using CWSandbox, IEEE Security and Privacy, vol. 5(2), pp. 32–39, 2007.

    Google Scholar 

  30. W. Xu, Y. Qi and D. Evans, Automatically evading classifiers: A case study on PDF malware classifiers, Proceedings of the Twenty-Third Network and Distributed Systems Symposium, vol. 10, 2016.

    Google Scholar 

  31. Z. Zhang, P. Qi and W. Wang, Dynamic malware analysis with feature engineering and feature learning, Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, Thirty-Second Innovative Applications of Artificial Intelligence Conference and Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, pp. 1210–1217, 2020.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Min Yu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 IFIP International Federation for Information Processing

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Wang, Y. et al. (2023). A Dynamic Malicious Document Detection Method Based on Multi-Memory Features. In: Peterson, G., Shenoi, S. (eds) Advances in Digital Forensics XIX. DigitalForensics 2023. IFIP Advances in Information and Communication Technology, vol 687. Springer, Cham. https://doi.org/10.1007/978-3-031-42991-0_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-42991-0_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-42990-3

  • Online ISBN: 978-3-031-42991-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics