Skip to main content

Identifying Drawbacks in Malicious PDF Detectors

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 878))

Abstract

Despite the continuous countermeasuring efforts, embedding malware in PDF documents and using it as a malware distribution mechanism is still a threat. This is due to its popularity as a document exchange format, the lack of user awareness of its dangers, as well as its ability to carry and execute malware. Several malicious PDF detection tools have been proposed by the academic community to address the PDF threat. All of which suffer some drawbacks that limit its utility. In this paper, we present the drawbacks of the current state of the art malicious PDF detectors. This was achieved by undertaking a survey of all recent malicious PDF detectors, followed by a comparative evaluation of the available tools. Our results show that Concept drifts is major drawback to the detectors, despite the fact that many detectors use machine learning approaches.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Search was conducted using the “PDF” keyword only. [22] reports much higher numbers using assumably the “adobe acrobat reader” keyword.

  2. 2.

    Neither article Slayer [10] and PDFrate [16] covers the feature set in full details, thus preventing full analysis of the observation.

References

  1. Adobe: Adobe reader security patches (2017). https://helpx.adobe.com/security/products/reader.html

  2. Adobe: PDF technology center (2017). http://www.adobe.com/devnet/pdf.html

  3. Carmony, C., Hu, X., Yin, H., Bhaskar, A.V., Zhang, M.: Extract me if you can: abusing PDF parsers in malware detectors, In: NDSS (2016)

    Google Scholar 

  4. Contagio: Contagio malware dump (2017). http://contagiodump.blogspot.com.au

  5. CVE: PDF-related vulnerabilities (2017). https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=PDF

  6. Esparza, J.M.: PDF attack - a journey from the exploit kit to the shellcode (2014). https://www.blackhat.com/docs/eu-14/materials/eu-14-Esparza-PDF-Attack-A-Journey-From-The-Exploit-Kit-To-The-Shellcode.pdf

  7. Laskov, P., Šrndić, N.: Static detection of malicious JavaScript-bearing PDF documents. In: Proceedings of the 27th Annual Computer Security Applications Conference, pp. 373–382. ACM (2011)

    Google Scholar 

  8. Li, W.-J., Stolfo, S., Stavrou, A., Androulaki, E., Keromytis, A.D.: A study of malcode-bearing documents. In: M. Hämmerli, B., Sommer, R. (eds.) DIMVA 2007. LNCS, vol. 4579, pp. 231–250. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73614-1_14

    Chapter  Google Scholar 

  9. Liu, D., Wang, H., Stavrou, A.: Detecting malicious JavaScript in PDF through document instrumentation. In: 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp. 100–111. IEEE (2014)

    Google Scholar 

  10. Maiorca, D., Ariu, D., Corona, I., Giacinto, G.: A structural and content-based approach for a precise and robust detection of malicious PDF files. In: 2015 International Conference on Information Systems Security and Privacy (ICISSP), pp. 27–36. IEEE (2015)

    Google Scholar 

  11. Maiorca, D., Giacinto, G., Corona, I.: A pattern recognition system for malicious PDF files detection. In: Perner, P. (ed.) MLDM 2012. LNCS (LNAI), vol. 7376, pp. 510–524. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31537-4_40

    Chapter  Google Scholar 

  12. McAfee: Mcafee september 2017 threat report (2017). https://www.mcafee.com/au/resources/reports/rp-quarterly-threats-sept-2017.pdf

  13. Trent Nelson: PDF collection (2017). https://github.com/tpn/pdfs

  14. Neupane, A., Saxena, N., Maximo, J.O., Kana, R.: Neural markers of cybersecurity: an fMRI study of phishing and malware warnings. IEEE Trans. Inf. Forensics Secur. 11(9), 1970–1983 (2016). https://doi.org/10.1109/TIFS.2016.2566265

    Article  Google Scholar 

  15. NIST: National vulnerable database (2017). https://nvd.nist.gov

  16. Smutz, C., Stavrou, A.: Malicious PDF detection using metadata and structural features. In: Proceedings Of The 28th Annual Computer Security Applications Conference, pp. 239–248. ACM (2012)

    Google Scholar 

  17. Smutz, C., Stavrou, A.: When a tree falls: using diversity in ensemble classifiers to identify evasion in malware detectors. In: NDSS (2016)

    Google Scholar 

  18. Šrndić, N., Laskov, P.: Detection of malicious PDF files based on hierarchical document structure. In: Proceedings of the 20th Annual Network and Distributed System Security Symposium (2013)

    Google Scholar 

  19. Šrndić, N., Laskov, P.: Hidost: a static machine-learning-based detector of malicious files. EURASIP J. Inf. Secur. 2016(1), 22 (2016)

    Article  Google Scholar 

  20. Tabish, S.M., Shafiq, M.Z., Farooq, M.: Malware detection using statistical analysis of byte-level file content. In: Proceedings of the ACM SIGKDD Workshop on CyberSecurity and Intelligence Informatics, pp. 23–31. ACM (2009)

    Google Scholar 

  21. VirusTotal: Virustotal (2017). https://www.virustotal.com

  22. Xu, M., Kim, T.: PlatPal: detecting malicious documents with platform diversity. In: USENIX Security Symposium (2017)

    Google Scholar 

Download references

Acknowledgements

We would like to thank Mustafa Al-Saegh for helping with dataset cleaning and preparation. We would also like to thank VirusTotal, the owner of the Contagio dataset and TPN for providing access to their files. Finally, we are grateful to the authors and creators of PDFrate and Slayer, for providing access to their tools.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ahmed Falah .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Falah, A., Pan, L., Abdelrazek, M., Doss, R. (2018). Identifying Drawbacks in Malicious PDF Detectors. In: Doss, R., Piramuthu, S., Zhou, W. (eds) Future Network Systems and Security. FNSS 2018. Communications in Computer and Information Science, vol 878. Springer, Cham. https://doi.org/10.1007/978-3-319-94421-0_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-94421-0_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-94420-3

  • Online ISBN: 978-3-319-94421-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics