Skip to main content

A Survey of the Software Vulnerability Discovery Using Machine Learning Techniques

  • Conference paper
  • First Online:
Artificial Intelligence and Security (ICAIS 2019)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11635))

Included in the following conference series:

Abstract

Nowadays, the study of vulnerability discovery has been attracted the widespread attention and the experts have proposed many different approaches in the past decades. To optimize the efficiency of the method, machine learning techniques are introduced into this area. In this paper, we provide an extensive review of the work in the field of software vulnerability discovery that utilize machine learning techniques. For the three key technologies of static analysis, symbolic execution and fuzzing in vulnerability discovery field, we first explain the basic principles respectively. Afterward, we review the research situation of software vulnerability discovery using machine learning techniques. Finally, we discuss both advantages and limitations of the approaches reviewed in the paper, and point out challenges and some uncharted territories in the three categories. In this paper, a brief study of the software vulnerability discovery using machine learning techniques is given, which is helpful to carry out the follow-up research work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Nayak, K., Marino, D., Efstathopoulos, P., Dumitraş, T.: Some vulnerabilities are different than others. In: Stavrou, A., Bos, H., Portokalidis, G. (eds.) RAID 2014. LNCS, vol. 8688, pp. 426–446. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11379-1_21

    Chapter  Google Scholar 

  2. Chen, Q.: Bridges, R.: Automated behavioral analysis of malware: a case study of WannaCry Ransomware. In: the 16th IEEE International Conference On Machine Learning And Applications, pp. 454–460, Cancun, Mexico (2017). https://dblp.uni-trier.de/pers/hd/c/Chen:Qian

  3. Liu, B., Shi, L., Cai, Z., Li, M.: Software vulnerability discovery techniques: a survey. In: the 4th International Conference on Multimedia Information Networking and Security, Nanjing, China (2012)

    Google Scholar 

  4. Scandariato, R., Walden, J., Hovsepyan, A., Joosen, W.: Predicting vulnerable software components via text mining. IEEE Trans. Softw. Eng. 40(10), 993–1006 (2014). https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=32

    Article  Google Scholar 

  5. Shin, E., Song, D., Moazzezi, R.: Recognizing functions in binaries with neural network. In: the 24th USENIX Security Symposium, Washington, D.C., USA (2015)

    Google Scholar 

  6. Perl, H., Dechand, S., Smith, M.: VCCFinder: finding potential vulnerabilities in open-source projects to assist code audits. In: Proceeding of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 426–437, Denver, Colorado, USA (2015)

    Google Scholar 

  7. Grieco, G., Grinblat, G., Uzal, L., Rawat, S., Feist, J., Mounier, L.: Toward large-scale vulnerability discovery using machine learning. In: Proceedings of the 6th ACM Conference on Data and Application Security and Privacy, pp. 85–96, San Antonio, TX, USA (2015)

    Google Scholar 

  8. Li, Z.: VulDeePecker: a deep learning-based system for vulnerability detection. In: the 25th Annual Network and Distributed System Security Symposium, NDSS, San Diego, California, USA (2018)

    Google Scholar 

  9. Lin, G., Zhang, J.: Cross-project transfer representation learning for vulnerable function discovery. IEEE Trans. Ind. Inf. 14, 3289–3297 (2018). https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=9424

    Article  Google Scholar 

  10. Chen, L., Yang, C., Liu, F., Gong, D., Ding, S.: Automatic mining of security-sensitive functions from source code. CMC: Comput. Mater. Cont. 56(2), 199–210 (2018)

    Google Scholar 

  11. Yamaguchi, F., Lottmann, M., Rieck, K.: Generalized vulnerability extrapolation using abstract syntax trees. In: Proceedings of the 28th Annual Computer Security Applications Conference, pp. 359–368 (2012)

    Google Scholar 

  12. Ghaffarian, S., Shahriari, H.: Software vulnerability analysis and discovery using machine-learning and data-mining techniques: a survey. ACM Comput. Surv. 50(4) (2017)

    Article  Google Scholar 

  13. He, H., Garcia, E.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9) (2009)

    Google Scholar 

  14. Chu, D.H., Jaffar, J., Murali, V.: Lazy symbolic execution for enhanced learning. In: the 5th International Conference on Runtime Verification, pp. 323–339, Toronto, ON, Canada (2014). https://link.springer.com/conference/rv

  15. Li, X.: Symbolic execution of complex program driven by machine learning based constraint solving. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, pp. 554–559, Singapore, Singapore (2016)

    Google Scholar 

  16. Yu, Y., Qian, H., Hu, Y.Q.: Derivative-free optimization via classification. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, pp. 2286–2292 (2016)

    Google Scholar 

  17. Meng, Q., Wen, S., Zhang, B., Tang, C.: Automatically discover vulnerability through similar functions. In: 2016 Progress in Electromagnetic Research Symposium (PIERS), Shanghai, China (2016). https://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=7655139

  18. Oehlert, P.: Violating assumptions with fuzzing. IEEE Secur. Priv. 3(2), 58–62 (2005)

    Article  Google Scholar 

  19. Liu, B., Shi, L., Cai, Z., Li, M.: Software vulnerability discovery techniques: a survey. In: Fourth International Conference on Multimedia Information Networking and Security (2012)

    Google Scholar 

  20. Böhme, M., Pham, V.T., Roychoudhury, A.: Coverage based greybox fuzzing as Markov Chain. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, NY, USA (2016)

    Google Scholar 

  21. Godefroid, P., Peleg, H., Singh, R.: Learn&Fuzz: machine learning for input fuzzing. In: Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering, pp. 50–59. Urbana-Champaign, IL, USA (2017)

    Google Scholar 

  22. Wang, J., Chen, B., Wei, L., Liu, Y.: Skyfire: data-driven seed generation for fuzzing. In: 2017 IEEE Symposium on Security and Privacy, San Jose, CA, USA (2017). https://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=7957740

  23. Nichols, N., Raugas, M., Jasper, R., Hilliard, N.: Faster fuzzing: reinitialization with deep neural models. arXiv preprint arXiv:1711.02807 (2017)

  24. Li, C., Jiang, Y., Cheslyar, M.: Embedding image through generated intermediate medium using deep convolutional generative adversarial network. CMC: Comput. Mater. Con. 56(2), 313–324 (2018)

    Google Scholar 

Download references

Acknowledgement

This work was supported by National Key Research & Development Plan of China under Grant 2016QY05X1000, National Natural Science Foundation of China under Grant No. 61771166, and Dongguan Innovative Research Team Program under Grant No. 201636000100038.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jian Jiang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jiang, J., Yu, X., Sun, Y., Zeng, H. (2019). A Survey of the Software Vulnerability Discovery Using Machine Learning Techniques. In: Sun, X., Pan, Z., Bertino, E. (eds) Artificial Intelligence and Security. ICAIS 2019. Lecture Notes in Computer Science(), vol 11635. Springer, Cham. https://doi.org/10.1007/978-3-030-24268-8_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-24268-8_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-24267-1

  • Online ISBN: 978-3-030-24268-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics