Skip to main content

Detecting Vulnerabilities via Explicitly Leveraging Vulnerability Features on Program Slices

  • Conference paper
  • First Online:
Theoretical Aspects of Software Engineering (TASE 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14777))

Included in the following conference series:

  • 512 Accesses

Abstract

As the size and complexity of software continue to increase, detecting software vulnerabilities becomes increasingly challenging. Traditional static and dynamic analysis methods often suffer from poor accuracy or little reliance on expert knowledge. In recent years, deep learning has emerged as a promising direction in this field due to its ability to automatically learn the subtle features in the massive software data. However, existing deep learning-based vulnerability detection methods have the following limitations: 1) They struggle with processing long source code sequences effectively, leading to sub-optimal feature representation. 2) Although they can find similar vulnerability features across different vulnerable programs, they often fail to explicitly leverage these vulnerability features, resulting in slightly inferior performance of model detection. In this paper, we propose a vulnerability detection method called DV-LVF, based on explicitly leveraging vulnerability features on program slices, to effectively detect vulnerabilities at both function and statement levels. Specifically, we introduce Gated Recurrent Unit (GRU) statement embedding technique combined with program slicing to enhance program feature representation. We introduce a vulnerability dictionary (vulDict), which can explicitly summarize and exploit these vulnerability features, to improve the performance of model detection. Our evaluation on real-world software shows that DV-LVF outperforms the state-of-the-art in both function-level and statement-level detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Li, Z., et al.: Vulpecker: an automated vulnerability detection system based on code similarity analysis. In: Proceedings of the 32nd Annual Conference on Computer Security Applications (2016)

    Google Scholar 

  2. Kim, S., et al.: Vuddy: a scalable approach for vulnerable code clone discovery. In: 2017 IEEE Symposium on Security and Privacy (SP). IEEE (2017)

    Google Scholar 

  3. Wheeler, D.A.: Flawfinder (2016). https://www.dwheeler.com/flawfinder/. Accessed 20 May 2018

  4. Sutton, M., Greene, A., Amini, P.: Fuzzing: Brute Force Vulnerability Discovery. Pearson Education, London (2007)

    Google Scholar 

  5. Newsome, J., Song, D.X.: Dynamic taint analysis for automatic detection, analysis, and signaturegeneration of exploits on commodity software. In: NDSS, vol. 5 (2005)

    Google Scholar 

  6. Zaazaa, O., El Bakkali, H.: Dynamic vulnerability detection approaches and tools: state of the art. In: 2020 Fourth International Conference On Intelligent Computing in Data Sciences (ICDS). IEEE (2020)

    Google Scholar 

  7. Li, Z., et al.: Vuldeepecker: a deep learning-based system for vulnerability detection. arXiv preprint arXiv:1801.01681 (2018)

  8. Zou, D., et al. : \(\mu \) VulDeePecker: a deep learning-based system for multiclass vulnerability detection. IEEE Trans. Dependable Secure Comput. 18(5), 2224–2236 (2019)

    Google Scholar 

  9. Russell, R., et al.: Automated vulnerability detection in source code using deep representation learning. In: 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE (2018)

    Google Scholar 

  10. Li, Z., et al.: VR: a deep learning-based fine-grained vulnerability detector. IEEE Trans. Dependable Secure Comput. 19(4), 2821–2837 (2021)

    Google Scholar 

  11. Wang, H., et al.: Combining graph-based learning with automated data collection for code vulnerability detection. IEEE Trans. Inf. Forensics Secur. 16, 1943–1958 (2020)

    Google Scholar 

  12. Zhuang, Y., et al.: Software vulnerability detection via deep learning over disaggregated code graph representation. arXiv preprint arXiv:2109.03341 (2021)

  13. Thapa, C., et al.: Transformer-based language models for software vulnerability detection. In: Proceedings of the 38th Annual Computer Security Applications Conference (2022)

    Google Scholar 

  14. Purba, M.D., et al.: Software vulnerability detection using large language models. In: 2023 IEEE 34th International Symposium on Software Reliability Engineering Workshops (ISSREW). IEEE (2023)

    Google Scholar 

  15. Zhang, X., et al.: VulD-Transformer: source code vulnerability detection via transformer. In: Proceedings of the 14th Asia-Pacific Symposium on Internetware (2023)

    Google Scholar 

  16. Chakraborty, S., et al.: Deep learning based vulnerability detection: are we there yet? In: ACM/IEEE International Conference on Software Engineering (2022)

    Google Scholar 

  17. Zeng, P., et al.: Software vulnerability analysis and discovery using deep learning techniques: a survey. IEEE Access 8, 197158–197172 (2020)

    Google Scholar 

  18. Chung, J., et al.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)

  19. Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909 (2015)

  20. Fan, J., et al.: AC/C++ code vulnerability dataset with code changes and CVE summaries. In: Proceedings of the 17th International Conference on Mining Software Repositories (2020)

    Google Scholar 

  21. Weiser, M.: Program slicing. IEEE Trans. Softw. Eng. 4, 352–357 (1984)

    Article  MATH  Google Scholar 

  22. Li, Z., et al.: SySeVR: a framework for using deep learning to detect software vulnerabilities. IEEE Trans. Dependable Secure Comput. 19(04), 2244–2258 (2022)

    Google Scholar 

  23. Joern[EB/OL]. https://github.com/octopus-platform/joern/

  24. Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  25. Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)

    Google Scholar 

  26. Zaremba, W., Sutskever, I., Vinyals, O.: Recurrent neural network regularization. arXiv preprint arXiv:1409.2329 (2014)

  27. Shi, X., et al.: Convolutional LSTM network: a machine learning approach for precipitation nowcasting. Adv. Neural Inf. Process. Syst. 28 (2015)

    Google Scholar 

  28. Wang, Y., et al.: Codet5: identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. arXiv preprint arXiv:2109.00859 (2021)

  29. Feng, Z., et al.: CodeBERT: a pre-trained model for programming and natural languages. arXiv preprint arXiv:2002.08155 (2020)

  30. Lu, S., et al.: Codexglue: a machine learning benchmark dataset for code understanding and generation. arXiv preprint arXiv:2102.04664 (2021)

  31. Fu, M., Tantithamthavorn, C.: Linevul: a transformer-based line-level vulnerability prediction. In: Proceedings of the 19th International Conference on Mining Software Repositories (2022)

    Google Scholar 

  32. Ding, Y., et al.: VELVET: a novel ensemble learning approach to automatically locate vulnerable statements. In: 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE (2022)

    Google Scholar 

  33. Nguyen, V.-A., et al.: ReGVD: revisiting graph neural networks for vulnerability detection. In: Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings (2022)

    Google Scholar 

  34. Zhou, Y., et al.: Devign: effective vulnerability identification by learning comprehensive program semantics via graph neural networks. Adv. Neural Inf. Process. Syst. 32 (2019)

    Google Scholar 

  35. Zou, D., et al.: mVulPreter: a multi-granularity vulnerability detection system with interpretations. IEEE Trans. Dependable Secure Comput. (2022)

    Google Scholar 

  36. Nguyen, V., et al.: Information-theoretic source code vulnerability highlighting. In: 2021 International Joint Conference on Neural Networks (IJCNN). IEEE (2021)

    Google Scholar 

  37. Zhang, J., et al.: Learning to locate and describe vulnerabilities. In: 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE (2023)

    Google Scholar 

  38. Hin, D., et al.: LineVD: statement-level vulnerability detection using graph neural networks. In: Proceedings of the 19th International Conference on Mining Software Repositories (2022)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaodong Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Guo, H., Zhang, X., Zhang, Z., Shen, Y. (2024). Detecting Vulnerabilities via Explicitly Leveraging Vulnerability Features on Program Slices. In: Chin, WN., Xu, Z. (eds) Theoretical Aspects of Software Engineering. TASE 2024. Lecture Notes in Computer Science, vol 14777. Springer, Cham. https://doi.org/10.1007/978-3-031-64626-3_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-64626-3_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-64625-6

  • Online ISBN: 978-3-031-64626-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics