Detecting Vulnerabilities via Explicitly Leveraging Vulnerability Features on Program Slices

Guo, Haoyu; Zhang, Xiaodong; Zhang, Zhiwei; Shen, Yulong

doi:10.1007/978-3-031-64626-3_10

Haoyu Guo²⁶,
Xiaodong Zhang²⁶,
Zhiwei Zhang²⁶ &
…
Yulong Shen²⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14777))

Included in the following conference series:

International Symposium on Theoretical Aspects of Software Engineering

512 Accesses

Abstract

As the size and complexity of software continue to increase, detecting software vulnerabilities becomes increasingly challenging. Traditional static and dynamic analysis methods often suffer from poor accuracy or little reliance on expert knowledge. In recent years, deep learning has emerged as a promising direction in this field due to its ability to automatically learn the subtle features in the massive software data. However, existing deep learning-based vulnerability detection methods have the following limitations: 1) They struggle with processing long source code sequences effectively, leading to sub-optimal feature representation. 2) Although they can find similar vulnerability features across different vulnerable programs, they often fail to explicitly leverage these vulnerability features, resulting in slightly inferior performance of model detection. In this paper, we propose a vulnerability detection method called DV-LVF, based on explicitly leveraging vulnerability features on program slices, to effectively detect vulnerabilities at both function and statement levels. Specifically, we introduce Gated Recurrent Unit (GRU) statement embedding technique combined with program slicing to enhance program feature representation. We introduce a vulnerability dictionary (vulDict), which can explicitly summarize and exploit these vulnerability features, to improve the performance of model detection. Our evaluation on real-world software shows that DV-LVF outperforms the state-of-the-art in both function-level and statement-level detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Multi-granularity Deep Vulnerability Detection Using Graph Neural Networks

The application of neural network for software vulnerability detection: a review

Article 27 November 2022

Program Source Code Vulnerability Mining Scheme Based on Abstract Syntax Tree

References

Li, Z., et al.: Vulpecker: an automated vulnerability detection system based on code similarity analysis. In: Proceedings of the 32nd Annual Conference on Computer Security Applications (2016)
Google Scholar
Kim, S., et al.: Vuddy: a scalable approach for vulnerable code clone discovery. In: 2017 IEEE Symposium on Security and Privacy (SP). IEEE (2017)
Google Scholar
Wheeler, D.A.: Flawfinder (2016). https://www.dwheeler.com/flawfinder/. Accessed 20 May 2018
Sutton, M., Greene, A., Amini, P.: Fuzzing: Brute Force Vulnerability Discovery. Pearson Education, London (2007)
Google Scholar
Newsome, J., Song, D.X.: Dynamic taint analysis for automatic detection, analysis, and signaturegeneration of exploits on commodity software. In: NDSS, vol. 5 (2005)
Google Scholar
Zaazaa, O., El Bakkali, H.: Dynamic vulnerability detection approaches and tools: state of the art. In: 2020 Fourth International Conference On Intelligent Computing in Data Sciences (ICDS). IEEE (2020)
Google Scholar
Li, Z., et al.: Vuldeepecker: a deep learning-based system for vulnerability detection. arXiv preprint arXiv:1801.01681 (2018)
Zou, D., et al. : $\mu $ VulDeePecker: a deep learning-based system for multiclass vulnerability detection. IEEE Trans. Dependable Secure Comput. 18(5), 2224–2236 (2019)
Google Scholar
Russell, R., et al.: Automated vulnerability detection in source code using deep representation learning. In: 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE (2018)
Google Scholar
Li, Z., et al.: VR: a deep learning-based fine-grained vulnerability detector. IEEE Trans. Dependable Secure Comput. 19(4), 2821–2837 (2021)
Google Scholar
Wang, H., et al.: Combining graph-based learning with automated data collection for code vulnerability detection. IEEE Trans. Inf. Forensics Secur. 16, 1943–1958 (2020)
Google Scholar
Zhuang, Y., et al.: Software vulnerability detection via deep learning over disaggregated code graph representation. arXiv preprint arXiv:2109.03341 (2021)
Thapa, C., et al.: Transformer-based language models for software vulnerability detection. In: Proceedings of the 38th Annual Computer Security Applications Conference (2022)
Google Scholar
Purba, M.D., et al.: Software vulnerability detection using large language models. In: 2023 IEEE 34th International Symposium on Software Reliability Engineering Workshops (ISSREW). IEEE (2023)
Google Scholar
Zhang, X., et al.: VulD-Transformer: source code vulnerability detection via transformer. In: Proceedings of the 14th Asia-Pacific Symposium on Internetware (2023)
Google Scholar
Chakraborty, S., et al.: Deep learning based vulnerability detection: are we there yet? In: ACM/IEEE International Conference on Software Engineering (2022)
Google Scholar
Zeng, P., et al.: Software vulnerability analysis and discovery using deep learning techniques: a survey. IEEE Access 8, 197158–197172 (2020)
Google Scholar
Chung, J., et al.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)
Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909 (2015)
Fan, J., et al.: AC/C++ code vulnerability dataset with code changes and CVE summaries. In: Proceedings of the 17th International Conference on Mining Software Repositories (2020)
Google Scholar
Weiser, M.: Program slicing. IEEE Trans. Softw. Eng. 4, 352–357 (1984)
Article MATH Google Scholar
Li, Z., et al.: SySeVR: a framework for using deep learning to detect software vulnerabilities. IEEE Trans. Dependable Secure Comput. 19(04), 2244–2258 (2022)
Google Scholar
Joern[EB/OL]. https://github.com/octopus-platform/joern/
Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)
Article MathSciNet MATH Google Scholar
Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)
Google Scholar
Zaremba, W., Sutskever, I., Vinyals, O.: Recurrent neural network regularization. arXiv preprint arXiv:1409.2329 (2014)
Shi, X., et al.: Convolutional LSTM network: a machine learning approach for precipitation nowcasting. Adv. Neural Inf. Process. Syst. 28 (2015)
Google Scholar
Wang, Y., et al.: Codet5: identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. arXiv preprint arXiv:2109.00859 (2021)
Feng, Z., et al.: CodeBERT: a pre-trained model for programming and natural languages. arXiv preprint arXiv:2002.08155 (2020)
Lu, S., et al.: Codexglue: a machine learning benchmark dataset for code understanding and generation. arXiv preprint arXiv:2102.04664 (2021)
Fu, M., Tantithamthavorn, C.: Linevul: a transformer-based line-level vulnerability prediction. In: Proceedings of the 19th International Conference on Mining Software Repositories (2022)
Google Scholar
Ding, Y., et al.: VELVET: a novel ensemble learning approach to automatically locate vulnerable statements. In: 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE (2022)
Google Scholar
Nguyen, V.-A., et al.: ReGVD: revisiting graph neural networks for vulnerability detection. In: Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings (2022)
Google Scholar
Zhou, Y., et al.: Devign: effective vulnerability identification by learning comprehensive program semantics via graph neural networks. Adv. Neural Inf. Process. Syst. 32 (2019)
Google Scholar
Zou, D., et al.: mVulPreter: a multi-granularity vulnerability detection system with interpretations. IEEE Trans. Dependable Secure Comput. (2022)
Google Scholar
Nguyen, V., et al.: Information-theoretic source code vulnerability highlighting. In: 2021 International Joint Conference on Neural Networks (IJCNN). IEEE (2021)
Google Scholar
Zhang, J., et al.: Learning to locate and describe vulnerabilities. In: 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE (2023)
Google Scholar
Hin, D., et al.: LineVD: statement-level vulnerability detection using graph neural networks. In: Proceedings of the 19th International Conference on Mining Software Repositories (2022)
Google Scholar

Download references

Author information

Authors and Affiliations

Xidian University, Xi’an, Shan Xi, 710071, China
Haoyu Guo, Xiaodong Zhang, Zhiwei Zhang & Yulong Shen

Authors

Haoyu Guo
View author publications
You can also search for this author in PubMed Google Scholar
Xiaodong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhiwei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yulong Shen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaodong Zhang .

Editor information

Editors and Affiliations

National University of Singapore, Singapore, Singapore
Wei-Ngan Chin
Shenzhen University, Guangdong, China
Zhiwu Xu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Guo, H., Zhang, X., Zhang, Z., Shen, Y. (2024). Detecting Vulnerabilities via Explicitly Leveraging Vulnerability Features on Program Slices. In: Chin, WN., Xu, Z. (eds) Theoretical Aspects of Software Engineering. TASE 2024. Lecture Notes in Computer Science, vol 14777. Springer, Cham. https://doi.org/10.1007/978-3-031-64626-3_10

Download citation

DOI: https://doi.org/10.1007/978-3-031-64626-3_10
Published: 14 July 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-64625-6
Online ISBN: 978-3-031-64626-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Detecting Vulnerabilities via Explicitly Leveraging Vulnerability Features on Program Slices