skip to main content
research-article

Learning to Detect Memory-related Vulnerabilities

Published: 23 December 2023 Publication History

Abstract

Memory-related vulnerabilities can result in performance degradation or even program crashes, constituting severe threats to the security of modern software. Despite the promising results of deep learning (DL)-based vulnerability detectors, there exist three main limitations: (1) rich contextual program semantics related to vulnerabilities have not yet been fully modeled; (2) multi-granularity vulnerability features in hierarchical code structure are still hard to be captured; and (3) heterogeneous flow information is not well utilized. To address these limitations, in this article, we propose a novel DL-based approach, called MVD+, to detect memory-related vulnerabilities at the statement-level. Specifically, it conducts both intraprocedural and interprocedural analysis to model vulnerability features, and adopts a hierarchical representation learning strategy, which performs syntax-aware neural embedding within statements and captures structured context information across statements based on a novel Flow-Sensitive Graph Neural Networks, to learn both syntactic and semantic features of vulnerable code. To demonstrate the performance, we conducted extensive experiments against eight state-of-the-art DL-based approaches as well as five well-known static analyzers on our constructed dataset with 6,879 vulnerabilities in 12 popular C/C++ applications. The experimental results confirmed that MVD+ can significantly outperform current state-of-the-art baselines and make a great trade-off between effectiveness and efficiency.

References

[1]
Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. 2018. Learning to represent programs with graphs. In Proceedings of the 6th International Conference on Learning Representations (ICLR’18).
[2]
Uri Alon and Eran Yahav. 2021. On the bottleneck of graph neural networks and its practical implications. In Proceedings of the 9th International Conference on Learning Representations (ICLR’21).
[3]
Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. 2019. code2vec: Learning distributed representations of code. Proc. ACM Program. Lang. 3, POPL (2019), 40:1–40:29.
[4]
Pan Bian, Bin Liang, Jianjun Huang, Wenchang Shi, Xidong Wang, and Jian Zhang. 2020. SinkFinder: Harvesting hundreds of unknown interesting function pairs with just one seed. In Proceedings of the 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE’20). ACM, 1621–1625.
[5]
Lili Bo, Yue Li, Xiaobing Sun, Xiaoxue Wu, and Bin Li. 2023. VulLoc: Vulnerability localization based on inducing commits and fixing commits. Frontiers Comput. Sci. 17, 3 (2023), 173207.
[6]
Lili Bo, Xuanrui Zhu, Xiaobing Sun, Ni Zhen, and Bin Li. 2021. Are similar bugs fixed with similar change operations? An empirical study. Chinese J. Electr. 30, 1 (2021), 55–63.
[7]
Antoine Bordes, Nicolas Usunier, Alberto García-Durán, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In Proceedings of the 27th Annual Conference on Neural Information Processing Systems (NeurIPS’13). 2787–2795.
[8]
Derek Bruening and Qin Zhao. 2011. Practical memory checking with Dr. Memory. In Proceedings of the 9th International Symposium on Code Generation and Optimization (CGO’11). IEEE, 213–223.
[9]
Nghi D. Q. Bui, Yijun Yu, and Lingxiao Jiang. 2021. TreeCaps: Tree-based capsule networks for source code processing. In Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI’21). AAAI Press, 30–38.
[10]
Jie Cai, Bin Li, Jiale Zhang, Xiaobing Sun, and Bing Chen. 2023. Combine sliced joint graph with graph neural networks for smart contract vulnerability detection. J. Syst. Softw. 195 (2023), 111550.
[11]
Sicong Cao, Biao He, Xiaobing Sun, Yu Ouyang, Chao Zhang, Xiaoxue Wu, Ting Su, Lili Bo, Bin Li, Chuanlei Ma, Jiajia Li, and Tao Wei. 2023. ODDFUZZ: Discovering Java deserialization vulnerabilities via structure-aware directed greybox fuzzing. In Proceedings of the 44th IEEE Symposium on Security and Privacy (SP’23). IEEE.
[12]
Sicong Cao, Xiaobing Sun, Lili Bo, Ying Wei, and Bin Li. 2021. BGNN4VD: Constructing Bidirectional Graph Neural-network for Vulnerability Detection. Inf. Softw. Technol. 136 (2021), 106576.
[13]
Sicong Cao, Xiaobing Sun, Lili Bo, Rongxin Wu, Bin Li, and Chuanqi Tao. 2022. MVD: Memory-related vulnerability detection based on flow-sensitive graph neural networks. In Proceedings of the 44th IEEE/ACM International Conference on Software Engineering (ICSE’22). ACM, 1456–1468.
[14]
Sicong Cao, Xiaobing Sun, Xiaoxue Wu, Lili Bo, Bin Li, Rongxin Wu, Wei Liu, Biao He, Yu Ouyang, and Jiajia Li. 2023. Improving Java deserialization gadget chain mining via overriding-guided object generation. In Proceedings of the 45th IEEE/ACM International Conference on Software Engineering (ICSE’23). ACM.
[15]
Saikat Chakraborty, Rahul Krishna, Yangruibo Ding, and Baishakhi Ray. 2022. Deep learning based vulnerability detection: Are we there yet? IEEE Trans. Softw. Eng. 48, 9 (2022), 3280–3296.
[16]
Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 16 (2002), 321–357.
[17]
Zhe Chen, Chong Wang, Junqi Yan, Yulei Sui, and Jingling Xue. 2021. Runtime detection of memory errors with smart status. In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’21). ACM, 296–308.
[18]
Xingqi Cheng, Xiaobing Sun, Lili Bo, and Ying Wei. 2022. KVS: A tool for knowledge-driven vulnerability searching. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE’22). ACM, 1731–1735.
[19]
Xiao Cheng, Haoyu Wang, Jiayi Hua, Guoai Xu, and Yulei Sui. 2021. DeepWukong: Statically detecting software vulnerabilities using deep graph neural network. ACM Trans. Softw. Eng. Methodol. 30, 3 (2021), 38:1–38:33.
[20]
Sigmund Cherem, Lonnie Princehouse, and Radu Rugina. 2007. Practical memory leak detection using guarded value-flow analysis. In Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’07). ACM, 480–491.
[21]
Common Vulnerabilities and Exposures. 2022. Retrieved from https://cve.mitre.org/
[22]
Common Weakness Enumeration. 2022. Retrieved from https://cwe.mitre.org/
[25]
Hoa Khanh Dam, Truyen Tran, Trang Pham, Shien Wee Ng, John Grundy, and Aditya Ghose. 2021. Automatic feature learning for predicting vulnerable software components. IEEE Trans. Softw. Eng. 47, 1 (2021), 67–85.
[26]
Navid Emamdoost, Qiushi Wu, Kangjie Lu, and Stephen McCamant. 2021. Detecting kernel memory leaks in specialized modules with ownership reasoning. In Proceedings of the 28th Annual Network and Distributed System Security Symposium (NDSS’21). The Internet Society.
[27]
Gang Fan, Rongxin Wu, Qingkai Shi, Xiao Xiao, Jinguo Zhou, and Charles Zhang. 2019. Smoke: Scalable path-sensitive memory leak detection for millions of lines of code. In Proceedings of the 41st International Conference on Software Engineering (ICSE’19). IEEE / ACM, 72–82.
[28]
Jiahao Fan, Yi Li, Shaohua Wang, and Tien N. Nguyen. 2020. A C/C++ code vulnerability dataset with code changes and CVE summaries. In Proceedings of the 17th International Conference on Mining Software Repositories (MSR’20). ACM, 508–512.
[29]
Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, and Ming Zhou. 2020. CodeBERT: A pre-trained model for programming and natural languages. In Findings of the Association for Computational Linguistics: EMNLP 2020(Findings of ACL, Vol. EMNLP 2020). ACL, 1536–1547.
[30]
Jeanne Ferrante, Karl J. Ottenstein, and Joe D. Warren. 1987. The program dependence graph and its use in optimization. ACM Trans. Program. Lang. Syst. 9, 3 (1987), 319–349.
[31]
Flawfinder. 2022. Retrieved from http://www.dwheeler.com/flawfinder
[32]
Michael Fu and Chakkrit Tantithamthavorn. 2022. LineVul: A transformer-based line-level vulnerability prediction. In Proceedings of the 19th IEEE/ACM International Conference on Mining Software Repositories (MSR’22). IEEE, 608–620.
[33]
David Gens, Simon Schmitt, Lucas Davi, and Ahmad-Reza Sadeghi. 2018. K-miner: Uncovering memory corruption in linux. In Proceedings of the 25th Annual Network and Distributed System Security Symposium (NDSS’18). The Internet Society.
[34]
David L. Heine and Monica S. Lam. 2006. Static detection of leaks in polymorphic containers. In Proceedings of the 28th International Conference on Software Engineering (ICSE’06). ACM, 252–261.
[35]
John L. Henning. 2000. SPEC CPU2000: Measuring CPU performance in the new millennium. Computer 33, 7 (2000), 28–35.
[36]
David Hin, Andrey Kan, Huaming Chen, and Muhammad Ali Babar. 2022. LineVD: Statement-level vulnerability detection using graph neural networks. In Proceedings of the 19th IEEE/ACM International Conference on Mining Software Repositories (MSR’22). IEEE, 596–607.
[37]
Abram Hindle, Earl T. Barr, Zhendong Su, Mark Gabel, and Premkumar T. Devanbu. 2012. On the naturalness of software. In Proceedings of the 34th International Conference on Software Engineering (ICSE’12). IEEE, 837–847.
[38]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735–1780.
[39]
Nasif Imtiaz and Laurie A. Williams. 2021. Memory error detection in security testing. Retrieved from https://arXiv:2104.04385
[40]
Infer. 2022. Retrieved from https://fbinfer.com
[41]
Changhee Jung, Sangho Lee, Easwaran Raman, and Santosh Pande. 2014. Automated memory leak detection for production use. In Proceedings of the 36th International Conference on Software Engineering (ICSE’14). ACM, 825–836.
[42]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR’15).
[43]
Thomas N. Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In Proceedings of the 5th International Conference on Learning Representations (ICLR’17).
[44]
Quoc V. Le and Tomás Mikolov. 2014. Distributed representations of sentences and documents. In Proceedings of the 31th International Conference on Machine Learning (ICML’14), Vol. 32. 1188–1196.
[45]
Wen Li, Haipeng Cai, Yulei Sui, and David Manz. 2020. PCA: Memory leak detection using partial call-path analysis. In Proceedings of the 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE’20). ACM, 1621–1625.
[46]
Yujia Li, Daniel Tarlow, Marc Brockschmidt, and Richard S. Zemel. 2016. Gated graph sequence neural networks. In Proceedings of the 4th International Conference on Learning Representations (ICLR’16).
[47]
Yi Li, Shaohua Wang, and Tien N. Nguyen. 2021. Vulnerability detection with fine-grained interpretations. In Proceeding of the 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE’21). ACM, 292–303.
[48]
Zhen Li, Deqing Zou, Shouhuai Xu, Zhaoxuan Chen, Yawei Zhu, and Hai Jin. 2022. VulDeeLocator: A deep learning-based fine-grained vulnerability detector. IEEE Trans. Depend. Secur. Comput. 19, 4 (2022), 2821–2837.
[49]
Zhen Li, Deqing Zou, Shouhuai Xu, Hai Jin, Yawei Zhu, and Zhaoxuan Chen. 2022. SySeVR: A framework for using deep learning to detect software vulnerabilities. IEEE Trans. Dependable Secur. Comput. 19, 4 (2022), 2244–2258.
[50]
Zhen Li, Deqing Zou, Shouhuai Xu, Xinyu Ou, Hai Jin, Sujuan Wang, Zhijun Deng, and Yuyi Zhong. 2018. VulDeePecker: A deep learning-based system for vulnerability detection. In Proceedings of the 25th Annual Network and Distributed System Security Symposium (NDSS’18). The Internet Society.
[51]
Linux Kernel. 2022. Retrieved from https://www.kernel.org/
[52]
Stephan Lipp, Sebastian Banescu, and Alexander Pretschner. 2022. An empirical study on the effectiveness of static C code analyzers for vulnerability detection. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’22). ACM, 544–555.
[53]
Dinghao Liu, Qiushi Wu, Shouling Ji, Kangjie Lu, Zhenguang Liu, Jianhai Chen, and Qinming He. 2021. Detecting missed security operations through differential checking of object-based similar paths. In Proceedings of the 27th ACM SIGSAC Conference on Computer and Communications Security (CCS’21). ACM, 1627–1644.
[54]
Tongping Liu, Charlie Curtsinger, and Emery D. Berger. 2016. DoubleTake: Fast and precise error detection via evidence-based dynamic analysis. In Proceedings of the 38th International Conference on Software Engineering (ICSE’16). ACM, 911–922.
[55]
Yiling Lou, Qihao Zhu, Jinhao Dong, Xia Li, Zeyu Sun, Dan Hao, Lu Zhang, and Lingming Zhang. 2021. Boosting coverage-based fault localization via graph-based representation learning. In Proceedings of the 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE’21). ACM, 664–676.
[56]
Kangjie Lu, Aditya Pakki, and Qiushi Wu. 2019. Detecting missing-check bugs via semantic- and context-aware criticalness and constraints inferences. In Proceedings of the 28th USENIX Security Symposium (USENIX Security’19). USENIX Association, 1769–1786.
[57]
Yunlong Lyu, Yi Fang, Yiwei Zhang, Qibin Sun, Siqi Ma, Elisa Bertino, Kangjie Lu, and Juanru Li. 2022. Goshawk: Hunting memory corruptions via structure-aware and object-centric memory operation synopsis. In Proceedings of the 43rd IEEE Symposium on Security and Privacy (SP’22). IEEE, 2096–2113.
[58]
Alejandro Mazuera-Rozo, Anamaria Mojica-Hanke, Mario Linares-Vásquez, and Gabriele Bavota. 2021. Shallow or deep? An empirical study on detecting vulnerabilities using deep learning. In Proceedings of the 29th IEEE/ACM International Conference on Program Comprehension (ICPC’21). IEEE, 276–287.
[59]
[60]
Tomás Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the 27th Annual Conference on Neural Information Processing Systems (NeurIPS’13). 3111–3119.
[61]
Lili Mou, Ge Li, Lu Zhang, Tao Wang, and Zhi Jin. 2016. Convolutional neural networks over tree structures for programming language processing. In Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI’16). AAAI Press, 1287–1293.
[62]
Nicholas Nethercote and Julian Seward. 2007. Valgrind: A framework for heavyweight dynamic binary instrumentation. In Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’07). ACM, 89–100.
[63]
Yu Nong, Haipeng Cai, Pengfei Ye, Li Li, and Feng Chen. 2021. Evaluating and comparing memory error vulnerability detectors. Inf. Softw. Technol. 137 (2021), 106614.
[64]
Robert E. Noonan. 1985. An algorithm for generating abstract syntax trees. Comput. Lang. 10, 3/4 (1985), 225–236.
[65]
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 26th Conference on Empirical Methods in Natural Language Processing (EMNLP’14). ACL, 1532–1543.
[66]
PyTorch. 2022. Retrieved from https://pytorch.org/
[67]
[68]
Michael Sejr Schlichtkrull, Thomas N. Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, and Max Welling. 2017. Modeling relational data with graph convolutional networks. Retrieved from https://arXiv:1703.06103
[69]
Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL’16).
[70]
Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitriy Vyukov. 2012. AddressSanitizer: A fast address sanity checker. In Proceedings of the 23rd USENIX Annual Technical Conference (USENIX ATC’12). USENIX Association, 309–318.
[71]
Qingkai Shi, Xiao Xiao, Rongxin Wu, Jinguo Zhou, Gang Fan, and Charles Zhang. 2018. Pinpoint: Fast and precise sparse value flow analysis for million lines of code. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’18). ACM, 693–706.
[72]
Saurabh Sinha, Mary Jean Harrold, and Gregg Rothermel. 1999. System-dependence-graph-based slicing of programs with arbitrary interprocedural control flow. In Proceedings of the 21st International Conference on Software Engineering (ICSE’99). ACM, 432–441.
[73]
Jing Kai Siow, Shangqing Liu, Xiaofei Xie, Guozhu Meng, and Yang Liu. 2022. Learning program semantics with code representations: An empirical study. In Proceedings of the 29th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER’22). IEEE, 554–565.
[74]
Software Assurance Reference Dataset. 2022. Retrieved from https://samate.nist.gov/SARD/index.php
[75]
Zihua Song, Junfeng Wang, Kaiyuan Yang, and Jigang Wang. 2023. HGIVul: Detecting inter-procedural vulnerabilities based on hypergraph convolution. Inf. Softw. Technol. 160 (2023), 107219.
[76]
Ezekiel O. Soremekun, Lukas Kirschner, Marcel Böhme, and Andreas Zeller. 2021. Locating faults with program slicing: An empirical analysis. Empir. Softw. Eng. 26, 3 (2021), 51.
[77]
Fazli Subhan, Xiaoxue Wu, Lili Bo, Xiaobing Sun, and Muhammad Rahman. 2022. A deep learning-based approach for software vulnerability detection using code metrics. IET Softw. 16, 5 (2022), 516–526.
[78]
Yulei Sui, Ding Ye, and Jingling Xue. 2012. Static memory leak detection using full-sparse value-flow analysis. In Proceedings of the 21st International Symposium on Software Testing and Analysis (ISSTA’12). ACM, 254–264.
[79]
Laszlo Szekeres, Mathias Payer, Tao Wei, and Dawn Song. 2013. SoK: Eternal war in memory. In Proceedings of the 34th IEEE Symposium on Security and Privacy (SP’13). IEEE, 48–62.
[80]
Kai Sheng Tai, Richard Socher, and Christopher D. Manning. 2015. Improved semantic representations from tree-structured long short-term memory networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL’15). ACL, 1556–1566.
[81]
Chakkrit Tantithamthavorn, Ahmed E. Hassan, and Kenichi Matsumoto. 2020. The impact of class rebalancing techniques on the performance and interpretation of defect prediction models. IEEE Trans. Softw. Eng. 46, 11 (2020), 1200–1219.
[82]
Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph attention networks. In Proceedings of the 6th International Conference on Learning Representations (ICLR’18).
[83]
Haijun Wang, Xiaofei Xie, Yi Li, Cheng Wen, Yuekang Li, Yang Liu, Shengchao Qin, Hongxu Chen, and Yulei Sui. 2020. Typestate-guided fuzzer for discovering use-after-free vulnerabilities. In Proceedings of the 42nd International Conference on Software Engineering (ICSE’20). ACM, 999–1010.
[84]
Huanting Wang, Guixin Ye, Zhanyong Tang, Shin Hwei Tan, Songfang Huang, Dingyi Fang, Yansong Feng, Lizhong Bian, and Zheng Wang. 2021. Combining graph-based learning with automated data collection for code vulnerability detection. IEEE Trans. Inf. Forensics Secur. 16 (2021), 1943–1958.
[85]
Wenwen Wang. 2021. MLEE: Effective detection of memory leaks on early-exit paths in OS kernels. In Proceedings of 32nd USENIX Annual Technical Conference (USENIX ATC’21). USENIX Association, 31–45.
[86]
Huihui Wei and Ming Li. 2017. Supervised deep features for software functional clone detection by exploiting lexical and syntactical information in source code. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI’17). ijcai.org, 3034–3040.
[87]
Ying Wei, Xiaobing Sun, Lili Bo, Sicong Cao, Xin Xia, and Bin Li. 2021. A comprehensive study on security bug characteristics. J. Softw. Evol. Process. 33, 10 (2021).
[88]
Mark Weiser. 1984. Program slicing. IEEE Trans. Softw. Eng. 10, 4 (1984), 352–357.
[89]
Cheng Wen, Haijun Wang, Yuekang Li, Shengchao Qin, Yang Liu, Zhiwu Xu, Hongxu Chen, Xiaofei Xie, Geguang Pu, and Ting Liu. 2020. MemLock: Memory usage guided fuzzing. In Proceedings of the 42nd International Conference on Software Engineering (ICSE’20). ACM, 765–777.
[90]
Xin-Cheng Wen, Yupan Chen, Cuiyun Gao, Hongyu Zhang, Jie M. Zhang, and Qing Liao. 2023. Vulnerability detection with graph simplification and enhanced graph representation learning. In Proceedings of the 45th IEEE/ACM International Conference on Software Engineering (ICSE’23). ACM.
[91]
Martin White, Michele Tufano, Christopher Vendome, and Denys Poshyvanyk. 2016. Deep learning code fragments for code clone detection. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (ASE’16). ACM, 87–98.
[92]
Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and Philip S. Yu. 2021. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 32, 1 (2021), 4–24.
[93]
Fabian Yamaguchi, Nico Golde, Daniel Arp, and Konrad Rieck. 2014. Modeling and discovering vulnerabilities with code property graphs. In Proceedings of the 35th IEEE Symposium on Security and Privacy (SP’14). IEEE Computer Society, 590–604.
[94]
Guoqing Yan, Sen Chen, Yude Bai, and Xiaohong Li. 2022. Can deep learning models learn the vulnerable patterns for vulnerability detection? In Proceedings of the 46th IEEE Annual Computers, Software, and Applications Conference (COMPSAC’22). IEEE, 904–913.
[95]
Hao Yu, Wing Lam, Long Chen, Ge Li, Tao Xie, and Qianxiang Wang. 2019. Neural detection of semantic code clones via tree-based convolution. In Proceedings of the 27th International Conference on Program Comprehension (ICPC’19). IEEE/ACM, 70–80.
[96]
Jian Zhang, Xu Wang, Hongyu Zhang, Hailong Sun, Kaixuan Wang, and Xudong Liu. 2019. A novel neural source code representation based on abstract syntax tree. In Proceedings of the 41st International Conference on Software Engineering (ICSE’19). IEEE / ACM, 783–794.
[97]
Tianxiang Zhao, Xiang Zhang, and Suhang Wang. 2021. GraphSMOTE: Imbalanced node classification on graphs with graph neural networks. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining (WSDM’21). ACM, 833–841.
[98]
Tianchi Zhou, Xiaobing Sun, Xin Xia, Bin Li, and Xiang Chen. 2019. Improving defect prediction with deep forest. Inf. Softw. Technol. 114 (2019), 204–216.
[99]
Yaqin Zhou, Shangqing Liu, Jing Kai Siow, Xiaoning Du, and Yang Liu. 2019. Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks. In Proceedings of the 33rd Annual Conference on Neural Information Processing Systems (NeurIPS’19). 10197–10207.
[100]
Zhou Zhou, Lili Bo, Xiaoxue Wu, Xiaobing Sun, Tao Zhang, Bin Li, Jiale Zhang, and Sicong Cao. 2022. SPVF: Security property assisted vulnerability fixing via attention-based models. Empir. Softw. Eng. 27, 7 (2022), 171.
[101]
Deqing Zou, Sujuan Wang, Shouhuai Xu, Zhen Li, and Hai Jin. 2021. \(\mu\)VulDeePecker: A deep learning-based system for multiclass vulnerability detection. IEEE Trans. Depend. Secur. Comput. 18, 5 (2021), 2224–2236.

Cited By

View all
  • (2025)Enhancing concurrency vulnerability detection through AST-based static fuzz mutationJournal of Systems and Software10.1016/j.jss.2025.112352(112352)Online publication date: Jan-2025
  • (2024)Snopy: Bridging Sample Denoising with Causal Graph Learning for Effective Vulnerability DetectionProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695057(606-618)Online publication date: 27-Oct-2024
  • (2024)Coca: Improving and Explaining Graph Neural Network-Based Vulnerability Detection SystemsProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639168(1-13)Online publication date: 20-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology
ACM Transactions on Software Engineering and Methodology  Volume 33, Issue 2
February 2024
947 pages
EISSN:1557-7392
DOI:10.1145/3618077
  • Editor:
  • Mauro Pezzè
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 December 2023
Online AM: 18 September 2023
Accepted: 01 September 2023
Revised: 27 August 2023
Received: 20 February 2023
Published in TOSEM Volume 33, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Statement-level vulnerability detection
  2. abstract syntax tree
  3. graph neural networks
  4. flow analysis

Qualifiers

  • Research-article

Funding Sources

  • National Natural Science Foundation of China
  • Six Talent Peaks Project in Jiangsu Province
  • Jiangsu “333” Project and Yangzhou University Top-level Talents Support Program (2019)
  • Postgraduate Research & Practice Innovation Program of Jiangsu Province
  • Macao Science and Technology Development Fund
  • Open Funds of State Key Laboratory for Novel Software Technology of Nanjing University
  • China Scholarship Council Foundation

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)645
  • Downloads (Last 6 weeks)52
Reflects downloads up to 23 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Enhancing concurrency vulnerability detection through AST-based static fuzz mutationJournal of Systems and Software10.1016/j.jss.2025.112352(112352)Online publication date: Jan-2025
  • (2024)Snopy: Bridging Sample Denoising with Causal Graph Learning for Effective Vulnerability DetectionProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695057(606-618)Online publication date: 27-Oct-2024
  • (2024)Coca: Improving and Explaining Graph Neural Network-Based Vulnerability Detection SystemsProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639168(1-13)Online publication date: 20-May-2024
  • (2024)EXVul: Toward Effective and Explainable Vulnerability Detection for IoT DevicesIEEE Internet of Things Journal10.1109/JIOT.2024.338164111:12(22385-22398)Online publication date: 15-Jun-2024

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media