Interpretation of Learning-Based Automatic Source Code Vulnerability Detection Model Using LIME

Tang, Gaigai; Zhang, Long; Yang, Feng; Meng, Lianxiao; Cao, Weipeng; Qiu, Meikang; Ren, Shuangyin; Yang, Lin; Wang, Huiqiang

doi:10.1007/978-3-030-82153-1_23

Gaigai Tang^13,14,
Long Zhang¹⁴,
Feng Yang¹⁴,
Lianxiao Meng^13,14,
Weipeng Cao¹⁵,
Meikang Qiu¹⁶,
Shuangyin Ren¹⁴,
Lin Yang¹⁴ &
…
Huiqiang Wang¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12817))

Included in the following conference series:

International Conference on Knowledge Science, Engineering and Management

2333 Accesses
2 Citations

Abstract

The existing advanced automatic vulnerability detection methods based on source code are mainly learning-based, such as machine learning and deep learning. These models can capture the vulnerability pattern through learning, which is more automatic and intelligent. However, the outputs of many learning-based vulnerability detection models are unexplainable, even though they usually show high accuracy. It’s meaningful to verify the credibility of the models so that we can better understand and use them in practice. To alleviate the above issue, we use an interpretation method called LIME to explain the learning-based automatic vulnerability detection model. For one thing, the preprocessing methods are all interpretable, including symbolization and vector representation, where the Bag of words model is chosen for source code vector representation. For another, the vulnerability detection models we select are based on Logistic Regression and Bi-LSTM. The former is interpretable, which is used to verify the effectiveness of LIME in the field of source code vulnerability detection. The latter is unexplained that is interpreted by LIME to its credibility on source code vulnerability detection. The experimental results show that LIME can effectively explain the learning-based automatic vulnerability detection model. Moreover, we find that under the condition of local interpretation, the predictions of the model based on Bi-LSTM are credible.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Program Source Code Vulnerability Mining Scheme Based on Abstract Syntax Tree

A novel approach for software vulnerability detection based on intelligent cognitive computing

Article 05 May 2023

A Review of Data Representation Methods for Vulnerability Mining Using Deep Learning

References

Amershi, S., Chickering, et al.: Modeltracker: redesigning performance analysis tools for machine learning. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pp. 337–346 (2015)
Google Scholar
Checkmarx: In: https://www.checkmarx.com/
Chernis, B., Verma, R.: Machine learning methods for software vulnerability detection. In: Proceedings of the Fourth ACM International Workshop on Security and Privacy Analytics, pp. 31–39 (2018)
Google Scholar
Cho, K., Van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: Encoder-decoder approaches, pp. 103–111. Association for Computational Linguistics (2014). https://doi.org/10.3115/v1/W14-4012
Dai, W., Qiu, M., Qiu, L., Chen, L., Wu, A.: Who moved my data? privacy protection in smartphones. IEEE Commun. Mag. 55(1), 20–25 (2017)
Article Google Scholar
FlawFinder: In: http://www.dwheeler.com/flawfinder
Fortify, H.: In: https://www.hpfod.com/
Gai, K., Qiu, M.: Optimal resource allocation using reinforcement learning for iot content-centric services. Appl. Soft Comput. 70, 12–21 (2018)
Article Google Scholar
Gai, K., Qiu, M.: Reinforcement learning-based content-centric services in mobile sensing. IEEE Netw. 32(4), 34–39 (2018)
Article Google Scholar
Gai, K., Qiu, M., Zhao, H., Sun, X.: Resource management in sustainable cyber-physical systems using heterogeneous cloud computing. IEEE Transactions on Sustainable Computing, pp. 1–1 (2018)
Google Scholar
Gai, K., Qiu, M., Elnagdy, S.A.: Security-aware information classifications using supervised learning for cloud-based cyber risk management in financial big data. In: 2016 IEEE 2nd International Conference on Big Data Security on Cloud, pp. 197–202. IEEE (2016)
Google Scholar
Gai, K., Qiu, M., Sun, X., Zhao, H.: Security and privacy issues: a survey on fintech. In: International Conference on Smart Computing and Communication, pp. 236–247. Springer, Cham (2016)
Google Scholar
Gai, K., Qiu, M., Zhao, H., Dai, W.: Anti-counterfeit scheme using monte carlo simulation for e-commerce in cloud systems. In: 2015 IEEE 2nd International Conference on Cyber Security and Cloud Computing, pp. 74–79. IEEE (2015)
Google Scholar
Gai, K., Wu, Y., Zhu, L., Zhang, Z., Qiu, M.: Differential privacy-based blockchain for industrial internet-of-things. IEEE Trans. Ind. Inf. 16(6), 4156–4165 (2019)
Article Google Scholar
Groce, A., Kulesza, T., Zhang, et al.: You are the only possible oracle: effective test selection for end users of interactive machine learning systems. IEEE Trans. Softw. Eng.40(3), 307–323 (2013)
Google Scholar
Harer, J.A., Kim, et al.: Automated software vulnerability detection with machine learning. CoRR abs/1803.04497 (2018)
Google Scholar
Huang, T., Zhu, Y., Zhang, Qiu, M., et al.: An lof-based adaptive anomaly detection scheme for cloud computing. In: 2013 IEEE 37th Annual Computer Software and Applications Conference Workshops, pp. 206–211. IEEE (2013)
Google Scholar
Kulesza, T., Burnett, M., Wong, W.K., Stumpf, S.: Principles of explanatory debugging to personalize interactive machine learning. In: Proceedings of the 20th international conference on intelligent user interfaces, pp. 126–137 (2015)
Google Scholar
Li, Z., Zou, Deqing, A.O.: Vuldeepecker: a deep learning-based system for vulnerability detection. In: 25th Annual Network and Distributed System Security Symposium, NDSS 2018, San Diego, California, USA, 18–21 February 2018
Google Scholar
Niu, J., Gao, Y., Qiu, M., Ming, Z.: Selecting proper wireless network interfaces for user experience enhancement with guaranteed probability. J. Parallel Distrib. Comput. 72(12), 1565–1575 (2012)
Article Google Scholar
Qiu, M., Ming, Z., Wang, J., Yang, L.T., Xiang, Y.: Enabling cloud computing in emergency management systems. IEEE Cloud Comput. 1(4), 60–67 (2014)
Article Google Scholar
Ribeiro, M.T., Singh, S., Guestrin, C.: “why should i trust you?” explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1135–1144 (2016)
Google Scholar
Savchenko, A., Fokin, O., Chernousov, A., Sinelnikova, O., Osadchyi, S.: Deedp: vulnerability detection and patching based on deep learning. Theor. Appl. Cybersecur. 2(1), 1–7 (2020)
Google Scholar
Shuai, B., Li, H., Li, et al.: Automatic classification for vulnerability based on machine learning. In: 2013 IEEE International Conference on Information and Automation (ICIA), pp. 312–318. IEEE (2013)
Google Scholar
Srikant, S., Lesimple, N., O’Reilly, U.M.: Dependency-based neural representations for classifying lines of programs. CoRR abs/2004.10166 (2020)
Google Scholar
Tao, L., Golikov, S., Gai, K., Qiu, M.: A reusable software component for integrated syntax and semantic validation for services computing. In: 2015 IEEE Symposium on Service-Oriented System Engineering, pp. 127–132. IEEE (2015)
Google Scholar
Thakur, K., Qiu, M., Gai, K., Ali, M.L.: An investigation on cyber security threats and security models. In: 2015 IEEE 2nd International Conference on Cyber Security and Cloud Computing, pp. 307–311. IEEE (2015)
Google Scholar
Wang, S., Liu, T., Tan, L.: Automatically learning semantic features for defect prediction. In: 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE), pp. 297–308. IEEE (2016)
Google Scholar
Zhang, Q., Huang, T., Zhu, Y., Qiu, M.: A case study of sensor data collection and analysis in smart city: provenance in smart food supply chain. Int. J. Distrib. Sensor Netw. 9(11), 382132 (2013)
Google Scholar
Zhang, Z., Wu, J., Deng, J., Qiu, M.: Jamming ack attack to wireless networks and a mitigation approach. In: IEEE GLOBECOM 2008–2008 IEEE Global Telecommunications Conference, pp. 1–5. IEEE (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, Harbin Engineering University, Harbin, China
Gaigai Tang, Lianxiao Meng & Huiqiang Wang
National Key Laboratory of Science and Technology on Information System Security, Institute of System Engineering, Chinese Academy of Military Science, Beijing, China
Gaigai Tang, Long Zhang, Feng Yang, Lianxiao Meng, Shuangyin Ren & Lin Yang
College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
Weipeng Cao
Department of Computer Science, Texas A&M University-Commerce, Commerce, TX, 75428, USA
Meikang Qiu

Authors

Gaigai Tang
View author publications
You can also search for this author in PubMed Google Scholar
Long Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Feng Yang
View author publications
You can also search for this author in PubMed Google Scholar
Lianxiao Meng
View author publications
You can also search for this author in PubMed Google Scholar
Weipeng Cao
View author publications
You can also search for this author in PubMed Google Scholar
Meikang Qiu
View author publications
You can also search for this author in PubMed Google Scholar
Shuangyin Ren
View author publications
You can also search for this author in PubMed Google Scholar
Lin Yang
View author publications
You can also search for this author in PubMed Google Scholar
Huiqiang Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Han Qiu
Ibaraki University, Hitachi, Japan
Cheng Zhang
University of Kentucky, Lexington, KY, USA
Zongming Fei
Texas A&M University – Commerce, Commerce, TX, USA
Meikang Qiu
Princeton University, Princeton, NJ, USA
Sun-Yuan Kung

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tang, G. et al. (2021). Interpretation of Learning-Based Automatic Source Code Vulnerability Detection Model Using LIME. In: Qiu, H., Zhang, C., Fei, Z., Qiu, M., Kung, SY. (eds) Knowledge Science, Engineering and Management. KSEM 2021. Lecture Notes in Computer Science(), vol 12817. Springer, Cham. https://doi.org/10.1007/978-3-030-82153-1_23

Download citation

DOI: https://doi.org/10.1007/978-3-030-82153-1_23
Published: 07 August 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-82152-4
Online ISBN: 978-3-030-82153-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Interpretation of Learning-Based Automatic Source Code Vulnerability Detection Model Using LIME

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Program Source Code Vulnerability Mining Scheme Based on Abstract Syntax Tree

A novel approach for software vulnerability detection based on intelligent cognitive computing

A Review of Data Representation Methods for Vulnerability Mining Using Deep Learning

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Interpretation of Learning-Based Automatic Source Code Vulnerability Detection Model Using LIME

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Program Source Code Vulnerability Mining Scheme Based on Abstract Syntax Tree

A novel approach for software vulnerability detection based on intelligent cognitive computing

A Review of Data Representation Methods for Vulnerability Mining Using Deep Learning

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation