Skip to main content

Interpretation of Learning-Based Automatic Source Code Vulnerability Detection Model Using LIME

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12817))

Abstract

The existing advanced automatic vulnerability detection methods based on source code are mainly learning-based, such as machine learning and deep learning. These models can capture the vulnerability pattern through learning, which is more automatic and intelligent. However, the outputs of many learning-based vulnerability detection models are unexplainable, even though they usually show high accuracy. It’s meaningful to verify the credibility of the models so that we can better understand and use them in practice. To alleviate the above issue, we use an interpretation method called LIME to explain the learning-based automatic vulnerability detection model. For one thing, the preprocessing methods are all interpretable, including symbolization and vector representation, where the Bag of words model is chosen for source code vector representation. For another, the vulnerability detection models we select are based on Logistic Regression and Bi-LSTM. The former is interpretable, which is used to verify the effectiveness of LIME in the field of source code vulnerability detection. The latter is unexplained that is interpreted by LIME to its credibility on source code vulnerability detection. The experimental results show that LIME can effectively explain the learning-based automatic vulnerability detection model. Moreover, we find that under the condition of local interpretation, the predictions of the model based on Bi-LSTM are credible.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Amershi, S., Chickering, et al.: Modeltracker: redesigning performance analysis tools for machine learning. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pp. 337–346 (2015)

    Google Scholar 

  2. Checkmarx: In: https://www.checkmarx.com/

  3. Chernis, B., Verma, R.: Machine learning methods for software vulnerability detection. In: Proceedings of the Fourth ACM International Workshop on Security and Privacy Analytics, pp. 31–39 (2018)

    Google Scholar 

  4. Cho, K., Van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: Encoder-decoder approaches, pp. 103–111. Association for Computational Linguistics (2014). https://doi.org/10.3115/v1/W14-4012

  5. Dai, W., Qiu, M., Qiu, L., Chen, L., Wu, A.: Who moved my data? privacy protection in smartphones. IEEE Commun. Mag. 55(1), 20–25 (2017)

    Article  Google Scholar 

  6. FlawFinder: In: http://www.dwheeler.com/flawfinder

  7. Fortify, H.: In: https://www.hpfod.com/

  8. Gai, K., Qiu, M.: Optimal resource allocation using reinforcement learning for iot content-centric services. Appl. Soft Comput. 70, 12–21 (2018)

    Article  Google Scholar 

  9. Gai, K., Qiu, M.: Reinforcement learning-based content-centric services in mobile sensing. IEEE Netw. 32(4), 34–39 (2018)

    Article  Google Scholar 

  10. Gai, K., Qiu, M., Zhao, H., Sun, X.: Resource management in sustainable cyber-physical systems using heterogeneous cloud computing. IEEE Transactions on Sustainable Computing, pp. 1–1 (2018)

    Google Scholar 

  11. Gai, K., Qiu, M., Elnagdy, S.A.: Security-aware information classifications using supervised learning for cloud-based cyber risk management in financial big data. In: 2016 IEEE 2nd International Conference on Big Data Security on Cloud, pp. 197–202. IEEE (2016)

    Google Scholar 

  12. Gai, K., Qiu, M., Sun, X., Zhao, H.: Security and privacy issues: a survey on fintech. In: International Conference on Smart Computing and Communication, pp. 236–247. Springer, Cham (2016)

    Google Scholar 

  13. Gai, K., Qiu, M., Zhao, H., Dai, W.: Anti-counterfeit scheme using monte carlo simulation for e-commerce in cloud systems. In: 2015 IEEE 2nd International Conference on Cyber Security and Cloud Computing, pp. 74–79. IEEE (2015)

    Google Scholar 

  14. Gai, K., Wu, Y., Zhu, L., Zhang, Z., Qiu, M.: Differential privacy-based blockchain for industrial internet-of-things. IEEE Trans. Ind. Inf. 16(6), 4156–4165 (2019)

    Article  Google Scholar 

  15. Groce, A., Kulesza, T., Zhang, et al.: You are the only possible oracle: effective test selection for end users of interactive machine learning systems. IEEE Trans. Softw. Eng.40(3), 307–323 (2013)

    Google Scholar 

  16. Harer, J.A., Kim, et al.: Automated software vulnerability detection with machine learning. CoRR abs/1803.04497 (2018)

    Google Scholar 

  17. Huang, T., Zhu, Y., Zhang, Qiu, M., et al.: An lof-based adaptive anomaly detection scheme for cloud computing. In: 2013 IEEE 37th Annual Computer Software and Applications Conference Workshops, pp. 206–211. IEEE (2013)

    Google Scholar 

  18. Kulesza, T., Burnett, M., Wong, W.K., Stumpf, S.: Principles of explanatory debugging to personalize interactive machine learning. In: Proceedings of the 20th international conference on intelligent user interfaces, pp. 126–137 (2015)

    Google Scholar 

  19. Li, Z., Zou, Deqing, A.O.: Vuldeepecker: a deep learning-based system for vulnerability detection. In: 25th Annual Network and Distributed System Security Symposium, NDSS 2018, San Diego, California, USA, 18–21 February 2018

    Google Scholar 

  20. Niu, J., Gao, Y., Qiu, M., Ming, Z.: Selecting proper wireless network interfaces for user experience enhancement with guaranteed probability. J. Parallel Distrib. Comput. 72(12), 1565–1575 (2012)

    Article  Google Scholar 

  21. Qiu, M., Ming, Z., Wang, J., Yang, L.T., Xiang, Y.: Enabling cloud computing in emergency management systems. IEEE Cloud Comput. 1(4), 60–67 (2014)

    Article  Google Scholar 

  22. Ribeiro, M.T., Singh, S., Guestrin, C.: “why should i trust you?” explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1135–1144 (2016)

    Google Scholar 

  23. Savchenko, A., Fokin, O., Chernousov, A., Sinelnikova, O., Osadchyi, S.: Deedp: vulnerability detection and patching based on deep learning. Theor. Appl. Cybersecur. 2(1), 1–7 (2020)

    Google Scholar 

  24. Shuai, B., Li, H., Li, et al.: Automatic classification for vulnerability based on machine learning. In: 2013 IEEE International Conference on Information and Automation (ICIA), pp. 312–318. IEEE (2013)

    Google Scholar 

  25. Srikant, S., Lesimple, N., O’Reilly, U.M.: Dependency-based neural representations for classifying lines of programs. CoRR abs/2004.10166 (2020)

    Google Scholar 

  26. Tao, L., Golikov, S., Gai, K., Qiu, M.: A reusable software component for integrated syntax and semantic validation for services computing. In: 2015 IEEE Symposium on Service-Oriented System Engineering, pp. 127–132. IEEE (2015)

    Google Scholar 

  27. Thakur, K., Qiu, M., Gai, K., Ali, M.L.: An investigation on cyber security threats and security models. In: 2015 IEEE 2nd International Conference on Cyber Security and Cloud Computing, pp. 307–311. IEEE (2015)

    Google Scholar 

  28. Wang, S., Liu, T., Tan, L.: Automatically learning semantic features for defect prediction. In: 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE), pp. 297–308. IEEE (2016)

    Google Scholar 

  29. Zhang, Q., Huang, T., Zhu, Y., Qiu, M.: A case study of sensor data collection and analysis in smart city: provenance in smart food supply chain. Int. J. Distrib. Sensor Netw. 9(11), 382132 (2013)

    Google Scholar 

  30. Zhang, Z., Wu, J., Deng, J., Qiu, M.: Jamming ack attack to wireless networks and a mitigation approach. In: IEEE GLOBECOM 2008–2008 IEEE Global Telecommunications Conference, pp. 1–5. IEEE (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tang, G. et al. (2021). Interpretation of Learning-Based Automatic Source Code Vulnerability Detection Model Using LIME. In: Qiu, H., Zhang, C., Fei, Z., Qiu, M., Kung, SY. (eds) Knowledge Science, Engineering and Management. KSEM 2021. Lecture Notes in Computer Science(), vol 12817. Springer, Cham. https://doi.org/10.1007/978-3-030-82153-1_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-82153-1_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-82152-4

  • Online ISBN: 978-3-030-82153-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics