Skip to main content

A Review of Data Representation Methods for Vulnerability Mining Using Deep Learning

  • Conference paper
  • First Online:
Frontiers in Cyber Security (FCS 2021)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1558))

Included in the following conference series:

  • 768 Accesses

Abstract

The rapid development of software has brought unprecedented severe challenges to software security vulnerabilities. Traditional vulnerability mining methods are difficult to apply to large-scale software systems due to drawbacks such as manual inspection, low efficiency, high false positives and high false negatives. Recent research works have attempted to apply deep learning models to vulnerability mining, and have made a good progress in vulnerability mining filed. In this paper, we analyze the deep learning model framework applied to vulnerability mining and summarize its overall workflow and technology. Then, we give a detailed analysis on five feature extraction methods for vulnerability mining, including sequence characterization-based method, abstract syntax tree-based method, graph-based method, text-based method and mixed characterization-based method. In addition, we summarize their advantages and disadvantages from the angles of single and mixed feature extraction method. Finally, we point out the future research trends and prospects.

This work was supported by the Key Research and Development Science and Technology of Hainan Province(ZDYF202012), the National Key Research and Development Program of China(2018YFB0804701), and the National Natural Science Foundation of China (U1836210).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Zhao, H., Li, X., Tan, J., Gai, K.: Smart contract security issues and research status. Inf. Technol. Netw. Secur. 40(05), 1–6 (2021)

    Google Scholar 

  2. Gu, M., et al.: Software secure vulnerability mining based on deep learning. Comput. Res. Dev. 58(10), 2073–2095 (2021)

    Google Scholar 

  3. Li, Y., Huang, C., Wang, Z., Yuan, L., Wang, X.: Overview of software vulnerability mining methods based on machine learning. J. Softw. 31(07), 2040–2061 (2020)

    Google Scholar 

  4. Tao, Y., Jia, X., Wu, Y.: A research method of industrial Internet security vulnerabilities based on knowledge map. Inf. Technol. Netw. Secur. 39(01), 6–13 (2020)

    Google Scholar 

  5. Peng, H., Mou, L., Li, G., et al.: Building program vector representations for deep learning. In: 8th International Conference on Knowledge Science, Engineering and Management, pp. 547–553 (2015)

    Google Scholar 

  6. He, Y., Li, B.: Learning rate strategy of a combined deep learning model. J. Autom. 42(06), 953–958 (2016)

    Google Scholar 

  7. Allamanis, M., Brockschmidt, M., Khademi, M.: Learning to represent programs with graphs. In: International Conference on Learning Presentations, pp. 1–17 (2017)

    Google Scholar 

  8. Wang, L., Li, X., Wang, R., et al.: PreNNsem: A heterogeneous ensemble learning framework for vulnerability detection in software. Appl. Sci. 10(22), 7954 (2020)

    Article  Google Scholar 

  9. Zhang, J., Wang, X., Zhang, H., et al.: A novel neural source code representation based on abstract syntax tree. In: 41st International Conference on Software Engineering, pp. 783–794 (2019)

    Google Scholar 

  10. Wang, H., Li, Han., Li, H.: Research on ontology relation extraction method in the field of civil aviation emergencies. Comput. Sci. Explor. 04(02), 285–293 (2020)

    Google Scholar 

  11. Li, X., Wang, L., Xin, Y., et al.: Automated software vulnerability detection based on hybrid neural network. Appl. Sci. 11(07), 3201 (2021)

    Article  Google Scholar 

  12. Yang, H., Shen, S., Xiong, J., et al.: Modulation recognition of underwater acoustic communication signals based on denoting and deep sparse autoencoder. In: INTER-NOISE and NOISE-CON Congress and Conference Proceedings, pp. 5506–5511 (2016)

    Google Scholar 

  13. Wang, X.: Application of hierarchical clustering based on matrix transformation in gene expression data analysis. Comput. CD Softw. Appl. 15(24), 46–47 (2012)

    Google Scholar 

  14. Zhu, X.: Deep learning analysis based on data collection. Jun. Mid. Sch. World: Jun. Mid. Sch. Teach. Res. 04, 66 (2021)

    Google Scholar 

  15. Liu, M., Wang, X., Huang, Y.: Data preprocessing in data mining. Comput. Sci. 04, 56–59 (2000)

    Google Scholar 

  16. Mohamed, A., Sainath, T., Dahl, G., et al.: Deep belief network for telephone recognition u sing discriminant features. In: IEEE International Conference on acoustics, pp. 5060–5063 (2015)

    Google Scholar 

  17. Wu, F., Wang, J., Liu, J., et al.: Vulnerability detection with deep learning. In: 3rd IEEE International Conference on Computer and Communications, pp. 1298–1302 (2017)

    Google Scholar 

  18. Yu, X., Chen, W., Chen, R.: Implementation of an approximate mining method for data protocol. J. Huaqiao Univ. (NATURAL SCIENCE EDITION) 29(03)29, 370–374 (2008)

    Google Scholar 

  19. Jaafor, O., Birregah, B.: Multi-layered graph-based model for social engineering vulnerability assessment. In: International Conference on Advances in Social Networks Analysis and Mining, pp. 1480–1488 (2015)

    Google Scholar 

  20. Gao, R., Zhou, C., Zhu, R.: Research on vulnerability mining technology of network application program. Mod. Electron. Tech. 41(03), 15–19 (2018)

    Google Scholar 

  21. Lin, Z., Xiang, L., Kuang, X.: Machine Learning in Vulnerability Databases. In: 10th International Symposium on Computational Intelligence and Design (ISCID), pp. 108–113 (2018)

    Google Scholar 

  22. Pang, Y., Xue, X., Wang, H.: Predicting vulnerable software components through deep neural network. In: 12th International Conference on Advanced Computational Intelligence (ICACI), pp. 6–10 (2017)

    Google Scholar 

  23. Zou, Q., et al.: From automation to intelligence: progress in software vulnerability mining technology. J. Tsinghua Univ. (NATURAL SCIENCE EDITION) 58(12), 1079–1094 (2018)

    Google Scholar 

  24. Li, Z., Zou, D., Xu, S., et al. VulDeePecker: A deep learning-based system for vulnerability detection. In: 25th Annual Network and Distributed System Security Symposium(NDSS), pp. 1–15 (2018)

    Google Scholar 

  25. Jian, X., Gu, H., Wang, R.: A short-term photovoltaic power prediction model based on dual-channel CNN and LSTM. Electr. Power Sci. Eng. 35(5), 7–11 (2019)

    Google Scholar 

  26. Zhang, Q., Peng, Z.: Attention-based convolutionalgated recurrent neural network for reader’s emotion prediction. Comput. Eng. Appl. 54(13), 168–174 (2018)

    Google Scholar 

  27. Liu, Q., Hu, Q., Yang, L., Zhou, H.: Research on deep learning photovoltaic power generation model based on time series. Power Syst. Protect. Control 49(19), 87–98 (2021)

    Google Scholar 

  28. Jiang, L., Liu, J., Zhang, H.: Discrimination and compensation of abnormal values of magnetic flux leakage in oil pipeline based on BP neural network. In: Chinese Control and Decision Conference (CCDC), pp. 3714–3718 (2017)

    Google Scholar 

  29. Pisal, A., Sor, R., Kinage, K., Facial feature extraction using hierarchical MAX(HMAX) method. In: International Conference on Computing, Communication, Control and Automation (ICCUBEA), pp. 1–5 (2017)

    Google Scholar 

  30. Sedaghat, A., Ebadi, H.: Remote sensing image matching based on adaptive binning SIFT descriptor. IEEE Trans. Geosci. Remote Sens. 53(10), 5283–5293 (2015)

    Article  Google Scholar 

  31. Liu, X., Tang, J.: Mass classification in mammograms using selected geometry and texture features. New SVM-Bas. Feature Select. Meth. 8(3), 910–920 (2014)

    Google Scholar 

  32. Jiang, L., Liu, J., Zhang, H., Xu, K.: MFL data feature extraction based on KPCA-BOMW Model. In: 31st Chinese Control and Decision Conference (CCDC), pp. 1025–1029 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuqing Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, Y. et al. (2022). A Review of Data Representation Methods for Vulnerability Mining Using Deep Learning. In: Cao, C., Zhang, Y., Hong, Y., Wang, D. (eds) Frontiers in Cyber Security. FCS 2021. Communications in Computer and Information Science, vol 1558. Springer, Singapore. https://doi.org/10.1007/978-981-19-0523-0_22

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-0523-0_22

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-0522-3

  • Online ISBN: 978-981-19-0523-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics