Restructured Cloning Vulnerability Detection Based on Function Semantic Reserving and Reiteration Screening

Jiang, Weipeng; Wu, Bin; Yu, Xingxin; Xue, Rui; Yu, Zhengmin

doi:10.1007/978-3-030-58951-6_18

Restructured Cloning Vulnerability Detection Based on Function Semantic Reserving and Reiteration Screening

Conference paper
First Online: 12 September 2020

3663 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12308))

Abstract

Although code cloning may speed up the process of software development, it could be detrimental to the software security as undiscovered vulnerabilities can be easily propagated through code clones. Even worse, since developers tend not to simply clone the original code fragments, but also add variable and debug statements, detecting propagated vulnerable code clone is challenging. A few approaches have been proposed to detect such vulnerability- named as restructured cloning vulnerability; However, they usually cannot effectively obtain the vulnerability context and related semantic information. To address this limitation, we propose in this paper a novel approach, called RCVD++, for detecting restructured cloning vulnerabilities, which introduces a new feature extraction for vulnerable code based on program slicing and optimizes the code abstraction and detection granularity. Our approach further features reiteration screening to compensate for the lack of retroactive detection of fingerprint matching. Compared with our previous work RCVD, RCVD++ innovatively utilizes two granularities including line and function, allowing additional detection for exact and renamed clones. Besides, it retains more semantics by identifying library functions and reduces the false positives by screening the detection results. The experimental results on three different datasets indicate that RCVD++ performs better than other detection tools for restructured cloning vulnerability detection.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Li, Z., et al.: Vuldeepecker: a deep learning-based system for vulnerability detection. In: Proceedings of the 25th Annual Network and Distributed System Security Symposium (NDSS) (2018)
Google Scholar
Kim, S., Woo, S., Lee, H., Oh, H.: Vuddy: a scalable approach for vulnerable code clone discovery. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 595–614. IEEE, May 2017
Google Scholar
Pham, N.H., Nguyen, T.T., Nguyen, H.A., Wang, X., Nguyen, A.T., Nguyen, T.N.: Detecting recurring and similar software vulnerabilities. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering, vol. 2, pp. 227–230. ACM, May 2010
Google Scholar
Li, J., Ernst, M.D.: CBCD: cloned buggy code detector. In: Proceedings of the 34th International Conference on Software Engineering, pp. 310–320. IEEE Press, June 2012
Google Scholar
Jang, J., Agrawal, A., Brumley, D.: ReDeBug: finding unpatched code clones in entire os distributions. In: 2012 IEEE Symposium on Security and Privacy, pp. 48–62. IEEE, May 2012
Google Scholar
Li, H., Kwon, H., Kwon, J., Lee, H.: CLORIFI: software vulnerability discovery using code clone verification. Concurr. Comput. Pract. Exp. 28(6), 1900–1917 (2016)
Article Google Scholar
Gan, S., Qin, X., Chen, Z., Wang, L.: Software vulnerability code clone detection method based on characteristic metrics. J. Softw. 26(2), 348–363 (2015)
Google Scholar
Liu, Z., Wei, Q., Cao, Y.: Vfdetect: a vulnerable code clone detection system based on vulnerability fingerprint. In: 2017 IEEE 3rd Information Technology and Mechatronics Engineering Conference (ITOEC), pp. 548–553. IEEE, October 2017
Google Scholar
Nishi, M.A., Damevski, K.: Scalable code clone detection and search based on adaptive prefix filtering. J. Syst. Softw. 137, 130–142 (2018)
Article Google Scholar
Jiang, L., Misherghi, G., Su, Z., Glondu, S.: Deckard: scalable and accurate tree-based detection of code clones. In: Proceedings of the 29th International Conference on Software Engineering, pp. 96–105. IEEE Computer Society, May 2007
Google Scholar
Jiang, W., Wu, B., Jiang, Z., Yang, S.: Cloning vulnerability detection in driver layer of IoT devices. In: Zhou, J., Luo, X., Shen, Q., Xu, Z. (eds.) ICICS 2019. LNCS, vol. 11999, pp. 89–104. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-41579-2_6
Chapter Google Scholar
Svajlenko, J., Roy, C.K.: Cloneworks: a fast and flexible large-scale near-miss clone detection tool. In: 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C), pp. 177–179. IEEE, May 2017
Google Scholar
Roy, C.K., Cordy, J.R., Koschke, R.: Comparison and evaluation of code clone detection techniques and tools: a qualitative approach. Sci. Comput. Program. 74(7), 470–495 (2009)
Article MathSciNet Google Scholar
Yamaguchi, F., Lindner, F., Rieck, K.: Vulnerability extrapolation: assisted discovery of vulnerabilities using machine learning. In: Proceedings of the 5th USENIX Conference on Offensive Technologies, p. 13. USENIX Association, August 2011
Google Scholar
Kamiya, T., Kusumoto, S., Inoue, K.: CCFinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Trans. Softw. Eng. 28(7), 654–670 (2002)
Article Google Scholar
Sajnani, H., Saini, V., Svajlenko, J., Roy, C.K., Lopes, C.V.: SourcererCC: scaling code clone detection to big-code. In: 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE), pp. 1157–1168. IEEE, May 2016
Google Scholar
Xu, B., Qian, J., Zhang, X., Wu, Z., Chen, L.: A brief survey of program slicing. ACM Sigsoft Softw. Eng. Notes 30(2), 1–36 (2005)
Article Google Scholar
joern. https://joern.readthedocs.io
Li, Z., et al.: Sysevr: a framework for using deep learning to detect software vulnerabilities. arXiv preprint arXiv:1807.06756 (2018)
Li, Z., Lu, S., Myagmar, S., Zhou, Y.: CP-Miner: a tool for finding copy-paste and related bugs in operating system code. In: OSdi, vol. 4, no. 19, pp. 289–302, December 2004
Google Scholar
Li, Z., et al.: VulPecker: an automated vulnerability detection system based on code similarity analysis. In: Proceedings of the 32nd Annual Conference on Computer Security Applications, pp. 201–213. ACM, December 2016
Google Scholar
Lin, G., et al.: Cross-project transfer representation learning for vulnerable function discovery. IEEE Trans. Ind. Inform. 14(7), 3289–3297 (2018)
Article Google Scholar

Download references

Acknowledgements

This research was supported by the National Nature Science Foundation of China under Grant No. U1936119, National Nature Science Foundation of China under Grant No. 61941116 and National Key R&D Program of China under Grant No. 2019QY(Y)0602.

Author information

Authors and Affiliations

State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Weipeng Jiang, Bin Wu, Xingxin Yu, Rui Xue & Zhengmin Yu
School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Weipeng Jiang, Bin Wu, Xingxin Yu, Rui Xue & Zhengmin Yu

Authors

Weipeng Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Bin Wu
View author publications
You can also search for this author in PubMed Google Scholar
Xingxin Yu
View author publications
You can also search for this author in PubMed Google Scholar
Rui Xue
View author publications
You can also search for this author in PubMed Google Scholar
Zhengmin Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bin Wu .

Editor information

Editors and Affiliations

University of Surrey, Guildford, UK
Liqun Chen
Purdue University, West Lafayette, IN, USA
Ninghui Li
Delft University of Technology, Delft, The Netherlands
Kaitai Liang
University of Surrey, Guildford, UK
Steve Schneider

Appendix

In this section, we provided the algorithms of greedy-based matching.

Greedy-Based Matching

Algorithm 1 introduces the pseudo code for matching algorithm. C is the target code and F is the fingerprint. The output R will be True if code C contains the fingerprint F, else it will be False. If the length of C is less than the length of F, it’s impossible for C to match the F. If the nth element of C is the same as the mth element of F, the n and m will increase by one at the same time. Otherwise, only n will increase. If F is completely matched, then the fingerprint matching is considered as successful and returns True. From Algorithm 1, it’s explicit that the time complexity is independent of fingerprint length and it has a linear relationship with the lines of code and the number of fingerprints.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jiang, W., Wu, B., Yu, X., Xue, R., Yu, Z. (2020). Restructured Cloning Vulnerability Detection Based on Function Semantic Reserving and Reiteration Screening. In: Chen, L., Li, N., Liang, K., Schneider, S. (eds) Computer Security – ESORICS 2020. ESORICS 2020. Lecture Notes in Computer Science(), vol 12308. Springer, Cham. https://doi.org/10.1007/978-3-030-58951-6_18

Download citation

DOI: https://doi.org/10.1007/978-3-030-58951-6_18
Published: 12 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58950-9
Online ISBN: 978-3-030-58951-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Abstract

Buying options

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

Greedy-Based Matching

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation