Abstract
Although code cloning may speed up the process of software development, it could be detrimental to the software security as undiscovered vulnerabilities can be easily propagated through code clones. Even worse, since developers tend not to simply clone the original code fragments, but also add variable and debug statements, detecting propagated vulnerable code clone is challenging. A few approaches have been proposed to detect such vulnerability- named as restructured cloning vulnerability; However, they usually cannot effectively obtain the vulnerability context and related semantic information. To address this limitation, we propose in this paper a novel approach, called RCVD++, for detecting restructured cloning vulnerabilities, which introduces a new feature extraction for vulnerable code based on program slicing and optimizes the code abstraction and detection granularity. Our approach further features reiteration screening to compensate for the lack of retroactive detection of fingerprint matching. Compared with our previous work RCVD, RCVD++ innovatively utilizes two granularities including line and function, allowing additional detection for exact and renamed clones. Besides, it retains more semantics by identifying library functions and reduces the false positives by screening the detection results. The experimental results on three different datasets indicate that RCVD++ performs better than other detection tools for restructured cloning vulnerability detection.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Li, Z., et al.: Vuldeepecker: a deep learning-based system for vulnerability detection. In: Proceedings of the 25th Annual Network and Distributed System Security Symposium (NDSS) (2018)
Kim, S., Woo, S., Lee, H., Oh, H.: Vuddy: a scalable approach for vulnerable code clone discovery. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 595–614. IEEE, May 2017
Pham, N.H., Nguyen, T.T., Nguyen, H.A., Wang, X., Nguyen, A.T., Nguyen, T.N.: Detecting recurring and similar software vulnerabilities. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering, vol. 2, pp. 227–230. ACM, May 2010
Li, J., Ernst, M.D.: CBCD: cloned buggy code detector. In: Proceedings of the 34th International Conference on Software Engineering, pp. 310–320. IEEE Press, June 2012
Jang, J., Agrawal, A., Brumley, D.: ReDeBug: finding unpatched code clones in entire os distributions. In: 2012 IEEE Symposium on Security and Privacy, pp. 48–62. IEEE, May 2012
Li, H., Kwon, H., Kwon, J., Lee, H.: CLORIFI: software vulnerability discovery using code clone verification. Concurr. Comput. Pract. Exp. 28(6), 1900–1917 (2016)
Gan, S., Qin, X., Chen, Z., Wang, L.: Software vulnerability code clone detection method based on characteristic metrics. J. Softw. 26(2), 348–363 (2015)
Liu, Z., Wei, Q., Cao, Y.: Vfdetect: a vulnerable code clone detection system based on vulnerability fingerprint. In: 2017 IEEE 3rd Information Technology and Mechatronics Engineering Conference (ITOEC), pp. 548–553. IEEE, October 2017
Nishi, M.A., Damevski, K.: Scalable code clone detection and search based on adaptive prefix filtering. J. Syst. Softw. 137, 130–142 (2018)
Jiang, L., Misherghi, G., Su, Z., Glondu, S.: Deckard: scalable and accurate tree-based detection of code clones. In: Proceedings of the 29th International Conference on Software Engineering, pp. 96–105. IEEE Computer Society, May 2007
Jiang, W., Wu, B., Jiang, Z., Yang, S.: Cloning vulnerability detection in driver layer of IoT devices. In: Zhou, J., Luo, X., Shen, Q., Xu, Z. (eds.) ICICS 2019. LNCS, vol. 11999, pp. 89–104. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-41579-2_6
Svajlenko, J., Roy, C.K.: Cloneworks: a fast and flexible large-scale near-miss clone detection tool. In: 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C), pp. 177–179. IEEE, May 2017
Roy, C.K., Cordy, J.R., Koschke, R.: Comparison and evaluation of code clone detection techniques and tools: a qualitative approach. Sci. Comput. Program. 74(7), 470–495 (2009)
Yamaguchi, F., Lindner, F., Rieck, K.: Vulnerability extrapolation: assisted discovery of vulnerabilities using machine learning. In: Proceedings of the 5th USENIX Conference on Offensive Technologies, p. 13. USENIX Association, August 2011
Kamiya, T., Kusumoto, S., Inoue, K.: CCFinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Trans. Softw. Eng. 28(7), 654–670 (2002)
Sajnani, H., Saini, V., Svajlenko, J., Roy, C.K., Lopes, C.V.: SourcererCC: scaling code clone detection to big-code. In: 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE), pp. 1157–1168. IEEE, May 2016
Xu, B., Qian, J., Zhang, X., Wu, Z., Chen, L.: A brief survey of program slicing. ACM Sigsoft Softw. Eng. Notes 30(2), 1–36 (2005)
joern. https://joern.readthedocs.io
Li, Z., et al.: Sysevr: a framework for using deep learning to detect software vulnerabilities. arXiv preprint arXiv:1807.06756 (2018)
Li, Z., Lu, S., Myagmar, S., Zhou, Y.: CP-Miner: a tool for finding copy-paste and related bugs in operating system code. In: OSdi, vol. 4, no. 19, pp. 289–302, December 2004
Li, Z., et al.: VulPecker: an automated vulnerability detection system based on code similarity analysis. In: Proceedings of the 32nd Annual Conference on Computer Security Applications, pp. 201–213. ACM, December 2016
Lin, G., et al.: Cross-project transfer representation learning for vulnerable function discovery. IEEE Trans. Ind. Inform. 14(7), 3289–3297 (2018)
Acknowledgements
This research was supported by the National Nature Science Foundation of China under Grant No. U1936119, National Nature Science Foundation of China under Grant No. 61941116 and National Key R&D Program of China under Grant No. 2019QY(Y)0602.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
In this section, we provided the algorithms of greedy-based matching.
Greedy-Based Matching
Algorithm 1 introduces the pseudo code for matching algorithm. C is the target code and F is the fingerprint. The output R will be True if code C contains the fingerprint F, else it will be False. If the length of C is less than the length of F, it’s impossible for C to match the F. If the nth element of C is the same as the mth element of F, the n and m will increase by one at the same time. Otherwise, only n will increase. If F is completely matched, then the fingerprint matching is considered as successful and returns True. From Algorithm 1, it’s explicit that the time complexity is independent of fingerprint length and it has a linear relationship with the lines of code and the number of fingerprints.
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Jiang, W., Wu, B., Yu, X., Xue, R., Yu, Z. (2020). Restructured Cloning Vulnerability Detection Based on Function Semantic Reserving and Reiteration Screening. In: Chen, L., Li, N., Liang, K., Schneider, S. (eds) Computer Security – ESORICS 2020. ESORICS 2020. Lecture Notes in Computer Science(), vol 12308. Springer, Cham. https://doi.org/10.1007/978-3-030-58951-6_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-58951-6_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58950-9
Online ISBN: 978-3-030-58951-6
eBook Packages: Computer ScienceComputer Science (R0)