Abstract
We present a method for vulnerability extrapolation to identify vulnerable functions in source code. Given a known vulnerable function, the proposed method extrapolates to find similar functions in the code base. Vulnerability extrapolation is based on the observation that given a starting vulnerability, similar behavior may be present in many other functions. In order to capture similarity, we represent functions in terms of syntactic and semantic patterns. These patterns are based on several code features like API usage pattern, argument types and control flow graph (CFG) of the functions. We employ a recent technique, called graph kernel to compute similarity directly on the CFGs of functions. We empirically demonstrate the capabilities of the proposed method by evaluating real-world applications to identify vulnerabilities.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Flawfinder, d. A. Wheeler. http://www.dwheeler.com/flawfinder/
Ducasse, S., Rieger, M., Demeyer, S.: A language independent approach for detecting duplicated code. In: Proceedings of IEEE Software Maintenance, pp. 109–118. IEEE (1999)
Einarsson, A., Nielsen, J.D.: A survivors guide to java program analysis with soot. Department of Computer Science, University of Aarhus, Denmark, BRICS (2008)
Fan, W., Li, J., Ma, S., Wang, H., Wu, Y.: Graph homomorphism revisited for graph matching. Proc. VLDB Endow. 3(1–2), 1161–1172 (2010)
Fechete, R., Kienesberger, G., Blieberger, J.: A framework for CFG-based static program analysis of ada programs. In: Kordon, F., Vardanega, T. (eds.) Ada-Europe 2008. LNCS, vol. 5026, pp. 130–143. Springer, Heidelberg (2008). doi:10.1007/978-3-540-68624-8_10
Kapser, C., Godfrey, M.W.: Toward a taxonomy of clones in source code: a case study. In: Proceedings of ELISA 2003, pp. 67–78 (2003)
Krissinel, E.B., Henrick, K.: Common subgraph isomorphism detection by backtracking search. Softw. Pract. Exper. 34(6), 591–607 (2004)
Li, W., Saidi, H., Sanchez, H., Schäf, M., Schweitzer, P.: Detecting similar programs via the weisfeiler-leman graph kernel. In: Kapitsaki, G.M., Santana de Almeida, E. (eds.) ICSR 2016. LNCS, vol. 9679, pp. 315–330. Springer, Heidelberg (2016). doi:10.1007/978-3-319-35122-3_21
Li, Z., Lu, S., Myagmar, S., Zhou, Y.: Cp-miner: finding copy-paste and related bugs in large-scale software code. IEEE Trans. Softw. Eng. 32(3), 176–192 (2006)
Newsome, J., Song, D.: Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software. In: NDSS. IEEE (2005)
Ransbotham, S.: An empirical analysis of exploitation attempts based on vulnerabilities in open source software. In: WEIS (2010)
Rawat, S., Mounier, L.: Finding buffer overflow inducing loops in binary executables. In: Proceedings Software Security and Reliability (SERE), pp. 177–186. IEEE CSP (2012)
Rieck, K., Laskov, P.: Detecting unknown network attacks using language models. In: Büschkes, R., Laskov, P. (eds.) DIMVA 2006. LNCS, vol. 4064, pp. 74–90. Springer, Heidelberg (2006). doi:10.1007/11790754_5
Schwartz, E.J., Avgerinos, T., Brumley, D.: All you ever wanted to know about dynamic taint analysis and forward symbolic execution (but might have been afraid to ask). In: IEEE S&P 2010, pp. 317–331. IEEE (2010)
Shervashidze, N., Schweitzer, P., Van Leeuwen, E.J., Mehlhorn, K., Borgwardt, K.M.: Weisfeiler-lehman graph kernels. J. Mach. Learn. Res. 12, 2539–2561 (2011)
Sugiyama, M., Borgwardt, K.: Halting in random walk kernels. In: Advances in Neural Information Processing Systems, pp. 1630–1638 (2015)
Vishwanathan, S.V.N., Schraudolph, N.N., Kondor, R., Borgwardt, K.M.: Graph kernels. J. Mach. Learn. Res. 11, 1201–1242 (2010)
Williams, C.C., Hollingsworth, J.K.: Automatic mining of source code repositories to improve bug finding techniques. IEEE Trans. Software Eng. 31(6), 466–480 (2005)
Xu, K., Yao, D.D., Ryder, B.G., Tian, K.: Probabilistic program modeling for high-precision anomaly classification. In: IEEE Computer Security Foundations Symposium, pp. 497–511. IEEE (2015)
Yamaguchi, F., Lindner, F., Rieck, K.: Vulnerability extrapolation: assisted discovery of vulnerabilities using machine learning. In: Proceedings of USENIX Conference on Offensive Technologies, pp. 13–13. USENIX Association (2011)
Yamaguchi, F., Lottmann, M., Rieck, K.: Generalized vulnerability extrapolation using abstract syntax trees. In: Proceedings of ACSAC, pp. 359–368. ACM (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Jain, L., Chandran, A., Rawat, S., Srinathan, K. (2016). Discovering Vulnerable Functions by Extrapolation: A Control-Flow Graph Similarity Based Approach. In: Ray, I., Gaur, M., Conti, M., Sanghi, D., Kamakoti, V. (eds) Information Systems Security. ICISS 2016. Lecture Notes in Computer Science(), vol 10063. Springer, Cham. https://doi.org/10.1007/978-3-319-49806-5_32
Download citation
DOI: https://doi.org/10.1007/978-3-319-49806-5_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49805-8
Online ISBN: 978-3-319-49806-5
eBook Packages: Computer ScienceComputer Science (R0)