Abstract
We present an automata-based approach for the verification of string operations in PHP programs based on symbolic string analysis. String analysis is a static analysis technique that determines the values that a string expression can take during program execution at a given program point. This information can be used to verify that string values are sanitized properly and to detect programming errors and security vulnerabilities. In our string analysis approach, we encode the set of string values that string variables can take as automata. We implement all string functions using a symbolic automata representation (MBDD representation from the MONA automata package) and leverage efficient manipulations on MBDDs, e.g., determinization and minimization. Particularly, we propose a novel algorithm for language-based replacement. Our replacement function takes three DFAs as arguments and outputs a DFA. Finally, we apply a widening operator defined on automata to approximate fixpoint computations. If this conservative approximation does not include any bad patterns (specified as regular expressions), we conclude that the program does not contain any errors or vulnerabilities. Our experimental results demonstrate that our approach works quite well in checking the correctness of sanitization operations in real-world PHP applications.
This work is supported by NSF grants CCF-0614002 and CCF-0716095.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Balzarotti, D., Cova, M., Felmetsger, V., Jovanovic, N., Kirda, E., Kruegel, C., Vigna, G.: Saner: Composing Static and Dynamic Analysis to Validate Sanitization in Web Applications. In: Proc. Symposium on Security and Privacy (2008)
Balzarotti, D., Cova, M., Felmetsger, V., Vigna, G.: Multi-module vulnerability analysis of web-based applications. In: Proc. 14th ACM conference on Computer and communications security, pp. 25–35. ACM, New York (2007)
Bartzis, C., Bultan, T.: Widening arithmetic automata. In: Proc. 16th International Conference on Computer Aided Verification, pp. 321–333 (2004)
Biehl, M., Klarlund, N., Rauhe, T.: Algorithms for guided tree automata. In: Raymond, D.R., Yu, S., Wood, D. (eds.) WIA 1996. LNCS, vol. 1260. Springer, Heidelberg (1997)
Bouajjani, A., Jonsson, B., Nilsson, M., Touili, T.: Regular model checking. In: Proc. 12th International Conference on Computer Aided Verification, pp. 403–418 (2000)
Choi, T.-H., Lee, O., Kim, H., Doh, K.-G.: A practical string analyzer by the widening approach. In: Kobayashi, N. (ed.) APLAS 2006. LNCS, vol. 4279, pp. 374–388. Springer, Heidelberg (2006)
Christensen, A.S., Møller, A., Schwartzbach, M.I.: Precise analysis of string expressions. In: Cousot, R. (ed.) SAS 2003. LNCS, vol. 2694, pp. 1–18. Springer, Heidelberg (2003)
Christodorescu, M., Kidd, N., Goh, W.-H.: String analysis for x86 binaries. In: Proc. 6th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE 2005), September 2005, ACM Press, New York (2005)
Fu, X., Lu, X., Peltsverger, B., Chen, S., Qian, K., Tao, L.: A static analysis framework for detecting sql injection vulnerabilities. In: Proc. 31st Annual International Computer Software and Applications Conference. COMPSAC 2007, Washington, DC, USA, vol. 1, pp. 87–96. IEEE Computer Society, Los Alamitos (2007)
Gerdemann, D., van Noord, G.: Transducers from rewrite rules with backreferences. In: Proc. 9th Conference of the European Chapter of the Association for Computational Linguistics, pp. 126–133 (1999)
Gould, C., Su, Z., Devanbu, P.: Static checking of dynamically generated queries in database applications. In: Proc. 26th International Conference on Software Engineering, pp. 645–654 (2004)
Karttunen, L.: The replace operator. In: Proc. 33rd annual meeting on Association for Computational Linguistics, pp. 16–23 (1995)
Minamide, Y.: Static approximation of dynamically generated web pages. In: Proc. 14th International World Wide Web Conference, pp. 432–441 (2005)
Mohri, M., Sproat, R.: An efficient compiler for weighted rewrite rules. In: Proc. 34th annual meeting on Association for Computational Linguistics, pp. 231–238. Association for Computational Linguistics (1996)
Shannon, D., Hajra, S., Lee, A., Zhan, D., Khurshid, S.: Abstracting symbolic execution with string analysis. In: Proc. Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION, Washington, DC, USA, pp. 13–22. IEEE Computer Society, Los Alamitos (2007)
van Noord, G.: FSA utilities toolbox, http://odur.let.rug.nl/~vannoord/Fsa/
van Noord, G., Gerdemann, D.: An extendible regular expression compiler for finite-state approaches in natural language processing. In: Proc. of the 4th International Workshop on Implementing Automata (WIA), July 1999, pp. 122–139. Springer, Heidelberg (1999)
Wassermann, G., Su, Z.: Sound and precise analysis of web applications for injection vulnerabilities. In: Proc. ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, pp. 32–41 (2007)
Xie, Y., Aiken, A.: Static detection of security vulnerabilities in scripting languages. In: Proc. 15th conference on USENIX Security Symposium, Berkeley, CA, USA, p. 13. USENIX Association (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yu, F., Bultan, T., Cova, M., Ibarra, O.H. (2008). Symbolic String Verification: An Automata-Based Approach. In: Havelund, K., Majumdar, R., Palsberg, J. (eds) Model Checking Software. SPIN 2008. Lecture Notes in Computer Science, vol 5156. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85114-1_21
Download citation
DOI: https://doi.org/10.1007/978-3-540-85114-1_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85113-4
Online ISBN: 978-3-540-85114-1
eBook Packages: Computer ScienceComputer Science (R0)