Abstract
Verifying string manipulating programs is a crucial problem in computer security. String operations are used extensively within web applications to manipulate user input, and their erroneous use is the most common cause of security vulnerabilities in web applications. We present an automata-based approach for symbolic analysis of string manipulating programs. We use deterministic finite automata (DFAs) to represent possible values of string variables. Using forward reachability analysis we compute an over-approximation of all possible values that string variables can take at each program point. Intersecting these with a given attack pattern yields the potential attack strings if the program is vulnerable. Based on the presented techniques, we have implemented Stranger, an automata-based string analysis tool for detecting string-related security vulnerabilities in PHP applications. We evaluated Stranger on several open-source Web applications including one with 350,000+ lines of code. Stranger is able to detect known/unknown vulnerabilities, and, after inserting proper sanitization routines, prove the absence of vulnerabilities with respect to given attack patterns.
Similar content being viewed by others
References
Alkhalaf M, Bultan T, Gallegos JL (2012) Verifying client-side input validation functions using string analysis. In: ICSE, pp 947–957
Balzarotti D, Cova M, Felmetsger V, Jovanovic N, Kruegel C, Kirda E, Vigna G (2008) Saner: composing static and dynamic analysis to validate sanitization in web applications. In: S&P, pp 387–401
Bartzis C, Bultan T (2003) Efficient symbolic representations for arithmetic constraints in verification. Int J Found Comput Sci 14(4):605–624
Bartzis C, Bultan T (2004) Widening arithmetic automata. In: CAV, pp 321–333
Biehl M, Klarlund N, Rauhe T (1997) Algorithms for guided tree automata. In: WIA, pp 6–25
Bjørner N, Tillmann N, Voronkov A (2009) Path feasibility analysis for string-manipulating programs. In: TACAS, pp 307–321
Book R, Even S, Greibach S, Ott G (1971) Ambiguity in graphs and expressions. IEEE Trans Comput C-20(2):149–153
Bouajjani A, Habermehl P, Vojnar T (2004) Abstract regular model checking. In: CAV, pp 372–386
Bouajjani A, Jonsson B, Nilsson M, Touili T (2000) Regular model checking. In: CAV, pp 403–418
BRICS. The MONA project. http://www.brics.dk/mona/
Choi T-H, Lee O, Kim H, Doh K-G (2006) A practical string analyzer by the widening approach. In: APLAS, pp 374–388
Christensen AS, Møller A, Schwartzbach MI (2003) Precise analysis of string expressions. In: SAS, pp 1–18
Christodorescu M, Kidd N, Goh W-H (2005) String analysis for x86 binaries. In: PASTE, pp 88–95
Fu X, Lu X, Peltsverger B, Chen S, Qian K, Tao L (2007) A static analysis framework for detecting SQL injection vulnerabilities. In: COMPSAC, pp 87–96
Gould C, Su Z, Devanbu P (2004) Static checking of dynamically generated queries in database applications. In: ICSE, pp 645–654
Hooimeijer P, Livshits B, Molnar D, Saxena P, Veanes M (2011) Fast and precise sanitizer analysis with BEK. In: SEC, p 1
Hooimeijer P, Weimer W (2009) A decision procedure for subset constraints over regular languages. In: PLDI, pp 188–198
Hooimeijer P, Weimer W (2012) Strsolve: solving string constraints lazily. Autom Softw Eng 19(4):531–559
Jovanovic N, Krügel C, Kirda E (2006) Pixy: a static analysis tool for detecting web application vulnerabilities (short paper). In: S&P, pp 258–263
Kiezun A, Ganesh V, Guo PJ, Hooimeijer P, Ernst MD (2009) Hampi: a solver for string constraints. In: ISSTA, pp 105–116
Kirkegaard C, Møller A, Schwartzbach MI (2004) Static analysis of XML transformations in Java. IEEE Trans Softw Eng 30(3):181–192
Klarlund N, Møller A, Schwartzbach MI (2002) MONA implementation secrets. Int J Found Comput Sci 13(4):571–586
Minamide Y (2005) Static approximation of dynamically generated web pages. In: WWW, pp 432–441
OWASP. Top 10 2007. https://www.owasp.org/index.php/Top_10_2007
OWASP. Top 10 2010. https://www.owasp.org/index.php/Top_10_2010-Main
OWASP. Top 10 2013. https://www.owasp.org/index.php/Top_10_2013-T10
Sakuma Y, Minamide Y, Voronkov A (2012) Translating regular expression matching into transducers. J Appl Log 10(1):32–51
Saxena P, Akhawe D, Hanna S, Mao F, McCamant S, Song D (2010) A symbolic execution framework for JavaScript. In: S&P, pp 513–528
Sen K, Marinov D, Agha G (2005) Cute: a concolic unit testing engine for C. In: ESEC/FSE, pp 263–272
Shannon D, Hajra S, Lee A, Zhan D, Khurshid S (2007) Abstracting symbolic execution with string analysis. In: TAICPART-MUTATION, pp 13–22
Sourceforge. Open sources. http://sourceforge.net
Tateishi T, Pistoia M, Tripp O (2011) Path- and index-sensitive string analysis based on monadic second-order logic. In: ISSTA, pp 166–176
van Noord G. FSA utilities toolbox. http://odur.let.rug.nl/~vannoord/Fsa/
Veanes M, Bjørner N (2012) Symbolic automata: the toolkit. In: TACAS, pp 472–477
Veanes M, Hooimeijer P, Livshits B, Molnar D, Bjorner N (2012) Symbolic finite state transducers: algorithms and applications. In: POPL, pp 137–150
Wassermann G, Su Z (2007) Sound and precise analysis of web applications for injection vulnerabilities. In: PLDI, pp 32–41
Wassermann G, Su Z (2008) Static detection of cross-site scripting vulnerabilities. In: ICSE, pp 171–180
Wassermann G, Yu D, Chander A, Dhurjati D, Inamura H, Su Z (2008) Dynamic test input generation for web applications. In: ISSTA, pp 249–260
Xie Y, Aiken A (2006) Static detection of security vulnerabilities in scripting languages. In: USENIX-SS, p 13
Yu F, Alkhalaf M, Bultan T (2010) Stranger: an automata-based string analysis tool for PHP. In: TACAS, pp 154–157
Yu F, Alkhalaf M, Bultan T (2011) Patching vulnerabilities with sanitization synthesis. In: ICSE, pp 251–260
Yu F, Bultan T, Cova M, Ibarra OH (2008) Symbolic string verification: an automata-based approach. In: SPIN, pp 306–324
Yu F, Bultan T, Hardekopf B (2011) String abstractions for string verification. In: SPIN, pp 20–37
Yu F, Bultan T, Ibarra OH (2011) Relational string verification using multi-track automata. Int J Found Comput Sci 22(8):1909–1924
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yu, F., Alkhalaf, M., Bultan, T. et al. Automata-based symbolic string analysis for vulnerability detection. Form Methods Syst Des 44, 44–70 (2014). https://doi.org/10.1007/s10703-013-0189-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10703-013-0189-1