Skip to main content

An Evaluation of Automata Algorithms for String Analysis

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6538))

Abstract

There has been significant recent interest in automated reasoning techniques, in particular constraint solvers, for string variables. These techniques support a wide variety of clients, ranging from static analysis to automated testing. The majority of string constraint solvers rely on finite automata to support regular expression constraints. For these approaches, performance depends critically on fast automata operations such as intersection, complementation, and determinization. Existing work in this area has not yet provided conclusive results as to which core algorithms and data structures work best in practice.

In this paper, we study a comprehensive set of algorithms and data structures for performing fast automata operations. Our goal is to provide an apples-to-apples comparison between techniques that are used in current tools. To achieve this, we re-implemented a number of existing techniques. We use an established set of regular expressions benchmarks as an indicative workload. We also include several techniques that, to the best of our knowledge, have not yet been used for string constraint solving. Our results show that there is a substantial performance difference across techniques, which has implications for future tool design.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bala, S.: Regular language matching and other decidable cases of the satisfiability problem for constraints between regular open terms. In: Diekert, V., Habib, M. (eds.) STACS 2004. LNCS, vol. 2996, pp. 596–607. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  2. Bjørner, N., Tillmann, N., Voronkov, A.: Path feasibility analysis for string-manipulating programs. In: Kowalewski, S., Philippou, A. (eds.) TACAS 2009. LNCS, vol. 5505, pp. 307–321. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  3. Blom, S., Orzan, S.: Distributed state space minimization. J. Software Tools for Technology Transfer 7(3), 280–291 (2005)

    Article  Google Scholar 

  4. Boigelot, B., Wolper, P.: Representing arithmetic constraints with finite automata: An overview. In: Stuckey, P.J. (ed.) ICLP 2002. LNCS, vol. 2401, pp. 1–19. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  5. Christensen, A.S., Møller, A., Schwartzbach, M.I.: Precise Analysis of String Expressions. In: Cousot, R. (ed.) SAS 2003. LNCS, vol. 2694, pp. 1–18. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  6. Coquand, T., Huet, G.P.: The calculus of constructions. Information and Computation 76(2/3), 95–120 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  7. de Moura, L., Bjørner, N.S.: Z3: An Efficient SMT Solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  8. Godefroid, P., Kieżun, A., Levin, M.Y.: Grammar-based whitebox fuzzing. In: PLDI 2008, Tucson, AZ, USA, June 9-11 (2008)

    Google Scholar 

  9. Godefroid, P., Klarlund, N., Sen, K.: DART: directed automated random testing. In: PLDI 2005, pp. 213–223 (2005)

    Google Scholar 

  10. Henriksen, J.G., Jensen, J., Jørgensen, M., Klarlund, N., Paige, B., Rauhe, T., Sandholm, A.: Mona: Monadic second-order logic in practice. In: Brinksma, E., Steffen, B., Cleaveland, W.R., Larsen, K.G., Margaria, T. (eds.) TACAS 1995. LNCS, vol. 1019. Springer, Heidelberg (1995)

    Chapter  Google Scholar 

  11. Hooimeijer, P., Weimer, W.: A decision procedure for subset constraints over regular languages. In: PLDI, pp. 188–198 (2009)

    Google Scholar 

  12. Hooimeijer, P., Weimer, W.: Solving string constraints lazily. In: ASE 2010 (2010)

    Google Scholar 

  13. Hopcroft, J.E., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Reading (1979)

    MATH  Google Scholar 

  14. Ilie, L., Yu, S.: Follow automata. Information and Computation 186(1), 140–162 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  15. Kieżun, A., Ganesh, V., Guo, P.J., Hooimeijer, P., Ernst, M.D.: HAMPI: a solver for string constraints. In: ISSTA 2009, pp. 105–116. ACM, New York (2009)

    Google Scholar 

  16. Klarlund, N., Møller, A., Schwartzbach, M.I.: MONA implementation secrets. International Journal of Foundations of Computer Science 13(4), 571–586 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  17. Kunc, M.: What do we know about language equations? In: Harju, T., Karhumäki, J., Lepistö, A. (eds.) DLT 2007. LNCS, vol. 4588, pp. 23–27. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  18. Li, N., Xie, T., Tillmann, N., de Halleux, P., Schulte, W.: Reggae: Automated test generation for programs using complex regular expressions. In: ASE 2009 (2009)

    Google Scholar 

  19. MSDN Library. System.text namespace, http://msdn.microsoft.com/en-us/library/system.text.aspx

  20. PHP Manual. Pcre; posix regex; strings, http://php.net/manual/en/book.strings.php

  21. Minamide, Y.: Static approximation of dynamically generated web pages. In: WWW 2005, pp. 432–441 (2005)

    Google Scholar 

  22. Van Noord, G., Gerdemann, D.: Finite state transducers with predicates and identities. Grammars 4 (2001)

    MathSciNet  MATH  Google Scholar 

  23. Pex, http://research.microsoft.com/projects/pex

  24. Saxena, P., Akhawe, D., Hanna, S., Mao, F., McCamant, S., Song, D.: A symbolic execution framework for javascript. In: SP 2010. IEEE Computer Society, Los Alamitos (2010)

    Google Scholar 

  25. Siek, J.G., Lee, L.-Q., Lumsdaine, A.: The Boost Graph Library: User Guide and Reference Manual. Addison-Wesley Professional, Reading (2001)

    Google Scholar 

  26. Veanes, M., Bjørner, N., de Moura, L.: Symbolic automata constraint solving. In: Fermüller, C.G., Voronkov, A. (eds.) LPAR-17. LNCS, vol. 6397, pp. 640–654. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  27. Veanes, M., de Halleux, P., Tillmann, N.: Rex: Symbolic Regular Expression Explorer. In: ICST 2010. IEEE, Los Alamitos (2010)

    Google Scholar 

  28. Wassermann, G., Su, Z.: Sound and precise analysis of web applications for injection vulnerabilities. In: PLDI 2007, pp. 32–41. ACM, New York (2007)

    Google Scholar 

  29. Yu, F., Alkhalaf, M., Bultan, T.: An automata-based string analysis tool for PHP. In: Esparza, J., Majumdar, R. (eds.) TACAS 2010. LNCS, vol. 6015, pp. 154–157. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  30. Yu, F., Bultan, T., Ibarra, O.H.: Symbolic string verification: Combining string analysis and size analysis. In: Kowalewski, S., Philippou, A. (eds.) TACAS 2009. LNCS, vol. 5505, pp. 322–336. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  31. Z3, http://research.microsoft.com/projects/z3

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hooimeijer, P., Veanes, M. (2011). An Evaluation of Automata Algorithms for String Analysis. In: Jhala, R., Schmidt, D. (eds) Verification, Model Checking, and Abstract Interpretation. VMCAI 2011. Lecture Notes in Computer Science, vol 6538. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18275-4_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-18275-4_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-18274-7

  • Online ISBN: 978-3-642-18275-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics