Abstract
Regular expressions are often an integral part of program customization and many algorithms have been proposed for transforming them into suitable data structures. These algorithms can be divided into two main classes: backtracking or automaton-based algorithms. Surprisingly, the latter class draws less attention than the former, even though automaton-based algorithms represent the oldest and by far the fastest solutions when carefully designed. Only two open-source automaton-based implementations stand out: PCRE and the recent RE2 from Google. We have developed, and present here, a competitive automaton-based regular expression engine on top of the LGPL C++ Automata Standard Template Library (ASTL), whose efficiency and scalability remain unmatched and which distinguishes itself through a unique and rigorous STL-like design.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aho, A.V., Sethi, R., Ullman, J.D.: Compilers - Principles, Techniques and Tools. Addison-Wesley, Reading (1986)
Hopcroft, J.E., Ullman, J.D.: Introduction to automata, languages and computation. Addison-Wesley, Reading (1979)
Musser, D.R., Stepanov, A.: Generic Programming. In: Gianni, P. (ed.) ISSAC 1988. LNCS, vol. 358. Springer, Heidelberg (1989)
Standard Template Library Programmer’s Guide, Silicon Graphics Computer Systems (1999), http://www.sgi.com/Technology/STL
Stepanov, A., Lee, M.: The Standard Template Library. HP Laboratories Technical Report 95-11(R.1) (1995)
Le Maout, V.: Tools to Implement Automata, a first step: ASTL. In: Wood, D., Yu, S. (eds.) WIA 1997. LNCS, vol. 1436, pp. 104–108. Springer, Heidelberg (1998)
Le Maout, V.: PhD Thesis: Expérience de programmation générique sur des structures non-séquentielles: les automates, Université de Marne-La-Vallée (2003)
Le Maout, V.: Cursors. In: Yu, S., Păun, A. (eds.) CIAA 2000. LNCS, vol. 2088, pp. 195–207. Springer, Heidelberg (2001)
Thompson, K.: Regular expression search algorithm. CACMÂ 11(6) (1968)
Cox, R.: Regular Expression Matching Can Be Simple And Fast (2007), http://swtch.com/~rsc/regexp/regexp1.html
Cox, R.: Regular Expression Matching: the Virtual Machine Approach (2009), http://swtch.com/~rsc/regexp/regexp2.html
Cox, R.: Regular Expression Matching in the Wild (2010), http://swtch.com/~rsc/regexp/regexp3.html
McNaughton, R., Yamada, H.: Regular expressions and state graphs for automata. IRE Transactions on Electronic Computers EC-9(1), 39–47 (1960)
Laurikari, V.: Efficient Submatch Addressing for Regular Expressions (2001)
Laurikari, V.: NFAs with Tagged Transitions, their Conversion to Deterministic Automata and Application to Regular Expressions (2000)
Le Maout, V.: Regular Expression Performance Comparison (2010), http://astl.sourceforge.net/bench.7.html
Maddock, J.: Boost Regex (2007), http://www.boost.org/doc/libs/1_42_0/libs/regex/doc/html/index.html
Niebler, E.: Boost Xpressive (2007), http://boost-sandbox.sourceforge.net/libs/xpressive/doc/html/index.html
PCRE. Univ. Cambridge (2009), http://sourceforge.net/projects/pcre/
Niebler, E.: GRETA. Microsoft (2003), http://research.microsoft.com/en-us/downloads/BD99F343-4FF4-4041-8293-34C054EFE749/default.aspx
Cox, R.: RE2, Google (2010), http://code.google.com/p/re2/
Boyer, R.S., Moore, J.S.: A fast string searching algorithm. CACM 20(10), 762–772 (1977)
Horspool, R.N.: Practical fast searching in strings. Software - Practice & Experience 10, 501–506 (1980)
Baeza-Yates, R.A., Régnier, M.: Average running time of the Boyer-Moore-Horspool algorithm. Theoretical Computer Science 92(1), 19–31 (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Le Maout, V. (2011). Regular Expressions at Their Best: A Case for Rational Design. In: Domaratzki, M., Salomaa, K. (eds) Implementation and Application of Automata. CIAA 2010. Lecture Notes in Computer Science, vol 6482. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18098-9_33
Download citation
DOI: https://doi.org/10.1007/978-3-642-18098-9_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-18097-2
Online ISBN: 978-3-642-18098-9
eBook Packages: Computer ScienceComputer Science (R0)