Skip to main content

On the Semantics of Regular Expression Parsing in the Wild

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9223))

Abstract

We introduce prioritized transducers to formalize capturing groups in regular expression matching in a way that permits straightforward modelling of and comparison with real-world regular expression matching library behaviors. The broader questions of parsing semantics and performance are discussed, and also the complexity of deciding equivalence of regular expressions with capturing groups.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Java is a registered trademark of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.

References

  1. Berglund, M., Björklund, H., Drewes, F., van der Merwe, B., Watson, B.: Cuts in regular expressions. In: Béal, M.-P., Carton, O. (eds.) DLT 2013. LNCS, vol. 7907, pp. 70–81. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  2. Berglund, M., Drewes, F., van der Merwe, B.: Analyzing catastrophic backtracking behavior in practical regular expression matching. In: Ésik, Z., Fülöp, Z., (eds.) Proceedings of the 14th International Conference on Automata and Formal Languages, pp. 109–123 (2014)

    Google Scholar 

  3. Câmpeanu, C., Salomaa, K., Yu, S.: A formal study of practical regular expressions. Int. J. Found. Comput. Sci. 14(6), 1007–1018 (2003)

    Article  MATH  Google Scholar 

  4. Cox, R.: Implementing regular expressions (2007). http://swtch.com/rsc/regexp/. (Accessed 3 March 2015)

  5. Friedl, J.: Mastering Regular Expressions, 3rd edn. O’Reilly Media Inc., Sebastopol (2006)

    Google Scholar 

  6. Frisch, A., Cardelli, L.: Greedy regular expression matching. In: Díaz, J., Karhumäki, J., Lepistö, A., Sannella, D. (eds.) ICALP 2004. LNCS, vol. 3142, pp. 618–629. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  7. Rathnayake, A., Thielecke, H.: Static analysis for regular expression exponential runtime via substructural logics. CoRR, abs/1405.7058 (2014)

    Google Scholar 

  8. Sakarovitch, J.: Elements of Automata Theory. Cambridge University Press, New York (2009)

    Book  MATH  Google Scholar 

  9. Sakuma, Y., Minamide, Y., Voronkov, A.: Translating regular expression matching into transducers. J. Applied Logic 10(1), 32–51 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  10. Thompson, K.: Regular expression search algorithm. Commun. ACM 11(6), 419–422 (1968)

    Article  MATH  Google Scholar 

  11. Wang, J.: Handbook of Finite State Based Models and Applications. 1st edn. Chapman & Hall/CRC, Boca Raton (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Brink van der Merwe .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Berglund, M., van der Merwe, B. (2015). On the Semantics of Regular Expression Parsing in the Wild. In: Drewes, F. (eds) Implementation and Application of Automata. CIAA 2015. Lecture Notes in Computer Science(), vol 9223. Springer, Cham. https://doi.org/10.1007/978-3-319-22360-5_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-22360-5_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-22359-9

  • Online ISBN: 978-3-319-22360-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics