skip to main content
10.1145/3129416.3129440acmotherconferencesArticle/Chapter ViewAbstractPublication PageshtConference Proceedingsconference-collections
research-article

Turning evil regexes harmless

Published:26 September 2017Publication History

ABSTRACT

We explore the relationship between ambiguity in automata and regular expressions on the one hand, and the matching time of backtracking regular expression matchers on the other. We focus in particular on the extreme cases where we have either an exponential amount of ambiguity or no ambiguity at all. We also investigate techniques to reduce or remove ambiguity from regular expressions, which can then be used to transform regular expressions which might be exploited by using algorithmic complexity, into harmless equivalent expressions.

References

  1. Cyril Allauzen, Mehryar Mohri, and Ashish Rastogi. 2008. General Algorithms for Testing the Ambiguity of Finite Automata. In Proc. 12th Intl. Conf. on Developments in Language Theory (DLT 2008) (LNCS), Masami Ito and Masafumi Toyama (Eds.), Vol. 5257. 108--120.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Martin Berglund, Frank Drewes, and Brink van der Merwe. 2014. Analyzing Catastrophic Backtracking Behavior in Practical Regular Expression Matching. In Proceedings 14th International Conference on Automata and Formal Languages, AFL 2014, Szeged, Hungary, May 27--29, 2014. (EPTCS), Zoltán Ésik and Zoltán Fülöp (Eds.), Vol. 151. 109--123. Google ScholarGoogle ScholarCross RefCross Ref
  3. Martin Berglund and Brink van der Merwe. 2015. On the Semantics of Regular Expression Parsing in the Wild. In Implementation and Application of Automata - 20th International Conference, CIAA 2015, Umeå, Sweden, August 18--21, 2015, Proceedings (Lecture Notes in Computer Science), Frank Drewes (Ed.), Vol. 9223. Springer, 292--304. Google ScholarGoogle ScholarCross RefCross Ref
  4. Ronald Book, Shimon Even, Sheila Greibach, and Gene Ott. 1971. Ambiguity in graphs and expressions. IEEE Trans. Comput. 100, 2 (1971), 149--153.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Anne Brüggemann-Klein. 1993. Regular expressions into finite automata. Theoretical Computer Science 120, 2 (1993), 197--213.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Russ Cox. 2007. Regular Expression Matching Can Be Simple And Fast (but is slow in Java, Perl, PHP, Python, Ruby, ...). (October 2007). Retrieved October 19, 2016 from https://swtch.com/~rsc/regexp/regexp1.htmlGoogle ScholarGoogle Scholar
  7. Scott Crosby and Dan Wallach. 2003. Denial of Service via Algorithmic Complexity Attacks. In Proceedings of the 12th USENIX Security Symposium, Washington, D.C., USA, August 4--8, 2003. USENIX Association.Google ScholarGoogle Scholar
  8. Ralf Engelschall. 1997. URL Rewriting Guide. (1997). Retrieved June 14, 2017 from http://httpd.apache.org/docs/2.0/misc/rewriteguide.htmlGoogle ScholarGoogle Scholar
  9. Google. 2010. RE2. (2010). Retrieved June 14, 2017 from https://github.com/google/re2Google ScholarGoogle Scholar
  10. James Kirrage, Asiri Rathnayake, and Hayo Thielecke. 2013. Static Analysis for Regular Expression Denial-of-Service Attacks. In Network and System Security - 7th International Conference, NSS 2013, Madrid, Spain, June 3--4, 2013. Proceedings (Lecture Notes in Computer Science), Javier Lopez, Xinyi Huang, and Ravi Sandhu (Eds.), Vol. 7873. Springer, 135--148. Google ScholarGoogle ScholarCross RefCross Ref
  11. Martin Kleppmann. 2012. Java's hashCode is not safe for distributed systems. (June 2012). Retrieved June 19, 2017 from https://martin.kleppmann.com/2012/06/18/java-hashcode-unsafe-for-distributed-systems.htmlGoogle ScholarGoogle Scholar
  12. Microsoft. 2010. New Tool: SDL Regex Fuzzer. (2010). Retrieved June 15, 2017 from https://blogs.microsoft.com/microsoftsecure/2010/10/12/new-tool-sdl-regex-fuzzer/Google ScholarGoogle Scholar
  13. OWASP. 2016. The Open Web Application Security Project. (October 2016). Retrieved October 18, 2016 from https://www.owasp.org/index.php/Regular_expression_Denial_of_Service_-_ReDoSGoogle ScholarGoogle Scholar
  14. Asiri Rathnayake and Hayo Thielecke. 2014. Static Analysis for Regular Expression Exponential Runtime via Substructural Logics. CoRR abs/1405.7058 (2014).Google ScholarGoogle Scholar
  15. Asiri Rathnayake and Hayo Thielecke. 2016. RXXR2 regular expression static analyzer. (2016). Retrieved June 15, 2017 from http://www.cs.bham.ac.uk/~hxt/research/rxxr2/Google ScholarGoogle Scholar
  16. Alex Roichman and Adar Weidman. 2012. Regular Expression Denial of Service. (2012). Retrieved October 19, 2016 from https://www.checkmarx.com/white_papers/redos-regular-expression-denial-of-service/Google ScholarGoogle Scholar
  17. Seppo Sippu and Eljas Soisalon-Soininen. 1988. Parsing Theory, Vol. I: Languages and Parsing, volume 15 of EATCS Monographs on Theoretical Computer Science. (1988).Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Randy Smith, Cristian Estan, and Somesh Jha. 2006. Backtracking Algorithmic Complexity Attacks against a NIDS. In 22nd Annual Computer Security Applications Conference (ACSAC 2006), 11--15 December 2006, Miami Beach, Florida, USA. IEEE Computer Society, 89--98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Snort. 2015. Snort. (2015). Retrieved October 6, 2015 from http://www.snort.orgGoogle ScholarGoogle Scholar
  20. Stack Exchange Network Status. 2017. Outage Postmortem - July 20, 2016. (June 2017). Retrieved June 14, 2017 from http://stackstatus.net/post/147710624694/outage-postmortem-july-20--2016Google ScholarGoogle Scholar
  21. Brink van der Merwe and Nicolaas Weideman. 2016. Regular Expression Static Analysis Project Page. (April 2016). Retrieved April, 30, 2016 from http://www.cs.sun.ac.za/~abvdm/regex.htmlGoogle ScholarGoogle Scholar
  22. Nicolaas Weideman, Brink van der Merwe, Martin Berglund, and Bruce Watson. 2016. Analyzing Matching Time Behavior of Backtracking Regular Expression Matchers by Using Ambiguity of NFA. In Implementation and Application of Automata - 21st International Conference, CIAA 2016, Seoul, South Korea, July 19--22, 2016, Proceedings (Lecture Notes in Computer Science), Yo-Sub Han and Kai Salomaa (Eds.), Vol. 9705. Springer, 322--334. Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Turning evil regexes harmless

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in
              • Published in

                cover image ACM Other conferences
                SAICSIT '17: Proceedings of the South African Institute of Computer Scientists and Information Technologists
                September 2017
                384 pages
                ISBN:9781450352505
                DOI:10.1145/3129416

                Copyright © 2017 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 26 September 2017

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • research-article

                Acceptance Rates

                SAICSIT '17 Paper Acceptance Rate39of108submissions,36%Overall Acceptance Rate187of439submissions,43%

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader