Skip to main content

How to Settle the ReDoS Problem: Back to the Classical Automata Theory

  • Conference paper
  • First Online:
Implementation and Application of Automata (CIAA 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13266))

Included in the following conference series:

Abstract

Most regular-expression matching engines in practice are based on the Thompson construction and the Spencer matching algorithm. While these engines work fast and efficiently, a serious problem, the regular expression denial-of-service (ReDoS), has been reported recently. ReDoS is an algorithm complexity attack, which exploits the backtracking feature of the engine, and makes the service unresponsive indefinitely. Researchers suggested a few remedies to cope with the ReDoS problem, yet they are often ad-hoc or undesirable in practice. We instead propose a hybrid matching scheme that selects between the Thompson and the Spencer matching algorithms depending on the needed features. We also suggest to use the position construction for its intrinsic characteristics for fast matching. We evaluate the proposed approach using a benchmark dataset collected from various open-source projects, and compare the performance with the current approach. The experimental results show that a hybrid matcher reduces the ReDoS-vulnerability by 96% and 99.98% in full and partial matching, respectively. Moreover, 55% of the most problematic regular expressions become invulnerable to ReDoS by the position construction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://stackstatus.net/post/147710624694/outage-postmortem-july-20-2016.

  2. 2.

    Simplified. and $ match the start and the end of lines, respectively. matches the whitespace characters such as a space or a tab.

  3. 3.

    https://blog.cloudflare.com/details-of-the-cloudflare-outage-on-july-2-2019/.

  4. 4.

    https://pypi.org/project/xeger/.

  5. 5.

    https://valgrind.org/.

  6. 6.

    We have removed 359 ReDoS-vulnerable regexes in Table 1 that do not cause ReDoS behavior in both full and partial matching.

References

  1. Berglund, M., Drewes, F., van der Merwe, B.: Analyzing catastrophic backtracking behavior in practical regular expression matching. In: Proceedings of the 14th International Conference on Automata and Formal Languages, pp. 109–123 (2014)

    Google Scholar 

  2. Berglund, M., van der Merwe, B.: Regular expressions with backreferences re-examined. In: Proceedings of the Prague Stringology Conference 2017, pp. 30–41 (2017)

    Google Scholar 

  3. Brüggemann-Klein, A., Wood, D.: One-unambiguous regular languages. Inf. Comput. 140(2), 229–253 (1998)

    Article  MathSciNet  Google Scholar 

  4. Câmpeanu, C., Salomaa, K., Yu, S.: A formal study of practical regular expressions. Int. J. Found. Comput. Sci. 14(6), 1007–1018 (2003)

    Article  MathSciNet  Google Scholar 

  5. Câmpeanu, C., Salomaa, K., Yu, S.: Regex and extended regex. In: Proceedings of the 7th International Conference on Implementation and Application of Automata, pp. 77–84 (2003)

    Google Scholar 

  6. Caron, P., Ziadi, D.: Characterization of Glushkov automata. Theoret. Comput. Sci. 233(1–2), 75–90 (2000)

    Article  MathSciNet  Google Scholar 

  7. Chapman, C., Stolee, K.T.: Exploring regular expression usage and context in Python. In: Proceedings of the 25th International Symposium on Software Testing and Analysis, pp. 282–293 (2016)

    Google Scholar 

  8. Cheon, H., Hahn, J., Han, Y.S.: On the decidability of infix inclusion problem. In: Proceedings of the 26th International Conference on Developments in Language Theory (2022, in publishing)

    Google Scholar 

  9. Cortes, C., Mohri, M.: Learning with weighted transducers. In: Proceedings of the 7th International Workshop on Finite State Methods and Natural Language Processing, pp. 14–22 (2008)

    Google Scholar 

  10. Cox, R.: Regular expression matching in the wild (2010). http://swtch.com/~rsc/regexp/regexp3.html. Accessed 7 Apr 2022

  11. Davis, J.C., Coghlan, C.A., Servant, F., Lee, D.: The impact of regular expression denial of service (ReDoS) in practice: an empirical study at the ecosystem scale. In: Proceedings of the 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 246–256 (2018)

    Google Scholar 

  12. Davis, J.C., Michael IV, L.G., Coghlan, C.A., Servant, F., Lee, D.: Why aren’t regular expressions a lingua franca? An empirical study on the re-use and portability of regular expressions. In: Proceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 443–454 (2019)

    Google Scholar 

  13. Davis, J.C., Servant, F., Lee, D.: Using selective memoization to defeat regular expression denial of service (ReDoS). In: Proceedings of the 2021 IEEE Symposium on Security and Privacy, pp. 1–17 (2021)

    Google Scholar 

  14. Friedl, J.E.F.: Mastering Regular Expressions, 3rd edn. O’Reilly Media Inc. (2006)

    Google Scholar 

  15. Giammarresi, D., Ponty, J., Wood, D., Ziadi, D.: A characterization of Thompson digraphs. Discret. Appl. Math. 134(1–3), 317–337 (2004)

    Article  MathSciNet  Google Scholar 

  16. Glushkov, V.M.: The abstract theory of automata. Russ. Math. Surv. 16(5), 1–53 (1961)

    Article  MathSciNet  Google Scholar 

  17. Google: regexp (Go documentation). https://pkg.go.dev/regexp@go1.18. Accessed 7 Apr 2022

  18. Kirrage, J., Rathnayake, A., Thielecke, H.: Static analysis for regular expression denial-of-service attacks. In: Proceedings of the 7th International Conference on Network and System Security, pp. 135–148 (2013)

    Google Scholar 

  19. Kleene, S.C.: Representation of events in nerve nets and finite automata. Automata Stud. 34, 3–41 (1956)

    MathSciNet  Google Scholar 

  20. Liu, T., Sun, Y., Liu, A.X., Guo, L., Fang, B.: A prefiltering approach to regular expression matching for network security systems. In: Bao, F., Samarati, P., Zhou, J. (eds.) ACNS 2012. LNCS, vol. 7341, pp. 363–380. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31284-7_22

    Chapter  Google Scholar 

  21. Liu, Y., Zhang, M., Meng, W.: Revealer: detecting and exploiting regular expression denial-of-service vulnerabilities. In: Proceedings of the 2021 IEEE Symposium on Security and Privacy, pp. 1468–1484 (2021)

    Google Scholar 

  22. McNaughton, R., Yamada, H.: Regular expressions and state graphs for automata. IRE Trans. Electron. Comput. EC-9(1), 39–47 (1960)

    Google Scholar 

  23. van der Merwe, B., Mouton, J., van Litsenborgh, S., Berglund, M.: Memoized regular expressions. In: Maneth, S. (ed.) CIAA 2021. LNCS, vol. 12803, pp. 39–52. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-79121-6_4

    Chapter  Google Scholar 

  24. Nicaud, C.: On the average size of Glushkov’s automata. In: Dediu, A.H., Ionescu, A.M., Martín-Vide, C. (eds.) LATA 2009. LNCS, vol. 5457, pp. 626–637. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00982-2_53

    Chapter  MATH  Google Scholar 

  25. Rathnayake, A., Thielecke, H.: Static analysis for regular expression exponential runtime via substructural logics. arXiv preprint CoRR 1405.7058 (2014)

    Google Scholar 

  26. Schmid, M.L.: Regular expressions with backreferences: polynomial-time matching techniques. arXiv preprint CoRR 1903.05896 (2019)

    Google Scholar 

  27. Shen, Y., Jiang, Y., Xu, C., Yu, P., Ma, X., Lu, J.: ReScue: crafting regular expression DoS attacks. In: Proceedings of the 33rd IEEE/ACM International Conference on Automated Software Engineering, pp. 225–235 (2018)

    Google Scholar 

  28. Spencer, H.: A regular-expression matcher. In: Software Solutions in C, pp. 35–71. AP Professional (1994)

    Google Scholar 

  29. Thompson, K.: Programming techniques: regular expression search algorithm. Commun. ACM 11(6), 419–422 (1968)

    Article  Google Scholar 

  30. Weideman, N., van der Merwe, B., Berglund, M., Watson, B.: Analyzing matching time behavior of backtracking regular expression matchers by using ambiguity of NFA. In: Proceedings of the 21st International Conference on Implementation and Application of Automata, pp. 322–334 (2016)

    Google Scholar 

  31. Weidman, A., other contributers: Regular expression denial of service - ReDoS. https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS. Accessed 7 Apr 2022

  32. Wüstholz, V., Olivo, O., Heule, M.J.H., Dillig, I.: Static detection of DoS vulnerabilities in programs that use regular expressions. In: Legay, A., Margaria, T. (eds.) TACAS 2017. LNCS, vol. 10206, pp. 3–20. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-662-54580-5_1

    Chapter  MATH  Google Scholar 

Download references

Acknowledgment

This research was supported by the NRF grant (NRF-2020R1A4A3079947) funded by MIST.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yo-Sub Han .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sung, S., Cheon, H., Han, YS. (2022). How to Settle the ReDoS Problem: Back to the Classical Automata Theory. In: Caron, P., Mignot, L. (eds) Implementation and Application of Automata. CIAA 2022. Lecture Notes in Computer Science, vol 13266. Springer, Cham. https://doi.org/10.1007/978-3-031-07469-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-07469-1_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-07468-4

  • Online ISBN: 978-3-031-07469-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics