skip to main content
10.1145/3550198.3550425acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article

Type-safe regular expressions

Published: 06 October 2022 Publication History

Abstract

Regular expressions can easily go wrong. Capturing groups, in particular, require meticulous care to avoid running into off-by-one errors and null pointer exceptions. In this chapter, we propose a new design for Scala's regular expressions which completely eliminates this class of errors. Our design makes extensive use of match types, Scala's new feature for type-level programming, to statically analyze regular expressions during type checking. We show that our approach has a minor impact on compilation times, which makes it suitable for practical use.

References

[1]
Vincenzo Bazzucchi. 2021. Tuples Bring Generic Programming to Scala 3. https://www.scala-lang.org/2021/02/26/tuples-bring-generic-programming-to-scala-3.html.
[2]
Olivier Blanvillain, Jonathan Immanuel Brachthäuser, Maxime Kjaer, and Martin Odersky. 2022. Type-Level Programming with Match Types, In Proc. ACM Program. Lang. Proceedings of the ACM SIGPLAN Symposium on Principles of Programming Languages.
[3]
Burak Emir, Martin Odersky, and John Williams. 2007. Matching Objects with Patterns. In Proceedings of the European Conference on Object-Oriented Programming (ECOOP'07). Springer-Verlag, Berlin, Heidelberg.
[4]
IEEE. 2018. The Open Group Base Specifications Issue 7, 2018 edition. https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html.
[5]
Oleg Kiselyov, Ralf Lämmel, and Keean Schupke. 2004. Strongly Typed Heterogeneous Collections. In Proceedings of the ACM SIGPLAN Workshop on Haskell (Haskell'04). ACM, New York, NY, USA.
[6]
Michael Leonhard. 2021--2022. safe-regex GitLab Repository. https://gitlab.com/leonhard-llc/safe-regex-rs.
[7]
Akshay Nair. 2021. typed-regex GitHub Repository. https://github.com/phenax/typed-regex.
[8]
Dmytro Petrashko. 2017. Design and implementation of an optimizing type-centric compiler for a high-level language. PhD dissertation. EPFL, Lausanne.
[9]
Gabriel Radanne. 2017--2020. Tyre GitHub Repository. https://github.com/Drup/tyre.
[10]
Cheplyaka Roman. 2011--2021. regex-applicative. https://github.com/UnkindPartition/regex-applicative.
[11]
Eric Spishak, Werner Dietl, and Michael D. Ernst. 2012. A type system for regular expressions. In Proceedings of the 14th Workshop on Formal Techniques for Java-like Programs (FTfJP'12). ACM, New York, NY, USA.
[12]
Nicolas Stucki, Aggelos Biboudis, and Martin Odersky. 2018. A practical unification of multi-stage programming and macros. In Proceedings of the ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE'18). ACM, New York, NY, USA.
[13]
The TypeScript development team. 2019--2022. The TypeScript Handbook. Microsoft Corporation. https://www.typescriptlang.org/docs/handbook/intro.html.
[14]
W3C. 1994--2013. XQuery/XPath/XSLT 3.* Test Suite (QT3TS). https://dev.w3.org/2011/QT3-test-suite/.
[15]
W3C. 2021. QT3TS GitHub Repository. https://github.com/w3c/qt3tests.
[16]
Stephanie Weirich. 2014--2020. Examples of Dependently-typed programs in Haskell. https://github.com/sweirich/dth.
[17]
Li-Xiao Zheng, Shuai Ma, Zu-Xi Chen, and Xiang-Yu Luo. 2021. Ensuring the Correctness of Regular Expressions: A Review. International Journal of Automation and Computing 18, 4 (2021).

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
Scala '22: Proceedings of the Scala Symposium
June 2022
33 pages
ISBN:9781450394635
DOI:10.1145/3550198
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 October 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. match types
  2. regular expressions
  3. type safety

Qualifiers

  • Research-article

Conference

Scala '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 5 of 6 submissions, 83%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 58
    Total Downloads
  • Downloads (Last 12 months)18
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media