skip to main content
10.1145/3136014.3136016acmconferencesArticle/Chapter ViewAbstractPublication PagessplashConference Proceedingsconference-collections

Type-safe modular parsing

Published:23 October 2017Publication History

ABSTRACT

Over the years a lot of effort has been put on solving extensibility problems, while retaining important software engineering properties such as modular type-safety and separate compilation. Most previous work focused on operations that traverse and process extensible Abstract Syntax Tree (AST) structures. However, there is almost no work on operations that build such extensible ASTs, including parsing.

This paper investigates solutions for the problem of modular parsing. We focus on semantic modularity and not just syntactic modularity. That is, the solutions should not only allow complete parsers to be built out of modular parsing components, but also enable the parsing components to be modularly type-checked and separately compiled. We present a technique based on parser combinators that enables modular parsing. Interestingly, the modularity requirements for modular parsing rule out several existing parser combinator approaches, which rely on some non-modular techniques. We show that Packrat parsing techniques, provide solutions for such modularity problems, and enable reasonable performance in a modular setting. Extensibility is achieved using multiple inheritance and Object Algebras. To evaluate the approach we conduct a case study based on the 'Types and Programming Languages' interpreters. The case study shows the effectiveness at reusing parsing code from existing interpreters, and the total parsing code is 69% shorter than an existing code base using a non-modular parsing approach.

Skip Supplemental Material Section

Supplemental Material

References

  1. Sven Apel and Christian Kastner. 2009. An Overview of Feature-Oriented Software Development. Journal of Object Technology 8, 5 (2009), 49-84.Google ScholarGoogle ScholarCross RefCross Ref
  2. Patrick Bahr. 2014. Composing and decomposing data types: a closed type families implementation of data types a la carte. In Proceedings of WGP 2014. 71-82. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Patrick Bahr and Tom Hvitved. 2011. Compositional data types. In Proceedings of WGP@ICFP 2011. 83-94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Eric Beguet and Manohar Jonnalagedda. 2014. Accelerating Parser Combinators with Macros. In Proceedings of SCALA 2014. 7-17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Gilad Bracha and William R. Cook. 1990. Mixin-based Inheritance. In Proceedings of OOPSLA/ECOOP 1990. 303-311. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Martin Bravenboer and Eelco Visser. 2008. Parse Table Composition. In Proceedings of SLE 2008. 74-94.Google ScholarGoogle Scholar
  7. William H. Burge. 1975. Recursive programming techniques. Addison-Wesley Longman, Incorporated.Google ScholarGoogle Scholar
  8. William R. Cook. 1989. A Denotational Semantics of Inheritance. Ph.D. Dissertation. Brown University. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Bruno C. d. S. Oliveira. 2009. Modular Visitor Components. In Proceedings of ECOOP 2009. 269-293. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Bruno C. d. S. Oliveira and William R. Cook. 2012. Extensibility for the Masses - Practical Extensibility with Object Algebras. In Proceedings of ECOOP 2012. 2-27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Bruno C. d. S. Oliveira, Shin-Cheng Mu, and Shu-Hung You. 2015. Modular reifiable matching: a list-of-functors approach to two-level types. In Proceedings of Haskell 2015. 82-93. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Bruno C. d. S. Oliveira, Tijs van der Storm, Alex Loh, and William R. Cook. 2013. Feature-Oriented Programming with Object Algebras. In Proceedings of ECOOP 2013. 27-51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Torbjorn Ekman and Gorel Hedin. 2007. The jastadd extensible java compiler. In Proceedings of OOPSLA 2007. 1-18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Sebastian Erdweg, Tillmann Rendel, Christian Kastner, and Klaus Ostermann. 2011. SugarJ: library-based syntactic language extensibility. In Proceedings of OOPSLA 2011. 391-406. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Erik Ernst. 2001. Family Polymorphism. In Proceedings of ECOOP 2001. 303-326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Erik Ernst, Klaus Ostermann, and William R. Cook. 2006. A virtual class calculus. In Proceedings of POPL 2006. 270-282. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Martin Odersky et al. 2004. An Overview of the Scala Programming Language. Technical Report IC/2004/64. EPFL Lausanne, Switzerland.Google ScholarGoogle Scholar
  18. Sebastian Erdweg et al. 2015. Evaluating and comparing language workbenches: Existing results and benchmarks for the future. Computer Languages, Systems & Structures 44 (2015), 24-47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Bryan Ford. 2002. Packrat parsing: : simple, powerful, lazy, linear time, functional pearl. In Proceedings of ICFP 2002. 36-47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Richard A. Frost, Rahmatullah Hafiz, and Paul Callaghan. 2008. Parser Combinators for Ambiguous Left-Recursive Grammars. In Proceedings of PADL 2008. 167-181. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Erich Gamma. 1995. Design patterns: elements of reusable object-oriented software. Pearson Education India. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Brian Goetz and Robert Field. 2012. Featherweight Defenders: A formal model for virtual extension methods in Java. Oracle Corporation, Mar 27 (2012), 9.Google ScholarGoogle Scholar
  23. Maria Gouseti, Chiel Peters, and Tijs van der Storm. 2014. Extensible language implementation with object algebras (short paper). In Proceedings of GPCE 2014. 25-28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Robert Grimm. 2006. Better extensibility through modular syntax. In Proceedings of PLDI 2006. 38-51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Jan Heering, Paul Robert Hendrik Hendriks, Paul Klint, and Jan Rekers. 1989. The syntax definition formalism SDF. ACM Sigplan Notices 24, 11 (1989), 43-75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Graham Hutton and Erik Meijer. 1996. Monadic parser combinators. Technical Report NOTTCS-TR-96-4. University of Nottingham, University of Nottingham. http://eprints.nottingham.ac.uk/237/Google ScholarGoogle Scholar
  27. Christian Kastner, Sven Apel, and Klaus Ostermann. 2011. The road to feature modularity?. In Proceedings of SPLC 2011. 5:1-5:8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Daan Leijen and Erik Meijer. 2001. Parsec: Direct Style Monadic Parser Combinators For The Real World. Technical Report UU-CS-2001-3. Department of Information and Computing Sciences, Utrecht University.Google ScholarGoogle Scholar
  29. M. Douglas McIlroy. 1968. Mass Produced Software Components. Report on a Conference of the Nato Science Committee. (1968), 135-150 pages.Google ScholarGoogle Scholar
  30. Matthew Might, David Darais, and Daniel Spiewak. 2011. Parsing with derivatives: a functional pearl. In Proceeding of ICFP 2011. 189-195. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Adriaan Moors, Frank Piessens, and Martin Odersky. 2008. Parser combinators in Scala. (2008).Google ScholarGoogle Scholar
  32. Nathaniel Nystrom, Michael R. Clarkson, and Andrew C. Myers. 2003. Polyglot: An Extensible Compiler Framework for Java. In Proceedings of CC 2003. 138-152. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Martin Odersky and Matthias Zenger. 2005. Independently extensible solutions to the expression problem. In Proceeding of FOOL 2015.Google ScholarGoogle Scholar
  34. Cyrus Omar, Darya Kurilova, Ligia Nistor, Benjamin Chung, Alex Potanin, and Jonathan Aldrich. 2014. Safely Composable Type-Specific Languages. In Proceedings of ECOOP 2014. 105-130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Terence John Parr and Russell W Quong. 1995. ANTLR: A predicated-LL(K) Parser Generator. Softw. Pract. Exper. 25, 7 (July 1995), 789-810. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Benjamin C Pierce. 2002. Types and programming languages. MIT press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Nathanael Scharli, Stephane Ducasse, Oscar Nierstrasz, and Andrew P. Black. 2003. Traits: Composable Units of Behaviour. In Proceedings of ECOOP 2003. 248-274.Google ScholarGoogle Scholar
  38. August Schwerdfeger and Eric VanWyk. 2009. Verifiable Composition of Deterministic Grammars. In Proceedings of PLDI 2009. 199-210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. August Schwerdfeger and Eric Van Wyk. 2009. Verifiable Parse Table Composition for Deterministic Parsing. In Proceedings of SLE 2009. 184-203. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Anthony M. Sloane and Matthew Roberts. 2015. Oberon-0 in Kiama. Science of Computer Programming 114 (2015), 20-32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Wouter Swierstra. 2008. Data types a la carte. Journal of functional programming 18, 04 (2008), 423-436. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Sam Tobin-Hochstadt, Vincent St-Amour, Ryan Culpepper, Matthew Flatt, and Matthias Felleisen. 2011. Languages as libraries. In Proceedings of PLDI 2011. 132-141. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Mads Torgersen. 2004. The Expression Problem Revisited. In Proceedings of ECOOP 2004. 123-143.Google ScholarGoogle ScholarCross RefCross Ref
  44. Marcos Viera, Doaitse Swierstra, and Atze Dijkstra. 2012. Grammar Fragments Fly First-class. In Proceedings of LDTA 2012. 5:1-5:7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Eelco Visser. 2001. Stratego: A Language for Program Transformation Based on Rewriting Strategies. In Proceedings of RTA 2001. 357-362. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Philip Wadler. 1985. How to Replace Failure by a List of Successes: A method for exception handling, backtracking, and pattern matching in lazy functional languages. In Proceedings of Functional Programming Languages and Computer Architecture, 1985. 113-128. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Philip Wadler. 1998. The expression problem. Java-genericity mailing list (1998).Google ScholarGoogle Scholar
  48. YanlinWang and Bruno C. d. S. Oliveira. 2016. The expression problem, trivially!. In Proceedings of Modularity 2016. 37-41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Alessandro Warth, James R. Douglass, and Todd D. Millstein. 2008. Packrat parsers can support left recursion. In Proceedings of PEPM 2008. 103-110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Alessandro Warth, Patrick Dubroy, and Tony Garnock-Jones. 2016. Modular semantic actions. In Proceedings of DLS 2016. 108-119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Haoyuan Zhang, Zewei Chu, Bruno C. d. S. Oliveira, and Tijs van der Storm. 2015. Scrap your boilerplate with object algebras. In Proceedings of OOPSLA 2015. 127-146. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Type-safe modular parsing

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader