ABSTRACT
Scannerless Earley Virtual Machines (SEVM) is a new generalized context-free parsing method, in which grammars are internally encoded using a special instruction intermediate language. In this paper we show how just-in-time compilation can be used to translate intermediate form grammars into native machine code to achieve improved parsing performance. We also present an efficient method for lexical disambiguation, which additionally enables to significantly reduce the amount of code that needs to be just-in-time compiled. Finally, we compare our implementation of SEVM with other parser implementations and show that our parser provides acceptable performance for analysing real-world computer languages.
- Audrius Šaikunas. 2017. Parsing with Earley Virtual Machines. In FedCSIS Communication Papers.Google Scholar
- Audrius Šaikunas. 2019. Parsing with Scannerless Earley Virtual Machines. Baltic Journal of Modern Computing 7 (06 2019). https://doi.org/10.22364/bjmc.2019.7.2. 01Google Scholar
- Bryan Ford. 2002. Packrat parsing: a practical linear-time algorithm with back tracking. Master's thesis. Massachusetts Institute of Technology.Google Scholar
- Claus Brabrand and Michael I. Schwartzbach. 2007. The Metafront System: Safe and Extensible Parsing and Transformation. Science of Computer Programming 68, 1 (Aug. 2007), 2--20. https://doi.org/10.1016/j.scico.2005.06.007Google ScholarDigital Library
- Donald E. Knuth. 1971. Top-down Syntax Analysis. Acta Inf. 1, 2 (June 1971), 79--110. https://doi.org/10.1007/BF00289517Google ScholarDigital Library
- Elizabeth Scott. 2008. SPPF-Style Parsing From Earley Recognisers. Electron. Notes Theor. Comput. Sci. 203, 2 (April 2008), 53--67. https://doi.org/10.1016/j.entcs.2008.03.044Google ScholarDigital Library
- Elizabeth Scott and Adrian Johnstone. 2006. Right Nulled GLR Parsers. ACM Trans. Program. Lang. Syst. 28, 4 (July 2006), 577--618. https://doi.org/10.1145/1146809.1146810Google ScholarDigital Library
- Giorgios Economopoulos, Paul Klint, and Jurgen Vinju. 2009. Faster scannerless GLR parsing. In In Proceedings of the 18th International Conference on Compiler Construction (CC. Springer-Verlag.Google ScholarDigital Library
- Jay Earley. 1970. An Efficient Context-free Parsing Algorithm. Commun. ACM 13, 2 (1970), 94--102. https://doi.org/10.1145/362007.362035Google ScholarDigital Library
- John Aycock and R. Nigel Horspool. 2002. Practical Earley Parsing. Comput. J. 45 (2002), 620--630.Google ScholarCross Ref
- Masaru Tomita. 1985. Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems. Kluwer Academic Publishers, Norwell, MA, USA.Google Scholar
- Paul Stansifer and Mitchell Wand. 2011. Parsing Reflective Grammars. In Proceedings of the Eleventh Workshop on Language Descriptions, Tools and Applications (LDTA '11). ACM, New York, United States, Article 10, 7 pages. https://doi.org/10.1145/1988783.1988793Google ScholarDigital Library
- Russ Cox. 2009. Regular Expression Matching: the Virtual Machine Approach. https://swtch.com/~rsc/regexp/regexp2.htmlGoogle Scholar
- Sérgio Medeiros and Roberto Ierusalimschy. 2008. A Parsing Machine for PEGs. In Proceedings of the 2008 Symposium on Dynamic Languages (DLS '08). ACM, New York, NY, USA, Article 2, 12 pages. https://doi.org/10.1145/1408681.1408683Google ScholarDigital Library
- Trevor Jim and Yitzhak Mandelbaum. 2010. Efficient Earley Parsing with Regular Right-hand Sides. Electronic Notes in Theoretical Computer Science 253, 7 (2010), 135 -- 148. https://doi.org/10.1016/j.entcs.2010.08.037Google ScholarDigital Library
- Trevor Jim, Yitzhak Mandelbaum, and David Walker. 2010. Semantics and Algorithms for Data-dependent Grammars. In Proceedings of the 37th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL'10). ACM, New York, United States, 417--430. https://doi.org/10.1145/1706299.1706347Google ScholarDigital Library
Index Terms
- Just-in-time Parsing with Scannerless Earley Virtual Machines
Recommendations
Parsing expression grammars: a recognition-based syntactic foundation
POPL '04For decades we have been using Chomsky's generative system of grammars, particularly context-free grammars (CFGs) and regular expressions (REs), to express the syntax of programming languages and protocols. The power of generative grammars to express ...
One parser to rule them all
Onward! 2015: 2015 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (Onward!)Despite the long history of research in parsing, constructing parsers for real programming languages remains a difficult and painful task. In the last decades, different parser generators emerged to allow the construction of parsers from a BNF-like ...
Parsing expression grammars: a recognition-based syntactic foundation
POPL '04: Proceedings of the 31st ACM SIGPLAN-SIGACT symposium on Principles of programming languagesFor decades we have been using Chomsky's generative system of grammars, particularly context-free grammars (CFGs) and regular expressions (REs), to express the syntax of programming languages and protocols. The power of generative grammars to express ...
Comments