skip to main content
10.1145/3387168.3387216acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicvispConference Proceedingsconference-collections
research-article

Just-in-time Parsing with Scannerless Earley Virtual Machines

Published:25 May 2020Publication History

ABSTRACT

Scannerless Earley Virtual Machines (SEVM) is a new generalized context-free parsing method, in which grammars are internally encoded using a special instruction intermediate language. In this paper we show how just-in-time compilation can be used to translate intermediate form grammars into native machine code to achieve improved parsing performance. We also present an efficient method for lexical disambiguation, which additionally enables to significantly reduce the amount of code that needs to be just-in-time compiled. Finally, we compare our implementation of SEVM with other parser implementations and show that our parser provides acceptable performance for analysing real-world computer languages.

References

  1. Audrius Šaikunas. 2017. Parsing with Earley Virtual Machines. In FedCSIS Communication Papers.Google ScholarGoogle Scholar
  2. Audrius Šaikunas. 2019. Parsing with Scannerless Earley Virtual Machines. Baltic Journal of Modern Computing 7 (06 2019). https://doi.org/10.22364/bjmc.2019.7.2. 01Google ScholarGoogle Scholar
  3. Bryan Ford. 2002. Packrat parsing: a practical linear-time algorithm with back tracking. Master's thesis. Massachusetts Institute of Technology.Google ScholarGoogle Scholar
  4. Claus Brabrand and Michael I. Schwartzbach. 2007. The Metafront System: Safe and Extensible Parsing and Transformation. Science of Computer Programming 68, 1 (Aug. 2007), 2--20. https://doi.org/10.1016/j.scico.2005.06.007Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Donald E. Knuth. 1971. Top-down Syntax Analysis. Acta Inf. 1, 2 (June 1971), 79--110. https://doi.org/10.1007/BF00289517Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Elizabeth Scott. 2008. SPPF-Style Parsing From Earley Recognisers. Electron. Notes Theor. Comput. Sci. 203, 2 (April 2008), 53--67. https://doi.org/10.1016/j.entcs.2008.03.044Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Elizabeth Scott and Adrian Johnstone. 2006. Right Nulled GLR Parsers. ACM Trans. Program. Lang. Syst. 28, 4 (July 2006), 577--618. https://doi.org/10.1145/1146809.1146810Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Giorgios Economopoulos, Paul Klint, and Jurgen Vinju. 2009. Faster scannerless GLR parsing. In In Proceedings of the 18th International Conference on Compiler Construction (CC. Springer-Verlag.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jay Earley. 1970. An Efficient Context-free Parsing Algorithm. Commun. ACM 13, 2 (1970), 94--102. https://doi.org/10.1145/362007.362035Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. John Aycock and R. Nigel Horspool. 2002. Practical Earley Parsing. Comput. J. 45 (2002), 620--630.Google ScholarGoogle ScholarCross RefCross Ref
  11. Masaru Tomita. 1985. Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems. Kluwer Academic Publishers, Norwell, MA, USA.Google ScholarGoogle Scholar
  12. Paul Stansifer and Mitchell Wand. 2011. Parsing Reflective Grammars. In Proceedings of the Eleventh Workshop on Language Descriptions, Tools and Applications (LDTA '11). ACM, New York, United States, Article 10, 7 pages. https://doi.org/10.1145/1988783.1988793Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Russ Cox. 2009. Regular Expression Matching: the Virtual Machine Approach. https://swtch.com/~rsc/regexp/regexp2.htmlGoogle ScholarGoogle Scholar
  14. Sérgio Medeiros and Roberto Ierusalimschy. 2008. A Parsing Machine for PEGs. In Proceedings of the 2008 Symposium on Dynamic Languages (DLS '08). ACM, New York, NY, USA, Article 2, 12 pages. https://doi.org/10.1145/1408681.1408683Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Trevor Jim and Yitzhak Mandelbaum. 2010. Efficient Earley Parsing with Regular Right-hand Sides. Electronic Notes in Theoretical Computer Science 253, 7 (2010), 135 -- 148. https://doi.org/10.1016/j.entcs.2010.08.037Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Trevor Jim, Yitzhak Mandelbaum, and David Walker. 2010. Semantics and Algorithms for Data-dependent Grammars. In Proceedings of the 37th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL'10). ACM, New York, United States, 417--430. https://doi.org/10.1145/1706299.1706347Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Just-in-time Parsing with Scannerless Earley Virtual Machines

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          ICVISP 2019: Proceedings of the 3rd International Conference on Vision, Image and Signal Processing
          August 2019
          584 pages
          ISBN:9781450376259
          DOI:10.1145/3387168

          Copyright © 2019 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 25 May 2020

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Acceptance Rates

          ICVISP 2019 Paper Acceptance Rate126of277submissions,45%Overall Acceptance Rate186of424submissions,44%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader