skip to main content
10.1145/2660193.2660202acmconferencesArticle/Chapter ViewAbstractPublication PagessplashConference Proceedingsconference-collections
research-article

Adaptive LL(*) parsing: the power of dynamic analysis

Published: 15 October 2014 Publication History

Abstract

Despite the advances made by modern parsing strategies such as PEG, LL(*), GLR, and GLL, parsing is not a solved problem. Existing approaches suffer from a number of weaknesses, including difficulties supporting side-effecting embedded actions, slow and/or unpredictable performance, and counter-intuitive matching strategies. This paper introduces the ALL(*) parsing strategy that combines the simplicity, efficiency, and predictability of conventional top-down LL(k) parsers with the power of a GLR-like mechanism to make parsing decisions. The critical innovation is to move grammar analysis to parse-time, which lets ALL(*) handle any non-left-recursive context-free grammar. ALL(*) is O(n4) in theory but consistently performs linearly on grammars used in practice, outperforming general strategies such as GLL and GLR by orders of magnitude. ANTLR 4 generates ALL(*) parsers and supports direct left-recursion through grammar rewriting. Widespread ANTLR 4 use (5000 downloads/month in 2013) provides evidence that ALL(*) is effective for a wide variety of applications.

Supplementary Material

TXT File (getting-started.txt)
Getting Started
TXT File (step-by-step.txt)
Step-by-Step Instructions
GZ File (allstar-paper-49.tar.gz)
All files

References

[1]
Ancona, M., Dodero, G., Gianuzzi, V., and Morgavi, M. Efficient construction of LR(k) states and tables. ACM Trans. Program. Lang. Syst. 13, 1 (Jan. 1991), 150--178.
[2]
Bermudez, M. E., and Schimpf, K. M. Practical arbitrary lookahead LR parsing. Journal of Computer and System Sciences 41, 2 (1990).
[3]
Brown, S., and Vranesic, Z. Fundamentals of Digital Logic with Verilog Design. McGraw-Hill series in ECE. 2003.
[4]
Charles, P. A Practical Method for Constructing Efficient LALR(k) Parsers with Automatic Error Recovery. PhD thesis, New York University, New York, NY, USA, 1991.
[5]
Clarke, K. The top-down parsing of expressions. Unpublished technical report, Dept. of Computer Science and Statistics, Queen Mary College, London, June 1986.
[6]
Cleveland, W. S. Robust Locally Weighted Regression and Smoothing Scatterplots. Journal of the American Statistical Association 74 (1979), 829--836.
[7]
Cohen, R., and Culik, K. LR-Regular grammars - an extension of LR(k) grammars. In SWAT '71 (Washington, DC, USA, 1971), IEEE Computer Society, pp. 153--165.
[8]
Earley, J. An efficient context-free parsing algorithm. Communications of the ACM 13, 2 (1970), 94--102.
[9]
Ford, B. Parsing Expression Grammars: A recognition-based syntactic foundation. In POPL (2004), ACM Press.
[10]
Grimm, R. Better extensibility through modular syntax. In PLDI (2006), ACM Press, pp. 38--51.
[11]
Hopcroft, J., and Ullman, J. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Reading, Massachusetts, 1979.
[12]
Jarzabek, S., and Krawczyk, T. LL-Regular grammars. Information Processing Letters 4, 2 (1975), 31--37.
[13]
Jim, T., Mandelbaum, Y., and Walker, D. Semantics and algorithms for data-dependent grammars. In POPL 2010.
[14]
Johnson, M. The computational complexity of GLR parsing. In Generalized LR Parsing, M. Tomita, Ed. Kluwer, 1991.
[15]
Kipps, J. Generalized LR Parsing. Springer, 1991, pp. 43--59.
[16]
Mclean, P., and Horspool, R. N. A faster Earley parser. In CC (1996), Springer, pp. 281--293.
[17]
McPeak, S. Elkhound: A fast, practical GLR parser generator. Tech. rep., UC Berkeley (EECS), Dec. 2002.
[18]
McPeak, S., and Necula, G. C. Elkhound: A fast, practical GLR parser generator. In CC (2004), pp. 73--88.
[19]
Parr, T. The Definitive ANTLR 4 Reference. The Pragmatic Programmers, 2013.
[20]
Parr, T., and Fisher, K. LL(*): The Foundation of the ANTLR Parser Generator. In PLDI (2011), pp. 425--436.
[21]
Parr, T. J. Obtaining practical variants of LL(k) and LR(k) for k>1 by splitting the atomic k-tuple. PhD thesis, Purdue University, West Lafayette, IN, USA, 1993.
[22]
Parr, T. J., and Quong, R. W. Adding Semantic and Syntactic Predicates to LL(k) - pred-LL(k). In CC (1994).
[23]
Perlin, M. LR recursive transition networks for Earley and Tomita parsing. In Proceedings of the 29th Annual Meeting on Association for Computational Linguistics (1991), ACL '91.
[24]
Plevyak, J. DParser: GLR parser generator, Oct. 2013.
[25]
Scott, E., and Johnstone, A. GLL parsing. Electron. Notes Theor. Comput. Sci. 253, 7 (Sept. 2010), 177--189.
[26]
Tomita, M. Efficient Parsing for Natural Language. Kluwer Academic Publishers, 1986.
[27]
Woods, W. A. Transition network grammars for natural language analysis. Comm. of the ACM 13, 10 (1970).

Cited By

View all
  • (2025)SNAILS: Schema Naming Assessments for Improved LLM-Based SQL InferenceProceedings of the ACM on Management of Data10.1145/37097273:1(1-26)Online publication date: 11-Feb-2025
  • (2024)Paguroidea: Fused Parser Generator with Transparent Semantic ActionsProceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction10.1145/3640537.3641563(39-48)Online publication date: 17-Feb-2024
  • (2024)Fast Deterministic Black-box Context-free Grammar InferenceProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639214(1-12)Online publication date: 20-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
OOPSLA '14: Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications
October 2014
946 pages
ISBN:9781450325851
DOI:10.1145/2660193
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 49, Issue 10
    OOPSLA '14
    October 2014
    907 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/2714064
    • Editor:
    • Andy Gill
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. all(*)
  2. augmented transition networks
  3. dfa
  4. gll
  5. glr
  6. grammar
  7. ll(*)
  8. nondeterministic parsing
  9. peg

Qualifiers

  • Research-article

Conference

SPLASH '14
Sponsor:

Acceptance Rates

OOPSLA '14 Paper Acceptance Rate 52 of 186 submissions, 28%;
Overall Acceptance Rate 268 of 1,244 submissions, 22%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)146
  • Downloads (Last 6 weeks)5
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)SNAILS: Schema Naming Assessments for Improved LLM-Based SQL InferenceProceedings of the ACM on Management of Data10.1145/37097273:1(1-26)Online publication date: 11-Feb-2025
  • (2024)Paguroidea: Fused Parser Generator with Transparent Semantic ActionsProceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction10.1145/3640537.3641563(39-48)Online publication date: 17-Feb-2024
  • (2024)Fast Deterministic Black-box Context-free Grammar InferenceProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639214(1-12)Online publication date: 20-May-2024
  • (2024)A Novel Weight-Driven ATN-Based SQL Sentence Generator to Accommodate AI-Based Reinforcement Learning2024 International Conference on Computer, Information and Telecommunication Systems (CITS)10.1109/CITS61189.2024.10608024(1-8)Online publication date: 17-Jul-2024
  • (2023)Ordered Context-Free Grammars RevisitedElectronic Proceedings in Theoretical Computer Science10.4204/EPTCS.388.13388(140-153)Online publication date: 15-Sep-2023
  • (2023)flap: A Deterministic Parser with Fused LexingProceedings of the ACM on Programming Languages10.1145/35912697:PLDI(1194-1217)Online publication date: 6-Jun-2023
  • (2023)Interval Parsing Grammars for File Format ParsingProceedings of the ACM on Programming Languages10.1145/35912647:PLDI(1073-1095)Online publication date: 6-Jun-2023
  • (2023)Statically Resolvable AmbiguityProceedings of the ACM on Programming Languages10.1145/35712517:POPL(1686-1712)Online publication date: 11-Jan-2023
  • (2023)Type-based Termination Analysis for Parsing Expression GrammarsProceedings of the 38th ACM/SIGAPP Symposium on Applied Computing10.1145/3555776.3577620(1372-1379)Online publication date: 27-Mar-2023
  • (2023)Enabling Generative AI to Produce SQL Statements: A Framework for the Auto- Generation of Knowledge Based on EBNF Context-Free GrammarsIEEE Access10.1109/ACCESS.2023.332907111(123543-123564)Online publication date: 2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media