Skip to main content
Log in

On the look-ahead problem in lexical analysis

  • Published:
Acta Informatica Aims and scope Submit manuscript

Abstract

Modern programming languages use regular expressions to define valid tokens. Traditional lexical analyzers based on minimum deterministic finite automata for regular expressions cannot handle the look-ahead problem. The scanner writer needs to explicitly identify the look-ahead states and code the buffering and re-scanning operations by hand. We identify the class of finite look-ahead finite automata, which is general enough to include all finite automata of practical lexical analyzers. Finite look-ahead finite automata are then transformed into suffix finite automata. A new lexical analyzer makes use of the suffix finite automata to identify tokens. The new lexical analyzer solves the look-ahead problem in a table-driven approach and it can detect lexical errors at an earlier time than traditional lexical analyzers. The extra cost of the new lexical analyzers is the larger state transition table and three additional 1-dimensional tables. Incremental lexical analysis is also discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Aho, A.V., Hopcroft, J.E., Ullman, J.D.: The Design and Analysis of Computer Algorithms. Reading, MA: Addison-Wesley 1974

    Google Scholar 

  2. Aho, A.V., Corasick, M.J.: Efficient string matching: An aid to bibliographic search. Commun. ACM18(6), 333–340 (1975)

    Google Scholar 

  3. Aho, A.V., Sethi, R., Ullman, J.D.: Compilers: Principles, Techniques, and Tools. Reading, MA: Addison-Wesley 1986

    Google Scholar 

  4. Beetem, J.F., Beetem, A.F.: Incremental scanning and parsing with Galaxy. IEEE Trans. Software Engineering17(7), 641–651 (1991)

    Google Scholar 

  5. Fischer, B., Hammer, C., Struckmann, W.: ALADIN: A scanner generator for incremental programming environments. Software-Practice and Experience22(11), 1011–1025 (1992)

    Google Scholar 

  6. Fischer, C.N., LeBlanc, R.J., Jr.: Crafting a Compiler with C. Reading, MA: Benjamin/Cummings 1991

    Google Scholar 

  7. Grosch, J.: Efficient generation of lexical analysers. Software-Practice and Experience19(11), 1089–1103 (1989)

    Google Scholar 

  8. Heuring, V.P.: The automatic generation of fast lexical analysers. Software-Practice and Experience16(9), 801–808 (1986)

    Google Scholar 

  9. Horspool, R.N., Levy, M.R.: Mkscan—An interactive scanner generator. Software-Practice and Experience17(6), 369–378 (1987)

    Google Scholar 

  10. Johnson, W.L., Porter, J.H., Ackley, S.I., Ross, D.T.: Automatic generation of efficient lexical processors using finite state techniques. Commun. ACM11(12), 305–313 (1968)

    Google Scholar 

  11. Knuth, D.E., Morris, J.H., Jr., Pratt, V.R.: Fast pattern matching in strings. SIAM J. on Computing6(2), (1977)

  12. Koskimies, K., Paakki, J.: Automating Language Implementation. New York: Ellis Horwood 1990

    Google Scholar 

  13. Lesk, M.E., Schmidt, E.: LEX—A lexical analyzer generator. Computer Science Technical Report 39, Bell Labs., Murray Hill, N.J., 1975

    Google Scholar 

  14. Mössenböck, H.: Alex—A simple and efficient scanner generator. ACM SIGPLAN Notices21(5), 69–78 (1986)

    Google Scholar 

  15. Nawrocki, J.R.: Conflict detection and resolution in a lexical analyzer generator. Information Processing Letters38, 323–328 (1991)

    Google Scholar 

  16. Paxson, V.: The Flex User Document, Version 2.3. Computer Science Department, Cornell Univ., Ithaca, NY, 1990

    Google Scholar 

  17. Szafron, D., Ng, R.: LexAGen: An interactive incremental scanner generator. Software—Practice and Experience20(5), 459–483 (1990)

    Google Scholar 

  18. Waite, W.M.: The cost of lexical analysis. Software—Practice and Experience16(5), 473–488 (1986)

    Google Scholar 

  19. Wirth, N.: Programming with Modula-2 (3rd corrected edn.) New York: Springer-Verlag 1985

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

This work was supported in part by National Science Council, Taiwan, R.O.C. under grants NSC 83-0111-S-009-001-CL and NSC 84-2213-E-009-043

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, W. On the look-ahead problem in lexical analysis. Acta Informatica 32, 459–476 (1995). https://doi.org/10.1007/BF01213079

Download citation

  • Received:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01213079

Keywords

Navigation