Abstract
Finite automata and their application in lexical analysis play an important role in many parts of computer science and particularly in compiler constructions. We measured 12 scanners using different implementation strategies and found that the execution time differed by a factor of 74. Our analysis of the algorithms as well as run-time statistics on cache misses and instruction frequency reveals substantive differences in code locality and certain kinds of overhead typical for specific implementation strategies. Some of the traditional statements on writing “fast” scanners could not be confirmed. Finally, we suggest an improved scanner generator.
Chapter PDF
References
Ada 95 Reference Manual. Intermetrics, Inc., 1995. ANSI/ISO/IEC-8652:1995.
A.V. Aho and M.J. Corasick. Efficient String Matching: An Aid to Bibliographic Search. Communications of the ACM, 18(6):333–340, June 1975.
A.V. Aho, R. Sethi, and J.D. Ullman. Compilers. Addison-Wesley, 1986.
J. Barnes. Programming in Ada 95. Addison Wesley, 1995.
P. Bumbulis and D.D. Cowan. RE2C: A More Versatile Scanner Generator. ACM Letters on Programming Languages and Systems, 2(1-4):70–84, 1993.
R.J. Cichelli. Minimal Perfect Hash Functions Made Simple. Communications of the ACM, 23:17–19, 1980.
C.R. Cook and R.R: Oldehoeft. A Letter Oriented Minimal Perfect Hashing Function. ACM SIGPLAN Notices, 17(9):18–27, 1982.
Z.J. Czech and G. Havas. An optimal algorithm for generating minimal perfect hash functions. Information Processing Letters, 43(5):257–264, October 1992.
R. Dewar. (private communication).
C.W. Praser and D.R. Hanson. A Retargetable C Compiler. ACM SIGPLAN Notices, 26(10):29–43, October 1991.
C.W. Fraser and D.R. Hanson. A Retargetable C Compiler. Addison-Wesley, 1995.
Free Software Foundation, 59 Temple Place-Suite 330, Boston, MA 02111-1307 USA. Using and Porting GNU CC, 1995. (for GCC Version 2.7.2).
W. Gellerich, M. Kosiol, and E. Ploedereder. Where does goto go to? In Reliable Software Technologies-Ada-Europe 1996, volume 1088 of LNCS, pages 385–395. Springer, 1996. (http://www.informatik.uni-stuttgart.de /ifi/ps/Gellerich/adagotowww.ps ).
Gnu ada translator (gnat) documentation, 1995. (ftp cs.nyu.edu:/pub/gnat).
R.W. Gray, V. Heurig, S.P. Levi, A.M. Sloane, and W.M. Waite. Eli: A complete, flexible compiler construction system. CALM, 35(2):121–131, February 1992.
J. Grosch. Generators for High-Speed Front-Ends. In Compiler-Compilers and High-Speed Compilation, volume 371 of LNCS, pages 81–92. Springer, 1988.
J. Grosch. Selected Examples of Scanner Specifications. Technical Report 7, Gesellschaft fuer Mathematik und Datenverarbeitung mbH, 1988.
J. Grosch. Efficient Generation of Lexical Analysers. Software Practice and Experience, 19(11):1089–1103, November 1989.
J. Grosch. Rex-A Scanner Generator. Technical Report 5, Gesellschaft fuer Mathematik und Datenverarbeitung mbH, 1992.
G. Havas and B.S. Majewski. Graph Theoretic Obstacles to Perfect Hashing. Congressus Numerantium, 98:81–93, 1993.
Intel Corporation. Pentium Processor Family Developer's Manual, 1997.
SPARC International. SPARC Architecture Manual, Vers. 8. Prentice Hall, 1992.
W.L. Johnson, J.H. Porter, S.I. Ackley, and D.T. Ross. Automatic Generation of Efficient Lexical Processors Using Finite State Techniques. ACM SIGPLAN Notices, 11(8):805–813, December 1968.
D.W. Jones. How (Not) to Code a Finite State Machine. ACM SIGPLAN Notices, 23(8):19–22, 1988.
J.R. Levine, T. Mason, and D. Brown. lex & yacc. O'Reilly & Associates, Inc., Sebastopol, 2. edition, 1990.
SUN Microsystems. Solaris 2.3 Software Developer Answerbook, November 1993.
H. Moessenboeck. Alex-a simple and efficient scanner generator. ACM SIGPLAN Notices, 21(5):69–78, May 1986.
Vern Paxon. Flex, Version 2.5. University of California, Berkeley, March 1995.
U. Post. Gleitzeit-Performance Monitoring deckt Gleitkommanutzung auf. c't, pages 256–259, Sep 1997.
Ada version of REX. (www.informatik.uni-stuttgart.de/ifi/ps/cocktail).
J. Self. Aflex-An Ada Lexical Analyzer Generator. Technical Report UCI-90-18, University of California, Irvine, May 1990.
D. Szafron and R. Ng. LexAGen: An Interactive Incremental Scanner Generator. Software-Practice and Experience, 20(5):459–483, 1990.
W.F. Tichy, P. Lukowicz, Lutz Prechelt, and E.A. Heinz. Experimental Evaluation in Computer Science: A Quantitative Study. 01Journal of Systems and Software, 28(1):9–18, Januar 1995.
J.P. Tremblay and P.G. Sorenson. The Theory and Praxis of Compiler Writing. McGraw-Hill, 1985.
W.M. Waite. The Cost of Lexical Analysis. Software-Practice and Experience, 16(5):473–488, 1986.
D.L. Weaver and T. Germond. SPARC Architecture Manual, Version 9. Prentice Hall, 1994.
R. Wilhelm and D. Maurer. Compiler Design. Addison-WesleySpringer, 1995.
M. Withopf and A. Stiller. Durchgriff-Direkte Zugriffe unter Windows NT 4.0 und ein entfesselter Cyrix 6x86. c't, pages 312–315, Jan 1997.
D.A. Wolverton. A Perfect Hash Function for Ada Reserved Words. ACM Ada Letters, VI(1):40–44, July/August 1984.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Brouwer, K., Gellerich, W., Ploedereder, E. (1998). Myths and facts about the efficient implementation of finite automata and lexical analysis. In: Koskimies, K. (eds) Compiler Construction. CC 1998. Lecture Notes in Computer Science, vol 1383. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0026419
Download citation
DOI: https://doi.org/10.1007/BFb0026419
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64304-3
Online ISBN: 978-3-540-69724-4
eBook Packages: Springer Book Archive