Abstract
The notion of Boyer-Moore automaton was introduced by Knuth, Morris, and Pratt in their historical paper on fast pattern matching. It leads to an algorithm that requires more preprocessing but is more efficient than the original Boyer-Moore's algorithm. We formalize the notion of Boyer-Moore automaton and we give an efficient building algorithm. Also, bounds on the number of states are presented, and the concept of potential of a transition is introduced to improve the worst-and average-case behavior of these machines. We show that looking at the rightmost unknown character, as suggested by Knuthet al., is not necessarily optimal.
Similar content being viewed by others
References
A. V. Aho. Algorithms for finding patterns in strings. In Janvan Leeuwen, editor,Handbook of Theoretical Computer Science, volume A, pages 255–300. Elsevier, Amsterdam, 1990.
A. Apostolico and R. Giancarlo. The Boyer-Moore-Galil string searching strategies revisited.SIAM J. Comput., 15:98–105, 1986.
V. Bruyère. Thèse annexe, automates de Boyer-Moore. Technical Report, Institut de Mathématique et d'Informatique, Université de Mons-hainaut, 1991.
R. Boyer and S. Moore. A fast string searching algorithm.Comm. ACM, 20:762–772, 1977.
R. A. Baeza-Yates. Efficient Text Searching. Ph.D. thesis, Dept. of Computer Science, University of Waterloo, May 1989. Also as Research Report CS-89-17.
R. A. Baeza-Yates. String searching algorithms revisited. In F. Dehne, J.-R. Sack, and N. Santoro, editors,Proceedings of the Workshop in Algorithms and Data Structures, pages 75–96, Ottawa, Canada, August 1989. Lecture Notes on Computer Science, Vol. 382. Springer-Verlag, Berlin, 1989.
R. Baeza-Yates, G. Gonnet, and M. Régnier. Analysis of Boyer-Moore-type string searching algorithms.Proceedings of the 1st ACM-SIAM Symposium on Discrete Algorithms, pages 328–343, San Francisco, January 1990.
R. Baeza-Yates and M. Régnier. Average running time of the Boyer-Moore-Horspool algorithm.Theoret. Comput. Sci., 92(1):19–31, 1992.
C. Choffrut. An optimal algorithm for building the Boyer-Moore automaton.Bull. EATCS, 40:217–224, 1990.
Z. Galil. On improving the worst case running time of the Boyer-Moore string matching algorithm.Comm. ACM, 22:505–508, 1979.
Z. Galil. Open problems in stringology. In A. Apostolico and Z. Galil, editors,Combinatorial Algorithms on Words. NATO ASI Series, volume F12, pages 1–8. Springer-Verlag, Berlin, 1985.
L. Guibas and A. Odlyzko. A new proof of the linearity of the Boyer-Moore string searching algorithm.SIAM J. Comput., 9:672–682, 1980.
R. N. Horspool. Practical fast searching in strings.Software—Practice and Experience, 10:501–506, 1980.
D. E. Knuth, J. Morris, and V. Pratt. Fast pattern matching in strings.SIAM J. Comput., 6:323–350, 1977.
J. G. Kemeny and J. L. Snell.Finite Markov Chains. Springer-Verlag, New York, 1983.
R. Rivest. On the worst-case behavior of string-searching algorithms.SIAM J. Comput., 6:669–674, 1977.
R. Scheihing. Personal communication, 1992.
A. C. Yao. The complexity of pattern matching for a random string.SIAM J. Comput., 8:368–387, 1979.
Author information
Authors and Affiliations
Additional information
Communicated by Alberto Apostolico.
R. A. Baeza-Yates gratefully acknowledges the support of Grant C-11001 from Fundación Andes, and C. Choffrut gratefully acknowledges the support of the PRC Mathématiques et Informatique.
Rights and permissions
About this article
Cite this article
Baeza-Yates, R.A., Choffrut, C. & Gonnet, G.H. On Boyer-Moore automata. Algorithmica 12, 268–292 (1994). https://doi.org/10.1007/BF01185428
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF01185428