Speeding up two string-matching algorithms

Crochemore, Maxime; Lecroq, Thierry; Czumaj, Artur; Gasieniec, Leszek; Jarominek, Stefan; Plandowski, Wojciech; Rytter, Wojciech

doi:10.1007/3-540-55210-3_215

Maxime Crochemore¹,
Thierry Lecroq¹,
Artur Czumaj²,
Leszek Gasieniec²,
Stefan Jarominek²,
Wojciech Plandowski² &
…
Wojciech Rytter²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 577))

Included in the following conference series:

Annual Symposium on Theoretical Aspects of Computer Science

148 Accesses

Abstract

We show how to speed up two string-matching algorithms: the Boyer-Moore algorithm (BM algorithm) and its version called here the reversed-factor algorithm (the RF algorithm). The RF algorithm is based on factor graphs for the reverse of the pattern. The main feature of both algorithms is that they scan the text right-to-left from the supposed right position of the pattern, BM algorithm goes as far as the scanned segment is a suffix of the pattern, while the RF algorithm is scanning while it is a factor of the pattern. Then they make a shift of the pattern, forget the history and start again. The RF algorithm usually makes bigger shifts than BM, but is quadratic in the worst case. We show that it is enough to remember the last matched segment to speed up considerably the RF algorithm (to make linear number of comparisons with small coefficient) and to speed up BM algorithm with match-shifts (to make at most 2.n comparisons). Only a constant additional memory is needed for the search phase. We give alternative versions of an accelerated algorithm RF: the first one is based on combinatorial properties of primitive words, and two others use extensively the power of suffix trees.

Work by these authors is partially supported by PRC “Mathématiques-Informatique”.

Work by this author is partially supported by NATO Grant CRG 900293

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Fast Exact Pattern Matching by the Means of a Character Bit Representation

Article 07 March 2022

Fast and Simple Computations Using Prefix Tables Under Hamming and Edit Distance

Suffix sorting via matching statistics

Article Open access 12 March 2024

References

A.V. Aho, Algorithms for finding patterns in strings, in: (J. van Leeuwen, editor, Handbook of Theoretical Computer Science, vol A, Algorithms and complexity, Elsevier, Amsterdam, 1990) 255–300.
Google Scholar
A. Apostolico, The myriad virtues of suffix trees, in: (A. Apostolico, Z. Galil, editors, Combinatorial Algorithms on Words, NATO Advanced Science Institutes, Series F, vol. 12, Springer-Verlag, Berlin, 1985) 85–96.
Google Scholar
A. Apostolico, R. Giancarlo, The Boyer-Moore-Galil string searching strategies revisited, SIAM J.Comput. 15 (1986) 98–105.
Google Scholar
R.A. Baeza-Yates, M. Régnier, Average running time of the Boyer-Moore-Horspool algorithm, Theoret. Comput. Sci. (1991) to appear.
Google Scholar
A. Blumer, J. Blumer, A. Ehrenfeucht, D. Haussler, M.T. Chen, J. Seiferas, The smallest automaton recognizing the subwords of a text, Theoret. Comput. Sci. 40 (1985) 31–55.
Google Scholar
L. Banachowski, A. Kreczmar, W. Rytter, Analysis of algorithms and data structures, Addison Wesley, 1991.
Google Scholar
R.S. Boyer, J.S. Moore, A fast string searching algorithm, Comm. ACM 20 (1977) 762–772.
Google Scholar
R. Cole, Tight bounds on the complexity of the Boyer-Moore pattern matching algorithm, in: (2nd annual ACM Symp. on Discrete Algorithms, 1991) 224–233
Google Scholar
M. Crochemore, Transducers and repetitions, Theoret. Comput. Sci. 45 (1986) 63–86.
Google Scholar
Z. Galil, On improving the worst case running time of the Boyer-Moore string searching algorithm, Comm. ACM 22 (1979) 505–508.
Google Scholar
L.J. Guibas, A.M. Odlyzko, A new proof of the linearity of the Boyer-Moore string searching algorithm, SIAM J.Comput. 9 (1980) 672–682.
Google Scholar
D.E. Knuth, J.H. Morris Jr, V.R. Pratt, Fast pattern matching in strings, SIAM J.Comput. 6 (1977) 323–350.
Google Scholar
T. Lecroq, A variation on Boyer-Moore algorithm, Theoret. Comput. Sci. (1991) to appear.
Google Scholar
W. Rytter, A correct preprocessing algorithm for Boyer-Moore string searching, SIAM J.Comput. 9 (1980) 509–512.
Google Scholar
A.C. Yao, The complexity of pattern matching for a random string, SIAM J.Comput. 8 (1979) 368–387.
Google Scholar

Download references

Author information

Authors and Affiliations

LITP, Institut Blaise Pascal, Université Paris 7, 2 Place Jussieu, 75251, Paris Cedex 05, France
Maxime Crochemore & Thierry Lecroq
Institute of Informatics, Warsaw University, ul. Banacha 2, 00-913, Warsaw 59, Poland
Artur Czumaj, Leszek Gasieniec, Stefan Jarominek, Wojciech Plandowski & Wojciech Rytter

Authors

Maxime Crochemore
View author publications
You can also search for this author in PubMed Google Scholar
Thierry Lecroq
View author publications
You can also search for this author in PubMed Google Scholar
Artur Czumaj
View author publications
You can also search for this author in PubMed Google Scholar
Leszek Gasieniec
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Jarominek
View author publications
You can also search for this author in PubMed Google Scholar
Wojciech Plandowski
View author publications
You can also search for this author in PubMed Google Scholar
Wojciech Rytter
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Alain Finkel Matthias Jantzen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Crochemore, M. et al. (1992). Speeding up two string-matching algorithms. In: Finkel, A., Jantzen, M. (eds) STACS 92. STACS 1992. Lecture Notes in Computer Science, vol 577. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-55210-3_215

Download citation

DOI: https://doi.org/10.1007/3-540-55210-3_215
Published: 02 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-55210-9
Online ISBN: 978-3-540-46775-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Speeding up two string-matching algorithms

Abstract

Access this chapter

Preview

Similar content being viewed by others

Fast Exact Pattern Matching by the Means of a Character Bit Representation

Fast and Simple Computations Using Prefix Tables Under Hamming and Edit Distance

Suffix sorting via matching statistics

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Speeding up two string-matching algorithms

Abstract

Access this chapter

Preview

Similar content being viewed by others

Fast Exact Pattern Matching by the Means of a Character Bit Representation

Fast and Simple Computations Using Prefix Tables Under Hamming and Edit Distance

Suffix sorting via matching statistics

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation