Abstract
Consider a text string of length n, a pattern string of length m and a match vector of length n which declares each location in the text to be either a mismatch (the pattern does not occur beginning at that location in the text) or a potential match (the pattern may occur beginning at that location in the text). Some of the potential matches could be false, i.e., the pattern may not occur beginning at some location in the text declared to be a potential match. We investigate the complexity of two problems in this context, namely, checking if there is any false match, and identifying all the false matches in the match vector.
We present an algorithm on the CRCW PRAM that checks if there exists any false match in O(1) time using O(n) processors. Since string matching takes Ω(log log m) time on the CRCW PRAM, checking for false matches is provably simpler than string matching. As an important application, we use this simple algorithm to convert the Karp-Rabin Monte Carlo type string matching algorithm into a Las Vegas type algorithm without asymptotic loss in complexity. We also present an efficient algorithm for identifying all the false matches and as a consequence, show that string matching algorithms take Ω(log log m) time even given the flexibility to output a few false matches.
In addition, we give a sequential algorithm for checking using three heads on a 2-way deterministic finite state automaton (DFA) in linear time and another on a 1-way DFA with a fixed number of heads.
This research was supported in part by NSF/DARPA under grant number CCR-89-06949 and by NSF under grant number CCR-91-03953.
The author sincerely thanks Ravi Boppana, Richard Cole, Babu Narayanan and Krishna Palem for very helpful discussions.
Preview
Unable to display preview. Download preview PDF.
References
A. Aho. Algorithms for finding patterns in strings. Handbook of theoretical computer science, Vol 1, Van Leeuwen Ed., 1989.
D. Breslauer and Z. Galil. An optimal O(log log n) time parallel string matching algorithm. SIAM J. Computing, 19:6, 1051–1058.
D. Breslauer and Z. Galil. A lower bound for parallel string matching. Proc 23rd Annual ACM Symposium on Theory of Computation, 1991, 439–443.
M. Blum and S. Kannan. Designing Programs That Check Their Work. Proc 21st Annual Annual ACM Symposium on Theory of Computation, 1989, 86–97.
R. Boyer and S. Moore. A fast string matching algorithm. Communications of ACM, 20(1977), 762–772.
M. Crochemore, Z. Galil, L. Gasieniec, S. Muthukrishnan, K. Park, H. Ramesh and W. Rytter. Fast two dimensional/string pattern matching. Manuscript, 1993.
M. Crochemore and D. Perrin. Two-way pattern matching. Journal of ACM, Vol 38, 1991, 651–675.
F. E. Fich, R. L. Ragde, and A. Wigderson. Relations between concurrent-write models of parallel computation. SIAM J. Computing, 17:1988, 606–627.
Z. Galil. Optimal Parallel Algorithms for String Matching. Information and Control, Vol. 67, 1985, 144–157.
Z. Galil. Open Problems in Stringology. Combinatorial Algorithms on Words, A. Apostolico and Z. Galil Eds, NATO ASI Series, Springer Verlag, 1985, 1–8.
Z. Galil. Hunting lions in the desert optimally or a constant time optimal parallel string matching algorithm. Proc. 24th Annual ACM Symposium on Theory of Computation, 1992.
Z. Galil and J. Seiferas. Time space optimal string matching. Journal Comput. Syst. Sci. 26(1983), 280–294.
J. Hopcroft and J. Ullman. Introduction to automata theory, languages and computation. Addison-Wesley, 1979.
J. JaJa. Introduction to Parallel Algorithms. Addison-Wesley, 1991.
T. Jiang and M. Li. k one way heads can not perform string matching. To appear in Proc ACM Symposium on Theory of Computation, 1993.
D.E. Knuth, J. Morris, V. Pratt. Fast pattern matching in strings. SIAM J. Computing, 6:1973, 323–350.
R. Karp and M.O. Rabin. Efficient randomized pattern matching algorithms. IBM Journal of Research and Development, 31(2), 1987, 249–260.
R. Lyndon and M. Schutzenberger. The equation a M = b N c p in a free group. Michigan Math. J. 9, 1962, 289–298.
U. Vishkin. Optimal pattern matching in strings. Information and Control, Vol. 67, 1985, 91–113.
U. Vishkin. Deterministic sampling — A new technique for fast pattern matching. Proc 22nd Annual ACM Symposium on Theory of Computation, 1990, 170–180.
P. Weiner. Linear Pattern Matching Algorithms. Proc. 14th IEEE Ann. Symp. on Switching and Automata Theory, 1973. 1–11.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1993 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Muthukrishnan, S. (1993). Detecting false matches in string matching algorithms. In: Apostolico, A., Crochemore, M., Galil, Z., Manber, U. (eds) Combinatorial Pattern Matching. CPM 1993. Lecture Notes in Computer Science, vol 684. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0029804
Download citation
DOI: https://doi.org/10.1007/BFb0029804
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-56764-6
Online ISBN: 978-3-540-47732-7
eBook Packages: Springer Book Archive