Abstract
There are a number of string matching problems for which the best known algorithms rely on algebraic convolutions (an approach pioneered by Fischer and Paterson [FP74]). These include for instance the classical string matching with wild cards and the k-mismatches problem. In [MP94], the authors studied generalizations of these problems which they called the non-standard stringology. There they derived upper and lower bounds for non-standard string matching problems.
In this paper, we pose several novel problems in the area of non-standard stringology. Some we have been able to resolve here; others we leave open. Among the technical results in this paper are:
-
1.
improved bounds for string matching when a symbol in the string matches at most d others (motivated by noisy string matching),
-
2.
first-known bounds for approximately counting mismatches in noisy string matching as above, and
-
3.
improved bounds for the k-witnesses problem and its applications.
Our results are obtained by using the probabilistic proof technique and randomized algorithmic methods; these techniques, although standard, have seldom been used in combinatorial pattern matching.
Supported by DIMACS (Center for Discrete Mathematics and Theoretical Computer Science), a National Science Foundation Science and Technology Center under NSF contract STC-8809648.
Preview
Unable to display preview. Download preview PDF.
References
A. Aho. Algorithms for finding patterns in strings. Handbook of theoretical computer science, Vol 1, Van Leeuwen Ed., 1989.
K. Abrahamson. Generalized string matching. SIAM J. Comp., 1987, 1039–1051.
A. Aho and M. Corasick. Efficient string searching: An aid to bibliographic search. Comm. of the ACM, 18(6), 1975, 333–340.
A. Amir and M. Farach. Efficient 2-dimensional Approximate Matching of Non-rectangular Figures. Proc of 2nd Ann ACM Symp on Discrete Algorithms, 1991, 212–222.
N. Alon, Z. Galil, O. Margalit and M. Naor. Witnesses for boolean matrix multiplication and for shortest paths. Proc. 33rd Ann. IEEE Symp. Foundations of CS, 1992,417–426.
A. Aho, J. Hopcroft, and J. Ullman. The design and analysis of computer algorithms. Addison-Wesley Publishers, 1974.
A. Amir and G. Landau. Fast serial and parallel multidimensional approximate array matching. Theoretical Computer Science, 81, 1991, 97–115.
N. Alon and J. Spencer. The probabilistic method. Wiley, 1993.
R. Baeza-Yates and G. Gonnet. A new approach to text searching. Proc. ACM SIGIR, Cambridge, Mass., 12:168–175, 1989.
S. Cook. Linear time simulation of deterministic two-way pushdown automata. Proc IFIP Congress, 1971.
M. Dayhoff, R. Schwartz and B. Orcutt. A model for evolutionary change in proteins, in Dayhoff, ed., Atlas of Protein Sequence and Structure, 5, 1979, 345–352.
M. Fischer and M. Paterson. String Matching and other Products. SIAM-AMS Proceedings, Vol. 7, 113–125, 1974.
Z. Galil. Some open problems in the theory of computation as questions about two-way deterministic pushdown automaton languages. Mathematical Systems Theory, 1979, 211–228.
Z. Galil. Open problems in stringology. Combinatorial Algorithms on Words, A. Apostolico and Z. Galil Eds, Springer-Verlag Lecture Notes, 1985. 1–8.
Z. Galil and R. Giancarlo. Data structures and algorithms for approximate string matching. Journal of Complexity, 4(1988), 33–72.
J. Hopcroft and J. Ullman. Introduction to Automata Theory, Languages and Computation, Addison-Wesley, Reading, Mass., 1979.
H. Karloff. Fast algorithms for approximately counting mismatches. Manuscript, 1993.
S.R. Kosaraju. Efficient string searching. Manuscript, 1987.
S.R. Kosaraju. Efficient tree pattern matching. Proc IEEE Ann. Symp. on FOCS, 1989, 178–183.
D. E. Knuth, J. H. Morris, and V. R. Pratt. Fast pattern matching in strings. SIAM J. Computing, 6:323–350, 1977.
M. Karchmer, I. Newman, M. Saks and A. Wigderson. Non-deterministic communication complexity with few witnesses. Manuscript, 1992.
R. Karp and M.O. Rabin. Efficient randomized pattern matching algorithms. IBM Journal of Research and Development, 31(2), 249–260.
L. Lovasz. Communication complexity — a survey. Paths, Flows and VLSI Layout, Korte, Lovasz, Promel, Schrijver Eds., Springer-Verlag (1990), 235–266.
G.M. Landau and U. Vishkin. Fast parallel and serial approximate string matching. Journal of Algorithms, Vol.10 2(1989), 262–272.
R. Lowrance and R. Wagner. An extension of the string-to-string correction problem. Journal of Association of Computing Machinery, 22, 1975, 177–183.
S. Muthukrishnan and K. Palem. Non-standard stringology: algorithms and complexity. Proc. 26th Annual ACM Symp. on the Theory of Computing, 1994, 770–779.
S. Muthukrishnan and H. Ramesh. String matching under general match relation. Proc 12th FST & TCS, India, LNCS, Springer-Verlag, Vol. 652, 1992, 356–367.
W. Masek and M. Paterson. A faster algorithm for computing string-edit distances. Journal of Computer and System Sciences, 20(1), 1980, 18–31.
V. Pan. Personal Communication, 1994.
R.Y. Pinter. Efficient string matching with don't-care in patterns. Combinatorial Algorithms on Words, NATO-ASI series, pp. 11–29, 1985. Editors: A. Apostolico and Z. Galil.
R. Seidel. On the all-pairs-shortest-path problems. Proc. 24th Ann. ACM Symp. Theory of Computing, 1992, 745–749.
E. Ukkonen. Finding approximate patterns in strings. Journal of Algorithms, Vol.6, 1985, 132–137.
S. Wu and U. Manber. Fast text searching allowing errors. Communications of ACM, 35, 1992, 83–91.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Muthukrishnan, S. (1995). New results and open problems related to non-standard stringology. In: Galil, Z., Ukkonen, E. (eds) Combinatorial Pattern Matching. CPM 1995. Lecture Notes in Computer Science, vol 937. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60044-2_50
Download citation
DOI: https://doi.org/10.1007/3-540-60044-2_50
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60044-2
Online ISBN: 978-3-540-49412-6
eBook Packages: Springer Book Archive