Abstract
In this paper we have presented new algorithms to handle the pattern matching problem where the pattern can contain variable length gaps. Given a pattern P with variable length gaps and a text T our algorithm works in O(n + m + α log(max \(_{\rm 1<={\it i}<={\it l}}\)(b i –a i ))) time where n is the length of the text, m is the summation of the lengths of the component subpatterns, α is the total number of occurrences of the component subpatterns in the text and a i and b i are, respectively, the minimum and maximum number of don’t cares allowed between the ith and (i+1)st component of the pattern. We also present another algorithm which, given a suffix array of the text, can report whether P occurs in T in O(m + α loglogn) time. Both the algorithms record information to report all the occurrences of P in T. Furthermore, the techniques used in our algorithms are shown to be useful in many other contexts.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aho, A., Corasick, M.: Efficient string matching: an aid to bibliographic search. Communications of the ACM 18, 333–340 (1975)
Akutsu, T.: Approximate string matching with variable length don’t care characters. IEICE Trans. Information and Systems E79-D, 1353–1354 (1996)
Amir, A., Lewenstein, M., Porat, E.: Faster algorithms for string matching with k mismatches. In: Proceedings of the Symposium on Discrete Algorithms (SODA 2000), pp. 794–803 (2000)
Baeza-Yates, R., Gonnet, G.: A new approach to text searching. Communications of the ACM 35, 74–82 (1992)
Cole, R., Hariharan, R.: Approximate string matching: a faster simpler algorithm. In: Proceedings of the Symposium on Discrete Algorithms (SODA 1998), pp. 463–472 (1998)
Cole, R., Hariharan, R.: Verifying candidate matches in sparse and wildcard matching. In: Proceedings of the Symposium on Theory of Computing (STOC 2002), pp. 592–601 (2002)
Galil, Z., Giancarlo, R.: Improved string matching with k mismatches. SIGACT News 17(4), 52–54 (1986)
Fischer, M.J., Paterson, M.S.: String matching and other products. Technical report, Massachusetts Institute of Technology, Cambridge, MA (1974)
Gusfield, D.: Algorithms on strings, trees, and sequences. Cambridge University Press, Cambridge (1997)
Kärkkäinen, J., Sanders, P.: Simple linear work Suffix Array Construction. In: Baeten, J.C.M., Lenstra, J.K., Parrow, J., Woeginger, G.J. (eds.) ICALP 2003. LNCS, vol. 2719, pp. 943–955. Springer, Heidelberg (2003)
Ko, P., Aluru, S.: Space Efficient Linear Time Construction of Suffix Arrays. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 200–210. Springer, Heidelberg (2003)
Kim, D.K., Sim, J.S., Park, H., Park, K.: Linear-Time Construction of Suffix Arrays. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 186–199. Springer, Heidelberg (2003)
Landau, G.M., Vishkin, U.: Efficient string matching with k mismatches. Theoretical Computer Science 43, 239–249 (1986)
Landau, G.M., Vishkin, U.: Fast parallel and serial approximate string matching. Journal of Algorithms 10(2), 157–169 (1989)
Lee, I., Apostolico, A., Iliopoulos, C.S., Park, K.: Finding approximate occurrence of a pattern that contains gaps. In: Proceedings of the 14th Australasian Workshop on Combinatorial Algorithms (AWOCA 2003), pp. 89–100 (2003)
Navarro, G., Raffinot, M.: Fast and simple character classes and bounded gaps pattern matching. Journal of Computational Biology 10(6), 903–923 (2003)
Sahinalp, S.C., Vishkin, U.: Efficient approximate and dynamic matching of patterns using a labeling paradigm. In: Proceedings of the Symposium on Foundations of Computer Science, pp. 320–328 (1996)
van Emde Boas, P.: Preserving order in a forest in less than logarithmic time and linear space. Information Processing Letters 6, 80–82 (1977)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rahman, M.S., Iliopoulos, C.S., Lee, I., Mohamed, M., Smyth, W.F. (2006). Finding Patterns with Variable Length Gaps or Don’t Cares. In: Chen, D.Z., Lee, D.T. (eds) Computing and Combinatorics. COCOON 2006. Lecture Notes in Computer Science, vol 4112. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11809678_17
Download citation
DOI: https://doi.org/10.1007/11809678_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-36925-7
Online ISBN: 978-3-540-36926-4
eBook Packages: Computer ScienceComputer Science (R0)