Abstract
We develop efficient dynamic programming algorithms for a pattern matching with general gaps and character classes. We consider patterns of the form p 0 g(a 0,b 0) p 1 g(a 1,b 1) ...p m − − 1, where p i ⊂Σ, where Σ is some finite alphabet, and g(a i ,b i ) denotes a gap of length a i ...b i between symbols p i and p i + 1. The text symbol t j matches p i iff t j ∈p i . Moreover, we require that if p i matches t j , then p i + 1 should match one of the text symbols \(t_{j+a_{i}+1} \ldots t_{j+b_i+1}\). Either or both of a i and b i can be negative. We give algorithms that have efficient average and worst case running times. The algorithms have important applications in music information retrieval and computational biology. We give experimental results showing that the algorithms work well in practice.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cantone, D., Cristofaro, S., Faro, S.: An efficient algorithm for δ -approximate matching with α -bounded gaps in musical sequences. In: Nikoletseas, S.E. (ed.) WEA 2005. LNCS, vol. 3503, pp. 428–439. Springer, Heidelberg (2005)
Cantone, D., Cristofaro, S., Faro, S.: On tuning the (δ,α)-sequential-sampling algorithm for δ-approximate matching with α-bounded gaps in musical sequences. In: Proceedings of ISMIR 2005 (2005)
Crochemore, M., Iliopoulos, C., Makris, C., Rytter, W., Tsakalidis, A., Tsichlas, K.: Approximate string matching with gaps. Nordic J. of Computing 9(1), 54–65 (2002)
Fredriksson, K., Grabowski, S.: Efficient bit-parallel algorithms for (δ,α)-matching. In: Àlvarez, C., Serna, M. (eds.) WEA 2006. LNCS, vol. 4007, pp. 170–181. Springer, Heidelberg (2006)
Johnson, D.B.: A priority queue in which initialization and queue operations take O(loglogD) time. Mathematical Systems Theory 15, 295–309 (1982)
Mäkinen, V.: Parameterized approximate string matching and local-similarity- based point-pattern matching. PhD thesis, Department of Computer Science, University of Helsinki (August 2003)
Mäkinen, V., Navarro, G., Ukkonen, E.: Transposition invariant string matching. Journal of Algorithms 56(2), 124–153 (2005)
Mehldau, G., Myers, G.: A system for pattern matching applications on biosequences. Comput. Appl. Biosci. 9(3), 299–314 (1993)
Myers, G.: Approximate matching of network expression with spacers. Journal of Computational Biology 3(1), 33–51 (1996)
Navarro, G., Raffinot, M.: Fast and simple character classes and bounded gaps pattern matching, with applications to protein searching. Journal of Computational Biology 10(6), 903–923 (2003)
Pinzón, Y.J., Wang, S.: Simple algorithm for pattern-matching with bounded gaps in genomic sequences. In: Proceedings of ICNAAM 2005, pp. 827–831 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fredriksson, K., Grabowski, S. (2006). Efficient Algorithms for Pattern Matching with General Gaps and Character Classes. In: Crestani, F., Ferragina, P., Sanderson, M. (eds) String Processing and Information Retrieval. SPIRE 2006. Lecture Notes in Computer Science, vol 4209. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11880561_22
Download citation
DOI: https://doi.org/10.1007/11880561_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45774-9
Online ISBN: 978-3-540-45775-6
eBook Packages: Computer ScienceComputer Science (R0)