Abstract
We consider the following string matching problem. Pattern p 0 p 1 p 2 ... p m − − 1 (δ,α)-matches the text substring \(t_{i_0} t_{i_1} t_{i_2} \ldots t_{i_{m-1}}\), if \(|p_j-t_{i_j}|\leq\delta\) for j ∈ {0,..., m–1}, where 0 < i j + 1 – i j ≤ α + 1. The task is then to find all text positions i m − − 1 that (δ,α)-match the pattern. For a text of length n, the best previously known algorithms for this string matching problem run in time O(nm) and in time O(n⌈mα/w⌉), where the former is based on dynamic programming, and the latter on bit-parallelism with w bits in computer word (32 or 64 typically). We improve these to take O(nδ + ⌈n/w⌉m) and O(n ⌈m log(α)/w⌉), respectively, worst case time using bit-parallelism. On average the algorithms run in O(⌈n/w⌉⌈αδ/σ⌉ + n)and O(n) time. Our experimental results show that the algorithms work extremely well in practice. Our algorithms handle general gaps as well, having important applications in computational biology.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baeza-Yates, R.A., Gonnet, G.H.: A new approach to text searching. Communications of the ACM 35(10), 74–82 (1992)
Cantone, D., Cristofaro, S., Faro, S.: An efficient algorithm for δ-approximate matching with α-bounded gaps in musical sequences. In: Nikoletseas, S.E. (ed.) WEA 2005. LNCS, vol. 3503, pp. 428–439. Springer, Heidelberg (2005)
Cantone, D., Cristofaro, S., Faro, S.: On tuning the (δ,α)-sequential-sampling algorithm for δ-approximate matching with α-bounded gaps in musical sequences. In: Proceedings of ISMIR 20005 (2005)
Crochemore, M., Iliopoulos, C., Makris, C., Rytter, W., Tsakalidis, A., Tsichlas, K.: Approximate string matching with gaps. Nordic Journal of Computing 9(1), 54–65 (2002)
Crochemore, M., Iliopoulos, C., Navarro, G., Pinzon, Y., Salinger, A.: Bit-parallel (δ,γ)-matching suffix automata. Journal of Discrete Algorithms (JDA) 3(2–4), 198–214 (2005)
Mäkinen, V.: Parameterized approximate string matching and local-similarity-based point-pattern matching. PhD thesis, Department of Computer Science, University of Helsinki (August 2003)
Mäkinen, V., Navarro, G., Ukkonen, E.: Transposition invariant string matching. Journal of Algorithms 56(2), 124–153 (2005)
Myers, E.W.: Approximate matching of network expression with spacers. Journal of Computational Biology 3(1), 33–51 (1996)
Navarro, G., Raffinot, M.: Fast and simple character classes and bounded gaps pattern matching, with applications to protein searching. Journal of Computational Biology 10(6), 903–923 (2003)
Pinzón, Y.J., Wang, S.: Simple algorithm for pattern-matching with bounded gaps in genomic sequences. In: Proceedings of ICNAAM 2005, pp. 827–831 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fredriksson, K., Grabowski, S. (2006). Efficient Bit-Parallel Algorithms for (δ,α)-Matching. In: Àlvarez, C., Serna, M. (eds) Experimental Algorithms. WEA 2006. Lecture Notes in Computer Science, vol 4007. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11764298_15
Download citation
DOI: https://doi.org/10.1007/11764298_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34597-8
Online ISBN: 978-3-540-34598-5
eBook Packages: Computer ScienceComputer Science (R0)