Abstract
Finding the similarity between two sequences is a major problem in computer science. It is motivated by many issues from computational biology as well as from information retrieval and image processing. These fields take into account possible corruptions of the data caused by genome rearrangements, typing mistakes, and more. Therefore, many applications do not require merely complete resemblance of the sequences, but rather an approximated matching. We consider mismatches and swaps as natural mistakes which are allowed in a meagre number. The edit distance problem with swap and mismatch operations was discussed by Amir et. al. [3]. They solved the problem in \(O(n\sqrt{m}\log m)\) time. From then on the problem of string matching with at most k swaps and mismatches errors was open.
In this paper we present an algorithm that finds all locations where the pattern has at most k mismatch and swap errors in time \(O(n\sqrt{k\log m})\).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Amir, A., Aumann, Y., Landau, G.M., Lewenstein, M., Lewenstein, N.: Pattern Matching with Swaps. In: Proc.38th IEEE FOCS, pp. 144–153 (1997)
Amir, A., Cole, R., Hariharan, R., Lewenstein, M., Porat, E.: Overlap matching. Inf. Comput. 181(1), 57–74 (2003)
Amir, A., Eisenberg, E., Porat, E.: Swap and Mismatch Edit Distance. In: Albers, S., Radzik, T. (eds.) ESA 2004. LNCS, vol. 3221, pp. 16–27. Springer, Heidelberg (2004)
Amir, A., Lewenstein, M., Porat, E.: Approximate swapped matching. Inf. Process. Lett. 83(1), 33–39 (2002)
Amir, A., Lewenstein, M., Porat, E.: Faster algorithms for string matching with k mismatches. J. Algorithms 50(2), 257–275 (2004)
Cole, R., Hariharan, R.: Approximate string matching: A faster simpler algorithm. In: Proc. 9th ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 463–472 (1998)
Dombb, Y., Lipsky, O., Porat, B., Porat, E., Zur, A.: Approximate Swap and Mismatch Edit Distance. In: the 14th String Processing and Information Retrievial Symposion (SPIRE) (to appear, 2007)
Karloff, H.: Fast algorithms for approximately counting mismatches. Information Processing Letters 48(2), 53–60 (1993)
Landau, G.M., Vishkin, U.: Efficient String Matching with k Mismatches. Theoretical Computer Science 43, 239–249 (1986)
Levenshtein, V.I.: Binary codes capable of correcting, deletions, insertions and reversals. Soviet Phys. Dokl. 10, 707–710 (1966)
Lowrance, R., Wagner, R.A.: An extension of the string-to-string correction problem. J. of the ACM, 177–183 (1975)
Wagner, R.A.: On the complexity of the extended string-to-string correction problem. In: Proc. 7th ACM STOC, pp. 218–223 (1975)
Yanai, I., DeLisi, C.: The society of genes: networks of functionallinks between genes from comparative genomics. Genome Biol. 3(64), 1–12 (2002)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lipsky, O., Porat, B., Porat, E., Shalom, B.R., Tzur, A. (2007). Approximate String Matching with Swap and Mismatch. In: Tokuyama, T. (eds) Algorithms and Computation. ISAAC 2007. Lecture Notes in Computer Science, vol 4835. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77120-3_75
Download citation
DOI: https://doi.org/10.1007/978-3-540-77120-3_75
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77118-0
Online ISBN: 978-3-540-77120-3
eBook Packages: Computer ScienceComputer Science (R0)