Skip to main content

Approximate String Matching with Swap and Mismatch

  • Conference paper
Algorithms and Computation (ISAAC 2007)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4835))

Included in the following conference series:

Abstract

Finding the similarity between two sequences is a major problem in computer science. It is motivated by many issues from computational biology as well as from information retrieval and image processing. These fields take into account possible corruptions of the data caused by genome rearrangements, typing mistakes, and more. Therefore, many applications do not require merely complete resemblance of the sequences, but rather an approximated matching. We consider mismatches and swaps as natural mistakes which are allowed in a meagre number. The edit distance problem with swap and mismatch operations was discussed by Amir et. al. [3]. They solved the problem in \(O(n\sqrt{m}\log m)\) time. From then on the problem of string matching with at most k swaps and mismatches errors was open.

In this paper we present an algorithm that finds all locations where the pattern has at most k mismatch and swap errors in time \(O(n\sqrt{k\log m})\).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amir, A., Aumann, Y., Landau, G.M., Lewenstein, M., Lewenstein, N.: Pattern Matching with Swaps. In: Proc.38th IEEE FOCS, pp. 144–153 (1997)

    Google Scholar 

  2. Amir, A., Cole, R., Hariharan, R., Lewenstein, M., Porat, E.: Overlap matching. Inf. Comput. 181(1), 57–74 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  3. Amir, A., Eisenberg, E., Porat, E.: Swap and Mismatch Edit Distance. In: Albers, S., Radzik, T. (eds.) ESA 2004. LNCS, vol. 3221, pp. 16–27. Springer, Heidelberg (2004)

    Google Scholar 

  4. Amir, A., Lewenstein, M., Porat, E.: Approximate swapped matching. Inf. Process. Lett. 83(1), 33–39 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  5. Amir, A., Lewenstein, M., Porat, E.: Faster algorithms for string matching with k mismatches. J. Algorithms 50(2), 257–275 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  6. Cole, R., Hariharan, R.: Approximate string matching: A faster simpler algorithm. In: Proc. 9th ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 463–472 (1998)

    Google Scholar 

  7. Dombb, Y., Lipsky, O., Porat, B., Porat, E., Zur, A.: Approximate Swap and Mismatch Edit Distance. In: the 14th String Processing and Information Retrievial Symposion (SPIRE) (to appear, 2007)

    Google Scholar 

  8. Karloff, H.: Fast algorithms for approximately counting mismatches. Information Processing Letters 48(2), 53–60 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  9. Landau, G.M., Vishkin, U.: Efficient String Matching with k Mismatches. Theoretical Computer Science 43, 239–249 (1986)

    Article  MATH  MathSciNet  Google Scholar 

  10. Levenshtein, V.I.: Binary codes capable of correcting, deletions, insertions and reversals. Soviet Phys. Dokl. 10, 707–710 (1966)

    MathSciNet  Google Scholar 

  11. Lowrance, R., Wagner, R.A.: An extension of the string-to-string correction problem. J. of the ACM, 177–183 (1975)

    Google Scholar 

  12. Wagner, R.A.: On the complexity of the extended string-to-string correction problem. In: Proc. 7th ACM STOC, pp. 218–223 (1975)

    Google Scholar 

  13. Yanai, I., DeLisi, C.: The society of genes: networks of functionallinks between genes from comparative genomics. Genome Biol. 3(64), 1–12 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Takeshi Tokuyama

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lipsky, O., Porat, B., Porat, E., Shalom, B.R., Tzur, A. (2007). Approximate String Matching with Swap and Mismatch. In: Tokuyama, T. (eds) Algorithms and Computation. ISAAC 2007. Lecture Notes in Computer Science, vol 4835. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77120-3_75

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-77120-3_75

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-77118-0

  • Online ISBN: 978-3-540-77120-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics