Skip to main content

Bit-Parallel Approximate String Matching Algorithms with Transposition

  • Conference paper
String Processing and Information Retrieval (SPIRE 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2857))

Included in the following conference series:

Abstract

Using bit-parallelism has resulted in fast and practical algorithms for approximate string matching under the Levenshtein edit distance, which permits a single edit operation to insert, delete or substitute a character. Depending on the parameters of the search, currently the fastest non-filtering algorithms in practice are the O(knm/w ⌉) algorithm of Wu & Manber, the O(⌈km/wn) algorithm of Baeza-Yates & Navarro, and the O(⌈m/wn) algorithm of Myers, where m is the pattern length, n is the text length, k is the error threshold and w is the computer word size. In this paper we discuss a uniform way of modifying each of these algorithms to permit also a fourth type of edit operation: transposing two adjacent characters in the pattern. This type of edit distance is also known as the Damerau edit distance. In the end we also present an experimental comparison of the resulting algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baeza-Yates, R., Navarro, G.: Faster approximate string matching. Algorithmica 23(2), 127–158 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  2. Damerau, F.: A technique for computer detection and correction of spelling errors. Comm. of the ACM 7(3), 171–176 (1964)

    Article  Google Scholar 

  3. Du, M.W., Chang, S.C.: A model and a fast algorithm for multiple errors spelling correction. Acta Informatica 29, 281–302 (1992)

    Article  MATH  MathSciNet  Google Scholar 

  4. Harman, D.: Overview of the Third Text REtrieval Conference. In: Proc. Third Text REtrieval Conference (TREC-3), pp. 1–19. NIST Special Publication 500-207 (1995)

    Google Scholar 

  5. Kukich, K.: Techniques for automatically correcting words in text. ACM Computing Surveys 24(4), 377–439 (1992)

    Article  Google Scholar 

  6. Levenshtein, V.: Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady 10(8), 707–710 (1966); Original in Russian in Doklady Akademii Nauk SSSR 163(4), 845–848 (1965)

    MathSciNet  Google Scholar 

  7. Myers, G.: A fast bit-vector algorithm for approximate string matching based on dynamic progamming. Journal of the ACM 46(3), 395–415 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  8. Navarro, G.: A guided tour to approximate string matching. ACM Computing Surveys 33(1), 31–88 (2001)

    Article  Google Scholar 

  9. Navarro, G.: NR-grep: a fast and flexible pattern matching tool. Software Practice. Software Practice and Experience (SPE) 31, 1265–1312 (2001)

    Article  MATH  Google Scholar 

  10. Navarro, G., Baeza-Yates, R.: Improving an algorithm for approximate pattern matching. Algorithmica 30(4), 473–502 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  11. Sellers, P.: The theory and computation of evolutionary distances: pattern recognition. J. of Algorithms 1, 359–373 (1980)

    Article  MATH  MathSciNet  Google Scholar 

  12. Ukkonen, E.: Algorithms for approximate string matching. Information and Control 64, 100–118 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  13. Ukkonen, E.: Finding approximate patterns in strings. J. of Algorithms 6, 132–137 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  14. Wright, A.: Approximate string matching using within-word parallelism. Software Practice and Experience 24(4), 337–362 (1994)

    Article  MATH  Google Scholar 

  15. Wu, S., Manber, U.: Fast text searching allowing errors. Comm. of the ACM 35(10), 83–91 (1992)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hyyrö, H. (2003). Bit-Parallel Approximate String Matching Algorithms with Transposition. In: Nascimento, M.A., de Moura, E.S., Oliveira, A.L. (eds) String Processing and Information Retrieval. SPIRE 2003. Lecture Notes in Computer Science, vol 2857. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39984-1_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-39984-1_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-20177-9

  • Online ISBN: 978-3-540-39984-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics