Skip to main content

Boyer-Moore approach to approximate string matching

  • Conference paper
  • First Online:
SWAT 90 (SWAT 1990)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 447))

Included in the following conference series:

Abstract

The Boyer-Moore idea applied in exact string matching is generalized to approximate string matching. Two versions of the problem are considered. The k mismatches problem is to find all approximate occurrences of a pattern string (length m) in a text string (length n) with at most k mismatches. Our generalized Boyer-Moore algorithm solves the problem in expected time O(kn(1/(mk)+k / c)) where c is the size of the alphabet. A related algorithm is developed for the k differences problem where the task is to find all approximate occurrences of a pattern in a text with ≤ k differences (insertions, deletions, changes).

Extended Abstract

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R. Baeza-Yates: Efficient Text Searching. Ph.D. Thesis, Report CS-89-17, University of Waterloo, Computer Science Department, 1989.

    Google Scholar 

  2. R. Baeza-Yates: String searching algorithms revisited. In: Proceedings of the Workshop on Algorithms and Data Structures (ed. F. Dehne et al.), Lecture Notes in Computer Science 382, Springer-Verlag, Berlin, 1989, 75–96.

    Google Scholar 

  3. R. Boyer and S. Moore: A fast string searching algorithm. Communcations of the ACM 20 (1977), 762–772.

    Article  Google Scholar 

  4. Z. Galil and R. Giancarlo: Improved string matching with k mismatches. SIGACT News 17 (1986), 52–54.

    Article  Google Scholar 

  5. Z. Galil and R. Giancarlo: Data structures and algorithms for approximate string matching. Journal of Complexity 4 (1988), 33–72.

    Google Scholar 

  6. Z. Galil and K. Park: An improved algorithm for approximate string matching. Proceedings of the 16th International Colloquium on Automata, Languages and Programming, Lecture Notes in Computer Science 372, Springer-Verlag, Berlin, 1989, 394–404.

    Google Scholar 

  7. R. Grossi and F. Luccio: Simple and efficient string matching with k mismatches. Information Processing Letters 33 (1989), 113–120.

    Article  Google Scholar 

  8. N. Horspool: Practical fast searching in strings. Software Practice & Experience 10 (1980), 501–506.

    Google Scholar 

  9. P. Jokinen, J. Tarhio and E. Ukkonen: A comparison of approximate string matching algorithms. In preparation.

    Google Scholar 

  10. S. R. Kosaraju: Efficient string matching. Extended abstract. Johns Hopkins University, 1988.

    Google Scholar 

  11. D. Knuth, J. Morris and V. Pratt: Fast pattern matching in strings. SIAM Journal on Computing 6 (1977), 323–350.

    Article  Google Scholar 

  12. G. Landau and U. Vishkin: Fast string matching witk k differences. Journal of Computer and System Sciences 37 (1988), 63–78.

    Google Scholar 

  13. G. Landau and U. Vishkin: Fast parallel and serial approximate string matching. Journal of Algorithms 10 (1989), 157–169.

    Google Scholar 

  14. P. Sellers: The theory and computation of evolutionary distances: Pattern recognition. Journal of Algorithms 1 (1980), 359–372.

    Google Scholar 

  15. J. Tarhio and E. Ukkonen: Approximate Boyer-Moore string matching. Report A-1990-3. Department of Computer Science, University of Helsinki, 1990.

    Google Scholar 

  16. E. Ukkonen: Algorithms for approximate string matching. Information Control 64 (1985), 100–118.

    Google Scholar 

  17. E. Ukkonen: Finding approximate patterns in strings. Journal of Algorithms 6 (1985), 132–137.

    Google Scholar 

  18. E. Ukkonen and D. Wood: Fast approximate string matching with suffix automata. Manuscript, 1989.

    Google Scholar 

  19. R. Wagner and M. Fischer: The string-to-string correction problem. Journal of the ACM 21 (1975), 168–173.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

John R. Gilbert Rolf Karlsson

Rights and permissions

Reprints and permissions

Copyright information

© 1990 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tarhio, J., Ukkonen, E. (1990). Boyer-Moore approach to approximate string matching. In: Gilbert, J.R., Karlsson, R. (eds) SWAT 90. SWAT 1990. Lecture Notes in Computer Science, vol 447. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-52846-6_103

Download citation

  • DOI: https://doi.org/10.1007/3-540-52846-6_103

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-52846-3

  • Online ISBN: 978-3-540-47164-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics