Skip to main content
Log in

Classes of cost functions for string edit distance

  • Published:
Algorithmica Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Finding a sequence of edit operations that transforms one string of symbols into another with the minimum cost is a well-known problem. The minimum cost, or edit distance, is a widely used measure of the similarity of two strings. An important parameter of this problem is the cost function, which specifies the cost of each insertion, deletion, and substitution. We show that cost functions having the same ratio of the sum of the insertion and deletion costs divided by the substitution cost yield the same minimum cost sequences of edit operations. This leads to a partitioning of the universe of cost functions into equivalence classes. Also, we show the relationship between a particular set of cost functions and the longest common subsequence of the input strings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. A. V. Aho, Algorithms for finding patterns in strings, inHandbook of Theoretical Computer Science, vol. A (J. van Leeuwen, ed.), Elsevier, Amsterdam, 1990, pp. 255–300.

    Google Scholar 

  2. D. Sankoff and J. Kruskal, eds.,Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, Addison-Wesley, Reading, MA, 1983.

    Google Scholar 

  3. J. Kanai, T. A. Nartker, S. V. Rice, and G. Nagy, Performance metrics for document understanding systems,Proceedings of the Second International Conference on Document Analysis and Recognition, 1993, pp. 424–427.

  4. D. Gusfield, K. Balasubramanian, and D. Naor, Parametric optimization of sequence alignment,Algorithmica,12 (1994), 312–326.

    Article  MATH  MathSciNet  Google Scholar 

  5. H. Bunke and J. Csirik, Parametric string edit distance and its application to pattern recognition,IEEE Trans. Systems Man Cybernet.,25(1) (1995), 202–206.

    Article  Google Scholar 

  6. R. A. Wagner and M. J. Fischer, The string-to-string correction problem,J. Assoc. Comput. Mach.,21(1) (1974), 168–173.

    MATH  MathSciNet  Google Scholar 

  7. E. Ukkonen, Algorithms for approximate string matching,Inform. and Control,64 (1985), 100–118.

    Article  MATH  MathSciNet  Google Scholar 

  8. E. W. Myers, AnO(N D) difference algorithm and its variations,Algorithmica,1 (1986), 251–266.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

Communicated by A. C.-C. Yao.

This work was supported in part by the U.S. Department of Defense and the U.S. Department of Energy.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rice, S.V., Bunke, H. & Nartker, T.A. Classes of cost functions for string edit distance. Algorithmica 18, 271–280 (1997). https://doi.org/10.1007/BF02526038

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02526038

Key Words

Navigation