Abstract
Finding a sequence of edit operations that transforms one string of symbols into another with the minimum cost is a well-known problem. The minimum cost, or edit distance, is a widely used measure of the similarity of two strings. An important parameter of this problem is the cost function, which specifies the cost of each insertion, deletion, and substitution. We show that cost functions having the same ratio of the sum of the insertion and deletion costs divided by the substitution cost yield the same minimum cost sequences of edit operations. This leads to a partitioning of the universe of cost functions into equivalence classes. Also, we show the relationship between a particular set of cost functions and the longest common subsequence of the input strings.
Similar content being viewed by others
References
A. V. Aho, Algorithms for finding patterns in strings, inHandbook of Theoretical Computer Science, vol. A (J. van Leeuwen, ed.), Elsevier, Amsterdam, 1990, pp. 255–300.
D. Sankoff and J. Kruskal, eds.,Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, Addison-Wesley, Reading, MA, 1983.
J. Kanai, T. A. Nartker, S. V. Rice, and G. Nagy, Performance metrics for document understanding systems,Proceedings of the Second International Conference on Document Analysis and Recognition, 1993, pp. 424–427.
D. Gusfield, K. Balasubramanian, and D. Naor, Parametric optimization of sequence alignment,Algorithmica,12 (1994), 312–326.
H. Bunke and J. Csirik, Parametric string edit distance and its application to pattern recognition,IEEE Trans. Systems Man Cybernet.,25(1) (1995), 202–206.
R. A. Wagner and M. J. Fischer, The string-to-string correction problem,J. Assoc. Comput. Mach.,21(1) (1974), 168–173.
E. Ukkonen, Algorithms for approximate string matching,Inform. and Control,64 (1985), 100–118.
E. W. Myers, AnO(N D) difference algorithm and its variations,Algorithmica,1 (1986), 251–266.
Author information
Authors and Affiliations
Additional information
Communicated by A. C.-C. Yao.
This work was supported in part by the U.S. Department of Defense and the U.S. Department of Energy.
Rights and permissions
About this article
Cite this article
Rice, S.V., Bunke, H. & Nartker, T.A. Classes of cost functions for string edit distance. Algorithmica 18, 271–280 (1997). https://doi.org/10.1007/BF02526038
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02526038