Abstract:
The edit distance is a basic string similarity measure for many applications such as string searching, text mining, signal processing, bioinformatics and so on. However, ...Show MoreMetadata
Abstract:
The edit distance is a basic string similarity measure for many applications such as string searching, text mining, signal processing, bioinformatics and so on. However, its high computational cost often prevents it from being used for a large set of strings like similar string searching. A promising solution for the problem is to approximate the edit distance with low computational cost. However, although there are many methods for approximating the edit distance, most of them are analyzed only theoretically. In fact, most of the methods can evaluate the edit distance only in terms of order notations, and do not conduct any experiment. This is a large obstacle for applying them to real applications. In this study we will first list up existing edit distance approximation methods. Then we compare them by: (i) approximation performances shown by the original authors, (ii) approximation performances re-analyzed by us (concrete values instead of the order notations) and (iii) experimental performances by our implementations.
Published in: 2011 IEEE International Conference on Granular Computing
Date of Conference: 08-10 November 2011
Date Added to IEEE Xplore: 05 January 2012
ISBN Information: