Edit distance for a run-length-encoded string and an uncompressed string

https://doi.org/10.1016/j.ipl.2007.07.006Get rights and content

Abstract

We propose a new algorithm for computing the edit distance of an uncompressed string against a run-length-encoded string. For an uncompressed string of length n and a compressed string with M runs, the algorithm computes their edit distance in time O(Mn). This result directly implies an O(min{mN,Mn}) time algorithm for strings of lengths m and n with M and N runs, respectively. It improves the previous best known time bound O(mN+Mn).

References (11)

There are more references available in the full text version of this article.

Cited by (18)

  • Binary image encryption in a joint transform correlator scheme by aid of run-length encoding and QR code

    2018, Optics and Laser Technology
    Citation Excerpt :

    Meanwhile, the scrambling appends an additional security level on the cryptosystem. Run-length encoding (RLE) is known as an important coding approach achieving lossless data compression [28]. In RLE, a string will be divided into several runs, and each run consists of identical letters.

  • Efficient merged longest common subsequence algorithms for similar sequences

    2018, Theoretical Computer Science
    Citation Excerpt :

    It is essential in many applications to measure the similarity of two sequences, such as computational biology, pattern matching, plagiarism detection, voice recognition, and so on. The most well-known methods for measuring the sequence similarity in computer science are the algorithms for the longest common subsequence (LCS) problem [1–4,6,7,12,19,32–35] and the edit distance problem [22–25,28]. There are some applications for the MLCS and BMLCS problems.

  • Hardness of comparing two run-length encoded strings

    2010, Journal of Complexity
    Citation Excerpt :

    Since then this has been an active research field, and several papers have delved into compressed pattern matching problems under various compression schemes (e.g., rle compression, LZ-family compression, or straight-line programs). For the rle scheme, some papers took one step further by considering problems of comparing two rle strings using different cost functions, such as the LCS metric [5,17,19], the Levenshtein distance [16,18], and arbitrary alignment scores [10,14]. In this paper, we investigate the hardness of comparing (or approximately matching) two strings both compressed into the rle format.

  • Approximating Dynamic Time Warping Distance between Run-Length Encoded Strings

    2022, Leibniz International Proceedings in Informatics, LIPIcs
View all citing articles on Scopus

This work was supported in part by the National Science Council of the Republic of China under Contract NSC 95-2221-E260-025.

View full text