Elsevier

Journal of Complexity

Volume 24, Issue 2, April 2008, Pages 173-184
Journal of Complexity

Finding a longest common subsequence between a run-length-encoded string and an uncompressed string

https://doi.org/10.1016/j.jco.2007.06.003Get rights and content
Under an Elsevier user license
open archive

Abstract

In this paper, we propose an O(min{mN,Mn}) time algorithm for finding a longest common subsequence of strings X and Y with lengths M and N, respectively, and run-length-encoded lengths m and n, respectively. We propose a new recursive formula for finding a longest common subsequence of Y and X which is in the run-length-encoded format. That is, Y=y1y2yN and X=r1l1r2l2rmlm, where ri is the repeated character of run i and li is the number of its repetitions. There are three cases in the proposed recursive formula in which two cases are for ri matching yj. The third case is for ri mismatching yj. We will look specifically at the prior two cases that ri matches yj. To determine which case will be used when ri matches yj, we have to find a specific value which can be obtained by using another of our proposed recursive formulas.

Keywords

Longest common subsequence
Run-length encoding
String compression

Cited by (0)

This work was supported in part by the National Science Council of the Republic of China under Contract NSC 95-2221-E260-025.