Abstract
In the Minimum Common String Partition problem (MCSP) we are given two strings on input, and we wish to partition them into the same collection of substrings, minimimizing the number of the substrings in the partition. Even a special case, denoted 2-MCSP, where each letter occurs at most twice in each input string, is NP-hard. We study a greedy algorithm for MCSP that at each step extracts a longest common substring from the given strings. We show that the approximation ratio of this algorithm is between Ω(n 0.43) and O(n 0.69). In case of 2-MCSP, we show that the approximation ratio is equal to 3. For 4-MCSP, we give a lower bound of Ω(log n).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Chen, X., Zheng, J., Fu, Z., Nan, P., Zhong, Y., Lonardi,S., Jiang, T.: Assignment of orthologous genes via genome rearrangement. (2004) (submitted)
Cormode, G., Muthukrishnan, J.A.: The string edit distance matching with moves. In: Proc. 13th Annual Symposium on Discrete Algorithms (SODA), pp. 667–676 (2002)
Goldstein, A., Kolman, P., Zheng, J.: Minimum common string partitioning problem: Hardness and approximations (2004) (manuscript)
Kruskal, J.B., Sankoff, D.: An anthology of algorithms and concepts for sequence comparison. In: Sankoff, D., Kruskal, J.B. (eds.) Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, Addison-Wesley, Reading (1983)
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions and reversals (in Russian). Doklady Akademii Nauk SSSR 163(4), 845–848 (1965)
Lopresti, D., Tomkins, A.: Block edit models for approximate string matching. Theoretical Computer Science 181, 159–179 (1997)
Shapira, D., Storer, J.A.: Edit distance with move operations. In: Proc. 13th Annual Symposium on Combinatorial Pattern Matching (CPM), pp. 85–98 (2002)
Tichy, W.F.: The string-to-string correction problem with block moves. ACM Trans. Computer Systems 2, 309–321 (1984)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chrobak, M., Kolman, P., Sgall, J. (2004). The Greedy Algorithm for the Minimum Common String Partition Problem. In: Jansen, K., Khanna, S., Rolim, J.D.P., Ron, D. (eds) Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques. RANDOM APPROX 2004 2004. Lecture Notes in Computer Science, vol 3122. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27821-4_8
Download citation
DOI: https://doi.org/10.1007/978-3-540-27821-4_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22894-3
Online ISBN: 978-3-540-27821-4
eBook Packages: Springer Book Archive