Abstract
Given a set S={S 1,…,S k} of finite strings, the k-Longest Common Subsequence Problem (k-LCSP) seeks a string L * of maximum length such that L * is a subsequence of each S i for i=1,…,k. This paper presents a large neighborhood search technique that provides quality solutions to large k-LCSP instances. This heuristic runs in linear time in both the length of the sequences and the number of sequences. Some computational results are provided.
Similar content being viewed by others
References
Aho, A.V., Hopcroft, J.E., Ullman, J.: Data Structures and Algorithms. Addison–Wiley, Reading (1983)
Ahuja, R., Ergun, O., Orlin, J., Punen, A.: A survey of very large-scale neighborhood search techniques. Discret. Appl. Math. 123(1–3), 75–102 (2002)
Bafna, V., Muthukrishnan, S., Ravi, R.: Computing similarity between RNA strings. In: Proceedings of the 6th Annual Symposium on Combinatorial Pattern Matching, pp. 1–16, 1995
Banerjee, A., Ghosh, J.: Clickstream clustering using weighted longest common subsequences. In: Proceedings of the Web Mining Workshop at the 1st SIAM Conference on Data Mining, pp. 33–40, 2001
Bergroth, L., Hakonen, H., Raitta, T.: New approximation algorithms for longest common subsequences. In: Proceedings. String Processing and Information Retrieval: A South American Symposium, pp. 32–40, 1998
Bergroth, L., Hakonen, H., Raitta, T.: A survey of longest common subsequence algorithms. In: Proceedings Seventh International Symposium on String Processing and Information Retrieval. SPIRE 2000, pp. 39–48, 2000
Bonizzoni, P., Vedova, G.D., Mauri, G.: Experimenting an approximation algorithm for the LCS. Discret. Appl. Math. 110(1), 13–24 (1998)
Brisk, P., Kaplan, A., Sarrafzadeh, M.: Area-efficient instruction set synthesis for reconfigurable system-on-chip designs. In: Proceedings 2004 Design Automation Conference, pp. 395–400, 2004
Chin, F., Poon, C.: Performance analysis of some simple heuristics for computing longest common subsequences. Algorithmica 12(4–5), 293–311 (1994)
Dayhoff, M.O.: Computer analysis of protein evolution. Sci. Am. 221(1), 86–95 (1969)
Dayhoff, M., Schwartz, R., Orcutt, B.: A model of evolutionary change in proteins. Atlas Protein Seq. Struct. 5, 345–352 (1978)
Eppstein, D., Galil, Z., Giancarlo, R.: Italiano, Sparse dynamic programming. II. convex and concave cost functions. J. Assoc. Comput. Mach. 39(3), 546–567 (1992)
Gallant, J., Maier, D., Storer, J.A.: On finding minimal length superstrings. J. Comput. Syst. Sci. 20(1), 50–58 (1980)
Guenoche, A., Vitte, P.: Longest common subsequence to multiple strings. Exact and approximate algorithms. Tech. Sci. Inform. 14(7), 897–915 (1995)
Guenoche, A.: Supersequence of masks for oligo-chips. J. Bioinform. Comput. Biol. 2(3), 459–469 (2004)
Hakata, K., Imai, H.: The longest common subsequence problem for small alphabet size between many strings. In: Proceedings of the 3rd International Symposium on Algorithms and Computation, 650, pp. 469–478, 1992
Hayes, C.C.: A model of planning for plan efficiency: taking advantage of operator overlap. In: Proceedings of the 11th International Joint Conference of Artificial Intelligence, pp. 949–953, 1989
Hirschberg, D.S.: A linear space algorithm for computing maximal common subsequences. Commun. Assoc. Comput. Mach. 18(6), 341–343 (1975)
Hirschberg, D.S.: Algorithms for the longest common subsequence problem. J. Assoc. Comput. Mach. 24(4), 664–675 (1977)
Hsu, W.J., Du, M.W.: Computing a longest common subsequence for a set of strings. BIT 24, 45–59 (1984)
Hunt, J.W., McIlroy, M.D.: An algorithm for differential file comparison. Computing Science Technical Report, 41, AT&T Bell Laboratories, Murray Hill, New Jersey (1975)
Hunt, J.W., Szymanski, T.G.: A fast algorithm for computing longest common subsequences. Commun. Assoc. Comput. Mach. 20(5), 350–353 (1977)
Irving, R.W., Fraser, C.B.: Two algorithms for the longest common subsequence of three (or more) strings. In: Proceedings of 3rd Symposium on Combinatorial Pattern Matching, vol. 644, pp. 214–229. Springer, Berlin (1992)
Itoga, S.Y.: The string merging problem. BIT 21, 20–30 (1981)
Jiang, T., Li, M.: On the approximation of shortest common and longest common subsequences. SIAM J. Comput. 24(5), 1122–1139 (1995)
Jiang, T., Lin, G., Ma, B., Zhang, K.: A general edit distance between RNA structures. J. Comput. Biol. 9(2), 371–388 (2002)
Land, A.H., Doig, A.G.: An automatic method for solving discrete programming problems. Econometrica 28, 497–520 (1960)
Larson, R.: State Increment Dynamic Programming. Elsevier, New York (1968)
Lu, S.Y., Fu, K.S.: A sentence-to-sentence clustering procedure for pattern analysis. IEEE Trans. Syst. Man Cybern. SMC-8(5), 381–389 (1978)
Maier, D.: The complexity of some problems on subsequences and supersequences. J. Assoc. Comput. Mach. 25, 322–336 (1978)
Sankoff, D., Kruskal, J.B. (eds.): Time Warps, String Edits, and Macromolecules: the Theory and Practice of Sequence Comparison. Addison–Wesley, Reading (1983)
Sellis, T.: Multiple query optimization. ACM Trans. Database Syst. 13(1), 23–52 (1988)
Singireddy, A.: Solving the longest common subsequence problem in Bioinformatics. Master’s Thesis, Industrial and Manufacturing Systems Engineering, Kansas State University, Manhattan, KS (2003)
Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197 (1981)
Storer, J.: Data Compression: Methods and Theory. Computer Science Press, MD (1988)
Thompson, J.D., Higgins, D.G., Gibson, T.J.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994)
Wagner, R.A.: Common phrases and minimum-space text storage. Commun. Assoc. Comput. Mach. 16(3), 148–152 (1973)
Winston, W.L.: Operations Research Applications and Algorithms, 4th edn. Brooks/Cole-Thomson Learning, Belmont (2004)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Easton, T., Singireddy, A. A large neighborhood search heuristic for the longest common subsequence problem. J Heuristics 14, 271–283 (2008). https://doi.org/10.1007/s10732-007-9038-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10732-007-9038-y