Abstract
Longest common subsequence (LCS) is a well-known NP-hard optimization problem that finds out the longest subsequence of each member of a given set of strings. In computational biology, sequence alignment is a fundamental technique to measure the similarity of biological sequences, such as DNA and genome sequences. A high sequence similarity often applied to molecular structural as well as functional similarities and can be used to determine whether (and how) sequences are related. Finding the longest common subsequence (LCS) is one way to measure the similarity of sequences. It has also applications in data compression, FPGA circuit minimization, and bioinformatics, etc. Exact algorithms are impractical since they fail to solve this problem for multiple instances of long lengths in polynomial time. There are some approximations, heuristic, and metaheuristic methods proposed to solve the problem. Chemical reaction optimization (CRO) is a new metaheuristic method that mimics the nature of chemical reaction into optimization problems. In this paper, we have proposed chemical reaction optimization technique to solve the longest common subsequence problem for multiple instances. Here, we have redesigned four elementary operators of CRO for LCS problem. Operators of CRO algorithm are used to explore the search space both locally and globally. A novel correction method has been designed to correct the solution. Correction method works after each search operator to ensure the validity of the changes made by operators. Both solution quality and execution time are considered while designing the operators and the correction method. Thus proposed system brings robustness, efficiency, and effectiveness while solving MLCS problem. Our approach is compared with hyper-heuristic, ant colony optimization, beam ant colony optimization, and memory-bound anytime algorithms. The experimental results in lengths of the returned common sequences show that our proposed algorithm gives either same or better results than all other algorithms in less execution time.













Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Aho AV, Hopcroft JE, Ullman JD (1983) Data structures and algorithms. Addison Wesley Publishing Company, INc., Boston
Aine S, Chakrabarti P, Kumar R (2007) Awa-a window constrained anytime heuristic search algorithm. In: IJCAI, pp 2250-2255
Banerjee A, Ghosh J (2001) Clickstream clustering using weighted longest common subsequences. In: Proceedings of the web mining workshop at the 1st SIAM conference on data mining, vol. 143, p 144
Bepery C, Abdullah-Al-Mamun S, Rahman MS (2015) Computing a longest common subsequence for multiple sequences. In: 2015 2nd international conference on electrical information and communication technology (EICT). IEEE, pp 118-129
Blum C (2010) Beam-ACO for the longest common subsequence problem. In: 2010 IEEE congress on evolutionary computation (CEC). IEEE, pp. 1-8
Blum C, Blesa M (2007) Probabilistic beam search for the longest common subsequence problem. Engineering stochastic local search algorithms. Designing, implementing and analyzing effective heuristics, pp 150–161
Blum C, Blesa MJ (2017) A hybrid evolutionary algorithm based on solution merging for the longest arc-preserving common subsequence problem. arXiv preprint arXiv:1702.00318
Blum C, Blesa MJ (2018) Hybrid techniques based on solving reduced problem instances for a longest common subsequence problem. Appl Soft Comput 62:15–28
Blum C, Blesa M, Lopez M (2009) Beam search for the longest common subsequence problem. Comput Oper Res 36:3178–3186
Blum C, Blesa MJ, Calvo B (2013) Beam-ACO for the repetition-free longest common subsequence problem. In: International conference on artificial evolution (Evolution Artificielle). Springer, pp 79–90
Bonizzoni P, Della Vedova G, Mauri G (2001) Experimenting an approximation algorithm for the LCS. Discrete Appl Math 110(1):13–24
Brisk P, Kaplan A, Sarrafzadeh M (2004) Area-efficient instruction set synthesis for reconfigurable system-on-chip designs. In: Proceedings of the 41st annual design automation conference. ACM, pp 395–400
Chen Y, Wan A, Liu W (2006) A fast parallel algorithm for finding the longest common sequence of multiple biosequences. BMC Bioinform 7(4):S4
Chin F, Poon CK (1994) Performance analysis of some simple heuristics for computing longest common subsequences. Algorithmica 12(4–5):293–311
Easton T, Singireddy A (2007) A specialized branching and fathoming technique for the longest common subsequence problem. Int J Oper Res 4(2):98–104
Easton T, Singireddy A (2008) A large neighborhood search heuristic for the longest common subsequence problem. J Heuristics 14(3):271–283
Eppstein D, Galil Z, Giancarlo R, Italiano GF (1992) Sparse dynamic programming ii: convex and concave cost functions. J ACM (JACM) 39(3):546–567
Guénoche A (2004) Supersequences of masks for oligo-chips. J Bioinform Comput Biol 2(03):459–469
Guenoche A, Vitte P (1995) Longest common subsequence to multiple strings. Exact and approximate algorithms. TSI-Technique et Science Informatiques-RAIRO 14(7):897–916
Hakata K, Imai H (1992) The longest common subsequence problem for small alphabet size between many strings. Algorithms Comput 650:469–478
Hirschberg DS (1975) A linear space algorithm for computing maximal common subsequences. Commun ACM 18(6):341–343
Ho Wc (2017) A fast algorithm for the constrained longest common subsequence problem with small alphabet. Proceedings of the 34th workshop on combinatorial mathematics and computation theory, Taichung, Taiwan, May 19–20, 2017
Hsu W, Du M (1984) Computing a longest common subsequence for a set of strings. BIT Numer Math 24(1):45–59
Huang K, Yang CB, Tseng KT, et al (2004) Fast algorithms for finding the common subsequence of multiple sequences. In: Proceedings of the international computer symposium. IEEE Press, pp 1006–1011
Irving RW, Fraser CB (1992) Two algorithms for the longest common subsequence of three (or more) strings. In: Annual symposium on combinatorial pattern matching. Springer, pp 214–229
Islam MR, Asha ZT, Ahmed R (2015) Longest common subsequence using chemical reaction optimization. In: 2015 2nd international conference on electrical information and communication technology (EICT). IEEE, pp 29–33
James J, Lam AY, Li VO (2011) Evolutionary artificial neural network based on chemical reaction optimization. In: 2011 IEEE congress on evolutionary computation (CEC). IEEE, pp 2083–2090
Jansen T, Weyland D (2007) Analysis of evolutionary algorithms for the longest common subsequence problem. In: Proceedings of the 9th annual conference on genetic and evolutionary computation. ACM, pp 939–946
Jiang T, Li M (1995) On the approximation of shortest common supersequences and longest common subsequences. SIAM J Comput 24(5):1122–1139
Johtela T, Smed J, Hakonen H, Raita T (1996) An efficient heuristic for the LCS problem. In: Third South American workshop on string processing, WSP 96:126–140
Korkin D, Wang Q, Shang Y (2008) An efficient parallel algorithm for the multiple longest common subsequence (MLCS) problem. In: 37th international conference on parallel processing, 2008, ICPP’08. IEEE, pp 354–363
Lam AY, Li VO (2012) Chemical reaction optimization: a tutorial. Memet Comput 4(1):3–17
Likhachev M, Gordon GJ, Thrun S (2004) Ara*: Anytime a* with provable bounds on sub-optimality. In: Advances in neural information processing systems, pp 767–774
Likhachev M, Ferguson D, Gordon G, Stentz A, Thrun S (2008) Anytime search in dynamic graphs. Artif Intell 172(14):1613–1643
Li Y, Li H, Duan T, Wang S, Wang Z, Cheng Y (2016) A real linear and parallel multiple longest common subsequences (MLCS) algorithm. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1725–1734
López-Ibánez M, Dubois-Lacoste J, Stützle T, Birattari M (2011) The irace package: iterated racing for automatic algorithm configuration. IRIDIA, Universite Libre de Bruxelles, Brussels, Belgium, Technical Report TR/IRIDIA/2011-004
Maier D (1978) The complexity of some problems on subsequences and supersequences. J ACM (JACM) 25(2):322–336
Mousavi SR, Tabataba FS (2012) An improved algorithm for the longest common subsequence problem. Comput Oper Res 39:512–520
Ning K (2010) Deposition and extension approach to find longest common subsequence for thousands of long sequences. Comput Biol Chem 34(3):149–157
Peng Z, Wang Y (2017) A novel efficient graph model for the multiple longest common subsequences (MLCS) problem. Front Genet 8:104
Saifullah CK, Islam MR (2016a) Chemical reaction optimization for solving shortest common supersequence problem. Comput Biol Chem 64:82–93
Saifullah CK, Islam MR (2016b) Solving shortest common supersequence problem using chemical reaction optimization. In: 2016 5th International conference on informatics, electronics and vision (ICIEV). IEEE, pp 50–55
Saifullah CK, Islam MR, Mahmud MR (2018) Chemical reaction optimization algorithm for word detection using pictorial structure. In: International conference on emerging technology in data mining and information security (IEMIS) (Accepted. To appear)
Sankoff D, Kruskal JB, (1983) Time warps, string edits, and macromolecules: the theory and practice of sequence comparison. In: Sankoff D, Kruskal JB (eds) Reading: Addison-Wesley Publication (1983)
Sellis TK (1988) Multiple-query optimization. ACM Trans Database Syst (TODS) 13(1):23–52
Shyu SJ, Tsai CY (2009) Finding the longest common subsequence for multiple biological sequences by ant colony optimization. Comput Oper Res 36(1):73–91
Singireddy A (2003) Solving the longest common subsequence problem in bioinformatics. Master, Kansas State University 1(1):1–10
Sivanandam S, Deepa S (2007) Introduction to genetic algorithms. Springer, Berlin
Storer J (1988) Data compression. Elsevier, Amsterdam
Tabataba FS, Mousavi SR (2012) A hyper-heuristic for the longest common subsequence problem. Comput Biol Chem 36:42–54
Truong TK, Li K, Xu Y (2013) Chemical reaction optimization with greedy strategy for the 0–1 knapsack problem. Appl Soft Comput 13(4):1774–1780
Tsai Y, Hsu J (2002) An approximation algorithm for multiple longest common subsequence problems. In: Proceeding of the 6th world multiconference on systemics, cybernetics and informatics, SCI, pp 456–460
Tseng KT, Chan DS, Yang CB, Lo SF (2018) Efficient merged longest common subsequence algorithms for similar sequences. Theor Comput Sci 708:75–90
Vadlamudi SG, Aine S, Chakrabarti PP (2011) A memory-bounded anytime heuristic-search algorithm. IEEE Trans Syst Man Cybern Part B (Cybern) 41(3):725–735
Van Den Berg J, Shah R, Huang A, Goldberg K (2011) Ana: anytime nonparametric a. In: Proceedings of twenty-fifth AAAI conference on artificial intelligence (AAAI-11)
Wang Q, Korkin D, Shang Y (2009) Efficient dominant point algorithms for the multiple longest common subsequence (mlcs) problem. In: IJCAI, pp 1494–1500
Wang Q, Pan M, Shang Y, Korkin D (2010) A fast heuristic search algorithm for finding the longest common subsequence of multiple strings. In: AAAI
Wang Q, Korkin D, Shang Y (2011) A fast multiple longest common subsequence (MLCS) algorithm. IEEE Trans Knowl Data Eng 23(3):321–334
Wang X, Wu Y, Zhu D (2016) A polynomial time algorithm for a generalized longest common subsequence problem. In: Green, pervasive, and cloud computing. Springer, pp 18–29
Xu J, Lam AY, Li VO (2010) Parallel chemical reaction optimization for the quadratic assignment problem. In: World congress in computer science, computer engineering, and applied computing, Worldcomp 2010
Xu J, Lam AY, Li VO (2011) Chemical reaction optimization for task scheduling in grid computing. IEEE Trans Parallel Distrib Syst 22(10):1624–1631
Yang J, Xu Y, Shang Y, Chen G (2014) A space-bounded anytime algorithm for the multiple longest common subsequence problem. IEEE Tans Knowl Data Eng 26(11):2599–2609
Yao X (1991) Optimization by genetic annealing. In: Proceedings of the second australian conference on neural networks, pp 94–97
Zhu D, Wang X (2016) A fast algorithm for solving a generalized longest common subsequence problem. ICSIC 2016 Committees Executive Committee , p 1
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Islam, M.R., Saifullah, C.M.K., Asha, Z.T. et al. Chemical reaction optimization for solving longest common subsequence problem for multiple string. Soft Comput 23, 5485–5509 (2019). https://doi.org/10.1007/s00500-018-3200-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-018-3200-3