skip to main content
10.1145/2347583.2347591acmotherconferencesArticle/Chapter ViewAbstractPublication PagesuccsConference Proceedingsconference-collections
research-article

Quick-MLCS: a new algorithm for the multiple longest common subsequence problem

Published:27 June 2012Publication History

ABSTRACT

Finding the longest common subsequence (LCS) of multiple strings is a well-known problem that has many applications in various fields, such as computational biology and computational genomics. This problem has been studied by a number of researchers and over the years, its complexity has been improved from various aspects. This paper presents a new algorithm for the general case of multiple LCS (MLCS) which is based on one of the fastest existing algorithms. The proposed algorithm is founded on the dominant point approach and uses a linear sorting technique to minimize the dominant points set. The main idea is that, after linearly sorting dominant points, a one-pass linear algorithm can minimize the dominant points set. The results of theoretical and experimental evaluations indicate that the efficiency of the newly proposed algorithm in different scenarios is better than the fastest existing algorithm.

References

  1. Aho, A., Hopcroft, J., Ullman, J. 1983. Data structures and algorithms. Addison-Wesley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Apostolico, A., Browne, S. and Guerra, C. 1992. Fast Linear-Space Computations of Longest Common Subsequences. Theoretical Computer Science 92, 1, 3--17. DOI=10.1016/0304-3975(92)90132-Y http://dx.doi.org/10.1016/0304-3975(92)90132-Y Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Attwood, T. K. and Findlay, J. B. C. 1994. Fingerprinting G Protein-Coupled Receptors. Protein Eng. 7, 2, 195--203. DOI=10.1093/protein/7.2.195.Google ScholarGoogle ScholarCross RefCross Ref
  4. Bergroth, L., Hakonen, H. and Raita, T. 2000. A Survey of Longest Common Subsequence Algorithms. Proc. Int'l Symp. String Processing Information Retrieval (SPIRE '00), IEEE Computer Society, Washington, DC, USA, 39--48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bourque, G. and Pevzner, P. A. 2002. Genome-Scale Evolution: Reconstructing Gene Orders in the Ancestral Species. Genome Research 12, 26--36.Google ScholarGoogle Scholar
  6. Chen, Y., Wan, A. and Liu, W. 2006. A Fast Parallel Algorithm for Finding the Longest Common Sequence of Multiple Biosequences. BMC Bioinformatics 7, S4.Google ScholarGoogle ScholarCross RefCross Ref
  7. Chin, F. Y. and Poon, C. K. 1990. A Fast Algorithm for Computing Longest Common Subsequences of Small Alphabet Size. J. Information Processing 13, 4, 463--469. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Dayhoff, M. O. 1969. Computer Analysis of Protein Evolution. Scientific Am. 221, 1, 86--95.Google ScholarGoogle ScholarCross RefCross Ref
  9. Hakata, K. and Imai, H. 1998. Algorithms for the Longest Common Subsequence Problem for Multiple Strings Based on Geometric Maxima. Optimization Methods and Software 10, 233--260.Google ScholarGoogle ScholarCross RefCross Ref
  10. Hirschberg, D. S. 1977. Algorithms for the Longest Common Subsequence Problem. J. ACM 24, 664--675. DOI=10.1145/322033.322044 http://doi.acm.org/10.1145/322033.322044 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Hsu, W. J. and Du, M. W. 1984. Computing a Longest Common Subsequence for a Set of Strings. BIT Numerical Math. 24, 1, 45--59.Google ScholarGoogle ScholarCross RefCross Ref
  12. Hunt, J. W. and Szymanski, T. G. 1977. A Fast Algorithm for Computing Longest Common Subsequences. Comm. ACM 20, 5, 350--353. DOI=10.1145/359581.359603 http://doi.acm.org/10.1145/359581.359603 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Korkin, D. 2001. A New Dominant Point-Based Parallel Algorithm for Multiple Longest Common Subsequence Problem. Technical Report TR01-148, Univ. of New Brunswick.Google ScholarGoogle Scholar
  14. Korkin, D., Wang, Q. and Shang, Y. 2008. An Efficient Parallel Algorithm for the Multiple Longest Common Subsequence (MLCS) Problem. Proc. 37th Int'l Conf. Parallel Processing (ICPP '08), 354--363. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Maier, D. 1978. The Complexity of Some Problems on Subsequences and Supersequences. J. ACM 25, 2 (April 1978), 322--336. DOI=10.1145/322063.322075 http://doi.acm.org/10.1145/322063.322075. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Masek, W. J. and Paterson, M. S. 1980. A Faster Algorithm Computing String Edit Distances. J. Computer and System Sciences 20, 18--31.Google ScholarGoogle ScholarCross RefCross Ref
  17. Rick, C. 1994. New Algorithms for the Longest Common Subsequence Problem. Technical Report No. 85123-CS, Computer Science Dept., Univ. of Bonn. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Sankoff, D. and Blanchette, M. 1999. Phylogenetic Invariants for Genome Rearrangements. J. Computational Biology 6, 431--445.Google ScholarGoogle ScholarCross RefCross Ref
  19. Sankoff, D., Kruskal, J. B. 1983. Time warps, string edits, and macromolecules: the theory and practice of sequence comparison. Addison-Wesley.Google ScholarGoogle Scholar
  20. Sankoff, D. 1972. Matching Sequences Under Deletion/Insertion Constraints. Proc. Nat'l Academy of Sciences USA 69, 4--6.Google ScholarGoogle ScholarCross RefCross Ref
  21. Smith, T. F. and Waterman, M. S. 1981. Identification of Common Molecular Subsequences. J. Molecular Biology 147, 195--197.Google ScholarGoogle ScholarCross RefCross Ref
  22. Wang, Q., Korkin, D. and Shang, Y. 2011. A Fast Multiple Longest Common Subsequence (MLCS) Algorithm. IEEE Transactions on Knowledge and Data Engineering 23, 3, 321--334. DOI=10.1109/TKDE.2010.123 http://dx.doi.org/10.1109/TKDE.2010.123 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Quick-MLCS: a new algorithm for the multiple longest common subsequence problem

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        C3S2E '12: Proceedings of the Fifth International C* Conference on Computer Science and Software Engineering
        June 2012
        139 pages
        ISBN:9781450310840
        DOI:10.1145/2347583

        Copyright © 2012 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 27 June 2012

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate12of42submissions,29%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader