Skip to main content

Aligning Multiple Protein Sequences by Hybrid Clonal Selection Algorithm with Insert-Remove-Gaps and BlockShuffling Operators

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4163))

Abstract

Multiple sequence alignment (MSA) is one of the most important tasks in biological sequence analysis. This paper will primarily focus on on protein alignments, but most of the discussion and methodology also applies to DNA alignments. A novel hybrid clonal selection algorihm, called an aligner, is presented. It searches for a set of alignments amongst the population of candidate alignments by optimizing the classical weighted sum of pairs objective function. Benchmarks from BaliBASE library (v.1.0 and v.2.0) are used to validate the algorithm. Experimental results of BaliBASE v.1.0 benchmarks show that the proposed algorithm is superior to PRRP, ClustalX, SAGA, DIALIGN, PIMA, MULTIALIGN, and PILEUP8. On BaliBASE v.2.0 benchmarks the algorithm shows interesting results in terms of SP score with respect to established and leading methods, i.e. ClustalW, T-Coffee, MUSCLE, PRALINE, ProbCons, and Spem.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Eidhammer, I., Jonassen, I., Taylor, W.R.: Protein Bioinformatics. Wiley, Chichester (2004)

    Google Scholar 

  2. Durbin, R., Eddy, S., Krogh, A., Mitchison, G.: Biological sequence analysis. Cambridge University Press, Cambridge (2004)

    Google Scholar 

  3. Thompson, J.D., Plewniak, F., Ripp, R., Thierry, J.C., Poch, O.: Towards a Reliable Objective Function for Multiple Sequence Alignments. J. Mol. Biol. 301, 937–951 (2001)

    Article  Google Scholar 

  4. Altschul, S.F., Lipman, D.J.: Trees stars and multiple biological sequence alignment. SIAM Journal on Applied Mathematics 49, 197–209 (1989)

    Article  MATH  MathSciNet  Google Scholar 

  5. Altschul, S.F., Carroll, R.J., Lipman, D.J.: Weights for data related by a tree. Journal on Molecular Biology 207, 647–653 (1989)

    Article  Google Scholar 

  6. Bonizzoni, P., Della Vedova, G.: The Complexity of Multiple Sequence Alignment with SP-score that is a Metric. Theoretical Computer Science 259(1), 63–79 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  7. Wang, L., Jiang, T.: On the complexity of multiple sequence alignment. Journal of Computational Biology 1, 337–348 (1994)

    Article  Google Scholar 

  8. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, New York (1979)

    MATH  Google Scholar 

  9. Gupta, S.K., Kececioglu, J.D., Schaffer, A.: Improving the practical space and time efficiency of the shortest-paths approach to sum-of-pairs multiple sequence alignment. Journal of Computational Biology 2, 459–472 (1995)

    Article  Google Scholar 

  10. Corpet, F.: Multiple sequence alignment with hierarchical clustering. Nucleic Acids Research 16, 10881–10890 (1988)

    Article  Google Scholar 

  11. Wisconsin Package v.8; Genetics Computer Group, Madison, WI, http://www.gcg.com

  12. Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., Higgins, D.G.: The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Research 24, 4876–4882 (1997)

    Article  Google Scholar 

  13. Thompson, J.D., Higgins, D.G., Gibson, T.J.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research 22, 4673–4680 (1994)

    Article  Google Scholar 

  14. Notredame, C., Higgins, D.G., Heringa, J.: T-Coffee: a novel method for fast and accurate Multiple Sequence Alignment. Journal Molecular Biology 302, 205–217 (2000)

    Article  Google Scholar 

  15. Zhou, H., Zhou, Y.: SPEM: Improving multiple sequence alignment with sequence profiles and predicted secondary structures. Bioinformatics 21, 3615–3621 (2005)

    Article  Google Scholar 

  16. Do, C.B., Mahabhashyam, M.S.P., Brudno, M., Batzoglou, S.: ProbCons: Probabilistic consistency-based multiple sequence alignment. Genome Research 15, 330–340 (2005)

    Article  Google Scholar 

  17. Smith, R.F., Smith, T.F.: Pattern-induced multi-sequence alignment (PIMA) algorithm employing secondary structure-dependent gap penalties for use in comparative protein modelling. Protein Engineering 5, 35–41 (1992)

    Article  Google Scholar 

  18. Carrillo, H., Lipman, D.J.: The Multiple Sequence Alignment Problem in Biology. SIAM Journal on Applied Mathematics 48, 1073–1082 (1988)

    Article  MATH  MathSciNet  Google Scholar 

  19. Stoye, J., Moulton, V., Dress, A.W.: DCA: an efficient implementation of the divide-and conquer approach to simultaneous multiple sequence alignment. Bioinformatics 13(6), 625–626 (1997)

    Article  Google Scholar 

  20. Morgenstern, B., Frech, K., Dress, A., Werner, T.: DIALIGN: Finding local similarities by multiple sequence alignment. Bioinformatics 14, 290–294 (1998)

    Article  Google Scholar 

  21. Morgenstern, B., Frech, K., Dress, A., Werner, T.: DIALIGN 2: improvement of the segment-to-segment approach to multiple sequence alignment. Bioinformatics 15, 211–218 (1999)

    Article  Google Scholar 

  22. Gotoh, O.: Further improvement in methods of group-to-group sequence alignment with generalized profile operations. Bioinformatics 10(4), 379–387 (1994)

    Article  MathSciNet  Google Scholar 

  23. Eddy, S.R.: Multiple alignment using hidden Markov models. In: 3rd International Conference on Intelligent Systems for Molecular Biology, vol. 3, pp. 114–120 (1995)

    Google Scholar 

  24. Edgar, R.C.: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 32, 1792–1797 (2004)

    Article  Google Scholar 

  25. Notredame, C., Higgins, D.G.: SAGA: sequence alignment by genetic algorithm. Nucleic Acids Research 24, 1515–1539 (1996)

    Article  Google Scholar 

  26. Notredame, C.: COFFEE: an objective function for multiple sequence alignments. Bioinformatics 14, 407–422 (1998)

    Article  Google Scholar 

  27. Simossis, V.A., Heringa, J.: PRALINE: a multiple sequence alignment toolbox that integrates homology-extended and secondary structure information. Nucleic Acids Research 33, 289–294 (2005)

    Article  Google Scholar 

  28. Shyu, C., Sheneman, L., Foster, J.A.: Multiple Sequence Alignment with Evolutionary Computation. Genetic Programming and Evolvable Machines 5, 121–144 (2004)

    Article  Google Scholar 

  29. Nguyen, H.D., Yoshihara, I., Yamamori, K., Yasunaga, M.: Aligning Multiple Protein Sequences by Parallel Hybrid Geneti Algorithm. Genome Informatics 13, 123–132 (2002)

    Google Scholar 

  30. Cutello, V., Nicosia, G.: An Immunological Approach to Combinatorial Optimization Problems. In: Garijo, F.J., Riquelme, J.-C., Toro, M. (eds.) IBERAMIA 2002. LNCS, vol. 2527, pp. 361–370. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  31. Nicosia, G.: Immune Algorithms for Optimization and Protein Structure Prediction. Ph.D. Dissertation, Department of Mathematics and Computer Science, University of Catania, Italy (2004)

    Google Scholar 

  32. Cutello, V., Narzisi, G., Nicosia, G., Pavone, M.: Clonal selection algorithms: A comparative case study using effective mutation potentials. In: Jacob, C., Pilat, M.L., Bentley, P.J., Timmis, J.I. (eds.) ICARIS 2005. LNCS, vol. 3627, pp. 13–28. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  33. Cutello, V., Nicosia, G., Pavone, M., Timmis, J.: An Immune Algorithm for Protein Structure Prediction on Lattice Models. IEEE Transaction on Evolutionary Computation (to appear)

    Google Scholar 

  34. Cutello, V., Nicosia, G., Pavone, M.: Exploring the capability of immune algorithms: A characterization of hypermutation operators. In: Nicosia, G., Cutello, V., Bentley, P.J., Timmis, J. (eds.) ICARIS 2004. LNCS, vol. 3239, pp. 263–276. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  35. Taylor, W.R.: A flexible method to align a large number of sequences. J. Mol. Evol. 28, 161–169 (1988)

    Article  Google Scholar 

  36. Thompson, J.D., Plewniak, F., Poch, O.: BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics 15, 87–88 (1999)

    Article  Google Scholar 

  37. Bahr, A., Thompson, J.D., Thierry, J.C., Poch, O.: BAliBASE (Benchmark Alignment dataBASE): Enhancements for Repeats, Transmembrane Sequences and Circular Permuations. Nucleic Acids Research 29(1), 232–326 (2001)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cutello, V., Lee, D., Nicosia, G., Pavone, M., Prizzi, I. (2006). Aligning Multiple Protein Sequences by Hybrid Clonal Selection Algorithm with Insert-Remove-Gaps and BlockShuffling Operators. In: Bersini, H., Carneiro, J. (eds) Artificial Immune Systems. ICARIS 2006. Lecture Notes in Computer Science, vol 4163. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11823940_25

Download citation

  • DOI: https://doi.org/10.1007/11823940_25

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-37749-8

  • Online ISBN: 978-3-540-37751-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics