Abstract
Multiple Sequence Alignment (MSA) constitutes an extremely powerful tool for important biological applications such as phylogenetic analysis, identification of conserved motifs and domains and structure prediction. In spite of the improvement in speed and accuracy introduced by MSA programs, the computational requirements for large-scale alignments requires high-performance computing and parallel applications. In this paper we present an improvement to a parallel implementation of T-Coffee, a widely used MSA package. Our approximation resolves the bottleneck of the progressive alignment stage on MSA. This is achieved by increasing the degree of parallelism by balancing the guide tree that drives the progressive alignment process. The experimental results show improvements in execution time of over 68% while maintaining the biological accuracy.
Similar content being viewed by others
References
Wang L, Jiang T (1994) On the complexity of multiple sequence alignment. J Comput Biol 1(4):337–348
Notredame C (2007) Recent evolutions of multiple sequence alignment algorithms. PLoS Comput Biol 3(8):123
Feng DF, Doolittle RF (1987) Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J Mol Evol 25:351–360
Thompson JD, Higgins DG, Gibson TJ (1994) ClustalW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22(22):4673–4680
Notredame C, Higgins DG, Heringa J (2000) T-Coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol 302(1):205–217
Li K-B (2003) ClustalW-MPI: ClustalW analysis using distributed and parallel computing. Bioinformatics 19(12):1585–1586
Zola J, Yang X, Rospondek A, Aluru S (2007) Parallel-TCoffee: A parallel multiple sequence aligner. In: Proceedings of ISCA PDCS-2007, pp. 248–253
Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L, Eddy SR, Griffiths-Jones S, Howe KL, Marshall M, Sonnhammer ELL (2002) The Pfam protein families database. Nucleic Acids Res 30(1):276–280
Thompson JD, Koehl P, Ripp R, Poch O (2005) BAliBASE 3.0: Latest developments of the multiple sequence alignment benchmark. Proteins: Struct Funct Bioinformatics 61(1):127–136
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Orobitg, M., Guirado, F., Notredame, C. et al. Exploiting parallelism on progressive alignment methods. J Supercomput 58, 186–194 (2011). https://doi.org/10.1007/s11227-009-0359-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-009-0359-5