Skip to main content
Log in

Multiple biological sequence alignment in heterogeneous multicore clusters with user-selectable task allocation policies

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Multiple Sequence Alignment (MSA) is an important problem in Bioinformatics that aims to align more than two sequences in order to emphasize similarity regions. This problem is known to be NP-Hard, so heuristic methods are used to solve it. DIALIGN-TX is an iterative heuristic method for MSA that generates alignments by concatenating ungapped regions with high similarity. Usually, the first phase of MSA algorithms is parallelized by distributing several independent tasks among the nodes. Even though heterogeneous multicore clusters are becoming very common nowadays, very few task allocation policies were proposed for this type of architecture. This paper proposes an MPI/OpenMP master/slave parallel strategy to run DIALIGN-TX in heterogeneous multicore clusters, with several allocation policies. We show that an appropriate choice of the master node has great impact on the overall system performance. Also, the results obtained in a heterogeneous multicore cluster composed of 4 nodes (30 cores), with real sequence sets show that the execution time can be drastically reduced when the appropriate allocation policy is used.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Brudno M, Steinkamp R, Morgenstern B (2004) The CHAOS/DIALIGN WWW server for multiple alignment of genomic sequences. Nucleic Acids Res. 32:41–44. Web Server issue

    Article  Google Scholar 

  2. Chaichoompu K, Kittitornkun S, Tongsima S (2006) MT-clustalW: multithreading multiple sequence alignment. In: IPDPS. IEEE Press, New York

    Google Scholar 

  3. ConsortiumTU (2011) Ongoing and future developments at the universal protein resource. Nucleic Acids Res. 39:214–219. Database issue

    Article  Google Scholar 

  4. Durbin R, Krigh E, Mitcheson G (1998) Biological sequence analysis. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  5. Finn D, Mistry J, Tate JG, Coggill C, Heger A, Pollington JE, Gavin L, Gunasekaran P, Ceric G, Forslund K, Holm A, Sonnhammer ELL, Eddy R, Bateman A (2010) The Pfam protein families database. Nucleic Acids Res. 38:211–222. Database issue

    Article  Google Scholar 

  6. Higgins DG, Thompson JD, Gibson TJ (1994) ClustalW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix. Nucleic Acids Res. 22:4673–4680

    Article  Google Scholar 

  7. Hummel SF, Schmidt JP, Uma RN, Wein J (1996) Load-sharing in heterogeneous systems via weighted factoring. In: SPAA, pp 318–328

    Google Scholar 

  8. Li KB (2003) ClustalW-MPI: ClustalW analysis using distributed and parallel computing. Bioinformatics 19:1585–1586

    Article  Google Scholar 

  9. Macedo EA, Melo ACMA, Pfitscher H, Boukerche A (2011) Hybrid MPI/OpenMP strategy for biological multiple sequence alignment with DIALIGN-TX in heterogeneous multicore clusters. In: IPDPS workshops, pp 418–425

    Google Scholar 

  10. Morgenstern B (1999) DIALIGN 2: improvement of the segment-to-segment approach to multiple sequence alignment. Bioinformatics 15:211–218

    Article  Google Scholar 

  11. Morgenstern B, Dress A, Werner T (1996) Multiple DNA and protein sequence alignment based on segment-to-segment comparison. In: Proceedings of the national academy of science, vol 93, pp 12098–12103

    Google Scholar 

  12. Morgenstern B, Frech K, Dress A, Werner T (1998) DIALIGN: finding local similarities by multiple sequence alignment. Bioinformatics 14:290–294

    Article  Google Scholar 

  13. Papadimitriou CH, Steiglitz K (1998) Combinatorial optimization: algorithms and complexity. Dover, New York

    MATH  Google Scholar 

  14. Polychronopoulos CD, Kuck DJ (1987) Guided self-scheduling: a practical scheduling scheme for parallel supercomputers. IEEE Trans. Comput. 36:1425–1439

    Article  Google Scholar 

  15. Schmollinger M, Nieselt K, Kaufmann M, Morgenstern B (2004) DIALIGN P: fast pair-wise and multiple sequence alignment using parallel processors. BMC Bioinform. 5:128

    Article  Google Scholar 

  16. Subramanian AR, Kaufmann M, Morgenstern B (2008) DIALIGN-TX: greedy and progressive approaches for segment-based multiple sequence alignment. Algorithm Mol. Biol. 3:6

    Article  Google Scholar 

  17. Tan G, Feng S, Sun N (2005) Parallel multiple sequences alignment in SMP clusters. In: High-performance computing in Asia-Pacific region

    Google Scholar 

  18. Tang P, Yew PC (1986) Processor self-scheduling for multiple-nested parallel loops. In: ICPP conference, pp 528–535

    Google Scholar 

  19. Tzen TH, Ni LM (1993) Trapezoid self-scheduling: a practical scheduling scheme for parallel compilers. IEEE Trans. Parallel Distrib. Syst. 4:87–98

    Article  Google Scholar 

  20. Wang L, Jiang T (1994) On the complexity of multiple sequence alignment. J. Comput. Biol. 4:337–348

    Article  Google Scholar 

  21. Zola J, Yang X, Rospondek A, Aluru S (2007) T-Coffee: a parallel multiple sequence aligner. In: PDCS, pp 248–253

    Google Scholar 

  22. Boukerche A, Correa JM, Melo ACMA, Jacobi RP (2010) A hardware accelerator for the fast retrieval of DIALIGN biological sequence alignments in linear space. IEEE Trans. Comput. 59(6):808–821

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alba Cristina Magalhaes Alves de Melo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

de Araujo Macedo, E., Magalhaes Alves de Melo, A.C., Pfitscher, G.H. et al. Multiple biological sequence alignment in heterogeneous multicore clusters with user-selectable task allocation policies. J Supercomput 63, 740–756 (2013). https://doi.org/10.1007/s11227-012-0768-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-012-0768-8

Keywords

Navigation