Skip to main content

A Parallel Algorithm for Multiple Biological Sequence Alignment

  • Conference paper
  • 1052 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7223))

Abstract

The search of a multiple sequence alignment (MSA) is a well-known problem in bioinformatics that consists in finding a sequence alignment of three or more biological sequences. In this paper, we propose a parallel iterative algorithm for the global alignment of multiple biological sequences. In this algorithm, a number of processes work independently at the same time searching for the best MSA of a set of sequences. It uses a Longest Common Subsequence (LCS) technique in order to generate a first MSA. An iterative process improves the MSA by applying a number of operators that have been implemented to produce more accurate alignments. Simulations were made using sequences from the UniProKB protein database. A preliminary performance analysis and comparison with several common methods for MSA shows promising results. The implementation was developed on a cluster platform through the use of the standard Message Passing Interface (MPI) library.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Altschul, S., Gish, W., Miller, W., Myers, E., Lipman, D.: Basic local alignment search tool. Molecular Biology-Elsevier 215(3), 403–410 (1990)

    Google Scholar 

  2. Anbarasu, L., Narayanasamy, P., Sundararajan, V.: Multiple molecular sequence alignment by island parallel genetic algorithm. Current Science 78(7), 858–863 (2000)

    Google Scholar 

  3. Bilu, Y., Agarwal, P., Kilodny, R.: Faster algorithms for optimal multiple sequence alignment based on pairwise comparisons. IEEE/ACM Transactions on Computational Biology and Bioinformatics 3(4), 408–422 (2006)

    Article  Google Scholar 

  4. Chengpeng, B.: DNA motif alignment by evolving a population of Markov chains. BMC Bioinformatics 10(1), S13 (2009)

    Article  Google Scholar 

  5. Edgar, R.: Muscle: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 32(5), 1792–1797 (2004)

    Article  Google Scholar 

  6. Galperin, M., Cochrane, G.: The 2011 nucleic acids research database issue and the online molecular biology database collection. Nucleic Acids Research 39, D1–D6 (2011)

    Article  Google Scholar 

  7. Gotoh, O.: Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as a assessed by reference to structural alignments. J. Mol. Biol. 264, 823–838 (1996)

    Article  Google Scholar 

  8. Henikoff, S., Henikoff, J.G.: Amino acid substitution matrices from protein blocks. Biochemistry 89, 10915–10919 (1992)

    Google Scholar 

  9. Jones, N., Pevzner, P.A.: An introduction to bioinformatics algorithms. MIT Press (1996)

    Google Scholar 

  10. Kim, J., Pramanik, S., Chung, M.: Multiple sequence alignment using simulated annealing. Comput. Appl. Biosci. 10(4), 419–426 (1994)

    Google Scholar 

  11. Kleinjung, J., Douglas, N., Heringa, J.: Parallelized multiple alignment. Bioinformatics Applications Note 18(9), 1270–1271 (2002)

    Google Scholar 

  12. Lassmann, T., Frings, O., Sonnhammer, E.: Kalign2: high-performance multiple alignment of protein and nucleotide sequences allowing external features. Nucleid Acids Research 37(3), 858–865 (2009)

    Article  Google Scholar 

  13. Li, K.: Clustalw-mpi: Clustalw analysis using distributed and parallel computing. Bioinformatics Applications Note 19(12), 1585–1586 (2003)

    Google Scholar 

  14. Lipman, D., Pearson, W.: Rapid and sensitive protein similarity searches. Science 227(4693), 1435–1441 (1985)

    Article  Google Scholar 

  15. Lu, Y., Sze, S.: Improvig accuracy of multiple sequence alignment algorithms based on alignment of neighboring residues. Nucleic Acids Research 37(2), 463–472 (2009)

    Article  Google Scholar 

  16. Luscombe, N., Greenbaum, D., Gerstein, M.: What is bioinformatics? a proposed definition and overview of the field. Method Inf. Med. 40(4), 346–358 (2001)

    Google Scholar 

  17. Moretti, S., Armougom, F., Wallace, I., Higgins, D., Jongeneel, C., Notredame, C.: The M-Coffee web server: a meta-method for computing multiple sequence alignments by combining alternative alignment methods. Nucleic Acids Research 35, Web Server Issue, W645–W648 (2007)

    Article  Google Scholar 

  18. Mount, D.: Bioinformatics: sequence and genome analysis. Cold Spring Harbor Laboratory Press (2004)

    Google Scholar 

  19. National Center for Biotechnology Information: Fasta format, http://blast.ncbi.nlm.nih.gov/blastcgihelp.shtml

  20. Needleman, S., Wunsch, C.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453 (1970)

    Article  Google Scholar 

  21. Notredame, C., Higgins, D.: Saga: sequence alignment by genetic algorithm. Nucleic Acids Research 24(8), 1515–1524 (1996)

    Article  Google Scholar 

  22. Notredame, C., Higgins, D., Heringa, J.: T-coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302(1), 205–217 (2000)

    Article  Google Scholar 

  23. Shu, N., Elofsson, A.: KalignP: Improved multiple sequence alignments using position specific gap penalties in kalign2. Bioinformatics Applications Note 27(12), 1702–1703 (2011)

    Google Scholar 

  24. Smith, T., Waterman, M.: Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197 (1981)

    Article  Google Scholar 

  25. Thompson, J., Higgins, D., Gibson, T.: Clustal w: improving the sensitivy of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research 22(22), 4673–4680 (1994)

    Article  Google Scholar 

  26. Wagner, R., Fischer, M.: The string-to-string correction problem. ACM 21(1), 168–173 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  27. Wallace, I., O’Sullivan, O., Higgins, D., Notredame, C.: M-coffee: combining multiple sequence alignment methods with t-coffee. Nucleic Acids Research 34(6), 1692–1699 (2006)

    Article  Google Scholar 

  28. Wang, Y., Li, K.: An adaptative and iterative algorithm for refining multiple sequence alignment. Computational Biology and Chemistry 28, 141–148 (2004)

    Article  MATH  Google Scholar 

  29. Zhang, Z., Schwartz, S., Wagner, L., Miller, W.: A greedy algorithm for aligning dna sequences. Journal of Computational Biology 7(1/2), 203–214 (2000)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Andalon-Garcia, I.R., Chavoya, A., Meda-Campaña, M.E. (2012). A Parallel Algorithm for Multiple Biological Sequence Alignment. In: Lones, M.A., Smith, S.L., Teichmann, S., Naef, F., Walker, J.A., Trefzer, M.A. (eds) Information Processign in Cells and Tissues. IPCAT 2012. Lecture Notes in Computer Science, vol 7223. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28792-3_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28792-3_31

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28791-6

  • Online ISBN: 978-3-642-28792-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics