Skip to main content
Log in

Adaptation of the method of musical composition for solving the multiple sequence alignment problem

  • Published:
Computing Aims and scope Submit manuscript

Abstract

The multiple sequence alignment (MSA) problem has become relevant to several areas in bioinformatics from finding sequences family, detecting structural homologies of protein/DNA sequences, determining functions of protein/DNA sequences to predict patients diseases by comparing DNAs of patients in disease discovery, etc. The MSA is a NP-hard problem. In this paper, two new methods based on a cultural algorithm, namely the method of musical composition, for the solution of the MSA problem are introduced. The performance of the first and second versions were evaluated and analyzed on 26 and 12 different benchmark alignments, respectively. Test instances were taken from BAliBASE 3.0. Alignment accuracies are computed using the QSCORE program, which is a quality scoring program that compares two multiple sequence alignments. Numerical results on the tackled instances indicate that the performance levels of the proposed versions of the MMC are promising. In particular, the experimental results show that the second version found the best alignment reported in the specialized literature in 25 \(\%\) of the tested instances. Besides, for 50 \(\%\) of the tested instances, the second version achieved the second best alignment. Finally, the significance of the numerical results were analyzed according to the Wilcoxon rank-sum test, which indicated that the second proposed version is statistically similar to some state-of-the-art techniques for the MSA problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Altschul SF, Erickson BW (1986) Optimal sequence alignment using affine gap costs. Bull Math Biol 48(5–6):603–616

    Article  MathSciNet  MATH  Google Scholar 

  2. Birattari M (2009) Tuning metaheuristics: a machine learning perspective. Studies in computational intelligence, vol 197. Springer, Berlin

  3. Bahr A, Thompson JD, Thierry JC, Poch O (2001) BAliBASE (Benchmark Alignment dataBASE): enhancements for repeats, transmembrane sequences and circular permutations. Nucleic Acids Res 29(1):323–326

  4. Baewicz J, Formanowicz P, Wojciechowski P (2009) Some remarks on evaluating the quality of the multiple sequence alignment based on the BAliBASE benchmark. Int J Appl Math Comput Sci 19(4):675–678

    Google Scholar 

  5. Blazewicz J, Frohmberg W, Kierzynka M, Wojciechowski P (2013) G-MSAA GPU-based, fast and accurate algorithm for multiple sequence alignment. J Parallel Distrib Comput 73(1):32–41

    Article  Google Scholar 

  6. Corpet F (1988) Multiple sequence alignment with hierarchical clustering. Nucleic Acids Res 16(22):10881–10890

    Article  Google Scholar 

  7. Daugelait J, O’ Driscoll A, Sleator R (2013) An overview of multiple sequence alignments and cloud computing in bioinformatics. ISRN Biomath 2013:Article ID 615630

  8. Do CB, Mahabhashyam MSP, Brudno M, Batzoglou S (2005) PROBCONS: probabilistic consistency-based multiple sequence alignment. Genome Res 15:330–340

    Article  Google Scholar 

  9. Duret L, Abdeddaim S (2000) Multiple alignments for structural, functional or phylogenetic analyses of homologous sequences, Bioinformatics Sequence structure and databanks. Oxford University Press, Oxford

    Google Scholar 

  10. Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797

    Article  Google Scholar 

  11. Edgar RC, Serafim B (2006) Multiple sequence alignment. Curr Opin Struct Biol 16(3):368–373

    Article  Google Scholar 

  12. Higgins DG, Sharp PM (1988) CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gene 73:237–244

    Article  Google Scholar 

  13. Katoh K, Misawa K, Kuma K, Miyata T (2002) MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res 30(14):305966. doi:10.1093/nar/gkf436

    Article  Google Scholar 

  14. Kelil A (2011) Contribution \(\grave{a}\) l’analyse des séquences de protéines similarité, clustering et alignement. PhD thesis. Université de Sherbrooke Faculté des sciences

  15. Krogh A, Brown M, Mian IS, Sjölander K, Haussler D (1994) Hidden Markov models in computational biology: applications to protein modeling. J Mol Biol 235:1501–1531

    Article  Google Scholar 

  16. Lassmann T, Sonnhammer ELL (2005) Kalignan accurate and fast multiple sequence alignment algorithm. BMC Bioinf 6:298

    Article  Google Scholar 

  17. Lee ZL, Su SF, Chuang CC, Liu KH (2008) Genetic algorithm with ant colony optimization (GA-ACO) for multiple sequence alignment. Appl Soft Comput 8(1):55–78. ISSN 1568–4946

  18. Lee C, Grasso C, Sharlow MF (2002) Multiple sequence alignment using partial order graphs. Bioinformatics 18(3):452–464

    Article  Google Scholar 

  19. Löytynoja A, Goldman N (2005) An algorithm for progressive multiple alignment of sequences with insertions. Proc Natl Acad Sci USA 102(30):10557–10562

    Article  Google Scholar 

  20. Manthey B (2005) Non-approximability of weighted multiple sequence alignment for arbitrary metrics. Inf Process Lett 95(3):389–395

    Article  MathSciNet  MATH  Google Scholar 

  21. Mayers A, Monga E, Wang S (2014) ALIGNER detecting and aligning related protein sequences. ProspectUS. 16 February 2010. Check in 25 May 2014. Website http://prospectus.usherbrooke.ca/aligner/Results/BALIBASE3.htm

  22. Mora-Gutiérrez RA, Ramírez-Rodríguez J, ElizondoO-Cortes M (2011) Heurística para solucionar el problema de alineamiento múltiple de secuencias. Rev Mat [online] 18(1):121–136

  23. Mora-Gutiérrez RA, Ramírez-Rodríguez J, Rincón-García EA (2012) An optimization algorithm inspired by musical composition. Artif Intell Rev 41(3):301–315

  24. Mora-Gutiérrez RA, Ramírez-Rodríguez J, Rincón-García, Ponsich A, Herrera O (2012) An optimization algorithm inspired by social creativity systems. Computing 94(11):887–914

  25. Morgenstern B, Frech K, Dress A, Werner T (1998) DIALIGN: finding local similarities by multiple sequence alignment. Bioinformatics 14:290–294

    Article  Google Scholar 

  26. Notredame C, Higgins D, Heringa J (2000) T-coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol 302(1):205–217

    Article  Google Scholar 

  27. Notredame C (2002) Recent progress in multiple sequence alignment: a survey. Pharmacogenomics 3(1):131–144

    Article  Google Scholar 

  28. Nuin PAS, Wang ZZ, Elisabeth RM (2006) The accuracy of several multiple sequence alignment programs for proteins. Bioinformatics 7:471–489

  29. Prakash Lingam KM, Chandrakala S (2011) A survey on recent developments in multiple sequence alignment methods. J Nat Sci Biol Med 2:96–97

    Google Scholar 

  30. Pei J, Sadreyev R, Grishin NV (2003) PCMA: fast and accurate multiple sequence alignment based on profile consistency. Bioinformatics 19(3):427–428

    Article  Google Scholar 

  31. Pei J, Grishin NV (2006) MUMMALS: multiple sequence alignment improved by using hidden Markov models with local structural information. Nucleic Acids Res 34(16):4364–4374

    Article  Google Scholar 

  32. Roshan U, Livesay DR (2006) Probalign: multiple sequence alignment using partition function posterior probabilities. Bioinformatics 22(22):2715–2721

    Article  Google Scholar 

  33. Subramanian AR, Weyer-Menkhoff J, Kaufmann M, Morgenstern B (2005) Dialign-t: an improved algorithm for segment-based multiple sequence alignment. BMC Bioinf 6:66. doi:10.1186/1471-2105-6-66

    Article  Google Scholar 

  34. Subramanian AR, Kaufmann M, Morgenstern B (2008) DIALIGN-TX: greedy and progressive approaches for segment-based multiple sequence alignment. Algorithms Mol Biol 3:6

    Article  Google Scholar 

  35. Schwartz AS, Pachter L (2007) Multiple alignment by sequence annealing. Bioinformatics 23(2):e24–e29

    Article  Google Scholar 

  36. Sze S-H, Lu Y, Yang Q (2006) A polynomial time solvable formulation of multiple sequence alignment. J Comput Biol 13:309–319 [Also appear in Proceedings of the 9th annual international conference on research in computational molecular biology (RECOMB’2005). Lecture notes in bioinformatics, vol 3500, pp 204–216]

  37. Thompson J, Higgins D, Gibson T (1994) ClustalW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4690

    Article  Google Scholar 

  38. Thompson JD, Ripp O (2014) BAliBASE 3 website of the LBGI Bioinformatique et Génomique Intégratives. Web 15 April 2014. http://lbgi.fr/balibase/

  39. Thompson JD, Koehl P, Ripp R, Poch O (2005) BAliBASE 3.0: latest developments of the multiple sequence alignment benchmark. Proteins Struct Funct Bioinf 61(1):127–136

    Article  Google Scholar 

  40. Wisconsin Package v. 8, Genetics Computer Group, Madison, WI. http://www.gcg.com. Accessed 7 Aug 2014

  41. Wojciechowski P, Formanowicz P, Blazewicz J (2014) Reference alignment based methods for quality evaluation of multiple sequence alignment—a survey. Curr Bioinf 9(1):44–56

    Article  Google Scholar 

  42. Van Walle I, Lasters I, Wyns L (2004) Align-m—a new algorithm for multiple alignment of highly divergent sequences. Bioinformatics 20(9):1428–1435

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roman Anselmo Mora-Gutiérrez.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mora-Gutiérrez, R.A., Lárraga-Ramírez, M.E., Rincón-García, E.A. et al. Adaptation of the method of musical composition for solving the multiple sequence alignment problem. Computing 97, 813–842 (2015). https://doi.org/10.1007/s00607-014-0436-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00607-014-0436-3

Keywords

Mathematics Subject Classification