Skip to main content

Advertisement

Log in

Parallel protein multiple sequence alignment approaches: a systematic literature review

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Multiple sequence alignment approaches refer to algorithmic solutions for the alignment of biological sequences. Since multiple sequence alignment has exponential time complexity when a dynamic programming approach is applied, a substantial number of parallel computing approaches have been implemented in the last two decades to improve their performance. In this paper, we present a systematic literature review of parallel computing approaches applied to multiple sequence alignment algorithms for proteins, published in the open literature from 1988 to 2022; we extracted articles from four scientific databases: ACM Digital Library, IEEE Xplore, Science Direct and SpringerLink, and four journals: Bioinformatics, PLOS Computational Biology, PLOS ONE, and Scientific Reports. Additionally, in order to cover other potential databases and journals, we performed a transversal search through Google Scholar. We conducted a selection process that yielded 106 research articles; then, we analyzed these articles and defined a classification framework. Additionally, we point out some directions and trends for parallel computing approaches for multiple sequence alignment, as well as some unsolved problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Data availability

All data generated or analyzed during this study are included in this published article and its supplementary information file.

References

  1. Bayat A (2002) Bioinformatics. BMJ 324(7344):1018–1022. https://doi.org/10.1136/bmj.324.7344.1018

    Article  Google Scholar 

  2. Ramsden J (2009) Bioinformatics: An Introduction, 2nd edn. Springer, London, England. https://doi.org/10.1007/978-1-84800-257-9

  3. Durbin R, Eddy SR, Krogh A, Mitchison G (1998) Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, Cambridge, England. https://doi.org/10.1017/CBO9780511790492

  4. Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. Proc Nat Acad Sci U. S. A. 89(22):10915–10919. https://doi.org/10.1073/pnas.89.22.10915

    Article  Google Scholar 

  5. Bonizzoni P, Vedova GD (2001) The complexity of multiple sequence alignment with SP-score that is a metric. Theor Comput Sci 259(1):63–79. https://doi.org/10.1016/S0304-3975(99)00324-2

    Article  MathSciNet  MATH  Google Scholar 

  6. Wernersson R, Pedersen AG (2003) RevTrans: multiple alignment of coding DNA from aligned amino acid sequences. Nucleic Acids Res 31(13):3537–3539. https://doi.org/10.1093/nar/gkg609

    Article  Google Scholar 

  7. Abascal F, Zardoya R, Telford MJ (2010) TranslatorX: multiple alignment of nucleotide sequences guided by amino acid translations. Nucleic Acids Research 38(supp 2):7–13. https://doi.org/10.1093/nar/gkq291

    Article  Google Scholar 

  8. Kitchenham BA, Charters S (2007) Guidelines for performing systematic literature reviews in software engineering. Technical Report EBSE 2007-001, Keele University and Durham University Joint Report. https://www.elsevier.com/__data/promis_misc/525444systematicreviewsguide.pdf

  9. Chen L, Ali Babar M (2011) A systematic review of evaluation of variability management approaches in software product lines. Inf Softw Technol 53(4):344–362. https://doi.org/10.1016/j.infsof.2010.12.006. Special Section: Software Engineering track of the 24th Annual Symposium on Applied Computing

    Article  Google Scholar 

  10. Salleh N, Mendes E, Grundy J (2011) Empirical studies of pair programming for CS/SE teaching in higher education: a systematic literature review. IEEE Trans Softw Eng 37(4):509–525. https://doi.org/10.1109/TSE.2010.59

    Article  Google Scholar 

  11. Hall T, Beecham S, Bowes D, Gray D, Counsell S (2012) A systematic literature review on fault prediction performance in software engineering. IEEE Trans Softw Eng 38(6):1276–1304. https://doi.org/10.1109/TSE.2011.103

    Article  Google Scholar 

  12. Galster M, Weyns D, Tofan D, Michalik B, Avgeriou P (2014) Variability in software systems–a systematic literature review. IEEE Trans Softw Eng 40(3):282–306. https://doi.org/10.1109/TSE.2013.56

    Article  Google Scholar 

  13. de Freitas Junior M, Fantinato M, Sun V (2015) Improvements to the function point analysis method: a systematic literature review. IEEE Trans Eng Manag 62(4):495–506. https://doi.org/10.1109/TEM.2015.2453354

    Article  Google Scholar 

  14. Hujainah F, Bakar RBA, Abdulgabber MA, Zamli KZ (2018) Software requirements prioritisation: a systematic literature review on significance, stakeholders, techniques and challenges. IEEE Access 6:71497–71523. https://doi.org/10.1109/ACCESS.2018.2881755

    Article  Google Scholar 

  15. Flores-Contreras J, Duran-Limon HA, Chavoya A, Almanza-Ruiz SH (2021) Performance prediction of parallel applications: a systematic literature review. J Supercomput 77(4):4014–4055. https://doi.org/10.1007/s11227-020-03417-5

    Article  Google Scholar 

  16. Mahdavi-Hezavehi S, Galster M, Avgeriou P (2013) Variability in quality attributes of service-based software systems: a systematic literature review. Inf Softw Technol 55(2):320–343. https://doi.org/10.1016/j.infsof.2012.08.010. Special Section: Component-Based Software Engineering (CBSE), 2011

    Article  Google Scholar 

  17. Bornmann L, Daniel H-D (2007) What do we know about the h index? J Am Soc Inf Sci Technol 58(9):1381–1385. https://doi.org/10.1002/asi.20609

    Article  Google Scholar 

  18. Welcome to CORE. Accessed: 2022-01-26 (2022). https://www.core.edu.au Accessed 2022-01-26

  19. Tajima K (1988) Multiple DNA and protein sequence alignment on a workstation and a supercomputer. Bioinformatics 4(4):467–471. https://doi.org/10.1093/bioinformatics/4.4.467

    Article  Google Scholar 

  20. Date S, Kulkarni R, Kulkarni B, Kulkarni-Kale U, Kolaskar AS (1993) Multiple alignment of sequences on parallel computers. Bioinformatics 9(4):397–402. https://doi.org/10.1093/bioinformatics/9.4.397

    Article  Google Scholar 

  21. Ishikawa M, Toya T, Hoshida M, Nitta K, Ogiwara A, Kanehisa M (1993) Multiple sequence alignment by parallel simulated annealing. Bioinformatics 9(3):267–273. https://doi.org/10.1093/bioinformatics/9.3.267

    Article  Google Scholar 

  22. Yap TK, Munson PJ, Frieder O, Martino RL (1995) Parallel multiple sequence alignment using speculative computation. In: Proceedings of the 1995 International Conference on Parallel Processing ICPP

  23. Hughey R, Krogh A (1996) Hidden Markov models for sequence analysis: extension and analysis of the basic method. Bioinformatics 12(2):95–107. https://doi.org/10.1093/bioinformatics/12.2.95

    Article  Google Scholar 

  24. Martino RL, Yap TK, Suh EB (1997) Parallel algorithms in molecular biology. In: Hertzberger B, Sloot P (eds) High-Performance Computing and Networking. Springer, Berlin, Heidelberg, pp 232–240

    Chapter  Google Scholar 

  25. Yap TK, Frieder O, Martino RL (1998) Parallel computation in biological sequence analysis. IEEE Trans Paral Distrib Syst 9(3):283–294. https://doi.org/10.1109/71.674320

    Article  Google Scholar 

  26. Anbarasu LA, Narayanasamy P, Sundararajan V (1999) Multiple sequence alignment using parallel genetic algorithms. In: McKay B, Yao X, Newton CS, Kim J-H, Furuhashi T (eds) Simulated Evolution and Learning. Springer, Berlin, Heidelberg, pp 130–137

    Chapter  Google Scholar 

  27. Anbarasu LA, Narayanasamy P, Sundararajan V (2000) Multiple molecular sequence alignment by island parallel genetic algorithm. Curr Sci 78(7):858–863

    Google Scholar 

  28. Catalyurek U, Stahlberg E, Ferreira R, Kurc T, Saltz J (2002) Improving performance of multiple sequence alignment analysis in multi-client environments. In: Proceedings 16th International Parallel and Distributed Processing Symposium, p. 8. https://doi.org/10.1109/IPDPS.2002.1016584

  29. Kleinjung J, Douglas N, Heringa J (2002) Parallelized multiple alignment. Bioinformatics 18(9):1270–1271. https://doi.org/10.1093/bioinformatics/18.9.1270

    Article  Google Scholar 

  30. Catalyurek U, Gray M, Kurc T, Saltz J, Stahlberg E, Ferreira R (2003) A component-based implementation of multiple sequence alignment. In: Proceedings of the 2003 ACM Symposium on Applied Computing. SAC ’03, pp. 122–126. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/952532.952559

  31. Cheetham J, Dehne F, Pitre S, Rau-Chaplin A, Taillon PJ (2003) Parallel CLUSTAL W for PC clusters. In: Kumar, V., Gavrilova, M.L., Tan, C.J.K., L’Ecuyer, P. (eds.) International Conference on Computational Science and Its Applications — ICCSA 2003, pp. 300–309. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44843-8_32

  32. Li K-B (2003) ClustalW-MPI: ClustalW analysis using distributed and parallel computing. Bioinformatics 19(12):1585–1586. https://doi.org/10.1093/bioinformatics/btg192

    Article  Google Scholar 

  33. Zhihua D, Feng L (2003) Parallel computation for multiple sequence alignments. In: Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint, vol. 1, pp. 300–3031. https://doi.org/10.1109/ICICS.2003.1292464

  34. Ebedes J, Datta A (2004) Multiple sequence alignment in parallel on a workstation cluster. Bioinformatics 20(7):1193–1195. https://doi.org/10.1093/bioinformatics/bth055

    Article  Google Scholar 

  35. Parmentier G, Trystram D, Zola J (2004) Cache-based parallelization of multiple sequence alignment problem. In: Danelutto M, Vanneschi M, Laforenza D (eds) Euro-Par 2004 Parallel Processing. Springer, Berlin, Heidelberg, pp 1005–1012. https://doi.org/10.1007/978-3-540-27866-5_135

    Chapter  Google Scholar 

  36. Schmollinger M, Nieselt K, Kaufmann M, Morgenstern B (2004) DIALIGN P: Fast pair-wise and multiple sequence alignment using parallel processors. BMC Bioinformatics 5(1):128. https://doi.org/10.1186/1471-2105-5-128

    Article  Google Scholar 

  37. Lin X, Peiheng Z, Dongbo B, Shengzhong F, Ninghui S (2005) To accelerate multiple sequence alignment using FPGAs. In: Eighth International Conference on High-Performance Computing in Asia-Pacific Region (HPCASIA’05), pp. 5–180. https://doi.org/10.1109/HPCASIA.2005.96

  38. Lopes HS, Moritz GL (2005) A distributed approach for a multiple sequence alignment algorithm using a parallel virtual machine. In: 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference, pp. 2843–2846. https://doi.org/10.1109/IEMBS.2005.1617066

  39. Luo J, Ahmad I, Ahmed M, Paul R (2005) Parallel multiple sequence alignment with dynamic scheduling. In: International Conference on Information Technology: Coding and Computing (ITCC’05) - Volume II, vol. 1, pp. 8–131. https://doi.org/10.1109/ITCC.2005.223

  40. Oliver T, Schmidt B, Maskell D, Nathan D, Clemens R (2005) Multiple sequence alignment on an FPGA. In: 11th International Conference on Parallel and Distributed Systems (ICPADS’05), vol. 2, pp. 326–330. https://doi.org/10.1109/ICPADS.2005.202

  41. Oliver T, Schmidt B, Nathan D, Clemens R, Maskell D (2005) Using reconfigurable hardware to accelerate multiple sequence alignment with ClustalW. Bioinformatics 21(16):3431–3432. https://doi.org/10.1093/bioinformatics/bti508

    Article  Google Scholar 

  42. Rajasekaran S, Thapar V, Dave H, Huang C-H (2005) Randomized and parallel algorithms for distance matrix calculations in multiple sequence alignment. J Clin Monit Comput 19(4):351–359. https://doi.org/10.1007/s10877-005-0680-3

    Article  Google Scholar 

  43. Tan G, Feng S, Sun N (2005) Parallel multiple sequences alignment in SMP cluster. In: Eighth International Conference on High-Performance Computing in Asia-Pacific Region (HPCASIA’05), pp. 6–431. https://doi.org/10.1109/HPCASIA.2005.70

  44. Trystram D, Zola J (2005) Parallel multiple sequence alignment with decentralized cache support. In: Cunha JC, Medeiros PD (eds) Euro-Par 2005 Parallel Processing. Springer, Berlin, Heidelberg, pp 1217–1226. https://doi.org/10.1007/11549468_133

    Chapter  Google Scholar 

  45. Chaichoompu K, Kittitornkun S, Tongsima S (2006) MT-ClustalW: multithreading multiple sequence alignment. In: Proceedings 20th IEEE International Parallel Distributed Processing Symposium, p. 8. https://doi.org/10.1109/IPDPS.2006.1639537

  46. Chaichoompu K, Kittitornkun S (2006) Multithreaded ClustalW with improved optimization for Intel multi-core processor. In: 2006 International Symposium on Communications and Information Technologies, pp. 590–594. https://doi.org/10.1109/ISCIT.2006.340018

  47. Deng X, Li E, Shan J, Chen W (2006) Parallel implementation and performance characterization of MUSCLE. In: Proceedings 20th IEEE International Parallel Distributed Processing Symposium, p. 7. https://doi.org/10.1109/IPDPS.2006.1639616

  48. Du Z, Lin F (2006) pNJTree: A parallel program for reconstruction of neighbor-joining tree and its application in ClustalW. Paral Comput 32(5):441–446. https://doi.org/10.1016/j.parco.2006.05.001

    Article  MathSciNet  Google Scholar 

  49. Oliver T, Schmidt B, Maskell D, Nathan D, Clemens R (2006) High-speed multiple sequence alignment on a reconfigurable platform. Int J Bioinf Res Appl 2(4):394–406. https://doi.org/10.1504/IJBRA.2006.011038

    Article  Google Scholar 

  50. Rezaei S, Monwar MM (2006) Divide-and-Conquer algorithm for ClustalW-MPI. In: 2006 Canadian Conference on Electrical and Computer Engineering, pp. 717–720. https://doi.org/10.1109/CCECE.2006.277630

  51. Rezaei S, Monwar MM, Bai J (2006) Performance comparison of MPI-based parallel multiple sequence alignment algorithm using single and multiple guide trees. In: 2006 5th IEEE International Conference on Cognitive Informatics, vol. 1, pp. 595–600. https://doi.org/10.1109/COGINF.2006.365552

  52. Tan G, Peng L, Feng S, Sun N (2006) Load balancing and parallel multiple sequence alignment with tree accumulation. In: Nagel WE, Walter WV, Lehner W (eds) Euro-Par 2006 Parallel Processing. Springer, Berlin, Heidelberg, pp 1138–1147. https://doi.org/10.1007/11823285_120

    Chapter  Google Scholar 

  53. Zola J, Trystram, D, Tchernykh A, Brizuela C (2006) Parallel multiple sequence alignment with local phylogeny search by simulated annealing. In: Proceedings 20th IEEE International Parallel Distributed Processing Symposium, p. 8. https://doi.org/10.1109/IPDPS.2006.1639536

  54. Lin CY, Huang CT, Chung Y-C, Tang CY (2007) Efficient parallel algorithm for optimal three-sequences alignment. In: 2007 International Conference on Parallel Processing (ICPP 2007), pp. 14–14. https://doi.org/10.1109/ICPP.2007.38

  55. Liu W, Schmidt B, Voss G, Muller-Wittig W (2007) Streaming algorithms for biological sequence alignment on GPUs. IEEE Trans Paral Distrib Syst 18(9):1270–1281. https://doi.org/10.1109/TPDS.2007.1069

    Article  Google Scholar 

  56. Low DHP, Veeravalli B, Bader DA (2007) On the design of high-performance algorithms for aligning multiple protein sequences on mesh-based multiprocessor architectures. J Paral Distrib Comput 67(9):1007–1017. https://doi.org/10.1016/j.jpdc.2007.03.007

    Article  MATH  Google Scholar 

  57. Zola J, Yang X, Rospondek A, Aluru S (2007) PARALLEL-TCOFFEE: A parallel multiple sequence aligner. In: Proceedings of the ISCA 20th International Conference on Parallel and Distributed Computing Systems, September 24-26, 2007, Las Vegas, Nevada, USA, pp. 248–253

  58. Helal M, El-Gindy H, Mullin L, Gaeta B (2008) Parallelizing optimal multiple sequence alignment by dynamic programming. In: 2008 IEEE International Symposium on Parallel and Distributed Processing with Applications, pp. 669–674. https://doi.org/10.1109/ISPA.2008.93

  59. Manavski SA, Valle G (2008) CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment. BMC Bioinf 9(2):10. https://doi.org/10.1186/1471-2105-9-S2-S10

    Article  Google Scholar 

  60. Saeed F, Khokhar A (2008) Sample-Align-D: A high performance multiple sequence alignment system using phylogenetic sampling and domain decomposition. In: 2008 IEEE International Symposium on Parallel and Distributed Processing, pp. 1–9. https://doi.org/10.1109/IPDPS.2008.4536174

  61. Bradley RK, Roberts A, Smoot M, Juvekar S, Do J, Dewey C, Holmes I, Pachter L (2009) Fast statistical alignment. PLOS Comput Biol 5(5):1–15. https://doi.org/10.1371/journal.pcbi.1000392

    Article  MathSciNet  Google Scholar 

  62. Liu Y, Schmidt B, Maskell DL (2009) MSA-CUDA: Multiple sequence alignment on graphics processing units with CUDA. In: 2009 20th IEEE International Conference on Application-specific Systems, Architectures and Processors, pp. 121–128. https://doi.org/10.1109/ASAP.2009.14

  63. Liu Y, Schmidt B, Maskell DL (2009) Parallel reconstruction of neighbor-joining trees for large multiple sequence alignments using CUDA. In: 2009 IEEE International Symposium on Parallel Distributed Processing, pp. 1–8. https://doi.org/10.1109/IPDPS.2009.5160923

  64. Saeed F, Khokhar A (2009) A domain decomposition strategy for alignment of multiple biological sequences on multiprocessor platforms. J Paral Distrib Comput 69(7):666–677. https://doi.org/10.1016/j.jpdc.2009.03.006

    Article  Google Scholar 

  65. Wirawan A, Schmidt B, Kwoh CK (2009) Pairwise distance matrix computation for multiple sequence alignment on the cell broadband engine. In: Allen G, Nabrzyski J, Seidel E, van Albada GD, Dongarra J, Sloot PMA (eds) Computational Science - ICCS 2009. Springer, Berlin, Heidelberg, pp 954–963

    Chapter  Google Scholar 

  66. Di Tommaso P, Orobitg M, Guirado F, Cores F, Espinosa T, Notredame C (2010) Cloud-Coffee: implementation of a parallel consistency-based multiple alignment algorithm in the T-Coffee package and its benchmarking on the Amazon Elastic-Cloud. Bioinformatics 26(15):1903–1904. https://doi.org/10.1093/bioinformatics/btq304

    Article  Google Scholar 

  67. Isaza S, Sanchez F, Gaydadjiev G, Ramirez A, Valero M (2010) Scalability analysis of progressive alignment on a multicore. In: 2010 International Conference on Complex, Intelligent and Software Intensive Systems, pp. 889–894. https://doi.org/10.1109/CISIS.2010.149

  68. Katoh K, Toh H (2010) Parallelization of the MAFFT multiple sequence alignment program. Bioinformatics 26(15):1899–1900. https://doi.org/10.1093/bioinformatics/btq224

    Article  Google Scholar 

  69. Kim T, Joo H (2010) ClustalXeed: a GUI-based grid computation version for high performance and terabyte size multiple sequence alignment. BMC Bioinf 11(1):467. https://doi.org/10.1186/1471-2105-11-467

    Article  MathSciNet  Google Scholar 

  70. Liu Y, Schmidt B, Maskell DL (2010) MSAProbs: multiple sequence alignment based on pair hidden Markov models and partition function posterior probabilities. Bioinformatics 26(16):1958–1964. https://doi.org/10.1093/bioinformatics/btq338

    Article  Google Scholar 

  71. Miranda LA, Caetano MAF, Melo ACMA, Correa JM, Bordim JL (2010) Multiple biological sequence alignment with a parallel island injection genetic algorithm. In: 2010 IEEE 12th International Conference on High Performance Computing and Communications (HPCC), pp. 314–321. https://doi.org/10.1109/HPCC.2010.31

  72. Wirawan A, Kwoh CK, Schmidt B (2010) Multi-threaded vectorized distance matrix computation on the CELL/BE and x86/SSE2 architectures. Bioinformatics 26(10):1368–1369. https://doi.org/10.1093/bioinformatics/btq135

    Article  Google Scholar 

  73. de Araujo Macedo E, Magalhaes Alves de Melo AC, Pfitscher GH, Boukerche A (2011) Hybrid MPI/OpenMP strategy for biological multiple sequence alignment with DIALIGN-TX in heterogeneous multicore clusters. In: 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum, pp. 418–425. https://doi.org/10.1109/IPDPS.2011.169

  74. Lloyd S, Snell QO (2011) Accelerated large-scale multiple sequence alignment. BMC Bioinf 12(1):466. https://doi.org/10.1186/1471-2105-12-466

    Article  Google Scholar 

  75. Nguyen KD, Pan Y, Nong G (2011) Parallel progressive multiple sequence alignment on reconfigurable meshes. BMC Genom 12(5):4. https://doi.org/10.1186/1471-2164-12-S5-S4

    Article  Google Scholar 

  76. Orobitg M, Guirado F, Notredame C, Cores F (2011) Exploiting parallelism on progressive alignment methods. J Supercomput 58(2):186–194. https://doi.org/10.1007/s11227-009-0359-5

    Article  Google Scholar 

  77. Rius J, Cores F, Solsona F, van Hemert JI, Koetsier J, Notredame C (2011) A user-friendly web portal for T-Coffee on supercomputers. BMC Bioinf 12(1):150. https://doi.org/10.1186/1471-2105-12-150

    Article  Google Scholar 

  78. Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li W, Lopez R, McWilliam H, Remmert M, Söding J, Thompson JD, Higgins DG (2011) Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol 7(1):539. https://doi.org/10.1038/msb.2011.75

    Article  Google Scholar 

  79. da Silva FJM, Pérez JMS, Pulido JAG, Rodríguez MAV (2011) Parallel Niche Pareto AlineaGA - an evolutionary multiobjective approach on multiple sequence alignment. J Integr Bioinf 8(3):57–72. https://doi.org/10.1515/jib-2011-174

    Article  Google Scholar 

  80. Lin Y-S, Lin, C-Y, Chung Y-C (2012) GPU-based cloud service for multiple sequence alignments with regular expression constrains. In: 4th IEEE International Conference on Cloud Computing Technology and Science Proceedings, pp. 741–746. https://doi.org/10.1109/CloudCom.2012.6427565

  81. Mahram A, Herbordt MC (2012) FMSA: FPGA-accelerated ClustalW-based multiple sequence alignment through pipelined prefiltering. In: 2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines, pp. 177–183. https://doi.org/10.1109/FCCM.2012.38

  82. Marucci EA, Zafalon GFD, Momente JC, Pinto AR, Amazonas JRA, Shiyou Y, Sato LM, Machado JM (2012) Using threads to overcome synchronization delays in parallel multiple progressive alignment algorithms. Curr Res Bioinf 1:50–63. https://doi.org/10.3844/ajbsp.2012.50.63

    Article  Google Scholar 

  83. Orobitg M, Cores F, Guirado F, Kemena C, Notredame C, Ripoll A (2012) Enhancing the scalability of consistency-based progressive multiple sequences alignment applications. In: 2012 IEEE 26th International Parallel and Distributed Processing Symposium, pp. 71–82. https://doi.org/10.1109/IPDPS.2012.17

  84. Blazewicz J, Frohmberg W, Kierzynka M, Wojciechowski P (2013) G-MSA - A GPU-based, fast and accurate algorithm for multiple sequence alignment. J Paral Distrib Comput 73(1):32–41. https://doi.org/10.1016/j.jpdc.2012.04.004

    Article  Google Scholar 

  85. de Araujo Macedo E, Alves Magalhaes, de Melo AC, Pfitscher GH, Boukerche A (2013) Multiple biological sequence alignment in heterogeneous multicore clusters with user-selectable task allocation policies. J Supercomput 63(3):740–756. https://doi.org/10.1007/s11227-012-0768-8

    Article  Google Scholar 

  86. Esteban FJ, Díaz D, Hernández P, Caballero JA, Dorado G, Gálvez S (2013) Direct approaches to exploit many-core architecture in bioinformatics. Future Gener Comput Syst 29(1), 15–26. https://doi.org/10.1016/j.future.2012.03.018. Including Special section: AIRCC-NetCoM 2009 and Special section: Clouds and Service-Oriented Architectures

  87. Hatem M, Ruml W (2013) External memory best-first search for multiple sequence alignment. Proc AAAI Conf Artif Intell 27(1):409–416

    Google Scholar 

  88. Katoh K, Standley DM (2013) MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability. Mol Biol Evol 30(4):772–780. https://doi.org/10.1093/molbev/mst010

    Article  Google Scholar 

  89. Montañola A, Roig C, Guirado F, Hernández P, Notredame C (2013) Performance analysis of computational approaches to solve multiple sequence alignment. J Supercomput 64(1):69–78. https://doi.org/10.1007/s11227-012-0751-4

    Article  Google Scholar 

  90. Orobitg M, Lladós J, Guirado F, Cores F, Notredame C (2013) Scalability and accuracy improvements of consistency-based multiple sequence alignment tools. In: Proceedings of the 20th European MPI Users’ Group Meeting. EuroMPI ’13, pp. 259–264. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/2488551.2488583

  91. Tzanoudakis T, Papaefstathiou I, Manifavas C (2013) Parallelizing bioinformatics and security applications on a low-cost multi-core system. In: 2013 ACS International Conference on Computer Systems and Applications (AICCSA), pp. 1–4. https://doi.org/10.1109/AICCSA.2013.6616452

  92. Yilmaz C, Gök M (2013) System designs to perform bioinformatics sequence alignment. Turkish J Electr Eng Comput Sci 21(1):246–262. https://doi.org/10.3906/elk-1105-22

    Article  Google Scholar 

  93. Zhu X, Li K, Salah A (2013) A data parallel strategy for aligning multiple biological sequences on multi-core computers. Comput Biol Med 43(4):350–361. https://doi.org/10.1016/j.compbiomed.2012.12.009

    Article  Google Scholar 

  94. Díaz D, Esteban FJ, Hernández P, Caballero JA, Guevara A, Dorado G, Gálvez S (2014) MC64-ClustalWP2: A highly-parallel hybrid strategy to align multiple sequences in many-core architectures. PLOS ONE 9(4):1–12. https://doi.org/10.1371/journal.pone.0094044

    Article  Google Scholar 

  95. Gudyś A, Deorowicz S (2014) QuickProbs–A fast multiple sequence alignment algorithm designed for graphics processors. PLOS ONE 9(2):1–18. https://doi.org/10.1371/journal.pone.0088901

    Article  Google Scholar 

  96. Lin CY, Lin YS (2014) Efficient parallel algorithm for multiple sequence alignments with regular expression constraints on graphics processing units. Int J Comput Sci Eng 9(1–2):11–20. https://doi.org/10.1504/IJCSE.2014.058687

    Article  Google Scholar 

  97. Al-Neama MW, Reda NM, Ghaleb FFM (2015) Fast vectorized distance matrix computation for multiple sequence alignment on multi-cores. Int J Biomath 08(06):1550084. https://doi.org/10.1142/S1793524515500849

    Article  MathSciNet  MATH  Google Scholar 

  98. Hung C-L, Lin Y-S, Lin C-Y, Chung Y-C, Chung Y-F (2015) CUDA ClustalW: an efficient parallel algorithm for progressive multiple sequence alignment on Multi-GPUs. Comput Biol Chem 58:62–68. https://doi.org/10.1016/j.compbiolchem.2015.05.004

    Article  Google Scholar 

  99. Mirarab S, Nguyen N, Guo S, Wang L-S, Kim J, Warnow T (2015) PASTA: Ultra-large multiple sequence alignment for nucleotide and amino-acid sequences. J Comput Biol 22(5):377–386. https://doi.org/10.1089/cmb.2014.0156 (PMID: 25549288)

    Article  Google Scholar 

  100. N-pD Nguyen, Mirarab S, Kumar K, Warnow T (2015) Ultra-large alignments using phylogeny-aware profiles. Genome Biol 16(1):124. https://doi.org/10.1186/s13059-015-0688-z

    Article  Google Scholar 

  101. Orobitg M, Guirado F, Cores F, Llados J, Notredame C (2015) High performance computing improvements on bioinformatics consistency-based multiple sequence alignment tools. Paral Comput 42:18–34. https://doi.org/10.1016/j.parco.2014.09.010

    Article  Google Scholar 

  102. Sundfeld D, Teodoro G, Magalhaes Alves de Melo AC (2015) Parallel A-Star multiple sequence alignment with locality-sensitive hash functions. In: 2015 Ninth International Conference on Complex, Intelligent, and Software Intensive Systems, pp. 342–347. https://doi.org/10.1109/CISIS.2015.50

  103. Zafalon GFD, Visotaky JMV, Amorim AR, Valêncio CR, Neves LA, de Souza RCG, Machado JM (2015) A parallel approach of COFFEE objective function to multiple sequence alignment. J Phys: Conf Ser 633:012084. https://doi.org/10.1088/1742-6596/633/1/012084

    Article  Google Scholar 

  104. Zhu X, Li K, Salah A, Shi L, Li K (2015) Parallel implementation of MAFFT on CUDA-enabled graphics hardware. IEEE/ACM Trans Comput Biol Bioinf 12(1):205–218. https://doi.org/10.1109/TCBB.2014.2351801

    Article  Google Scholar 

  105. Amorim AR, Visotaky JMV, de Godoi Contessoto A, Neves LA, Gratão De Souza RC, Valêncio CR, Zafalon GFD (2016) Performance improvement of genetic algorithm for multiple sequence alignment. In: 2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), pp. 69–72. https://doi.org/10.1109/PDCAT.2016.029

  106. Deorowicz S, Debudaj-Grabysz A, Gudyś A (2016) FAMSA: fast and accurate multiple sequence alignment of huge protein families. Sci Rep 6(1):33964. https://doi.org/10.1038/srep33964

    Article  Google Scholar 

  107. González-Domínguez J, Liu Y, Touriño J, Schmidt B (2016) MSAProbs-MPI: parallel multiple sequence aligner for distributed-memory systems. Bioinformatics 32(24):3826–3828. https://doi.org/10.1093/bioinformatics/btw558

    Article  Google Scholar 

  108. Lan H, Chan Y, Xu K, Schmidt B, Peng S, Liu W (2016) Parallel algorithms for large-scale biological sequence alignment on Xeon-Phi based clusters. BMC Bioinf 17(9):267. https://doi.org/10.1186/s12859-016-1128-0

    Article  Google Scholar 

  109. Reda NM, Al-Neama M, Ghaleb FFM (2016) HAMSA: highly accelerated multiple sequence aligner. Int J Adv Comput Sci Appl. https://doi.org/10.14569/IJACSA.2016.070661

    Article  Google Scholar 

  110. Abuín JM, Pena TF, Pichel JC (2017) PASTASpark: multiple sequence alignment meets Big Data. Bioinformatics 33(18):2948–2950. https://doi.org/10.1093/bioinformatics/btx354

    Article  Google Scholar 

  111. Araujo E, Stefanes MA, O. Ferlete Vd, Rozante LCS (2017) Multiple sequence alignment using hybrid parallel computing. In: 2017 IEEE 17th International Conference on Bioinformatics and Bioengineering (BIBE), pp. 175–180. https://doi.org/10.1109/BIBE.2017.00-59

  112. Gudyś A, Deorowicz S (2017) QuickProbs 2: towards rapid construction of high-quality alignments of large protein families. Sci Rep 7(1):41553. https://doi.org/10.1038/srep41553

    Article  Google Scholar 

  113. Liu P, Hemani A, Paul K, Weis C, Jung M, Wehn N (2017) 3D-stacked many-core architecture for biological sequence analysis problems. Int J Paral Program 45(6):1420–1460. https://doi.org/10.1007/s10766-017-0495-0

    Article  Google Scholar 

  114. Neehal N, Karim DZ, Islam A (2017) Cloud-POA: A cloud-based map only implementation of PO-MSA on Amazon multi-node EC2 Hadoop Cluster. In: 2017 20th International Conference of Computer and Information Technology (ICCIT), pp. 1–6 https://doi.org/10.1109/ICCITECHN.2017.8281808

  115. Wan S, Zou Q (2017) HAlign-II: efficient ultra-large multiple sequence alignment and phylogenetic tree reconstruction with distributed and parallel computing. Algorithms Mol Biol 12(1):25. https://doi.org/10.1186/s13015-017-0116-x

    Article  Google Scholar 

  116. Zambrano-Vega C, Nebro AJ, García-Nieto J, Aldana-Montes JF (2017) M2Align: parallel multiple sequence alignment with a multi-objective metaheuristic. Bioinformatics 33(19):3011–3017. https://doi.org/10.1093/bioinformatics/btx338

    Article  Google Scholar 

  117. Nakamura T, Yamada KD, Tomii K, Katoh K (2018) Parallelization of MAFFT for large-scale multiple sequence alignments. Bioinformatics 34(14):2490–2492. https://doi.org/10.1093/bioinformatics/bty121

    Article  Google Scholar 

  118. Sundfeld D, Razzolini C, Teodoro G, Boukerche A, de Melo ACMA (2018) PA-Star: a disk-assisted parallel A-Star strategy with locality-sensitive hash for multiple sequence alignment. J Paral Distrib Comput 112:154–165. https://doi.org/10.1016/j.jpdc.2017.04.014

    Article  Google Scholar 

  119. Welivita A, Perera I, Meedeniya D, Wickramarachchi A, Mallawaarachchi V (2018) Managing complex workflows in bioinformatics: An interactive toolkit with GPU acceleration. IEEE Trans NanoBiosci 17(3):199–208. https://doi.org/10.1109/TNB.2018.2837122

    Article  Google Scholar 

  120. Lassmann T (2019) Kalign 3: multiple sequence alignment of large datasets. Bioinformatics 36(6):1928–1929. https://doi.org/10.1093/bioinformatics/btz795

    Article  Google Scholar 

  121. Benítez-Hidalgo A, Nebro AJ, Aldana-Montes JF (2020) Sequoya: multiobjective multiple sequence alignment in Python. Bioinformatics 36(12):3892–3893. https://doi.org/10.1093/bioinformatics/btaa257

    Article  Google Scholar 

  122. Smirnov V, Warnow T (2020) MAGUS: multiple sequence alignment using graph clUStering. Bioinformatics 37(12):1666–1672. https://doi.org/10.1093/bioinformatics/btaa992

    Article  Google Scholar 

  123. Smirnov V (2021) Recursive MAGUS: scalable and accurate multiple sequence alignment. PLOS Comput Biol 17(10):1–17. https://doi.org/10.1371/journal.pcbi.1008950

    Article  Google Scholar 

  124. Ishaq M, Khan A, Su’ud MM, Alam MM, Bangash JI, Khan A (2022) An improved strategy for task scheduling in the parallel computational alignment of multiple sequences. Comput Math Methods Med 2022:8691646. https://doi.org/10.1155/2022/8691646

    Article  Google Scholar 

  125. Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22(22):4673–4680. https://doi.org/10.1093/nar/22.22.4673

    Article  Google Scholar 

  126. Chowdhury B, Garai G (2017) A review on multiple sequence alignment from the perspective of genetic algorithm. Genomics 109(5):419–431. https://doi.org/10.1016/j.ygeno.2017.06.007

    Article  Google Scholar 

  127. Prousalis K, Konofaos N (2019) A quantum pattern recognition method for improving pairwise sequence alignment. Sci Rep 9(1):7226. https://doi.org/10.1038/s41598-019-43697-3

    Article  Google Scholar 

Download references

Funding

Sergio H. Almanza-Ruiz is receiving a full-time scholarship for his graduate studies from the Mexican National Council for Science and Technology (CONACyT).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arturo Chavoya.

Ethics declarations

Competing interests

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (XLSX 67 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Almanza-Ruiz, S.H., Chavoya, A. & Duran-Limon, H.A. Parallel protein multiple sequence alignment approaches: a systematic literature review. J Supercomput 79, 1201–1234 (2023). https://doi.org/10.1007/s11227-022-04697-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04697-9

Keywords