Skip to main content
Log in

Using mixed mode programming to parallelize an indicator-based evolutionary algorithm for inferring multiobjective phylogenetic histories

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Multiple problems in bioinformatics research involve the optimization of time-consuming objective functions over exponentially growing search spaces. The capabilities shown by modern parallel systems composed of clustered multicore multiprocessors represent an opportunity to address such difficult problems. A suitable paradigm to exploit these systems lies on the combination of mixed mode programming and evolutionary computation. This research focuses on the reconstruction of multiobjective phylogenetic hypotheses by using an indicator-based evolutionary algorithm. In order to overcome the main sources of complexity of the problem, we propose a parallel adaptation of this algorithm based on master–worker principles. Experimental results on six real data sets report that the design achieves an efficient exploitation of a shared–distributed memory hybrid system composed of 48 processing cores, observing improved scalability in comparison with other parallel proposals. In addition, the inferred Pareto fronts give account of the relevance of the indicator-based design, verifying significant solution quality under different multiobjective metrics and biological testing procedures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Adhianto L, Chapman B (2007) Performance modeling of communication and computation in hybrid MPI and OpenMP applications. Simul Model Pract Theory 15(4):481–491

    Article  Google Scholar 

  • Bader DA, Chandu VP, Yan M (2006) ExactMP: an efficient parallel exact solver for phylogenetic tree reconstruction using maximum parsimony. In: Proceedings of ICPP 2006. IEEE, pp 65–73

  • Bader DA, Stamatakis A, Tseng CW (2006) Computational grand challenges in assembling the tree of life: problems and solutions. In: Tseng C-W, Zelkowitz M (eds) Advances in computers, vol 68. Elsevier, Oxford, pp 127–176

  • Beume N, Fonseca CM, López-Ibáñez M, Paquete L, Vahrenhold J (2009) On the complexity of computing the hypervolume indicator. IEEE Trans Evol Comput 13(5):1075–1082

    Article  Google Scholar 

  • Bos DH, Posada D (2005) Using models of nucleotide evolution to build phylogenetic trees. Dev Comp Immunol 29(3):211–227

    Article  Google Scholar 

  • Brauer MJ, Holder MT, Dries LA, Zwickl DJ, Lewis PO, Hillis DM (2002) Genetic algorithms and parallel processing in maximum-likelihood phylogeny inference. Mol Biol Evol 19(10):1717–1726

    Article  Google Scholar 

  • Bryant D, Galtier N, Poursat MA (2005) Likelihood calculations in molecular phylogenetics. In: Gascuel O (ed) Mathematics of evolution and phylogeny. Oxford University Press, Oxford, pp 33–62

  • Cancino W, Delbem ACB (2010) A multi-criterion evolutionary approach applied to phylogenetic reconstruction. In: Korosec P (ed) New achievements in evolutionary computation. InTech, Rijeka, pp 135–156

  • Cancino W, Jourdan L, Talbi E-G, Delbem ACB (2010) Parallel multi-objective approaches for inferring phylogenies. In: Proceedings of EVOBIO’2010, LNCS, vol 6023. Springer, pp 26–37

  • Chai J, Su H, Zhang C (2012) Performance analysis and comparison of three MrBayes computational biology code on TianHe-1A supercomputer. In: Proceedings of the international conference on computer science and service system 2012. IEEE, pp 2135–2140

  • Chapman B, Jost G, van der Pas R (2007) Using OpenMP: portable shared memory parallel programming. The MIT Press, Cambridge

    Google Scholar 

  • Chase MW et al (1993) Phylogenetics of seed plants: an analysis of nucleotide sequences from the plastid gene rbcL. Ann Mo Bot Gard 80(3):528–580

    Article  Google Scholar 

  • Chor B, Tuller T (2005) Maximum likelihood of evolutionary trees is hard. In: Research in computational molecular biology, LNCS, vol 3500. Springer, pp 296–310

  • Coelho GP, Silva AEA, Zuben FJV (2010) An immune-inspired multi-objective approach to the reconstruction of phylogenetic trees. Neural Comput Appl 19(8):1103–1132

    Article  Google Scholar 

  • Coello C, Dhaenens C, Jourdan L (2010) Advances in multi-objective nature inspired computing. Springer, Berlin

    Book  MATH  Google Scholar 

  • Cole JR et al (2005) The Ribosomal Database Project (RDP-II): sequences and tools for high-throughput rRNA analysis. Nucleic Acids Res 33:D294–D296

    Article  Google Scholar 

  • Congdon CB, Septor KJ (2003) Phylogenetic trees using evolutionary search: initial progress in extending gaphyl to work with genetic data. In: Proceeding of the 2003 congress on evolutionary computation (CEC 2003). IEEE Press, Piscataway, pp 320–326

  • Darriba D, Taboada GL, Doallo R, Posada D (2012) jModelTest 2: more models, new heuristics and parallel computing. Nat Methods 9(8):772–772

    Article  Google Scholar 

  • Day WHE, Johnson DS, Sankoff D (1986) The computational complexity of inferring rooted phylogenies by parsimony. Math Biosci 81(1):33–42

    Article  MathSciNet  MATH  Google Scholar 

  • Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multi-objective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197

    Article  Google Scholar 

  • Díaz J, Muñoz-Caro C, Niño A (2012) A survey of parallel programming models and tools in the multi and many-core era. IEEE Trans Parallel Distrib Syst 23(8):1369–1386

    Article  Google Scholar 

  • Felsenstein J (2000) PHYLIP (phylogeny inference package). http://evolution.genetics.washington.edu/phylip.html

  • Figueira JR, Liefooghe A, Talbi E-G, Wierzbicki AP (2010) A parallel multiple reference point approach for multi-objective optimization. Eur J Oper Res 205(2):390–400

    Article  MathSciNet  MATH  Google Scholar 

  • Goëffon A, Richer JM, Hao JK (2008) Progressive tree neighborhood applied to the maximum parsimony problem. IEEE/ACM Trans Comput Biol Bioinform 5(1):136–145

    Article  Google Scholar 

  • Goloboff PA, Farris JS, Nixon KC (2008) TNT, a free program for phylogenetic analysis. Cladistics 24(5):774–786

    Article  Google Scholar 

  • Gropp W, Lusk W, Skjellum A (2014) Using MPI: portable parallel programming with the message passing interface, 3rd edn. The MIT Press, Cambridge

    MATH  Google Scholar 

  • Guéquen L et al (2013) Bio++: efficient extensible libraries and tools for computational molecular evolution. Mol Biol Evol 30(8):1745–1750

    Article  Google Scholar 

  • HIV sequence database. http://www.hiv.lanl.gov/ (2005)

  • Ingman M, Gyllensten U (2006) mtDB: Human Mitochondrial Genome Database, a resource for population genetics and medical sciences. Nucleic Acids Res 34:D749–D751

    Article  Google Scholar 

  • Izquierdo-Carrasco F, Alachiotis N, Berger S, Flouri T, Pissis SP, Stamatakis A (2013) A generic vectorization scheme and a GPU kernel for the phylogenetic likelihood library. In: Proceedings of the 27th IEEE international parallel & distributed processing symposium. IEEE, pp 530–538

  • Jaimes AL, Coello C (2009) Applications of parallel platforms and models in evolutionary multi-objective optimization. In: Biologically-inspired optimisation methods, studies in computational intelligence, vol 210. Springer, pp 23–49

  • Katoh K, Kuma K, Miyata T (2001) Genetic algorithm-based maximum-likelihood analysis for molecular phylogeny. J Mol Evol 53(4–5):477–484

    Article  Google Scholar 

  • Lemey P, Salemi M, Vandamme AM (2009) The phylogenetic handbook: a practical approach to phylogenetic analysis and hypothesis testing. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Lemmon AR, Milinkovitch MC (2002) The metapopulation genetic algorithm: an efficient solution for the problem of large phylogeny estimation. Proc Natl Acad Sci USA 99(16):10516–10521

    Article  Google Scholar 

  • León C, Miranda G, Segredo E, Segura C (2009) Parallel library of multi-objective evolutionary algorithms. In: Proceedings of the 2009 17th Euromicro international conference on parallel, distributed and network-based processing. IEEE, pp 28–35

  • Lewis PO (1998) A genetic algorithm for maximum-likelihood phylogeny inference using nucleotide sequence data. Mol Biol Evol 15(3):277–283

    Article  Google Scholar 

  • López-Ibáñez M, Dubois-Lacoste J, Stützle T, Birattari M (2011) The irace package, Iterated Race for automatic algorithm configuration. Technical report TR/IRIDIA/2011-004. IRIDIA, Université libre de Bruxelles, Belgium

  • Macey JR (2005) Plethodontid salamander mitochondrial genomics: a parsimony evaluation of character conflict and implications for historical biogeography. Cladistics 21(2):194–202

    Article  Google Scholar 

  • Minh BQ, Vinh LS, von Haeseler A, Schmidt HA (2005) pIQPNNI—parallel reconstruction of large maximum likelihood phylogenies. Bioinformatics 21(19):3794–3796

    Article  Google Scholar 

  • Minh BQ, Vinh LS, Schmidt HA, von Haeseler A (2006) Large maximum likelihood trees. In: Proceedings of the NIC symposium. Forschungszentrum Jlich, Germany, pp 357–366

  • Moore MJ, Jansen RK (2006) Molecular evidence for the age, origin, and evolutionary history of the american desert plant genus tiquilia (boraginaceae). Mol Phylogenet Evol 39(3):668–687

    Article  Google Scholar 

  • Ott M, Zola J, Aluru S, Johnson AD, Janies D, Stamatakis A (2008) Large-scale phylogenetic analysis on current HPC architectures. Sci Program 16(2–3):255–270

    Google Scholar 

  • Pfeiffer W, Stamatakis A (2010) Hybrid MPI/Pthreads parallelization of the RAxML phylogenetics code. In: Proceedings of HiCOMB 2010. IEEE, pp 1–8

  • Poladian L (2005) A GA for maximum likelihood phylogenetic inference using neighbour-joining as a genotype to phenotype mapping. In: Genetic and evolutionary computation conference, pp 415–422

  • Poladian L, Jermiin L (2006) Multi-objective evolutionary algorithms and phylogenetic inference with multiple data sets. Soft Comput 10(4):359–368

    Article  Google Scholar 

  • Pratas F, Trancoso P, Sousa L, Stamatakis A, Shi G, Kindratenko V (2012) Fine-grain parallelism using multi-core, Cell/BE, and GPU systems. Parallel Comput 38(8):365–390

    Article  Google Scholar 

  • Rokas A, Williams BL, King N, Carroll SB (2003) Genome-scale approaches to resolving incongruence in molecular phylogenies. Nature 425(6960):798–804

    Article  Google Scholar 

  • Ronquist F et al (2012) MrBayes 3.2: efficient bayesian phylogenetic inference and model choice across a large model space. Syst Biol 61(3):539–542

    Article  Google Scholar 

  • Santander-Jiménez S, Vega-Rodríguez MA (2014) Inferring multiobjective phylogenetic hypotheses by using a parallel indicator-based evolutionary algorithm. In: Theory and practice of natural computing, LNCS, vol 8890. Springer, pp 205–217

  • Santander-Jiménez S, Vega-Rodríguez MA (2015) A hybrid approach to parallelize a fast non-dominated sorting genetic algorithm for phylogenetic inference. Concurr Comput Pract Exp 27(3):702–734

    Article  Google Scholar 

  • Sheskin DJ (2011) Handbook of parametric and nonparametric statistical procedures, 5th edn. Chapman & Hall/CRC Press, London

    MATH  Google Scholar 

  • Shimodaira H, Hasegawa M (2001) CONSEL: for assessing the confidence of phylogenetic tree selection. Bioinformatics 17(12):1246–1247

    Article  Google Scholar 

  • Stamatakis A (2014) RAxML Version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30(9):1312–1313

    Article  Google Scholar 

  • Stamatakis A, Ott M (2008) Exploiting fine-grained parallelism in the phylogenetic likelihood function with MPI, Pthreads, and OpenMP: a performance study. In: Pattern recognition in bioinformatics, LNBI, vol 5265. Springer, pp 424–436

  • Wiens JJ, Servedio MR (1998) Phylogenetic analysis and intraspecific variation: performance of parsimony, likelihood, and distance methods. Syst Biol 47(2):228–253

    Article  Google Scholar 

  • Zitzler E, Künzli S (2004) Indicator-based selection in multiobjective search. In: PPSN VIII, LNCS, vol 3242. Springer, pp 832–842

  • Zitzler E, Thiele L, Laumanns M, Fonseca CM, Fonseca VGD (2003) Performance assessment of multiobjective optimizers: an analysis and review. IEEE Trans Evol Comput 7(2):117–132

    Article  Google Scholar 

Download references

Acknowledgments

This work was partially funded by the Spanish Ministry of Economy and Competitiveness and the ERDF (European Regional Development Fund), under the contract TIN2012-30685 (BIO project). Sergio Santander-Jiménez was supported by the Grant FPU12/04101 from the Spanish Government. Currently, he is supported by the postdoc research grant ACCION-III-04 from the University of Extremadura, Spain.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sergio Santander-Jiménez.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Communicated by C. M. Vide and A. H. Dediu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Santander-Jiménez, S., Vega-Rodríguez, M.A. Using mixed mode programming to parallelize an indicator-based evolutionary algorithm for inferring multiobjective phylogenetic histories. Soft Comput 21, 5601–5620 (2017). https://doi.org/10.1007/s00500-016-2219-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-016-2219-6

Keywords

Navigation