Abstract
Multiple problems in bioinformatics research involve the optimization of time-consuming objective functions over exponentially growing search spaces. The capabilities shown by modern parallel systems composed of clustered multicore multiprocessors represent an opportunity to address such difficult problems. A suitable paradigm to exploit these systems lies on the combination of mixed mode programming and evolutionary computation. This research focuses on the reconstruction of multiobjective phylogenetic hypotheses by using an indicator-based evolutionary algorithm. In order to overcome the main sources of complexity of the problem, we propose a parallel adaptation of this algorithm based on master–worker principles. Experimental results on six real data sets report that the design achieves an efficient exploitation of a shared–distributed memory hybrid system composed of 48 processing cores, observing improved scalability in comparison with other parallel proposals. In addition, the inferred Pareto fronts give account of the relevance of the indicator-based design, verifying significant solution quality under different multiobjective metrics and biological testing procedures.




Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Adhianto L, Chapman B (2007) Performance modeling of communication and computation in hybrid MPI and OpenMP applications. Simul Model Pract Theory 15(4):481–491
Bader DA, Chandu VP, Yan M (2006) ExactMP: an efficient parallel exact solver for phylogenetic tree reconstruction using maximum parsimony. In: Proceedings of ICPP 2006. IEEE, pp 65–73
Bader DA, Stamatakis A, Tseng CW (2006) Computational grand challenges in assembling the tree of life: problems and solutions. In: Tseng C-W, Zelkowitz M (eds) Advances in computers, vol 68. Elsevier, Oxford, pp 127–176
Beume N, Fonseca CM, López-Ibáñez M, Paquete L, Vahrenhold J (2009) On the complexity of computing the hypervolume indicator. IEEE Trans Evol Comput 13(5):1075–1082
Bos DH, Posada D (2005) Using models of nucleotide evolution to build phylogenetic trees. Dev Comp Immunol 29(3):211–227
Brauer MJ, Holder MT, Dries LA, Zwickl DJ, Lewis PO, Hillis DM (2002) Genetic algorithms and parallel processing in maximum-likelihood phylogeny inference. Mol Biol Evol 19(10):1717–1726
Bryant D, Galtier N, Poursat MA (2005) Likelihood calculations in molecular phylogenetics. In: Gascuel O (ed) Mathematics of evolution and phylogeny. Oxford University Press, Oxford, pp 33–62
Cancino W, Delbem ACB (2010) A multi-criterion evolutionary approach applied to phylogenetic reconstruction. In: Korosec P (ed) New achievements in evolutionary computation. InTech, Rijeka, pp 135–156
Cancino W, Jourdan L, Talbi E-G, Delbem ACB (2010) Parallel multi-objective approaches for inferring phylogenies. In: Proceedings of EVOBIO’2010, LNCS, vol 6023. Springer, pp 26–37
Chai J, Su H, Zhang C (2012) Performance analysis and comparison of three MrBayes computational biology code on TianHe-1A supercomputer. In: Proceedings of the international conference on computer science and service system 2012. IEEE, pp 2135–2140
Chapman B, Jost G, van der Pas R (2007) Using OpenMP: portable shared memory parallel programming. The MIT Press, Cambridge
Chase MW et al (1993) Phylogenetics of seed plants: an analysis of nucleotide sequences from the plastid gene rbcL. Ann Mo Bot Gard 80(3):528–580
Chor B, Tuller T (2005) Maximum likelihood of evolutionary trees is hard. In: Research in computational molecular biology, LNCS, vol 3500. Springer, pp 296–310
Coelho GP, Silva AEA, Zuben FJV (2010) An immune-inspired multi-objective approach to the reconstruction of phylogenetic trees. Neural Comput Appl 19(8):1103–1132
Coello C, Dhaenens C, Jourdan L (2010) Advances in multi-objective nature inspired computing. Springer, Berlin
Cole JR et al (2005) The Ribosomal Database Project (RDP-II): sequences and tools for high-throughput rRNA analysis. Nucleic Acids Res 33:D294–D296
Congdon CB, Septor KJ (2003) Phylogenetic trees using evolutionary search: initial progress in extending gaphyl to work with genetic data. In: Proceeding of the 2003 congress on evolutionary computation (CEC 2003). IEEE Press, Piscataway, pp 320–326
Darriba D, Taboada GL, Doallo R, Posada D (2012) jModelTest 2: more models, new heuristics and parallel computing. Nat Methods 9(8):772–772
Day WHE, Johnson DS, Sankoff D (1986) The computational complexity of inferring rooted phylogenies by parsimony. Math Biosci 81(1):33–42
Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multi-objective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197
Díaz J, Muñoz-Caro C, Niño A (2012) A survey of parallel programming models and tools in the multi and many-core era. IEEE Trans Parallel Distrib Syst 23(8):1369–1386
Felsenstein J (2000) PHYLIP (phylogeny inference package). http://evolution.genetics.washington.edu/phylip.html
Figueira JR, Liefooghe A, Talbi E-G, Wierzbicki AP (2010) A parallel multiple reference point approach for multi-objective optimization. Eur J Oper Res 205(2):390–400
Goëffon A, Richer JM, Hao JK (2008) Progressive tree neighborhood applied to the maximum parsimony problem. IEEE/ACM Trans Comput Biol Bioinform 5(1):136–145
Goloboff PA, Farris JS, Nixon KC (2008) TNT, a free program for phylogenetic analysis. Cladistics 24(5):774–786
Gropp W, Lusk W, Skjellum A (2014) Using MPI: portable parallel programming with the message passing interface, 3rd edn. The MIT Press, Cambridge
Guéquen L et al (2013) Bio++: efficient extensible libraries and tools for computational molecular evolution. Mol Biol Evol 30(8):1745–1750
HIV sequence database. http://www.hiv.lanl.gov/ (2005)
Ingman M, Gyllensten U (2006) mtDB: Human Mitochondrial Genome Database, a resource for population genetics and medical sciences. Nucleic Acids Res 34:D749–D751
Izquierdo-Carrasco F, Alachiotis N, Berger S, Flouri T, Pissis SP, Stamatakis A (2013) A generic vectorization scheme and a GPU kernel for the phylogenetic likelihood library. In: Proceedings of the 27th IEEE international parallel & distributed processing symposium. IEEE, pp 530–538
Jaimes AL, Coello C (2009) Applications of parallel platforms and models in evolutionary multi-objective optimization. In: Biologically-inspired optimisation methods, studies in computational intelligence, vol 210. Springer, pp 23–49
Katoh K, Kuma K, Miyata T (2001) Genetic algorithm-based maximum-likelihood analysis for molecular phylogeny. J Mol Evol 53(4–5):477–484
Lemey P, Salemi M, Vandamme AM (2009) The phylogenetic handbook: a practical approach to phylogenetic analysis and hypothesis testing. Cambridge University Press, Cambridge
Lemmon AR, Milinkovitch MC (2002) The metapopulation genetic algorithm: an efficient solution for the problem of large phylogeny estimation. Proc Natl Acad Sci USA 99(16):10516–10521
León C, Miranda G, Segredo E, Segura C (2009) Parallel library of multi-objective evolutionary algorithms. In: Proceedings of the 2009 17th Euromicro international conference on parallel, distributed and network-based processing. IEEE, pp 28–35
Lewis PO (1998) A genetic algorithm for maximum-likelihood phylogeny inference using nucleotide sequence data. Mol Biol Evol 15(3):277–283
López-Ibáñez M, Dubois-Lacoste J, Stützle T, Birattari M (2011) The irace package, Iterated Race for automatic algorithm configuration. Technical report TR/IRIDIA/2011-004. IRIDIA, Université libre de Bruxelles, Belgium
Macey JR (2005) Plethodontid salamander mitochondrial genomics: a parsimony evaluation of character conflict and implications for historical biogeography. Cladistics 21(2):194–202
Minh BQ, Vinh LS, von Haeseler A, Schmidt HA (2005) pIQPNNI—parallel reconstruction of large maximum likelihood phylogenies. Bioinformatics 21(19):3794–3796
Minh BQ, Vinh LS, Schmidt HA, von Haeseler A (2006) Large maximum likelihood trees. In: Proceedings of the NIC symposium. Forschungszentrum Jlich, Germany, pp 357–366
Moore MJ, Jansen RK (2006) Molecular evidence for the age, origin, and evolutionary history of the american desert plant genus tiquilia (boraginaceae). Mol Phylogenet Evol 39(3):668–687
Ott M, Zola J, Aluru S, Johnson AD, Janies D, Stamatakis A (2008) Large-scale phylogenetic analysis on current HPC architectures. Sci Program 16(2–3):255–270
Pfeiffer W, Stamatakis A (2010) Hybrid MPI/Pthreads parallelization of the RAxML phylogenetics code. In: Proceedings of HiCOMB 2010. IEEE, pp 1–8
Poladian L (2005) A GA for maximum likelihood phylogenetic inference using neighbour-joining as a genotype to phenotype mapping. In: Genetic and evolutionary computation conference, pp 415–422
Poladian L, Jermiin L (2006) Multi-objective evolutionary algorithms and phylogenetic inference with multiple data sets. Soft Comput 10(4):359–368
Pratas F, Trancoso P, Sousa L, Stamatakis A, Shi G, Kindratenko V (2012) Fine-grain parallelism using multi-core, Cell/BE, and GPU systems. Parallel Comput 38(8):365–390
Rokas A, Williams BL, King N, Carroll SB (2003) Genome-scale approaches to resolving incongruence in molecular phylogenies. Nature 425(6960):798–804
Ronquist F et al (2012) MrBayes 3.2: efficient bayesian phylogenetic inference and model choice across a large model space. Syst Biol 61(3):539–542
Santander-Jiménez S, Vega-Rodríguez MA (2014) Inferring multiobjective phylogenetic hypotheses by using a parallel indicator-based evolutionary algorithm. In: Theory and practice of natural computing, LNCS, vol 8890. Springer, pp 205–217
Santander-Jiménez S, Vega-Rodríguez MA (2015) A hybrid approach to parallelize a fast non-dominated sorting genetic algorithm for phylogenetic inference. Concurr Comput Pract Exp 27(3):702–734
Sheskin DJ (2011) Handbook of parametric and nonparametric statistical procedures, 5th edn. Chapman & Hall/CRC Press, London
Shimodaira H, Hasegawa M (2001) CONSEL: for assessing the confidence of phylogenetic tree selection. Bioinformatics 17(12):1246–1247
Stamatakis A (2014) RAxML Version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30(9):1312–1313
Stamatakis A, Ott M (2008) Exploiting fine-grained parallelism in the phylogenetic likelihood function with MPI, Pthreads, and OpenMP: a performance study. In: Pattern recognition in bioinformatics, LNBI, vol 5265. Springer, pp 424–436
Wiens JJ, Servedio MR (1998) Phylogenetic analysis and intraspecific variation: performance of parsimony, likelihood, and distance methods. Syst Biol 47(2):228–253
Zitzler E, Künzli S (2004) Indicator-based selection in multiobjective search. In: PPSN VIII, LNCS, vol 3242. Springer, pp 832–842
Zitzler E, Thiele L, Laumanns M, Fonseca CM, Fonseca VGD (2003) Performance assessment of multiobjective optimizers: an analysis and review. IEEE Trans Evol Comput 7(2):117–132
Acknowledgments
This work was partially funded by the Spanish Ministry of Economy and Competitiveness and the ERDF (European Regional Development Fund), under the contract TIN2012-30685 (BIO project). Sergio Santander-Jiménez was supported by the Grant FPU12/04101 from the Spanish Government. Currently, he is supported by the postdoc research grant ACCION-III-04 from the University of Extremadura, Spain.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Communicated by C. M. Vide and A. H. Dediu.
Rights and permissions
About this article
Cite this article
Santander-Jiménez, S., Vega-Rodríguez, M.A. Using mixed mode programming to parallelize an indicator-based evolutionary algorithm for inferring multiobjective phylogenetic histories. Soft Comput 21, 5601–5620 (2017). https://doi.org/10.1007/s00500-016-2219-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-016-2219-6