ABSTRACT
With the advancement of biology and computer science, the amount of DNA sequences has grown at a rapid rate giving rise to the analysis of phylogenetic trees with many taxa. The maximum likelihood analysis is commonly considered as the best approach in phylogenetic analyses, which is extremely intensive for computation. Availability of computer resources and the application of modern technologies are key factors that determine the use of such analyses. The paper presents a parallel implementation of a GPU accelerated maximum likelihood inference of phylogenetic trees on DNAml program of the PHYLIP package. The improved DNAml program uses both GPU and CPU processing to perform compute-intensive tasks in phylogenetic analyses. The evaluation results show a speedup of x2.94 for the GPU accelerated DNAml program than the existing program. As the results show the proposed system saves the processing time increasingly against the current system with the number of taxa.
- Arrowsmith, C., Bountra, C., Fish, P., Lee, K. and Schapira, M. 2012. Epigenetic protein families: a new frontier for drug discovery. Nature Reviews Drug Discovery 11, 5 (2012), 384--400.Google ScholarCross Ref
- GenBank and WGS Statistics. Ncbi.nlm.nih.gov, 2018. https://www.ncbi.nlm.nih.gov/genbank/statistics/.Google Scholar
- Cole, J. 2004. The Ribosomal Database Project (RDP-II): sequences and tools for high-throughput rRNA analysis. Nucleic Acids Research 33, Database issue (2004), D294-D296.Google Scholar
- Kuhner, M. K. and Felsenstein, J. A. 1994. Simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates. Molecular Biology and Evolution 11, (1994), 459--468.Google Scholar
- Ranwez, V. and Gascuel, O. 2001. Quartet-Based Phylogenetic Inference: Improvements and Limits. Molecular Biology and Evolution 18, 6 (2001), 1103--1116.Google ScholarCross Ref
- Guindon, S. and Gascuel, O. 2003. A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood. Systematic Biology 52, 5 (2003), 696--704.Google ScholarCross Ref
- Felsenstein, J. 2008. Inferring phylogenies. Sinauer Associates, Inc., Sunderland, Mass., 2008.Google Scholar
- Gilmour, R. The Tree of Life: A Multi-Authored, Distributed Internet Project Containing Information about Phylogeny and Biodiversity. Electronic Resources Review 4, 5 (2000), 42--43.Google Scholar
- Welivita, A., Perera, I., Meedeniya, D., Wickramarachchi, A. and Mallawaarachchi, V. 2018. Managing Complex Workflows in Bioinformatics: An Interactive Toolkit with GPU Acceleration. IEEE Transactions on NanoBioscience 17, 3 (2018), 199--208.Google ScholarCross Ref
- Mallawaarachchi, V., Wickramarachchi, A., Welivita, A., Perera, I., and Meedeniya, D. 2018. Efficient bioinformatics computations through GPU accelerated web services. 2nd International Conference on Algorithms, Computing and Systems, 13, 2018. Google ScholarDigital Library
- Owens, J., Houston, M., Luebke, D., Green, S., Stone, J. and Phillips, J. 2008. GPU Computing. Proceedings of the IEEE 96, 5 (2008), 879--899.Google ScholarCross Ref
- Felsenstein, J. 1981. Evolutionary trees from DNA sequences: A maximum likelihood approach. Journal of Molecular Evolution 17, 6 (1981), 368--376.Google ScholarCross Ref
- Guindon, S., Dufayard, J., Lefort, V., Anisimova, M., Hordijk, W. and Gascuel, O. 2010. New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance of PhyML 3.0. Systematic Biology 59, 3 (2010), 307--321.Google Scholar
- Hordijk, W. and Gascuel, O. 2005. Improving the efficiency of SPR moves in phylogenetic tree search methods based on maximum likelihood. Bioinformatics 21, 24 (2005), 4338--4347. Google ScholarDigital Library
- Yang, Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. Bioinformatics 13, 5 (1997), 555--556.Google ScholarCross Ref
- Stamatakis, A. 2006. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22, 21 (2006), 2688--2690. Google ScholarDigital Library
- Olsen, G., Matsuda, H., Hagstrom, R. and Overbeek, R. 1994. fastDNAml: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihood. Bioinformatics 10, 1 (1994), 41--48.Google ScholarCross Ref
- Stewart, C.A., Hart, D., Berry, D. K., Olsen, G. J., Wernert, E. A. and Fischer, W., 2001, November. Parallel implementation and performance of fastDNAml: a program for maximum likelihood phylogenetic inference. In Proceedings of the 2001 ACM/IEEE conference on Supercomputing, 20--20. Google ScholarDigital Library
- Ronquist, F. and Huelsenbeck, J. 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19, 12 (2003), 1572--1574.Google ScholarCross Ref
- Bao, J., Xia, H., Zhou, J., Liu, X. and Wang, G. 2013. Efficient Implementation of MrBayes on Multi-GPU. Molecular Biology and Evolution 30, 6 (2013), 1471--1479.Google ScholarCross Ref
- PHYLIP Home Page. Evolution.genetics.washington.edu, 2018. http://evolution.genetics.washington.edu/phylip.html.Google Scholar
- Felsenstein, J. and Churchill, G. 1996. A Hidden Markov Model approach to variation among sites in rate of evolution. Molecular Biology and Evolution 13, 1 (1996), 93--104.Google ScholarCross Ref
Index Terms
- GPU Accelerated Maximum Likelihood Analysis for Phylogenetic Inference
Recommendations
Efficient Computation of the Phylogenetic Likelihood Function on the Intel MIC Architecture
IPDPSW '14: Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium WorkshopsPhylogenetic inference is the process of reconstructing the evolutionary history of species based on their traits, nowadays mostly using molecular sequence data. Current state-of-the-art inference methods, like Bayesian and Maximum Likelihood (ML) ...
Parallel implementation and performance of fastDNAml: a program for maximum likelihood phylogenetic inference
SC '01: Proceedings of the 2001 ACM/IEEE conference on SupercomputingThis paper describes the parallel implementation of fastDNAml, a program for the maximum likelihood inference of phylogenetic trees from DNA sequence data. Mathematical means of inferring phylogenetic trees have been made possible by the wealth of DNA ...
Jumpstarting phylogenetic analysis
Phylogenetic analysis is a central tool in studies of comparative genomics. When a new region of DNA is isolated and sequenced, researchers are often forced to throw away months of computation on an existing phylogeny of homologous sequences in order to ...
Comments