Abstract
Inferences on the evolutionary history of a gene can provide insight into whether the findings made for a given gene in a given species can be extrapolated to other species, including humans, help explain morphological evolution or give an explanation for unexpected findings regarding gene expression suppression experiments, among others. The large amount of sequence data that is already available, and that is predicted to dramatically increase in the next few years, means that life science researchers need efficient automated ways of analyzing such data. Moreover, especially when dealing with divergent sequences, inferences can be affected by the chosen alignment and tree building algorithms, and thus the same dataset should be analyzed in different ways, reinforcing the need for the availability of efficient automated ways of analyzing the sequencing data. Therefore, here, we present auto-phylo, a simple pipeline maker for phylogenetic studies, and provide two examples of its utility: one involving a small already formatted sequenced dataset (41 CDS) to determine the impact of the use of different alignment and tree building algorithms in an automated way, and another one involving the automated identification and processing of the sequences of interest, starting from 16550 bacterial CDS FASTA files downloaded from the NCBI Assembly RefSeq database, and subsequent alignment and tree building inferences.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Birchler, J.A., Yang, H.: The multiple fates of gene duplications: deletion, hypofunctionalization, subfunctionalization, neofunctionalization, dosage balance constraints, and neutral variation. Plant Cell 34(7), 2466–2474 (2022)
Merabet, S., Carnesecchi, J.: Hox dosage and morphological diversification during development and evolution. In: Seminars in Cell and Developmental Biology. Elsevier (2022)
e Silva, R.S., et al.: The Josephin domain (JD) containing proteins are predicted to bind to the same interactors: implications for spinocerebellar ataxia type 3 (SCA3) studies using Drosophila melanogaster mutants. Front. Mol. Neurosci. 16 (2023)
Gupta, P.K.: Earth Biogenome project: present status and future plans. Trends Genet. (2022)
Lewin, H.A., et al.: Earth BioGenome project: sequencing life for the future of life. Proc. Natl. Acad. Sci. 115(17), 4325–4333 (2018)
López-Fernández, H., et al.: SEDA: a desktop tool suite for FASTA files processing. IEEE/ACM Trans. Comput. Biol. Bioinf. 19(3), 1850–1860 (2020)
López-Fernández, H., et al.: Compi: a framework for portable and reproducible pipelines. PeerJ Comput. Sci. 7, e593 (2021)
López-Fernández, H., Ferreira, P., Reboiro-Jato, M., Vieira, C.P., Vieira, Jorge: The pegi3s bioinformatics docker images project. In: Rocha, M., Fdez-Riverola, F., Mohamad, M.S., Casado-Vara, R. (eds.) PACBB 2021. LNNS, vol. 325, pp. 31–40. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-86258-9_4
Sievers, F., Higgins, D.G.: Clustal omega. Current Protocols Bioinform. 48(1), 3.13. 1–3.13. 16 (2014)
Notredame, C., Higgins, D.G., Heringa, J.: T-Coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302(1), 205–217 (2000)
Price, M.N., Dehal, P.S., Arkin, A.P.: FastTree: computing large minimum evolution trees with profiles instead of a distance matrix. Mol. Biol. Evol. 26(7), 1641–1650 (2009)
Huelsenbeck, J.P., Ronquist, F.: MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 17(8), 754–755 (2001)
Kumar, S., et al.: MEGA-CC: computing core of molecular evolutionary genetics analysis program for automated and iterative data analysis. Bioinformatics 28(20), 2685–2686 (2012)
Bettisworth, B., Stamatakis, A.: Root digger: a root placement program for phylogenetic trees. BMC Bioinform. 22(1), 1–20 (2021)
Darriba, D., et al.: jModelTest 2: more models, new heuristics and parallel computing. Nat. Methods 9(8), 772 (2012)
Duque, P., Vieira, C.P., Vieira, J.: Advances in novel animal vitamin c biosynthesis pathways and the role of prokaryote-based inferences to understand their origin. Genes 13(10), 1917 (2022)
Edgar, R.C.: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32(5), 1792–1797 (2004)
Nouioui, I., et al.: Genome-based taxonomic classification of the phylum Actinobacteria. Front. Microbiol. 9, 2007 (2018)
Acknowledgments
This research was financed by the National Funds through FCT—Fundação para a Ciência e a Tecnologia, I.P., under the project UIDB/04293/2020, and by the Consellería de Educación, Universidades e Formación Profesional (Xunta de Galicia), under the scope of the strategic funding ED431C2018/55-GRC and ED431C 2022/03-GRC Competitive Reference Group. PD is supported by a PhD fellowship from Fundação para a Ciência e Tecnologia (SFRH/BD/145515/2019), co-financed by the European Social Fund through the Norte Portugal Regional Operational Programme (NORTE 2020).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
López-Fenández, H., Pinto, M., Vieira, C.P., Duque, P., Reboiro-Jato, M., Vieira, J. (2023). Auto-phylo: A Pipeline Maker for Phylogenetic Studies. In: Rocha, M., Fdez-Riverola, F., Mohamad, M.S., Gil-González, A.B. (eds) Practical Applications of Computational Biology and Bioinformatics, 17th International Conference (PACBB 2023). PACBB 2023. Lecture Notes in Networks and Systems, vol 743. Springer, Cham. https://doi.org/10.1007/978-3-031-38079-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-38079-2_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-38078-5
Online ISBN: 978-3-031-38079-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)