Abstract
This research investigates the use of genetic algorithms to solve problems from cladistics — a technique used by biologists to hypothesize the evolutionary relationships between organisms. Since exhaustive search is not practical in this domain, typical cladistics software packages use heuristic search methods to navigate through the space of possible trees in an attempt to find one or more “best” solutions. We have developed a system called Gaphyl, which uses the genetic algorithm approach as a search technique for finding cladograms, and a tree evaluation metric from a common cladistics software package (Phylip). On a nontrivial problem (49 species with 61 attributes), Gaphyl is able to find more of the best known trees with less computational effort than Phylip is able to find (corresponding to more equally plausible evolutionary hypotheses).
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
L. Davis. Handbook of Genetic Algorithms. Van Nostrand Reinhold, New York, NY, 1991.
M. J. Donaghue. Treebase: A database of phylogenetic knowledge. web-based data repository, 2000. Available at http://phylogeny.harvard.edu/treebase.
J. Felsenstein. Phylip source code and documentation, 1995. Available via the web at http://evolution.genetics.washington.edu/phylip.html.
P. L. Forey, C. J. Humphries, I. L. Kitching, R. W. Scotland, D. J. Siebert, and D. M. Williams. Cladistics: A Practical Course in Systematics. Number 10 in The Systematics Association. Clarendon Press, Oxford, 1993.
D. E. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, Reading, MA, 1989.
J. J. Grefenstette. A user’s guide to GENESIS. Technical report, Navy Center for Applied Research in AI, Washington, DC, 1987. Source code updated 1990; available at http://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/areas/genetic/ga/systems/genesis/.
P. O. Lewis. A genetic algorithm for maximum-likelihood phylogeny inference using nucleotide sequence data. Mol. Biol. Evol., 15(3):277–283, 1998.
H. Matsuda. Protein phylogenetic inference using maximum likelihood with a genetic algorithm. In L. Hunter and T. E. Klein, editors, Pacific Symposium on Biocomputing’ 96, pages 512–523. World Scientific, London, 1996.
M. Mitchell. An Introduction to Genetic Algorithms. MIT Press, Cambridge, MA, 1996.
D. L. Swofford, G. J. Olsen, P. J. Waddell, and D. M. Hillis. Molecular Systematics, chapter Phylogenetic Inference, pages 407–514. Sinauer Associates, Inc., Sunderland, MA, 1996.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Congdon, C.B. (2001). Gaphyl: A Genetic Algorithms Approach to Cladistics. In: De Raedt, L., Siebes, A. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 2001. Lecture Notes in Computer Science(), vol 2168. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44794-6_6
Download citation
DOI: https://doi.org/10.1007/3-540-44794-6_6
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42534-2
Online ISBN: 978-3-540-44794-8
eBook Packages: Springer Book Archive