Abstract
Constructing parsimonious phylogenetic trees from species data is a central problem in phylogenetics, and has diverse applications, even outside biology. Many variations of the problem, including the cladistic Camin-Sokal (CCS) version, are NP-complete. We present Answer Set Programming (ASP) models for the binary CCS problem, as well as a simpler perfect phylogeny version, along with experimental results of applying the models to biological data. Our contribution is three-fold. First, we solve phylogeny problems which have not previously been tackled by ASP. Second, we report on variants of our CCS model which significantly affect run time, including the interesting case of making the program “slightly tighter”. This version exhibits some of the best performance, in contrast with a tight version of the model which exhibited poor performance. Third, we are able to find proven-optimal solutions for larger instances of the CCS problem than the widely used branch-and-bound-based PHYLIP package.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Roderic, D., Page, M., Holmes, E.: Molecular Evolution: A Phylogenetic Approach. Blackwell Science, Oxford, UK (1998)
Gusfield, D.: Haplotyping as perfect phylogeny: conceptual framework and efficient solutions. In: RECOMB 2002: Proc. of the sixth annual int’l conf. on Comp. biology, pp. 166–175 (2002)
Erdem, E., Lifschitz, V., Nakhleh, L., Ringe, D.: Reconstructing the evolutionary history of indo-european languages using answer set programming. In: Proc., Practical Aspects of Declarative Languages: 5th Int’l Symposium, pp. 160–176 (2003)
Hendy, M., Penny, D.: Branch and bound algorithms to determine minimal evolutionary trees. Mathematical Biosciences 59, 277–290 (1982)
Felsenstein, J.: Phylip home page (1980), http://evolution.genetics.washington.edu/phylip
Swofford, D.: Paup* 4.0 Phylogenetic Analysis Using Parsimony (*and Other Methods) (2001)
Gelfond, M., Lifschitz, V.: The stable model semantics for logic programming. In: Proc., Int’l Logic Programming Conference and Symposium, pp. 1070–1080 (1988)
Niemelä, I.: Logic programs with stable model semantics as a constraint programming paradigm. Annals of Mathematics and Artificial Intelligence 25, 241–273 (1999)
Marek, V., Truszczynski, M.: Stable logic programming - an alternative logic programming paradigm. In: Apt, K.R., Marek, V.W., Truszczynski, M., Warren, D.S. (eds.) The Logic Programming Paradigm: A 25-Year Perspective. Springer, Heidelberg (1999)
Niemelä, I., Simons, P., Syrjänen, T.: Smodels: A system for answer set programming. In: Proc. 8th Int’l Workshop on Non-Monotonic Reasoning, Breckenridge, Colorado, April 9-11 (2000)
Lierler, Y., Maratea, M.: Cmodels-2: SAT-based answer set solver enhanced to non-tight programs. In: Lifschitz, V., Niemelä, I. (eds.) LPNMR 2004. LNCS, vol. 2923, pp. 346–350. Springer, Heidelberg (2003)
Eck, R., Dayhoff, M.: Atlas of protein sequence and structure. National Biomedical Research Foundation (1966)
Camin, J., Sokal, R.: A method for deducing branching sequences in phylogeny. Evolution 19, 311–326 (1965)
Edwards-Ingram, L., Gent, M., Hoyle, D., Hayes, A., Stateva, L., Oliver, S.: Comparative genomic hybridization provides new insights into the molecular taxonomy of the saccharomyces sensu stricto complex. Genome Research 14, 1043–1051 (2004)
Nozaki, H., Ohta, N., Matsuzaki, M., Misumi, O., Kuroiwa, T.: Phylogeny of plastids based on cladistic analysis of gene loss inferred from complete plastid genome sequences. J. Molecular Evolution 57, 377–382 (2003)
Pacak, A., Fiedorow, P., Dabert, J., Szweykowska-Kulińska, Z.: RAPD technique for taxonomic studies of pellia epiphylla-complex (hepaticae, metzgeriales). Genetica 104, 179–187 (1998)
Day, W., Johnson, D., Sankoff, D.: The computational complexity of inferring rooted phylogenies by parsimony. Mathematical Biosciences 81, 33–42 (1986)
Agarwala, R., Fernandez-Baca, D.: A polynomial-time algorithm for the perfect phylogeny problem when the number of character states is fixed. SIAM Journal on Computing, 1216–1224 (1994)
Hellman, M., Tripathi, N., Henz, S., Lindholm, A., Weigel, D., Breden, F., Dreyer, C.: Unpublished data (2006)
Brooks, D.R., Erdem, E., Minett, J.W., Ringe, D.: Character-based cladistics and answer set programming. In: Hermenegildo, M.V., Cabeza, D. (eds.) PADL 2004. LNCS, vol. 3350, pp. 37–51. Springer, Heidelberg (2005)
Purdom Jr., W., Bradford, P., Tamura, K., Kumar, S.: Single column discrepancy and dynamic max-mini optimization for quickly finding the most parsimonious evolutionary trees. Bioinformatics 2, 140–151 (2000)
Yan, M., Bader, D.A.: Fast character optimization in parsimony phylogeny reconstruction. Technical report (2003)
Moret, B., Tang, J., Wang, L., Warnow, T.: Steps toward accurate reconstruction of phylogenies from gene-order data. J. Comput. Syst. Sci. 65, 508–525 (2002)
Erdem, E., Lifschitz, V.: Tight logic programs. Theory and Practice of Logic Programming 3, 499–518 (2003)
Syrjänen, T.: Lparse user’s manual (1998), http://www.tcs.hut.fi/Software/smodels/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kavanagh, J., Mitchell, D., Ternovska, E., Maňuch, J., Zhao, X., Gupta, A. (2006). Constructing Camin-Sokal Phylogenies Via Answer Set Programming. In: Hermann, M., Voronkov, A. (eds) Logic for Programming, Artificial Intelligence, and Reasoning. LPAR 2006. Lecture Notes in Computer Science(), vol 4246. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11916277_31
Download citation
DOI: https://doi.org/10.1007/11916277_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-48281-9
Online ISBN: 978-3-540-48282-6
eBook Packages: Computer ScienceComputer Science (R0)