Abstract
We develop techniques to calculate important measures in evolutionary biology by encoding to CNF formulas and using powerful SAT solvers. Comparing evolutionary trees is a necessary step in tree reconstruction algorithms, locating recombination and lateral gene transfer, and in analyzing and visualizing sets of trees. We focus on two popular comparison measures for trees: the hybridization number and the rooted subtree-prune-and-regraft (rSPR) distance. Both have recently been shown to be NP-hard, and efficient algorithms are needed to compute and approximate these measures. We encode these as a Boolean formula such that two trees have hybridization number k (or rSPR distance k) if and only if the corresponding formula is satisfiable. We use state-of-the-art SAT solvers to determine if the formula encoding the measure has a satisfying assignment. Our encoding also provides a rich source of real-world SAT instances, and we include a comparison of several recent solvers (minisat, adaptg2wsat, novelty+p, Walksat, March KS and SATzilla).
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Allen, B., Steel, M.: Subtree transfer operations and their induced metrics on evolutionary trees. Annals of Combinatorics 5, 1–13 (2001)
Baroni, M., Semple, C., Steel, M.: Hybrids in real time. Systematic Biology 55, 46–56 (2006)
Beiko, R.G., Hamilton, N.: Phylogenetic identification of lateral genetic transfer events. BMC Evol. Biol. 6, 15 (2006)
Bonet, M.L., John, K.S., Mahindru, R., Amenta, N.: Approximating subtree distances between phylogenies. Journal of Computational Biology 13(8), 1419–1434 (2006)
Bordewich, M., Semple, C.: On the computational complexity of the rooted subtree prune and regraft distance. Annals of Combinatorics 8, 409–423 (2005)
Bordewich, M., Semple, C.: Computing the minimum number of hybridization events for a consistent evolutionary history. Discrete Applied Mathematics (2007)
Bordewich, M., Linz, S., John, K.S., Semple, C.: A reduction algorithm for computing the hybridization number of two trees. Evolutionary Bioinformatics 3, 86–98 (2007)
Zhang, H., Li, C.M., Wei, W.: Combining adaptive noise and look-ahead in local search for SAT. In: Marques-Silva, J., Sakallah, K.A. (eds.) SAT 2007. LNCS, vol. 4501, pp. 121–133. Springer, Heidelberg (2007)
Day, W.H.E.: Optimal algorithms for comparing trees with labeled leaves. Journal of Classification 2, 7–28 (1985)
Eén, N., Sörensson, N.: Software, http://www.cs.chalmers.se/Cs/Research/FormalMethods/MiniSat/
Eén, N., Sörensson, N.: An extensible SAT-solver. In: Giunchiglia, E., Tacchella, A. (eds.) SAT 2003. LNCS, vol. 2919, pp. 502–518. Springer, Heidelberg (2004)
Grass Phylogeny Working Group. Phylogeny and subfamilial classification of the grasses (poaceae). Annals of the Missouri Botanical Garden 88(3), 373–457 (2001)
Hallett, M.T., Lagergren, J.: Efficient algorithms for lateral gene transfer problems. In: ACM (ed.) Proceedings of the Fifth Annual International Conference on Computational Molecular Biology (RECOMB 2001), pp. 149–156. ACM, New York (2001)
Hein, J., Jiang, T., Wang, L., Zhang, K.: On the complexity of comparing evolutionary trees. Discrete Applied Mathematics 71, 153–169 (1996)
Heule, M.J.H., van Maaren, H.: March dl: Adding adaptive heuristics and a new branching strategy. Journal on Satisfiability, Boolean Modeling and Computation 2, 47–59 (2006)
Huson, D.H., Bryant, D.: Application of phylogenetic networks in evolutionary studies. Molecular Biology and Evolution 23(2), 254–267 (2006)
Lynce, I., Marques Silva, J.P.: Efficient haplotype inference with boolean satisfiability. In: Proceedings of National Conference on Artificial Intelligence (AAAI) (2006)
Moret, B., Nakhleh, L., Warnow, T., Linder, C.R., Tholse, A., Padolina, A., Sun, J., Timme, R.: Phylogenetic networks: Modeling, reconstructibility and accuracy. IEEE Transactions on Computational Biology and Bioinformatics 1(1), 13–23 (2004)
Munzner, T., Guimbrètiere, F., Tasiran, S., Zhang, L., Zhou, Y.: TreeJuxtaposer: Scalable tree comparison using Focus+Context with guaranteed visibility. In: SIGGRAPH 2003 Proceedings, published as special issue of Transactions on Graphics, pp. 453–462 (2003)
Nakhleh, L., Ruths, D., Wang, L.-S.: RIATA-HGT: A fast and accurate heuristic for reconstructing horizontal gene transfer. In: Wang, L. (ed.) COCOON 2005. LNCS, vol. 3595, pp. 84–93. Springer, Heidelberg (2005)
Olsen, G.J., Matsuda, H., Hagstrom, R., Overbeek, R.: Fastdnaml: A tool for construction of phylogenetic trees of dna sequences using maximum likelihood. Comput. Appl. Biosci. 10, 41–48 (1994)
Sanderson, M.J.: r8s; inferring absolute rates of evolution and divergence times in the absence of a molecular clock. Bioinformatics 19, 301–302 (2003)
Schmidt, H.A.: Phylogenetic trees from large datasets. PhD thesis, Heinrich-Heine-Universitat, Dusseldorf (2003)
Selman, B., Kautz, H.A., Cohen, B.: Software, http://www.cs.rochester.edu/u/kautz/walksat/
Selman, B., Kautz, H.A., Cohen, B.: Local search strategies for satisfiability testing. In: Trick, M., Johnson, D.S. (eds.) Proceedings of the Second DIMACS Challange on Cliques, Coloring, and Satisfiability, Providence RI (1993)
Semple, C.: Hybridization networks. New Mathematical Models for Evolution. Oxford University Press, Oxford (2007)
Tompkins, D.A.D., Hoos, H.H.: UBCSAT: An implementation and experimentation environment for SLS algorithms for SAT and MAX-SAT. In: Hoos, H.H., Mitchell, D.G. (eds.) SAT 2004. LNCS, vol. 3542, pp. 306–320. Springer, Heidelberg (2005)
Wu, Y.: A practical method for exact computation of subtree prune and regraft distance. Bioinformatics 25(2), 190–196 (2009)
Xu, L., Hutter, F., Hoos, H.H., Leyton-Brown, K.: SATzilla:portfolio-based algorithm selection for SAT. Journal of Artificial Intelligence Research 32, 565–606 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bonet, M.L., John, K.S. (2009). Efficiently Calculating Evolutionary Tree Measures Using SAT. In: Kullmann, O. (eds) Theory and Applications of Satisfiability Testing - SAT 2009. SAT 2009. Lecture Notes in Computer Science, vol 5584. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02777-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-02777-2_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02776-5
Online ISBN: 978-3-642-02777-2
eBook Packages: Computer ScienceComputer Science (R0)