Abstract
Haplotyping, also known as haplotype phase prediction, is the problem of predicting likely haplotypes from genotype data. One fast haplotyping method is based on an evolutionary model where a perfect phylogenetic tree is sought that explains the observed data. Unfortunately, when data entries are missing as is often the case in laboratory data, the resulting incomplete perfect phylogeny haplotyping problem ipph is NP-complete and no theoretical results are known concerning its approximability, fixed-parameter tractability, or exact algorithms for it. Even radically simplified versions, such as the restriction to phylogenetic trees consisting of just two directed paths from a given root, are still NP-complete; but here a fixed-parameter algorithm is known. We show that such drastic and ad hoc simplifications are not necessary to make ipph fixed-parameter tractable: We present the first theoretical analysis of an algorithm, which we develop in the course of the paper, that works for arbitrary instances of ipph. On the negative side we show that restricting the topology of perfect phylogenies does not always reduce the computational complexity: while the incomplete directed perfect phylogeny problem is well-known to be solvable in polynomial time, we show that the same problem restricted to path topologies is NP-complete.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bafna, V., Gusfield, D., Lancia, G., Yooseph, S.: Haplotyping as perfect phylogeny: A direct approach. J. Comput. Biol. 10(3–4), 323–340 (2003)
Benham, C.J., Kannan, S., Paterson, M., Warnow, T.: Hen’s teeth and whale’s feet: Generalized characters and their compatibility. J. Comput. Biol. 2(4), 515–525 (1995)
Bonizzoni, P.: A linear-time algorithm for the perfect phylogeny haplotype problem. Algorithmica 48(3), 267–285 (2007)
Clark, A.G.: Inference of haplotypes from PCR-amplified samples of diploid populations. J. of Mol. Biol. and Evol. 7(2), 111–122 (1990)
Ding, Z., Filkov, V., Gusfield, D.: A linear-time algorithm for the perfect phylogeny haplotyping (PPH) problem. J. Comput. Biol. 13(2), 522–553 (2006)
Elberfeld, M., Schnoor, I., Tantau, T.: Influence of tree topology restrictions on the complexity of haplotyping with missing data. Tech. Rep. SIIM-TR-A-08-05, Universität zu Lübeck (2008)
Elberfeld, M., Tantau, T.: Computational complexity of perfect-phylogeny-related haplotyping problems. In: Ochmański, E., Tyszkiewicz, J. (eds.) MFCS 2008. LNCS, vol. 5162, pp. 299–310. Springer, Heidelberg (2008)
Eskin, E., Halperin, E., Karp, R.M.: Efficient reconstruction of haplotype structure via perfect phylogeny. J. of Bioinform. and Comput. Biol. 1(1), 1–20 (2003)
Excoffier, L., Slatkin, M.: Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Mol. Biol. and Evol. 12(5), 921–927 (1995)
Gramm, J., Hartman, T., Nierhoff, T., Sharan, R., Tantau, T.: On the complexity of SNP block partitioning under the perfect phylogeny model. Discrete Math. (2008) (to appear), doi:010.1016/j.disc.2008.04.002
Gramm, J., Nierhoff, T., Sharan, R., Tantau, T.: Haplotyping with missing data via perfect path phylogenies. Discrete and Appl. Math. 155(6-7), 788–805 (2007)
Gusfield, D.: Inference of haplotypes from samples of diploid populations: Complexity and algorithms. J. Comput. Biol. 8(3), 305–323 (2001)
Gusfield, D.: Haplotyping as perfect phylogeny: Conceptual framework and efficient solutions. In: Proc. RECOMB 2002, pp. 166–175. ACM Press, New York (2002)
Halperin, E., Karp, R.M.: Perfect phylogeny and haplotype assignment. In: Proc. RECOMB 2002, pp. 10–19. ACM Press, New York (2004)
Kimmel, G., Shamir, R.: The incomplete perfect phylogeny haplotype problem. J. Bioinform. and Comput. Biol. 3(2), 359–384 (2005)
Liu, Y., Zhang, C.-Q.: A linear solution for haplotype perfect phylogeny problem. In: Proc. Int. Conf. Adv. in Bioinform. and Appl., pp. 173–184. World Scientific, Singapore (2005)
Pe’er, I., Pupko, T., Shamir, R., Sharan, R.: Incomplete directed perfect phylogeny. SIAM J. Comput. 33(3), 590–607 (2004)
Vijaya Satya, R., Mukherjee, A.: An optimal algorithm for perfect phylogeny haplotyping. J. Comput. Biol. 13(4), 897–928 (2006)
Vijaya Satya, R., Mukherjee, A.: The undirected incomplete perfect phylogeny problem. IEEE/ACM T. Comput. Biol. and Bioinform. 5(4), 618–629 (2008)
Steel, M.: The complexity of reconstructing trees from qualitative characters and subtrees. J. Classif. 9(1), 91–116 (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Elberfeld, M., Schnoor, I., Tantau, T. (2009). Influence of Tree Topology Restrictions on the Complexity of Haplotyping with Missing Data. In: Chen, J., Cooper, S.B. (eds) Theory and Applications of Models of Computation. TAMC 2009. Lecture Notes in Computer Science, vol 5532. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02017-9_23
Download citation
DOI: https://doi.org/10.1007/978-3-642-02017-9_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02016-2
Online ISBN: 978-3-642-02017-9
eBook Packages: Computer ScienceComputer Science (R0)