Abstract
We address the problem of reconstructing haplotypes in a population, given a sample of genotypes and assumptions about the underlying population. The problem is of major interest in genetics because haplotypes are more informative than genotypes when it comes to searching for trait genes, but it is difficult to get them directly by sequencing. After showing that simple resolution-based inference can be terribly wrong in some natural types of population, we propose a different combinatorial approach exploiting intersections of sampled genotypes (considered as sets of candidate haplotypes). For populations with perfect phylogeny we obtain an inference algorithm which is both sound and efficient. It yields with high propability the complete set of haplotypes showing up in the sample, for a sample size close to the trivial lower bound. The perfect phylogeny assumption is often justified, but we also believe that the ideas can be further extended to populations obeying relaxed structural assumptions. The ideas are quite different from other existing practical algorithms for the problem.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bafna, V., Gusfield, D., Lancia, G., Yooseph, S.: Haplotyping as perfect phylogeny: A direct approach, UC Davis Computer Science Tech. Report CSE-2002-21
Clark, A.: Inference of haplotypes from PCR-amplified samples of diploid populations. Mol. Biol. Evol. 7, 111–122 (1990)
Eskin, E., Halperin, E., Karp, R.M.: Large scale reconstruction of haplotypes from genotype data. In: 7th Int. Conf. on Research in Computational Molecular Biology RECOMB, pp. 104–113 (2003)
Excoffier, L., Slatkin, M.: Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. In: Amer. Assoc. of Artif. Intell. (2000)
Gusfield, D.: Efficient algorithms for inferring evolutionary trees. Networks 21, 19–28 (1991)
Gusfield, D.: Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge Univ. Press, Cambridge (1997)
Gusfield, D.: Inference of haplotypes from preamplified samples of diploid populations, UC Davis, technical report csse-99-6
Gusfield, D.: A practical algorithm for optimal inference of haplotypes from diploid populations. In: 8th Int. Conf. on Intell. Systems for Mol. Biology ISMB 2000, pp. 183–189. AAAI Press, Menlo Park (2000)
Gusfield, D.: Haplotyping as perfect phylogeny: Conceptual framework and efficient solutions (extended abstract). In: 6th Int. Conf. on Research in Computational Molecular Biology RECOMB 2002, pp. 166–175 (2002)
Motwani, R., Raghavan, P.: Randomized Algorithms. Cambridge Univ. Press, Cambridge (1995)
Stephens, M., Smith, N.J., Donnelly, P.: A new statistical method for haplotype reconstruction from population data. Amer. J. Human Genetics 68, 978–989 (2001)
Zhang, J., Vingron, M., Hoehe, M.R.: On haplotype reconstruction for diploid populations, EURANDOM technical report (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Damaschke, P. (2003). Fast Perfect Phylogeny Haplotype Inference. In: Lingas, A., Nilsson, B.J. (eds) Fundamentals of Computation Theory. FCT 2003. Lecture Notes in Computer Science, vol 2751. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45077-1_18
Download citation
DOI: https://doi.org/10.1007/978-3-540-45077-1_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40543-6
Online ISBN: 978-3-540-45077-1
eBook Packages: Springer Book Archive