Elsevier

Discrete Mathematics

Volume 309, Issue 18, 28 September 2009, Pages 5610-5617
Discrete Mathematics

On the complexity of SNP block partitioning under the perfect phylogeny model

https://doi.org/10.1016/j.disc.2008.04.002Get rights and content
Under an Elsevier user license
open archive

Abstract

Recent technologies for typing single nucleotide polymorphisms (SNPs) across a population are producing genome-wide genotype data for tens of thousands of SNP sites. The emergence of such large data sets underscores the importance of algorithms for large-scale haplotyping. Common haplotyping approaches first partition the SNPs into blocks of high linkage-disequilibrium, and then infer haplotypes for each block separately. We investigate an integrated haplotyping approach where a partition of the SNPs into a minimum number of non-contiguous subsets is sought, such that each subset can be haplotyped under the perfect phylogeny model. We show that finding an optimum partition is NP-hard even if we are guaranteed that two subsets suffice. On the positive side, we show that a variant of the problem, in which each subset is required to admit a perfect path phylogeny haplotyping, is solvable in polynomial time.

Keywords

Perfect phylogeny haplotyping
Perfect path phylogeny
Partitioning problems

Cited by (0)

A preliminary version of this paper appeared in [Jens Gramm, Tzvika Hartman, Till Nierhoff, Roded Sharan, Till Tantau, On the complexity of SNP block partitioning under the perfect phylogeny model, in: Proc. of the sixth Workshop on Algorithms in Bioinformatics, WABI’06, 2006, pp. 92–102].