Article

Haplotypes and informative SNP selection algorithms: don't block out information

Authors:
Vineet Bafna

The Center for Advancement of Genomics, Rockville, MD

The Center for Advancement of Genomics, Rockville, MD
View Profile

,
Bjarni V. Halldorsson

Applied Biosystems, Rockville MD

Applied Biosystems, Rockville MD
View Profile

,
Russell Schwartz

Carnegie Mellon University, Pittsburgh, PA

Carnegie Mellon University, Pittsburgh, PA
View Profile

,
Andrew G. Clark

Cornell University, Ithaca, NY

Cornell University, Ithaca, NY
View Profile

,
Sorin Istrail

Applied Biosystems, Rockville MD

Applied Biosystems, Rockville MD
View Profile

RECOMB '03: Proceedings of the seventh annual international conference on Research in computational molecular biologyApril 2003Pages 19–27https://doi.org/10.1145/640075.640078

Published:10 April 2003Publication History

RECOMB '03: Proceedings of the seventh annual international conference on Research in computational molecular biology

Pages 19–27

ABSTRACT

It is widely hoped that variation in the human genome will provide a means of predicting risk of a variety of complex, chronic diseases. A major stumbling block to the successful identification of association between human DNA polymorphisms (SNPs) and variability in risk of complex diseases is the enormous number of SNPs in the human genome (4,9). The large number of SNPs results in unacceptably high costs for exhaustive genotyping, and so there is a broad effort to determine ways to select SNPs so as to maximize the informativeness of a subset.In this paper we contrast two methods for reducing the complexity of SNP variation: haplotype tagging, i.e. typing a subset of SNPs to identify segments of the genome that appear to be nearly unrecombined (haplotype blocks), and a new block-free model that we develop in this report. We present a statistic for comparing haplotype blocks and show that while the concept of haplotype blocks is reasonably robust there is substantial variability among block partitions. We develop a measure for selecting an informative subset of SNPs in a block free model. We show that the general version of this problem is NP-hard and give efficient algorithms for two important special cases of this problem.

References

Goncalo R. Abecasis, Stacey S. Cherny, William O. Cookson, and Lon R. Cardon. Merlin - rapid analysis of dense genetic maps using sparse gene flow trees. Nature Genetics, 30:97--101, 2002.]]Google ScholarCross Ref
Hadar I. Avi-Itzhak, Xiaoping Su, and Francisco M. De La Vega. Selection of minimum subsets of single nucleotide polymorphism to capture haplotype block diversity. In Proceedings of Pacific Symposium on Biocomputing, pages 466--477, 2003.]]Google Scholar
V. Bafna, D. Gusfield, G. Lancia, and S. Yooseph. Haplotyping as a perfect phylogeny. a direct approach. Journal of Computational Biology, 2003. To appear.]]Google ScholarCross Ref
K.M.J. De Bontridder, B.V. Halldorsson, M.M. Halldorsson, C.A.J. Hurkens, J.K. Lenstra, R. Ravi, and L. Stougie. Approximation algorithms for the minimum test cover problem. Mathematical Programming-B, 2003. To Appear.]]Google Scholar
K.M.J. De Bontridder, B.J. Lageweg, J.K. Lenstra, J.B. Orlin, and L. Stougie. Branch and bound algorithms for the test cover problem. In Proceedings of the 10th Annual European Symposium on Algorithms (ESA), pages 223--233, 2002.]] Google ScholarDigital Library
D. Clayton. Choosing a set of haplotype tagging SNPs from a larger set of diallelic loci. www.nature.com/ng/journal/v29/n2/extref/ng1001-233-S10.pdf, 2001.]]Google Scholar
Gusfield D. Algorithms on Strings, Trees, and Sequences. Cambridge University Press, 1997.]] Google ScholarDigital Library
M.J. Daly, J.D. Rioux, S.F. Schaffner, T.J. Hudson, and E. S. Lander. High-resolution haplotype structure in the human genome. Nature Genetics, 29:229--232, 2001.]]Google ScholarCross Ref
B. Devlin and N. Risch. A comparison of linkage disequilibrium measures for fine-scale mapping. Genomics, 29:311--322, 1995.]]Google ScholarCross Ref
D. E. Reich et al. Linkage disequiblirium in the human genome. Nature, 2001.]]Google Scholar
S.B. Gabriel, S.F. Schaffner, H. Nguyen, J.M. Moore, J. Roy, B. Blumenstiel, J. Higgins, M. DeFelice, A. Lochner, M. Faggart, S.N. Liu-Cordero, C. Rotimi, A. Adeyemo, R. Cooper, R. Ward, E.S. Lander, M.J. Daly, and D. Altschuler. The structure of haplotype blocks in the human genome. Science, 296:2225--2229, 2002.]]Google ScholarCross Ref
B.V. Halldorsson, M.M. Halldorsson, and R. Ravi. On the approximability of the test collection problem. In Proceedings of the 9th Annual European Symposium on Algorithms (ESA), pages 158--169, 2001.]] Google ScholarDigital Library
D. S. Hirschberg. A linear space algorithm for computing maximal common subsequence. Communications of the ACM, 18:341--343, 1975.]] Google ScholarDigital Library
R.R. Hudson and N.L. Kaplan. Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics, 111:147--164, 1985.]]Google ScholarCross Ref
A.J. Jeffreys, L. Kauppi, and R. Neumann. Intensely punctute meiotic recombination in the class II region of the major histocompatibility complex. Nature Genetics, 29:217--222, 2001.]]Google ScholarCross Ref
R. Judson, B. Salisbury, J. Schneider, A. Windemuth, and J. C. Stephens. How many SNPs does a genome-wide haplotype map require? Pharmacogenomics, 3:379--391, 2002.]]Google ScholarCross Ref
L. Kruglyak. Prospects for whole-genome linkage mapping of common disease genes. Nature Genetics, 22:139--144, 1999.]]Google ScholarCross Ref
G. Lancia, V. Bafna, S. Istrail, R. Lippert, and R. Schwartz. SNPs problems, complexity and algorithms. In Proceedings of the 9th Annual European Symposium on Algorithms (ESA), pages 182--193, 2001.]] Google ScholarDigital Library
R. Lippert, R. Schwartz, G. Lancia, and S. Istrail. Algorithmic strategies for the single nucleotide polymorphism haplotype assembly problem. Briefings in Bioinformatics, 3(1):23--31, 2002.]]Google ScholarCross Ref
D. A. Nickerson, S. L. Taylor, S. M. Fullerton, K. M. Weiss, A. G. Clark, J. H. Stengaard, V. Salomaa, E. Boerwinkle, and C. F. Sing. Sequence diversity and large-scale typing of SNPs in the human apolipoprotein E gene. Genome Research, 10:1532--1545, 2000.]]Google ScholarCross Ref
N. Patil et al. Blocks of limited haplotype diversity revealed by high resolution scanning of human chromosome 21. Science, 294:1719--1722, 2001.]]Google ScholarCross Ref
R. Rizzi, V. Bafna, S. Istrail, and G. Lancia. Practical algorithms for the single individual SNP haplotyping problem. In Workshop on Algorithms in Bioinformatics, pages 29--43, 2002.]] Google ScholarDigital Library
F. M. De La Vega, X. Su, H. Avi-Itzhak, B. V. Halldorsson, D. Gordon, A. Collins, R. A. Lippert, R. Schwartz, C. Scafe, Y. Wang, M. Laig-Webster, R. T. Koehler, J. Ziegle, L. Wogan, J.F. Stevens, K.M. Leinen, S.J. Olson, K.J. Guegler, X. You, L. Xu., H.G. Hemken, F. Kalush, A. G. Clark, S. Istrail, M. W. Hunkapiller, E. G. Spier, and D. A. Gilbert. The profile of linkage disequilibrium across human chromosomes 6, 21, and 22 in African-American and Caucasian populations. In preparation, 2003.]]Google Scholar
K. Weiss and A. Clark. Linkage diseuilibrium and the mapping of comples human traits. Trends in Genetics, 18(1):19--24, 2002.]]Google ScholarCross Ref
K. Zhang, M. Deng, T. Chen, M.S. Waterman, and F. Sun. A dynamic programming algorithm for haplotype block partitioning. Proceedings of the National Academy of Sciences, 99(11):7335--7339, 2002.]]Google ScholarCross Ref

Index Terms

Haplotypes and informative SNP selection algorithms: don't block out information

Recommendations

Inferring combined CNV/SNP haplotypes from genotype data

Motivation: Copy number variations (CNVs) are increasingly recognized as an substantial source of individual genetic variation, and hence there is a growing interest in investigating the evolutionary history of CNVs as well as their impact on complex ...
Read More
Characterization of expressed sequence tags from a Gallus gallus pineal gland cDNA library: Research Articles

The pineal gland is the circadian oscillator in the chicken, regulating diverse functions ranging from egg laying to feeding. Here, we describe the isolation and characterization of expressed sequence tags (ESTs) isolated from a chicken pineal gland ...
Read More
A Compatibility Approach to Identify Recombination Breakpoints in Bacterial and Viral Genomes
ACM-BCB '17: Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics

Recombination is an evolutionary force that results in mosaic genomes for microorganisms. The evolutionary history of microorganisms cannot be properly inferred if recombination has occurred among a set of taxa. That is, polymorphic sites of a multiple ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
RECOMB '03: Proceedings of the seventh annual international conference on Research in computational molecular biology
April 2003
352 pages
ISBN:1581136358
DOI:10.1145/640075
Editors:
Martin Vingron
Max-Planck-Institute for Molecular Genetics, Germany
,
Sorin Istrail
Celera Genomics/Applied Biosystems
,
Pavel Pevzner
University of California at San Diego, CA
,
Michael Waterman
University of Southern California, CA
,
Program Chair:
Webb Miller
The Pennsylvania State University
Copyright © 2003 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 April 2003
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
SNPs
haplotype blocks
haplotype tagging
Qualifiers
- Article
Conference

Acceptance Rates
RECOMB '03 Paper Acceptance Rate35of175submissions,20%Overall Acceptance Rate148of538submissions,28%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 50
  Total Citations
  View Citations
- 1,102
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Haplotypes and informative SNP selection algorithms: don't block out information

RECOMB '03: Proceedings of the seventh annual international conference on Research in computational molecular biology

ABSTRACT

References

Cited By

Index Terms

Recommendations

Inferring combined CNV/SNP haplotypes from genotype data

Characterization of expressed sequence tags from a Gallus gallus pineal gland cDNA library: Research Articles

A Compatibility Approach to Identify Recombination Breakpoints in Bacterial and Viral Genomes