skip to main content
10.1145/640075.640078acmconferencesArticle/Chapter ViewAbstractPublication PagesrecombConference Proceedingsconference-collections
Article

Haplotypes and informative SNP selection algorithms: don't block out information

Published:10 April 2003Publication History

ABSTRACT

It is widely hoped that variation in the human genome will provide a means of predicting risk of a variety of complex, chronic diseases. A major stumbling block to the successful identification of association between human DNA polymorphisms (SNPs) and variability in risk of complex diseases is the enormous number of SNPs in the human genome (4,9). The large number of SNPs results in unacceptably high costs for exhaustive genotyping, and so there is a broad effort to determine ways to select SNPs so as to maximize the informativeness of a subset.In this paper we contrast two methods for reducing the complexity of SNP variation: haplotype tagging, i.e. typing a subset of SNPs to identify segments of the genome that appear to be nearly unrecombined (haplotype blocks), and a new block-free model that we develop in this report. We present a statistic for comparing haplotype blocks and show that while the concept of haplotype blocks is reasonably robust there is substantial variability among block partitions. We develop a measure for selecting an informative subset of SNPs in a block free model. We show that the general version of this problem is NP-hard and give efficient algorithms for two important special cases of this problem.

References

  1. Goncalo R. Abecasis, Stacey S. Cherny, William O. Cookson, and Lon R. Cardon. Merlin - rapid analysis of dense genetic maps using sparse gene flow trees. Nature Genetics, 30:97--101, 2002.]]Google ScholarGoogle ScholarCross RefCross Ref
  2. Hadar I. Avi-Itzhak, Xiaoping Su, and Francisco M. De La Vega. Selection of minimum subsets of single nucleotide polymorphism to capture haplotype block diversity. In Proceedings of Pacific Symposium on Biocomputing, pages 466--477, 2003.]]Google ScholarGoogle Scholar
  3. V. Bafna, D. Gusfield, G. Lancia, and S. Yooseph. Haplotyping as a perfect phylogeny. a direct approach. Journal of Computational Biology, 2003. To appear.]]Google ScholarGoogle ScholarCross RefCross Ref
  4. K.M.J. De Bontridder, B.V. Halldorsson, M.M. Halldorsson, C.A.J. Hurkens, J.K. Lenstra, R. Ravi, and L. Stougie. Approximation algorithms for the minimum test cover problem. Mathematical Programming-B, 2003. To Appear.]]Google ScholarGoogle Scholar
  5. K.M.J. De Bontridder, B.J. Lageweg, J.K. Lenstra, J.B. Orlin, and L. Stougie. Branch and bound algorithms for the test cover problem. In Proceedings of the 10th Annual European Symposium on Algorithms (ESA), pages 223--233, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. Clayton. Choosing a set of haplotype tagging SNPs from a larger set of diallelic loci. www.nature.com/ng/journal/v29/n2/extref/ng1001-233-S10.pdf, 2001.]]Google ScholarGoogle Scholar
  7. Gusfield D. Algorithms on Strings, Trees, and Sequences. Cambridge University Press, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M.J. Daly, J.D. Rioux, S.F. Schaffner, T.J. Hudson, and E. S. Lander. High-resolution haplotype structure in the human genome. Nature Genetics, 29:229--232, 2001.]]Google ScholarGoogle ScholarCross RefCross Ref
  9. B. Devlin and N. Risch. A comparison of linkage disequilibrium measures for fine-scale mapping. Genomics, 29:311--322, 1995.]]Google ScholarGoogle ScholarCross RefCross Ref
  10. D. E. Reich et al. Linkage disequiblirium in the human genome. Nature, 2001.]]Google ScholarGoogle Scholar
  11. S.B. Gabriel, S.F. Schaffner, H. Nguyen, J.M. Moore, J. Roy, B. Blumenstiel, J. Higgins, M. DeFelice, A. Lochner, M. Faggart, S.N. Liu-Cordero, C. Rotimi, A. Adeyemo, R. Cooper, R. Ward, E.S. Lander, M.J. Daly, and D. Altschuler. The structure of haplotype blocks in the human genome. Science, 296:2225--2229, 2002.]]Google ScholarGoogle ScholarCross RefCross Ref
  12. B.V. Halldorsson, M.M. Halldorsson, and R. Ravi. On the approximability of the test collection problem. In Proceedings of the 9th Annual European Symposium on Algorithms (ESA), pages 158--169, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. S. Hirschberg. A linear space algorithm for computing maximal common subsequence. Communications of the ACM, 18:341--343, 1975.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R.R. Hudson and N.L. Kaplan. Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics, 111:147--164, 1985.]]Google ScholarGoogle ScholarCross RefCross Ref
  15. A.J. Jeffreys, L. Kauppi, and R. Neumann. Intensely punctute meiotic recombination in the class II region of the major histocompatibility complex. Nature Genetics, 29:217--222, 2001.]]Google ScholarGoogle ScholarCross RefCross Ref
  16. R. Judson, B. Salisbury, J. Schneider, A. Windemuth, and J. C. Stephens. How many SNPs does a genome-wide haplotype map require? Pharmacogenomics, 3:379--391, 2002.]]Google ScholarGoogle ScholarCross RefCross Ref
  17. L. Kruglyak. Prospects for whole-genome linkage mapping of common disease genes. Nature Genetics, 22:139--144, 1999.]]Google ScholarGoogle ScholarCross RefCross Ref
  18. G. Lancia, V. Bafna, S. Istrail, R. Lippert, and R. Schwartz. SNPs problems, complexity and algorithms. In Proceedings of the 9th Annual European Symposium on Algorithms (ESA), pages 182--193, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. Lippert, R. Schwartz, G. Lancia, and S. Istrail. Algorithmic strategies for the single nucleotide polymorphism haplotype assembly problem. Briefings in Bioinformatics, 3(1):23--31, 2002.]]Google ScholarGoogle ScholarCross RefCross Ref
  20. D. A. Nickerson, S. L. Taylor, S. M. Fullerton, K. M. Weiss, A. G. Clark, J. H. Stengaard, V. Salomaa, E. Boerwinkle, and C. F. Sing. Sequence diversity and large-scale typing of SNPs in the human apolipoprotein E gene. Genome Research, 10:1532--1545, 2000.]]Google ScholarGoogle ScholarCross RefCross Ref
  21. N. Patil et al. Blocks of limited haplotype diversity revealed by high resolution scanning of human chromosome 21. Science, 294:1719--1722, 2001.]]Google ScholarGoogle ScholarCross RefCross Ref
  22. R. Rizzi, V. Bafna, S. Istrail, and G. Lancia. Practical algorithms for the single individual SNP haplotyping problem. In Workshop on Algorithms in Bioinformatics, pages 29--43, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. F. M. De La Vega, X. Su, H. Avi-Itzhak, B. V. Halldorsson, D. Gordon, A. Collins, R. A. Lippert, R. Schwartz, C. Scafe, Y. Wang, M. Laig-Webster, R. T. Koehler, J. Ziegle, L. Wogan, J.F. Stevens, K.M. Leinen, S.J. Olson, K.J. Guegler, X. You, L. Xu., H.G. Hemken, F. Kalush, A. G. Clark, S. Istrail, M. W. Hunkapiller, E. G. Spier, and D. A. Gilbert. The profile of linkage disequilibrium across human chromosomes 6, 21, and 22 in African-American and Caucasian populations. In preparation, 2003.]]Google ScholarGoogle Scholar
  24. K. Weiss and A. Clark. Linkage diseuilibrium and the mapping of comples human traits. Trends in Genetics, 18(1):19--24, 2002.]]Google ScholarGoogle ScholarCross RefCross Ref
  25. K. Zhang, M. Deng, T. Chen, M.S. Waterman, and F. Sun. A dynamic programming algorithm for haplotype block partitioning. Proceedings of the National Academy of Sciences, 99(11):7335--7339, 2002.]]Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Haplotypes and informative SNP selection algorithms: don't block out information

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            RECOMB '03: Proceedings of the seventh annual international conference on Research in computational molecular biology
            April 2003
            352 pages
            ISBN:1581136358
            DOI:10.1145/640075

            Copyright © 2003 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 10 April 2003

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • Article

            Acceptance Rates

            RECOMB '03 Paper Acceptance Rate35of175submissions,20%Overall Acceptance Rate148of538submissions,28%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader