Skip to main content

On the Approximability of Some Haplotyping Problems

  • Conference paper
Algorithmic Aspects in Information and Management (AAIM 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5564))

Included in the following conference series:

Abstract

In this paper, we study several versions of optimization problems related to haplotype reconstruction/identification. The input to the first problem is a set C 1 of haplotypes, a set C 2 of haplotypes, and a set G of genotypes. The objective is to select the minimum number of haplotypes from C 2 so that together with haplotypes in C 1 they resolve all (or the maximum number of) genotypes in G. We show that this problem has a factor-O(logn) polynomial time approximation. We also show that this problem does not admit any approximation with a factor better than O(logn) unless P=NP. For the corresponding reconstruction problem, i.e., when C 2 is not given, the same approximability results hold.

The other versions of the haplotype identification problem are based on single individual haplotyping, including the well-known Minimum Fragment Removal (MFR) and Minimum SNP Removal (MSR), which have both shown to be APX-hard previously. We show in this paper that MFR has a polynomial time O(logn)-factor approximation. We also consider Maximum Fragment Identification (MFI), which is the complementary version of MFR; and Maximum SNP Identification (MSI), which is the complementary version of MSR. We show that, for any positive constant ε< 1, neither MFI nor MSI has a factor-n 1 − ε polynomial time approximation algorithm unless P=NP.

This research is partially supported by NSF Career Award 0845376.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bafna, V., Istrail, S., Lancia, G., Rizzi, R.: Polynomial and APX-hard cases of the individual haplotyping problem. Theoretical Computer Science 335, 109–125 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  2. Chvátal, V.: A greedy heuristic for the set-covering problem. Math. Oper. Res. 4, 233–235 (1979)

    Article  MATH  MathSciNet  Google Scholar 

  3. Chen, Z., Fu, B., Sweller, R., Yang, B., Zhao, Z., Zhu, B.: Linear probabilistic algorithms for the singular haplotype reconstruction problem from SNP fragments. J. Computational Biology 15, 535–546 (2008)

    Article  MathSciNet  Google Scholar 

  4. Cilibrasi, R., van Iersel, L., Kelk, S., Tromp, J.: The complexity of the single individual SNP haplotyping problem. Algorithmica 49, 13–36 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  5. Clark, A.: Inference of haplotypes from PCR-amplified samples of diploid populations. Molecular Biology Evolution 7, 111–122 (1990)

    Google Scholar 

  6. Douglas, J., Boehnke, M., Gillanders, E., Trent, J., Gruber, S.: Experimentally-driven haplotypes substantially increase the efficiency of linkage disequillibrium studies. Nat. Genetics 28, 361–364 (2001)

    Article  Google Scholar 

  7. Duh, R.-c., Fürer, M.: Approximation of k-set cover by semi-local optimization. In: Proc. 29th ACM Symp. on Theory of Comput (STOC 1997), pp. 256–264 (1997)

    Google Scholar 

  8. Garg, N., Vazirani, V.V., Yannakakis, M.: Multiway cuts in directed and node weighted graphs. In: Shamir, E., Abiteboul, S. (eds.) ICALP 1994. LNCS, vol. 820, pp. 103–111. Springer, Heidelberg (1994)

    Google Scholar 

  9. Gusfield, D.: A practical algorithm for optimal inference of haplotype from diploid populations. In: ISMB 2000, pp. 183–189 (2000)

    Google Scholar 

  10. Gusfield, D.: Inference of haplotypes from samples of diploid populations: complexity and algorithms. J. Computational Biology 8, 305–323 (2001)

    Article  Google Scholar 

  11. Gusfield, D.: Haplotyping as perfect phylogeny: Conceptual framework and efficient solutions. In: RECOMB 2002, pp. 166–175 (2002)

    Google Scholar 

  12. Gusfield, D.: Haplotype inference by pure parsimony. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 144–155. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  13. Hästad, J.: Clique is hard to approximate within n 1 − ε. Acta Mathematica 182, 105–142 (1999)

    Article  MathSciNet  Google Scholar 

  14. Huang, Y.-T., Chao, K.-M., Chen, T.: An approximation algorithm for haplotype inference by maximum parsimony. J. Computational Biology 12, 1261–1274 (2005)

    Article  Google Scholar 

  15. Johnson, D.: Approximation algorithms for combinatorial problems. J. Comput. System Sci. 9, 256–278 (1974)

    Article  MATH  Google Scholar 

  16. Lancia, G., Pinotti, M.C., Rizzi, R.: Haplotyping populations by pure parsimony: complexity and algorithms. INFORMS Journal on computing 16, 348–359 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  17. Lancia, G., Bafna, V., Istrail, S., Lippert, R., Schwartz, R.: SNPs Problems, Complexity and Algorithms. In: Meyer auf der Heide, F. (ed.) ESA 2001. LNCS, vol. 2161, pp. 182–193. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  18. Lancia, G., Rizzi, R.: A polynomial solution to a special case of the parsimony haplotyping problem. Operations Research letters 34, 289–295 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  19. Lippert, R., Schwartz, R., Lancia, G., Istrail, S.: Algorithmic strategies for the single nucleotide polymorphism haplotype assembly problem. Briefings in bioinformatics 3, 23–31 (2002)

    Article  Google Scholar 

  20. Lóvasz, L.: On the ratio of optimal integral and fractional covers. Discrete Mathematics 13, 383–390 (1975)

    Article  MATH  MathSciNet  Google Scholar 

  21. Lund, C., Yannakakis, M.: On the hardness of approximating minimization problems. J. ACM 41, 960–981 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  22. Panconesi, A., Sozio, M.: Fast Hare: A fast heuristic for single individual SNP haplotype reconstruction. In: Jonassen, I., Kim, J. (eds.) WABI 2004. LNCS (LNBI), vol. 3240, pp. 266–277. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  23. Raz, R., Safra, S.: A sub-constant error-probability low-degree test, and sub-constant error-probability PCP characterization of NP. In: Proc. 29th ACM Symp. on Theory of Comput. (STOC 1997), pp. 475–484 (1997)

    Google Scholar 

  24. Rizzi, R., Bafna, V., Istrail, S., Lancia, G.: Practical algorithms and fixed-parameter tractability for the single individual SNP haplotyping problem. In: Guigó, R., Gusfield, D. (eds.) WABI 2002. LNCS, vol. 2452, pp. 29–43. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  25. Wang, R.S., Wu, L.Y., Li, Z.P., Zhang, X.S.: Haplotype reconstruction from SNP fragments by minimum error correction. Bioinformatics 21, 2456–2462 (2005)

    Article  Google Scholar 

  26. Wang, L., Xu, Y.: Haplotype inference by maximum parsimony. Bioinformatics 19, 1773–1780 (2003)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Abraham, J., Chen, Z., Fowler, R., Fu, B., Zhu, B. (2009). On the Approximability of Some Haplotyping Problems. In: Goldberg, A.V., Zhou, Y. (eds) Algorithmic Aspects in Information and Management. AAIM 2009. Lecture Notes in Computer Science, vol 5564. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02158-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02158-9_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02157-2

  • Online ISBN: 978-3-642-02158-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics