Skip to main content

HIV-1 Drug Resistance Prediction and Therapy Optimization: A Case Study for the Application of Classification and Clustering Methods

  • Chapter
Similarity-Based Clustering

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5400))

  • 1384 Accesses

Abstract

This chapter provides a review of the challenges machine-learning specialists face when trying to assist virologists by generating an automatic prediction of an outcome of HIV therapy.

Optimizing HIV therapies is crucial since the virus rapidly develops mutations to evade drug pressures. Modern anti-HIV regimens comprise multiple drugs in order to prevent, or at least delay, the development of resistance mutations. In recent years, large databases have been collected to allow the automatic analysis of relations between the virus genome other clinical and demographical information, and the failure or success of a therapy. The EuResist integrated database (EID) collected from about 18500 patients and 65000 different therapies is probably one of the largest clinical genomic databases. Only one third of the therapies in the EID contain therapy response information and only 5% of the therapy records have response information as well as genotypic data. This leads to two specific challenges (a) semi-supervised learning – a setting where many samples are available but only a small proportion of them are labeled and (b) missing data.

We review a novel solution for the first setting: a novel dimensionality reduction framework that binds information theoretic considerations with geometrical constraints over the simplex. The dimensionality reduction framework is formulated to find optimal low dimensional geometric embedding of the simplex that preserves pairwise distances. This novel similarity-based clustering solution was tested on toy data and textual data. We show that this solution, although it outperforms other methods and provides good results on a small sample of the Euresist data, is impractical for the large EuResist dataset. In addition, we review a generative-discriminative prediction system that successfully overcomes the missing value challenge.

Apart from a review of the EuResist project and related challenges, this chapter provides an overview of recent developments in the field of machine learning-based prediction methods for HIV drug resistance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Altfeld, M., Alle, T.: Hitting hiv where it hurts: an alternative approach to hiv vaccine design. TRENDS in Immunology 27, 504–510 (2006)

    Article  CAS  PubMed  Google Scholar 

  2. Cai, F., Haifeng, C., Hicks, C., Bartlett, J., Zhu, J., Gao, F.: Detection of minor drug-resistant populations by parallel allele-specific sequencing. Nature Methods 4, 123–125 (2007)

    Article  CAS  PubMed  Google Scholar 

  3. Schindler, M., Mönch, J., Kutsch, O., Li, H., Santiago, M.L., Billet-Ruche, F., Müller-Trutwein, M.C., Novembre, F.J., Peeters, M., Courgnaud, V., Bailes, E., Roques, P., Sodora, D.L., Silvetri, G., Sharp, P.M., Hahn, B.H., Kirchhoff, F.: Nef-mediated suppression of t cell activation was lost in a lentiviral lineage that gave rise to hiv-1. Cell 125, 1055–1067 (2006)

    Article  CAS  PubMed  Google Scholar 

  4. Brass, A.L., Dykxhoorn, D.M., Benita, Y., Yan, N., Engelman, A., Xavier, R.J., Lieberman, J., Elledge, S.J.: Identification of host proteins required for hiv infection through a functional genomic screen. Science, 1152725 (January 2008)

    Google Scholar 

  5. Roomp, K., Beerenwinkel, N., Sing, T., Schülter, E., Büch, J., Sierra-Aragon, S., Däumer, M., Hoffmann, D., Kaiser, R., Lengauer, T., Selbig, J.: Arevir: A secure platform for designing personalized antiretroviral therapies against hiv. In: Leser, U., Naumann, F., Eckman, B. (eds.) DILS 2006. LNCS (LNBI), vol. 4075, pp. 185–194. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  6. Cordes, F., Kaiser, R., Selbig, J.: Bioinformatics approach to predicting hiv drug resistance. Expert Review of Molecular Diagnostics 6(2), 207–215 (2006)

    Article  CAS  PubMed  Google Scholar 

  7. Altmann, A., Rosen-Zvi, M., Prosperi, M., Aharoni, E., Neuvirth, H., Schülter, E., Büch, J., Peres, Y., Incardona, F., Sönnerborg, A., Kaiser, R., Zazzi, M., Lengauer, T.: The euresist approach for predicting response to anti hiv-1 therapy. In: The 6th European HIV Drug Resistance Workshop, Cascais, Portugal (2008)

    Google Scholar 

  8. Rhee, S.Y., Taylor, J., Wadhera, G., Ben-Hur, A., Brutlag, B., Shafer, R.: Genotypic predictors of human immunodeficiency virus type 1 drug resistance. PNAS 103, 17355–17360 (2006)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Carvajal-Rodriguez, A.: The importance of bio-computational tools for predicting hiv drug re-sistance. Cell 1, 63–68 (2007)

    CAS  Google Scholar 

  10. Beerenwinkel, N., Schmidt, B., Walther, H., Kaiser, R., Lengauer, T., Hoffmann, R., Korn, K., Selbig, J.: Diversity and complexity of hiv-1 drug resitance: A bioinformatics approach to predicting phenotype from genotype. PNAS 99, 8271–8276 (2002)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Altmann, A., Beerenwinkel, N., Sing, T., Savenkov, I., Doumer, M., Kaiser, R., Rhee, S.Y., Fessel, W., Shafer, R., Lengauer, T.: Super learning: A applica-tion to prediction of hiv-1 drug resistance. Statistical Applications in Genetics and Molecular Biology 6(7), 169–178 (2007)

    Google Scholar 

  12. Johnson, A., Brun-Vezinet, F., Clotet, B., Gunthard, H., Kuritzkes, D., Pillay, D., Schapiro, J., Richman, D.: Update of the drug resistance mutations in hiv-1: 2007. Top HIV Med. 15(4), 119–125 (1991)

    Google Scholar 

  13. Brun-Vezinet, F., Costagliola, D., Mounir Ait, K., Calvez, V., Clavel, F., Clotet, B., Haubrich, R., Kempf, D., King, M., Kuritzkes, D., Lanier, R., Miller, M., Miller, V., Phillips, A., Pillay, D., Schapiro, J., Scott, J., Shafer, R., Zazzi, M., Zolopa, A., DeGruttola, V.: Clinically validated genotype analysis: guiding principles and statistical concerns. Antiviral therapy 9(4), 465–478 (2004)

    CAS  PubMed  Google Scholar 

  14. Van Laethem, K., De Luca, A., Antinori, A., Cingolani, A., Perna, C., Vandamme, A.: A genotypic drug resistance interpretation algorithm that significantly predicts therapy response in hiv-1-infected patients. Antiviral therapy 2, 123–129 (2002)

    Google Scholar 

  15. Kantor, R., Machekano, R., Gonzales, M.J., Dupnik, K., Schapiro, J.M., Shafer, R.W.: Human immunodeficiency virus reverse transcriptase and protease sequence database: an expanded data model integrating natural language text and sequence analysis programs. Nucleic Acids Research 29(1), 296–299 (2001)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Zazzi, M., Romano, L., Venturi, G., Shafer, R., Reid, C., Dal Bello, F., Parolin, C., Palu, G., Valensin, P.: Comparative evaluation of three computerized algorithms for prediction of antiretroviral susceptibility from hiv type 1 genotype. J. Antimicrob Chemother 53(2), 356–360 (2004)

    Article  CAS  PubMed  Google Scholar 

  17. Beerenwinkel, N., Däumer, M., Oette, M., Korn, K., Hoffmann, D., Kaiser, R., Lengauer, T., Selbig, J., Walter, H.: Geno2pheno: estimating phenotypic drug resistance from hiv-1 genotypes. Nucleic Acids Research 31(13), 3850–3855 (2003)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Larder, B., Wang, D., Revell, A., Montaner, J., Harrigan, R., De Wolf, F., Lange, J., Wegner, S., Ruiz, L., Perez-Elias, M., Emery, S., Gatell, J., Monforte, A., Torti, C., Zazzi, M., Lane, L.: The development of artificial neural networks to predict virological response to combination hiv therapy. Antiviral therapy 12, 15–24 (2007)

    CAS  PubMed  Google Scholar 

  19. Altmann, A., Beerenwinkel, N., Sing, T., Savenkov, I., Doumer, M., Kaiser, R., Rhee, S.Y., Fessel, W., Shafer, R., Lengauer, T.: Improved prediction of response to antiretroviral combination therapy using the genetic barrier to drug resistance. Antiviral therapy 12, 169–178 (2007)

    CAS  PubMed  Google Scholar 

  20. Saigo, H., Uno, T., Tsuda, K.: Mining complex genotypic features for predicting hiv-1 drug resistance. Bioinformatics 23(18), 2455–2462 (2007)

    Article  CAS  PubMed  Google Scholar 

  21. Almerico, A., Tutone, M., Lauria, A.: Docking and multivariate methods to explore hiv-1 drug-resistance: a comparative analysis. J. Comput. Aided Mol. Des. (2008)

    Google Scholar 

  22. Altmann, A., Rosen-Zvi, M., Prosperi, M., Aharoni, E., Neuvirth, H., Schülter, E., Büch, J., Struck, D., Peres, Y., Incardona, F., Sönnerborg, A., Kaiser, R., Zazzi, M., Lengauer, T.: Comparison of classifier fusion methods for predicting response to anti hiv-1 therapy. PLoS ONE 3(10), 3470 (2008)

    Article  Google Scholar 

  23. Rosen-Zvi, M., Altmann, A., Prosperi, M., Aharoni, E., Neuvirth, H., Sönnerborg, A., Schülter, E., Struck, D., Peres, Y., Incardona, F., Kaiser, R., Zazzi, M., Lengauer, T.: Selecting anti-HIV therapies based on a variety of genomic and clinical factors. Bioinformatics 24(13), i399–i406 (2008)

    Article  Google Scholar 

  24. Aharoni, E., Altman, A., Borgulya, G., D’Autilia, R., Incardona, F., Kaiser, R., Kent, C., Lengauer, T., Neuvirth, H., Peres, Y., Petroczi, A., Prosperi, M., Rosen-Zvi, M., Schülter, E., Sing, T., Sönnenborg, A., Thompson, R., Zazzi, M.: Integration of viral genomics with clinical data to predict response to anti-hiv treatment. In: IST-Africa 2007 Conference & Exhibition (2007)

    Google Scholar 

  25. Zazzi, M., Aharoni, E., Altmann, A., Baszó, F., Bidgood, P., Borgulya, G., Denholm-Prince, J., Fielder, M., Kent, C., Lengauer, T., Nepusz, T., Neuvirth, H., Peres, Y., Petroczi, A., Prosperi, M., Romano, L., Rosen-Zvi, M., Schülter, E., Sing, T., Sönnerborg, A., Thompson, R., Ulivi, G., Zalány, L., Incardona, F.: Euresist: exploration of multiple modeling techniques for prediction of response to treatment. In: Proceedings of the 5th European HIV Drug Resistance Workshop (2007)

    Google Scholar 

  26. Rosen-Zvi, M., Neuvirth, H., Aharoni, E., Zazzi, M., Tishby, N.: Consistent dimensionality reduction scheme and its application to clinical hiv data. In: NIPS 2006 workshop, Novel Applications of Dimensionality Reduction (2006)

    Google Scholar 

  27. Cover, T.M., Thomas, J.A.: Elements of information theory. Wiley-Interscience, New York (1991)

    Book  Google Scholar 

  28. Lin, J.: Divergence measures based on the shannon entropy. IEEE Transactions on Information Theory 37(1), 145–151 (1991)

    Article  Google Scholar 

  29. Endres, D.M., Schindelin, J.E.: A new metric for probability distributions. IEEE Transactions on Information Theory 49(7), 1858–1860 (2003)

    Article  Google Scholar 

  30. Chan, A.B., Chan, A.B., Vasconcelos, N., Vasconcelos, N., Moreno, P.J., Moreno, P.J.: A family of probabilistic kernels based on information divergence (2004)

    Google Scholar 

  31. Lafferty, J., Lebanon, G.: Diffusion kernels on statistical manifolds. J. Mach. Learn. Res. 6, 129–163 (2005)

    Google Scholar 

  32. Tishby, N., Pereira, F., Bialek, W.: The information bottleneck method. In: Proceedings of the 37th Annual Allerton Conference on Communication, Control and Computing, pp. 368–377 (1999)

    Google Scholar 

  33. Chung, F., Handjani, S., Jungreis, D.: Generalizations of polya’s urn problem. Annals of combinatorics 7, 141–154 (2003)

    Article  Google Scholar 

  34. Chapelle, O., Zien, A.: Semi-supervised classification by low density separation. In: Cowell, R., Ghahramani, Z. (eds.) Tenth International Workshop on Artificial Intelligence and Statistics, pp. 57–64 (2005)

    Google Scholar 

  35. Siliciano, R.: Viral reservoirs and ongoing virus replication in patients on haart: implications for clinical management. In: Conf. Retrovir Oppor. Infect Conf. Retrovir Oppor. Infect 8th Abstract No. L5 (2001)

    Google Scholar 

  36. Piliero, P.: Early factors in successful anti-hiv treatment. Journal of the International Association of Physicians in AIDS Care (JIAPAC) 2(1), 10–20 (2003)

    Article  Google Scholar 

  37. Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, Santa Mateo (1988)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Rosen-Zvi, M., Aharoni, E., Selbig, J. (2009). HIV-1 Drug Resistance Prediction and Therapy Optimization: A Case Study for the Application of Classification and Clustering Methods. In: Biehl, M., Hammer, B., Verleysen, M., Villmann, T. (eds) Similarity-Based Clustering. Lecture Notes in Computer Science(), vol 5400. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01805-3_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01805-3_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01804-6

  • Online ISBN: 978-3-642-01805-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics