short-paper

Sequence-based prediction of HIV-1 coreceptor usage: utility of n-grams for representing gp120 V3 loops

Author:
Majid Masso

George Mason University, Manassas, Virginia

George Mason University, Manassas, Virginia
View Profile

BCB '11: Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and BiomedicineAugust 2011Pages 309–314https://doi.org/10.1145/2147805.2147841

Published:01 August 2011Publication History

BCB '11: Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine

Pages 309–314

ABSTRACT

Human immunodeficiency virus type 1 (HIV-1) targets for infection host cells that express both the CD4 surface membrane receptor, which binds the viral envelope glycoprotein gp120, as well as either the CCR5 (R5) or CXCR4 (X4) chemokine coreceptor, which principally interact with the V3 loop region of gp120. Coreceptor selectivity, or tropism, is dependent upon the sequence patterns encoding HIV-1 viral strains, and there are medications currently on the market and in development designed to bind and inhibit each coreceptor. Since determination of HIV-1 coreceptor usage must be undertaken prior to administering such a drug, and given the costly and time-consuming nature of experimental assays in this regard, there is now considerable interest in direct application of machine learning algorithms for classifying HIV-1 coreceptor usage based on the V3 loop region of gp120. Here for the first time, a number of n-grams (subsequences formed by a sliding window of size n) approaches are described for representing as feature vectors two large datasets of V3 loop peptide sequences obtained from HIV-1 viruses with known coreceptor usage, and the random forest algorithm is implemented for classification. These datasets were previously retrieved and used to develop combined sequence-structure based classifiers as well as sequence based string kernel classifiers, respectively. A comparison of the accuracy reported for those complex classifiers with the performance achieved here using relatively simpler and more computationally efficient n-grams reveals significant advantages while highlighting limitations.

References

Boisvert, S., Marchand, M., Laviolette, F., and Corbeil, J. HIV-1 coreceptor usage prediction without multiple alignments: an application of string kernels. Retrovirology, 5:110, 2008.Google ScholarCross Ref
Breiman, L. Random forests. Machine Learning, 45:5--32, 2001. Google ScholarDigital Library
Broder, S. The development of antiretroviral therapy and its impact on the HIV-1/AIDS pandemic. Antiviral Res, 85 (1):1--18, 2010.Google ScholarCross Ref
Cheng, B. Y., Carbonell, J. G., and Klein-Seetharaman, J. Protein classification based on text document classification techniques. Proteins, 58 (4):955--970, 2005.Google ScholarCross Ref
Damashek, M. Gauging Similarity with n-Grams: Language-Independent Categorization of Text. Science, 267 (5199):843--848, 1995.Google Scholar
Dayhoff, M. O., Schwartz, R. M., and Orcut, B. C. A model for evolutionary change in proteins. In Atlas of Protein Sequence and Structure, Vol 5. M. O. Dayhoff, Ed. National Biomedical Research Foundation, Washington D. C., 345--352, 1978.Google Scholar
De Jong, J. J., De Ronde, A., Keulen, W., Tersmette, M., and Goudsmit, J. Minimal requirements for the human immunodeficiency virus type 1 V3 domain to support the syncytium-inducing phenotype: analysis by single amino acid substitution. J Virol, 66 (11):6777--6780, 1992.Google ScholarCross Ref
Dong, Q., Zhou, S., Deng, L., and Guan, J. Gene ontology-based protein function prediction by using sequence composition information. Protein Pept Lett, 17 (6):789--795, 2010.Google ScholarCross Ref
Eggink, D., Berkhout, B., and Sanders, R. W. Inhibition of HIV-1 by fusion inhibitors. Curr Pharm Des, 16 (33):3716--3728, 2010.Google ScholarCross Ref
Frank, E., Hall, M., Trigg, L., Holmes, G., and Witten, I. H. Data mining in bioinformatics using Weka. Bioinformatics, 20 (15):2479--2481, 2004. Google ScholarDigital Library
Gardner, E. M., Burman, W. J., Steiner, J. F., Anderson, P. L., and Bangsberg, D. R. Antiretroviral medication adherence and the development of class-specific antiretroviral resistance. AIDS, 23 (9):1035--1046, 2009.Google ScholarCross Ref
Gulick, R. M., Lalezari, J., Goodrich, J., et al. Maraviroc for previously treated patients with R5 HIV-1 infection. N Engl J Med, 359 (14):1429--1441, 2008.Google ScholarCross Ref
Jensen, M. A. and van 't Wout, A. B. Predicting HIV-1 coreceptor usage with sequence analysis. AIDS Rev, 5 (2):104--112, 2003.Google Scholar
Jensen, M. A., Coetzer, M., van 't Wout, A. B., Morris, L., and Mullins, J. I. A reliable phenotype predictor for human immunodeficiency virus type 1 subtype C based on envelope V3 sequences. J Virol, 80 (10):4698--4704, 2006.Google ScholarCross Ref
Low, A. J., Dong, W., Chan, D., Sing, T., Swanstrom, R., Jensen, M., Pillai, S., Good, B., and Harrigan, P. R. Current V3 genotyping algorithms are inadequate for predicting X4 co-receptor usage in clinical isolates. AIDS, 21 (14):F17--24, 2007.Google ScholarCross Ref
Mansoori, E. G., Zolghadri, M. J., and Katebi, S. D. Protein superfamily classification using fuzzy rule-based classifier. IEEE Trans Nanobioscience, 8 (1):92--99, 2009.Google ScholarCross Ref
Masso, M. and Vaisman, I. I. Accurate and efficient gp120 V3 loop structure based models for the determination of HIV-1 co-receptor usage. BMC Bioinformatics, 11:494, 2010.Google ScholarCross Ref
Patrussi, L. and Baldari, C. T. The CXCL12/CXCR4 Axis as a Therapeutic Target in Cancer and HIV-1 Infection. Curr Med Chem, 18 (4):497--512, 2011.Google ScholarCross Ref
Pillai, S., Good, B., Richman, D., and Corbeil, J. A new perspective on V3 phenotype prediction. AIDS Res Hum Retroviruses, 19 (2):145--149, 2003.Google ScholarCross Ref
Prosperi, M. C., Fanti, I., Ulivi, G., Micarelli, A., De Luca, A., and Zazzi, M. Robust supervised and unsupervised statistical learning for HIV type 1 coreceptor usage analysis. AIDS Res Hum Retroviruses, 25 (3):305--314, 2009.Google ScholarCross Ref
Ramkumar, K., Serrao, E., Odde, S., and Neamati, N. HIV-1 integrase inhibitors: 2007--2008 update. Med Res Rev, 30 (6):890--954, 2010.Google ScholarCross Ref
Resch, W., Hoffman, N., and Swanstrom, R. Improved success of phenotype prediction of the human immunodeficiency virus type 1 from envelope variable loop 3 sequence using neural networks. Virology, 288 (1):51--62, 2001.Google ScholarCross Ref
Rose, J. D., Rhea, A. M., Weber, J., and Quinones-Mateu, M. E. Current tests to evaluate HIV-1 coreceptor tropism. Curr Opin HIV AIDS, 4 (2):136--142, 2009.Google ScholarCross Ref
Sagar, M. Clinical implications of new findings in HIV basic research. HIV Ther, 3 (4):351--360, 2009.Google ScholarCross Ref
Sander, O., Sing, T., Sommer, I., Low, A. J., Cheung, P. K., Harrigan, P. R., Lengauer, T., and Domingues, F. S. Structural descriptors of gp120 V3 loop for the prediction of HIV-1 coreceptor usage. PLoS Comput Biol, 3 (3):e58, 2007.Google ScholarCross Ref
Scheib, H., Sperisen, P., and Hartley, O. HIV-1 coreceptor selectivity: structural analogy between HIV-1 V3 regions and chemokine beta-hairpins is not the explanation. Structure, 14 (4):645--647; discussion 649--651, 2006.Google ScholarCross Ref
Sharon, M., Kessler, N., Levy, R., Zolla-Pazner, S., Gorlach, M., and Anglister, J. Alternative conformations of HIV-1 V3 loops mimic beta hairpins in chemokines, suggesting a mechanism for coreceptor selectivity. Structure, 11 (2):225--236, 2003.Google ScholarCross Ref
Sing, T., Low, A. J., Beerenwinkel, N., et al. Predicting HIV coreceptor usage on the basis of genetic and clinical covariates. Antivir Ther, 12 (7):1097--1106, 2007.Google Scholar
Vries, J. K., Liu, X., and Bahar, I. The relationship between n-gram patterns and protein secondary structure. Proteins, 68 (4):830--838, 2007.Google ScholarCross Ref
Watabe, T., Kishino, H., Okuhara, Y., and Kitazoe, Y. Fold recognition of the human immunodeficiency virus type 1 V3 loop and flexibility of its crown structure during the course of adaptation to a host. Genetics, 172 (3):1385--1396, 2006.Google ScholarCross Ref
Westby, M. and van der Ryst, E. CCR5 antagonists: host-targeted antiviral agents for the treatment of HIV infection, 4 years on. Antivir Chem Chemother, 20 (5):179--192, 2010.Google ScholarCross Ref
Wu, C. H., Zhao, S., Chen, H. L., Lo, C. J., and McLarty, J. Motif identification neural design for rapid and sensitive protein family search. Comput Appl Biosci, 12 (2):109--118, 1996.Google Scholar
Wu, Y. The co-receptor signaling model of HIV-1 pathogenesis in peripheral CD4 T cells. Retrovirology, 6:41, 2009.Google ScholarCross Ref
Zhang, K. X. and Ouellette, B. F. GAIA: a gram-based interaction analysis tool--an approach for identifying interacting domains in yeast. BMC Bioinformatics, 10 Suppl 1:S60, 2009.Google ScholarCross Ref

Index Terms

Sequence-based prediction of HIV-1 coreceptor usage: utility of n-grams for representing gp120 V3 loops
1. Applied computing
  1. Life and medical sciences

Recommendations

Exploring antiviral potency of N-1 substituted pyrimidines against HIV-1 and other DNA/RNA viruses: Design, synthesis, characterization, ADMET analysis, docking, molecular dynamics and biological activity
Abstract
A novel series of pyrimidine derivatives, bearing modified benzimidazoles at N-1 position, has been designed, synthesized and screened as NNRTIs against HIV and as broad-spectrum antiviral agents. The molecules were screened against ...
Graphical Abstract

Display Omitted
Highlights
- New pyrimidine derivatives bearing modified benzimidazoles synthesized as antivirals against HIV-1 and different DNA/RNA viruses.
Read More
HIV-1 CRF01_AE coreceptor usage prediction using kernel methods based logistic model trees

The determination of HIV-1 coreceptor usage plays a major role in HIV treatment. Since Maraviroc has been used in a treatment for patients those exclusively harbor R5-tropic strains, the efficient performance of classifying HIV-1 coreceptor usage can ...
Read More
Prediction of R5, X4, and R5X4 HIV-1 Coreceptor Usage with Evolved Neural Networks

The HIV-1 genome is highly heterogeneous. This variation affords the virus a wide range of molecular properties, including the ability to infect cell types, such as macrophages and lymphocytes, expressing different chemokine receptors on the cell ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
BCB '11: Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine
August 2011
688 pages
ISBN:9781450307963
DOI:10.1145/2147805
General Chairs:
Robert Grossman
University of Chicago
,
Andrey Rzhetsky
University of Chicago
,
Program Chairs:
Sun Kim
Indiana University Bloomington and Seoul National University
,
Wei Wang
University of North Carolina at Chapel Hill
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 August 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
CCR5
CXCR4
HIV
V3
classifier
n-grams
random forest
Qualifiers
- short-paper
Conference

Acceptance Rates
Overall Acceptance Rate254of885submissions,29%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 69
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Sequence-based prediction of HIV-1 coreceptor usage: utility of n-grams for representing gp120 V3 loops

BCB '11: Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine

ABSTRACT

References

Cited By

Index Terms

Recommendations

Exploring antiviral potency of N-1 substituted pyrimidines against HIV-1 and other DNA/RNA viruses: Design, synthesis, characterization, ADMET analysis, docking, molecular dynamics and biological activity

HIV-1 CRF01_AE coreceptor usage prediction using kernel methods based logistic model trees

Prediction of R5, X4, and R5X4 HIV-1 Coreceptor Usage with Evolved Neural Networks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Sequence-based prediction of HIV-1 coreceptor usage: utility of n-grams for representing gp120 V3 loops

BCB '11: Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine

ABSTRACT

References

Cited By

Index Terms

Recommendations

Exploring antiviral potency of N-1 substituted pyrimidines against HIV-1 and other DNA/RNA viruses: Design, synthesis, characterization, ADMET analysis, docking, molecular dynamics and biological activity

HIV-1 CRF01_AE coreceptor usage prediction using kernel methods based logistic model trees

Prediction of R5, X4, and R5X4 HIV-1 Coreceptor Usage with Evolved Neural Networks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media