Abstract
The Human Immunodeficiency Virus (HIV) encodes an enzyme, called HIV protease, which is responsible for the generation of infectious viral particles by cleaving the virus polypeptides. Many efforts have been devoted to perform accurate predictions on the HIV-protease cleavability of peptides, in order to design efficient inhibitor drugs. Over the last decade, linear and nonlinear supervised learning methods have been extensively used to discriminate between protease-cleavable and non cleavable peptides. In this paper we consider four different proteins encoding schemes and we apply a discrete variant of linear support vector machines to predict their HIV protease-cleavable status. Empirical results indicate the effectiveness of the proposed method, that is able to classify with the highest accuracy the cleavable and non cleavable peptides contained in two publicly available benchmark datasets. Moreover, the optimal classification rules generated are characterized by a strong generalization capability, as shown by their accuracy in predicting the HIV protease cleavable status of peptides in out-of-sample datasets.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Beck, Z.Q., Hervio, L., Dawson, P.E., Elder, J.E., Madison, E.L.: Identification of efficiently cleaved substrates for HIV-1 protease using a phage display library and use in inhibitor development. Virology 274, 391–401 (2000)
Beck, Z.Q., Lin, Y.-C., Elder, J.E.: Molecular basis for the relative substrate specificity of human immunodeficiency virus type 1 and feline immunodeficiency virus proteases. Journal of Virology 75, 9458–9469 (2001)
Cai, Y., Chou, K.: Artificial neural network model for predicting HIV protease cleavage sites in protein. Advances in Engineering Software 29, 119–128 (1998)
Cai, Y., Liu, X., Xu, X., Chou, K.: Support vector machines for predicting HIV protease cleavage sites in protein. Journal of Computational Chemistry 23, 267–274 (2002)
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001)
Chou, K.C.: Prediction of human immunodeficiency virus protease cleavage sites in proteins. Analytical Biochemistry 233, 1–14 (1996)
Cristianini, N., Shawe-Taylor, J.: An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge (2000)
Yang, Z.R., Chou, K.C.: Bio-support vector machines for computational proteomics. Bioinformatics 20, 735–741 (2004)
Nanni, N.: Comparison among feature extraction methods for HIV-1 protease cleavage site prediction. Pattern Recognition 39, 711–713 (2006)
Maetschke, S., Towsey, M., Boden, M.: Blomap: An encoding of amino acids which improves signal peptide cleavage prediction. In: Chen, Y., L.W. (ed.) Proceedings of the 3rd Asia-Pacific Bioinformatics Conference. pp.141–150 (2005)
Narayanan, A., Wu, X., Yang, Z.R.: Mining viral protease data to extract cleavage knowledge. Bioinformatics 18, 13–15 (2002)
Orsenigo, C., Vercellis, C.: Multivariate classification trees based on minimum features discrete support vector machines. IMA Journal of Management Mathematics 14, 221–234 (2003)
Orsenigo, C., Vercellis, C.: Discrete support vector decision trees via tabu-search. Journal of Computational Statistics and Data Analysis 47, 311–322 (2004)
Poorman, R., Tomasselli, A., Heinrikson, R., Kezdy, F.: A cumulative specificity model for proteases from human immunodeficiency virus types 1 and 2, inferred from statistical analysis of an extended substrate data base. The Journal of Biological Chemistry 266, 14554–14561 (1991)
Rögnvaldsson, T., You, L.: Why neural networks should not be used for HIV-1 protease cleavage site prediction. Bioinformatics 20, 1702–1709 (2004)
Schechter, I., Berger, A.: On the size of the active site in proteases. Biochemical and Biophysical Research Communications 27, 157–162 (1967)
Thompson, T., Chou, K., Zheng, C.: Neural network prediction of the hiv-1 protease cleavage sites. Journal of Theoretical Biology 177, 369–379 (1995)
Tözsér, J., Zahuczky, G., Bagossi, P., Louis, J.M., Copeland, T.D., Oroszlan, S., Harrison, R.W., Weber, I.T.: Comparison of the substrate specificity of the human T-cell leukemia virus and human immunodeficiency virus proteinases. European Journal of Biochemistry 267, 6287–6295 (2000)
Vapnik, V.: The nature of statistical learning theory. Springer, New York (1995)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Orsenigo, C., Vercellis, C. (2007). Predicting HIV Protease-Cleavable Peptides by Discrete Support Vector Machines. In: Marchiori, E., Moore, J.H., Rajapakse, J.C. (eds) Evolutionary Computation,Machine Learning and Data Mining in Bioinformatics. EvoBIO 2007. Lecture Notes in Computer Science, vol 4447. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71783-6_19
Download citation
DOI: https://doi.org/10.1007/978-3-540-71783-6_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71782-9
Online ISBN: 978-3-540-71783-6
eBook Packages: Computer ScienceComputer Science (R0)