Abstract
HIV-1 protease has been the subject of intense research for deciphering HIV-1 virus replication process for decades. Knowledge of the substrate specificity of HIV-1 protease will enlighten the way of development of HIV-1 protease inhibitors. In the prediction of HIV-1 protease cleavage site techniques, various feature encoding techniques and machine learning algorithms have been used frequently. In this paper, a new feature amino acid encoding scheme is proposed to predict HIV-1 protease cleavage sites. In the proposed method, we combined orthonormal encoding and Taylor’s venn-diagram. We used linear support vector machines as the classifier in the tests. We also analyzed our technique by comparing some feature encoding techniques. The tests are carried out on PR-1625 and PR-3261 datasets. Experimental results show that our amino acid encoding technique leads to better classification performance than other encoding techniques on a standalone classifier.
Similar content being viewed by others
References
Chou KC (1996) Prediction of human immunodeficiency virus protease cleavage sites in proteins. Anal Biochem 233(1):1–14
Ding SF, Jia WK et al (2008) A survey on statistical pattern feature extraction. Adv Intell Comput Theor Appl Proc 5227:701–708
Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. John Wiley & Sons, New York
Fawcett T (2004) ROC graphs: notes and practical considerations for researchers. Technical report, HP laboratories. Palo Alto, California
Fukunaga K (1990) Introduction to statistical pattern recognition, 2nd edn. Academic Press, New York
Guo J, YL Lin (2005) A novel method for protein subcellular localization: combining residue-couple model and SVM. Proceedings of the 3rd Asia-Pacific bioinformatics conference pp 117–129
Henikoff S, Henikoff JG (1992) Amino-acid substitution matrices from protein blocks. Proc Natl Acad Sci USA 89(22):10915–10919
Kontijevskis A, Wikberg JE et al (2007) Computational proteomics analysis of HIV-1 protease interactome. Proteins-Struct Funct Bioinformatics 68(1):305–312
Maetschke S, Towsey M, Boden M (2005) Blomap: an encoding of amino acids which improves signal peptide cleavage prediction. Proceedings of the 3rd Asia-Pacific bioinformatics conference, pp 141–150
Nanni L (2006) Comparison among feature extraction methods for HIV-1 protease cleavage site prediction. Pattern Recognit 39(4):711–713
Rognvaldsson T, You L (2004) Why neural networks should not be used for HIV-1 protease cleavage site prediction. Bioinformatics 20(11):1702–1709
Rognvaldsson T, Etchells TA et al (2009) How to find simple and accurate rules for viral protease cleavage specificities. BMC Bioinformatics 10:149
Schilling O, Overall CM (2008) Proteome-derived, database-searchable peptide libraries for identifying protease cleavage sites. Nat Biotechnol 26(6):685–694
Taylor WR (1986) The classification of amino-acid conservation. J Theor Biol 119(2):205–218
Wu C et al (1992) Protein classification artificial neural system. Protein Sci 1(5):667–677
Zvelebil MJ et al (1987) Prediction of protein secondary structure and active-sites using the alignment of homologous sequences. J Mol Biol 195(4):957–961
Acknowledgments
This work was supported by Sakarya University. BAP Project (Grant 2010-50-02-007).
Author information
Authors and Affiliations
Corresponding author
Additional information
Reproducible research: We reported MatLab code and datasets used for obtaining the empirical results in this paper are available at http://dl.dropbox.com/u/70054715/codeHIV1p.zip.
Rights and permissions
About this article
Cite this article
Gök, M., Özcerit, A.T. A new feature encoding scheme for HIV-1 protease cleavage site prediction. Neural Comput & Applic 22, 1757–1761 (2013). https://doi.org/10.1007/s00521-012-0967-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-012-0967-5