Abstract
Proteins can be classified into four structural classes (all-α, all-β, α/β, α+β) according to their secondary structure composition. In this paper, we predict the structural class of a protein from its Amino Acid Composition (AAC) using Support Vector Machines (SVM). A protein can be represented by a 20 dimensional vector according to its AAC. In addition to the AAC, we have used another feature set, called the Trio Amino Acid Composition (Trio AAC) which takes into account the amino acid neighborhood information. We have tried both of these features, the AAC and the Trio AAC, in each case using a SVM as the classification tool, in predicting the structural class of a protein. According to the Jackknife test results, Trio AAC feature set shows better classification performance than the AAC feature.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Levitt, M., Chothia, C.: Structural patterns in globular proteins. Nature 261, 552–558 (1976)
Richardson, J.S., Richardson, D.C.: Principles and patterns of protein conformation. In: Fasman, G.D. (ed.) Prediction of protein structure and the principles of protein conformation, pp. 1–98. Plenum Press, New York (1989)
Deleage, G., Dixon, J.: Use of class prediction to improve protein secondary structure prediction. In: Fasman, G.D. (ed.) Prediction of protein structure and the principles of protein conformation, pp. 587–597. Plenum Press, New York (1989)
Kneller, D.G., Cohen, F.E., Langridge, R.: Improvements in protein secondary structure prediction by an enhanced neural network. J. Mol. Biol. 214, 171–182 (1990)
Eisenhaber, F., Persson, B., Argos, P.: Protein structure prediction: recognition of primary, secondary, and tertiary structural features from amino acid sequence. Crit. Rev. Biochem. Mol. Biol. 30, 1–94 (1995)
Nakashima, H., Nishikawa, K., Ooi, T.: The folding type of a protein is relevant to the amino acid composition. J. Biochem (Tokyo) 99, 153–162 (1986)
Klein, P., Delisi, C.: Prediction of protein structural class from the amino acid sequence. Biopolymers 25, 1659–1672 (1986)
Chou, P.Y.: Prediction of protein structural classes from amino acid composition. In: Fasman, G.D. (ed.) Prediction of protein structure and the principles of protein conformation, pp. 549–586. Plenum Press, New York (1989)
Zhang, C.T., Chou, K.C.: An optimization approach to predicting protein structural class from amino acid composition. Protein Sci. 1, 401–408 (1992)
Metfessel, B.A., Saurugger, P.N., Connelly, D.P., Rich, S.S.: Cross-validation of protein structural class prediction using statistical clustering and neural networks. Protein Sci. 2, 1171–1182 (1993)
Chandonia, J.M., Karplus, M.: Neural networks for secondary structure and structural class predictions. Protein Sci. 4, 275–285 (1995)
Chou, K.C.: A novel approach to predicting protein structural classes in a (20-1)-d amino acid composition space. Proteins 21, 319–344 (1995)
Bahar, I., Atilgan, A.R., Jernigan, R.L., Erman, B.: Understanding the recognition of protein structural classes by amino acid composition. Proteins 29, 172–185 (1997)
Chou, K.C.: A key driving force in determination of protein structural classes. Biochem Biophys Res. Commun. 264, 216–224 (1999)
Cai, Y., Zhou, G.: Prediction of protein structural classes by neural network. Biochimie 82, 783–787 (2000)
Cai, Y.D., Liu, X.J., Xu, X., Chou, K.C.: Prediction of protein structural classes by support vector machines. Comput. Chem. 26, 293–296 (2002)
Ding, C.H., Dubchak, I.: Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics 17, 349–358 (2001)
Tan, A.C., Gilbert, D., Deville, Y.: Multi-class protein fold classification using a new ensemble machine learning approach. Genome Informatics 14, 206–217 (2003)
Wang, Z.X., Yuan, Z.: How good is prediction of protein structural class by the component-coupled method. Proteins 38, 165–175 (2000)
Thomas, P.D., Dill, K.A.: An iterative method for extracting energy-like quantities from protein structures. Proc. Natl. Acad. Sci. USA 93, 11628–11633 (1996)
Vapnik, V.: Statistical learning theory. Wiley, New York (1998)
Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines (2002)
Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The protein data bank. Nucleic. Acids Res. 28, 235–242 (2000)
Leslie, C., Eskin, E., Noble, W.S.: The spectrum kernel: A string kernel for svm protein classification. In: Pacific Symposium on Biocomputing, Hawaii, USA (2002)
Vishwanathan, S.V.N., Smola, A.J.: Fast kernels for string and tree matching. In: Neural Information Processing Systems: Natural and Synthetic, Vancouver, Canada (2002)
Markowetz, F., Edler, L., Vingron, M.: Support vector machines for protein fold class prediction. Biometrical Journal 45, 377–389 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Isik, Z., Yanikoglu, B., Sezerman, U. (2004). Protein Structural Class Determination Using Support Vector Machines. In: Aykanat, C., Dayar, T., Körpeoğlu, İ. (eds) Computer and Information Sciences - ISCIS 2004. ISCIS 2004. Lecture Notes in Computer Science, vol 3280. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30182-0_9
Download citation
DOI: https://doi.org/10.1007/978-3-540-30182-0_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23526-2
Online ISBN: 978-3-540-30182-0
eBook Packages: Springer Book Archive