Abstract
Knowledge of protein secondary structure is a useful step toward prediction of the 3D structure of a particular protein. In this paper, a support vector machine (SVM) based method used for the prediction of secondary structure is introduced in details. Protein sequence data is in a hybrid representation combining the Position-specific Scoring Matrix (PSSM), the Hydrophobicity Sequence Feature (HSF), and the Structural Sequence Feature (SSF). Protein sequences are obtained from CB513 dataset, corresponding PSSM profiles are obtained from PSI-BLAST Program and sequence features are computed based on amino acid scales offered by Expasy website (http://web.expasy.org/protscale/). Basically, PSSM profiles are used as input data to the SVM-PSSM classifier of the secondary structure prediction. Furthermore, to construct more accurate classifiers, more than 40 SFs (sequence features) are examined as accessional input vector to SVM-PSSM classifier for feature selection. The most accurate classifier in this study is constructed using a combination of PSSM and few relevant sequence features. The experimental results show that relevant sequence features extracted from Hydrophobicity index and Structural conformational parameters can improve the SVM-PSSM classifier for the prediction of protein secondary structure elements. Our proposed final SVM-PSSM-SF method achieved an overall accuracy of 78%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
David, W.: Proteins: Structure and Function. Wiley, Hoboken (2013)
Raven, P.H., Johnson, G.B.: How Scientists Think. WCB/McGraw-Hill, New York (1997)
Martin J., Gibrat J.F., Rodolphe F: Analysis of an optimal hidden Markov model for secondary structure prediction. BMC Struct. Biol. 6(25) (2006)
Yao, X.-Q., Zhu, H., She, Z.-S.: A dynamic Bayesian network approach to protein secondary structure prediction. BMC Bioinform. 9(49), 25 (2008)
Kunal, J.: Prediction of ubiquitin proteins using artificial neural networks, hidden Markov model and support vector machines. Silico Biol. 7(6), 559–568 (2007)
Chen, C., Tian, Y., Zou, X., Cai, P., Mo, J.: Prediction of protein secondary structure content using support vector machine. Talanta 71(5), 2069–2073 (2007)
Jones, D.T.: Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292(2), 195–202 (1999)
Ding, S., Li, Y., Shi, Z., Yan, S.: A protein structural classes prediction method based on predicted secondary structure and PSI-BLAST profile. Biochimie 97, 60–65 (2014)
Teng, S., Srivastava, A.K., Wang, L.: Sequence feature-based prediction of protein stability changes upon amino acid substitutions. BMC Genom. 11(Suppl. 2), S5 (2010)
Cuff, J.A., Barton, G.J.: Application of multiple sequence alignment profiles to improve protein secondary structure prediction. Proteins Struct. Funct. Genet. 40(3), 502–511 (2000)
Qu, W., Sui, H., Yang, B., Qian, W.: Improving protein secondary structure prediction using a multi-modal BP method. Comput. Biol. Med. 41(10), 946–959 (2011)
Kabsch, W., Sander, C.: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22(12), 2577–2637 (1983)
Gasteiger, E.H.C., Gattiker, A., Duvaud, S., Wilkins, M.R., Appel, R.D., Bairoch, A.: The Proteomics Protocols Handbook. Humana Press, New York (2005)
Kyte, J., Doolittle, R.F.: A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157, 105–132 (1982)
Deleage, G., Roux, B.: An algorithm for protein secondary structure prediction based on class prediction. Protein Eng. 1(4), 289–294 (1987)
Vapnik, V.N.: The Nature of Statistical Learning Theory, 2nd edn. Springer, Heidelberg (2000)
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(27), 1–27 (2011)
Gibrat J.F., Rodolphe F: Analysis of an optimal hidden Markov model for secondary structure prediction. BMC Struct. Biol. 6(25) (2006)
Rose, G.D., Geselowitz, A.R., Lesser, G.J., Lee, R.H., Zehfus, M.H.: Hydrophobicity of amino acid residues in globular proteins. Science 229(4716), 834–838 (1985)
Cho, M.K., Kim, H.Y., Bernado, P., Fernandez, C.O., Blackledge, M., Zweckstetter, M.: Amino acid bulkiness defines the local conformations and dynamics of natively unfolded alpha-synuclein and tau. J. Am. Chem. Soc. 129(11), 3032–3033 (2007)
Acknowledgements
The research work is supported by the National Natural Science Foundation of China (Grant No. 61375013); and the Natural Science Foundation of Shandong province (Grant No. ZR2013FM020) China.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Chen, Y., Cheng, J., Liu, Y., Park, P.S. (2018). A Novel Approach of Protein Secondary Structure Prediction by SVM Using PSSM Combined by Sequence Features. In: Bi, Y., Kapoor, S., Bhatia, R. (eds) Proceedings of SAI Intelligent Systems Conference (IntelliSys) 2016. IntelliSys 2016. Lecture Notes in Networks and Systems, vol 15. Springer, Cham. https://doi.org/10.1007/978-3-319-56994-9_74
Download citation
DOI: https://doi.org/10.1007/978-3-319-56994-9_74
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56993-2
Online ISBN: 978-3-319-56994-9
eBook Packages: EngineeringEngineering (R0)