Skip to main content
Log in

HYBP_PSSP: a hybrid back propagation method for predicting protein secondary structure

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

The prediction of secondary structure is an important topic in the field of bioinformatics, even if the methods have matured, and development of the algorithms is a far less active area than a decade ago. Accurate prediction is very useful to biologists in its own right, but it is worth pointing out that it is also an essential component of tertiary structure prediction, which in contrast is far from solved and continues to be a highly active area of research. In addition, sequence comparison methods have more recently incorporated local structure tracks. The extra information utilized by the new methods has led to considerable improvements in fold recognition and alignment accuracy. In this paper, a novel method for protein secondary structure prediction is presented. Using evolutionary information contained in amino acid’s physicochemical properties, position-specific scoring matrix generated by PSI-BLAST and HMMER3 profiles as input to hybrid back propagation system, secondary structure can be predicted at significantly increased accuracy. Based on knowledge discovery theory based on inner cognitive mechanism (KDTICM) theory, we have constructed a compound pyramid model approach, which is composed of four layers of the intelligent interface and integrated in several ways, such as hybrid back propagation method (HBP), modified knowledge discovery in databases (KDD*), hybrid SVM method (HSVM) and so on. Experiments on three standard datasets (RS126, CB513 and CASP8) show that CPM is capable of producing the higher Q 3 and SOV scores than that achieved by existing widely used schemes such as PSIPRED, PHD, Predator, as well as previously developed prediction methods. On the RS126 and CB513 datasets, it achieves a Q 3 and SOV99 score are considerably higher than the best reported scores, respectively. It is also tested on target proteins of critical assessment of protein structure prediction experiment (CASP8) and achieves better results than the traditional methods, including the popular PSIPRED method over overall prediction accuracy. Available: http://www.kdd.ustb.edu.cn/protein_Web.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Ankur Bansal TC, Zhong S (2010) Privacy preserving back-propagation neural network learning over arbitrarily partitioned data. Neural Comput & Appl 20(1):143–150

    Article  Google Scholar 

  2. APSSP2 (2009) http://www.imtech.res.in/raghava/apssp2

  3. Bahrammirzaee A (2010) A comparative survey of artificial intelligence applications in finance: artificial neural networks, expert system and hybrid intelligent systems. Neural Comput & Appl 19(8):1165–1195

    Article  Google Scholar 

  4. Baldi P, Brunak S, Frasconi P, Pollastri G, Soda G (1999) Bidirectional dynamics for protein secondary structure prediction. In: Proceedings of the sixteenth international joint conference on artificial intelligence (IJCAI99), vol 1828, Springer, Berlin, pp 80–104

  5. Baldi P, Brunak S, Frasconi P, Soda G, Pollastri G (1999) Exploiting the past and the future in protein secondary structure prediction. Bioinformatics 15(11):937–946

    Article  Google Scholar 

  6. Barton GJ (1990) Protein multiple sequence alignment and flexible pattern matching. Methods Enzymol 183:403–428

    Article  Google Scholar 

  7. Ben Gal I, Shani A, Gohr A, Grau J et al (2005) Identification of transcription factor binding sites with variable-order bayesian networks. Bioinformatics 21(11):2657–2666

    Article  Google Scholar 

  8. Bhairpred (2009) http://www.imtech.res.in/raghava/bhairpred

  9. Crooks GE, Brenner SE (2004) Protein secondary structure: entropy, correlations and prediction. Bioinformatics 20:1603–1611

    Article  Google Scholar 

  10. Cuff JA, Barton GJ (1999) Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. Proteins 34(4):508–519

    Article  Google Scholar 

  11. Cuff JA, Barton GJ (2000) Application of multiple sequence alignment profiles to improve protein secondary structure prediction. Proteins 40:502–511

    Article  Google Scholar 

  12. Cuff JA, Clamp ME, Siddiqui AS, Finlay M, Barton GJ (1998) Jpred: a consensus secondary structure prediction server. Bioinformatics 14(10):892–893

    Article  Google Scholar 

  13. Durbin R, Eddy S, Krogh A, Mitchison G (1998) Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  14. F Vivarelli PF, Casadio R (1997) The prediction of protein secondary structure with a cascade correlation learning architecture of neural networks. Neural Comput & Appl 6(2):57–62

    Article  Google Scholar 

  15. Frishman D, Argos P (1995) Knowledge-based protein secondary structure assignment. Proteins 23(4):566–579

    Article  Google Scholar 

  16. Frishman D, Argos P (1997) 75% accuracy in protein secondary structure prediction. Proteins 327:329–335

    Article  Google Scholar 

  17. Geoffrey JB, Michael JES (1987) A strategy for the rapid multiple alignment of protein sequences: confidence levels from tertiary structure comparisons. J Mol Biol 198:327–337

    Article  Google Scholar 

  18. Gouchol P, Jin CH, Ryu KH (2008) Correlation of amino acid physicochemical properties with protein secondary structure conformation. In: proceedings of the 2008 international conference on biomedical engineering and informatics, vol 01, IEEE Computer Society, Washington, pp 117–121

  19. Howlett R, Lovrek I, Jain L, Lim CP, Gabrys B (2010) Advances in design and application of neural networks. Neural Comput & Appl 19(2):167–168

    Article  Google Scholar 

  20. Hua SJ, Sun ZR (2001) Support vector machine approach for protein subcellular localization prediction. Bioinformatics 17:721–728

    Article  Google Scholar 

  21. Jones D (1999) Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol 292:195–202

    Article  Google Scholar 

  22. Kabsch W, Sander C (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22(12):2577–2637

    Article  Google Scholar 

  23. Karplus K, Karplus R, Draper J et al (2003) Combining local-structure, fold-recognition, and new fold methods for protein structure prediction. Proteins 53(6):491–496

    Article  Google Scholar 

  24. Kim H, Park H (2003) Protein secondary structure prediction based on an improved support vector machines approach. Protein Eng 16(8):553–560

    Article  Google Scholar 

  25. King RD, Sternberg MJ (1996) Identification and application of the concepts important for accurate and reliable protein secondary structure prediction. Protein Sci 5(11):2298–2310

    Article  Google Scholar 

  26. Kloczkowski A, Ting KL, Jernigan RL, Garnier J (2002) Combining the gor v algorithm with evolutionary information for protein secondary structure prediction from amino acid sequence. Proteins 49:154–166

    Article  Google Scholar 

  27. Negi S, Braun W (2007) Statistical analysis of physical-chemical properties and prediction of protein-protein interfaces. J Mol Model 13(11):1157–1167

    Article  Google Scholar 

  28. Ouali M, King R (2000) Cascaded multiple classiers for secondary structure prediction. Protein Sci 9:1162–1176

    Article  Google Scholar 

  29. Pedro J, Garca-Laencina JLSG, Figueiras-Vidal AR (2010) Pattern classification with missing data: a review. Neural Comput & Appl 19(2):263–282

    Article  Google Scholar 

  30. Richards FM, Kundrot CE (1988) Identification of structural motifs from protein coordinate data: Secondary structure and first-level supersecondary structure. Proteins-struct Funct Bioinform 3(2):71–84

    Article  Google Scholar 

  31. Robles V, Larranaga P, Pena J et al (2004) Bayesian network multi-classifiers for protein secondary structure prediction. Art Intell Med 31(2):117–136

    Article  Google Scholar 

  32. Rost B, Sander C (1993) Prediction of protein secondary structure at better than 70% accuracy. J Mol Biol 232(2):584–599

    Article  Google Scholar 

  33. Salamov AA, Solovyev VV (1995) Prediction of protein secondary structure by combining nearest-neighbor algorithms and multiple sequence alignments. J Mol Biol 247(1):11–15

    Article  Google Scholar 

  34. Sen TZ, Jernigan RL, Garnier J, Kloczkowski A (2005) Gor v server for protein secondary structure prediction. Bioinformatics 21(11):2787–2788

    Article  Google Scholar 

  35. Sui H, Qu W, Yan B, Wang L (2011) Improved protein secondary structure prediction using a intelligent hsvm method with a new encoding scheme. IJACT : Int J Adv Comput Technol 3(3):239–250

    Google Scholar 

  36. Sujun H, Zhirong S (2001) A novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach. J Mol Biol 308(2):397–407

    Article  Google Scholar 

  37. Yang BR (2004) Knowledge discovery based on theory of inner cognition mechanism and application. Beijing Electronic Industry Press, Beijing

    Google Scholar 

  38. Yang BR, Hou W, Zhou Z (2009) Kaapro: An approach of protein secondary structure prediction based on kdd* in the compound pyramid prediction model. Expert Syst Appl 36(5):9000–9006

    Article  Google Scholar 

  39. Yang BR, Sun HH, Xiong FL (2002) Ming quantitative association rules with standard sql queries and it’s evaluation. J Comput Res Dev 39(3):307–312

    Google Scholar 

  40. Zemla A, Venclovas C, Fidelis K, Rost B (1999) A modified definition of sov, a segment-based measure for protein secondary structure prediction assessment. Proteins 34(2):220–223

    Article  Google Scholar 

  41. Zhai Y, Yang B, Qu W, Sui H (2011) Study on source of classification in imbalanced datasets based on new ensemble classifier. J Syst Eng Electron 33(1):196–201

    Google Scholar 

  42. Zhou Z, Yang BR, Hou W (2009) An improved cba prediction algorithm in compound pyramid model. In: Proceedings of the 21st annual international conference on chinese control and decision conference, IEEE Press, Piscataway, pp 5212–5216

Download references

Acknowledgments

We are grateful for the support of the National Nature Science Foundation (60675030, 60875029), the Beijing Key Discipline Development Program (XK100080537) and the Beijing Key Discipline Development Program-Computer Architecture.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wu Qu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qu, W., Yang, B., Jiang, W. et al. HYBP_PSSP: a hybrid back propagation method for predicting protein secondary structure. Neural Comput & Applic 21, 337–349 (2012). https://doi.org/10.1007/s00521-011-0739-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-011-0739-7

Keywords

Navigation