Skip to main content
Log in

Protein fold recognition using Deep Kernelized Extreme Learning Machine and linear discriminant analysis

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Protein fold recognition is considered as an essential step in determining the tertiary structure of proteins in bioinformatics. The most complex challenge in the protein folding problem is the high dimensionality of feature vectors and the diversity of the protein fold classes. In this paper, two frameworks are proposed to solve this problem. The main components in the first two-level framework are Deep Kernelized Extreme Learning Machine (DKELM) and linear discriminant analysis. Second framework consists of three levels. In the first level, the dataset is initialized to be used in the next level. In the second level, OVADKELM and OVODKELM are independently employed to extract four and six new features, respectively, which are added into the basic datasets in the third level. DKELM is applied in the third level as a final classifier to classify the instances into folds. The proposed frameworks are implemented on DD and TG datasets which are considered as SCOP datasets. The experimental results indicate that proposed methods could improve the classification accuracy in both datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Ding CHQ, Dubchak I (2001) Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics 17(4):349–358

    Article  Google Scholar 

  2. Zhang Y (2009) Protein structure prediction: When is it useful? Curr Opin Struct Biol 19(2):145–155

    Article  Google Scholar 

  3. Shenoy SR, Jayaram B (2010) Proteins: sequence to structure and function-current status. Curr Protein Pept Sci 11(7):498–514

    Article  Google Scholar 

  4. Valavanis IK, Spyrou GM, Nikita KS (2010) A comparative study of multi-classification methods for protein fold recognition. Int J Comput Intell Bioinf Syst Biol 1(3):332–346

    Google Scholar 

  5. Pal NR, Chakraborty D (2003) Some new features for protein fold prediction. In: Artificial neural networks and neural information processing—ICANN/ICONIP 2003. Springer, pp 1176–1183

  6. Cheng J, Baldi P (2006) A machine learning information retrieval approach to protein fold recognition. Bioinformatics 22(12):1456–1463

    Article  Google Scholar 

  7. Qu W et al (2011) Improving protein secondary structure prediction using a multi-modal BP method. Comput Biol Med 41(10):946–959

    Article  Google Scholar 

  8. Abbasi E, Ghatee M, Shiri ME (2013) FRAN and RBF-PSO as two components of a hyper framework to recognize protein folds. Comput Biol Med 43(9):1182–1191

    Article  Google Scholar 

  9. Jazebi S, Tohidi A, Rahgozar M (2009) Application of classifier fusion for protein fold recognition. In: Sixth international conference on FSKD’09, vol 7, 2009 Aug 14. IEEE, pp 171–175

  10. Chmielnicki W, Sta K (2012) A hybrid discriminative/generative approach to protein fold recognition. Neurocomputing 75(1):194–198

    Article  Google Scholar 

  11. Chen Y et al (2008) Ensemble voting system for multiclass protein fold recognition. Int J Pattern Recognit Artif Intell 22(04):747–763

    Article  Google Scholar 

  12. Hashemi HB, Shakery A, Naeini MP (2009) Protein fold pattern recognition using Bayesian ensemble of RBF neural networks. In: International conference of soft computing and pattern recognition, SOCPAR’09, 2009 Dec 4. IEEE, pp. 436–441

  13. Nanni L (2006) Ensemble of classifiers for protein fold recognition. Neurocomputing 69(7):850–853

    Article  Google Scholar 

  14. Sharma A et al (2013) A feature extraction technique using bi-gram probabilities of position specific scoring matrix for protein fold recognition. J Theor Biol 320:41–46

    Article  MathSciNet  MATH  Google Scholar 

  15. Paliwal KK et al (2014) A tri-gram based feature extraction technique using linear probabilities of position specific scoring matrix for protein fold recognition. IEEE Trans Nanobiosci 13(1):44–50

    Article  Google Scholar 

  16. Lyons J et al (2015) Advancing the accuracy of protein fold recognition by utilizing profiles from hidden Markov models. IEEE Trans Nanobiosci 14(7):761–772

    Article  Google Scholar 

  17. Lyons J et al (2016) Protein fold recognition using HMM–HMM alignment and dynamic programming. J Theor Biol 393:67–74

    Article  Google Scholar 

  18. Paliwal KK et al (2014) Improving protein fold recognition using the amalgamation of evolutionary-based and structural based information. BMC Bioinformatics 15(16):S12

    Article  Google Scholar 

  19. Huang C-D, Lin C-T, Pal NR (2003) Hierarchical learning architecture with automatic feature selection for multiclass protein fold classification. IEEE Trans Nanobiosci 2(4):221–232

    Article  Google Scholar 

  20. Aram RZ, Charkari NM (2015) A two-layer classification framework for protein fold recognition. J Theor Biol 365:32–39

    Article  MathSciNet  MATH  Google Scholar 

  21. Taguchi Y, Gromiha MM (2007) Application of amino acid occurrence for discriminating different folding types of globular proteins. BMC Bioinformatics 8(1):404

    Article  Google Scholar 

  22. Dubchak I et al (1995) Prediction of protein folding class using global description of amino acid sequence. Proc Natl Acad Sci 92(19):8700–8704

    Article  Google Scholar 

  23. Huang G-B, Zhu Q-Y, Siew C-K (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: Proceedings of 2004 IEEE international joint conference on neural networks, vol 2, 2004 Jul 25. IEEE, pp 985–990

  24. He Y-L, Geng Z-Q, Zhu Q-X (2015) Data driven soft sensor development for complex chemical processes using extreme learning machine. Chem Eng Res Des 102:1–11

    Article  Google Scholar 

  25. Serre D (2002) Matrices: theory and applications. Springer, New York

    MATH  Google Scholar 

  26. Yu W et al (2015) Learning deep representations via extreme learning machines. Neurocomputing 149:308–315

    Article  Google Scholar 

  27. Ghassabeh YA, Rudzicz F, Moghaddam HA (2015) Fast incremental LDA feature extraction. Pattern Recogn 48(6):1999–2012

    Article  Google Scholar 

  28. Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7(2):179–188

    Article  Google Scholar 

  29. Bi X et al (2015) Distributed extreme learning machine with kernels based on mapreduce. Neurocomputing 149:456–463

    Article  Google Scholar 

  30. Lyons J et al (2014) Protein fold recognition by alignment of amino acid residues using kernelized dynamic time warping. J Theor Biol 354:137–145

    Article  MathSciNet  MATH  Google Scholar 

  31. Rahimi M, Bakhtiarizadeh MR, Mohammadi-Sangcheshmeh A (2017) OOgenesis_Pred: a sequence-based method for predicting oogenesis proteins by six different modes of Chou’s pseudo amino acid composition. J Theor Biol 414:128–136

    Article  Google Scholar 

  32. Shi SY, Suganthan PN, Deb K (2004) Multiclass protein fold recognition using multiobjective evolutionary algorithms. In: Proceedings of the 2004 IEEE symposium on computational intelligence in bioinformatics and computational biology, 2004. CIBCB’04. IEEE

  33. Leon F, Aignatoaiei BI, Zaharia MH (2009) Performance analysis of algorithms for protein structure classification. In: 20th international workshop on database and expert systems application, DEXA’09, 2009 Aug 31. IEEE, pp 203–207

  34. Kavousi K et al (2012) Evidence theoretic protein fold classification based on the concept of hyperfold. Math Biosci 240(2):148–160

    Article  MathSciNet  MATH  Google Scholar 

  35. Ghanty P, Pal NR (2009) Prediction of protein folds: extraction of new features, dimensionality reduction, and fusion of heterogeneous classifiers. IEEE Trans Nanobiosci 8(1):100–110

    Article  Google Scholar 

  36. Yang T et al (2011) Margin-based ensemble classifier for protein fold recognition. Expert Syst Appl 38(10):12348–12355

    Article  Google Scholar 

  37. Huang JT, Tian J (2006) Amino acid sequence predicts folding rate for middle-size two-state proteins. Proteins Struct Funct Bioinf 63(3):551–554

    Article  Google Scholar 

  38. Dong Q, Zhou S, Guan J (2009) A new taxonomy-based protein fold recognition approach based on autocross-covariance transformation. Bioinformatics 25(20):2655–2662

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Saniee Abadeh.

Ethics declarations

Conflict of interest

The authors declare that there is not any conflict of interest in this manuscript.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ibrahim, W., Abadeh, M.S. Protein fold recognition using Deep Kernelized Extreme Learning Machine and linear discriminant analysis. Neural Comput & Applic 31, 4201–4214 (2019). https://doi.org/10.1007/s00521-018-3346-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-018-3346-z

Keywords

Navigation