Abstract
Protein fold recognition is considered as an essential step in determining the tertiary structure of proteins in bioinformatics. The most complex challenge in the protein folding problem is the high dimensionality of feature vectors and the diversity of the protein fold classes. In this paper, two frameworks are proposed to solve this problem. The main components in the first two-level framework are Deep Kernelized Extreme Learning Machine (DKELM) and linear discriminant analysis. Second framework consists of three levels. In the first level, the dataset is initialized to be used in the next level. In the second level, OVADKELM and OVODKELM are independently employed to extract four and six new features, respectively, which are added into the basic datasets in the third level. DKELM is applied in the third level as a final classifier to classify the instances into folds. The proposed frameworks are implemented on DD and TG datasets which are considered as SCOP datasets. The experimental results indicate that proposed methods could improve the classification accuracy in both datasets.
Similar content being viewed by others
References
Ding CHQ, Dubchak I (2001) Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics 17(4):349–358
Zhang Y (2009) Protein structure prediction: When is it useful? Curr Opin Struct Biol 19(2):145–155
Shenoy SR, Jayaram B (2010) Proteins: sequence to structure and function-current status. Curr Protein Pept Sci 11(7):498–514
Valavanis IK, Spyrou GM, Nikita KS (2010) A comparative study of multi-classification methods for protein fold recognition. Int J Comput Intell Bioinf Syst Biol 1(3):332–346
Pal NR, Chakraborty D (2003) Some new features for protein fold prediction. In: Artificial neural networks and neural information processing—ICANN/ICONIP 2003. Springer, pp 1176–1183
Cheng J, Baldi P (2006) A machine learning information retrieval approach to protein fold recognition. Bioinformatics 22(12):1456–1463
Qu W et al (2011) Improving protein secondary structure prediction using a multi-modal BP method. Comput Biol Med 41(10):946–959
Abbasi E, Ghatee M, Shiri ME (2013) FRAN and RBF-PSO as two components of a hyper framework to recognize protein folds. Comput Biol Med 43(9):1182–1191
Jazebi S, Tohidi A, Rahgozar M (2009) Application of classifier fusion for protein fold recognition. In: Sixth international conference on FSKD’09, vol 7, 2009 Aug 14. IEEE, pp 171–175
Chmielnicki W, Sta K (2012) A hybrid discriminative/generative approach to protein fold recognition. Neurocomputing 75(1):194–198
Chen Y et al (2008) Ensemble voting system for multiclass protein fold recognition. Int J Pattern Recognit Artif Intell 22(04):747–763
Hashemi HB, Shakery A, Naeini MP (2009) Protein fold pattern recognition using Bayesian ensemble of RBF neural networks. In: International conference of soft computing and pattern recognition, SOCPAR’09, 2009 Dec 4. IEEE, pp. 436–441
Nanni L (2006) Ensemble of classifiers for protein fold recognition. Neurocomputing 69(7):850–853
Sharma A et al (2013) A feature extraction technique using bi-gram probabilities of position specific scoring matrix for protein fold recognition. J Theor Biol 320:41–46
Paliwal KK et al (2014) A tri-gram based feature extraction technique using linear probabilities of position specific scoring matrix for protein fold recognition. IEEE Trans Nanobiosci 13(1):44–50
Lyons J et al (2015) Advancing the accuracy of protein fold recognition by utilizing profiles from hidden Markov models. IEEE Trans Nanobiosci 14(7):761–772
Lyons J et al (2016) Protein fold recognition using HMM–HMM alignment and dynamic programming. J Theor Biol 393:67–74
Paliwal KK et al (2014) Improving protein fold recognition using the amalgamation of evolutionary-based and structural based information. BMC Bioinformatics 15(16):S12
Huang C-D, Lin C-T, Pal NR (2003) Hierarchical learning architecture with automatic feature selection for multiclass protein fold classification. IEEE Trans Nanobiosci 2(4):221–232
Aram RZ, Charkari NM (2015) A two-layer classification framework for protein fold recognition. J Theor Biol 365:32–39
Taguchi Y, Gromiha MM (2007) Application of amino acid occurrence for discriminating different folding types of globular proteins. BMC Bioinformatics 8(1):404
Dubchak I et al (1995) Prediction of protein folding class using global description of amino acid sequence. Proc Natl Acad Sci 92(19):8700–8704
Huang G-B, Zhu Q-Y, Siew C-K (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: Proceedings of 2004 IEEE international joint conference on neural networks, vol 2, 2004 Jul 25. IEEE, pp 985–990
He Y-L, Geng Z-Q, Zhu Q-X (2015) Data driven soft sensor development for complex chemical processes using extreme learning machine. Chem Eng Res Des 102:1–11
Serre D (2002) Matrices: theory and applications. Springer, New York
Yu W et al (2015) Learning deep representations via extreme learning machines. Neurocomputing 149:308–315
Ghassabeh YA, Rudzicz F, Moghaddam HA (2015) Fast incremental LDA feature extraction. Pattern Recogn 48(6):1999–2012
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7(2):179–188
Bi X et al (2015) Distributed extreme learning machine with kernels based on mapreduce. Neurocomputing 149:456–463
Lyons J et al (2014) Protein fold recognition by alignment of amino acid residues using kernelized dynamic time warping. J Theor Biol 354:137–145
Rahimi M, Bakhtiarizadeh MR, Mohammadi-Sangcheshmeh A (2017) OOgenesis_Pred: a sequence-based method for predicting oogenesis proteins by six different modes of Chou’s pseudo amino acid composition. J Theor Biol 414:128–136
Shi SY, Suganthan PN, Deb K (2004) Multiclass protein fold recognition using multiobjective evolutionary algorithms. In: Proceedings of the 2004 IEEE symposium on computational intelligence in bioinformatics and computational biology, 2004. CIBCB’04. IEEE
Leon F, Aignatoaiei BI, Zaharia MH (2009) Performance analysis of algorithms for protein structure classification. In: 20th international workshop on database and expert systems application, DEXA’09, 2009 Aug 31. IEEE, pp 203–207
Kavousi K et al (2012) Evidence theoretic protein fold classification based on the concept of hyperfold. Math Biosci 240(2):148–160
Ghanty P, Pal NR (2009) Prediction of protein folds: extraction of new features, dimensionality reduction, and fusion of heterogeneous classifiers. IEEE Trans Nanobiosci 8(1):100–110
Yang T et al (2011) Margin-based ensemble classifier for protein fold recognition. Expert Syst Appl 38(10):12348–12355
Huang JT, Tian J (2006) Amino acid sequence predicts folding rate for middle-size two-state proteins. Proteins Struct Funct Bioinf 63(3):551–554
Dong Q, Zhou S, Guan J (2009) A new taxonomy-based protein fold recognition approach based on autocross-covariance transformation. Bioinformatics 25(20):2655–2662
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is not any conflict of interest in this manuscript.
Rights and permissions
About this article
Cite this article
Ibrahim, W., Abadeh, M.S. Protein fold recognition using Deep Kernelized Extreme Learning Machine and linear discriminant analysis. Neural Comput & Applic 31, 4201–4214 (2019). https://doi.org/10.1007/s00521-018-3346-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-018-3346-z