Protein Fold Prediction Problem Using Ensemble of Classifiers

Dehzangi, Abdollah; Phon Amnuaisuk, Somnuk; Ng, Keng Hoong; Mohandesi, Ehsan

doi:10.1007/978-3-642-10684-2_56

Protein Fold Prediction Problem Using Ensemble of Classifiers

Abdollah Dehzangi¹⁹,
Somnuk Phon Amnuaisuk¹⁹,
Keng Hoong Ng¹⁹ &
…
Ehsan Mohandesi¹⁹

Conference paper

1735 Accesses
13 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5864))

Abstract

Prediction of tertiary structure of protein from its primary structure (amino acid sequence of protein) without relying on sequential similarity is a challenging task for bioinformatics and biological science. The protein fold prediction problem can be expressed as a prediction problem that can be solved by machine learning techniques. In this paper, a new method based on ensemble of five classifiers (Naïve Bayes, Multi Layer Perceptron (MLP), Support Vector Machine (SVM), LogitBoost and AdaBoost.M1) is proposed for the protein fold prediction problem. The dataset used in this experiment is from the standard dataset provided by Ding and Dubchak. Experimental results show that the proposed method enhanced the prediction accuracy up to 64% on an independent test dataset, which is the highest prediction accuracy in compare with other methods proposed by the works have done by literature.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Stanley Shi, Y.M., Suganthan, P.N.: Multiclass protein fold recognition using multiobjective evolutionary algorithms. In: Computational Intelligence in Bioinformatics and Computational Biology (2004), 0-7803-8728-7
Google Scholar
Ding, C., Dubchak, I.: Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics 17(4), 349–358 (2001)
Article Google Scholar
Bologna, G., Appel, R.D.: A comparison study on protein fold recognition. In: Ninth International Conference on Neural Information Processing, November 2002, vol. 5, pp. 2492–2496 (2002)
Google Scholar
Bittencourt, V.G., Abreut, M.C.C., de Souto, M.C.P., Canutot, A.M.P.: An empirical comparison of individual machine learning techniques and ensemble approaches in protein structural class prediction. In: International Joint Conference on Neural Networks, 0-7803-9048-2 (2005)
Google Scholar
Krishnaraj, Y., Reddy, C.K.: Boosting methods for Protein Fold Recognition: An Empirical Comparison. In: IEEE International Conference on Bioinformatics (2008) 978-0-7695-3452-7
Google Scholar
Hobohm, U., Scharf, M., Schneider, R., Sander, C.: selection of a representative set of structure from the Brookhaven Protein Bank protein. Science 1, 409–417 (1992)
Google Scholar
Lo Conte, L., Ailey, B., Hubbard, T.J.P., Braner, S.E., Murzin, A.G., Chothia, C.: SCOP a structural classification of proteins database 28(1), 257–259 (2000)
Google Scholar
Huang, C.D., Lin, C.T., Pal, N.R.: Hierarchical learning architecture with automatic fearture selection for multiclass protein fold classification. IEEE transactions on NanoBioscience 2(4), 221–232 (2003)
Article Google Scholar
Duwairi, R., Kassawneh, A.: A Framework for Predicting Proteins 3D Structures. In: Computer Systems and Applications, AICCSA 2008 (2008), 978-1-4244-1968
Google Scholar
Miller, D.J., Pal, S.: Transductive Methods for the Distributed Ensemble Classification Problem. Neural Computation 19, 856–884 (2007)
Article MATH MathSciNet Google Scholar
Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms, pp. 1045–9227 (1997)
Google Scholar
Platt, J.: Fast Training of Support Vector Machines using Sequential Minimal Optimization. In: Schoelkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods-Support Vector Learning, p. 185. MIT Press, Cambridge (1998)
Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: Additive Logistic Regression: a Statistical View of Boosting Annals of Statistics 28(2), 337–407 (2001) (Published version)
Google Scholar
Friedman, N., Goldszmidt, M.: Learning Bayesian networks with local structure. In: Proc. UAI 1996, pp. 252–262 (1996)
Google Scholar
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. (2001), 978-0-471-05669-0
Google Scholar
Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd edn. (1998), 978-0-471-05669-0
Google Scholar
Viola, P., Jones, M.: Rapid Object Detection using a Boosted Cascade of Simple Features. Computer Vision and Pattern Recognition (2001), 0-7695-1272-0
Google Scholar
Schapire, R.E.: The strength of weak learnability. Machine Learning 5, 197–227 (1990)
Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting. Annals of Statistics 28(2), 337–407 (2000)
Article MATH MathSciNet Google Scholar
Breiman, L.: Bagging Predictors. Machine Learning 24(2), 123–140 (1996)
MATH MathSciNet Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
MATH Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Google Scholar
MacKay, D.J.C.: Information Theory, Inference, and Learning Algorithms. Cambridge University Press, Cambridge (2003)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Artificial Intelligence and Intelligent Computing Center, Faculty of Information Technology, Multimedia University Cyberjaya, Selangor, Malaysia
Abdollah Dehzangi, Somnuk Phon Amnuaisuk, Keng Hoong Ng & Ehsan Mohandesi

Authors

Abdollah Dehzangi
View author publications
You can also search for this author in PubMed Google Scholar
Somnuk Phon Amnuaisuk
View author publications
You can also search for this author in PubMed Google Scholar
Keng Hoong Ng
View author publications
You can also search for this author in PubMed Google Scholar
Ehsan Mohandesi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electronic Engineering, City University of Hong Kong, Hong Kong,
Chi Sing Leung
School of Electrical Engineering and Computer Science, Kyungpook National University, 1370 Sankyuk-Dong, Puk-Gu, 702-701, Taegu, Korea
Minho Lee
School of Information Technology, King Mongkut’s University of Technology Thonburi, 126 Pracha-U-Thit Rd., Bangmod, Thungkru, 10140, Bangkok, Thailand
Jonathan H. Chan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dehzangi, A., Phon Amnuaisuk, S., Ng, K.H., Mohandesi, E. (2009). Protein Fold Prediction Problem Using Ensemble of Classifiers. In: Leung, C.S., Lee, M., Chan, J.H. (eds) Neural Information Processing. ICONIP 2009. Lecture Notes in Computer Science, vol 5864. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10684-2_56

Download citation

DOI: https://doi.org/10.1007/978-3-642-10684-2_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10682-8
Online ISBN: 978-3-642-10684-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics