Skip to main content

Protein Fold Prediction Problem Using Ensemble of Classifiers

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5864))

Abstract

Prediction of tertiary structure of protein from its primary structure (amino acid sequence of protein) without relying on sequential similarity is a challenging task for bioinformatics and biological science. The protein fold prediction problem can be expressed as a prediction problem that can be solved by machine learning techniques. In this paper, a new method based on ensemble of five classifiers (Naïve Bayes, Multi Layer Perceptron (MLP), Support Vector Machine (SVM), LogitBoost and AdaBoost.M1) is proposed for the protein fold prediction problem. The dataset used in this experiment is from the standard dataset provided by Ding and Dubchak. Experimental results show that the proposed method enhanced the prediction accuracy up to 64% on an independent test dataset, which is the highest prediction accuracy in compare with other methods proposed by the works have done by literature.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Stanley Shi, Y.M., Suganthan, P.N.: Multiclass protein fold recognition using multiobjective evolutionary algorithms. In: Computational Intelligence in Bioinformatics and Computational Biology (2004), 0-7803-8728-7

    Google Scholar 

  2. Ding, C., Dubchak, I.: Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics 17(4), 349–358 (2001)

    Article  Google Scholar 

  3. Bologna, G., Appel, R.D.: A comparison study on protein fold recognition. In: Ninth International Conference on Neural Information Processing, November 2002, vol. 5, pp. 2492–2496 (2002)

    Google Scholar 

  4. Bittencourt, V.G., Abreut, M.C.C., de Souto, M.C.P., Canutot, A.M.P.: An empirical comparison of individual machine learning techniques and ensemble approaches in protein structural class prediction. In: International Joint Conference on Neural Networks, 0-7803-9048-2 (2005)

    Google Scholar 

  5. Krishnaraj, Y., Reddy, C.K.: Boosting methods for Protein Fold Recognition: An Empirical Comparison. In: IEEE International Conference on Bioinformatics (2008) 978-0-7695-3452-7

    Google Scholar 

  6. Hobohm, U., Scharf, M., Schneider, R., Sander, C.: selection of a representative set of structure from the Brookhaven Protein Bank protein. Science 1, 409–417 (1992)

    Google Scholar 

  7. Lo Conte, L., Ailey, B., Hubbard, T.J.P., Braner, S.E., Murzin, A.G., Chothia, C.: SCOP a structural classification of proteins database 28(1), 257–259 (2000)

    Google Scholar 

  8. Huang, C.D., Lin, C.T., Pal, N.R.: Hierarchical learning architecture with automatic fearture selection for multiclass protein fold classification. IEEE transactions on NanoBioscience 2(4), 221–232 (2003)

    Article  Google Scholar 

  9. Duwairi, R., Kassawneh, A.: A Framework for Predicting Proteins 3D Structures. In: Computer Systems and Applications, AICCSA 2008 (2008), 978-1-4244-1968

    Google Scholar 

  10. Miller, D.J., Pal, S.: Transductive Methods for the Distributed Ensemble Classification Problem. Neural Computation 19, 856–884 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  11. Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms, pp. 1045–9227 (1997)

    Google Scholar 

  12. Platt, J.: Fast Training of Support Vector Machines using Sequential Minimal Optimization. In: Schoelkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods-Support Vector Learning, p. 185. MIT Press, Cambridge (1998)

    Google Scholar 

  13. Friedman, J., Hastie, T., Tibshirani, R.: Additive Logistic Regression: a Statistical View of Boosting Annals of Statistics 28(2), 337–407 (2001) (Published version)

    Google Scholar 

  14. Friedman, N., Goldszmidt, M.: Learning Bayesian networks with local structure. In: Proc. UAI 1996, pp. 252–262 (1996)

    Google Scholar 

  15. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. (2001), 978-0-471-05669-0

    Google Scholar 

  16. Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd edn. (1998), 978-0-471-05669-0

    Google Scholar 

  17. Viola, P., Jones, M.: Rapid Object Detection using a Boosted Cascade of Simple Features. Computer Vision and Pattern Recognition (2001), 0-7695-1272-0

    Google Scholar 

  18. Schapire, R.E.: The strength of weak learnability. Machine Learning 5, 197–227 (1990)

    Google Scholar 

  19. Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting. Annals of Statistics 28(2), 337–407 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  20. Breiman, L.: Bagging Predictors. Machine Learning 24(2), 123–140 (1996)

    MATH  MathSciNet  Google Scholar 

  21. Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  22. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)

    Google Scholar 

  23. MacKay, D.J.C.: Information Theory, Inference, and Learning Algorithms. Cambridge University Press, Cambridge (2003)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dehzangi, A., Phon Amnuaisuk, S., Ng, K.H., Mohandesi, E. (2009). Protein Fold Prediction Problem Using Ensemble of Classifiers. In: Leung, C.S., Lee, M., Chan, J.H. (eds) Neural Information Processing. ICONIP 2009. Lecture Notes in Computer Science, vol 5864. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10684-2_56

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-10684-2_56

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-10682-8

  • Online ISBN: 978-3-642-10684-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics