Skip to main content

A New Approach to Multi-class SVM-Based Classification Using Error Correcting Output Codes

  • Conference paper
Book cover Computer Recognition Systems 4

Part of the book series: Advances in Intelligent and Soft Computing ((AINSC,volume 95))

Abstract

Protein fold classification is the prediction of protein’s tertiary structure (fold) from amino acid sequence without relying on the sequence similarity. The problem how to predict protein fold from amino acid sequence is regarded as a great challenge in computational biology and bioinformatics. To deal with this problem the support vector machine (SVM) classifier was introduced. However the SVM is a binary classifier, but protein fold recognition is a multi-class problem. So the method of solving this issue was proposed based on error correcting output codes (ECOC). The key problem in this approach is how to construct the optimal ECOC codewords. There are three strategies presented in this paper based on recognition ratios obtained by binary classfiers on the traing data set. The SVM classifier using the ECOC codewords contructed using these strategies was used on a real world data set. The obtained results (57.1% - 62.6%) are better than the best results published in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Allwein, E., Schapire, R., Singer, Y.: Reducing multiclass to binary: A unifying approach for margin classifiers. Journal of Machine Learning Research 1, 113–141 (2002)

    Article  MathSciNet  Google Scholar 

  2. Baldi, P., Brunak, S., Chauvin, Y., Andersen, C., Nielsen, H.: Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16, 412–424 (2000)

    Article  Google Scholar 

  3. Chan, H.S., Dill, K.: The protein folding problem. Physics Today, 24–32 (February 1993)

    Google Scholar 

  4. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm

  5. Chung, I.F., Huang, C.D., Shen, Y.H., Lin, C.T.: Recognition of structure classification of protein folding by NN and SVM hierarchical learning architecture. In: Kaynak, O., Alpaydın, E., Oja, E., Xu, L. (eds.) ICANN 2003 and ICONIP 2003. LNCS, vol. 2714, pp. 1159–1167. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  6. Ding, C.H., Dubchak, I.: Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics 17, 349–358 (2001)

    Article  Google Scholar 

  7. Dietterich, T.G., Bakiri, G.: Solving multiclass problems via error-correcting output codes. Journal of Artificial Intelligence Research 2, 263–286 (1995)

    MATH  Google Scholar 

  8. Dubchak, I., Muchnik, I., Holbrook, S.R., Kim, S.H.: Prediction of protein folding class using global description of amino acid sequence. Proc. Natl. Acad. Sci. USA 92, 8700–8704 (1995)

    Article  Google Scholar 

  9. Dubchak, I., Muchnik, I., Kim, S.H.: Protein folding class predictor for SCOP: approach based on global descriptors. In: Proceedings ISMB, vol. 5, pp. 104–107 (1997)

    Google Scholar 

  10. Fei, B., Liu, J.: Binary Tree of SVM: A New Fast Multiclass Training and Classification Algorithm. IEEE Transaction on Neural Networks 17(3), 696–704 (2006)

    Article  Google Scholar 

  11. Hastie, T., Tibshirani, R.: Classification by pairwise coupling. Annals of Statistics 26(2), 451–471 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  12. Hobohm, U., Scharf, M., Schneider, R., Sander, C.: Selection of a representative set of structures from the Brookhaven Protein Bank. Protein Sci. 1, 409–417 (1992)

    Article  Google Scholar 

  13. Hobohm, U., Sander, C.: Enlarged representative set of Proteins. Protein Sci. 3, 522–524 (1994)

    Article  Google Scholar 

  14. Lo Conte, L., Ailey, B., Hubbard, T.J.P., Brenner, S.E., Murzin, A.G., Chotchia, C.: SCOP: a structural classification of protein database. Nucleic Acids Res. 28, 257–259 (2000)

    Article  Google Scholar 

  15. Nanni, L.: A novel ensemble of classifiers for protein fold recognition. Neurocomputing 69, 2434–2437 (2006)

    Article  Google Scholar 

  16. Okun, O.: Protein fold recognition with k-local hyperplane distance nearest neighbor algorithm. In: Proceedings of the Second European Workshop on Data Mining and Text Mining in Bioinformatics, Pisa, Italy, September 24, pp. 51–57 (2004)

    Google Scholar 

  17. Platt, J.C., Cristianini, N., Shawe-Taylor, J.: Large Margin DAGs for Multiclass Classification. In: Proceedings of Neural Information Processing Systems, pp. 547–553 (2000)

    Google Scholar 

  18. Shen, H.B., Chou, K.C.: Ensemble classifier for protein fold pattern recognition. Bioinformatics 22, 1717–1722 (2006)

    Article  Google Scholar 

  19. Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)

    MATH  Google Scholar 

  20. Vural, V., Dy, J.G.: A hierarchical method for multi-class support vector machines. In: Proceedings of the Twenty-First ICML, pp. 831–838 (2004)

    Google Scholar 

  21. Windeatt, T., Ghaderi, R.: Coding and decoding for multiclass learning problems. Information Fusion 4(1), 11–21 (2003)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chmielnicki, W., Stąpor, K. (2011). A New Approach to Multi-class SVM-Based Classification Using Error Correcting Output Codes. In: Burduk, R., Kurzyński, M., Woźniak, M., Żołnierek, A. (eds) Computer Recognition Systems 4. Advances in Intelligent and Soft Computing, vol 95. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20320-6_52

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20320-6_52

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20319-0

  • Online ISBN: 978-3-642-20320-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics