Skip to main content

Comparing Methods for Multilabel Classification of Proteins Using Machine Learning Techniques

  • Conference paper
Advances in Bioinformatics and Computational Biology (BSB 2009)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 5676))

Included in the following conference series:

Abstract

Multilabel classification is an important problem in bioinformatics and Machine Learning. In a conventional classification problem, examples belong to just one among many classes. When an example can simultaneously belong to more than one class, the classification problem is named multilabel classification problem. Protein function classification is a typical example of multilabel classification, since a protein may have more than one function. This paper describes the main characteristics of some multilabel classification methods and applies five methods to protein classification problems. For an experimental comparison of these methods, traditional machine learning techniques are used. The paper also compares different evaluation metrics used in multilabel problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Tsoumakas, G., Katakis, I.: Multi label classification: An overview. International Journal of Data Warehousing and Mining 3(3), 1–13 (2007)

    Article  Google Scholar 

  2. Gonçalves, T., Quaresma, P.: A preliminary approach to the multilabel classification problem of portuguese juridical documents. In: Pires, F.M., Abreu, S.P. (eds.) EPIA 2003. LNCS, vol. 2902, pp. 435–444. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  3. Lauser, B., Hotho, A.: Automatic multi-label subject indexing in a multilingual environment. In: Koch, T., Sølvberg, I.T. (eds.) ECDL 2003. LNCS, vol. 2769, pp. 140–151. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  4. Luo, X., Zincir-Heywood, N.A.: Evaluation of two systems on multi-class multi-label document classification. In: International Syposium on Methodologies for Intelligent Systems, pp. 161–169 (2005)

    Google Scholar 

  5. Clare, A., King, R.D.: Knowledge discovery in multi-label phenotype data. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, pp. 42–53. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  6. Zhang, M.L., Zhou, Z.H.: A k-nearest neighbor based algorithm for multi-label classification. In: IEEE International Conference on Granular Computing, vol. 2, pp. 718–721. The IEEE Computational Intelligence Society (2005)

    Google Scholar 

  7. Elisseeff, A.E., Weston, J.: A kernel method for multi-labelled classification. In: Advances in Neural Information Processing Systems, vol. 14, pp. 681–687. MIT Press, Cambridge (2001)

    Google Scholar 

  8. Alves, R., Delgado, M., Freitas, A.: Multi-label hierarchical classification of protein functions with artificial immune systems. In: Advances in Bioinformatics and Computational Biology, pp. 1–12 (2008)

    Google Scholar 

  9. Diplaris, S., Tsoumakas, G., Mitkas, P., Vlahavas, I.: Protein classification with multiple algorithms. In: Bozanis, P., Houstis, E.N. (eds.) PCI 2005. LNCS, vol. 3746, pp. 448–456. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  10. Karalic, A., Pirnat, V.: Significance level based multiple tree classification. Informatica 5 (1991)

    Google Scholar 

  11. Boutell, M.R., Luo, J., Shen, X., Brown, C.M.: Learning multi-label scene classification. Pattern Recognition 37(9), 1757–1771 (2004)

    Article  Google Scholar 

  12. Shen, X., Boutell, M., Luo, J., Brown, C.: Multi-label machine learning and its application to semantic scene classification. In: International Symposium on Electronic Imaging, San Jose, CA, January 2004, pp. 18–22 (2004)

    Google Scholar 

  13. Tsoumakas, G., Vlahavas, I.: Random k-labelsets: An ensemble method for multilabel classification. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS, vol. 4701, pp. 406–417. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  14. Saridis, G.: Parameter estimation: Principles and problems. Automatic Control, IEEE Transactions on 28(5), 634–635 (1983)

    Article  Google Scholar 

  15. Schapire, R.E., Singer, Y.: Boostexter: a boosting-based system for text categorization. In: Machine Learning, pp. 135–168 (2000)

    Google Scholar 

  16. Godbole, S., Sarawagi, S.: Discriminative methods for multi-labeled classification. In: Advances in Knowledge Discovery and Data Mining, pp. 22–30 (2004)

    Google Scholar 

  17. R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2008) ISBN 3-900051-07-0

    Google Scholar 

  18. Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Machine Learning 6(1), 37–66 (1991)

    Google Scholar 

  19. Vapnik, V.N.: The Nature of Statistical Learning Theory (Information Science and Statistics). Springer, Heidelberg (1999)

    Google Scholar 

  20. Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)

    Google Scholar 

  21. Cohen, W.W.: Fast effective rule induction. In. Proceedings of the Twelfth International Conference on Machine Learning, pp. 115–123 (1995)

    Google Scholar 

  22. Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Mach. Learn. 29(2-3), 131–163 (1997)

    Article  Google Scholar 

  23. Tsoumakas, G., Friberg, R., Spyromitros-Xioufis, E., Katakis, I., Vilcek, J.: Mulan software - java classes for multi-label classification (May 2008), http://mlkd.csd.auth.gr/multilabel.html#Software

  24. Demšar, J.: Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7, 1–30 (2006)

    Google Scholar 

  25. Abdi, H.: Bonferroni and Sidak corrections for multiple comparisons. Encyclopedia of Measurement and Statistics, pp. 175–208. Sage, Thousand Oaks (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cerri, R., da Silva, R.R.O., de Carvalho, A.C.P.L.F. (2009). Comparing Methods for Multilabel Classification of Proteins Using Machine Learning Techniques. In: Guimarães, K.S., Panchenko, A., Przytycka, T.M. (eds) Advances in Bioinformatics and Computational Biology. BSB 2009. Lecture Notes in Computer Science(), vol 5676. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03223-3_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03223-3_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03222-6

  • Online ISBN: 978-3-642-03223-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics