Skip to main content

MLRF: Multi-label Classification Through Random Forest with Label-Set Partition

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9227))

Abstract

Although random forest is one of the best ensemble learning algorithms for single-label classification, exploiting it for multi-label classification problems is still challenging and few method has been investigated in the literature. This paper proposes MLRF, a multi-label classification method based on a variation of random forest. In this algorithm, a new label set partition method is proposed to transform multi-label data sets into multiple single-label data sets, which can effectively discover correlated labels to optimize the label subset partition. For each generated single-label subset, a random forest classifier is learned by an improved random forest algorithm that employs a kNN-like on-line instance sampling method. Experimental results on ten benchmark data sets have demonstrated that MLRF outperforms other state-of-the-art multi-label classification algorithms in terms of classification performance as well as various evaluation criteria widely used for multi-label classification.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Tsoumakas G, Katakis I. Multi-label classification: an overview. Department of Informatics, Aristotle University of Thessaloniki, Greece (2006)

    Google Scholar 

  2. Madjarov, G., Kocev, D., Gjorgjevikj, D., et al.: An extensive experimental comparison of methods for multi-label learning. Pattern Recogn. 45(9), 3084–3104 (2012)

    Article  Google Scholar 

  3. Read, J., Pfahringer, B., Holmes, G.: Multi-label classification using ensembles of pruned sets. In: Eighth IEEE International Conference on Data Mining, 2008, ICDM 2008, pp. 995–1000. IEEE (2008)

    Google Scholar 

  4. Tsoumakas, G., Vlahavas, I.P.: Random k-labelsets: an ensemble method for multilabel classification. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 406–417. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  5. Montañes, E., Senge, R., Barranquero, J., et al.: Dependent binary relevance models for multi-label classification. Pattern Recogn. 47(3), 1494–1508 (2014)

    Article  Google Scholar 

  6. Read, J., Pfahringer, B., Holmes, G., et al.: Classifier chains for multi-label classification. Mach. Learn. 85(3), 333–359 (2011)

    Article  MathSciNet  Google Scholar 

  7. Kocev, D.: Ensembles for predicting structured outputs. Informatica Int. J. Comput. Inf. 36(1), 113–114 (2012)

    Google Scholar 

  8. Blockeel, H., De Raedt, L., Ramon, J.: Top-down induction of clustering trees. arXiv preprint cs/0011032 (2000)

    Google Scholar 

  9. Kocev, D., Vens, C., Struyf, J., Džeroski, S.: Ensembles of multi-objective decision trees. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 624–631. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  10. Zhang, M.L., Zhou, Z.H.: A k-nearest neighbor based algorithm for multi-label classification. In: IEEE International Conference on Granular Computing, 2005, vol. 2, pp. 718–721. IEEE (2005)

    Google Scholar 

  11. Spyromitros, E., Tsoumakas, G., Vlahavas, I.P.: An empirical study of lazy multilabel classification algorithms. In: Darzentas, J., Vouros, G.A., Vosinakis, S., Arnellos, A. (eds.) SETN 2008. LNCS (LNAI), vol. 5138, pp. 401–406. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  12. Tsoumakas, G., Katakis, I., Vlahavas, I.: Effective and efficient multi-label classification in domains with large number of labels. In: Proceedings of ECML/PKDD 2008 Workshop on Mining Multidimensional Data (MMD 2008), pp. 30–44 (2008)

    Google Scholar 

  13. Schapire, R.E., Singer, Y.: BoosTexter: a boosting-based system for text categorization. Mach. Learn. 39(2–3), 135–168 (2000)

    Article  MATH  Google Scholar 

  14. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  15. Ye, Y., Wu, Q., Huang, J.Z., et al.: Stratified sampling for feature subspace selection in random forests for high dimensional data. Pattern Recogn. 46(3), 769–787 (2013)

    Article  Google Scholar 

  16. Wuensch, K.L.: Chi-square tests. In: Lovric, M. (ed.) International Encyclopedia of Statistical Science, pp. 252–253. Springer, Berlin, Heidelberg (2011)

    Chapter  Google Scholar 

  17. Sprent, P.: Fisher exact test. In: Lovric, M. (ed.) International Encyclopedia of Statistical Science, pp. 524–525. Springer, Berlin, Heidelberg (2011)

    Chapter  Google Scholar 

  18. Fleiss, J.L., Levin, B., Paik, M.C.: Statistical Methods for Rates and Proportions. John Wiley & Sons (2013)

    Google Scholar 

  19. Anisimova, M., Gascuel, O.: Approximate likelihood-ratio test for branches: a fast, accurate, and powerful alternative. Syst. Biol. 55(4), 539–552 (2006)

    Article  Google Scholar 

  20. Raileanu, L.E., Stoffel, K.: Theoretical comparison between the gini index and information gain criteria. Ann. Math. Artif. Intell. 41(1), 77–93 (2004)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgement

Yunming Ye’s work was supported in part by National Key Technology R&D Program of MOST China under Grant No. 2014BAL05B06, Shenzhen Science and Technology Program under Grant No. JCYJ20140417172417128, and the Shenzhen Strategic Emerging Industries Program under Grant No. JCYJ20130329142551746. Yan Li’s work was supported in part by NSFC under Grant No. 61303103, and the Shenzhen Science and Technology Program under Grant No. JCY20130331150354073.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yunming Ye .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Liu, F., Zhang, X., Ye, Y., Zhao, Y., Li, Y. (2015). MLRF: Multi-label Classification Through Random Forest with Label-Set Partition. In: Huang, DS., Han, K. (eds) Advanced Intelligent Computing Theories and Applications. ICIC 2015. Lecture Notes in Computer Science(), vol 9227. Springer, Cham. https://doi.org/10.1007/978-3-319-22053-6_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-22053-6_44

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-22052-9

  • Online ISBN: 978-3-319-22053-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics