Multi-label Classification Using Random Label Subset Selections

Breskvar, Martin; Kocev, Dragi; Džeroski, Sašo

doi:10.1007/978-3-319-67786-6_8

Martin Breskvar^17,18,
Dragi Kocev^17,18 &
Sašo Džeroski^17,18

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10558))

Included in the following conference series:

International Conference on Discovery Science

962 Accesses
3 Citations
1 Altmetric

Abstract

In this work, we address the task of multi-label classification (MLC). There are two main groups of methods addressing the task of MLC: problem transformation and algorithm adaptation. Methods from the former group transform the dataset to simpler local problems and then use off-the-shelf methods to solve them. Methods from the latter group change and adapt existing methods to directly address this task and provide a global solution. There is no consensus on when to apply a given method (local or global) to a given dataset. In this work, we design a method that builds on the strengths of both groups of methods. We propose an ensemble method that constructs global predictive models on randomly selected subsets of labels. More specifically, we extend the random forests of predictive clustering trees (PCTs) to consider random output subspaces. We evaluate the proposed ensemble extension on 13 benchmark datasets. The results give parameter recommendations for the proposed method and show that the method yields models with competitive performance as compared to three competing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Mach. Learn. 36(1), 105–139 (1999)
Article Google Scholar
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
MathSciNet MATH Google Scholar
Joly, A., Geurts, P., Wehenkel, L.: Random forests with random projections of the output space for high dimensional multi-label classification. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014. LNCS, vol. 8724, pp. 607–622. Springer, Heidelberg (2014). doi:10.1007/978-3-662-44848-9_39
Google Scholar
Kocev, D., Vens, C., Struyf, J., Džeroski, S.: Tree ensembles for predicting structured outputs. Pattern Recogn. 46(3), 817–833 (2013)
Article Google Scholar
Madjarov, G., Gjorgjevikj, D., Dimitrovski, I., Džeroski, S.: The use of data-derived label hierarchies in multi-label classification. J. Intel. Inf. Syst. 47(1), 57–90 (2016)
Article Google Scholar
Madjarov, G., Kocev, D., Gjorgjevikj, D., Džeroski, S.: An extensive experimental comparison of methods for multi-label learning. Pattern Recogn. 45(9), 3084–3104 (2012)
Article Google Scholar
Szymański, P., Kajdanowicz, T., Kersting, K.: How is a data-driven approach better than random choice in label space division for multi-label classification? Entropy 18(8), 282 (2016)
Article Google Scholar
Tsoumakas, G., Vlahavas, I.: Random k-labelsets: an ensemble method for multilabel classification. In: Kok, J.N., Koronacki, J., Mantaras, R.L., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS, vol. 4701, pp. 406–417. Springer, Heidelberg (2007). doi:10.1007/978-3-540-74958-5_38
Chapter Google Scholar

Download references

Acknowledgements

We acknowledge the financial support of the Slovenian Research Agency via the grants P2-0103,L2-7509, and a young researcher grant to MB, as well as the European Commission, through grants ICT-2013-612944 MAESTRA and ICT-2013-604102 HBP SGA1.

Author information

Authors and Affiliations

Department of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia
Martin Breskvar, Dragi Kocev & Sašo Džeroski
Jožef Stefan International Postgraduate School, Ljubljana, Slovenia
Martin Breskvar, Dragi Kocev & Sašo Džeroski

Authors

Martin Breskvar
View author publications
You can also search for this author in PubMed Google Scholar
Dragi Kocev
View author publications
You can also search for this author in PubMed Google Scholar
Sašo Džeroski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Martin Breskvar .

Editor information

Editors and Affiliations

Kyoto University, Kyoto, Japan
Akihiro Yamamoto
Hokkaido University, Sapporo, Japan
Takuya Kida
National Institute of Informatics, Tokyo, Japan
Takeaki Uno
Gakushuin University, Tokyo, Japan
Tetsuji Kuboyama

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Breskvar, M., Kocev, D., Džeroski, S. (2017). Multi-label Classification Using Random Label Subset Selections. In: Yamamoto, A., Kida, T., Uno, T., Kuboyama, T. (eds) Discovery Science. DS 2017. Lecture Notes in Computer Science(), vol 10558. Springer, Cham. https://doi.org/10.1007/978-3-319-67786-6_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-67786-6_8
Published: 16 September 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67785-9
Online ISBN: 978-3-319-67786-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics