Abstract
RAndom k-labELsets (RAkEL) is an effective ensemble multi-label classification (MLC) model where each base-classifier is trained on a small random subset of k labels. However, the model construction does not fully benefit from the diversity of the ensemble and the label probability estimates obtained with RAkEL are usually badly calibrated due to the problems raised by the imbalanced label representation. In this paper, we propose three practical solutions to overcome these drawbacks. One is to increase the diversity of the base classifiers in the ensemble. The second to smooth the label powerset probability estimates during the ensemble aggregation process, and the third to calibrate the label decision thresholds. Experimental results on various benchmark data sets indicate that the proposed approach outperforms significantly recent state-of-the-art MLC algorithms, including RAkEL and its variants.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Briggs, F., Fern, X.Z., Irvine, J.: Multi-label classifier chains for bird sound. In: Workshop on Machine Learning for Bioacoustics, ICML 2013 (2013)
Demsar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Fan, R., Lin, C.: A study on threshold selection for multi-label classification. Department of Computer Science, National Taiwan University (2007)
Marios, I., George S., Tsoumakas, G., Vlahavas, I.: Obtaining bipartitions from score vectors for multi-label classification. In: ICTAI (2010)
Lipton, Z.C., Elkan, C., Naryanaswamy, B.: Optimal thresholding of classifiers to maximize F1 measure. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014, Part II. LNCS, vol. 8725, pp. 225–239. Springer, Heidelberg (2014)
Madjarov, G., Kocev, D., Gjorgjevikj, D., Džeroski, S.: An extensive experimental comparison of methods for multi-label learning. Pattern Recogn. 45, 3084–3104 (2012)
Kouzani, A.Z., Nasierding, G., Tsoumakas, G.: A triple-random ensemble classification method for mining multi-label data. In: Data Mining Workshops, ICDMW 2010 (2010)
Pillai, I., Fumera, G., Roli, F.: Threshold optimisation for multi-label classifiers. Pattern Recogn. 46, 2055–2065 (2013)
Payam, R., Lei, T., Huan, L.: Cross-validation. In: Encyclopedia of Database Systems. Springer, New York (2009)
Rokach, L., Schclar, A., Itach, E.: Ensemble methods for multi-label classification. Expert Syst. Appl. 41, 7507–7523 (2014)
Tsoumakas, G., Katakis, I., Vlahavas, I.: Random k-labelsets for multilabel classification. IEEE Transa. Knowl. Data Eng. 23, 1079–1089 (2011)
Tsoumakas, G., Xioufis, E.S., Vilcek, J., Vlahavas, I.P.: Mulan: a java library for multi-label learning. J. Mach. Learn. Res. 12, 2411–2414 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Gharroudi, O., Elghazel, H., Aussem, A. (2015). Calibrated k-labelsets for Ensemble Multi-label Classification. In: Arik, S., Huang, T., Lai, W., Liu, Q. (eds) Neural Information Processing. ICONIP 2015. Lecture Notes in Computer Science(), vol 9489. Springer, Cham. https://doi.org/10.1007/978-3-319-26532-2_63
Download citation
DOI: https://doi.org/10.1007/978-3-319-26532-2_63
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26531-5
Online ISBN: 978-3-319-26532-2
eBook Packages: Computer ScienceComputer Science (R0)