Elsevier

Neurocomputing

Volume 394, 21 June 2020, Pages 51-60
Neurocomputing

Comparison of base classifiers for multi-label learning

https://doi.org/10.1016/j.neucom.2020.01.102Get rights and content

Abstract

Multi-label learning methods can be categorised into algorithm adaptation, problem transformation and ensemble methods. Some of these methods depend on a base classifier and the relationship is not well understood. In this paper the sensitivity of five problem transformation and two ensemble methods to four types of classifiers is studied. Their performance across 11 benchmark datasets is measured using 16 evaluation metrics. The best classifier is shown to depend on the method: Support Vector Machines (SVM) for binary relevance, classifier chains, calibrated label ranking, quick weighted multi-label learning and RAndom k-labELsets; k-Nearest Neighbours (k-NN) and Naïve Bayes (NB) for Hierarchy Of Multilabel classifiERs; and Decision Trees (DT) for ensemble of classifier chains. The statistical performance of a classifier is also found to be generally consistent across the metrics for any given method. Overall, DT and SVM have the best performance–computational time trade-off followed by k-NN and NB.

Introduction

Multi-label learning is a supervised learning problem where each training example is associated with multiple labels (binary or multi-class). In contrast, the traditional problem involves learning from single-label data. The multi-label learning problem has attracted interest from a wide range of domains such as text classification [1], [2], [3], [4]; scene classification [5] and annotation of images [6]; emotion detection in music [7]; detection of semantic concepts in videos [8]; gene functional classification [9]; and more recently recommendation of food trucks to customers based on their information and preferences [10].

The different methods for multi-label learning can be categorised into [11], [12]: algorithm adaptation, problem transformation and ensemble methods. Algorithm adaptation involves modifying the algorithm to make multi-label predictions. In problem transformation, the multi-label problem is transformed into single-label problems and the standard classifier is applied; the results are then transformed into multi-label predictions. Ensemble methods combine multiple algorithm adaptation or problem transformation methods to make a prediction.

An excellent review of the paradigm formalism and algorithmic details of eight representative multi-label learning algorithms can be found in [13]. A preliminary analysis was conducted in [14] showing that the performance of a multi-label learning method depends on the choice of base classifier. However, the analysis was limited to just three datasets, three multi-label learning methods and four evaluation metrics. An extensive comparison of 12 multi-label learning methods using 16 evaluation metrics over 11 benchmark datasets was made in [12]; however, a single base classifier was assumed for all the problem transformation methods and the ensemble methods RAndom k-labELsets (RAkEL) [15] and Ensemble of Classifier Chains (ECC) which may not be optimal [16]. Similarly, a default base classifier is usually assumed in the literature (see e.g. [4], [16], [17]).

In this paper we compare four base classifiers—k-nearest neighbours, decision trees, Naïve Bayes and support vector machines—across 11 datasets, 7 multi-label learning methods and 16 evaluation metrics. The technical contributions are that, to the best of our knowledge, this is the most extensive study of the performance of multi-label learning methods to be conducted from the perspective of the choice of base classifier—4928 possible combinations. These results allow us to come up with a set of robust recommendations on the best choice of base classifier for each multi-label learning method independent of the dataset and evaluation metric. Note that the scope of this paper is limited to the study of the single-label base classifier, and not the methods used to transform the multi-label problem into one or more single-label problems. Algorithm adaptation methods are not considered as they inherently do not depend on a base classifier. However, a comparison of such methods would be an interesting avenue of future research as they may have specific classifier advantages. We also developed one of the problem transformation methods (quick weighted multi-label learning [18], [19]) without which the modelling of certain datasets was not possible.

The paper is organised as follows: Section 2 is a background on the multi-label learning methods used in this study. Section 3 provides details on the evaluation metrics, datasets, and setup and method used. The key results and discussion are in Section 4 followed by the conclusions in Section 5. The complete set of results may be found in the Supplementary Material.

Section snippets

Background

In this section, we provide a brief description of the five problem transformation and two ensemble methods used in this study which all depend on a base classifier. From here on, it is assumed that binary labels are assigned to each training example. A summary of each method as well as the advantages and disadvantages may be found in Table 1.

Methodology

In this section, the evaluation metrics used to assess the performance of the multi-label learning methods are presented, followed by the datasets and the computational setup used.

Results and discussion

The complete set of results for the different datasets, multi-label learning methods and base classifiers across the different evaluation metrics (including training time) may be found in Supplementary Material Tables S1 to S17. Unless stated otherwise, the discussion is based only on the datasets where a complete set of results is obtained for all multi-label learning methods and base classifiers.

Conclusions

The performance of seven multi-label learning methods in relation to four types of base classifiers was studied using 11 benchmark datasets and 16 evaluation metrics. The corrected Friedman test with the corresponding Nemenyi post-hoc test were used for comparison of base classifiers over multiple datasets; statistical significance was determined at the 0.05 and 0.1 confidence levels. As not all the methods were able to finish within the given memory and time constraints, the analysis was first

CRediT authorship contribution statement

Edward K. Y. Yapp: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Data curation, Writing - original draft, Visualization, Project administration. Xiang Li: Writing - review & editing, Supervision. Wen Feng Lu: Writing - review & editing. Puay Siew Tan: Writing - review & editing, Supervision, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The authors would like to thank Peter Reutemann and Jesse Read for their support in the development of MEKA. This work was supported by the A*STAR Computational Resource Centre through the use of its high performance computing facilities; and the SERC Strategic Funding (A1718g0040).

Dr Edward K. Y. Yapp is a Scientist at the Singapore Institute of Manufacturing Technology (SIMTech). He received his B.Eng. and B.Fin. degrees from The University of Adelaide in 2010, and his Ph.D. degree from the University of Cambridge in 2016. His research interests are in combustion and artificial intelligence.

References (41)

  • K. Trohidis et al.

    Multi-label classification of music by emotion

    EURASIP J. Audio, Speech Music Process.

    (2011)
  • C.G.M. Snoek et al.

    The challenge problem for automated detection of 101 semantic concepts in multimedia

    Proceedings of the 14th ACM International Conference on Multimedia - MM’06

    (2006)
  • A. Elisseeff et al.

    A kernel method for multi-labelled classification

  • A. Rivolli et al.

    Food truck recommendation using multi-label classification

  • G. Tsoumakas et al.

    Multi-label classification: An overview

    Int. J. Data Warehous. Min.

    (2007)
  • M.-L. Zhang et al.

    A review on multi-label learning algorithms

    IEEE Trans. Knowl. Data Eng.

    (2014)
  • G. Tsoumakas et al.

    Multi-label classification

  • G. Tsoumakas et al.

    Random k-labelsets: An ensemble method for multilabel classification

  • J. Read et al.

    Classifier chains for multi-label classification

  • S.-H. Park et al.

    Efficient pairwise classification

  • Cited by (30)

    • Active k-labelsets ensemble for multi-label classification

      2021, Pattern Recognition
      Citation Excerpt :

      To address the first issue, we apply a measurement based on the Fisher’s linear discriminant ratio [45] to evaluate the separability of LP classes, and then we use joint entropy to describe the imbalance level of data. Because separability will be easier to analyze in a feature space with kernel technique, we adopt kernel support vector machine (SVM) as the base learner [41,50]. For the second issue, we should iteratively control the label-selection process via designed measurements.

    • Multi-label classification based ensemble learning for human activity recognition in smart home

      2020, Internet of Things (Netherlands)
      Citation Excerpt :

      The main motives of using several base classifiers is to compare, review and evaluate their performance on a real smart home dataset. The base classifiers chosen for the experiments performed in this study were decided based on the literature i.e. the comparison study of several base classifiers for multi-label classification [15]. The study depicted that Decision Tree classifier is an overall good base classifier, and other base classifiers KNN, Naïve Bayes and so on are good classifiers depending on what application is being targeted.

    • Boosting Multi-Label Classification Performance Through Meta-Model

      2024, International Journal of Pattern Recognition and Artificial Intelligence
    View all citing articles on Scopus

    Dr Edward K. Y. Yapp is a Scientist at the Singapore Institute of Manufacturing Technology (SIMTech). He received his B.Eng. and B.Fin. degrees from The University of Adelaide in 2010, and his Ph.D. degree from the University of Cambridge in 2016. His research interests are in combustion and artificial intelligence.

    Dr Xiang Li is currently a Senior Scientist and Team Lead at Singapore Institute of Manufacturing Technology (SIMTech). She has more than 20 years of experience in research on machine learning, data mining and artificial intelligence. Her research interests include big data analytics, machine learning, deep learning, data mining, decision support systems, and knowledge-based systems.

    Dr Wen Feng Lu is currently the Associate Professor of Department of Mechanical Engineering at the National University of Singapore (NUS). He has about 30 years of research experience in intelligent manufacturing, including using machine learning in data analytics. His research interests include machine learning, data analytics, intelligent manufacturing, engineering design technology, and 3D printing.

    Dr Puay Siew Tan leads the Manufacturing Control TowerTM (MCTTM) responsible for the setup of Model Factory@SIMTech. Her research has been in the cross-field disciplines of Computer Science and Operations Research for cyber physical production system (CPPS) collaboration, in particular sustainable complex manufacturing and supply chain operations. To this end, she has been active in using context-aware and services techniques.

    View full text