Boosting label weighted extreme learning machine for classifying multi-label imbalanced data
Introduction
In supervised learning, the single-label learning in where each instance only associates with one unique label is the most studied paradigm. In real world applications, however, an object may be simultaneously related with multiple different labels, for examples, an image includes mountain, lake, cascade, tree, cloud, sky and sun labels simultaneously (see Fig. 1), a news report covers several different topics, such as economy, political and sport, and a protein holds some different biological functions synchronously, etc. We call this type of data as the multi-label data and the modeling procedure of multi-label data as the multi-label learning. In the past decade, the multi-label learning has gradually developed to be one of the research hotspots in the field of machine learning [1].
Most existing multi-label learning studies focus on how to improve the recognition rate by either modifying the single-label learning model to make it adapt the multi-label data [2], [3], or mining the correlations among labels to promote the quality of the model [4], [5], [6], but in general, ignoring the issue of class imbalance. In fact, multi-label learning faces a greater threat from class imbalance than the single-label learning, as for each label in multi-label data, the instances might have seriously skewed distribution [7]. We note that as another research hotspot, there have existed a plentiful of class imbalance learning methods, including sampling [8], [9], [10], [11], [12], cost-sensitive learning [13], [14], [15], [16], threshold strategies [17], [18], [19], [20], one-class learning [21], [22], metric learning [23], [24] and ensemble learning [25], [26], [27], [28], [29]. However, most of them are designed to only address the single-label classification problem, and it is difficult to directly transform them to deal with multi-label imbalanced data.
In recent several years, some researchers have noted the multi-label imbalance classification problem, further proposed several effective solutions [7], [30], [31], [32], [33], [34], [35], [36]. Adopting these techniques, the impact of class imbalance distribution can be alleviated more or less, however, each one has its inherent drawbacks, e.g., low robustness caused by the empirical or random manipulation, the high time-complexity because of the adoption of complex calculation, etc.
Label weighted extreme learning machine (LW-ELM) [34] is an efficient class imbalance learning algorithm which can be used to classify the multi-label data. As a cost-sensitive learning algorithm, LW-ELM not only inherits the robustness and the fast training speed from extreme learning machine (ELM), but also presents a higher flexibility than the other cost-sensitive extreme learning machine algorithm called WELM [37]. Although the LW-ELM algorithm holds several remarkable merits, we note that its modeling quality is not excellent enough as in general, the label costs are designated empirically. Considering the empirical costs only associate with the class imbalance ratio, but neglect the specific data distribution, the quality of the model constructed by LW-ELM can be further improved.
In this paper, we benefit from the idea of Boosting WELM algorithm [38], which integrates WELM model into the Boosting ensemble learning framework, and further propose a novel algorithm named BLW-ELM algorithm. BLW-ELM first empirically designates the initial label costs based on the class imbalance ratios, and then adjusts the cost of each label belonging to each instance according to the feedback of the current model, and finally organizes all single trained models to make decision in the form of weighted voting. The advantage of the BLW-ELM algorithm lies in that it can significantly improve the generalization ability of LW-ELM, and meanwhile get rid of exploring the prior distribution of the multi-label data.
The remainder of this paper is organized as follows. In Section 2, we provide a basic description about the multi-label class imbalance problem. Section 3 presents some related work of multi-label class imbalance learning. Section 4 describes the proposed BLW-ELM algorithm in detail. Then, in Section 5, the experimental results and the corresponding discussions are presented. Finally, Section 6 concludes the contributions of this paper and indicates the future work.
Section snippets
What is class imbalance in multi-label data?
As mentioned in Section 1, the multi-label data indicates that an instance associates with multiple labels. Suppose there is a multi-label data set D={x1,x2,…,x|D|} and the corresponding label space O={1,2,…,|O|}, where |D| denotes the number of instances in the data set, |O| indicates the number of labels, and xi denotes a specific instance which associates with a label set . In theory, there exist 2|O|-1 different label sets with considering each instance associates a label in O at least.
Extreme learning machine
ELM, which was proposed by Huang et al., [40], [41], [42], is a specific learning algorithm for single-hidden layer feedforward neural network (SLFN) (see Fig. 3). The main characteristics of ELM that distinguishes from those conventional learning algorithms of SLFN is the random generation of hidden nodes. Therefore, ELM does not need to iteratively regulate parameters to make them approach the optimal values, causing it has faster learning speed and better generalization ability. Previous
The description about the data sets
In this paper, we collected 12 multi-label data sets from the MLC Toolbox [39] to validate the effectiveness of the proposed BLW-ELM algorithm. Specifically, these data sets have different number of instances, number of features, number of labels, label cardinalities and label densities. The data sets also cover several different fields, including image, text and biology, etc. The detailed information about these data sets are provided in Table 1.
Experimental settings
To validate the effectiveness and superiority of
Concluding remarks
In this paper, we tried to improve the LW-ELM algorithm by integrating it into the Boosting learning framework, further designed a novel algorithm named BLW-ELM to address multi-label class imbalance learning problem. The merit of the BLW-ELM algorithm lies in that it gets rid of exploring the complex data distribution directly, but adaptively assigns the appropriate label weights. Lots of experimental results indicated that the proposed BLW-ELM algorithm is an robust, efficient and universal
Acknowledgements
This work was supported by Natural Science Foundation of Jiangsu Province of China under grant No. BK20191457, Open Project of Artificial Intelligence Key Laboratory of Sichuan Province under grant No. 2019RYJ02, National Natural Science Foundation of China under grants No. 61305058 and No. 61572242, China Postdoctoral Science Foundation under grants No.2013 M540404 and No. 2015 T80481.
Ke Cheng was born in Chizhou, Anhui, China, in 1972. He received the B.S. degree in mining engineering from Hunan University of Science and Technology, Xiangtan, China, in 1996, and received M.S. degree in agricultural process engineering from Jiangsu University, Zhenjiang, China, in 1999, and Ph.D. degree in computer science from Nanjing University of Science and Technology, Nanjing, China, in 2006. Since 2008, he has been an Associate Professor in School of Computer, Jiangsu University of
References (46)
- et al.
Hierarchical partitioning of the output space in multi-label data
Data Knowl. Eng.
(2018) - et al.
Multi-label classification by exploiting label correlations
Expert Syst. Appl.
(2014) - et al.
Addressing imbalance in multi-label classification: Measures and random resampling algorithms
Neurocomputing
(2015) - et al.
ACOSampling: an ant colony optimization-based undersampling method for classifying imbalanced DNA microarray data
Neurocomputing
(2013) - et al.
Imbalanced enterprise credit evaluation with DTE-SBD: decision tree ensemble based on SMOTE and bagging with differentiated sampling rates
Inf. Sci.
(2018) - et al.
A synthetic informative minority over-sampling (SIMO) algorithm leveraging support vector machine to enhance learning from imbalanced datasets
Decis. Support Syst.
(2018) - et al.
Cost-sensitive linguistic fuzzy rule based classification systems under the MapReduce framework for imbalanced big data
Fuzzy Sets Syst.
(2015) - et al.
Near-Bayesian Support Vector Machines for imbalanced data classification with equal or unequal misclassification costs
Neural Networks
(2015) - et al.
A simple plug-in bagging ensemble based on threshold-moving for classifying binary and multiclass imbalanced data
Neurocomputing
(2018) - et al.
ODOC-ELM: Optimal decision outputs compensation-based extreme learning machine for classifying imbalanced data
Knowl.-Based Syst.
(2016)
Support vector machine-based optimized decision threshold adjustment strategy for classifying imbalanced data
Knowl.-Based Syst.
A general tensor representation framework for cross-view gait recognition
Pattern Recogn.
A novel ensemble method for classifying imbalanced data
Pattern Recogn.
MLSMOTE: approaching imbalanced multilabel learning through synthetic instance generation
Knowl.-Based Syst.
Inverse random under sampling for class imbalance problem and its application to multi-label classification
Pattern Recogn.
Weighted extreme learning machine for imbalance learning
Neurocomputing
Boosting weighted ELM for imbalanced learning
Neurocomputing
Extreme learning machine: theory and applications
Neurocomputing
Trends in extreme learning machines: A review
Neural Networks
Advanced non-parametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power
Inf. Sci.
A review on multi-label learning algorithms
IEEE Trans. Knowl. Data Eng.
Editing training data for multi-label classification with the k-nearest neighbor rule
Pattern Anal. Appl.
Toward non-intrusive load monitoring via multi-label classification
IEEE Trans. Smart Grid
Cited by (24)
Multi-label borderline oversampling technique
2024, Pattern RecognitionImproved hybrid resampling and ensemble model for imbalance learning and credit evaluation
2022, Journal of Management Science and EngineeringCitation Excerpt :Threshold-based methods generally proceed by setting the threshold values of different classes in different classifier learning stages, whereas single-class learning methods are performed by training the classifier using a training set that contains only a specific class (Ganganwar, 2012). Kernel and activation function-based transformation methods (Ding et al., 2018), fuzzy-based methods, and clustering and task decomposition strategies have been commonly used and combined with support vector machines (SVM) (Richhariya and Tanveer, 2020), decision trees (DTs), neural networks, K-nearest neighbor learning, extreme learning machines (Cheng et al., 2020; Liu et al., 2020), and rule-based classifiers (Guo et al., 2017). Other recent imbalanced learning-based approaches include the SMOTE-based class-specific extreme learning machine (Choudhary and Shukla, 2021), and Gaussian affinity for max-margin class imbalanced learning (Hayat et al., 2019), and AdaBalGAN (Wang et al., 2019).
Bat algorithm optimized extreme learning machine: A new modeling strategy for predicting river water turbidity at the United States
2022, Handbook of HydroInformatics: Volume I: Classic Soft-Computing TechniquesA semi-supervised deep learning model for ship encounter situation classification
2021, Ocean EngineeringOnline ensemble learning algorithm for imbalanced data stream
2021, Applied Soft ComputingCitation Excerpt :Ensemble learning algorithms [1–6] changes the distribution of training data sets with a certain way to obtain multiple training subsets, and then constructs multiple base-classifiers to predict unknown data by voting with multiple base-classifiers. Boosting algorithm [7–11] and bagging algorithm [12–15] are the two most representative ensemble learning algorithms. But these two and their improved ensemble learning algorithms are designed for block learning and are not suitable for online data stream learning.
Ke Cheng was born in Chizhou, Anhui, China, in 1972. He received the B.S. degree in mining engineering from Hunan University of Science and Technology, Xiangtan, China, in 1996, and received M.S. degree in agricultural process engineering from Jiangsu University, Zhenjiang, China, in 1999, and Ph.D. degree in computer science from Nanjing University of Science and Technology, Nanjing, China, in 2006. Since 2008, he has been an Associate Professor in School of Computer, Jiangsu University of Science and Technology, Zhenjiang, China. From 2002 to 2003, he was a senior visiting scholar in School of Computer Science, Southeast University, Nanjing, China. He has authored or co-authored more than 20 research papers. His research interests include machine learning, data mining and bioinformatics.
Shang Gao received his B.S. degree in System Engineering from Air Force Engineering University, Xi'an, China, in 1993, the M.S. degree in Military Equipment from Air-Force Engineering University, Xi'an, China, in 1996, and the Ph.D. degree in Pattern Recognition and Intelligent Systems from Nanjing University of Science and Technology, Nanjing, China, in 2006, respectively.
Since 2009, he has been an Professor in School of Computer, Jiangsu University of Science and Technology, Zhenjiang, China. Since 2017, he has been the dean of School of Computer, Jiangsu University of Science and Technology, Zhenjiang, China. He has published over 80 research articles on the professional journals and conferences. His research interests include Optimization Theory, Swarm Intelligence and Machine Learning.
Wenlu Dong was born in Henan, China, in 1995. She received the B.S. degree in Computer Science and Technology from Henan Institute of Engineering, Zhengzhou, in 2018.
Since 2018, she is working toward the M.S. degree in the School of Computer, Jiangsu University of Science and Technology, Zhenjiang, China. Her research interests mainly include machine learning and data mining.
Xibei Yang received the B.S. degree in computer sciences from Xuzhou Normal University, Xuzhou, China, in 2002, the M.S. degree in computer applications from Jiangsu University of Science and Technology, Zhenjiang, China, in 2006 and the Ph.D. degree in Pattern Recognition and Intelligence System from Nanjing University of Science and Technology, Nanjing, China, in 2010. Since 2018, he has been a Professor in School of Computer, Jiangsu University of Science and Technology, Zhenjiang, China. He has published over 100 research articles on the professional journals and conferences. His research interests include granular computing and rough set theory. Dr. Yang is the reviewer for over 10 high-quality international journals, and the member in the organizing committee of several international conferences.
Qi Wang received his M.S. degree in electrical and computer engineering from Sungkyunkwan University, South Korea in 2011, and Ph.D. degree in information and communications technology from University of Trento, Italy in 2015, respectively. He was a visiting scholar at North Carolina State University from 2013 to 2014.
Dr. Wang is currently a lecturer with school of computer, Jiangsu University of Science and Technology, Zhenjiang, China. His research interests include wireless sensor network, wireless communications in smart grid, architecture reliability of smart grid with renewable energy system, and fast power charging station for electric vehicles.
Hualong Yu was born in Harbin, China, in 1982. He received the B.S. degree in computer science from Heilongjiang University, Harbin, China, in 2005, and received M.S. and Ph.D. degrees in computer science from Harbin Engineering University, Harbin, China, in 2008 and 2010, respectively. Since 2010, he has been an Associate Professor in School of Computer, Jiangsu University of Science and Technology, Zhenjiang, China. From 2013 to 2017, he was a Post-Doctoral Fellow in School of Automation, Southeast University, Nanjing, China. From 2017 to 2018, he was a senior visiting scholar in Faculty of Information Technology, Monash University, Melbourne, Australia. He has authored or co-authored more than 70 journal and conference papers, and 4 monographs, including publications on IEEE TNNLS, IEEE TFS, IEEE TCBB, IEEE Access, Information Science, KBS, Neurocomputing, etc. His research interests include machine learning, data mining and bioinformatics. Dr. Yu is an Associate Editor of IEEE Access, and an active reviewer for more than 20 high-quality international journals, including IEEE TNNLS, TCYB, TKDE, TCBB and ACM TKDD etc., and the member in the organizing committee of several international conferences. He is also the member of ACM, China Computer Federation (CCF) and the Youth Committee of the Chinese Association of Automation (CAA).