Hierarchical ensemble of Extreme Learning Machine

doi:10.1016/j.patrec.2018.06.015

Pattern Recognition Letters

Volume 116, 1 December 2018, Pages 101-106

https://doi.org/10.1016/j.patrec.2018.06.015 Get rights and content

Highlights

•
A novel hierarchical ensemble of ELM integrated with representation learning and ensemble learning.
•
Deep cascade structure is used for re-representing features.
•
Sparse connection and feature bagging are used to encourage individual diversity.
•
HE-ELM significantly outperforms many existing ensemble and representation learning methods.

Abstract

Extreme Learning Machine (ELM), which is proposed for generalized single-hidden layer feedforward neural networks, has become a popular research topic due to its unique characteristics. However, the random nature inherent in ELM’s hidden layer results in unstable performance and a large number of hidden neurons is required, making the risk of overfitting increased. In this paper, we propose a simple but effective ensemble approach, called Hierarchical Ensemble of Extreme Learning Machine (HE-ELM), to improve ELM. To encourage the diversity of component ELMs, two strategies are taken into account, namely, the sparse connection to component ELMs and feature bagging. The resulting architecture is able to integrate both representation learning and ensemble learning with relatively fewer parameters and consists of independent component ELMs, making it easy to implement, train, and apply in practice. We compare results of the proposed HE-ELM with existing methods for 22 classification problems, showing that HE-ELM is able to achieve significant improvement in terms of classification accuracy, with a reduced risk of overfitting the training data.

Introduction

Extreme Learning Machine (ELM) was first proposed by Huang et al. [1] for generalized single-hidden layer feedforward neural networks (SLFNs). In contrast to the traditional neural networks which require great effort in hyper-parameter tuning, ELM randomly generates its hidden weights and biases first, and then mathematically calculates output weights by solving a Ridge Regression problem [2], [3]. Due to the unique characteristics, i.e., fast learning speed, ease of implementation, and universal approximation capability [4], ELM has been widely applied in image recognition [5], [6], [7], remote sensing image classification [8], [9], [10], and protein structure prediction [11].

Actually, ELM or its variants can also be regarded as Randomized Neural Networks [12]. Since the random nature of ELM’s hidden layer, ELM potentially yields an unstable prediction, which results in a large number of hidden neurons is required to guarantee its performance. To address this problem, many approaches have been proposed to improve ELMs. A straightforward idea is to optimize ELM’s hidden layer parameters utilizing heuristic searching, such as Differential Evolution [13], Memetic Algorithm [3], and Evolutionary Multi-objective Algorithm [14], however, which generally implies a high computational cost. In [15], Kernel-based ELM (KELM) was proposed by assuming ELM’s hidden mapping unknown to users, and have shown can outperform Support Vector Machine (SVM) in many applications.

In recent years, deep learning [16], which learns feature representation via a hierarchical structure, has achieved remarkable success in various fields. Inspired by this, developing deep representation methods based on ELM has attracted increasing attention [4], [17], [18], [19]. For example, multi-layer ELM (ML-ELM) [4] that constructs deep representation by stacking a series of ELM autoencoders (ELM-AEs) sequentially. Instead, Zhou and Feng [20] first proposed the Deep Forest method (gcForest), a random forest ensemble approach, showing the power of integrating representation learning and ensemble learning, furthermore, providing a solution to build deep representation using traditional methods.

It is well known that ensemble learning is effective for combining multiple learning methods to yield better performance. Although ensemble methods have been used to improve ELM, e.g. Voting-based ELM (V-ELM) [21], the integration of representation learning and ensemble learning has not been drawing enough attention. In [22], a hierarchical ELM ensemble (H-ELM-E), an ensemble of ensembles, was used to fuse different image features. Similarly, in [12], a trained combiner is used to integrate component ELMs, however, it essentially is weighted voting and is not able to do representation learning.

In this paper, we aim at developing a novel hierarchical ensemble of ELM (HE-ELM) for representation learning. Unlike the traditional ensemble methods which make a final decision on a shallow ensemble, the proposed method includes multiple re-representation layers and adopts two diversity encouraging strategies to avoid model overfitting.

This paper is structured as follows. Section 2 reviews the related works and gives the main motivation. Section 3 describes the proposed method and two diversity encouraging strategies. The experiment results are given in Section 4. The conclusions are discussed in Section 5.

Section snippets

Briefs of ELM

Theorem 1

Learning can be made without iteratively tuning (artificial) hidden nodes (or hundred types of biological neurons) even though the modeling of biological neurons may be unknown as long as they are nonlinear piecewise continuous, and such a network can approximate any continuous target function with any small error and can also separate any disjoint regions without tuning hidden neurons [23].

Consider a data set with training samples $X = {x_{i}}_{i = 1}^{N}$ in $R^{n}$ (n-dimensional feature space) and class labels

Hierarchical ensemble of ELM

In this section, we shall first introduce the overall architecture of the proposed method, and then we will describe two diversity encouraging strategies.

Experiment setup

We evaluate the performance of our proposed HE-ELM on 22 benchmark data sets that taken from UCI repository.¹ All the data sets are normalized using max-min normalization in prepossessing, and 60% of the labeled samples of each data set are selected randomly for training and the rest for testing. We run a group of experiments to compare HE-ELM with basic Extreme Learning Machine (ELM) [1], Multiple Layer Perceptron (MLP) [26], Multiple-Layer Extreme

Conclusions

We have introduced a novel hierarchical ensemble method based on ELM for classification, using two re-representation layers to re-represent features by adding predictions of component ELMs into the initial features. To encourage the diversity of component ELMs, we introduce two strategies, including sparse connection, which aims at randomly disconnecting a percentage of hidden connections, and feature bagging, which aims at increasing the number of sub-samples by subspace sampling. Simulations

Acknowledgments

We thank the reviewers for their valuable comments and suggestions. This work was partially supported by the National Natural Science Foundation of China under grant nos. 61773355, 61603355, the Fundamental Research Funds for National University, China University of Geosciences(Wuhan) grant no. G1323541717, and the National Nature Science Foundation of Hubei Province, China grant no. 2018CFB528.

References (28)

G.-B. Huang et al.
Extreme learning machine: theory and applications
Neurocomputing
(2006)
Y. Zhang et al.
Memetic extreme learning machine
Pattern Recognit.
(2016)
R. Minhas et al.
Human action recognition using extreme learning machine based on visual vocabularies
Neurocomputing
(2010)
W. Zong et al.
Face recognition based on extreme learning machine
Neurocomputing
(2011)
P. Ksieniewicz et al.
Ensemble of extreme learning machines with trained classifier combination and statistical features for hyperspectral data
Neurocomputing
(2018)
Q.-Y. Zhu et al.
Evolutionary extreme learning machine
Pattern Recognit.
(2005)
J. Cao et al.
Voting based extreme learning machine
Inf. Sci.
(2012)
K. Hornik et al.
Multilayer feedforward networks are universal approximators
Neural Netw.
(1989)
G.-B. Huang et al.
Extreme learning machine: a new learning scheme of feedforward neural networks
Neural Networks, 2004. Proceedings. 2004 IEEE International Joint Conference on, Volume 2
(2004)
J. Tang et al.
Extreme learning machine for multilayer perceptron
IEEE Trans. Neural Netw. Learn. Syst.
(2016)

Y. Zeng et al.

Traffic sign recognition using kernel extreme learning machines with deep perceptual features

IEEE Trans. Intell. Transp. Syst.

(2017)

C. Chen et al.

Spectral-spatial classification of hyperspectral image based on kernel extreme learning machine

Remote Sens.

(2014)

M. Pal et al.

Kernel-based extreme learning machine for remote-sensing image classification

Remote Sens. Lett.

(2013)

Q. Lv et al.

Classification of hyperspectral remote sensing image using hierarchical local-receptive-field-based extreme learning machine

IEEE Geosci. Remote Sens. Lett.

(2016)

Cited by (46)

Transformer-BLS: An efficient learning algorithm based on multi-head attention mechanism and incremental learning algorithms
2024, Expert Systems with Applications
Due to its efficient model calibration given by unique incremental learning capability, broad learning system (BLS) has made impressive progress in image analytical tasks such as image classification and object detection. Inspired by this incremental remodel success, we proposed a novel transformer-BLS network to achieve a trade-off between model training speed and accuracy. Specially, we developed sub-BLS layers with the multi-head attention mechanism and combining these layers to construct a transformer-BLS network. In particular, our proposed transformer-BLS network provides four different incremental learning algorithms that enable the proposed model can realize the increments of its feature nodes, enhancement nodes, input data and sub-BLS layers, respectively, without the need of the full-weight update in this model. Furthermore, we validated the performance of our transformer-BLS network and its four incremental learning algorithms on a variety of image classification datasets. The results demonstrated that the proposed transformer-BLS maintains classification performance on both the MNIST and Fashion-MNIST datasets, while saving 2/3 of the training time. These findings imply that the proposed method has the potential in significant reducing model training complexity with this incremental remodel system, while simultaneously improving the increment learning performance of the original BLS within such contexts, especially in the classification task of some datasets.
Semi-supervised learning with graph convolutional extreme learning machines
2023, Expert Systems with Applications
Extreme Learning Machine (ELM) has been widely used for various classification problems. However, the traditional ELMs are typically based on the regular Euclidean data, thus ignoring the intrinsic structured information among data and resulting in poor robustness. In this paper, we present a novel semi-supervised learning framework, termed Graph Convolutional Extreme Learning Machines (GCELM), based on extending the traditional ELM to the non-Euclidean domain. Technically, we recast ELM layers into a randomized graph convolutional embedding layer followed by a graph convolutional regression layer, which endows ELM with the capability to handle graphs in the non-Euclidean domain. Benefiting from the diversity of the randomized graph convolution, we further propose an enhanced GCELM, i.e., Voting-based GCELM (V-GCELM), by using a simple voting ensemble strategy. The proposed methods preserve the advantages of ELMs, thus being more efficient than the gradient-based graph convolutional networks but without loss of graph learning ability. Extensive experiments on 36 benchmark datasets demonstrate that the proposed methods significantly outperform many previous semi-supervised classification methods.
Hierarchical Cat and Mouse based ensemble extreme learning machine for spectrum sensing data falsification attack detection in cognitive radio network
2022, Microprocessors and Microsystems
Citation Excerpt :
HCM-EELM is the integration of Cat and Mouse optimizer (CMO) algorithm and Hierarchical Ensemble Extreme Learning Machine (HE-ELM). HE-ELM [31] is described as the improved version of Extreme Learning Machine (ELM) that improves the accuracy of classification. The structure of HE-ELM comprised of two-representation layer and decision layer.
While cognitive radio networks (CRNs) provide a promising solution to mitigate the scarcity of the radio spectrum, they are still susceptible to several security threats. Until now, only a few researchers considered the usage of intrusion detection systems (IDSs) to combat the threads against CRNs. In CRN, spectrum sensing is considered as a significant function and collaborative spectrum sensing (CSS) has known to result in better sensing accuracy. However, CSS is vulnerable to Spectrum Sensing Data Falsification (SSDF) attack wherein a node maliciously falsifies the sensing report prior to sending it to the Fusion Centre (FC), with the aim of disrupting the spectrum sensing process. Thus, a new novel machine learning model is significant for detecting of attacks in a CRN. In this paper, a Hierarchical Cat and Mouse Based Ensemble Extreme Learning Machine (HCM-EELM) model has proposed to analyse the security threats in CRN for enhancing the network performance by minimizing the attacks. Also, it is attempted at investigating the viability of machine learning classification for detecting SSDF attack in a binary reporting CRN. History of the sensing reports accumulated at the FC are applied to acquire the temporal characteristics of SU, and thereby establishing the training and testing data-sets. Here, three attacks namely Random Yes (RY) attack, Random No (RN) attack, and Random False (RF) attack have considered. Moreover, the proposed model is implemented in MATLAB software and the stimulation results are compared with existing models such as Extreme Learning Machine (ELM), Deep Neural Network (DNN), Recurrent Neural Network (RNN), and Support Vector Machine (SVM). The result has determined in terms of performance metrics such as accuracy, precision, recall, F1-score and FAR. As a result, better accuracy for 99.7% has achieved than the existing models by detecting the attacks efficiently.
Densely connected convolutional extreme learning machine for hyperspectral image classification
2021, Neurocomputing
Extreme Learning Machine (ELM) has gained lots of research interests due to its universal approximation capability and fast learning speed. Although several prior works have focused on developing deep ELM, it is still an open problem to design effective deep ELM. Stacking random layers will result in overfitting and accumulation of random errors. To address this issue, this paper presents a simple yet effective deep ELM called Densely Connected Convolutional ELM (DC²ELM) for hyperspectral image spectral-spatial classification. First, we introduce dense connection into ELM to make full use of intermediate feature maps produced by randomized convolutional layers, which is beneficial to reduce the random error. Secondly, stacked ELM auto-encoders are employed to generate reduced representation, leading to a deeper architecture. The proposed approach consists of fewer trainable parameters than traditional convolutional neural networks and can easily be trained without any iterative parameters tuning, making it easier to implement and apply in practice. We compare the proposed approach with many prior arts over three real hyperspectral images, showing that the proposed approach can achieve superior performance using limited training data and with a reduced risk of overfitting the training data.
A hybrid adaptive teaching–learning-based optimization and differential evolution for parameter identification of photovoltaic models
2020, Energy Conversion and Management
Citation Excerpt :
In future work, we will apply the ATLDE for the practical PV models under varied temperature condition. In addition, the double diode model should be paid more attention because many methods do not provide a more accurate and reliable result, and the development of hybrid approaches, such as heuristics and analytical methods, heuristics and heuristics, heuristics and machine learning techniques [63,64], may be an effective solution. The source code used in this paper can be obtained from the authors upon request.
Photovoltaic (PV) systems play an important role in today’s power systems because they can convert solar energy directly into electricity. However, theirs conversion performance depends mainly on the PV models unknown parameters. Due to the complex characteristics of the equivalent circuit equation of the PV model, parameter identification of PV models remains a very popular and challenging task in PV system optimization. In this paper, a hybrid adaptive teaching–learning-based optimization (TLBO) with differential evolution (DE), referred to as ATLDE, is proposed to accurately and reliably identify the unknown parameters of PV models. In ATLDE, three improvements are introduced: i) the learners’ ranking probability is presented to adaptively choose the teacher or learner phase of TLBO; ii) based on the learners’ ranking probability, an enhanced teaching manner with an adaptive teaching factor $T_{F}$ is proposed to make use of the exploitation abilities of better learners in the teacher phase; iii) DE is embedded in the learner phase to improve population diversity and encourage wider exploration of the search space. In order to verify the performance of ATLDE, it is applied to parameter identification of different PV models, including the single diode model, the double diode model, and two PV panel module models. The experimental results demonstrate that our approach has great competitiveness in terms of accuracy and reliability. Therefore, the proposed algorithm can be an effective and efficient alternative for PV model parameter identification problems.
Optimal power flow by means of improved adaptive differential evolution
2020, Energy
Citation Excerpt :
Combining with machine learning techniques, such as extreme machine learning [47] and naive bayes [48], in EJADE-SP is also an interesting future direction.
Optimal power flow (OPF) problem is a large-scale, non-convex, multi-modal, and non-linear constrained optimization problem, which has been widely used in power system operation. Because of these features, solving the OPF problem is a very popular and challenging task in power system optimization. In recent years, many advanced optimization methods are employed to deal with the OPF problem. However, most of these methods are unconstrained. In this paper, an enhanced adaptive differential evolution (JADE) with self-adaptive penalty constraint handling technique, referred to as EJADE-SP, is proposed to obtain the optimal solution of the OPF problem. The EJADE-SP is an enhanced version of JADE, where four improvements are proposed to enhance the performance of JADE when solving the OPF problem: i) crossover rate ( $C R$ ) sorting mechanism is introduced to allow individuals to inherit more good genes; ii) re-randomizing parameters ( $C R$ and scale factor F) to maintain the search efficiency and diversity; iii) dynamic population reduction strategy is used to accelerate convergence; and iv) self-adaptive penalty constraint handling technique is integrated to deal with the constraints. To verify the effectiveness of the proposed method, it is applied to the OPF problem on a modified IEEE 30-bus test system, which combines stochastic wind energy and solar energy with conventional thermal power generators. The simulation results demonstrate that the proposed approach can be an effective alternative for the OPF problem.

View all citing articles on Scopus

View full text

Hierarchical ensemble of Extreme Learning Machine

Highlights

Abstract

Introduction

Section snippets

Briefs of ELM

Hierarchical ensemble of ELM

Experiment setup

Conclusions

Acknowledgments

Neurocomputing

Pattern Recognit.

Neurocomputing

Neurocomputing

Neurocomputing

Pattern Recognit.

Inf. Sci.

Neural Netw.

Extreme learning machine: a new learning scheme of feedforward neural networks

Neural Networks, 2004. Proceedings. 2004 IEEE International Joint Conference on, Volume 2

Extreme learning machine for multilayer perceptron

IEEE Trans. Neural Netw. Learn. Syst.

Traffic sign recognition using kernel extreme learning machines with deep perceptual features

IEEE Trans. Intell. Transp. Syst.

Spectral-spatial classification of hyperspectral image based on kernel extreme learning machine

Remote Sens.

Kernel-based extreme learning machine for remote-sensing image classification

Remote Sens. Lett.

Classification of hyperspectral remote sensing image using hierarchical local-receptive-field-based extreme learning machine

IEEE Geosci. Remote Sens. Lett.