A multi-objective micro genetic ELM algorithm

doi:10.1016/j.neucom.2012.11.035

Neurocomputing

Volume 111, 2 July 2013, Pages 90-103

https://doi.org/10.1016/j.neucom.2012.11.035 Get rights and content

Abstract

The extreme learning machine (ELM) is a methodology for learning single-hidden layer feedforward neural networks (SLFN) which has been proved to be extremely fast and to provide very good generalization performance. ELM works by randomly choosing the weights and biases of the hidden nodes and then analytically obtaining the output weights and biases for a SLFN with the number of hidden nodes previously fixed. In this work, we develop a multi-objective micro genetic ELM $(μ G - ELM)$ which provides the appropriate number of hidden nodes for the problem being solved as well as the weights and biases which minimize the MSE. The multi-objective algorithm is conducted by two criteria: the number of hidden nodes and the mean square error (MSE). Furthermore, as a novelty, $μ G - ELM$ incorporates a regression device in order to decide whether the number of hidden nodes of the individuals of the population should be increased or decreased or unchanged. In general, the proposed algorithm reaches better errors by also implying a smaller number of hidden nodes for the data sets and competitors considered.

Introduction

Over the last twenty years artificial neural networks (ANN) have allowed researchers to model and solve real problems that cannot be easily solved with classical mathematical tools (due to a great number of data and/or variables, non-linear relations, etc.).

Historically back-propagation algorithms (BP) are the most widely used for training SLFNs. In order to set all their weights and biases, from input to hidden nodes and from hidden to output nodes, they require many parameters, they are slow and the training phase has to be repeated frequently.

However, the ELM does not need to adjust, in a computational expensive way, the hidden weights and biases since it chooses them at random. Then, with these weights and biases the first part of the SLFN is evaluated. Due to the fact that the activation function in the output layer is linear, the resulting values allow us to define a linear system of equations whose solutions are the output weights and biases of the SLFN. This linear system is easily solved through a simple generalized inverse calculation. To justify the correctness of the ELM algorithm it has been proved in [1] that, under certain regularity conditions, given a set of patterns with N records and any small positive value $ϵ > 0$ , there exists a number of hidden nodes $\tilde{N}$ , $\tilde{N} \leq N$ , such that, after applying the above process the learning approximation error of the SLFN is less than $ϵ$ with probability one. Since the publication of the original ELM a great number of different alternatives have appeared: I-ELM [1], OS-ELM [2], EI-ELM [3], OP-ELM [4], EM-ELM [5], EOS-ELM [6], CEOS-ELM [7], TS-ELM [8], TROP-ELM [9] and EELM [10], some of which we will use in order to carry out our comparisons in Section 5.

On the other hand, evolutionary algorithms (EAs) are search and optimization methods based on the principles of natural evolution and the genetics that try to approximate the optimal solution of a given problem [11]. The general schema of an evolutionary algorithm is shown in Procedure 1. Its main sub-processes include the variation, the evaluation and the selection of solutions. The EAs in general, and multi-objective EAs [12], in particular, have shown their great potential during the last two decades. A list of references with more than 6500 on evolutionary multi-objective optimization is maintained by Coello [13].

Procedure 1

General schema of an evolutionary algorithm.

In this paper we develop an improved version of our micro genetic algorithm, $μ G - ELM$ . We introduced the original version in [14], where our main contribution was the development of a bi-objective genetic algorithm which determined simultaneously the appropriate number of hidden nodes of the artificial neural network as well as the corresponding set of weights and biases, unlike other authors did, for instance [15] and [16], which only use the approximation error to guide the search of the appropriate weights and biases given a fixed number of hidden nodes. In order to conduct the search of the appropriate number of hidden nodes we proposed a new strategy based on regression by means of which the algorithm decides, in each iteration, to increase, to decrease or to unchange the number of hidden nodes of the artificial neural networks in the current population being evolved. Furthermore, we developed a new mutation operator for the number of hidden nodes and a survival selection operator, both specially designed for our algorithm.

The main goal of this paper is to look further in depth at the development of the $μ G - ELM$ algorithm. Since the regression strategy uses the information of certain solutions and, also, this information can be used with or without smoothing, in this paper we propose and study new alternatives for the original regression device resulting from the combination of three ways of selecting the solutions and three ways of preprocessing their information. We determine the best one and the selected alternative is then compared with other usual ELM algorithms in the literature. Furthermore, several elements of the original algorithm have been improved. The regression device considers more types of relations (Section 4) and the tournament selection process used to build the micro population to be evolved depends on the result of the regression device now (Section 3.4).

The paper is organized as follows: In Section 2 we show the previous establishments and notations for working with SLFNs and we introduce the bi-objective optimization problem whose solutions we approximate. Section 3 presents the $μ G - ELM$ algorithm with a detailed description of its steps. Section 4 is devoted to describing the regression device under several alternatives. Section 5 shows two experiments. The first one corresponds to a detailed study of the performance of our algorithm under several choices. In the second, we compare our $μ G - ELM$ under the best implementation selected as a result of the first experiment with the chosen competitors. Summary and concluding remarks are provided in Section 6.

Section snippets

Previous establishments and notations

In this Section we present the basic elements and notations for understanding the ELM methodology and we present the bi-objective optimization problem which is used to guide the search of the “best SLFNs”.

The micro genetic ELM algorithm

In this section we develop our micro genetic ELM algorithm. Procedure 2 shows the schema of the algorithm where lines 4–9 constitute its core. Most steps of this schema require several lines of code which are summarized in the corresponding procedures.

Procedure 2 starts building an initial population P¹ which elements are defined in Section 3.1 and its construction in Section 3.2. This population is evaluated in both the training and the test sets using Eq. (7), (the division of the original

Setting the variables add and subtract

The only thing that we have omitted in the description of the algorithm is the way in which the variables add and subtract are set, in this section we will present the regression device used for this purpose.

In each iteration the algorithm will have to decide whether the current population, as a whole, has to tend towards neural networks with a higher number of hidden nodes, with a lower number of hidden nodes, with a specific number of hidden nodes, or with no changes in that number. All these

The experiments

We will now assess the performance of the proposed methodology by carrying out two experiments. The first experiment is devoted to deciding which combination of the two factors considered in the previous section exhibit a better performance, and then, in the second experiment we compare the behavior of our algorithm with some ELM competitors.

For both experiments we use the data sets shown, together with their main characteristics, in Table 1. These sets are classified according to their type of

Conclusions

In this paper we have proposed a new bi-objective genetic algorithm, $μ G - ELM$ , which obtains the appropriate number of hidden nodes as well as the appropriate weights and biases in one execution of the algorithm. For carrying out this goal it solves a bi-objective optimization problem which simultaneously minimizes the MSE and the number of hidden nodes. Furthermore, as a novelty, a regression device which combines the MSE value and the number of hidden nodes of the individuals of the population

Acknowledgment

The authors are grateful to both the editor and referees for their comments and suggestions, which greatly improved the presentation of this paper. This research has been supported by the University of Zaragoza under Grant UZ2010-CIE06 and research groups of Gobierno de Aragón E58 and E22.

David Lahoz received the B.Sc. degree in Mathematics from the University of Zaragoza, Spain. He is a Ph.D. student and an Assistant Professor in the School of Engineering and Architecture of the University of Zaragoza. His research interests include artificial neural networks and multi-objective evolutionary algorithms.

References (25)

G.-B. Huang et al.
Extreme learning machinetheory and applications
Neurocomputing
(2006)
Y. Lan et al.
Ensemble of online sequential extreme learning machine
Neurocomputing
(2009)
Y. Lan et al.
Two-stage extreme learning machine for regression
Neurocomputing
(2010)
Y. Miche et al.
TROP-ELMa double-regularized ELM using LARS and Tikhonov regularization
Neurocomputing
(2011)
Y. Wang et al.
A study on effectiveness of extreme learning machine
Neurocomputing
(2011)
Q.-Y. Zhu et al.
Evolutionary extreme learning machine
Pattern Recognition
(2005)
G.-B. Huang et al.
Enhanced random search based incremental extreme learning machine
Neurocomputing
(2008)
N.-Y. Liang et al.
A fast and accurate online sequential learning algorithm for feedforward networks
IEEE Trans. Neural Networks
(2006)
G.-B. Huang et al.
Universal approximation using incremental constructive feedforward networks with random hidden nodes
IEEE Trans. Neural Networks
(2006)
Y. Miche et al.
OP-ELMtheory, experiments and a toolbox
Artif. Neural Networks-ICANN 2008
(2008)

G. Feng et al.

Error minimized extreme learning machine with growth of hidden nodes and incremental learning

IEEE Trans. Neural Networks

(2009)

Y. Lan, Y.C. Soh, G.-B. Huang, A constructive enhancement for online sequential extreme learning machine, in:...

Cited by (22)

A survey on metaheuristic optimization for random single-hidden layer feedforward neural network
2019, Neurocomputing
Citation Excerpt :
The hidden nodes were added to the RSLFN one-by-one, and MOCLPSO was used to select optimal input weights by minimizing the leave-one-out error bound (ferr) [156] and the norm of the output weights (fadd) at each step. Additionally, multiobjective optimization based on micro genetic [157] and membrane systems [158] were also presented to discover the optimal parameters of RSLFN. Both RSLFN and metaheuristic are bio-inspired insights.
Random single-hidden layer feedforward neural network (RSLFN) is currently a popular learning algorithm proposed for improving traditional gradient-based model due to its fast learning speed and acceptable performance. For RSLFN, the input weights and/or other parameters are randomly initialized, and the other ones are iteratively or non-iteratively trained. However, the performance of RSLFN is sensitive to the number of hidden neurons and randomly initialized parameters. Numerous methods have been successfully employed to improve the RSLFN from various perspectives. Because of their favourable search ability, metaheuristic optimization approaches gradually attract more and more attentions. Metaheuristic algorithms usually formulate the random parameters of RSLFN into an optimization model, and then provide a near-optimum solution which could be converted into RSLFN with better generalization performance. The hybrid method for optimizing RSLFN therefore shows considerable potential in intelligent computing and artificial intelligence. However, there is no comprehensive survey on RSLFN with metaheuristic in the research area, which ultimately leads to lost opportunities for an advancement. This paper firstly introduces the basic principles of RSLFN along with several metaheuristic algorithms. Secondly, it provides a comprehensive survey of the state-of-the-art contributions in the area. Finally, current challenges are highlighted and promising research directions are also presented.
Micro-differential evolution: Diversity enhancement and a comparative study
2017, Applied Soft Computing Journal
Citation Excerpt :
It is reported that this technique can obtain good results based on using less data [45]. In [25] a multi-objective micro genetic extreme learning machine is proposed, which provides the appropriate number of hidden nodes in the machine for solving the problem, which minimizes the mean square error (MSE) of the training phase. The micro-GA is applied successfully for many applications such as designing wave-guide slot antenna with dielectric lenses [47], detection of flaws in composites [41], and scheduling of a real-world pipeline network [40], where better performances compared to the standard GA are reported.
Differential evolution (DE) algorithm suffers from high computational time due to slow nature of evaluation. Micro-DE (MDE) algorithms utilize a very small population size, which can converge faster to a reasonable solution. Such algorithms are vulnerable to premature convergence and high risk of stagnation. This paper proposes a MDE algorithm with vectorized random mutation factor (MDEVM), which utilizes the small size population benefit while empowers the exploration ability of mutation factor through randomizing it in the decision variable level. The idea is supported by analyzing mutation factor using Monte-Carlo based simulations. To facilitate the usage of MDE algorithms with very-small population sizes, a new mutation scheme for population sizes less than four is also proposed. Furthermore, comprehensive comparative simulations and analysis on performance of the MDE algorithms over various mutation schemes, population sizes, problem types (i.e. uni-modal, multi-modal, and composite), problem dimensionalities, and mutation factor ranges are conducted by considering population diversity analysis for stagnation and pre-mature convergence. The MDEVM is implemented using a population-based parallel model and studies are conducted on 28 benchmark functions provided for the IEEE CEC-2013 competition. Experimental results demonstrate high performance in convergence speed of the proposed MDEVM algorithm.
μG2-ELM: An upgraded implementation of μ G-ELM
2016, Neurocomputing
Citation Excerpt :
For each of the 100 processes, the corresponding data set is randomly divided in three subsets: 56.66% for training, 10.00% for testing (both used in the learning process) and 33.33% for the validation set used for evaluating the final ability of generalization. The parameters that we used for running the algorithms are the same we used in the previous work: size of population and number of iterations equal to 25; maximum number of hidden nodes equal to 100, number of solutions to be evolved in the micro-loop equal to 4 (for a complete description of the parameters we refer the readers to [28]). In addition to the new parameters, the re-sowing percentage and the values of σi and n2j are also fixed and they are shown in Table 2.
μG-ELM is a multiobjective evolutionary algorithm which looks for the best (in terms of the MSE) and most compact artificial neural network using the ELM methodology. In this work we present the μG2-ELM, an upgraded version of μG-ELM, previously presented by the authors. The upgrading is based on three key elements: a specifically designed approach for the initialization of the weights of the initial artificial neural networks, the introduction of a re-sowing process when selecting the population to be evolved and a change of the process used to modify the weights of the artificial neural networks. To test our proposal we consider several state-of-the-art Extreme Learning Machine (ELM) algorithms and we confront them using a wide and well-known set of continuous, regression and classification problems. From the conducted experiments it is proved that the μG2-ELM shows a better general performance than the previous version and also than other competitors. Therefore, we can guess that the combination of evolutionary algorithms with the ELM methodology is a promising subject of study since both together allow for the design of better training algorithms for artificial neural networks.
Trends in extreme learning machines: A review
2015, Neural Networks
Extreme learning machine (ELM) has gained increasing interest from various research fields recently. In this review, we aim to report the current state of the theoretical research and practical advances on this subject. We first give an overview of ELM from the theoretical perspective, including the interpolation theory, universal approximation capability, and generalization ability. Then we focus on the various improvements made to ELM which further improve its stability, sparsity and accuracy under general or specific conditions. Apart from classification and regression, ELM has recently been extended for clustering, feature selection, representational learning and many other learning tasks. These newly emerging algorithms greatly expand the applications of ELM. From implementation aspect, hardware implementation and parallel computation techniques have substantially sped up the training of ELM, making it feasible for big data processing and real-time reasoning. Due to its remarkable efficiency, simplicity, and impressive generalization performance, ELM have been applied in a variety of domains, such as biomedical engineering, computer vision, system identification, and control and robotics. In this review, we try to provide a comprehensive view of these advances in ELM together with its future perspectives.
Daily global solar radiation prediction based on a hybrid Coral Reefs Optimization - Extreme Learning Machine approach
2014, Solar Energy
Citation Excerpt :
This approach is quite sensitive to the global search algorithm used, and an excess of evolution may lead to overfitting and therefore to poor results. However this, positive results have been recently reported using micro-evolutionary algorithms (Lahoz et al., 2013), particle swarm (Han et al., 2013) or evolutionary ensembles (Wang and Alhamdoosh, 2013). In this paper we discuss the performance of a hybrid evolutionary-ELM algorithm in a problem of daily global solar radiation prediction.
This paper discusses the performance of a novel Coral Reefs Optimization – Extreme Learning Machine (CRO–ELM) algorithm in a real problem of global solar radiation prediction. The work considers different meteorological data from the radiometric station at Murcia (southern Spain), both from measurements, radiosondes and meteorological models, and fully describes the hybrid CRO–ELM to solve the prediction of the daily global solar radiation from these data. The algorithm is designed in such a way that the ELM solves the prediction problem, whereas the CRO evolves the weights of the neural network, in order to improve the solutions obtained. The experiments carried out have shown that the CRO–ELM approach is able to obtain an accurate prediction of the daily global radiation, better than the classical ELM, and the Support Vector Regression algorithm.
Multi-Damage Quantitative Detection in Composite Wind Turbine Blade Combining Curvature Modal Shape Based on the Surface Interpolation Method with Extreme Learning Machine
2023, SSRN

View all citing articles on Scopus

Beatriz Lacruz is an Associate Professor of Statistics in the University of Zaragoza, Spain. She obtained her Ph.D. degree in Mathematics from Zaragoza University, in 1998. Her research interests include the development of statistical techniques to data mining, pattern recognition and machine learning.

Pedro M. Mateo received his Ph.D. degree in Sciences from the University of Zaragoza, Spain, in 1995. Since 1998 he is an Associate Professor in the Department of Statistical Methods of the University of Zaragoza, Spain. His current research interests include machine learning, multi-objective evolutionary algorithms and simulation modeling.

View full text

A multi-objective micro genetic ELM algorithm

Abstract

Introduction

Section snippets

Previous establishments and notations

The micro genetic ELM algorithm

Setting the variables add and subtract

The experiments

Conclusions

Acknowledgment

Neurocomputing

Neurocomputing

Neurocomputing

Neurocomputing

Neurocomputing

Pattern Recognition

Neurocomputing

A fast and accurate online sequential learning algorithm for feedforward networks

IEEE Trans. Neural Networks

Universal approximation using incremental constructive feedforward networks with random hidden nodes

IEEE Trans. Neural Networks

OP-ELMtheory, experiments and a toolbox

Artif. Neural Networks-ICANN 2008

Error minimized extreme learning machine with growth of hidden nodes and incremental learning

IEEE Trans. Neural Networks