Elsevier

Neurocomputing

Volume 111, 2 July 2013, Pages 90-103
Neurocomputing

A multi-objective micro genetic ELM algorithm

https://doi.org/10.1016/j.neucom.2012.11.035Get rights and content

Abstract

The extreme learning machine (ELM) is a methodology for learning single-hidden layer feedforward neural networks (SLFN) which has been proved to be extremely fast and to provide very good generalization performance. ELM works by randomly choosing the weights and biases of the hidden nodes and then analytically obtaining the output weights and biases for a SLFN with the number of hidden nodes previously fixed. In this work, we develop a multi-objective micro genetic ELM (μG-ELM) which provides the appropriate number of hidden nodes for the problem being solved as well as the weights and biases which minimize the MSE. The multi-objective algorithm is conducted by two criteria: the number of hidden nodes and the mean square error (MSE). Furthermore, as a novelty, μG-ELM incorporates a regression device in order to decide whether the number of hidden nodes of the individuals of the population should be increased or decreased or unchanged. In general, the proposed algorithm reaches better errors by also implying a smaller number of hidden nodes for the data sets and competitors considered.

Introduction

Over the last twenty years artificial neural networks (ANN) have allowed researchers to model and solve real problems that cannot be easily solved with classical mathematical tools (due to a great number of data and/or variables, non-linear relations, etc.).

Historically back-propagation algorithms (BP) are the most widely used for training SLFNs. In order to set all their weights and biases, from input to hidden nodes and from hidden to output nodes, they require many parameters, they are slow and the training phase has to be repeated frequently.

However, the ELM does not need to adjust, in a computational expensive way, the hidden weights and biases since it chooses them at random. Then, with these weights and biases the first part of the SLFN is evaluated. Due to the fact that the activation function in the output layer is linear, the resulting values allow us to define a linear system of equations whose solutions are the output weights and biases of the SLFN. This linear system is easily solved through a simple generalized inverse calculation. To justify the correctness of the ELM algorithm it has been proved in [1] that, under certain regularity conditions, given a set of patterns with N records and any small positive value ϵ>0, there exists a number of hidden nodes N˜, N˜N, such that, after applying the above process the learning approximation error of the SLFN is less than ϵ with probability one. Since the publication of the original ELM a great number of different alternatives have appeared: I-ELM [1], OS-ELM [2], EI-ELM [3], OP-ELM [4], EM-ELM [5], EOS-ELM [6], CEOS-ELM [7], TS-ELM [8], TROP-ELM [9] and EELM [10], some of which we will use in order to carry out our comparisons in Section 5.

On the other hand, evolutionary algorithms (EAs) are search and optimization methods based on the principles of natural evolution and the genetics that try to approximate the optimal solution of a given problem [11]. The general schema of an evolutionary algorithm is shown in Procedure 1. Its main sub-processes include the variation, the evaluation and the selection of solutions. The EAs in general, and multi-objective EAs [12], in particular, have shown their great potential during the last two decades. A list of references with more than 6500 on evolutionary multi-objective optimization is maintained by Coello [13].

Procedure 1

General schema of an evolutionary algorithm.

In this paper we develop an improved version of our micro genetic algorithm, μG-ELM. We introduced the original version in [14], where our main contribution was the development of a bi-objective genetic algorithm which determined simultaneously the appropriate number of hidden nodes of the artificial neural network as well as the corresponding set of weights and biases, unlike other authors did, for instance [15] and [16], which only use the approximation error to guide the search of the appropriate weights and biases given a fixed number of hidden nodes. In order to conduct the search of the appropriate number of hidden nodes we proposed a new strategy based on regression by means of which the algorithm decides, in each iteration, to increase, to decrease or to unchange the number of hidden nodes of the artificial neural networks in the current population being evolved. Furthermore, we developed a new mutation operator for the number of hidden nodes and a survival selection operator, both specially designed for our algorithm.

The main goal of this paper is to look further in depth at the development of the μG-ELM algorithm. Since the regression strategy uses the information of certain solutions and, also, this information can be used with or without smoothing, in this paper we propose and study new alternatives for the original regression device resulting from the combination of three ways of selecting the solutions and three ways of preprocessing their information. We determine the best one and the selected alternative is then compared with other usual ELM algorithms in the literature. Furthermore, several elements of the original algorithm have been improved. The regression device considers more types of relations (Section 4) and the tournament selection process used to build the micro population to be evolved depends on the result of the regression device now (Section 3.4).

The paper is organized as follows: In Section 2 we show the previous establishments and notations for working with SLFNs and we introduce the bi-objective optimization problem whose solutions we approximate. Section 3 presents the μG-ELM algorithm with a detailed description of its steps. Section 4 is devoted to describing the regression device under several alternatives. Section 5 shows two experiments. The first one corresponds to a detailed study of the performance of our algorithm under several choices. In the second, we compare our μG-ELM under the best implementation selected as a result of the first experiment with the chosen competitors. Summary and concluding remarks are provided in Section 6.

Section snippets

Previous establishments and notations

In this Section we present the basic elements and notations for understanding the ELM methodology and we present the bi-objective optimization problem which is used to guide the search of the “best SLFNs”.

The micro genetic ELM algorithm

In this section we develop our micro genetic ELM algorithm. Procedure 2 shows the schema of the algorithm where lines 4–9 constitute its core. Most steps of this schema require several lines of code which are summarized in the corresponding procedures.

Procedure 2 starts building an initial population P1 which elements are defined in Section 3.1 and its construction in Section 3.2. This population is evaluated in both the training and the test sets using Eq. (7), (the division of the original

Setting the variables add and subtract

The only thing that we have omitted in the description of the algorithm is the way in which the variables add and subtract are set, in this section we will present the regression device used for this purpose.

In each iteration the algorithm will have to decide whether the current population, as a whole, has to tend towards neural networks with a higher number of hidden nodes, with a lower number of hidden nodes, with a specific number of hidden nodes, or with no changes in that number. All these

The experiments

We will now assess the performance of the proposed methodology by carrying out two experiments. The first experiment is devoted to deciding which combination of the two factors considered in the previous section exhibit a better performance, and then, in the second experiment we compare the behavior of our algorithm with some ELM competitors.

For both experiments we use the data sets shown, together with their main characteristics, in Table 1. These sets are classified according to their type of

Conclusions

In this paper we have proposed a new bi-objective genetic algorithm, μG-ELM, which obtains the appropriate number of hidden nodes as well as the appropriate weights and biases in one execution of the algorithm. For carrying out this goal it solves a bi-objective optimization problem which simultaneously minimizes the MSE and the number of hidden nodes. Furthermore, as a novelty, a regression device which combines the MSE value and the number of hidden nodes of the individuals of the population

Acknowledgment

The authors are grateful to both the editor and referees for their comments and suggestions, which greatly improved the presentation of this paper. This research has been supported by the University of Zaragoza under Grant UZ2010-CIE06 and research groups of Gobierno de Aragón E58 and E22.

David Lahoz received the B.Sc. degree in Mathematics from the University of Zaragoza, Spain. He is a Ph.D. student and an Assistant Professor in the School of Engineering and Architecture of the University of Zaragoza. His research interests include artificial neural networks and multi-objective evolutionary algorithms.

References (25)

  • G. Feng et al.

    Error minimized extreme learning machine with growth of hidden nodes and incremental learning

    IEEE Trans. Neural Networks

    (2009)
  • Y. Lan, Y.C. Soh, G.-B. Huang, A constructive enhancement for online sequential extreme learning machine, in:...
  • Cited by (22)

    • A survey on metaheuristic optimization for random single-hidden layer feedforward neural network

      2019, Neurocomputing
      Citation Excerpt :

      The hidden nodes were added to the RSLFN one-by-one, and MOCLPSO was used to select optimal input weights by minimizing the leave-one-out error bound (ferr) [156] and the norm of the output weights (fadd) at each step. Additionally, multiobjective optimization based on micro genetic [157] and membrane systems [158] were also presented to discover the optimal parameters of RSLFN. Both RSLFN and metaheuristic are bio-inspired insights.

    • Micro-differential evolution: Diversity enhancement and a comparative study

      2017, Applied Soft Computing Journal
      Citation Excerpt :

      It is reported that this technique can obtain good results based on using less data [45]. In [25] a multi-objective micro genetic extreme learning machine is proposed, which provides the appropriate number of hidden nodes in the machine for solving the problem, which minimizes the mean square error (MSE) of the training phase. The micro-GA is applied successfully for many applications such as designing wave-guide slot antenna with dielectric lenses [47], detection of flaws in composites [41], and scheduling of a real-world pipeline network [40], where better performances compared to the standard GA are reported.

    • μG2-ELM: An upgraded implementation of μ G-ELM

      2016, Neurocomputing
      Citation Excerpt :

      For each of the 100 processes, the corresponding data set is randomly divided in three subsets: 56.66% for training, 10.00% for testing (both used in the learning process) and 33.33% for the validation set used for evaluating the final ability of generalization. The parameters that we used for running the algorithms are the same we used in the previous work: size of population and number of iterations equal to 25; maximum number of hidden nodes equal to 100, number of solutions to be evolved in the micro-loop equal to 4 (for a complete description of the parameters we refer the readers to [28]). In addition to the new parameters, the re-sowing percentage and the values of σi and n2j are also fixed and they are shown in Table 2.

    • Daily global solar radiation prediction based on a hybrid Coral Reefs Optimization - Extreme Learning Machine approach

      2014, Solar Energy
      Citation Excerpt :

      This approach is quite sensitive to the global search algorithm used, and an excess of evolution may lead to overfitting and therefore to poor results. However this, positive results have been recently reported using micro-evolutionary algorithms (Lahoz et al., 2013), particle swarm (Han et al., 2013) or evolutionary ensembles (Wang and Alhamdoosh, 2013). In this paper we discuss the performance of a hybrid evolutionary-ELM algorithm in a problem of daily global solar radiation prediction.

    View all citing articles on Scopus

    David Lahoz received the B.Sc. degree in Mathematics from the University of Zaragoza, Spain. He is a Ph.D. student and an Assistant Professor in the School of Engineering and Architecture of the University of Zaragoza. His research interests include artificial neural networks and multi-objective evolutionary algorithms.

    Beatriz Lacruz is an Associate Professor of Statistics in the University of Zaragoza, Spain. She obtained her Ph.D. degree in Mathematics from Zaragoza University, in 1998. Her research interests include the development of statistical techniques to data mining, pattern recognition and machine learning.

    Pedro M. Mateo received his Ph.D. degree in Sciences from the University of Zaragoza, Spain, in 1995. Since 1998 he is an Associate Professor in the Department of Statistical Methods of the University of Zaragoza, Spain. His current research interests include machine learning, multi-objective evolutionary algorithms and simulation modeling.

    View full text