Elsevier

Neurocomputing

Volume 261, 25 October 2017, Pages 144-152
Neurocomputing

Online sequential ELM algorithm with forgetting factor for real applications

https://doi.org/10.1016/j.neucom.2016.09.121Get rights and content

Abstract

Sequential learning algorithms are a good choice for learning data one-by-one or chunk-by-chunk. Liang et al. has proposed OS-ELM algorithm based on the ordinary ELM algorithm, which produces better generalization performance than other famous sequential learning algorithms. One of the deficiencies of OS-ELM is that all the observations are weighted equally regardless of the acquisition time. However, the training data often have timeliness in many real industrial applications. In this paper, we propose a modified online sequential learning algorithm with the forgetting factor (named WOS-ELM algorithm) that weights the new observations more. Then a convergence analysis is presented to make sure the estimation of output weights tend to converge at the exponential speed with the arriving of new observations. For the determination of the value of forgetting factor, it would change with the forecast error automatically and get rid of excessive human interference. We employ several applications in the simulation part including time-series predication, time-variant system identification and the weather forecast problem. The simulation results show that WOS-ELM is more accurate and robust than other sequential learning algorithms.

Introduction

Extreme learning machine (ELM) proposed by Huang in 2006 is a fast machine learning algorithm based on the generalized single-hidden layer feedforward networks (SLFNs) [1]. The key advantages of ELM compared with other famous neural network algorithms are that the learning parameters in the neural model are generated randomly without human tuning or iterative method [2], [3]. The output weights are determined by the method of least square (LS). Nowadays, it has been widely used in many real applications including both regression and classification problems [4], [5], [6], [7].

In many real applications, data are obtained one by one or chunk by chunk. Online sequential machine learning is a model of induction that learns one instance or some instances at a time [8], [9]. Liang et al. has proposed a fast and accurate online sequential learning algorithm (OS-ELM) for SLFNs based on ELM network with additive or radial basis function (RBF) hidden nodes [10]. In OS-ELM, the newly generated observations can be trained one-by-one or chunk-by-chunk with fixed or varying data size, while the output weights will be updated analytically simultaneously. Then many modified OS-ELM algorithms have been proposed, such as EOS-ELM [11], OS-ELMK [12], OL-ELM-TV [13] et al. However, the above listed online sequential learning methods do not take timeliness aspect of the problem into consideration. Timeliness problem extensively exists in our daily life, such as weather forecast and stock forecast [14], [15]. With the time passing by, the distribution of data changes and shows much non-stationary phenomenon. In such cases, the old data should contribute lesser and lesser so that the model represents the most recent behavior [16]. Broadly speaking, in the training of ELM model, we should allocate high weights for new data and low weights for old ones.

There are many ELM-related online learning algorithms subject to nonstationary applications. FOS-ELM aims to learn the sequential data with timeliness, where a removable sliding-window is employed to limit the active area during the process of data acquisition [17]. Zhou employed the same forgetting mechanism in the regularized and kernelized ELM algorithms [18]. In addition, Wang proposed OS-ELMK algorithm, and combined it with a sliding window for nonstationary time series prediction [19]. With the arriving of new observations, the sliding window would move forward in order to forget the ‘old’ samples. Another strategy to deal with nonstationary data is based on the introduction of forgetting factor. From the extreme point of view, the method of sliding window can be seen as a special case of the method of forgetting factor. Matias introduced the forgetting factor into OS-ELM algorithm [20]. However, the authors did not present the method to choose an approximate value for the forgetting factor. Lim presented a relatively complex mechanism to determine the value of forgetting factor based on the gradient descent method [21]. The additionally required computational complexity would increase at remarkable speed with the increase of the number of hidden nodes, which is time-consuming and cannot meet the needs of online implementation.

In this paper, we propose a novel modified online sequential ELM algorithm named WOS-ELM. WOS-ELM algorithm introduces the forgetting factor in the performance indicator. For the online sequential learning, old data are gradually being forgotten, while new coming data gets more emphasis. Then we present a convergence theorem to ensure the estimation of output weights converges to the true value with the arriving of new observations at the exponential speed. In addition, the forgetting factor can be set to be variable according to the output prediction error automatically. Thus the model can ensure the output error fluctuates around the set point. This automatic updating strategy for the forgetting factor is easy and time-saving to implement, which would not affect the advantage of rapid training speed of ELM algorithm. In addition, we present a mechanism to deal with the contaminated industrial data inspired by the introduction of forgetting factor. More details will be discussed in Section 4.

In the simulation section, we carry out three experiments to verify the performance of WOS-ELM algorithm. First, WOS-ELM is conducted for a time-series prediction problem (Mackey Glass Time-Series Application) with other two famous online learning algorithms (GGAP-RBF [22] and MRAN [23]). Second, we employ WOS-ELM algorithm in the time-variant system identification problem. In the end, a real world application, weather forecast, is taken into consideration, where WOS-ELM is employed to make forecast for the average temperature in the next day. Simulations have shown that the WOS-ELM algorithm can produce more rational and robust results. And its fast training speed can satisfy the demand for real online implementation.

The paper are organized as follows: Section 2 presents the review of the ordinary ELM and the OS-ELM algorithm. The basic theory of WOS-ELM algorithm will be presented in Section 3 and some discussions are given in Section 4. Sections 5 and6 are the simulation results and conclusions respectively.

Section snippets

ELM

The ELM algorithm was originally proposed by Huang subject to a general single hidden layer network. ELM gets rid of human tuning with random initialization of SLFNs learning parameters. Then the output weights can be determined by the theory of least square method [1], [24].

Given a training set consisting of N arbitrary distinct samples S={(xi,ti)|xiRn,tiRm,i=1,2,,N}, the SLFNs network function with N˜ hidden nodes can be formulated as fN˜=j=1N˜βjG(aj,bj,xi)=ti,i=1,2,,Nwhere aj and bj are

WOS-ELM

In this section, we present a brief introduction of the proposed WOS-ELM algorithm. WOS-ELM algorithm aims at the following three aspects:

  • 1.

    Pay more emphasis on the new observations and ignore the old ones gradually. Generally, with the arriving of new observations, the contribution of old samples to the model becomes smaller and smaller. The trained model should closely track the changes of data distribution.

  • 2.

    Improve the computing efficiency and reduce the training time and manual intervention.

Further iscussion

A) In WOS-ELM, the forgetting factor plays a significant role in online learning network. One should set a suitable value for the forgetting factor before training the model. Here we make the forgetting factor variable with the forecast error ef. When the norm of forecast error ‖ef‖ becomes smaller than the setpoint error εfe, it means that the trained model can adapt well to the new data. Consequentially, the forgetting factor λ should tends to 1 along a special curve. On the contrary, larger

Simulation esults

The performance evaluation of the proposed WOS-ELM algorithm in three applications is presented in this section. Firstly, WOS-ELM is verified on the MackeyGlass chaotic time series problem. Secondly, we employ WOS-ELM in the Narendra system identification problem. In the above two experiments, some measurements have been done to create time-variant of nonstationary data series. In the end, WOS-ELM is applied in a real world application: weather forecast. The sigmoidal additive activation

Conclusions

This paper introduces the forgetting factor into OS-ELM algorithm and proposes the WOS-ELM algorithm. Compared with the ordinary online sequential learning algorithms of ELM, the WOS-ELM algorithm can well be employed in the non-stationary process in the real world. In order to get rid of the human interface, the forgetting factor can vary automatically based on the output prediction error. We employ several databases including time-series prediction, time-variant system identification and a

Haigang Zhang received the B.S. degree in School of Electronic and Information Engineering from University of Science and Technology Liaoning in 2012. Now he is a Ph.D. candidate of control science and engineering in University of Science and Technology Beijing. His research interests include machine learning and its application to control of the world.

References (35)

  • ZhangH.G. et al.

    An improved ELM algorithm for the measurement of hot metal temperature in blast furnace

    Neurocomputing

    (2016)
  • HuangG.B. et al.

    Extreme learning machine for regression and multiclass classification

    IEEE Trans. Systems. Man. Cybern. Part B Cybern.

    (2012)
  • HuangG.B.

    An insight into extreme learning machines: random neurons, random features and kernels

    Cognit. Comput.

    (2014)
  • YuJ. et al.

    An enhanced online sequential extreme learning machine algorithm

    Proceedings of the 2008 China Control and Decision Conference, Shandong, China, 2–4 July

    (2008)
  • LiuN. et al.

    Ensemble based extreme learning machine

    IEEE Signal Process. Lett.

    (2010)
  • LiangN.Y. et al.

    A fast and accurate online sequential learning algorithm for feedforward networks

    IEEE Trans. Neural Netw.

    (2006)
  • C. Cingolani et al.

    An extreme learning machine approach for training time variant neural networks

    Proceedings of the 2008 IEEE Asia Pacific Conference on Circuits and Systems, Macao, 30 November–3 December

    (2008)
  • Cited by (32)

    • Variational quantum extreme learning machine

      2022, Neurocomputing
      Citation Excerpt :

      The authors proposed class-specific ELM to handle binary class imbalance problem [25]. The work [26] proposed a modified online sequential learning algorithm that weights the new observations more and achieves better accuracy and more robustness. For unknown noise in the environment, a robust ELM is proposed to improve the robustness and generalization ability of the model under the disturbance of Gaussian and non-Gaussian noise [6]; (4) Extend ELM to deeper structures.

    • Improved multi-layer online sequential extreme learning machine and its application for hot metal silicon content

      2020, Journal of the Franklin Institute
      Citation Excerpt :

      Hence, in this paper, the variable forgetting factor (VFF) is introduced for ML-OSELM. VFF can balance the weight between the new data and the old data [18,25]. Besides, ML-OSELM may also have instability in different trails of simulations.

    • A T–S fuzzy model identification approach based on evolving MIT2-FCRM and WOS-ELM algorithm

      2020, Engineering Applications of Artificial Intelligence
      Citation Excerpt :

      Since the evolving MIT2-FCRM based on GSA and the hyper-plane-shaped MF have performed quite well in fuzzy space partition and antecedent parameter identification, we try to improve the consequent parameter identification approach to achieve higher performance of the identified model. In a recent study (Zhang et al., 2017), an extreme learning machine algorithm with forgetting factor for processing online sequences (namely WOS-ELM) is proposed. Considering the parameters of the model may change over time, a forgetting factor has been introduced in the WOS-ELM algorithm, making that the new observations of the model weight more and the old samples are gradually forgotten.

    • Meta-cognitive recurrent kernel online sequential extreme learning machine with kernel adaptive filter for concept drift handling

      2020, Engineering Applications of Artificial Intelligence
      Citation Excerpt :

      More importantly, compared with other popular online learning algorithms, OSELM can provide better generalization performance at a much faster learning speed. Depending upon these advantages, OSELM has been successfully applied in time series prediction and non-stationary environments (Zhang et al., 2017). There are two main steps in the learning process of OS-ELM.

    View all citing articles on Scopus

    Haigang Zhang received the B.S. degree in School of Electronic and Information Engineering from University of Science and Technology Liaoning in 2012. Now he is a Ph.D. candidate of control science and engineering in University of Science and Technology Beijing. His research interests include machine learning and its application to control of the world.

    Sen Zhang received the Ph.D. degree in Electrical Engineering from Nanyang Technological University in 2005. She has been working as postdoctoral research fellow in National University of Singapore and lecturer in charge in Singapore Polytechnic. She is currently an associate professor of the school of automation and electrical engineering in the university of science and technology Beijing. Her research interests include ELM, target tracking and estimation theory.

    Yixin Yin received the Ph.D. degree in Electrical Engineering from University of Science and Technology Beijing in 2002. He is full professor of the school of automation and electrical engineering in the university of science and technology Beijing. His research interests include control theory and its applications.

    This work has been supported by the National Natural Science Foundation of China (NSFC grant nos. 61333002, 61673056, 61673055 and 61671054).

    View full text