Elsevier

Neurocomputing

Volume 128, 27 March 2014, Pages 50-58
Neurocomputing

An incremental extreme learning machine for online sequential learning problems

https://doi.org/10.1016/j.neucom.2013.03.055Get rights and content

Abstract

A fast and outstanding incremental learning algorithm is required to meet the demand of online applications where data comes one by one or chunk by chunk to avoid retraining and save precious time. Although many interesting research results have been achieved, there are still a lot of difficulties in real applications because of their unsatisfying generalization performance or intensive computation cost. This paper presents an Incremental Extreme Learning Machine (IELM) which is developed based on Extreme Learning Machine (ELM), a unified framework of LS-SVM and PSVM presented by Hang et al. (2011) in [15]. Under different application demand and different computational cost and efficiency, three different alternative solutions of IELM are achieved. Detailed comparisons of the IELM algorithm with other incremental algorithms are achieved by simulation on benchmark problems and real critical dimension (CD) prediction problem in lithography of actual semiconductor production line. The results show that kernel based IELM solution performs best while least square IELM solution is the fastest of the three alterative solutions when the number of training data is huge. All the results show that the presented IELM algorithms have better performance than other incremental algorithms such as online sequential ELM (OS-ELM) presented by Liang et al. (2006) [8] and fixed size LSSVM presented by Espinoza et al. (2006) [11].

Introduction

Over the past few decades, batch learning algorithms have been discussed and investigated thoroughly. Plenty of interesting research work has been developed, such as Back-propagation Neutral Network (BP-NN) [1], Support Vector Machine (SVM) [2], Least Square Support Vector Machine (LSSVM) [3], Proximal Support Vector Machine (PSVM) [4], Extreme Support Vector Machine (ESVM) [5], Extreme Learning Machine (ELM) [6], etc. Although many interesting research results have been attained, there is still a lot of difficulties in real applications because of the typical features for the datum attained from an actual environment. The typical feature for the datum attained from the actual environment is that the training data often arrives one by one or trunk by trunk. Retraining of all the data with the batch learning algorithm when one new data comes is very time consuming and cannot meet the time demand of actual applications. Intensive computation cost and strict time requirements severely restrict these batch learning algorithms from real applications.

Originated from the batch learning methods that have been developed, many online sequential algorithms have been presented to meet the actual application demand. In [7], a generalized growing and pruning RBF (GGAP-RBF) is presented. The authors introduce the significance of the neurons and develop an online sequential algorithm by generalized growing and pruning methods. In [8], an online sequential extreme learning machine (OS-ELM) is introduced which is much faster and produces better generalization performance compared with other sequential learning algorithms, such as GGAP-RBF. Based on the batch learning algorithm of SVM, an incremental support vector machine is presented in [9]. The basic idea is that the new SVM is built based on the new arrival data and the trained support vectors. The method is called a SV-incremental algorithm in [10]. Fixed-size LSSVM (FS-LSSVM) is developed in [11] for large scale regression problems. Based on quadratic Renyi entropy criteria, an active support vector selection method is developed. Compared with LSSVM, FS-LSSVM needs much less support vectors and produces better performance. In [12], an incremental learning algorithm for ESVM (IESVM) is developed by He et al., and the parallel version of IESVM (PIESVM) is also presented based on the powerful parallel programming framework of MapReduce. The presented PIESVM is much more efficient than ESVM, while the solutions obtained are exactly the same as that by ESVM.

GGAP-RBF and OS-ELM may be over-fitting because of the modeling basis of Empirical Risk Minimization principle. The SV-incremental algorithm and FS-LSSVM are constructed on the basis of Structural Risk Minimization principle, thus avoiding over-fitting and having been very popular in recent years. However in the SV-incremental algorithm, the previous support vectors may have only a little influence on the new SVM [10] and the performance cannot be satisfied. Also, the computation cost of FS-LSSVM is very high because of the eigenvectors and eigenvalues computation. The PIESVM algorithm can efficiently solve the large-scale problems and the online problems at the same time and has shown great prospects in real applications. A new online sequential learning algorithm based on an enhanced extreme learning machine has been presented recently in [13] by using left or right pseudo-inverse. The algorithm performs much better than other sequential algorithms and is very promising.

To join the ELM and SVM together is an important method to overcome the over-fitting problem of ELM. The first important contribution in joining the ELM and SVM together is presented in [5] by Liu et al. The presented ESVM algorithm is much faster than nonlinear SVM algorithms and obtains better generalization performance than ELM. A similar work is also done on standard SVM by Frnay and Verleysen in [14]. It has been shown that better performance can be achieved by simply replacing the SVM kernels with random ELM kernels in SVMs [5], [14].

The general algorithm is proposed by Huang et al. in [15]. A constrained optimization based Extreme Learning Machine as a unified learning framework for LS-SVM, PSVM and other regularization algorithms is developed. Huang et al. also proved in the paper that LS-SVM and PSVM can only obtain suboptimal solutions comparing with ELM both in theoretical and simulation results. Different from other kernel based algorithms, ELM provides a unified platform with a special mapping referred to as ELM feature mapping, which is not relevant with the target value and can be constituted by almost all the nonlinear piecewise continuous functions. The ELM feature mapping can be regarded as a good dimensionality reduction which has a fixed dimension size while the dimension of input data increases. The kernel based ELM is also given in [15] when the feature mapping is unknown.

In this paper, based on the batch-learning idea of ELM [15], an Incremental Extreme Learning Machine (IELM) is presented to meet the demand of actual applications where data comes one by one or chunk by chunk. Under different computational cost and efficiency, three different alterative solutions of IELM are achieved. The three different alterative solutions are referred to as Minimum Norm Incremental Extreme Learning Machine (MN-IELM), Least Square Incremental Extreme Learning Machine (LS-IELM) and Kernel Based Incremental Extreme Learning Machine (KB-IELM). All three different alterative solutions of IELM are proposed for online industrial applications to avoid retraining on all the data when new data comes, thus improving the training speed and efficiency.

However, the computation cost will be sharply different for the three proposed algorithms under different application conditions. It is suggested that different algorithms should be selected under different applications. The same suggestions are also presented in [15]. For the case where the number of training data is not huge and the dimensionality of ELM feature space is very large, the MN-IELM is preferred for application to reduce computation cost. For the case where the number of training data is very huge, for example, the number is much larger than the feature space dimensionality, LS-IELM is a better option for application to improve the computation efficiency. For the case where the feature mapping is unknown and MN-IELM and LS-IELM cannot be used any more, KB-IELM should be selected for applications.

The rest of the paper is organized as follows. Section 2 gives the brief review of the unified framework of ELM and the developed three alterative batch learning algorithms. Section 3 presents our three incremental ELM algorithms. Simulations are carried out and results are analyzed in Section 4. Conclusions are drawn in Section 5.

Section snippets

Brief review of the unified framework of ELM

Huang et al. propose a unified framework of ELM by introducing a constrained optimization problem, thus bridges the gap between ELM and SVMs. Given the input training dataset X={xi}i=1N and the corresponding output training dataset Y={yi}i=1N where N is the total number of training data, and given the ELM mapping as h(xi) and the output weight as W, Huang et al. propose the multi-classifier with multi-outputs constrained optimization based ELM by formulating asMinimize:LPELM=12W2+ν12ξ2

Incremental ELM

In actual application, data may come one by one or chunk by chunk. So, it is necessary to develop effective online incremental learning algorithms. The three alterative ELM batch learning solutions Huang et al. present have been proved to be better than other algorithms, such as SVM, LS-SVM and PSVM both in training time spent and generalization performance. We develop the online incremental learning algorithm based on the batch learning ELM. Three alterative incremental ELM solutions are also

Simulation and results

In this section the performance of the presented IELM algorithm is shown by comparison with the OS-ELM algorithm presented by Liang et al. [8] and fixed size LS-SVM (FS-LSSVM) presented by Espinoza et al. [11] on a lot of benchmark problems and a real critical dimension (CD) prediction problem in lithography of semiconductor production line. The benchmark problems are very famous and can be downloaded from UCI Repository of machine learning databases [19] conveniently. The real CD prediction

Conclusions

In this paper, an Incremental Extreme Learning Machine for online sequential learning problems is presented based on ELM. Comparisons with the OS-ELM and FS-LSSVM for classification and regression problems are given by simulations on benchmark problems and real CD prediction problem of semiconductor production line. Clearly, all the three alterative IELM algorithms can have better performance than OS-ELM presented by Liang et al. [8] and FS-LSSVM presented by Espinoza et al. [11]. KB-IELM

Acknowledgments

The authors would like to thank a lot Guang-bin Huang from Nanyang Technological University for his constructive suggestions and inspiring tutorings in the research.

This work was supported by the National Natural Science Foundation of China (Nos. 61025018, 60834004, 61021063, 61104172), the National Key Basic Research and Development Program of China (2009CB320602) and the National Science and Technology Major Project of China (2011ZX02504-008).

Lu Guo received his B.Eng. degree from Harbin Institute of Technology in 2001, Harbin, China, and his M.Eng. degree from the Second Academy of China Aerospace in 2004, Beijing, China. Now he is working toward his Ph.D. degree in the Department of Automation, Tsinghua University, Beijing, China.

His research interests include scheduling optimization, extreme learning machines, machine learning and signal processing.

References (20)

  • G.-B. Huang et al.

    Extreme learning machine: theory and applications

    Neurocomputing

    (2006)
  • Qing He et al.

    A parallel incremental extreme SVM classifier

    Neurocomputing

    (2011)
  • S. Haykin

    Neural Networks: A Comprehensive Foundation

    (1999)
  • Corinna Cortes et al.

    Support vector networks

    Mach. Learn.

    (1995)
  • J.A.K. Suykens et al.

    Least squares support vector machine classifiers

    Neural Process. Lett.

    (1999)
  • Glenn Fung, Olvi L. Mangasarian, Proximal support vector machine classifiers, in: The 7th ACM SIGKDD International...
  • Qiuge Liu, Qing He, Zhongzhi Shi, Extreme support vector machine classifier, in: Advances in Knowledge Discovery and...
  • G.-B. Huang et al.

    A generalized growing and pruning RBF (GGAP-RBF) neural network for function approximation

    IEEE Trans. Neural Networks

    (2005)
  • Nan-Ying Liang et al.

    A fast and accurate on-line sequential learning algorithm for feedforward networks

    IEEE Trans. Neural Networks

    (2006)
  • N. Syed, H. Liu, K. Sung. Incremental learning with support vector machines, in: Proceedings of the Workshop on Support...
There are more references available in the full text version of this article.

Cited by (81)

  • A Bayesian optimization hyperband-optimized incremental deep belief network for online battery behaviour modelling for a satellite simulator

    2023, Journal of Energy Storage
    Citation Excerpt :

    Therefore, complete training strategies are defined such that we use all the available data to train the DBN model, disregarding the time cost of the training process. ( 3) Incremental learning-based extreme learning machine (ILELM) [28]: The ILELM works by assigning the weights output by a trained ELM as initial hidden layer input weights in the current model, unlike assigning random weights in a regular ELM [36]. ( 4) Incremental learning-based stochastic gradient descent (ILSGD) [49]: ILSGD is developed based on a linear model fitted by minimizing an empirical loss regularized with SGD, but the solution of the previous call is reused to fit the initialization.

  • An improved OS-ELM based Real-time prognostic method towards singularity perturbation phenomenon

    2021, Measurement: Journal of the International Measurement Confederation
View all citing articles on Scopus

Lu Guo received his B.Eng. degree from Harbin Institute of Technology in 2001, Harbin, China, and his M.Eng. degree from the Second Academy of China Aerospace in 2004, Beijing, China. Now he is working toward his Ph.D. degree in the Department of Automation, Tsinghua University, Beijing, China.

His research interests include scheduling optimization, extreme learning machines, machine learning and signal processing.

Jing-hua Hao received his Ph.D. degree in Control Science and Engineering from Tsinghua University. His research interests include modeling, simulation and optimization methods for the complex manufacturing process. He has published more than 10 papers. He participated in several projects of the National Program of China.

Min Liu received the Ph.D. degree from Tsinghua University, Beijing, China, in 1999. He is currently a Professor with the Department of Automation, Tsinghua University, Associate Director of Automation Science and Technology Research Department of Tsinghua National Laboratory for Information Science and Technology, Director of Control and Optimization of Complex Industrial Process, Tsinghua University, Director of China National Committee for Terms in Automation Science and Technology, Director of Intelligent Optimization Committee of China Artificial Intelligence Association. His main research interests are in optimization scheduling of complex manufacturing process and intelligent operational optimization of complex manufacturing process or equipment. He led more than 20 important research projects including the project of the National 973 Program of China, the project of the National Science and Technology Major Project of China, the project of the National Science Fund for Distinguished Young Scholars of China, the project of the National 863 High-Tech Program of China, and so on. He has published more than 100 papers and a monograph supported by the National Defense Science and Technology Book Publishing Fund. Dr. Liu won the National Science and Technology Progress Award.

View full text