Elsevier

Neurocomputing

Volume 174, Part A, 22 January 2016, Pages 168-178
Neurocomputing

A semi-supervised online sequential extreme learning machine method

https://doi.org/10.1016/j.neucom.2015.04.102Get rights and content

Abstract

This paper proposes a learning algorithm called Semi-supervised Online Sequential ELM, denoted as SOS-ELM. It aims to provide a solution for streaming data applications by learning from just the newly arrived observations, called a chunk. In addition, SOS-ELM can utilize both labeled and unlabeled training data by combining the advantages of two existing algorithms: Online Sequential ELM (OS-ELM) and Semi-Supervised ELM (SS-ELM). The rationale behind our algorithm exploits an optimal condition to alleviate empirical risk and structure risk used by SS-ELM, in combination with block calculation of matrices similar to OS-ELM. Efficient implementation of the SOS-ELM algorithm is made viable by an additional assumption that there is negligible structural relationship between chunks from different times. Experiments have been performed on standard benchmark problems for regression, balanced binary classification, unbalanced binary classification and multi-class classification by comparing the performance of the proposed SOS-ELM with OS-ELM and SS-ELM. The experimental results show that the SOS-ELM outperforms OS-ELM in generalization performance with similar training speed, and in addition outperforms SS-ELM with much lower supervision overheads.

Introduction

In recent years, Extreme Learning Machine (ELM), proposed by Huang et al. [1], [2], [3], [4], is attracting more and more attention because of its outstanding performance in training speed, predicting accuracy and generalization ability [5], [6], [7], [8], [9], [10]. In particular, it is shown that ELM tends to outperform support vector machine (SVM) in both regression and classification applications with much easier implementation [11]. However, batch ELM is still a time consuming affair, although much faster than traditional learning algorithms including SVM. Time cost is mainly reflected in two aspects: (1) Matrix inversion calculation, whose computational complexity is between quadratic and cubic with respect to the training data size [12]; (2) Output weight updating, which consumes considerable time as batch ELM needs to do recalculation by combining the old data together with the new data when a new chunk of training data is received. Online Sequential ELM (OS-ELM) [13] adapted batch ELM to overcome the abovementioned problems for practical applications. OS-ELM learns the training data chunk-by-chunk and updates the output weight only by new training data. Thus OS-ELM not only saves storage but also decreases the computational complexity. In addition, unlike other sequential learning algorithms which have many control parameters to be tuned, OS-ELM only requires the number of hidden nodes to be specified. In a word, OS-ELM presents much more advantage than many traditional sequential learning algorithms, like SGBP [14], RAN [15], RANEKF [16], MRAN [17], [18], GAP-RBF [19] and GGAP-RBF [20].

Although OS-ELM has advantages in generalization performance and learning speed, it still cannot avoid the dependency on a large amount of labeled data, which usually involves high cost in labor and time. Compared with OS-ELM, semi-supervised learning methods benefit from utilizing unlabeled data and reducing the need for labeled data. Compared with labeled data, unlabeled data is much easier and cheaper to acquire. Therefore semi-supervised learning provides an effective solution for the problems with small amount of labeled samples in various classification and regression tasks. To exploit unlabeled data, some semi-supervised ELM variants [21], [22], [23], [24] have been proposed. As typical examples, [22], [24] propose a kind of semi-supervised ELM based on manifold regularization, so that the learning system can balance the empirical risk and the complexity of the learned function f, where [22] is an improvement of [24] in terms of semi-supervised ELM, which brings good performance on predicting accuracy. However, the semi-supervised ELM mentioned above learns in a batch way, so its training speed decreases rapidly as the sample size gets larger.

To solve the manual labeling cost problem and meet the demand of sequential learning for many real applications, we propose a new type of online sequential ELM, which we name as semi-supervised online sequential ELM (SOS-ELM). It inherits not only the training and testing speed of OS-ELM but also the prediction accurate of SS-ELM. The experimental results show that using the same number of labeled samples, our proposed SOS-ELM has higher accurate prediction rate than that of OS-ELM and much faster training speed than that of SS-ELM.

The details of our proposed SOS-ELM are elaborated in the remainder part of the paper which is organized as follows. Section 2 gives a brief review of the batch ELM, OS-ELM and SS-ELM. Section 3 illustrates the derivation of SOS-ELM. Section 4 presents the experimental results and discussion based on the benchmark problems in the areas of regression and classification. Conclusions and future work based on the study are given in Section 5.

Section snippets

Review of ELM, SS-ELM and OS-ELM

This section briefly reviews the batch ELM, OS-ELM and SS-ELM, which are foundation of our extended algorithm: SOS-ELM. This part covers the related motivation, modeling and algorithm steps.

Proposed SOS-ELM

Both SS-ELM and OS-ELM improve the performance of basic ELM from different points of view. However, SS-ELM does the training with both labeled and unlabeled data in a batch way, whilst OS-ELM only utilizes labeled data. Clearly, it is feasible in practical applications to process the sequential data and make use of unlabeled data. Therefore, this paper proposes to integrate the both advantages together by modifying SS-ELM algorithm to suit sequential learning, and we refer to the new algorithm

Introduction of datasets used for performance evaluation

In this section, we will systematically evaluate the performance of our proposed SOS-ELM for regression and classification problems on some benchmark datasets by comparing with that of SS-ELM and OS-ELM. The benchmark datasets used in the paper are listed in Table 1. The number of labeled data, unlabeled data and testing data and other specifications: attribution number to represent the input and classification number of sample data for classification problem used in our experiments are also

Conclusion and future work

In this paper, a new algorithm in the ELM family called semi-supervised OS-ELM (SOS-ELM) is proposed. This algorithm can not only handle data arriving chunk-by-chunk like OS-ELM, but also reduce the requirement of labeled data and increases the performance with utilizing the unlabeled data as well. The performance of SOS-ELM is evaluated by comparing with that of OS-ELM and SS-ELM on real world benchmark datasets for regression and classification problems. The results demonstrate that proposed

Acknowledgments

This research is partly supported by the Natural Science Foundation of China (Nos. 61375059, 61175115), the Beijing Natural Science Foundation under Grant (Nos. 4122004, 4152005), Specialized Research Fund for the Doctoral Program of Higher Education (20121103110031), the Importation and the Importation and Development of High-Caliber Talents Project of Beijing Municipal Institutions (CIT&TCD201304035), Special training program for construction of teachers of Beijing High education – 2014

Xibin Jia, born in 1969, received Ph.D. degree in computer science and technology from Beijing University of Technology in 2007, M.S. degree in intelligent instrument from North China Institute of Technology in 1996 and B.S. degree in wireless technology from Chongqing University in 1991. She is an Associate Professor in the College of Computing at the Beijing University of Technology in Beijing, China. Her areas of interest include visual information cognition, and multi-information fusion,

References (32)

  • G.-B. Huang et al.

    Extreme learning machine for regression and multiclass classification

    IEEE Trans. Syst. Man Cybern. Part B: Cybern.

    (2012)
  • G.-B. Huang, An insight into extreme learning machines: random neurons, random features and kernels, Cognit. Comput....
  • Z. Bai, G.-B. Huang, D. Wang, H. Wang, M.B. Westover, Sparse extreme learning machine for...
  • N.-Y. Liang et al.

    A fast and accurate online sequential learning algorithm for feedforward networks

    IEEE Trans. Neural Netw.

    (2006)
  • Y.A. LeCun, L. Bottou, G.B. Orr, K.-R. Müller, Efficient backprop, Neural Networks: Tricks of the Trade, Springer,...
  • J. Platt

    A resource-allocating network for function interpolation

    Neural Comput.

    (1991)
  • Cited by (36)

    • An accuracy-maximization learning framework for supervised and semi-supervised imbalanced data

      2022, Knowledge-Based Systems
      Citation Excerpt :

      To utilize a large amount of unlabeled data as well as a relatively limited amount of labeled data for better classification, semi-supervised learning methods [19] was proposed and demonstrated promising performances in various tasks. Semi-supervised ELM variants have also been proposed [4–9] to specifically tackle the labeled data scarcity problem in ELM learning. However, most of the existing semi-supervised ELM variants can only support learning on balanced datasets but not imbalanced ones, which relates to another real-world challenge facing ELM — class imbalance issues.

    View all citing articles on Scopus

    Xibin Jia, born in 1969, received Ph.D. degree in computer science and technology from Beijing University of Technology in 2007, M.S. degree in intelligent instrument from North China Institute of Technology in 1996 and B.S. degree in wireless technology from Chongqing University in 1991. She is an Associate Professor in the College of Computing at the Beijing University of Technology in Beijing, China. Her areas of interest include visual information cognition, and multi-information fusion, especially for facial expression recognition and visual speech recognition.

    Runyuan Wang, born in 1991, received B.S. degree from University of Science and Technology, Beijing in 2012. He is currently pursuing the M.S. degree in computer science at the Beijing University of Technology. His research interests include pattern recognition and applications of computer vision to multimedia.

    Junfa Liu, born in 1973, received Ph.D. degree in Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS) in 2009. He is an Associate Researcher in ICT, CAS now, and his areas of interest include pervasive computing, virtual reality and data mining.

    David M.W. Powers is a Professor of Computer Science and Director of AI Lab and the Centre for Knowledge and Interaction Technology (KIT) at Flinders University. He specializes in applications of unsupervised learning to language and speech processing. Dr Powers undertook his Ph.D. in this area, as well as co-founding ACLs SIGNLL and CoNLL. He is also a trader, and has a Diploma in Technical Analysis, being the study of how to find and exploit edge in the financial markets.

    View full text