Elsevier

Neurocomputing

Volume 219, 5 January 2017, Pages 248-262
Neurocomputing

Modeling collinear data using double-layer GA-based selective ensemble kernel partial least squares algorithm

https://doi.org/10.1016/j.neucom.2016.09.019Get rights and content

Abstract

Collinear and nonlinear characteristics of modeling data have to be addressed for constructing effective soft measuring models. Latent variables (LVs)-based modeling approaches, such as kernel partial least squares (KPLS), can overcome these disadvantages in certain degree. Selective ensemble (SEN) modeling can improve generalization performance of learning models further. Nevertheless, how to select SEN model's learning parameters is an important open issue. In this paper, a novel SENKPLS modeling method based on double-layer genetic algorithm (DLGA) optimization is proposed. At first, one mechanism, titled outside layer adaptive GA (AGA) optimization encoding and decoding principle, is employed to produce initial learning parameter values for KPLS-based candidate sub-models. Then, ensemble sub-models are selected and combined based on inside layer GA optimization toolbox (GAOT) and adaptive weighting fusion (AWF) algorithm. Thus, SEN models of all AGA populations are obtained. Finally, outside layer AGA optimization operations, i.e., selection, crossover and mutation processes, are repeated until the pre-set stopping criterion is satisfied. Simulation results validate the effectiveness of the proposed method as far as the synthetic data, low dimensional and high dimensional benchmark data.

Introduction

Soft measuring model's construction methods based on data-driven have been widely used to predict some difficulty-to-measure process parameters in complex industrial process [1]. The accurate measurement of these process parameters is one of the key factors to improve production quality and efficiency indices continuously [2]. Statistical inference techniques and machine learning techniques, such as artificial neural network (ANN) and support vector machines (SVM), are always employed [3]. However, these methods suffer from long learning time. Actually, some rapid increment learning algorithms have been proposed to settle this problem [4], [5], [6]]. Indeed, modeling data in the industrial processes are strong collinear. Moreover, ANN and SVM cannot model high dimensional collinear modeling data directly. Some feature extraction-based pre-processing techniques, such as principal component analysis (PCA), are used to solve them [1], [7]. However, the extracted low dimensional independent features with PCA may have little relation with the predicted process parameters [8]. Partial least squares (PLS) can construct linear learning model with linear latent variables (LVs), which capture the maximal covariance between input data and output data [9]. In nature, except near the steady working condition, most of the industrial processes are nonlinear. Thus, kernel methods have become one of the simple and elegant approaches in soft measuring model's development [10], [11], [12], [13], [14], [15], [16], [17]. Kernel PLS (KPLS) method can model collinear and nonlinear data effectively with well prediction performance in terms of nonlinear LVs [18], [19], [20]. However, how to select learning parameters effectively, such as kernel parameter and kernel LVs (KLVs) number, is still an open issue [21], [22].

Ensemble learning-based modeling methods, i.e., collective outputs of multiple ensemble sub-models, can improve generalization, validity and reliability of soft measuring models [23], [24], [25], [26]. The initial neural network ensemble method assumes that columns and rows of the prediction error function's correlation matrix are linear independent [27]. The trade-off problem between predictive accuracies and diversities of the ensemble sub-models is still an important open issue [28]. For example, ensemble classifiers are constructed in terms of consideration of both sparsity and diversity [29]. Based on popular back propagation neural network (BPNN) approach, genetic algorithm (GA) selective ensemble (SEN, i.e., select some sub-models from candidate sub-models to ensemble) shows that GASEN-BPNN can obtain better performance than normal ensemble methods [30]. By replacing BPNN with KPLS, KPLS-based GASEN (GASEN-KPLS) can model high dimensional collinear data directly [31]. However, the simple average weighting-based ensemble sub-models' combination method is not reasonable for function estimation problems. In certain degree, these ensemble sub-models can be looked as multiple sensors to measure a same physical parameter [32]. Normally, the optimal observation value in multi-sensor system can obtain with adaptive weighted fusion (AWF) algorithm [33].

The issue of optimized selection of SEN model's learning parameters is still a difficult problem, which needs to determine model structure and model parameters of the candidate sub-models and SEN model jointly. Ensemble size, i.e., number of ensemble sub-models, can look as model structure of SEN. Thus, weighting coefficients of these ensemble sub-models are model parameters of SEN. With the pre-selected weighting method and pre-constructed candidate sub-models, SEN modeling process can be formulated as an optimization problem, whose solution process is same as that of optimal feature selection [32]. Generally, SEN model's generalization performance is influenced by diversities and accuracies among different candidate sub-models. Further, these two factors are decided by the model structure and model parameters of KPLS-based candidate sub-models. From another perspective, only the optimized model structure and model parameters of different candidate sub-models can assure the most optimized SEN model. Thus, how to select these learning parameters in terms of simplization and easiness? Double-layer optimization, i.e., inside layer optimization for SEN model and outside layer optimization for candidate sub-models, may be one of the simple and effective methods. This is also the motivation of this study.

In this paper, GA optimization toolbox (GAOT) and AWF methods are selected to construct an inside layer SEN model based on the former studies [30], [32]. The outside layer optimization is used to identify model structure and model parameters of all the candidate sub-models globally. Many GA-based approaches are proposed to address model parameters’ identification problem, such as multi-population parallel GA-based structure and system latency optimization of evolvable block-based neural networks [34], adaptive particle swarm optimization and genetic algorithms (APSO-GA)-based uncertain parameters calibration of energy model in an experimental greenhouse [35], GA-based feature selection and parameter optimization for linear support higher-order tensor machine [36], and robust optimization algorithm for ANFIS [37]. These studies show that GA is a commonly available tool for model parameter identification in present studies. Recently, some new optimization algorithms are proposed to estimate the unknown parameters [38], [39], which can be used in the further study. Therefore, based on our former research [40], adaptive GA (AGA) is used as the outside layer optimization tool.

Based on the above analysis, a double-layer GA-based SENKPLS (DLGA-SENKPLS) method is proposed to address collinear and nonlinear data modeling problem in this paper. Adaptive GA (AGA) and GAOT are exploited to optimize learning parameters of the candidate sub-models and SEN model, respectively. In nature, the final optimization objective is to obtain global SEN soft measuring model. At first, outside layer AGA is applied to encode and decode initial solutions of learning parameters for candidate sub-models. Then, training sub-samples based on Bootstrap production approach are employed to build KPLS-based candidate sub-models with these learning parameters for each AGA population. Inside layer GAOT-based optimization and AWF-based combination methods are integrated to obtain these SEN models of different populations. At last, outside layer AGA-based optimization operations are repeated until the pre-set stopping criterion is satisfied. Simulation results based on synthetic and benchmark datasets show that the proposed method outperforms the compared methods in terms of generalization performance.

Compared with the existing literatures, the distinctive contributions of the proposed method include: (1) In order to model high dimensional collinear and nonlinear data directly, a new double-layer GA optimization-based SENKPLS approach is proposed, while [30] cannot; (2) The common trade-off problem between diversities and accuracies of multiple models [32] is solved by global double-layer optimization selection of the learning parameters; (3) The paper studies the relations among different learning parameters of candidate sub-models and SEN model for the first time.

The remainder of this paper is organized as follows. Section 2 provides an overview of KPLS algorithm, analytical solution of the optimized weights for ensemble sub-models, Genetic algorithm-based SEN (GASEN) method and adaptive weighting fusion (AWF) algorithm. In Section 3 and Section 4, strategy and realization of our proposed approach are presented in details. Simulation results based on synthetic, low and high dimensional benchmark datasets are used to validate the proposed method in Section 5. Finally, the conclusion and future works are drawn in Section 6.

Section snippets

Kernel partial least squares ( KPLS)

Partial least squares (PLS) constructs linear multivariable regression model by extracted latent variables (LVs) from the original input/output data space. Especially for high dimensional collinear spectral data, the number of LVs is much lower than that of the original input features. Thus, PLS has been used widely in collinear data modeling. The objective of PLS method is to search for weight vectors w and c in terms of maximization covariance between latent scores t and u. PLS decomposes

SEN modeling strategy based on dual-layer GA-based optimazation

Based on the above analysis, the proposed double-layer GA-based SENKLPS (DLGA-SENKPLS) algorithm consists of five modules, i.e, outside layer AGA-based learning parameters encoding, outside layer AGA-based learning parameters decoding, KPLS-based candidate sub-models construction, inside layer GAOT-based SEN modeling, and outside layer AGA operation, which are shown in Fig. 1.

In Fig. 1, jGA=1,...,JGA, JGA is the population size of the outside layer AGA-based optimization; {(γKPLSjGA)Bin}jGA=1JGA

Outside layer AGA-based learning parameters encoding

There are three leaning parameters to be optimized by outside AGA: kernel parameter (γKPLSsel) and KLVs number (hKPLSsel) of KPLS-based candidate sub-models, and population size of the inside layer GAOT optimization (JLSENsel). The ranges of these learning parameters are denoted as:{γKPLSsel[γKPLSmin,γKPLSmax]hKPLSsel[1,rank(K())]JLSENsel[10,JLSENmax]where γKPLSmin and γKPLSmax are the minimum and maximum values of γKPLSsel, K(.)is the kernel matrix of input data, and JLSENmax is the

Simulation results

In this section, performance of the proposed approach is compared with PLS and KPLS methods using synthetic, low dimensional and high dimensional benchmark datasets. These datasets are divided into 5 parts with equal space, and the third part is selected as the testing samples. The other parts are used as training and validation samples, which two thirds are used as training data and one third is used as validation data. Popular radius basis function (RBF) is used for all datasets. The

Conclusion

A new double-layer genetic algorithm (DLGA) nested optimization procedure for searching learning parameters of selective ensemble kernel partial least squares (SENKPLS) model is proposed in this paper. It can be used to model collinear and nonlinear data directly without additional dimension reduction pre-processing step. Outside layer optimization is realized based on adaptive GA (AGA), whose objective is to optimized select population size of inside layer optimization, kernel parameter and

Acknowledgment

This work is partially supported by the post doctoral National Natural Science Foundation of China (2013M532118, 2015T81082, 2015M581355),National Natural Science Foundation of China (61573364, 61273177,61305029, 61503066, 61573249), State Key Laboratory of Synthetical Automation for Process Industries (PAL-N201504), China National 863 Projects (2015AA043802), also Project funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD), Jiangsu Collaborative

Jian Tang received bachelor degree from Naval College of engineering, Wuhan, China, in 1998, MS and pH.D. degrees from Northeastern University in 2006 and 2012 respectively in control theory and control engineering in Northeastern University. His current research interests are integrated automation of industrial processes, and soft sensor modeling based data driven.

References (44)

  • B.M. Nicolaï et al.

    Kernel pls regression on wavelet transformed nir spectra for prediction of sugar content of apple

    Chemom. Intell. Lab. Syst.

    (2007)
  • M. Wang et al.

    Kernel pls based prediction model construction and simulation on theoretical cases

    Neurocomputing

    (2015)
  • M. Jin et al.

    Reliable fault diagnosis method using ensemble fuzzy artmap based on improved bayesian belief method

    Neurocomputing

    (2014)
  • S. Faußer et al.

    Selective neural network ensembles in reinforcement learning: taking the advantage of many agents

    Neurocomputing

    (2015)
  • P.M. Granitto et al.

    Neural network ensembles: evaluation of aggregation algorithms

    Artif. Intell.

    (2005)
  • X.C. Yin et al.

    A novel classifier ensemble method with sparsity and diversity

    Neurocomputing

    (2014)
  • Z.H. Zhou et al.

    Ensembling neural networks: many could be better than all ☆

    Artif. Intell.

    (2002)
  • V.P. Nambiar et al.

    Optimization of structure and system latency in evolvable block-based neural networks using genetic algorithm

    Neurocomputing

    (2014)
  • T. Guo et al.

    A ga-based feature selection and parameter optimization for linear support higher-order tensor machine

    Neurocomputing

    (2014)
  • A. Sarkheyli et al.

    Robust optimization of anfis based on a new modified GA

    Neurocomputing

    (2015)
  • J. Hu et al.

    A variance-constrained approach to recursive state estimation for time-varying complex networks with missing measurements

    Automatica

    (2016)
  • J. Tang et al.

    Feature extraction and selection based on vibration spectrum with application to estimate the load parameters of ball mill in grinding process

    Control Eng. Pract.

    (2012)
  • Cited by (28)

    • Stock returns prediction using kernel adaptive filtering within a stock market interdependence approach

      2020, Expert Systems with Applications
      Citation Excerpt :

      In practice, as NN models are retrained at regular intervals they require both significant computing and storage resources (Cui et al., 2016). Note that non-parametric kernel approaches have proven useful in identifying non-linear systems (Orabona, Keshet, & Caputo, 2009; Zhao, Hoi, & Jin, 2011; Tang et al., 2017), showing that their convex optimization helps to reduce the computational complexity in sequential learning environments (Liu, Principe, & Haykin, 2011). The KAFs-based approaches can start learning the model without having the entire training set in advance, as their learning scheme is a combination of memory-based learning and error-correction, meaning that the model is updated sequentially in real-time while predictions are obtained.

    • Forward and backward input variable selection for polynomial echo state networks

      2020, Neurocomputing
      Citation Excerpt :

      In the field of machine learning, the regression problems have been widely studied, whose objective is to find the relationship between two sets of variables by approximating mathematical models [1]. To solve regression problems, many techniques have been proposed, such as the nonlinear least squares methods [2], kernel machines [3], tree-based ensembles [4], autoregressive models [5], feedforward neural networks [6], recurrent neural networks (RNNs) [7], and so on. Among these methods, the RNNs are known for their approximation ability of arbitrary nonlinear systems [8].

    • Dual-layer optimized selective information fusion using multi-source multi-component mechanical signals for mill load parameters forecasting

      2020, Mechanical Systems and Signal Processing
      Citation Excerpt :

      In nature, SEN process can be formulated as an optimization problem with the pre-built sub-models and pre-selected combination method. The optimized learning parameters of SENKPLS can be searched with multiple optimization algorithms [42]. However, it cannot be used to construct MLPF model based on a multi-source multi-scale frequency spectrum in terms of selective information fusion.

    • Learning from data streams using kernel least-mean-square with multiple kernel-sizes and adaptive step-size

      2019, Neurocomputing
      Citation Excerpt :

      In other words, unlike NNs, the model is learned using a single pass through the entire training set in KAF algorithms. As a result, KAFs have been shown to be an efficient alternative to identifying non-linear systems in online sequential learning [16–18]. Algorithms based on KAFs operate in a very special Hilbert space of functions called a reproducing kernel Hilbert space (RKHS).

    • Spatial partial least squares autoregression: Algorithm and applications

      2019, Chemometrics and Intelligent Laboratory Systems
      Citation Excerpt :

      Via projecting the original variables onto latent variables, PLS can not only solve the collinearity problem, but also reduce the dimension, and accordingly decrease the computation complexity. Moreover, PLS is a supervised approach, which can guarantee that the extracted latent variables have maximum covariance with response variables during all the iteration process [17]. Consequently, the extracted latent variables possess significant explanatory ability to the response variables.

    View all citing articles on Scopus

    Jian Tang received bachelor degree from Naval College of engineering, Wuhan, China, in 1998, MS and pH.D. degrees from Northeastern University in 2006 and 2012 respectively in control theory and control engineering in Northeastern University. His current research interests are integrated automation of industrial processes, and soft sensor modeling based data driven.

    Jian Zhang received the MS and PhD in applied mathematics from Liaoning University and in pattern recognition and intelligent systems from Northeastern University, China, in 2008 and 2012, respectively. Since 2012, he has been working as a lecturer at Nanjing University of Information Science & Technology, China. His research interests include wireless sensor networks, big data, mobile computing.

    Zhiwei Wu received the B.S. degree in electronic and information engineering from Dalian Nationalities University, Dalian, China, in 2004, the M.S. degree in control theory and engineering from Shenyang University of Chemical Technology, Shenyang, China, in 2007, and the pH.D. degrees in control theory and engineering from Northeastern University, Shenyang, China, in 2015. He is currently a Lecturer with the State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, China. His research interests include operational control for complex industry processes and industrial embedded control systems.

    Liu Zhuo, pH. D. candidate at Northeastern University. Received bachelor degree in 2002 from Northeastern University in industrial automation. Her research interest covers soft sensor modeling for complex industries.

    Tianyou Chai (M′90-SM’97-F′08) received the pH.D. degree in control theory and engineering in 1985 from Northeastern University, Shenyang, China, where he became a Professor in 1988. He is the founder and Director of the Center of Automation, which became a National Engineering and echnology Research Center and a State Key Laboratory. He is a member of Chinese Academy of Engineering, IFAC Fellow and IEEE Fellow, director of Department of Information Science of National Natural Science Foundation of China. His current research interests include modeling, control, optimization and integrated automation of complex industrial processes. He has published 144 peer reviewed international journal papers. He has developed control technologies with applications to various industrial processes. For his contributions, he has won 4 prestigious awards of National Science and Technology Progress and National Technological Innovation, the 2007 Industry Award for Excellence in Transitional Control Research from IEEE Multiple-conference on Systems and Control.

    Wen Yu (M′97-SM04)received the B.S. degree from Tsinghua University, Beijing, China in 1990 and the M.S. and pH.D. degrees, both in Electrical Engineering, from Northeastern University, Shenyang, China, in 1992 and 1995, respectively. From 1995 to 1996, he served as a Lecturer in the Department of Automatic Control at Northeastern University, Shenyang, China. Since 1996, he has been with the Centro de Investigación y de Estudios Avanzados, National Polytechnic Institute (CINVESTAV-IPN), Mexico City, Mexico, where he is currently a Professor with the Departamento de Control Automatico. From 2002 to 2003, he held research positions with the Instituto Mexicano del Petroleo. He was a Senior Visiting Research Fellow with Queen’s University Belfast, Belfast, U.K., from 2006 to 2007, and a Visiting Associate Professor with the University of California, Santa Cruz, from 2009 to 2010. He also holds a visiting professorship at Northeastern University in China from 2006. Dr. Wen Yu serves as associate editors of IEEE Transactions on Cybernetics, Neurocomputing, and Journal of Intelligent and Fuzzy Systems. He is a member of the Mexican Academy of Sciences.

    View full text