Elsevier

Computers & Chemical Engineering

Volume 71, 4 December 2014, Pages 77-93
Computers & Chemical Engineering

Adaptive soft sensor modeling framework based on just-in-time learning and kernel partial least squares regression for nonlinear multiphase batch processes

https://doi.org/10.1016/j.compchemeng.2014.07.014Get rights and content

Highlights

  • An adaptive soft sensing framework is proposed for multiphase batch processes.

  • New hybrid similarity measure and adaptive sample selection for JIT learning.

  • Phase identification using Gaussian mixture model and Bayesian inference.

  • Partial mutual information criterion for input variable selection.

  • Application to industrial CTC fermentation process with satisfactory results.

Abstract

Batch processes are characterized by inherent nonlinearity, multiple phases and time-varying behavior that pose great challenges for accurate state estimation. A multiphase just-in-time (MJIT) learning based kernel partial least squares (KPLS) method is proposed for multiphase batch processes. Gaussian mixture model is estimated to identify different operating phases where various JIT-KPLS frameworks are built. By applying Bayesian inference strategy, the query data is classified into a particular phase with the maximal posterior probability, and thus the corresponding JIT-KPLS framework is chosen for online prediction. To further improve the predictive accuracy of the MJIT-KPLS algorithm, a hybrid similarity measure and an adaptive selection strategy are proposed for selecting local modeling samples. Moreover, maximal similarity replacement rule is proposed to update database. A procedure of input variable selection based on partial mutual information is also presented. The effectiveness of the MJIT-KPLS algorithm is demonstrated through application to industrial fed-batch chlortetracycline fermentation process.

Introduction

Batch or semibatch processes have been widely used to produce special chemicals, materials for microelectronics, pharmaceutical and agricultural products (Cinar et al., 2003). As prior requirements, the reliable real-time measurements play a crucial role in process automation, monitoring and optimization (Alford, 2006). However, the lack of reliable online sensors, which can accurately detect the important state variables, has become one of the major challenges of controlling batch processes accurately, automatically and optimally (Chen et al., 2006, Dochain, 2008, Nicoletti and Jain, 2009).

Over the past two decades, soft sensors have received increasing attention in both academia and industry due to their inferential estimation capability. Although soft sensors have been applied in broad fields, the online prediction remains the dominant application (Kadlec et al., 2009). When online analyzers are not available, soft sensing technology aims to provide estimations of difficult-to-measure variables based on some easy-to-measure variables. Generally, soft sensors can be classified into two groups: first-principle models and data-driven models (Kadlec et al., 2009). The focus of this work is on the data-driven soft sensor modeling. Reviews of this type of soft sensor application have been published in references Fortuna et al. (2007), Haimi et al. (2013), Kadlec et al. (2009), Kano and Nakagawa (2008), Kano and Fujiwara (2013), Sliskovic et al. (2011).

The most common data-driven modeling techniques for developing soft sensors are multivariate statistical techniques such as multivariate linear regression (MLR) (Kano and Ogawa, 2010), principle component regression (PCR) (Jolliffe, 2002) and partial least squares (PLS) (Lin et al., 2007, Kano and Fujiwara, 2013, Sharmin et al., 2006). These linear modeling methods account for 90% of the soft sensors used in industry (Kano and Ogawa, 2010) due to their statistical background, ease of interpretability, and because they can deal efficiently with data collinearity, which is common among industrial datasets (Kadlec and Gabrys, 2011). Nevertheless, these linear methods cannot function well when applied to highly nonlinear processes like batch processes. Thus many efforts have been attempted to nonlinear approaches, such as artificial neural networks (ANN) (Cui et al., 2012, Gonzaga et al., 2009), KPLS (Jia et al., 2013, Yu, 2012a), support vector regression (Yu, 2012b), neuro-fuzzy system (Jang, 1993, Jassar et al., 2009) and Gaussian process regression (GPR) (Grbić et al., 2013). However, there exist some issues remained to be solved to develop soft sensors for batch processes.

One particular drawback of many current soft sensors is their non-adaptive nature. Traditionally, the predictive models are not adaptive, and once deployed into the real-life operation, the models will not change, whereas the operation environment is often changing. To cope with changes in process characteristics, model maintenance is essential to maintain high estimation performance for a long time. Without process or expert knowledge, the soft sensor model has to be updated automatically. Thus many kinds of recursive modeling methods, which update models by prioritizing newer samples, have been proposed (Kadlec et al., 2011). Although these methods can adapt the soft sensor to a new operation condition recursively, they cannot cope with abrupt changes in process characteristics, which are caused by replacement of a catalyst, cleaning of equipment, etc., because a query sample just after an abrupt change becomes significantly different from the prioritized samples. It is also a common practice to enhance the model adaptation by the ensemble learning framework (Grbić et al., 2013, Kadlec et al., 2011, Kadlec and Gabrys, 2011, Yu, 2012c). In the ensemble framework, data are divided into different sub-domains and local sub-models are constructed over each domain. In this way, instead of using a single global model, multiple simpler models are developed and then combined to obtain the final prediction. Some other adaptive soft sensors were developed by partitioning the process data into multiple clusters corresponding to different operating phases where local predictive model is built for each phase (Yu, 2012a, Yu and Qin, 2008, Yu and Qin, 2009). When implemented online, the query sample, for which an output estimation is required, is firstly classified into a particular phase, and then the corresponding local model representing the identical phase is adaptively chosen for prediction. Although such phase based multi-model methods outperforms the single models, they are unable to effectively capture the between-phase transient dynamics. Thus, a Bayesian model averaging (BMA) based multi-model method was proposed to tackle this issue (Yu et al., 2013). In practice, however, it is difficult to determine the partition number due to the lack of the quantitative and precise information of phase divisions. More importantly, the local models used in the BMA based multi-model methods are built offline and not updated once deployed online, thus the changes in process characteristics cannot be well dealt with due to the time-varying nature of the real-life processes.

Recently, just-in-time (JIT) learning has attracted increasing attention in process modeling and soft sensor development (Cheng and Chiu, 2004, Fujiwara et al., 2009). By applying JIT learning, a local model is constructed from the samples similar to the query sample. Thus, on one hand, JIT based model can cope with abrupt changes as well as gradual ones. On the other hand, it can deal with nonlinearity since it builds a local model repeatedly. Compared to the traditional modeling methods which can be considered as global modeling, JIT based method exhibits a local model structure where a local model is built from the historical dataset selected by some similarity measure to the query data when the estimation is required. Once the estimated output is obtained, the built local model is discarded. Nevertheless, the estimation accuracy of JIT based models is expected to be further improved by selecting the optimal combination of local regression function, input variable selection, similarity measure, and database updating scheme, etc., simultaneously.

To enhance the predictive performance and adaptability of JIT based models, there are some problems remain to be addressed. Although linear local model based JIT methods (Cheng and Chiu, 2004, Kim et al., 2013) can successfully address the process nonlinearity, high nonlinearity variable relationships are widely existed in batch processes where a local linear model may not always function well. Thus a nonlinear modeling technique with high computational efficiency is preferable. In addition, the commonly used similarity measures rarely take into account process characteristics. In real applications, the most frequently used similarity measures are based on the Euclidean distance or the Mahalanobis distance (Cheng and Chiu, 2004, Ge and Song, 2010, Schaal et al., 2002), most of which are defined only from the perspective of sample algebraic space irrespective of specific process knowledge. Moreover, the optimal local modeling size is usually determined offline as reported in (Ge and Song, 2010). An adaptive strategy is required to choose local modeling samples adaptively for each query data. Besides, a reliable database updating scheme is also crucial for JIT based models to tackle changes in process characteristics. Usually, the database is updated only by simply removing the oldest samples and adding the new samples (Shigemori et al., 2011).

The second problem encountered in soft sensor modeling is the lack of a systematic guideline for input variable selection. As reported in literature work (Cui et al., 2012, Kadlec and Gabrys, 2011, Kim et al., 2013, Pani et al., 2013), input variables are often selected based on engineers’ personal experience and prior process knowledge. However, it is time-consuming for the engineers to select the input variables since trial and error is inevitable. Additionally, the selected variables may not be optimal. Also, it becomes very difficult even for experienced engineers to properly select input variables when a large number of variables are measured and physical and chemical phenomena are not sufficiently understood. Consequently, various data-based methods have been proposed for selecting proper input variables.

One popular approach for reducing the input variable dimension is by projecting the original input space into an adequate lower dimensional space. The most popular methods for achieving such projection task are based on linear projection of input space, such as the widely known principle component analysis (PCA) and PLS. However, the new variables resulting from such methods are difficult to interpret in terms of actual process variables (Delgado et al., 2009). More importantly, the underlying assumption of linearly structured dependence contradicts to the development of statistical model of nonlinear processes.

Another approach for reducing input space dimension is based on selecting the most important variables from all potential variables according to some criteria. As a nonparametric and nonlinear measure of relevance derived from information theory, mutual information (MI) has recently been applied to nonlinear processes modeling (May et al., 2008a), process monitoring (Chen et al., 2013, Rashid and Yu, 2012a, Rashid and Yu, 2012b) and soft sensor design (Grbić et al., 2013). Unlike those linear methods that only consider linear relationships between variables, MI is theoretically able to identify relations of any type. It furthermore makes no assumption about the distribution of the data. However, several issues have arisen in the formulation of MI-based selection algorithms, which are: the ability of handling the inter-dependencies between candidates and the lack of an appropriate principle for determining when to halt the selection procedure (Chow and Huang, 2005). To tackle this issue, the partial mutual information (PMI) criterion has been developed by considering the effect of the already selected input variables when evaluating the relevance between one plausible input variable and the output variable (May et al., 2008a, May et al., 2008b, Sharma, 2000).

Apart from the model adaptation and input variable selection, during data-driven soft sensor modeling, much attention is only paid to the plant data, whereas the process characteristics are usually ignored. In practice, batch processes are often characterized by their multiphase characteristics where multiple operating phases are involved. Thus the accuracy and reliability of quality variable prediction can heavily degrade as the operating phase and process dynamics change. Multiphase modeling strategy with phase identification has been reported more efficient than the conventional single-model based methods (Yu, 2012a, Yu and Qin, 2008, Yu and Qin, 2009, Yu et al., 2013). These results indicate that multiphase nature essentially needs to be considered when designing data-driven soft sensors for batch processes.

To address the above-mentioned issues, a novel adaptive soft sensor modeling framework is proposed for nonlinear multiphase batch processes. This soft sensing algorithm is outlined as follows:

  • (i)

    JIT learning framework is adopted due to its capability of dealing with changes in process characteristics as well as nonlinearity. Within this learning framework, a new hybrid similarity measure is defined by integrating Euclidean distance based similarity with process phase similarity. Further, the optimal local modeling samples are adaptively determined by online cross-validation optimization. Besides, a maximal similarity replacement rule is proposed to update sample database.

  • (ii)

    KPLS is chosen as the local modeling technique for two reasons. One reason is that it is more effective to capture the nonlinear characteristics of batch processes than linear methods such as PLS regression. Another reason is that KPLS essentially requires only linear algebra, making it as simple as a regular linear PLS regression.

  • (iii)

    Gaussian mixture model (GMM) is estimated to identify the operating phases of batch process, and then various JIT-KPLS modeling frameworks are constructed for different phases.

  • (iv)

    The input variables are selected based on PMI criterion between potential input variables and output variable by performing a stepwise selection procedure, which effectively alleviates the effect of the redundancy between input variables.

The proposed MJIT-KPLS soft sensing framework allows performing model adaptation in four aspects. First, a query sample can be automatically classified into a particular operating phase based on the estimated GMM model through Bayesian inference strategy. Second, in the JIT learning framework, a local KPLS model is constructed online when the estimation for the query sample is required. Third, an adaptive strategy is proposed to select the optimal local modeling size adaptively for local KPLS modeling. Finally, the proposed framework can adapt to new process state by adding new samples into the database from which the samples for local modeling are selected.

The rest of this paper proceeds as follows. Section 2 briefly outlines the theories about KPLS regression, JIT learning, PMI criterion, and GMM algorithm. The proposed adaptive soft sensing algorithm, MJIT-KPLS, is discussed in detail in Section 3. Subsequently, a case study of industrial fed-batch chlortetracycline fermentation process is used to evaluate the proposed algorithm in Section 4. Finally, this research is concluded in Section 5.

Section snippets

Preliminaries

In this section, KPLS regression, JIT learning, PMI criterion, and GMM algorithm are briefly introduced.

Proposed adaptive soft sensor

The proposed MJIT-KPLS soft sensing algorithm builds soft sensors online by performing KPLS regression in multiphase JIT modeling framework. The JIT learning framework for each operating phase is similarly determined by a group of offline parameters which are different from phase to phase. Thus the development of MJIT-KPLS soft sensors consists of building multiple similar JIT-KPLS modeling frameworks. Without considering a particular phase, the development of JIT-KPLS soft sensors can be split

Case study

In this section, the proposed soft sensing algorithm is tested on industrial fed-batch chlortetracycline fermentation process. Two indexes are used to evaluate the model performance, including root-mean-square error (RMSE) and the coefficient of determination (R2) given byRMSE=1Ntesti=1Ntest(yˆiyi)2,R2=1i=1Ntest(yˆiyi)2i=1Ntest(yiy¯i)2,where yˆi is the estimated output, yi is the actual output, and y¯i is the mean of the actual output; Ntest denotes the number of testing samples.

RMSE is

Conclusions

The MJIT-KPLS soft sensing algorithm, the main contribution of this work, provides a novel adaptive soft sensor modeling framework for multiphase batch processes. By exploiting the JIT learning framework, MJIT-KPLS allows to cope with changes in process characteristics as well as process nonlinearity. Moreover, Gaussian mixture model enables automatic phase identification and multiphase modeling. Through the Bayesian inference strategy, the JIT-KPLS modeling framework representing the identical

Acknowledgements

We thank Charoen Pokphand Group for their financial support and for providing the industrial datasets of fed-batch CTC fermentation process. We also appreciate the valuable comments and suggestions of the anonymous reviewers.

References (60)

  • H. Haimi et al.

    Data-derived soft-sensors for biological wastewater treatment plants: an overview

    Environ Model Softw

    (2013)
  • S. Jassar et al.

    Adaptive neuro-fuzzy based inferential sensor model for estimating the average air temperature in space heating systems

    Build Environ

    (2009)
  • P. Kadlec et al.

    Data-driven soft sensors in the process industry

    Comput Chem Eng

    (2009)
  • P. Kadlec et al.

    Review of adaptation mechanisms for data-driven soft sensors

    Comput Chem Eng

    (2011)
  • M. Kano et al.

    Data-based process monitoring, process control, and quality improvement: recent developments and applications in steel industry

    Comput Chem Eng

    (2008)
  • M. Kano et al.

    The state of the art in chemical process control in Japan: good practice and questionnaire survey

    J Process Control

    (2010)
  • B. Lin et al.

    A systematic approach for soft sensor development

    Comput Chem Eng

    (2007)
  • H. Liu et al.

    On-line outlier detection and data cleaning

    Comput Chem Eng

    (2004)
  • R.J. May et al.

    Non-linear variable selection for artificial neural networks using partial mutual information

    Environ Mod Softw

    (2008)
  • R.J. May et al.

    Application of partial mutual information variable selection to ANN forecasting of water quality in water distribution systems

    Environ Model Softw

    (2008)
  • A.K. Pani et al.

    Development and comparison of neural network based soft sensors for online estimation of cement clinker quality

    ISA Trans

    (2013)
  • M.M. Rashid et al.

    A new dissimilarity method integrating multidimensional mutual information and independent component analysis for non-Gaussian dynamic process monitoring

    Chemom Intell Lab Syst

    (2012)
  • A. Sharma

    Seasonal to interannual rainfall probabilistic forecasts for improved water supply management: part 1—a strategy for system predictor identification

    J Hydrol

    (2000)
  • R. Sharmin et al.

    Inferential sensors for estimation of polymer quality parameters: industrial application of a PLS-based soft sensor for a LDPE plant

    Chem Eng Sci

    (2006)
  • H. Shigemori et al.

    Optimum quality design system for steel products through locally weighted regression model

    J Process Control

    (2011)
  • J. Yu

    A Bayesian inference based two-stage support vector regression framework for soft sensor development in batch bioprocesses

    Comput Chem Eng

    (2012)
  • J. Yu

    Online quality prediction of nonlinear and non-Gaussian chemical processes with shifting dynamics using finite mixture model based Gaussian process regression approach

    Chem Eng Sci

    (2012)
  • J. Yu et al.

    A Bayesian model averaging based multi-kernel Gaussian process regression framework for nonlinear state estimation and quality prediction of multiphase batch processes with transient dynamics and uncertainty

    Chem Eng Sci

    (2013)
  • C.M. Bishop

    Pattern recognition and machine learning

    (2006)
  • L.Z. Chen et al.

    Modelling and optimization of biotechnological processes: artificial intelligence approaches

    (2006)
  • Cited by (0)

    View full text