Adaptive soft sensor modeling framework based on just-in-time learning and kernel partial least squares regression for nonlinear multiphase batch processes
Introduction
Batch or semibatch processes have been widely used to produce special chemicals, materials for microelectronics, pharmaceutical and agricultural products (Cinar et al., 2003). As prior requirements, the reliable real-time measurements play a crucial role in process automation, monitoring and optimization (Alford, 2006). However, the lack of reliable online sensors, which can accurately detect the important state variables, has become one of the major challenges of controlling batch processes accurately, automatically and optimally (Chen et al., 2006, Dochain, 2008, Nicoletti and Jain, 2009).
Over the past two decades, soft sensors have received increasing attention in both academia and industry due to their inferential estimation capability. Although soft sensors have been applied in broad fields, the online prediction remains the dominant application (Kadlec et al., 2009). When online analyzers are not available, soft sensing technology aims to provide estimations of difficult-to-measure variables based on some easy-to-measure variables. Generally, soft sensors can be classified into two groups: first-principle models and data-driven models (Kadlec et al., 2009). The focus of this work is on the data-driven soft sensor modeling. Reviews of this type of soft sensor application have been published in references Fortuna et al. (2007), Haimi et al. (2013), Kadlec et al. (2009), Kano and Nakagawa (2008), Kano and Fujiwara (2013), Sliskovic et al. (2011).
The most common data-driven modeling techniques for developing soft sensors are multivariate statistical techniques such as multivariate linear regression (MLR) (Kano and Ogawa, 2010), principle component regression (PCR) (Jolliffe, 2002) and partial least squares (PLS) (Lin et al., 2007, Kano and Fujiwara, 2013, Sharmin et al., 2006). These linear modeling methods account for 90% of the soft sensors used in industry (Kano and Ogawa, 2010) due to their statistical background, ease of interpretability, and because they can deal efficiently with data collinearity, which is common among industrial datasets (Kadlec and Gabrys, 2011). Nevertheless, these linear methods cannot function well when applied to highly nonlinear processes like batch processes. Thus many efforts have been attempted to nonlinear approaches, such as artificial neural networks (ANN) (Cui et al., 2012, Gonzaga et al., 2009), KPLS (Jia et al., 2013, Yu, 2012a), support vector regression (Yu, 2012b), neuro-fuzzy system (Jang, 1993, Jassar et al., 2009) and Gaussian process regression (GPR) (Grbić et al., 2013). However, there exist some issues remained to be solved to develop soft sensors for batch processes.
One particular drawback of many current soft sensors is their non-adaptive nature. Traditionally, the predictive models are not adaptive, and once deployed into the real-life operation, the models will not change, whereas the operation environment is often changing. To cope with changes in process characteristics, model maintenance is essential to maintain high estimation performance for a long time. Without process or expert knowledge, the soft sensor model has to be updated automatically. Thus many kinds of recursive modeling methods, which update models by prioritizing newer samples, have been proposed (Kadlec et al., 2011). Although these methods can adapt the soft sensor to a new operation condition recursively, they cannot cope with abrupt changes in process characteristics, which are caused by replacement of a catalyst, cleaning of equipment, etc., because a query sample just after an abrupt change becomes significantly different from the prioritized samples. It is also a common practice to enhance the model adaptation by the ensemble learning framework (Grbić et al., 2013, Kadlec et al., 2011, Kadlec and Gabrys, 2011, Yu, 2012c). In the ensemble framework, data are divided into different sub-domains and local sub-models are constructed over each domain. In this way, instead of using a single global model, multiple simpler models are developed and then combined to obtain the final prediction. Some other adaptive soft sensors were developed by partitioning the process data into multiple clusters corresponding to different operating phases where local predictive model is built for each phase (Yu, 2012a, Yu and Qin, 2008, Yu and Qin, 2009). When implemented online, the query sample, for which an output estimation is required, is firstly classified into a particular phase, and then the corresponding local model representing the identical phase is adaptively chosen for prediction. Although such phase based multi-model methods outperforms the single models, they are unable to effectively capture the between-phase transient dynamics. Thus, a Bayesian model averaging (BMA) based multi-model method was proposed to tackle this issue (Yu et al., 2013). In practice, however, it is difficult to determine the partition number due to the lack of the quantitative and precise information of phase divisions. More importantly, the local models used in the BMA based multi-model methods are built offline and not updated once deployed online, thus the changes in process characteristics cannot be well dealt with due to the time-varying nature of the real-life processes.
Recently, just-in-time (JIT) learning has attracted increasing attention in process modeling and soft sensor development (Cheng and Chiu, 2004, Fujiwara et al., 2009). By applying JIT learning, a local model is constructed from the samples similar to the query sample. Thus, on one hand, JIT based model can cope with abrupt changes as well as gradual ones. On the other hand, it can deal with nonlinearity since it builds a local model repeatedly. Compared to the traditional modeling methods which can be considered as global modeling, JIT based method exhibits a local model structure where a local model is built from the historical dataset selected by some similarity measure to the query data when the estimation is required. Once the estimated output is obtained, the built local model is discarded. Nevertheless, the estimation accuracy of JIT based models is expected to be further improved by selecting the optimal combination of local regression function, input variable selection, similarity measure, and database updating scheme, etc., simultaneously.
To enhance the predictive performance and adaptability of JIT based models, there are some problems remain to be addressed. Although linear local model based JIT methods (Cheng and Chiu, 2004, Kim et al., 2013) can successfully address the process nonlinearity, high nonlinearity variable relationships are widely existed in batch processes where a local linear model may not always function well. Thus a nonlinear modeling technique with high computational efficiency is preferable. In addition, the commonly used similarity measures rarely take into account process characteristics. In real applications, the most frequently used similarity measures are based on the Euclidean distance or the Mahalanobis distance (Cheng and Chiu, 2004, Ge and Song, 2010, Schaal et al., 2002), most of which are defined only from the perspective of sample algebraic space irrespective of specific process knowledge. Moreover, the optimal local modeling size is usually determined offline as reported in (Ge and Song, 2010). An adaptive strategy is required to choose local modeling samples adaptively for each query data. Besides, a reliable database updating scheme is also crucial for JIT based models to tackle changes in process characteristics. Usually, the database is updated only by simply removing the oldest samples and adding the new samples (Shigemori et al., 2011).
The second problem encountered in soft sensor modeling is the lack of a systematic guideline for input variable selection. As reported in literature work (Cui et al., 2012, Kadlec and Gabrys, 2011, Kim et al., 2013, Pani et al., 2013), input variables are often selected based on engineers’ personal experience and prior process knowledge. However, it is time-consuming for the engineers to select the input variables since trial and error is inevitable. Additionally, the selected variables may not be optimal. Also, it becomes very difficult even for experienced engineers to properly select input variables when a large number of variables are measured and physical and chemical phenomena are not sufficiently understood. Consequently, various data-based methods have been proposed for selecting proper input variables.
One popular approach for reducing the input variable dimension is by projecting the original input space into an adequate lower dimensional space. The most popular methods for achieving such projection task are based on linear projection of input space, such as the widely known principle component analysis (PCA) and PLS. However, the new variables resulting from such methods are difficult to interpret in terms of actual process variables (Delgado et al., 2009). More importantly, the underlying assumption of linearly structured dependence contradicts to the development of statistical model of nonlinear processes.
Another approach for reducing input space dimension is based on selecting the most important variables from all potential variables according to some criteria. As a nonparametric and nonlinear measure of relevance derived from information theory, mutual information (MI) has recently been applied to nonlinear processes modeling (May et al., 2008a), process monitoring (Chen et al., 2013, Rashid and Yu, 2012a, Rashid and Yu, 2012b) and soft sensor design (Grbić et al., 2013). Unlike those linear methods that only consider linear relationships between variables, MI is theoretically able to identify relations of any type. It furthermore makes no assumption about the distribution of the data. However, several issues have arisen in the formulation of MI-based selection algorithms, which are: the ability of handling the inter-dependencies between candidates and the lack of an appropriate principle for determining when to halt the selection procedure (Chow and Huang, 2005). To tackle this issue, the partial mutual information (PMI) criterion has been developed by considering the effect of the already selected input variables when evaluating the relevance between one plausible input variable and the output variable (May et al., 2008a, May et al., 2008b, Sharma, 2000).
Apart from the model adaptation and input variable selection, during data-driven soft sensor modeling, much attention is only paid to the plant data, whereas the process characteristics are usually ignored. In practice, batch processes are often characterized by their multiphase characteristics where multiple operating phases are involved. Thus the accuracy and reliability of quality variable prediction can heavily degrade as the operating phase and process dynamics change. Multiphase modeling strategy with phase identification has been reported more efficient than the conventional single-model based methods (Yu, 2012a, Yu and Qin, 2008, Yu and Qin, 2009, Yu et al., 2013). These results indicate that multiphase nature essentially needs to be considered when designing data-driven soft sensors for batch processes.
To address the above-mentioned issues, a novel adaptive soft sensor modeling framework is proposed for nonlinear multiphase batch processes. This soft sensing algorithm is outlined as follows:
- (i)
JIT learning framework is adopted due to its capability of dealing with changes in process characteristics as well as nonlinearity. Within this learning framework, a new hybrid similarity measure is defined by integrating Euclidean distance based similarity with process phase similarity. Further, the optimal local modeling samples are adaptively determined by online cross-validation optimization. Besides, a maximal similarity replacement rule is proposed to update sample database.
- (ii)
KPLS is chosen as the local modeling technique for two reasons. One reason is that it is more effective to capture the nonlinear characteristics of batch processes than linear methods such as PLS regression. Another reason is that KPLS essentially requires only linear algebra, making it as simple as a regular linear PLS regression.
- (iii)
Gaussian mixture model (GMM) is estimated to identify the operating phases of batch process, and then various JIT-KPLS modeling frameworks are constructed for different phases.
- (iv)
The input variables are selected based on PMI criterion between potential input variables and output variable by performing a stepwise selection procedure, which effectively alleviates the effect of the redundancy between input variables.
The proposed MJIT-KPLS soft sensing framework allows performing model adaptation in four aspects. First, a query sample can be automatically classified into a particular operating phase based on the estimated GMM model through Bayesian inference strategy. Second, in the JIT learning framework, a local KPLS model is constructed online when the estimation for the query sample is required. Third, an adaptive strategy is proposed to select the optimal local modeling size adaptively for local KPLS modeling. Finally, the proposed framework can adapt to new process state by adding new samples into the database from which the samples for local modeling are selected.
The rest of this paper proceeds as follows. Section 2 briefly outlines the theories about KPLS regression, JIT learning, PMI criterion, and GMM algorithm. The proposed adaptive soft sensing algorithm, MJIT-KPLS, is discussed in detail in Section 3. Subsequently, a case study of industrial fed-batch chlortetracycline fermentation process is used to evaluate the proposed algorithm in Section 4. Finally, this research is concluded in Section 5.
Section snippets
Preliminaries
In this section, KPLS regression, JIT learning, PMI criterion, and GMM algorithm are briefly introduced.
Proposed adaptive soft sensor
The proposed MJIT-KPLS soft sensing algorithm builds soft sensors online by performing KPLS regression in multiphase JIT modeling framework. The JIT learning framework for each operating phase is similarly determined by a group of offline parameters which are different from phase to phase. Thus the development of MJIT-KPLS soft sensors consists of building multiple similar JIT-KPLS modeling frameworks. Without considering a particular phase, the development of JIT-KPLS soft sensors can be split
Case study
In this section, the proposed soft sensing algorithm is tested on industrial fed-batch chlortetracycline fermentation process. Two indexes are used to evaluate the model performance, including root-mean-square error (RMSE) and the coefficient of determination () given bywhere is the estimated output, yi is the actual output, and is the mean of the actual output; Ntest denotes the number of testing samples.
RMSE is
Conclusions
The MJIT-KPLS soft sensing algorithm, the main contribution of this work, provides a novel adaptive soft sensor modeling framework for multiphase batch processes. By exploiting the JIT learning framework, MJIT-KPLS allows to cope with changes in process characteristics as well as process nonlinearity. Moreover, Gaussian mixture model enables automatic phase identification and multiphase modeling. Through the Bayesian inference strategy, the JIT-KPLS modeling framework representing the identical
Acknowledgements
We thank Charoen Pokphand Group for their financial support and for providing the industrial datasets of fed-batch CTC fermentation process. We also appreciate the valuable comments and suggestions of the anonymous reviewers.
References (60)
Bioprocess control: advances and challenges
Comput Chem Eng
(2006)- et al.
A non-Gaussian pattern matching based dynamic process monitoring approach and its application to cryogenic air separation process
Comput Chem Eng
(2013) - et al.
A new data-based methodology for nonlinear process modeling
Chem Eng Sci
(2004) - et al.
Data-driven prediction of the product formation in industrial 2-keto-l-gulonic acid fermentation
Comput Chem Eng
(2012) SIMPLS: an alternative approach to partial least squares regression
Chemom Intell Lab Syst
(1993)- et al.
Selection of input variables for data driven models: an average shifted histogram partial mutual information estimator approach
J Hydrol
(2009) - et al.
Resampling methods for parameter-free and robust feature selection with mutual information
Neurocomputing
(2007) - et al.
A comparative study of just-in-time-learning based methods for online soft sensor modeling
Chemom Intell Lab Syst
(2010) - et al.
ANN-based soft-sensor for real-time process monitoring and control of an industrial polymerization process
Comput Chem Eng
(2009) - et al.
Adaptive soft sensor for online prediction and process monitoring based on a mixture of Gaussian process models
Comput Chem Eng
(2013)