Robust inferential sensor development based on variational Bayesian Student’s-t mixture regression
Introduction
In process industries, a great many of measuring instruments are installed to gather data for real-time monitoring and control [1]. Nevertheless, lots of key quality variables, such as the melt index of polypropylene, the content of butane, the endpoint of crude oil, the concentration of oxygen, just to name a few, are very difficult to measure in real time by these traditional sensors [2]. Measuring the values of these quality variables is often completed by either using the expensive online analyzer, for example, mass spectrometer which causes high investment costs, or testing in the laboratory which results in large time delays [3]. On the other side, there are a great number of process variables that can be acquired quite easily and cheaply, such as temperatures, flow rates, and pressures. These easy-to-measure variables (called secondary variables) are closely related to the difficult-to-measure quality variables (called primary variables). Therefore, mathematical models can be constructed to capture the dependency of the primary variables on the secondary variables, and then be applied to infer the primary variables. Such a mathematical model is referred to as the ‘inferential sensor’, and has the merits of being low in cost, easy to maintain and free of measurement delay. Therefore, inferential sensors have been widely investigated and successfully applied to industrial processes for the purpose of monitoring and control over the past several decades [1], [4], [5]. Note that the idea of inferential sensor is also popular in non-industrial fields. For example, the orthogonal forward regression (ORF)-based proxy measurement model proposed by Guo et al. established the mathematical relationship between the vertical ground reaction forces (vGRF) and wearable accelerometer signals. The experimental results were evaluated by dataset collected from individual persons’ walking states, which demonstrate that the developed dynamic model can successively improve the estimation accuracy [6]. In fact, many applications, including the above proxy measurement and quality variable prediction share one task, i.e., the regression task.
Generally speaking, inferential sensors can be categorized into two types: the first-principle-based sensors [7] and data-driven sensors [8]. Because modern industrial processes are growing complicated, the first-principle-based sensors cannot acquire explicit evolution presentations for model dynamics. By benefiting from the distributed control systems and the large-capacity database techniques, abundant process data that reveal the real status of industrial operations could be gathered through apparatuses at the spot of industry [8], [9]. Therefore, data-driven inferential sensors have drawn increasing popularity and attentions in recent years. Over the past few years, lots of algorithms have been designed and used to build inferential sensor models. Partial least squares (PLS) [10] and principle component regression (PCR) [11] are commonly employed to describe linear relationships between the primary and secondary variables. To tackle process nonlinearities, applying the support vector machines (SVMs) [12] and artificial neural networks (ANNs) [13] to develop nonlinear inferential sensors has also been systematically studied. In addition, comprehensive reviews of the methods and applications of inferential sensors in process industries are easily acquired from which one can learn more about inferential sensors [4], [14].
Because of reasons such as multiple product grades or operating conditions, most of industrial processes work with multiple modes, which means a single inferential sensor model may fail to achieve satisfactory performance [15], [16], [17]. These processes are usually characterized by strong nonlinearity and non-Gaussianity. Aiming at modeling these multimode processes for inferential sensor development, finite mixture models (FMM) have been studied systematically and widely used. The most widespread adoption of the FMM family is perhaps the Gaussian mixture models (GMM), which linearly combines a group of Gaussian distributions to approximate complex non-Gaussian distributions. The GMM and its variants have shown more and more potential in the field of both process monitoring and quality forecasting [15], [18], [19], [20], [21], [22], [23], [24], [25], [26], [27], [28]. For instance, Yu et al. and Liu et al. have applied the GMM successfully to multimode process monitoring and multiphase batch process monitoring [18], [19], respectively. For the purpose of estimating quality variables, the GMM can be classified into two types. The first type includes inferential sensors based on a two-step strategy: mode identification and regression model construction. First, the GMM is used to cluster the data into several modes in the input space; subsequently, a regression model such as kernel PLS [20] and Gaussian process regression (GPR) [21] can be built in each mode. In addition, in [22], Fan et al. introduced the GMM into a just-in-time learning framework, and achieved more reliable prediction performance. However, solely considering the input space in the first step ignores the important information of the output space. Another type mainly includes Gaussian mixture regression (GMR) and its plentiful variants. In [23], Yuan et al. treated the input and output spaces together, rather than separately, for the joint probability density function (p.d.f.) estimations. The functional dependency of the target variables on the secondary variables can then be derived straightly from their joint p.d.f. This procedure has proven superior over the two-step procedure mentioned above, but the higher predictive accuracy depends on plenty of labeled samples. However, labeled samples in the inferential sensor application could be rare because of expensive cost and large time delay by labeling samples, making the performance of GMR-based inferential sensors disappointing. In contrast, there are many unlabeled samples that can be easily collected. To tackle this problem, Shao et al. developed the semisupervised GMR (S2GMR) and semisupervised Dirichlet process GMR (S2DPGMR), respectively, where the useful information of unlabeled samples is taken into account [15], [26]. To improve the computational efficiency for large-scale data modeling, scalable semisupervised GMM was designed in [27], and significant improvement in computational efficiency was demonstrated. Furthermore, with the usage of Bayesian treatment, the variational Bayesian GMR (VBGMR) was developed to realize automatic determination of the best mixing components number [28]. These studies vastly enrich the treasury of GMR-related methods.
Although inferential sensors based on GMR or its variants have delivered satisfactory performance, some researchers have recently demonstrated that the performance of GMM will significantly deteriorated by the presence of outliers. This is because learning the parameters of the GMM is very sensitive to outliers, which leads to significant distortion of the estimated p.d.f. over variables of interest or excessive components for explaining the information associated with outliers [29], [30], [31]. Outliers are those measurements that seem to diverge noticeably from the statistical ranges of gathered data [32]. In industrial datasets, outliers exist widely due to measurement data that are incorrectly observed, recorded, or imported [33], leading to skewed parameter estimation and plant-model mismatch for statistical analysis. Basically there are two types of outliers, namely conspicuous outliers and in-distinctive outliers. The conspicuous outliers are those which are readily detected and removed, such as data points containing values that are beyond their physical limitations. In contrast, the in-distinctive outliers are difficult to identify and address especially for those multimode processes where the outliers might be misunderstood as an ‘error mode’ by the training algorithm. What’s worse, normal samples can mistakenly be classified as indistinctive outliers and simply discarded. This situation can very likely cause loss of useful information and distort the original data distribution when the available samples are insufficient [34].
To be robust against outliers, researchers have put forward the Student’s-t mixture model (SMM) to overcome the shortcoming of GMM, as the SMM can tender stronger robustness with respect to outliers through heavier tails [35]. The heavier tails of the SMM step from an important parameter ν (referred to as the ‘degrees of freedom’) in the Student’s-t distribution. Recently, the SMM has been proven to achieve much better performance than GMM in various applications such as automatic gesture recognition [36], medical image segmentation [37], and persons’ fall detection [38]. Bayesian strategies for estimating probability density and clustering by utilizing mixture distributions enable automatic model selection (i.e., the determination of optimal component number under the FMM framework), thus the variational Bayesian SMM (VBSMM) has been widely studied and achieves relatively satisfactory results [39], [40], [41]. However, these studied and applied models based on the SMM are unsupervised, which merely uses the input information while neglects the supervision information. On the other hand, the inferential sensor model is supervised, where the supervision information (i.e., the samples of primary variables) is very important. Therefore, the merits of the unsupervised SMM or VBSMM in anti-outlier can not be directly used to develop robust inferential sensors.
Therefore, the motivation of this paper is to develop a supervised SMM-based inferential sensing approach (which we refer to as the ‘variational Bayesian Student’s-t mixture regression’), such that the merit of the FMM in dealing with multimode characteristics and the advantage of the Student’s-t distribution in addressing outliers can be both absorbed. Specifically, in the VBSMR, the functional dependency of primary variables on the secondary variables is taken into consideration. In addition, a variational Bayesian expectation maximization (VBEM)-based parameter learning algorithm for training the VBSMR is developed. Note that under the framework of variational Bayesian inference, all variables, including latent or hidden variables and model parameters, are treated as random variables and are assigned with corresponding prior distribution; for example, the Dirichlet distribution is chosen to govern the mixing coefficients. There are some substantial advantages in developing inferential sensors based on the VBSMR. First, the singularities that arise in inverting covariance matrices can be avoided by the Bayesian treatment. Second, over-fitting problem can be effectively mitigated by integrating out model parameters. Third, the Bayesian treatment is capable of automatically determining the best component number without resorting to techniques such as cross-validation [28], [42].
The remainder of this paper is structured as follows. Section 2 briefly introduces the Student’s-t distribution and shows the difference between the Student’s-t distribution and Gaussian distribution, followed by a detailed explanation of the VBSMR and of how to learn the model parameters as well as how to develop the VBSMR-based inferential sensor in Section 3. Then, in Section 4, two cases, consisting of a synthetic example and an actual industrial process, are supplied to validate the availability and flexibility of the VBSMR. Finally, conclusions and focuses in the future works are made.
Section snippets
Student’s-t distribution
The p.d.f. of Student’s-t distribution is given bywhere μ represents the mean vector, Λ represents the precision matrix (which to some extent is like the inverse covariance for the Gaussian distribution), ν is called the degrees of freedom, d represents the dimensionality of the variable vector x and denotes the Gamma function.
By taking a scalar variable x as an example, the p.d.f. of the Student’s-t
Variational Bayesian Student’s-t mixture regression
Let and be the input and output data matrices, respectively, where xn and yn are the nth sample for the input variables and output variable, respectively, d is the dimensionality of the input space and N is the size of the dataset. The basic thought of the SMM is that a complex non-Gaussian distribution can be approximated by the combination of a finite number of Student’s-t distributions. Therefore, the p.d.f. that is assumed to generate the data points {xn}
Case studies
In this section, two cases including a numerical example and an actual methanation furnace unit of the ammonia synthesis process are provided for evaluating the predictive performance of the proposed approach. In addition, the performance of PLS, GMR, and VBGMR are provided as benchmarks.
For quantitative evaluation of the predictive accuracies of these four approaches, the root mean square error (RMSE) is adopted, given bywhere yi and denote the real and predicted
Conclusions
In this paper, for developing industrial inferential sensor to estimate the quality variable in real-time, the variational Bayesian Student’s-t mixture regression (VBSMR) has been put forward to overcome the shortcoming of traditional Gaussian mixture regression (GMR), i.e., being sensitive to outliers. Our theoretical contributions can be twofold: (1) the proposal of the VBSMR, which enables robust regression under the FMM framework; (2) the development of an efficient learning algorithm for
Declaration of Competing Interest
The authors declare no conflict of interest.
Acknowledgment
This work was supported by the National Natural Science Foundation of China (Grant no. 61703367) and the China Postdoctoral Science Foundation (Grant nos. 2017M621929 and 2019T120516).
Jingbo Wang received the B.Eng. degree from Liangxin College, China Jiliang University, Hangzhou, China, in 2017. He is currently a master degree candidate with the State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou, China. His current research interests include industrial process soft sensor modeling, Bayesian methods with application to classification and regression tasks.
References (44)
- et al.
Data-driven soft sensors in the process industry
Comput. Chem. Eng.
(2009) - et al.
Semi-supervised selective ensemble learning based on distance to model for nonlinear soft sensor development
Neurocomputing
(2017) - et al.
Online soft sensor design using local partial least squares models with adaptive process state partition
Chemom. Intell. Lab. Syst.
(2015) - et al.
Product property and production rate control of styrene polymerization
J. Process Control
(2002) - et al.
Data-based process monitoring, process control, and quality improvement: recent developments and applications in steel industry
Comput. Chem. Eng.
(2008) - et al.
Adaptive soft sensor for quality prediction of chemical processes based on selective ensemble of local partial least squares models
Chem. Eng. Res. Des.
(2015) - et al.
ANN-Based soft-sensor for real-time process monitoring and control of an industrial polymerization process
Comput. Chem. Eng.
(2009) - et al.
The state of the art in chemical process control in Japan: good practice and questionnaire survey
J. Process Control
(2010) - et al.
Sequential local-based gaussian mixture model for monitoring multiphase batch processes
Chem. Eng. Sci.
(2018) - et al.
Soft sensor model development in multiphase/multimode processes based on Gaussian mixture regression
Chemom. Intell. Lab. Syst.
(2014)
Dynamic soft sensor development based gaussian mixture regression for fermentation processes
Chin. J. Chem. Eng.
Quality variable prediction for chemical processes based on semisupervised Dirichlet process mixture of Gaussians
Chem. Eng. Sci.
Robust fuzzy clustering using mixtures of Student’s-t distributions
Pattern Recognit. Lett.
Robust Bayesian mixture modelling
Neurocomputing
Design of inferential sensors in the process industry: a review of Bayesian methods
J. Process Control
The infinite Student’s t-factor mixture analyzer for robust clustering and classification
Pattern Recognit.
Data mining and analytics in the process industry: the role of machine learning
IEEE Access
Soft Sensors for Monitoring and Control of Industrial Processes
A new proxy measurement algorithm with application to the estimation of vertical ground reaction forces using wearable sensors
Sensors
Process data analytics via probabilistic latent variable models: a tutorial review
Ind. Eng. Chem. Res.
Semisupervised Bayesian method for soft sensor modeling with unlabeled data samples
AIChE J.
Database monitoring index for adaptive soft sensors and the application to industrial process
AIChE J.
Cited by (12)
Multi-mode industrial soft sensor method based on mixture Laplace variational auto-encoder
2024, Measurement: Journal of the International Measurement ConfederationA robust hybrid predictive model of mixed oil length with deep integration of mechanism and data
2021, Journal of Pipeline Science and EngineeringCitation Excerpt :Notwithstanding the aforementioned advancements of the above hybrid model, risk metrics of outliers and robustness of predictive model of mixed oil have rarely been discussed, and we find that Chen's model is quite vulnerable to the presence of outliers which causes the great biased parameter estimate. Outliers are measurements that deviate apparently from the statistical ranges of historical data (Khatibisepehr et al., 2013) and exist widely in industrial processes attributed to inaccurate measurement or record (Shao et al., 2020; Wang et al., 2019). Generally, there are two types of outliers including conspicuous outliers and in-distinctive outliers.
Nonlinear variational Bayesian Student's-t mixture regression and inferential sensor application with semisupervised data
2021, Journal of Process ControlCitation Excerpt :Apart from that, model selection (i.e., the determination of optimal component number) can be automatically completed with the Dirichlet prior on the mixing coefficients [28]. Based on VBSMM, variational Bayesian Student’s-t mixture regression (VBSMR) [29] was developed to estimate those difficult-to-measure quality variables. For compensating for the insufficiency of labeled data, semisupervised variational Bayesian Student’s-t mixture regression (SSVBSMR) [30] was developed for exploiting both labeled and unlabeled data.
Extended Gaussian mixture regression for forward and inverse analysis
2021, Chemometrics and Intelligent Laboratory SystemsCitation Excerpt :An expectation–maximization (EM) algorithm [12] is a common method of estimating the parameters of Gaussian mixture models (GMMs) [3] in GMR, or the parameters can be stably estimated by setting a prior distribution for each parameter using the variational Bayesian (VB) method [13]. The GMM parameters obtained with VB have been applied to GMR for robot learning [14], and VB-based GMR has also been applied to regression models for estimating product quality in an industrial plant [15]. Whether GMR is used to predict Y from X (regression or forward analysis) or to predict X from Y (inverse analysis), the predictive ability of GMR is important.
A Bayesian bias updating procedure for automatic adaptation of soft sensors
2021, Computers and Chemical EngineeringCitation Excerpt :Some SS based on Bayesian networks have special abilities to estimate variables under missing data conditions (Deng et al., 2013; Gonzalez et al., 2011); while other applications include semi-supervised learning strategies (Shang et al., 2014), and vector regression methods (Zhiqiang and Zhihuan, 2009). For complex chemical processes, several sub-models can be combined to characterize different operation states (Wang et al., 2019); and Bayesian algorithms have been used to develop online calibration and adaptation of the SS, in order to rearrange the overlapping modes of the model (Khatibisepehr et al., 2012), or to dynamically determine the share of each sub-model in the global estimation (Shao and Tian, 2015). Other works have used Bayesian approaches to deal with process nonlinearities (Yang et al., 2016).
Variational Bayesian Student's-t Mixture Model With Closed-Form Missing Value Imputation for Robust Process Monitoring of Low-Quality Data
2024, IEEE Transactions on Cybernetics
Jingbo Wang received the B.Eng. degree from Liangxin College, China Jiliang University, Hangzhou, China, in 2017. He is currently a master degree candidate with the State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou, China. His current research interests include industrial process soft sensor modeling, Bayesian methods with application to classification and regression tasks.
Weiming Shao received his B.Eng. and Ph.D. degrees both from the College of Information and Control Engineering, China University of Petroleum, Qingdao, China, in 2009 and 2016, respectively. He was a Visiting Research Associate with the Department of Electrical Engineering in the Petroleum Institute, Abu Dhabi, UAE, from Nov. 2014 to Nov. 2015. He is currently a Postdoctoral Research Fellow with the State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou, China. His research interests include machine learning and statistical learning methods and their applications to semisupervised, robust and adaptive soft sensor development.
Zhihuan Song received the B.Eng. and M.Eng. degrees in industrial automation from Hefei University of Technology, Ahhui, China, in 1983 and 1986, respectively, and the Ph.D. degree in industrial automation from Zhejiang University, Hangzhou, China, in 1997. Since 1997, he has been in the Department of Control Science and Engineering, Zhejiang University, where he was first a Postdoctoral Research Fellow, then an Associate Professor, and is currently a Professor. He has published more than 200 papers in journals and conference proceedings. His research interests include the modeling and fault diagnosis of industrial processes, analytics and applications of industrial big data, and advanced process control technologies.