Quality-related fault detection using linear and nonlinear principal component regression

https://doi.org/10.1016/j.jfranklin.2016.03.021Get rights and content

Abstract

The issue of quality-related fault detection is a hot research topic in the process monitoring community in the recent five years. Several modifications based on partial least squares (PLS) have been proposed to solve the relevant problems for linear systems. For the systems with nonlinear characteristics, some modified algorithms based on kernel partial least squares (KPLS) have also been designed very recently. However, most of the existing methods suffer from the defect that their performances are not stable when the fault intensity increases. More importantly, there is no way yet to solve the linear and nonlinear problems in a uniform algorithm structure, which is very important for simplifying the design steps of fault detection systems. This paper aims to propose such approaches based on principal component regression (PCR) and kernel principal component regression (KPCR). Such that, relevant problems in linear and nonlinear systems can be solved in the same way. Two literature examples are used to test the performance of the proposed approaches.

Introduction

Data-driven approaches have been receiving considerable attention in the field of process monitoring [1], [2], [3], [4] due to their easy implementation and less requirement for the underlying model. In recent years, the so-called quality-related fault detection has attracted wide attention in the process monitoring community. By modeling the relationship between quality and process variables, such methods classify the faults happened in the process space into two categories, namely those affect the final product quality and those do not affect. By reducing the alarm rates of the later, these methods can significantly reduce the unnecessary downtime of the plant, and then bring considerable economic benefits for practical applications. Moreover, quality-related fault detection can be used to guide the implementation of fault tolerant control (FTC) scheme more accurate and more targeted.

Principal component analysis (PCA) and PLS are the two most common used multivariate statistical analysis methods in process monitoring [5]. As PCA cannot establish the correlation between quality and process variables, it is naturally unable to be used for quality-related fault detection. The nature of PLS makes it without such problem, however, as revealed by Li et al. [6], the standard PLS performs an oblique decomposition on process variable space and the significant process variation related to the output might also be contained in the residual part. As a result, the utilized test statistics yet fault diagnostic results offered by PLS are problematic for quality-related fault detection. To overcome this shortcoming, Zhou et al. [7] first proposed a PLS-postprocessing-based approach, named total projection to latent structures (T-PLS), by further decomposing the score and load matrices of PLS, which finally divided the process space (spanned by the X matrix) into four subspaces with each part had a different correlation with the output space (spanned by the Y matrix). Such that, by designing appropriate statistics in these four subspaces faults with different correlations with Y can be classified. Soon later, Yin et al. [8] proposed a different approach, called modified partial least squares (M-PLS), which first estimated the regression coefficient matrix between X and Y, and then projected X onto the null and the remaining subspaces of the coefficient matrix, respectively. Finally, the process space was decomposed into two orthogonal subspaces. Compared with T-PLS, M-PLS realizes an orthogonal decomposition on the process space, and it is more simple and effective than T-PLS for most cases. By taking advantages of the two methods, Qin et al. [9] developed a concurrent partial least squares (C-PLS) which was claimed more efficient. Considering the drawbacks of PLS-postprocessing-based approaches, Wang et al. [10] combined orthogonal signal correction (OSC) and M-PLS to develop an enhanced method. For nonlinear process monitoring, Peng et al. [11] and Zhang et al. [12] extended T-PLS and C-PLS into nonlinear versions, respectively. Recently, Zhang et al. [13] made a comprehensive and detailed comparison study for all these approaches.

Although the aforementioned results have promoted the advancement of the field of quality-related fault detection, they still have obvious defects. For example, the methods based on modifications on PLS [7], [9] and KPLS [11], [12] are usually not stable when fault intensity increases. Besides, the linear and nonlinear methods with uniform algorithm structure are not yet attracting attention which have significant benefit in simplifying the design steps of fault detection systems. These defects prompted us to conduct more in-depth study. Based on PCA decomposition, PCR can reveal the relationship between score matrix T and Y. By restructuring X from T, the coefficient matrix between X and Y can be obtained. Therefore, PCR can also be used to solve quality-related issue in a similar way to PLS. Such an achievement has recently been reported in [14], the idea of which is quite similar to M-PLS that projecting the reconstructed X onto the null and the remaining subspaces of the coefficient matrix to get two orthogonal subspaces. Notice that, the residual part of X is obtained by a PCA decomposition and not used in the subsequent regression steps. Thus, the correlation between this part and the output is still ambiguous. In this paper, we aim to solve this problem by developing a new decomposition algorithm based on PCR. What׳s more, this paper will further extend the new algorithm into a nonlinear version to realize a nonlinear quality-related fault detection approach.

The rest sections are organized as follows. Section 2 first reviews the basic concepts and algorithms. Section 3 describes the proposed approaches in detail. In Section 4, two numerical examples are used to test the performance of the proposed approaches. Finally, conclusions are summarized in Section 5.

Section snippets

The algorithm of principal component regression

Given an industrial process, collect normal process data into input and output matrices X=[x1x2xn]TRn×m, Y=[y1y2yn]TRn×l, where n is the number of samples; m and l are the number of variables of input and output samples, respectively. Without loss of generality, we assume that the given process is linear and works around a desired operating point; all measurements follow normal distribution, i.e. x~N(0,Σx),y~N(0,Σy); process noises and measurement noises also follow normal distributions; nm

Total principal component regression (T-PCR)

From Eq. (3), we haveY^=TQT

Perform PCA on Y^ to get its score matrix Ty and load matrix Qy, thusTy=Y^Qy=TQTQy

Obviously, T is re-projected to Ty by QTQy. Next, we reconstruct Xy from Ty by the following steps:PyT=(TyTTy)1TyTX^thusXy=TyPyT=Ty(TyTTy)1TyTTPTandXo=XXyAs mentioned before, Xy and Xo are highly correlated and non-correlated with Y, respectively.

To monitor these two parts, we use the T2 statistics. Perform PCA on Xo to get its score matrix To and load matrix Po with only the

Simulation

In this section, the effectiveness of the proposed approaches will be tested using two literature examples. An indicator, fault detection rates (FDR), is utilized for performance evaluation, which is defined as follows:FDR=No.ofalarmstotalfaultysamples×100

From the perspective of applications, a superior quality-related fault detection scheme should have the following capability:

  • For quality-related faults: high FDR in its quality-related statistical indicators.

  • For quality-unrelated faults: low

Conclusion

In this paper, two approaches, T-PCR and T-KPCR, have been proposed to solve the quality-related fault detection issues for linear and nonlinear systems, respectively. The main highlight of the two methods is that they are with the same algorithm structure. The basic idea of the structure is that, the original process matrix X is first projected onto the score matrix T by PCA and KPCA. Ty is then extracted from T by least squares regression between T and Y. After that, Xy is reconstructed from X

Acknowledgment

This work was supported by National Natural Science Foundations of China (Nos. 61503039 and 61503040).

References (16)

There are more references available in the full text version of this article.

Cited by (80)

View all citing articles on Scopus
View full text