Semi-supervised generative adversarial network with guaranteed safeness for industrial quality prediction

https://doi.org/10.1016/j.compchemeng.2021.107418Get rights and content

Highlights

  • A semi-supervised Generative adversarial network with Co-trained Generators (GCG) is proposed for quality prediction by utilizing both labeled and unlabeled process data.

  • The safeness of the proposed GCG is further discussed, which guarantees the performance of GCG is not worse than the GAN with supervised generators.

  • The effectiveness of the proposed GCG is verified by a benchmarked industrial case and the real absorption stabilization system in FCCU from a refinery.

Abstract

In process industries, due to the low sampling rate of the quality variables, there are abundant unlabeled data but limited labeled data. Most data-driven quality prediction models only use the labeled data but ignore the unlabeled data, resulting in overfitting and low generalization performance. Hence, it is necessary to extract useful information from the unlabeled process data. Due to the noise in the signal transmission process and sensors, there are unlabeled data with low confidence that will mislead the training process. To tackle this problem, we propose a semi-supervised Generative adversarial network with Co-trained Generators (GCG) that utilizes the unlabeled data safely through the co-training of generators. The optimal parameters and weight coefficients of co-trained generators guarantee the ”safeness”, i.e., the performance of GCG is not worse than generators with only labeled data. The proposed method is validated by a benchmarked industrial case and the real absorption-stabilization system in Fluid Catalytic Cracking Unit (FCCU). The results suggest that the GCG method improves the generalization performance of the quality prediction model.

Introduction

In process industries, raw materials are converted to the final products through pieces of equipment with physical or chemical changes, such as the refining process (Wang et al., 2009), blast furnace iron-making process (Zhou et al., 2017), grinding process (Dai et al., 2015), etc. Some key quality variables (e.g., yields, concentration, element content) are obtained by laboratory tests, online analyzer, soft sensors or the fusion from all sensor sources (Sansana et al., 2020). The laboratory tests are time-consuming and labor intensive, and the online analyzer is expensive and hard to maintain (Kadlec, Gabrys, Strandt, 2009, Shang, Yang, Huang, Lyu, 2014). Hence, the quality variables have high labeling costs, which brings difficulties to real-time monitoring, optimization, and control (Yuan et al., 2018). Therefore, it is meaningful to predict quality variables online. In most cases, the key quality variables can be estimated by first-principle or data-driven models. First-principle models are usually built based on assumptions and physico-chemical knowledge about the thermodynamic equilibrium, energy conservation, and material balance. Therefore, first-principle models are usually simplified and highly dependent on the expert’s knowledge, which is usually limited and hard to obtain. With the development of advanced sensors and database technology, a large amount of data can be collected from the distributed control system, which provides favorable conditions for data-driven models (Kadlec, Gabrys, Strandt, 2009, Shang, Yang, Huang, Lyu, 2014). Data-driven models are free of prior knowledge and directly dependent on process data, which have received great attention in recent years (Zhang, Zou, Li, Xu, 2019, Zhang, Zou, Li, 2020, Steurtewagen, Van den Poel, 2020, Dias, Oliveira, Saraiva, Reis, 2020).

In process industries, the quality variables are usually sampled in a low frequency due to the high labeling cost, and process variables such as temperature, pressure, flowrate are sampled fast. Hence, there are plentiful process data as inputs but limited quality data as labels. Traditional quality prediction is based on the supervised learning methods which only use labeled process data but ignore the unlabeled data, resulting in overfitting and poor generalization performance of the quality prediction model. To deal with the limitation of labeled data, Semi-Supervised Learning (SSL) methods have been proposed to extract useful information from unlabeled data. In the self-training scheme (Kang et al., 2016), the labeled data are augmented with the model’s own highly confident predictions, and the process is repeated until some termination condition is reached. The self-training methods are sensitive to the poor predictions that result in accumulative errors. Graph-based semi-supervised learning methods aim to construct a graph to connect similar observations, and the label information is propagated through the graph from labeled to unlabeled nodes by minimizing the energy configuration (Blum et al., 2004). Graph-based methods are susceptible to the graph structure and require the analysis of eigenvectors and eigenvalues of graph Laplacian, which limits the applicable scale of these methods. Recently, semi-supervised training schemes for neural networks have also attracted wide attention due to the advantages of deep learning on feature engineering. Studies have shown that unsupervised pre-training and supervised fine-tuning of deep neural networks can guide the learning process towards basins of attraction of minima for better generalization (Erhan, Courville, Bengio, Vincent, 2010, Shang, Yang, Huang, Lyu, 2014, Yuan, Huang, Wang, Yang, Gui, 2018), but the compatibility problem can not be ignored between the unsupervised pre-training and supervised fine-tuning (Rasmus et al., 2015). To solve the compatibility problem, Ladder Network combines the unsupervised and supervised learning to develop an end-to-end semi-supervised deep learning model (Rasmus, Berglund, Honkala, Valpola, Raiko, 2015, Li, Luo, Hu, 2020). Besides, the generator of a trained Generative Adversarial Network (GAN) can generate realistic data, which is regarded as capturing the manifold of data, such property can be utilized for semi-supervised learning (Salimans et al., 2016). GAN-based semi-supervised learning methods have shown better performance than the above semi-supervised methods in classification and regression tasks (Dai, Yang, Yang, Cohen, Salakhutdinov, 2017, Kumar, Sattigeri, Fletcher, 2017, Qi, Zhang, Hu, Edraki, Wang, Hua, 2018, Rezagholiradeh, Haidar, 2018, Ji, Li, Wang, Sun, Guo, Guo, Wu, Huang, Luo, 2020). It should be noted that in process industries, a regressor is introduced to the vanilla GAN for the prediction of crude oil properties (Zheng and Ding, 2018). The GAN-based methods have been used for data augmentation and supplement (Han, Zhang, Zhou, Wang, 2019, Wang, Liu, 2020), and GAN-based semi-supervised learning method is developed to identify the process risk level (He et al., 2020).

Due to the noise in the signal transmission process and sensors, there are unlabeled data with low confidence, which will mislead the training process, causing that the performance of semi-supervised learning is even worse than supervised learning. Therefore, it is highly necessary to study a semi-supervised learning scheme that guarantees ”safeness” (Li and Zhou, 2015), which means the usage of more data will not perform worse than supervised learning with labeled data. However, the previous works on GAN-based semi-supervised learning methods did not consider the safeness, which impairs the effectiveness of GAN on process industries.

Based on the above discussions, due to the limitation of labeled data and the consideration for safeness, a novel semi-supervised learning method is proposed for taking advantage of unlabeled data, as well as guaranteeing the safeness, i.e., the performance of semi-supervised learning is not worse than purely supervised learning. The main contributions of this paper are as follows:

  • A semi-supervised Generative adversarial network with Co-trained Generators (GCG) is proposed for quality prediction by utilizing both labeled and unlabeled process data.

  • The safeness of the proposed GCG is further discussed, which guarantees the performance of GCG is not worse than the GAN with supervised generators.

  • The effectiveness of the proposed GCG is verified by a benchmarked industrial case and the real absorption-stabilization system in FCCU from a refinery.

This paper is organized as follows. In Section 2, the semi-supervised process data are described, and related works on semi-supervised GAN are reviewed. The proposed semi-supervised Generative adversarial network with Co-trained Generators (GCG) is detailed in Section 3. In Section 4, the benchmarked debutanizer column data and the real absorption-stabilization system data are used to verify the effectiveness of the proposed method. The final section gives the concluding remarks. Notations in this paper are presented in Table 1.

Section snippets

Description of semi-supervised process data

In process industries, the quality variables are usually sampled in a low frequency due to the high labeling cost, and process variables such as temperature, pressure, flowrate are sampled fast. So there are plentiful process data as inputs but limited quality data as labels. Traditional quality prediction is based on the supervised learning methods which only use labeled process data but ignore the unlabeled data, which results in overfitting and poor generalization performance of the quality

Semi-supervised generative adversarial network with co-trained generators

Based on the semi-supervised GAN for regression (Ji et al., 2020), generators are introduced in a co-trained form, then discriminators and generators are alternately optimized to find a Nash equilibrium. The proposed GCG framework is shown in Fig. 3, where xlxl, xunlxunl, and yy. C is the combination of {C1,C2,,Cc}, i.e., C(x)=i=1cαiCi(x), where i=1cαi=1 and c is the number of generators. For clarify, let c=2 in this paper, so C1 and C2 are co-trained as the predictors. To guarantee the

Case study

In this section, a benchmarked industrial case and the real absorption-stabilization system in FCCU are studied to verify the effectiveness of the proposed method. It should be noted that the case study is performed on Python 3.6, and the experiments are carried out on a PC with an i5-8250U CPU Intel Core.

The prediction performance is evaluated by Mean Absolute Error (MAE), Root Mean Square Error (RMSE) and R2, which are described as:MAE=1ntesti=1ntest|y˜iyi|,RMSE=1ntesti=1ntest(y˜iyi)2,R2=1

Conclusions

To utilize the unlabeled process data and improve the generalization performance of the quality prediction model, a semi-supervised learning method named GCG is proposed in this paper. The GCG method can avoid the misleading of the unlabeled data with low confidence and guarantee the safeness. The optimal parameters and combination coefficients of co-trained generators guarantee the performance of GCG is not worse than GAN with supervised generators. The benchmarked debutanizer column data and

CRediT authorship contribution statement

Xu Zhang: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing - original draft, Writing - review & editing. Yuanyuan Zou: Conceptualization, Formal analysis, Investigation, Methodology, Writing - review & editing, Funding acquisition. Shaoyuan Li: Investigation, Methodology, Writing - review & editing, Supervision, Project administration, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgement

The authors thank the founding from the National Key Research and Development Program of China (2018YFB1701101 and 2018AAA0101700), and the National Natural Science Foundation of China (NSFC) (61773162 and 61833012).

References (34)

  • X. Wang et al.

    Data supplement for a soft sensor using a new generative model based on a variational autoencoder and wasserstein GAN

    J Process Control

    (2020)
  • X. Zhang et al.

    Enhancing incremental deep learning for FCCU end-point quality prediction

    Inf Sci (Ny)

    (2020)
  • X. Zhang et al.

    A weighted auto regressive LSTM based approach for chemical processes modeling

    Neurocomputing

    (2019)
  • A. Blum et al.

    Semi-supervised learning using randomized mincuts

    Proceedings of the twenty-first international conference on machine learning

    (2004)
  • Z. Dai et al.

    Good semi-supervised learning that requires a bad gan

    Advances in neural information processing systems

    (2017)
  • D. Erhan et al.

    Why does unsupervised pre-training help deep learning?

    Proceedings of the thirteenth international conference on artificial intelligence and statistics

    (2010)
  • L. Fortuna et al.

    Soft sensors for monitoring and control of industrial processes

    (2007)
  • Cited by (7)

    • Roughness detection method based on image multi-features

      2023, Proceedings of the Institution of Mechanical Engineers, Part E: Journal of Process Mechanical Engineering
    View all citing articles on Scopus
    View full text