Skip to main content
Log in

Stacked sparse autoencoders monitoring model based on fault-related variable selection

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

In the modern chemical industry, the process data collected are high-dimensional and complex. All measured variables are usually incorporated in statistical process monitoring models because these models generally perform dimension reduction. However, if modeling involves variables that do not contain useful information about faults, that is, variables that are not relevant to faults, monitoring performance may be degraded. In typical process monitoring methods, offline modeling only uses normal data without any fault information, making monitoring performance unlikely to be optimal. Hence, a novel stacked sparse autoencoder (SSAE) monitoring model based on fault-related variable selection was proposed. From the point of view that correlation characteristics between measured variables will change when faults occur, strongly fault-related variables are selected. Mutual information was used to calculate correlations between measured variables, including normal and fault data. Euclidean distance was adopted as a similarity index to measure the similarity between each correlation vector of measured variables in a normal state and that in a fault state. Only variables strongly related to fault effects were retained, and other uninformative variables were excluded from model development. Then, SSAEs were used to construct a monitoring model for selected data. The proposed method can utilize historical fault data to select strongly fault-related variables, making the model contain useful process information and features extracted by SSAE have high interpretability. A case study on the Tennessee–Eastman process demonstrated its availability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Anter AM, Gupta D, Castillo O (2020) A novel parameter estimation in dynamic model via fuzzy swarm intelligence and chaos theory for faults in wastewater treatment plant. Soft Comput 24:111–129

    Article  Google Scholar 

  • Comon P (1994) Independent component analysis, A new concept? Sig Process 36:287–314

    Article  Google Scholar 

  • Downs JJ, Vogel EF (1993) A plant-wide industrial process control problem. Comput Chem Eng 17:245–255

    Article  Google Scholar 

  • Ge Z (2017) Review on data-driven modeling and monitoring for plant-wide industrial processes. Chemometr Intell Lab 171:16–25

    Article  Google Scholar 

  • Ge Z, Gao F, Song Z (2011) Two-dimensional Bayesian monitoring method for nonlinear multimode processes. Chem Eng Sci 66:5173–5183

    Article  Google Scholar 

  • Ge Z, Song Z, Gao F (2013) Reviewof recent research on data-based process monitoring. Ind Eng Chem Res 52:3543–3562

    Article  Google Scholar 

  • Ge Z, Song Z, Ding SX, Huang B (2017) Data mining and analytics in the process industry: the role of machine learning. IEEE Access 5:20590–20616

    Article  Google Scholar 

  • Ghosh K, Ramteke M, Srinivasan R (2014) Optimal variable selection for effective statistical process monitoring. Comput Chem Eng 60:260–276

    Article  Google Scholar 

  • Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182

    MATH  Google Scholar 

  • Jiang Q, Yan X, Huang B (2016) Performance-driven distributed PCA process monitoring based on fault-relevant variable selection and bayesian inference. IEEE T Ind Electron 63:377–386

    Article  Google Scholar 

  • Kano M, Hasebe S, Hashimoto IHO (2002) Statistical process monitoring based on dissimilarity of process data. AIChE J 48:1231–1240

    Article  Google Scholar 

  • Khatib S, Daoutidis P, Almansoori A (2018) System decomposition for distributed multivariate statistical process monitoring by performance driven agglomerative clustering. Ind Eng Chem Res 57:8283–8298

    Article  Google Scholar 

  • LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444

    Article  Google Scholar 

  • Lee JM, Yoo C, Choi SW, Vanrolleghem PA, Lee IB (2003) Nonlinear process monitoring using kernel principal component analysis. Chem Eng Sci 59:223–234

    Article  Google Scholar 

  • Li W (1990) Mutual information functions versus correlation functions. J Stat Phys 60:823–837

    Article  MathSciNet  Google Scholar 

  • Li X, Wang L, Li P (2008) The study on composite load model structure of artificial neural network. In: 2008 Third International Conference on Electric Utility Deregulation and Restructuring and Power Technologies. IEEE 1564-1570

  • Liu H, Wu X, Zhang S (2014) A new supervised feature selection method for pattern classification. Comput Intell 30:342–361

    Article  MathSciNet  Google Scholar 

  • Liu J, Song C, Zhao J, Ji P (2020) Large-scale dynamic process monitoring based on performance-driven distributed canonical variate analysis. J Chemom 34:1–27

    Article  Google Scholar 

  • Lv FY, Wen CL, Liu MQ, Bao ZJ (2017) Weighted time series fault diagnosis based on a stacked sparse autoencoder. J Chemometr 31:2912

    Article  Google Scholar 

  • McAvoy TJ, Ye N (1994) Base control for the Tennessee Eastman problem. Comput Chem Eng 18:383–413

    Article  Google Scholar 

  • Ming L, Zhao J (2017) Review on chemical process fault detection and diagnosis. In: 2017 6th International Symposium on Advanced Control of Industrial Processes (AdCONIP). IEEE 457-462

  • Qin SJ (2003) Statistical process monitoring: basics and beyond. J Chemometr 17:480–502

    Article  Google Scholar 

  • Qin SJ (2012) Survey on data-driven industrial process monitoring and diagnosis. Annu Rev Control 36:220–234

    Article  Google Scholar 

  • Reunanen J (2003) Overfitting in making comparisons between variable selection methods(Article). J Mach Learn Res 3:1371–1382

    MATH  Google Scholar 

  • Ricker NL, Lee JH (1995) Nonlinear model predictive control of the Tennessee Eastman challenge process. Comput Chem Eng 19:961–981

    Article  Google Scholar 

  • Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. NATURE 5:1

    Google Scholar 

  • Tong C, Song Y, Yan X (2013) Distributed statistical process monitoring based on four-subspace construction and bayesian inference. Ind Eng Chem Res 52:9897–9907

    Article  Google Scholar 

  • Wang YQ, Si YB, Huang B, Lou ZJ (2018) Survey on the theoretical research and engineering applications of multivariate statistics process monitoring algorithms: 2008–2017. Canad J Chem Eng 96:2073–2085

    Article  Google Scholar 

  • Yin S, Ding SX, Xie X, Luo H (2014) A review on basic data-driven approaches for industrial process monitoring(Review). IEEE T Ind Electron 61:6414–6428

    Google Scholar 

  • Yu J, Yan X (2019) Active features extracted by deep belief network for process monitoring. ISA T 84:247–261

    Article  Google Scholar 

  • Zeng J, Luo X, Liang J (2018) Online process monitoring using recursive mutual information-based variable selection and dissimilarity analysis with no prior information. IEEE Access 6:58662–58672

    Article  Google Scholar 

  • Zeng J, Huang W, Wang Z, Liang J (2019) Mutual information-based sparse multiblock dissimilarity method for incipient fault detection and diagnosis in plant-wide process. J Process Contr 83:63–76

    Article  Google Scholar 

  • Zhang Z, Jiang T, Li S, Yang Y (2018) Automated feature learning for nonlinear process monitoring—an approach using stacked denoising autoencoder and k-nearest neighbor rule. J Process Contr 64:49–61

    Article  Google Scholar 

  • Zou C, Qiu P (2009) Multivariate statistical process control using LASSO. J Am Stat Assoc 104:1586–1596

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors are grateful for the support of the National Natural Science Foundation of China (21878081) and Fundamental Research Funds for the Central Universities under Grant of China (222201917006).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuefeng Yan.

Ethics declarations

Conflict of interest

The authors declared no potential conflict of interests with respect to the research, authorship and/or publication of this article.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yin, J., Yan, X. Stacked sparse autoencoders monitoring model based on fault-related variable selection. Soft Comput 25, 3531–3543 (2021). https://doi.org/10.1007/s00500-020-05384-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-020-05384-8

Keywords

Navigation