Skip to main content
Log in

VisPro: a prognostic SqueezeNet and non-stationary Gaussian process approach for remaining useful life prediction with uncertainty quantification

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Rotating machinery is essential to modern life, from power generation to transportation and a host of other industrial applications. Since such equipment generally operates under challenging working conditions which can lead to untimely failures, accurate remaining useful life (RUL) prediction is essential for maintenance planning and prevention of catastrophic failures. In this work, we address current challenges in data-driven RUL prediction for rotating machinery. The challenges revolve around the accuracy and uncertainty quantification of the prediction, and the non-stationarity of the system degradation and RUL estimation given sensor data. We devise a novel computational architecture and RUL prediction model with uncertainty quantification, termed VisPro, which integrates time–frequency analysis, deep learning image recognition, and nonstationary Gaussian process regression. We analyze and benchmark the results obtained with our model against those of other advanced data-driven RUL prediction models using the PHM12 bearing vibration dataset. The computational experiments show that (1) the VisPro predictions are highly accurate and provide significant improvements over existing prediction models (three times more accurate than the second-best model), and (2) the RUL uncertainty bounds are valid and informative. We identify and discuss the architectural and modeling choices made that explain this predictive performance of VisPro.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. It can be seen in Table 4 that for the bearing 2_3, the prediction accuracy of VisPro is worse than Ref. [20]. This is likely due to the random nature of the prediction performance in that reference with a high STD. For example, Ref. [20] has small error for bearing 2_3, but a significantly large error for bearing 2_5. VisPro achieves a more consistent RUL prediction accuracy with small mean and STD of prediction error for all testing bearings.

References

  1. Zheng Y (2019) Predicting remaining useful life based on hilbert–huang entropy with degradation model. J Electr Comput Eng

  2. Jardine AK, Lin D, Banjevic D (2006) A review on machinery diagnostics and prognostics implementing condition-based maintenance. Mech Syst Signal Process 20(7):1483–1510

    Article  Google Scholar 

  3. Dragomir OE, Gouriveau R, Dragomir F, Minca E, Zerhouni N (2009) Review of prognostic problem in condition-based maintenance. In 2009 European Control Conference (ECC) (pp. 1587–1592). IEEE

  4. Lin Y, Li X, Hu Y (2018) Deep diagnostics and prognostics: an integrated hierarchical learning framework in PHM applications. Appl Soft Comput 72:555–564

    Article  Google Scholar 

  5. Adams ML (2000) Rotating machinery vibration: from analysis to troubleshooting (Vol. 131). CRC Press

  6. Wang D, Tsui KL, Miao Q (2017) Prognostics and health management: a review of vibration based bearing and gear health indicators. IEEE Access 6:665–676

    Article  Google Scholar 

  7. Si XS, Wang W, Hu CH, Zhou DH (2011) Remaining useful life estimation–a review on the statistical data driven approaches. Eur J Oper Res 213(1):1–14

    Article  MathSciNet  Google Scholar 

  8. Xiongzi C, Jinsong Y, Diyin T, Yingxun W (2011) Remaining useful life prognostic estimation for aircraft subsystems or components: a review. In IEEE 2011 10th International Conference on Electronic Measurement and Instruments (Vol. 2, pp. 94–98). IEEE

  9. Wang H, Peng MJ, Miao Z, Liu YK, Ayodeji A, Hao C (2021) Remaining useful life prediction techniques for electric valves based on convolution auto encoder and long short term memory. ISA Trans 108:333–342

    Article  Google Scholar 

  10. Wu J, Hu K, Cheng Y, Zhu H, Shao X, Wang Y (2020) Data-driven remaining useful life prediction via multiple sensor signals and deep long short-term memory neural network. ISA Trans 97:241–250

    Article  Google Scholar 

  11. Yan M, Wang X, Wang B, Chang M, Muhammad I (2020) Bearing remaining useful life prediction using support vector machine and hybrid degradation tracking model. ISA Trans 98:471–482

    Article  Google Scholar 

  12. Ma M, Sun C, Mao Z, Chen X (2021) Ensemble deep learning with multi-objective optimization for prognosis of rotating machinery. ISA Trans 113:166–174

    Article  Google Scholar 

  13. Xiang S, Qin Y, Zhu C, Wang Y, Chen H (2020) LSTM networks based on attention ordered neurons for gear remaining life prediction. ISA Trans 106:343–354

    Article  Google Scholar 

  14. Sutrisno E, Oh H, Vasan ASS, Pecht M (2012) Estimation of remaining useful life of ball bearings using data driven methodologies. In 2012 ieee conference on prognostics and health management (pp. 1–7). IEEE

  15. Hong S, Zhou Z, Zio E, Hong K (2014) Condition assessment for the performance degradation of bearing based on a combinatorial feature extraction method. Digital Signal Process 27:159–166

    Article  Google Scholar 

  16. Lei Y, Li N, Gontarz S, Lin J, Radkowski S, Dybala J (2016) A model-based method for remaining useful life prediction of machinery. IEEE Trans Reliab 65(3):1314–1326

    Article  Google Scholar 

  17. Guo L, Li N, Jia F, Lei Y, Lin J (2017) A recurrent neural network based health indicator for remaining useful life prediction of bearings. Neurocomputing 240:98–109

    Article  Google Scholar 

  18. Yoo Y, Baek JG (2018) A novel image feature for the remaining useful lifetime prediction of bearings based on continuous wavelet transform and convolutional neural network. Appl Sci 8(7):1102

    Article  Google Scholar 

  19. Wang B, Lei Y, Li N, Li N (2018) A hybrid prognostics approach for estimating remaining useful life of rolling element bearings. IEEE Trans Reliab 69(1):401–412

    Article  Google Scholar 

  20. Zhang G, Liang W, She B, Tian F (2021) Rotating machinery remaining useful life prediction scheme using deep-learning-based health indicator and a new RVM. Shock and Vibration

  21. Moghaddass R, Zuo MJ (2014) An integrated framework for online diagnostic and prognostic health monitoring using a multistate deterioration process. Reliab Eng Syst Saf 124:92–104

    Article  Google Scholar 

  22. Singleton RK, Strangas EG, Aviyente S (2014) Extended Kalman filtering for remaining-useful-life estimation of bearings. IEEE Trans Industr Electron 62(3):1781–1790

    Article  Google Scholar 

  23. Van der Aalst WM, Rubin V, Verbeek HMW, van Dongen BF, Kindler E, Günther CW (2010) Process mining: a two-step approach to balance between underfitting and overfitting. Softw Syst Model 9(1):87–111

    Article  Google Scholar 

  24. Koehrsen W (2018) Overfitting vs. underfitting: a complete example. Towards Data Science

  25. Gavrilov AD, Jordache A, Vasdani M, Deng J (2018) Preventing model overfitting and underfitting in convolutional neural networks. Int J Softw Sci Comput Intell (IJSSCI) 10(4):19–28

    Article  Google Scholar 

  26. Xu Z, Saleh JH (2021) Machine learning for reliability engineering and safety applications: review of current status and future opportunities. Reliability Eng Syst Safety, 107530

  27. Peng W, Ye ZS, Chen N (2019) Bayesian deep-learning-based health prognostics toward prognostics uncertainty. IEEE Trans Industr Electron 67(3):2283–2293

    Article  Google Scholar 

  28. Fink O, Zio E, Weidmann U (2014) Predicting component reliability and level of degradation with complex-valued neural networks. Reliab Eng Syst Saf 121:198–206

    Article  Google Scholar 

  29. Lederer A, Conejo AJO, Maier K, Xiao W, Hirche S (2020) Real-time regression with dividing local gaussian processes. arXiv preprint arXiv:2006.09446

  30. Nectoux P, Gouriveau R, Medjaher K, Ramasso E, Chebel-Morello B, Zerhouni N, Varnier C (2012) PRONOSTIA: an experimental platform for bearings accelerated degradation tests. In IEEE International Conference on Prognostics and Health Management, PHM'12. (pp. 1–8). IEEE Catalog Number: CPF12PHM-CDR

  31. Sejdić E, Djurović I, Jiang J (2009) Time–frequency feature representation using energy concentration: an overview of recent advances. Digital signal Process 19(1):153–183

    Article  Google Scholar 

  32. Djebbari A, Reguig FB (2000) Short-time Fourier transform analysis of the phonocardiogram signal. In ICECS 2000. 7th IEEE International Conference on Electronics, Circuits and Systems (Cat. No. 00EX445) (Vol. 2, pp. 844–847). IEEE

  33. Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv preprint arXiv:1602.07360

  34. Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90

    Article  Google Scholar 

  35. Schulz E, Speekenbrink M, Krause A (2018) A tutorial on Gaussian process regression: modelling, exploring, and exploiting functions. J Math Psychol 85:1–16

    Article  MathSciNet  Google Scholar 

  36. Remes S, Heinonen M, Kaski S (2017) Non-stationary spectral kernels. In Advances in neural information processing systems (pp. 4642–4651)

  37. Mosallam A, Medjaher K, Zerhouni N (2016) Data-driven prognostic method based on Bayesian approaches for direct remaining useful life prediction. J Intell Manuf 27(5):1037–1048

    Article  Google Scholar 

  38. Paciorek CJ, Schervish MJ (2003) Nonstationary covariance functions for gaussian process regression. In NIPS (pp. 273–280)

  39. Lang T, Plagemann C, Burgard W (2007) Adaptive Non-Stationary Kernel Regression for Terrain Modeling. In Robotics: Science and Systems (Vol. 6)

  40. Garg S, Singh A, Ramos F (2012) Learning non-stationary space-time models for environmental monitoring. In Twenty-Sixth AAAI Conference on Artificial Intelligence

  41. Lei Y (2016) Intelligent fault diagnosis and remaining useful life prediction of rotating machinery. Butterworth-Heinemann, Oxford

    Google Scholar 

Download references

Acknowledgements

This work was supported in part by a Space Technology Research Institute grant from NASA’s Space Technology Research Grants Program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhaoyi Xu.

Ethics declarations

Conflict of interest

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Examining the effectiveness of NSGPR

In this appendix, we examine the effectiveness of the NSGPR in our RUL prediction model by comparing the RUL prediction with and without NSGPR. The RUL predictions without NSGPR are taken from the last step prediction of the Pro-SQM network in the testing dataset. The RUL prediction without NSGPR the testing Bearing 1_2 is shown in Fig. 

Fig. 11
figure 11

The RUL estimation of the DL network of the testing Bearing 1_2

11.

First, we note that the RUL estimation curve is oscillating more significantly compared with the RUL prediction with NSGPR as shown in Fig. 9. Second, the RUL prediction is 1360 s, which has a larger error than that of the prediction with NSGPR. The RUL predictions without NSGPR of overall testing Bearings are shown in Fig. 

Fig. 12
figure 12

The RUL estimation of all Bearings in the testing set of DL network

12.

First, the RUL prediction in Fig. 12 does not provide the lower and upper bounds, and the use of NSGPR supports our model with uncertainty quantification. Second, in this work, we use the scoring function as calculated in Eq. 9 to measure the accuracy of the model prediction. The scoring function for RUL predictions without NSGPR is 0.59. The scoring function of VisPro predictions is 0.84, which is 42% higher than that of the predictions without NSGPR.

Appendix B: Structure and weight parameters of the Pro-SQN

Here, we introduce the details of the output size, memory requirement per image, and the number of the weight of Pro-SQN as shown in Table

Table 5 The output size, hardware memory, and weight size of the Prognostic-SqueezeNet discriminator

5. We use 32-bit floating numbers for variables in the data processing that one variable takes \(\frac{32}{8}=4\) bytes. According to Table 5, the number of weights is 1,187 M and it possesses 0.594 Mb on memory for the model itself. The memory usage is the hard requirement of the hardware and weight size indicates the training requirement of the model.

Then, we introduced the fire model as shown in Fig. 

Fig. 13
figure 13

Organization of convolution filters in the fire model. In this example, \({{\varvec{s}}}_{1{\varvec{x}}1}=3\), \({{\varvec{e}}}_{1{\varvec{x}}1}=4\), and \({{\varvec{e}}}_{3{\varvec{x}}3}=4\). We illustrate the convolution filters but not the activations [33]

13. 1 since it is extensively used in the SqueezeNet, where \({s}_{1x1}\), \({e}_{1x1}\), and \({e}_{3x3}\) stand for the number of squeeze layers, the number of \(1\times 1\) expand layer, and the number of \(3\times 3\) expand layers, respectively.

In our fire detection SqueezeNet, we set \({s}_{1x1}\), \({e}_{1x1}\), and \({e}_{3x3}\) as 1. We switch the activation function of the fire model from ReLU in the original model to LeakyReLU for more nonlinearity and preventing vanishing gradient problem for negative input.

Appendix C: Examining the effectiveness of the local length scale kernel

In this appendix, we examine the effectiveness of the local length scale kernel in our RUL prediction model. In order to demonstrate the effectiveness of the local length scale, we compare our results with a prediction with dot product and squared exponential kernel, which has a universal length scale. The RUL predictions without local length scale kernel of overall testing Bearings are shown in Fig. 

Fig. 14
figure 14

The RUL estimation of the entire Bearings in the testing set without local length scale kernel

14 and Table

Table 6 Comparison of RUL prediction of VisPro with and without local length scale kernel

6.

First, in Table 6, the RUL prediction with local length scale kernel has a significant advantage over the prediction with squared exponential kernel. The mean of the prediction error, STD of the error, and score function are improved by 66%, 55%, and 14%, respectively. Second, in Table 6, the RUL prediction error of Bearing 2_4 is significantly improved from 27.34 to 7.91 by using local length scale kernel. In order to investigate the details of this improvement, the RUL predictions for the testing Bearing 2_4 with and without local length scale kernel are compared in Fig. 

Fig. 15
figure 15

The RUL estimation of Bearing 2_4 a with local length scale kernel; b without local length scale kernel

15.

Comparing the results of RUL prediction of Bearing 2_4 with local length scale kernel and without local length scale kernel (with squared exponential kernel), first, the prediction of without local length scale kernel is oscillating. Since the local length scale kernel considers the local smoothness, its prediction is less oscillating and more robust. This consequently improves the prediction accuracy of the B 2_4 in the testing dataset. Second, in Fig. 15, the uncertainty bound is larger after truncation time (6110 s) for the prediction without local length scale kernel compared with that of the prediction with local length scale kernel. Consequently, the use of local length scale kernel in the NSGPR step provides a more precise RUL prediction with a tighter uncertainty bound.

Appendix D: Uncertainty quantification of 80%, 90%, and 95% confidence interval

In this appendix, we examine the uncertainty quantification results with 80%, 90%, and 95% confidence intervals. The prediction and uncertainty quantification results are shown in Fig. 

Fig. 16
figure 16

The RUL estimation of the entire Bearings in the testing set with 80%, 90%, and 95% confidence interval

16.

Figure 16 shows that first the mean estimation of RUL is identical and does not vary with different confidence intervals. Second, virtually, the size of uncertainty shrinks with the decrease of its percentage. For example, 80% confidence interval is tighter compared with that of 90% and 95%. However, the uncertainty quantification has potential invalid cases that ground truths are outside the uncertainty bound for the 80% confidence interval. We calculate and summarize the average confidence interval size and invalid case number among all testing bearings for 80%, 90%, and 95% confidence intervals in Table

Table 7 The uncertainty quantification performance with 80%, 90%, and 95% confidence interval

7.

Table 7 shows 80% confidence interval has the tightest uncertainty bound with the average size of 322 s. However, it has 3 invalid cases that the real RULs are out of the estimated uncertainty bounds, and their uncertainty quantifications are not informative. Second, 90% has a tighter uncertainty bound compared with that of 95% without invalid case of uncertainty quantification. In this way, with the considerations of both average size and invalid case of the uncertainty quantification, 90% confidence interval is the optimal selection for VisPro. We use 90% confidence interval in Sect. 4 to discuss the RUL prediction and uncertainty quantification results.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, Z., Guo, Y. & Saleh, J.H. VisPro: a prognostic SqueezeNet and non-stationary Gaussian process approach for remaining useful life prediction with uncertainty quantification. Neural Comput & Applic 34, 14683–14698 (2022). https://doi.org/10.1007/s00521-022-07316-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07316-z

Keywords

Navigation