Gated recurrent unit based recurrent neural network for remaining useful life prediction of nonlinear deterioration process

doi:10.1016/j.ress.2019.01.006

Reliability Engineering & System Safety

Volume 185, May 2019, Pages 372-382

https://doi.org/10.1016/j.ress.2019.01.006 Get rights and content

Highlights

•
A general solution is presented for RUL prediction of nonlinear deterioration process.
•
KPCA is selected for dimensionality reduction and nonlinear feature extraction.
•
GRU is presented to replace LSTM, which behaves better both in prediction accuracy and training time.

Abstract

Remaining useful life (RUL) prediction is a key process for prognostics and health management (PHM). However, conventional model-based methods and data-driven methods for RUL prediction are bad at a very complex system with multiple components, multiple states and therefore extremely large amount of parameters. In order to solve the problem, a general two-step solution is proposed in this paper. In the first step, kernel principle component analysis (KPCA) is applied for nonlinear feature extraction. Then, a novel recurrent neural network called gated recurrent unit (GRU) is presented as the second step to predict RUL. GRU network is capable of describing a very complex system because of its specially designed structure. The effectiveness of the proposed solution for RUL prediction of a nonlinear degradation process is proved by a case study of commercial modular aero-propulsion system simulation data (C-MAPSS-Data) from NASA. Results also show that the proposed method requires less training time and has better prediction accuracy than other data-driven methods.

Introduction

PHM is a basic requirement for condition-based maintenance in many application domains where safety, reliability, and availability of the systems are considered mission critical [1]. Particularly, RUL prediction is one of the main tasks in PHM. Improving the accuracy of the RUL prediction can not only enhance the safety and reliability, but also prolong service time which decreases the average cost in turn. Therefore, many researchers have studied RUL prediction methods in recent years.

Generally, there are two approaches for RUL prediction: model-based methods and data-driven methods. Model-based methods can be used for a component or a simple system to deduct a more accurate RUL by building a physical failure model while data-driven methods can estimate RUL for a complex system by constructing a simpler data-based model. In order to predict RUL of complex systems, data-driven methods therefore has got more attention recently [2]. Ahmad et al. [3] predicted the RUL of the rolling element bearings using dynamic regression models. Hu et al. [4] proposed a prediction method for the RUL of wind turbine bearings based on the Wiener process. Huang et al. [5] presented an adaptive skew-Wiener process model for RUL prediction. Zhang et al. [6] presented a review on Wiener-process-based methods for RUL prediction and degradation data analysis. Le et al. [7] estimated the RUL with noisy gamma deterioration process. Ling et al. [8] proposed Bayesian and likelihood inferences on remaining useful life in two-phase degradation models under gamma process. Baptista et al. [9] proposed a method for RUL prediction combining data-driven and Kalman filter. Son et al. [10] predicted the RUL based on noisy condition monitoring signals using constrained Kalman filter. Duong et al. [11] presented a method with heuristic Kalman optimized particle filter for RUL prediction. Liu et al. [12] proposed a novel method using adaptive hidden semi-Markov model (HSMM) for multi-sensor monitoring equipment health prognosis. Chen et al. [13] presented a hidden Markov model (HMM) with auto-correlated observations for RUL prediction and optimal maintenance policy. Li et al. [14] proposed an optimal Bayesian control policy for gear shaft fault detection using HSMM. Chen et al. [15] proposed a general solution to nonlinear multistate deterioration modeling with non-homogeneous hidden semi-Markov model (NHSMM) for deterioration level assessment and RUL prediction. Moghaddass et al. [16] presented an integrated framework for online diagnostic and prognostic health monitoring using a multistate deterioration process. Although these methods are widely used, they have their own limitations. The deterioration process of the equipment is usually nonlinear and multiple, because of the complicated structure and variable work status. The deterioration curve may not follow a typical shape such as exponential or linear function. It is an important challenge of RUL prediction that finding out the rule of the nonlinear deterioration. Wiener process, Gamma process and Kalman filter perform not very well when the deterioration process is nonlinear. HMM model performs well on nonlinear deterioration process but the training time increases dramatically when multiple system states are concerned.

Another important branch of data-driven methods is artificial intelligence (AI). In recent years, AI, particularly deep learning methods, has achieved outstanding performance in image processing, natural language processing (NLP) and so on. Researchers have also exploited applications of AI methods for RUL prediction. Among deep learning methods, recurrent neural network (RNN) has attracted special attention because its network structure contains recurrent hidden layer, which is very suitable for time series processing and consequently RUL prediction. Guo et al. [17] proposed a recurrent neural network based health indicator for RUL prediction of bearings. Liu et al. [18] proposed a method for fault diagnosis of rolling bearings with recurrent neural network-based auto-encoders. However, RNN cannot link two similar data if they are separated too far away.

In order to overcome the weakness of RNN, long short term memory (LSTM) is proposed, which introduces input gate, output gate and cell state into RNN [19]. LSTM could save long-time memory into cell state and it has been verified as a most mature and efficient method on many tasks. Hinchi and Tkiouat [20] proposed a method based on LSTM for RUL prediction of rolling bearing. Yuan et al. [21] proposed a method for RUL prediction of aero engine using LSTM neural network. Malhotra et al. [22] proposed a method for multi-sensor prognostics by using an unsupervised health index based on LSTM encoder-decoder. However, each memory blocks in LSTM needs an input gate and an output gate. These gates make the training more difficult and increase the training time of the network.

To reduce training time and improve network performance, a simplified but improved LSTM-architecture network, GRU, is proposed [23]. The GRU chooses a new type of hidden unit that merges the forget gate and the input gate into a single update gate and mixes cellular state and hidden state into one state as well. In brief, the number of gates is decreased from 4 in LSTM to 2 in GRU, named update gate and reset gates.

A general two-step solution for RUL prediction of nonlinear deterioration process is proposed to deal with the nonlinearity in deterioration modeling. In the solution, (1) KPCA is applied as the first step for nonlinear feature extraction. By reducing the dimension, over-fitting caused by too many model parameters can be effectively avoided. (2) GRU, a simplified network of LSTM with fewer parameters, is presented to predict RUL. In practice, (3) Sequence-to-one method is applied, increasing the number of samples while avoiding the trouble of variable length sequence input. 4) Sliding average method is applied to smooth the results, increasing the prediction accuracy effectively.

The rest of the paper is organized as follows. Section II describes different RNN structures. In section III, a general two-step solution for RUL prediction is proposed. In Section IV, the C-MAPSS-Data is used to verify the efficiency and accuracy of the proposed method. Finally, conclusion is drawn in Section V.

Section snippets

Recurrent neural network

In this section, we briefly introduce RNN, LSTM and GRU. A standard neural network usually contains three layers, input layer, hidden layer and output layer. The input set is marked as the vector x, and the hidden set is marked as the vector h, and the output set is marked as the vector y. Matrix U connects input layer and hidden layer, and matrix V connects hidden layer and output layer. Any two inputs are totally independent, for the points are not related inside of the layers. When it comes

Proposed model

In this section, a general solution is proposed for RUL prediction of nonlinear deterioration process.

Data description

C-MAPSS, called the commercial modular aero-propulsion system simulation, is a flexible turbofan engine simulation environment with easy access to health, control and engine parameters through a graphical user interface, established by US Army Research Laboratory, Glenn Research Center [25]. The diagram of engine simulated in C-MAPSS has been shown as Fig. 7. C-MAPSS can be used for the development and validation of control and diagnostic algorithms and it runs faster than real time. The

Conclusion

In this paper, a general two-step solution for RUL prediction of nonlinear deterioration process is proposed. In the solution, KPCA is applied as the first step for nonlinear feature extraction. The second step is using GRU, a simplified network of LSTM with fewer parameters, to predict RUL. C-MAPSS-Data, a dataset of aero-engines with nonlinear deterioration process, was used to test the proposed method. Results show that GRU performs better than LSTM both in training time and prediction

Acknowledgments

The authors would like to sincerely thank all the anonymous reviewers for the valuable comments that greatly helped to improve the manuscript.

This work was supported financially in part by the National Natural Science Foundation of China under Grant 51875436 and Grant 61633001, in part by the China Postdoctoral Science Foundation under Grant 2018M631145.

References (25)

M. Dong et al.
Equipment PHM using non-stationary segmental hidden semi-Markov model
Rob Comput Integr Manuf
(2011)
X.-S. Si
Remaining useful life estimation – A review on the statistical data driven approaches
Eur J Oper Res
(2011)
Z. Huang
Remaining useful life prediction for an adaptive skew-Wiener process model
Mech Syst Sig Process
(2017)
K. Le Son et al.
Remaining useful lifetime estimation and noisy gamma deterioration process
Reliab Eng Syst Safety
(2016)
J. Son
Remaining useful life prediction based on noisy condition monitoring signals using constrained Kalman filter
Reliab Eng Syst Safety
(2016)
P.L.T. Duong et al.
Heuristic Kalman optimized particle filter for remaining useful life prediction of lithium-ion battery
Microelectron Reliab
(2018)
Q. Liu
A novel method using adaptive hidden semi-Markov model for multi-sensor monitoring equipment health prognosis
Mech Syst Sig Process
(2015)
X. Li
Optimal Bayesian control policy for gear shaft fault detection using hidden semi-Markov model
Comput Ind Eng
(2018)
R. Moghaddass et al.
An integrated framework for online diagnostic and prognostic health monitoring using a multistate deterioration process
Reliab Eng Syst Safety
(2014)
L. Guo
A recurrent neural network based health indicator for remaining useful life prediction of bearings
Neurocomputing
(2017)

A.Z. Hinchi et al.

Rolling element bearing remaining useful life estimation based on a convolutional long-short-term memory network

Procedia Comput Sci

(2018)

A. Graves

2005 special issue: framewise phoneme classification with bidirectional LSTM and other neural network architectures

(2005)

Cited by (320)

A vulnerability severity prediction method based on bimodal data and multi-task learning
2024, Journal of Systems and Software
Facing the increasing number of software vulnerabilities, the automatic analysis of vulnerabilities has become an important task in the field of software security. However, the existing severity prediction methods are mainly based on vulnerability descriptions and ignore the relevant features of vulnerability code, which only includes unimodal information and result in low prediction accuracy. This paper proposes a vulnerability severity prediction method based on bimodal data and multi-task learning. First the bimodal data, which consists of the description and source code of each vulnerability, is preprocessed. Next the GraphCodeBert is used for the word embedding module to extract different vulnerability features from the bimodal data. Then the Bi-GRU with attention mechanism is adopted for further feature extraction of vulnerability severity. Considering the strong correlation between the two tasks of vulnerability severity prediction and exploitability prediction, this paper proposes a multi-task learning approach, which allows the model to learn the connection and shared information between different tasks through a hard parameter sharing strategy, so as to achieve more accurate and reliable prediction of vulnerability severity. Experimental results show that the severity prediction method proposed in this paper outperforms state-of-the-art methods, and can achieve an average F1 score of 93.83 % on the public vulnerability dataset.
A novel bearing intelligent fault diagnosis method based on spectrum sparse deep deconvolution
2024, Engineering Applications of Artificial Intelligence
The extraction of fault-induced repetitive transients which possess cyclo-stationarity is the key to the fault diagnosis of rotating machinery, which is of considerable significance for ensuring the safe and reliable operation of machinery equipment. Traditional deconvolution methods mainly aim to recover fault-related impulsive features from the time domain and are prone to give poor fault diagnosis results under heavy interference conditions. To solve this problem, a spectrum sparse deep deconvolution method (SSDD) with a deep neural network structure is proposed in this paper. The proposed method uses an envelope spectrum sparse criterion as the cost function to seek an optimal inverse filter through a deep neural network. Firstly, a special band-averaging strategy is designed to initialize the filters in the input layer of the neural network with a window method to provide a direction for deconvolution. Secondly, envelope spectral kurtosis that can depict the sparse feature in the envelope spectrum domain is taken as the cost function to guide the training of the deep network and lock the fault information. Then, the optimal weights are realized by the eigenvalue algorithm, and the weak sparse features are enhanced and extracted layer by layer. Finally, the most significant fault information is obtained through dimension reduction. The simulated and experimental data analysis results verified that the proposed method is superior to traditional deconvolution methods in fault diagnosis performance and robustness to random impulses and strong background noise.
Effective combining source code and opcode for accurate vulnerability detection of smart contracts in edge AI systems
2024, Applied Soft Computing
Automating transactions using smart contracts extends the functionality of blockchains and secures the decentralization of blockchains in edge AI systems. Whereas, since plenty of smart contracts are deployed to support various decentralized edge applications, the security vulnerabilities of smart contracts will lead to huge irreversible losses. To deal with this problem, many deep learning-based methods have been developed for vulnerability detection. However, most existing methods use only contract source codes for feature extraction, resulting in low accuracy. In contrast, we propose a method based on deep learning model to integrate both the features of contract source codes and opcodes for vulnerability detection. Particularly, the contextual features are extracted based on opcodes while the expert pattern features are extracted from the source codes. Using the real-world dataset of Ethereum smart contracts targeting reentrancy vulnerability, experiment results demonstrate that our method outperforms the state-of-the-art methods and achieves 96.89% accuracy and 95.41% F1-Score.
Multi-node knowledge graph assisted distributed fault detection for large-scale industrial processes based on graph attention network and bidirectional LSTMs
2024, Neural Networks
Modern industrial processes are characterized by extensive, multiple operation units, and strong coupled correlation of subsystems. Fault detection of large-scale processes is still a challenging problem, especially for tandem plant-wide processes in multiple fields such as water treatment process. In this paper, a novel distributed graph attention network-bidirectional long short-term memory (D-GATBLSTM) fault detection model is proposed for large-scale industrial processes. Firstly, a multi-node knowledge graph (MNKG) is constructed using a joint data and knowledge driven strategy. Secondly, for large-scale industrial process, a global feature extractor of graph attention networks (GATs) is constructed, on the basis of which, sub-blocks are decomposed based on MNKG. Then, local feature extractors of bidirectional long short-term memory (Bi-LSTM) for each sub-block are constructed, in which correlations among multiple sub-blocks are considered. Finally, a multi-subblock fusion collaborative prediction model is constructed and the comprehensive fault detection results are given by the grid search method. The effectiveness of our D-GATBLSTM is exemplified in a secure water treatment process case, where it outperforms baseline models compared, with 27% improvement in precision, 15% increase in recall, and overall $F$ -score enhancement of 0.22.
Deep learning-based air pollution analysis on carbon monoxide in Taiwan
2024, Ecological Informatics
Global air pollution poses a threat to humanity. Specifically, CO directly affects cardiovascular and other organ tissues and leads to numerous chronic diseases and major public health problems. The effective implementation of a deep learning model for predicting variations in CO levels would enable the early formulation of policies for controlling air pollution. In this study, a seasonal gated recurrent unit (SGRU) model, which is a deep learning time-series prediction model, was developed to predict the levels of CO in Taiwan. Atmospheric CO measurements from 2005 to 2021 were collected from the Environmental Protection Administration of Taiwan and preprocessed using the Kalman filter to achieve accurate forecasting. The performance of the proposed SGRU model was compared with that of the autoregressive integrated moving average (ARIMA), seasonal ARIMA, exponential smoothing (ETS), Holt–Winters ETS, support vector regression, and seasonal long short-term memory models in terms of mean absolute percentage error (MAPE) and root mean square error. The SGRU model achieved the lowest MAPE value of 0.94, which demonstrated its superior performance. The construction of an accurate air pollution prediction model can assist government entities in formulating health and social care strategies and in planning future air pollution control measures.
RUL prediction for two-phase degrading systems considering physical damage observations
2024, Reliability Engineering and System Safety
This paper focuses on a specific type of two-phase degrading system commonly encountered in industrial practice. The first phase is moderate with a low degradation rate while the second is rapid with a high rate. Current studies usually rely solely on sensor measurements to divide phases and predict the remaining useful life (RUL), ignoring the utilization of actual physical damage observations, such as wear depth and crack length. These observations, available during system shutdown periods, directly reflect system states and phase changes. To this end, we propose a novel RUL prediction framework consisting of offline training and online prediction processes. In the offline training process, the physical damage observations and sensor measurements are utilized to estimate the parameters of a two-phase Wiener process and a bijective function matrix. In the online prediction process, real-time sensor measurements are transformed into virtual damage observations for RUL prediction. To enhance the accuracy of phase change point detection, a change point detection algorithm is proposed for both processes. The effectiveness is demonstrated using a simulation and a real case study.

View all citing articles on Scopus

View full text

Gated recurrent unit based recurrent neural network for remaining useful life prediction of nonlinear deterioration process

Highlights

Abstract

Introduction

Section snippets

Recurrent neural network

Proposed model

Data description

Conclusion

Acknowledgments

Rob Comput Integr Manuf

Eur J Oper Res

Mech Syst Sig Process

Reliab Eng Syst Safety

Reliab Eng Syst Safety

Microelectron Reliab

Mech Syst Sig Process

Comput Ind Eng

Reliab Eng Syst Safety

Neurocomputing

Procedia Comput Sci