Elsevier

Neurocomputing

Volume 275, 31 January 2018, Pages 167-179
Neurocomputing

Remaining useful life estimation of engineered systems using vanilla LSTM neural networks

https://doi.org/10.1016/j.neucom.2017.05.063Get rights and content

Highlights

  • Dynamic differential features – inter-frame dynamic changes contain a great amount of model degradation information. Therefore, a dynamic difference technology was used to extract new features from original datasets.

  • Higher accuracy – the vanilla LSTM can get higher accuracy than the standard RNN and GRU. As the research objects become more complex, the prediction accuracies did not obviously decrease.

  • Advanced Regularization mechanism and optimization algorithm – dropout is used to improve the generalization ability of vanilla LSTM while Adam algorithm is used to reduce the effects of learning rate on the final optimization results.

Abstract

Long Short-Term Memory (LSTM) networks are a significant branch of Recurrent Neural Networks (RNN), capable of learning long-term dependencies. In recent years, vanilla LSTM (a variation of original LSTM above) has become the state-of-the-art model for a variety of machine learning problems, especially Natural Language Processing (NLP). However, in industry, this powerful Deep Neural Network (DNN) has not aroused wide concern. In research focusing on Prognostics and Health Management (PHM) technology for complex engineered systems, Remaining Useful Life (RUL) estimation is one of the most challenging problems, which can lead to appropriate maintenance actions to be scheduled proactively to avoid catastrophic failures and minimize economic losses of the systems. Following that, this paper aims to propose utilizing vanilla LSTM neural networks to get good RUL prediction accuracy which makes the most of long short-term memory ability, in the cases of complicated operations, working conditions, model degradations and strong noises. In addition, to promote cognition ability about model degradation processes, a dynamic differential technology was proposed to extract inter-frame information. The whole proposition is illustrated and discussed by performing tests on a case of the health monitoring of aircraft turbofan engines which have four different issues. Performances of vanilla LSTM are benchmarked with standard RNN and Gated Recurrent Unit (GRU) LSTM. Results show the significance of performance improvement achieved by vanilla LSTM.

Introduction

In industry, the remaining useful life estimation of a system or component is usually dependent on operating conditions and sensor readings. Obviously, the more historical data available, the more accurate predictions will be. At the data competition of 1st international conference on Prognostics and Health Management (PHM08), Peel used Multi-Layer Perceptron (MLP) and Radial Basis Function (RBF) networks [1] to estimate the RUL of aero-engines while Heimes used classical RNN. Heimes's algorithm got better performance than Peel's owing to the RNN's hidden units which implicitly contain information about the history of all past elements in the sequence [2]. By virtue of weight sharing idea and feedback structure, recurrent neural network is able to use all of the historical conditions and sensing data to predict with low model complexity. Nowadays, recurrent neural network has become one of the important subfields of deep learning [3]. It has been widely used to generate sequences in domains as diverse as music [4], speech [5], text [6] and motion capture data [7].

Unfortunately, trying to unfold recurrent connections of RNN lead us to find that RNN can be a very deep feedforward network. This is called “long-term dependencies”, making it hard to learn to store information for very long [8]. In the past two decades, researches made every effort to solve this issue and proposed some variants; the most famous two are Echo State Networks (ESN) [9] and LSTM, [10], respectively. On one side, since learning the recurrent and input weights is difficult, Jaeger and Haas [9] presented setting those weights such that the recurrent hidden units can capture the history of past inputs well, and only learn the output weights. This is the core of the echo state network. Echo state networks have been shown to be an effective RNN variant and achieved some success in RUL prediction problems, such as satellite lithium battery RUL estimation developed by Hong [11]. On the other side, the central idea behind the LSTM architecture is a memory cell which can maintain its state over time, and non-linear gating units which regulate the information flow into and out of the cell. However, the original LSTM (no forget gate), proposed by Hochreiter and Schmidhuber in 1997 [10], did not perform well. The most commonly used LSTM architectures nowadays were originally introduced by Graves and Schmidhuber [12]. People refer to it as vanilla LSTM. Vanilla LSTMs have forget gate allowing learning of continual tasks, and they use full gradients training instead of setting parts of weights just like ESN. Although it was also then modeled on vanilla LSTM to create a number of typical variants, such as GRU [13], Greff et al. did a nice comparison of popular variants, finding that “vanilla LSTM performs reasonably well on various datasets and using any of eight possible modifications does not significantly improve the LSTM performance” [14]. These modifications included GRU.

Above all, the aim of this paper is to utilize vanilla LSTM, which usually deals with supervised learning on language modeling, and related state-of-the-art technologies of feature extraction to improve accuracy in RUL prediction problems for complicated industrial objects. The main contributions of this paper are as follows:

  • (1)

    Add dynamic differential features – raw features in RUL estimation problem are often stationary. But inter-frame dynamic changes (observed by the sensors under different operation conditions) contain a great amount of model degradation information. Therefore, a dynamic difference technology was used to extract new features from original health monitoring datasets.

  • (2)

    Higher prediction accuracy – the vanilla LSTM can get higher prediction than the standard RNN and GRU under the same number of hidden neurons in a single layer. As research objects become increasingly complex, the prediction accuracies can be obtained by the model do not obviously decrease.

  • (3)

    Advanced regularization mechanism and optimization algorithm – dropout mechanism is used to improve the generalization ability of vanilla LSTM while advanced optimization algorithm (Adam) is used to reduce the effects of learning rate on final optimization results.

This paper is organized as shown below. Application backgrounds of neural networks in RUL estimation of engineered systems are given in Section 2. This part illustrates advantages and drawbacks of classical neural networks when dealing with RUL estimation problems. On this basis, Section 3 proposes using vanilla LSTM to make effectively full of historical data to assess the RUL. This section also briefly introduces the main schemes of vanilla LSTM. Performances of vanilla LSTM are benchmarked by performing tests on aircraft turbofan engines datasets from NASA in Section 4. Four different issues are considered: a single fault and single operating mode problem, a single fault and multiple operating modes problem, a hybrid fault and single operating mode problem and a hybrid fault and multiple operating modes problem. Through comparisons with standard RNN and GRU LSTM, vanilla LSTM's excellent performance in RUL estimation field is demonstrated. Finally, Section 5 concludes this work and proposes some future aspects.

Section snippets

Neural networks in RUL estimation

As one of the most important members in the field of machine learning, neural network is considered as a mature prognostic algorithm. Multilayer perceptron, radial basis function networks and other neural networks have been widely used in anomaly detection, damage clustering and fault diagnosis. Excitedly, they achieved remarkable success [1], [15], [16], [17].

As for time series data, such as the samples in RUL prediction problem, researchers have been searching for more reasonable models and

Concept of vanilla LSTM

After refinement and popularization, the variant of LSTM, vanilla, is most commonly used in literature. The schematic of the vanilla LSTM block can be seen in Fig. 1.

As shown in Fig. 1, the core idea of LSTM lies in the information flows represented by the two black horizontal lines. The bottom one indicates the combination of input of the current time (Xt) and output of the previous time (ht1). In classical RNN, this integrated information is used for overwriting cell state directly. As for

Experiments and discussion

The aim of this part is to demonstrate fast modeling and enhanced performances of using vanilla LSTM in the challenge of RUL estimation, in comparison to standard RNN and GRU LSTM. Experiments are carried out on four aircraft turbofan engine simulation datasets which are injected faults unknown by data analyzers and work in different complex conditions with noises. In model training phase, Man Square Error (MSE) on cross validation set is used to evaluate performance of the trained neural

Conclusion

In this paper, vanilla LSTM neural networks, which usually work effectively in the field of natural language processing, are utilized to solve the bottleneck problem of high-precision RUL estimation for complicated engineered systems. Besides, a dynamic difference technology is proposed to extract new features from raw health monitoring data, with which RNNs can make full use of inter-frame information to find the real physical degradation mechanism behind sensor readings under complex and

Acknowledgments

This work was supported by National Natural Science Foundation of China under Grant no.51375030. First of all, I would like to thank NASA Ames Research Center for providing turbofan engine degradation simulation data set. Secondly, I would like to express my sincere thanks to my supervisor Mei Yuan and Shaopeng Dong, who have given me so much useful advices on my writing and have tried their best to improve my paper. Last but not the least; I would like to thank my junior apprentice Lin Li and

Yuting Wu is a Master candidate student jointly educated by School of Automation Science and Electrical Engineering and School of Energy and Power Engineering, Beihang University. He received the Bachelor's degree from School of Information Science and Engineering, Central South University, Changsha, China, in 2014. His research interests include machine learning, data mining and deep learning.

References (26)

  • I. Sutskever et al.

    The recurrent temporal restricted boltzmann machine

    In Advances in Neural Information Processing Systems

    (2009)
  • Y. Bengio et al.

    Learning long–short term dependency is difficult

    IEEE Trans. Neural Netw.

    (1994)
  • H. Jaeger et al.

    Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication

    Science

    (2004)
  • Cited by (633)

    • A survey of deep learning-driven architecture for predictive maintenance

      2024, Engineering Applications of Artificial Intelligence
    View all citing articles on Scopus

    Yuting Wu is a Master candidate student jointly educated by School of Automation Science and Electrical Engineering and School of Energy and Power Engineering, Beihang University. He received the Bachelor's degree from School of Information Science and Engineering, Central South University, Changsha, China, in 2014. His research interests include machine learning, data mining and deep learning.

    Mei Yuan is an Associate Professor at School of Automation Science and Electrical Engineering, Collaborative Innovation Center for Advanced Aero-Engine, Beihang University, Beijing, China. Her current research interests include prognostic and health management, advanced signal processing, embedded systems, structural health monitoring of complex system. She is currently the Member and Secretary of Chinese Society of Aeronautics and Astronautics GNC branch, Director and Member of Chinese Instrument and Control Society SHM branch, senior member of Chinese Metrology Society.

    Shaopeng Dong is currently a Lecturer and working towards the Ph.D. degree at School of Automation Science and Electrical Engineering, Beihang University, Beijing, China. He received the Bachelor's degree in Automation from China Agriculture University, Beijing, China in 2004. He received the Master's degree in Detection Technology and Automatic Equipment from Beihang University, Beijing, China in 2007.

    His main research interests include prognostic and health management, embedded system, signal processing, structural health monitoring of complex system.

    Li Lin is a Master candidate student in the School of Energy and Power Engineering, Beihang University. She received the bachelor`s degree from the School of Electric Engineering and Automation, Hefei University of Technology, Anhui, China, in 2015. Her research interests include machine learning, automatic test system and sensor technology.

    Yingqi Liu is currently a Master candidate in the school of Automation Science and Electrical Enginnering at Beihang University. He is Graduated in Ecole Centrale de Pekin from Beihang University, Beijing, China, in 2015. His main research interests include prognostic and health management (PHM), deep learning, machine learning and data mining.

    View full text