Attention-based deep survival model for time series data

doi:10.1016/j.ress.2021.108033

Reliability Engineering & System Safety

Volume 217, January 2022, 108033

https://doi.org/10.1016/j.ress.2021.108033 Get rights and content

Highlights

•
Deep survival models excel in modeling reliability function and its covariates.
•
A novel deep survival model enhances Seq2Seq structure with attention mechanism.
•
Seq2Surv model outperforms existing models in prediction accuracy.

Abstract

In the era of internet of things and Industry 4.0, smart products and manufacturing systems emit signals tracking their operating condition in real-time. Survival analysis shows its strength in modeling such signals to determine the condition of in-service equipment and products to yield critical operational decisions, i.e., maintenance and repair. One appealing aspect of survival analysis is the possibility to include subjects in the model which did not have their failure yet or when the exact failure time is unknown.

NN-based survival models, i.e., deep survival models, show superior performance in modeling the non-linear relationship between the reliability function and covariates. We propose a novel deep survival model, seq2surv, to incorporate the seq2seq structure and attention mechanism to enhance the ability to analyze a sequence of signals in the survival analysis. Similar to the seq2seq model which shows superior performance in machine translation, we designed the seq2surv model to translate from a sequence of signals to a sequence of survival probabilities and to update the reliability predictions along with real-time monitoring. Our results show that the seq2surv model outperforms existing deep survival approaches in terms of higher prediction accuracy and lower errors in the survival function estimation on both simulated and real-world datasets.

Introduction

Automobile manufacturers spend 2.5–3.0% of their sales revenue on fixing vehicle issues. Customer satisfaction and feedback are essential in marketing the product and preventing existing problems from recurring [1]. Manufacturers seek to address the challenges of effectively utilizing diverse databases including customer feedback, laboratory test, maintenance, and field-tracking to identify and resolve product defects at design and manufacturing phases [2]. With recent development in new communication systems and devices, cybersecurity standards, and deployable artificial intelligence, manufacturers have started to understand failures by data-driven prognosis techniques thus improving their profit margin by improving product design and preparing spare parts based on the failure prediction [3], [4].

Data-driven prognosis analysis can leverage heterogeneous databases to derive actionable insights to enhance the development of innovative products [2] and improve the reliability of existing products [5]. Moreover, bridging between manufacturing data and field reliability makes root cause analysis of defects possible, even for defects that cannot be easily identified by technicians [6]. Manufacturers seek to understand field failures by modeling the distribution of failure times [7] and operational actions [8].

Survival analysis shows its strength in modeling such data to help determine the condition of in-service equipment and products for the entire lifeline, which can be classified into univariate, bivariate, and those containing covariates (a.k.a., explanatory variables). A comprehensive summary of these models is presented in [9]. In the univariate approach, the survival variable is typically the age of the product. It is widely discussed by many authors [10]. In the bivariate approach, the data are partitioned in a two-dimensional plane with one axis representing product age and the other axis representing product usage, e.g., mileage in the automotive industry [11]. There are studies that directly estimate the bivariate lifetime distribution of products [12]. In the covariates approach, models leverage exploratory variables describing the design, production and market-related information on the actual environment in which the product is used. Many procedures have been developed for collecting and analyzing warranty claim data [13], [14].

Extensive research has been conducted in implementing the covariates approach in automotive case studies. Karim and Suzuki designed a Weibull regression model as a function of reliability-related covariates [14]. Attardi et al. [15] use a mixed-Weibull regression model for the analysis of automotive warranty claims data with engine type and car model used as covariates. Cox proportional hazards (CPH) model [16] is a semi-parametric model that estimates the effects of observed covariates on the hazard function. CPH model assumes that the risk can be computed by a linear combination of covariates and links risks to the baseline hazard via the exponential function to enforce positivity. Krivtsov et al. uses a Cox regression model to understand the failure mechanism of tires [17].

It is interesting to note that binary classification models, one of the most common machine learning applications, are used in industrial settings, where survival methodology is applicable. Binary classifiers can provide predictions for a certain time slice but lose the interpretability and flexibility in modeling the distribution of an event as a continuous function of time. Moreover, in applications with a substantial amount of censoring, the use of binary classifiers tends to be problematic. For example, for the steering systems, the percentage of uncensored data, i.e., failure event occurs before the end of observation, is around 1% [18]. While binary classifiers typically ignore censored observations, one of the main objectives in survival analysis is to account for them [19].

Several challenges exist in implementing survival analysis in industrial practices. Classical statistics techniques for Cox regression rely on non-parametric or semi-parametric methods for the survival function estimation, primarily because they make working with censored data relatively straightforward [20]. However, non-parametric methods may suffer from the problem of dimensionality, when learning individual hazards, especially when the size of co-variate is large. Semi-parametric approaches usually depend on the prevailing assumption of constant risks over a lifetime, which is very likely to be unrealistic in many practical scenarios encountered in healthcare, predictive maintenance, econometrics, or operations research [8], [20], [21].

As such, a richer family of deep survival models have been developed to better fit survival data with nonlinear risk functions. Since neural networks (NNs) can learn highly complex and nonlinear functions, researchers have used NN-based to build survival models, which offer greater flexibility and accuracy in modeling the relationship between covariates and time-to-event, also known as deep survival model. Deep survival models combine the advantages of deep NNs to more accurately model complex functions with a better ability to handle the censored data.

Today, products and manufacturing devices are getting smarter. Sensors and smart chips are commonly used in machines and products to monitor use rate, system load, and various environmental variables in real-time. Next-generation reliability data are much richer with time-series features describing system operating and environmental information [22]. With the development of connected vehicle technology, vehicle speed, acceleration, temperature, pressure, and vast amounts of network data are available for traffic safety evaluation [23], remote vehicle prognostics and health management [24]. Connected manufacturing devices integrate industrial production and store machine-related data as large-scale time series from every aspect of production, transportation and after-sales [25], [26].

Moreover, products with various usage rates, e.g., due to the heterogeneous customer behavior, lead to distinct failure processes. The excavators with a high usage rate are likely to experience more failures than those with either a moderate usage rate or low usage rate [27]. Another study also reveals that the reliability of the power system of electric vehicles depends on travel patterns and driver’s behavior [28]. An increasing amount of time series data can be collected from the mounted sensors in vehicles and manufacturing systems. Vast data provide opportunities to understand the field failures but also challenge the existing deep survival models to effectively prognosticate failure [22], for example, high complexity and nonlinearity in the machine condition data [29].

Time series data often has periodic temporal features due to seasonality or complex patterns underlying the activities measured and noticeable noise from communication and measurement [30]. Accurate prognosis relies on effective feature extraction from the whole series to capture valuable information and discard irrelevant noise [31]. Unfortunately, all existing work, either statistical analysis or NN models, does not attain this goal well. To our best knowledge, no prognosis model is structured to take advantage of such time series data and grasp the emerging opportunity that has arisen from the breakthrough in information and communication technology.

In this study, we systematically review the existing deep survival models in multidisciplinary studies, spanning from disease management to automotive analysis and bridge the likelihood functions in survival analysis to the loss function in machine learning models. Machine learning models, which have been used in the parameterization of survival models, are summarized and compared according to their strengths and weakness, especially for the era of internet of things and Industry 4.0. We build on the previous study [32] and propose a novel deep survival model, seq2surv, that can effectively analyze the time-series data to address the emerging opportunities and challenges in Industry 4.0, for example, connected vehicles and smart manufacturing systems. Seq2surv model substantially improves the accuracy of predicting survivability of each individual among all existing deep survival models, indicating vast potentials in improving product designs and manufacturing processes. Our paper is structured as follows: after some preliminaries in Section 2, in Section 3, we review the fundamentals of survival analysis as well as the existing deep survival models. In Section 4, we address the challenges in analyzing the time-series features and making a prediction. In Section 5, we describe our seq2surv model, which predicts survival curves from correlated time series data. In Section 6, we implemented the proposed model on simulated and real-world time series datasets to show its effectiveness. In Section 7, we summarize our work and propose future research directions.

Section snippets

Preliminaries

In this study, we aim to model the distribution of failure time, denoted by $T^{*}$ . In most cases, not all failure times are observable. We denote a right-censored observation as $C^{*}$ , whereby the failure after $C^{*}$ is censored. We assume that a failure is happened, once it is seen and recorded. Survival time $T$ and failure indicator $D$ are defined respectively as $T = min {T^{*}, C^{*}},$ $D = 1 {T^{*} \leq C^{*}} .$

We denote $f (t)$ as the probability density function of the failure time, and $F (t)$ as the cumulative distribution

Literature review

Given a parametric assumption for a distribution of survival times, a variaty of survival models and parameter estimation methods have been built in the framework of (deep) survival analysis. In this section, we summarize the existing (deep) survival models by sorting them as time-invariant and time-dependent survival models.

Problem statement

Given the time-series features (covariates) of each individual, our goal is to forecast another time-series output representing the reliability performance at each future time period. Extensive research has been made in time-series modeling. Although the well-known models, including the auto-regressive moving average (ARMA) model, kernel methods, and ensemble methods, have shown their effectiveness in many real-world applications, most of these approaches employ a predefined nonlinear form,

Seq2surv model

In this section, we propose a seq2surv model with an attention mechanism to learn complex patterns in the time-series signals and to generate multi-step survival predictions with the ability to filter out irrelevant noise by encoder–decoder structure. The model is designed to bridge sequences of time-series features to a sequence of the survival probability estimates for the entire lifetime. Specifically, the architecture of seq2seq model is leveraged to parameterize the survival function with

Results

In this section, we compared our seq2surv model with the aforementioned deep survival models in the simulated and NASA turbofan engine datasets. Evaluation metrics, including SEP, Concordance index, and Brier score, are adopted for model comparison (detailed information can be found in Appendix B).

Conclusions and future work

This paper proposes a novel system prognostic tool, seq2surv, to analyze the relationship between lifetime reliability performance and complex real-time signals, which are widely collected in Industry 4.0 and the Internet of Things. Seq2surv model is designed for the reliability analysis of time-series signals, which commonly exist in smart manufacturing systems and autonomous/connected vehicles. Seq2surv leverages the merits from the seq2seq model (commonly used for machine translation) and

CRediT authorship contribution statement

Xingyu Li: Conceptualization, Methodology, Software, Validation, Writing – original draft, Writing – review & editing. Vasiliy Krivtsov: Conceptualization, Formal analysis, Writing – original draft, Project administration. Karunesh Arora: Conceptualization, Methodology, Software, Writing – original draft, Visualization.

References (59)

ZhaoX. et al.
Utilizing experimental degradation data for warranty cost optimization under imperfect repair
Reliab Eng Syst Saf
(2018)
AlkahtaniM. et al.
A decision support system based on ontology and data mining to improve design using warranty data
Comput Ind Eng
(2019)
LiX. et al.
AI-based competition of autonomous vehicle fleets with application to fleet modularity
European J Oper Res
(2020)
XuF. et al.
Life prediction of lithium-ion batteries based on stacked denoising autoencoders
Reliab Eng Syst Saf
(2021)
KangS. et al.
Mining the relationship between production and customer service data for failure analysis of industrial products
Comput Ind Eng
(2017)
OhY. et al.
Field data analyses with additional after-warranty failure data
Reliab Eng Syst Saf
(2001)
LiX. et al.
Degradation-aware decision making in reconfigurable manufacturing systems
CIRP Annal
(2019)
HuangY.-S. et al.
Cost analysis of two-dimensional warranty for products with periodic preventive maintenance
Reliab Eng Syst Saf
(2015)
AttardiL. et al.
A mixed-Weibull regression model for the analysis of automotive warranty data
Reliab Eng Syst Saf
(2005)
KrivtsovV.V. et al.
Regression approach to tire reliability analysis
Reliab Eng Syst Saf
(2002)

TaoF. et al.

Data-driven smart manufacturing

J Manuf Syst

(2018)

YangD. et al.

Warranty claims forecasting based on a general imperfect repair model considering usage rate

Reliab Eng Syst Saf

(2016)

LiX. et al.

Deep learning-based remaining useful life estimation of bearings using multi-scale feature extraction

Reliab Eng Syst Saf

(2019)

XiangA. et al.

Comparison of the performance of neural network methods and cox regression for censored survival data

Comput Statist Data Anal

(2000)

LiX. et al.

Remaining useful life estimation in prognostics using deep convolution neural networks

Reliab Eng Syst Saf

(2018)

ZhangY. et al.

Cnn-based survival model for pancreatic ductal adenocarcinoma in medical imaging

BMC Med Imag

(2020)

WangJ. et al.

Software reliability prediction using a deep learning model based on the RNN encoder–decoder

Reliab Eng Syst Saf

(2018)

ZouZ. et al.

Task space-based dynamic trajectory planning for digging process of a hydraulic excavator with the integration of soil–bucket interaction

Proc Inst Mech Eng K: J Multi-Body Dyn

(2019)

KrivtsovV.V.

Field data analysis & statistical warranty forecasting

IEEE Catalog No CFP11RAM-CDR

(2011)

ModarresM. et al.

Reliability engineering and risk analysis: A practical Guide

(2016)

LawlessJ.F. et al.

Analysis of reliability and warranty claims in products with age and usage scales

Technometrics

(2009)

KrivtsovV. et al.

Nonparametric estimation of marginal failure distributions from dually censored automotive data

KarimM. et al.

Analysis of warranty data with covariates

Proc Inst Mech Eng O: J Risk Reliab

(2007)

CoxD.R.

The analysis of multivariate binary data

Appl Stat

(1972)

VintaS.

Analysis of data to predict warranty cost for various regions

KvammeH. et al.

Time-to-event prediction with neural networks and Cox regression

(2019)

NagpalC. et al.

Deep survival machines: Fully parametric survival regression and representation learning for censored data with competing risks

IEEE J Biomed Health Inf

(2021)

KatzmanJ.L. et al.

Deep survival: A deep cox proportional hazards network

Stat

(2016)

MeekerW.Q. et al.

Reliability meets big data: opportunities and challenges

Qual Eng

(2014)

Cited by (22)

An intelligent decision support system for warranty claims forecasting: Merits of social media and quality function deployment
2024, Technological Forecasting and Social Change
This work develops a novel approach based on Machine Learning (ML)-assisted Quality Function Deployment (QFD) to sift the gold from the stone. It includes Time-Varying Filter-based Empirical Mode Decomposition (TVF-EMD), Deep Ensemble Random Vector Functional Link (DE-RVFL), and a Bayesian optimization algorithm for optimizing the shaped DE-RVFLTVF-EMD hyperparameters. This approach makes it possible for the proposed methods to be dynamic enough to deal with the data's volatility, complexity, uncertainty, and ambiguity. It is demonstrated that incorporating TVF-EMD to provide time-frequency analysis along DE-RVFL, and goal-oriented social media analytics boosts the performance of out-of-sample predictions statistically and compensates for the "warranty data maturation" effect. The proposed algorithm's Root Mean Square Error (RMSE) decreases by 23.37%-88.76% relative to other benchmark cutting-edge models. This study contributes significantly to the services management community. Using the proposed methodology, managers could create plans for warranty claims strategies that reduce inventory levels and waste while optimizing customer satisfaction, advocacy, and revenues. These merits provide incentives and support for policymakers to adopt advanced technologies, such as the ones developed and implemented in the current study, in warranty claims forecasting to improve accuracy and efficiency.
Separate-and-conquer survival action rule learning
2023, Knowledge-Based Systems
Action mining is a data mining method that aims to identify recommendations for changing attribute values that can lead to the classification of data instances as examples of another class. Action mining algorithms extract rules containing recommendations in the premises and class changes in the conclusion. To the best of the authors’ knowledge, no method has been proposed yet for generating action rules based on censored data. This study introduces the first method for survival action rule generation. The method stems from the covering rule induction algorithm but generates rules defining the actions required to change not the class but the survival curve of the covered examples. Thus, this study poses a new research problem: generating action rules for censored data and survival analysis. This study evaluated the proposed method using 22 data sets in which two application domains of survival analysis were distinguished: medicine and predictive maintenance. In addition, more detailed analyses of the generated survival action rules were presented in the form of case studies for the two selected data sets. The results show that the proposed method generates good-quality survival action rules and changes in the survival curves, resulting from the identified actions are significant.
Deep Bayesian survival analysis of rail useful lifetime
2023, Engineering Structures
Reliable estimation of rail useful lifetime can provide valuable information for predictive maintenance in railway systems. However, in most cases, lifetime data is incomplete because not all pieces of rail experience failure by the end of the study horizon, a problem known as censoring. Ignoring or otherwise mistreating the censored cases might lead to false conclusions. Survival approach is particularly designed to handle censored data for analysing the expected duration of time until one event occurs, which is rail failure in this paper. This paper proposes a deep Bayesian survival approach named BNN-Surv to properly handle censored data for rail useful lifetime modelling. The proposed BNN-Surv model applies the deep neural network in the survival approach to capture the non-linear relationship between covariates and rail useful lifetime. To consider and quantify uncertainty in the model, Monte Carlo dropout, regarded as the approximate Bayesian inference, is incorporated into the deep neural network to provide the confidence interval of the estimated lifetime. The proposed approach is implemented on a four-year dataset including track geometry monitoring data, track characteristics data, various types of defect data, and maintenance and replacement (M&R) data collected from a section of railway tracks in Australia. Through extensive evaluation, including Concordance index (C-index) and root mean square error (RMSE) for evaluating model performance, as well as a proposed CW-index for evaluating uncertainty estimations, the effectiveness of the proposed approach is confirmed. The results show that, compared with other commonly used models, the proposed approach can achieve the best concordance index (C-index) of 0.80, and the estimated rail useful lifetimes are closer to real lifetimes. In addition, the proposed approach can provide the confidence interval of the estimated lifetime, with a correct coverage of 81% of the actual lifetime when the confidence interval is 1.38, which is more useful than point estimates in decision-making and maintenance planning of railroad systems.
Modelling long- and short-term multi-dimensional patterns in predictive maintenance with accumulative attention
2023, Reliability Engineering and System Safety
Predictive Maintenance (PdM) plays a pivotal role in safety management by planning necessary maintenance in advance to avoid future serious breakdown. Predicting the Remaining useful life (RUL) based on historical running data is an important task in PdM. One challenge of this issue is to capture both the temporal and spatial complex patterns especially in ultra-long sequences. Recent studies have demonstrated the superiority of Transformer model in capturing long-term dependencies. However, in the research field of PdM, the canonical Transformer faces with difficulties to deploy due to its limited input length, neglect of local correlations, insensitivity to input pattern and high computational cost. To tackle this, a novel lightweight RUL prediction model called TCNASA integrating temporal convolution network (TCN), accumulative self-attention layer (ASA) and autoregressive component is proposed. It uses TCN firstly to capture local correlations, prunes the redundant short-term cases when matching pairs in attention layers, accumulates global patterns through stacked self-attention layers, and lastly integrates an autoregressive component to enhance the robustness. The experimental results on several real-world PdM datasets have verified the effectiveness and efficiency of the proposed TCNASA model.
Attention-based Gate Recurrent Unit for remaining useful life prediction in prognostics
2023, Applied Soft Computing
An essential process in prognostics and health management (PHM) is remaining useful life (RUL) prediction. The traditional Recurrent Neural Networks (RNNs) and their variants are not very efficient at solving the regression problems of RUL prediction. Given this problem, an attention-based Gate Recurrent Unit (ABGRU) for RUL prediction is proposed in this paper. Firstly, the dataset is preprocessed, and the RUL labels are modeled using the piecewise linear degradation method. Then, a GRU network based on an encoder–decoder framework with an attention mechanism is proposed. The network can assign weights according to the importance of feature information and effectively use the feature information to predict RUL. The validity of the proposed framework is verified in the NASA C-MAPSS benchmark dataset. The results show that the presented method outperforms the existing state-of-the-art approaches and provides a new solution for RUL Prediction.
Energy-Based Survival Models for Predictive Maintenance
2023, IFAC-PapersOnLine
Predictive maintenance is an effective tool for reducing maintenance costs. Its effectiveness relies heavily on the ability to predict the future state of health of the system, and for this survival models have shown to be very useful. Due to the complex behavior of system degradation, data-driven methods are often preferred, and neural network-based methods have been shown to perform particularly very well. Many neural network-based methods have been proposed and successfully applied to many problems. However, most models rely on assumptions that often are quite restrictive and there is an interest to find more expressive models. Energy-based models are promising candidates for this due to their successful use in other applications, which include natural language processing and computer vision. The focus of this work is therefore to investigate how energy-based models can be used for survival modeling and predictive maintenance. A key step in using energy-based models for survival modeling is the introduction of right-censored data, which, based on a maximum likelihood approach, is shown to be a straightforward process. Another important part of the model is the evaluation of the integral used to normalize the modeled probability density function, and it is shown how this can be done efficiently. The energy-based survival model is evaluated using both simulated data and experimental data in the form of starter battery failures from a fleet of vehicles, and its performance is found to be highly competitive compared to existing models. Code available at https://github.com/oholmer/PySaRe.

View all citing articles on Scopus

View full text

Attention-based deep survival model for time series data

Highlights

Abstract

Introduction

Section snippets

Preliminaries

Literature review

Problem statement

Seq2surv model

Results

Conclusions and future work

CRediT authorship contribution statement

Reliab Eng Syst Saf

Comput Ind Eng

European J Oper Res

Reliab Eng Syst Saf

Comput Ind Eng

Reliab Eng Syst Saf

CIRP Annal

Reliab Eng Syst Saf

Reliab Eng Syst Saf

Reliab Eng Syst Saf

J Manuf Syst

Reliab Eng Syst Saf

Reliab Eng Syst Saf

Comput Statist Data Anal

Reliab Eng Syst Saf

BMC Med Imag

Reliab Eng Syst Saf

Task space-based dynamic trajectory planning for digging process of a hydraulic excavator with the integration of soil–bucket interaction

Proc Inst Mech Eng K: J Multi-Body Dyn

Field data analysis & statistical warranty forecasting

IEEE Catalog No CFP11RAM-CDR

Reliability engineering and risk analysis: A practical Guide

Analysis of reliability and warranty claims in products with age and usage scales

Technometrics

Nonparametric estimation of marginal failure distributions from dually censored automotive data

Analysis of warranty data with covariates

Proc Inst Mech Eng O: J Risk Reliab

The analysis of multivariate binary data

Appl Stat

Analysis of data to predict warranty cost for various regions

Time-to-event prediction with neural networks and Cox regression

Deep survival machines: Fully parametric survival regression and representation learning for censored data with competing risks

IEEE J Biomed Health Inf

Deep survival: A deep cox proportional hazards network

Stat

Reliability meets big data: opportunities and challenges

Qual Eng