Noise-dependent ranking of prognostics algorithms based on discrepancy without true damage information

https://doi.org/10.1016/j.ress.2017.09.021Get rights and content

Highlights

  • The performance of prognostics algorithms depends on the realization of random noise.

  • Mean squared discrepancy is a good metric for assessing the accuracy of prognostics algorithms.

  • Evaluation index is a good metric for assessing the confidence intervals of prediction.

  • Both metrics do not require the true damage information.

  • The selected best algorithm is also likely the best algorithm for predicting the future damage.

Abstract

In this paper, an interesting observation on the noise-dependent performance of prognostics algorithms is presented. A method of evaluating the accuracy of prognostics algorithms without having the true degradation model is proposed. This paper compares the four most widely used model-based prognostics algorithms, i.e., Bayesian method, particle filter, Extended Kalman filter, and nonlinear least squares, to illustrate the effect of random noise in data on the performance of prediction. The mean squared error (MSE) that measures the difference between the true damage size and the predicted one is used to rank the four algorithms for each dataset. We found that the randomness in the noise leads to a very different ranking of the algorithms for different datasets, even though they are all from the same damage model. In particular, even for the algorithm that has the best performance on average, poor results can be obtained for some datasets. In absence of true damage information, we propose another metric, mean squared discrepancy (MSD), which measures the difference between the prediction and the data. A correlation study between MSE and MSD indicates that MSD can be used to estimate the ranking of the four prognostics algorithms without having the true damage information. Moreover, the best algorithm selected by MSD has a high probability of also having the smallest prediction error when used for predicting beyond the last measurement. MSD can thus be particularly useful for selecting the best algorithm for predicting into the near future for a given set of measurements.

Introduction

Model-based prognostics approaches can provide a better performance than data-driven approaches when a degradation model is available [1], [2]. The most widely used model-based prognostics methods in the literature include Extended Kalman filter [3], [4], particle filter [5], [6], Bayesian method [7], [8] and nonlinear least squares [9]. Some review articles [10], [11] introduce and evaluate these algorithms (or part of them) in terms of a general descriptive explanation of their respective advantages and disadvantages. Some publications quantitatively compare different model-based prognostic methods through specific examples [1], [9], [12]. However, we have not found studies of the effect of randomness in the measurement data on the ranking of algorithms, which is the objective of the present paper since measurement noise is one of the most important uncertainty sources.

One of the major challenges in prognostics and health management (PHM) is dealing with prediction uncertainty. Long-term prediction of remaining useful life (RUL) or probability of failure in a certain time horizon increases the bound of prediction uncertainty due to various sources, such as measurement noise, future load, and usage uncertainty, model assumptions and inaccuracies, etc. [13]. An important issue encountered in making meaningful predictions is to treat these uncertainties properly as they directly affect the prognostics results, thus affecting the associated decision-making process [14]. Uncertainty leads to significant deviation of prognostics results from the actual situation. For example, in the application of fatigue crack growth, the Paris model is often used. Among the two Paris model parameters, the exponent, m, of aluminum alloy is known to be in the range of 3.6 and 4.2. This corresponds to 16% variability. But the life cycle can differ by 500% [15]. Sankararaman discussed a series of issues regarding uncertainties in PHM including the causes of uncertainty, the ways of how to interpret uncertainty, how to facilitate effective treatment of uncertainty and how to accurately quantify the uncertainty [16].

Measurement noise is one of the most significant uncertainty sources, which must be represented and managed properly. In model-based prognostics techniques, the estimation of model parameters depends on the measurement data. Using a different set of data will result in different estimations of model parameters, i.e., the uncertainty in measurement data propagates into the uncertainty in model parameters, which significantly affects the performance of the prognostics. In addition, even small errors at the initial state caused by measurement noise can accumulate and grow over time, and consequently, severely distort the predicted probability distribution over a long time period. It is necessary to account for measurement uncertainty right from the initial stages of system-level design. Therefore, well representing, propagating, quantifying and managing measurement uncertainty is very important. In this research, we seek to compare the most commonly used model-based prognostics techniques among themselves for their suitability in various datasets.

An observation from this study is that the performance of the algorithms depends on the particular realization of random noise. Therefore, there may not be a significant meaning to rank the generic performance of algorithms. In this sense, it would be more desirable to use the best algorithm for the given noisy data. The selection of the best algorithm is challenging when the true damage information is not available. Most degradation metrics in the literature are based on the knowledge of true degradation information [17]. Therefore, they are useful for evaluating the generic performance of an algorithm. A useful conclusion from this study is that the mean squared discrepancy is a good representative of the mean squared error. The former one can be used instead of the latter in the absence of true degradation. In addition, we observe that the best algorithm selected based on the past measurement data is highly likely to be among the best for future prediction. Therefore, the proposed method of selecting the best algorithm for the specific noisy data set can be practical for future prognostics.

Existing metrics such as the prognostic horizon, αλ accuracy, (cumulative) relative accuracy and convergence have been proposed for offline assessment of the performance of a prognostic model from accuracy, precision or robustness aspects. Typically, computation of these metrics requires the ``true'' degradation information, e.g., the true End of Life (EOL), the true remaining useful life (RUL), the availability of historical run-to-fail data, etc. However, in practice, the entire degradation process from the beginning of operation until the equipment failure may not yet be observed. Therefore, this makes it challenging to assess the performance of prognostic methods and difficult for users to suitably choose the appropriate method for their problems. The main objective of this paper is to propose a method for online assessment of the performance of model-based prognostics algorithms based on past degradation measurement data. Hu et al. [18] proposed a metric that enables assessing the performance of a model-based prognostic approach from past measurement data in three cases characterized by different levels of knowledge on the degradation process. In [18], the PF is chosen as the prognostics method to test the proposed metric.

In this paper, we focus on four of the most commonly used model-based algorithms, Bayesian method, particle filter, Non-linear least squares and Extended Kalman filter, and verify their performance through a simple degradation model with multiple simulated measurement data sets. Random multiple datasets are generated using the same noise level. This makes sense since for a particular engineering application (e.g., the sensors are embedded into the aircraft structures such as fuselage panels, wings, bulkheads to monitor the fatigue cracks), once the sensor is installed, the level of measurement uncertainty is deterministic since it mainly stems from the sensor limitations, which is an intrinsic attribute of the sensor.

The conventional metric, the mean squared error (MSE) measuring the difference between predicted and the true crack size, is firstly utilized to rank the four algorithms in terms of accuracy assuming the true information on crack growth is available. We examine how much the ranking changes from one dataset to another due to randomness in the noise. We assume that difference in performance from one dataset to another is caused by specific realizations of the noise, which may be friendly to one algorithm and unfriendly to another. Then a new metric based on measurement data, called mean squared discrepancy (MSD), which measures the difference between predicted crack sizes and measured data, is proposed to be a performance indicator in the absence of true crack size. This metric is used to rank the four methods when different datasets are employed. Based on our numerical tests, it shows that even using a simple degradation model, when dealing with multiple measurement data sets among which the inner difference only stems from random noise with even the same noise level, the performance of one algorithm varies from one data set to another. No method can perform consistently well and always be the best for handling all datasets. The ranking based on the mean squared error (MSE) can be mostly preserved when the ranking based on the mean squared discrepancy (MSD) is used. The former requires the true model while the latter does not. This indicates that MSD can be considered to rank the algorithms when the true crack size is not available.

The paper is organized as follows. Section 2 briefly reviews the basic concepts and some key issues of the four algorithms. The degradation model used for testing the algorithms is also presented in this section. Two forms of the model, recursive and non-recursive, are given for adapting the characteristics of different algorithms. Section 3 states the strategy for implementing the performance comparison and the metrics for performance evaluation. A numerical study is implemented in Section 4 where the four algorithms are tested based on 100 datasets with randomly generated noise. The proposed metric is used to rank the prognostic performance of the four algorithms. Section 5 summarizes and concludes the paper.

Section snippets

Model-based prognostic and four most commonly used algorithms

In a model-based prognostics method, it is assumed that the damage state propagates over time, which is an unobserved process governed by some physical damage model, and the damage related quantities are measured at successive time-points. The physical damage model describing the degradation process is assumed to be known. However, the model parameters are generally unknown and need to be identified along with the damage state from noisy measurements. This process usually resorts to estimation

Strategy for comparison and metrics for performance

When multiple predictions are available from different algorithms, it is important for the users to evaluate their performance. Some works compared the performance of several model-based prognostics algorithms using one specific data set and concluded that one method outperforms others, trying to provide users the guidance of selecting the best method. However, comparing multiple algorithms with only one data set may lead to a wrong conclusion since the conclusion may not hold when different

Numerical case study

In this section, we investigate the four algorithms by assessing their prognostics performance to Nd= 100 randomly generated measurement data sets. All datasets are generated using the same crack growth model, but different because of the randomly generated noise. The strategy for implementing performance comparison and the metrics for performance evaluation presented in Section 3 are employed. For each dataset, we test the prognostics behavior of each algorithm and rank them in terms of

Conclusions

In this paper, the four most commonly used algorithms, Bayesian method, particle filter, nonlinear least squares, and Extended Kalman filter, are applied on a simple crack growth model with simulated random measurement noise. We investigate their performance statistically by testing their performance using 100 randomly generated measurement datasets with the same noise level. The mean squared error (MSE) is used as a metric to rank the four algorithms in terms of accuracy for each data set. It

References (45)

  • P. Wang et al.

    A generic probabilistic framework for structural health prognostics and uncertainty management

    Mech Syst Signal Process

    (2012)
  • S. Sankararaman

    Significance, interpretation, and quantification of uncertainty in prognostics and remaining useful life prediction

    Mech Syst Signal Process

    (2015)
  • E Zio et al.

    Particle filtering prognostic estimation of the remaining useful life of nonlinear components

    Reliab Eng Syst Saf

    (2011)
  • Y Hu et al.

    A particle filtering and kernel smoothing-based approach for new design component prognostics

    Reliab Eng Syst Saf

    (2015)
  • D. An et al.

    Prognostics 101: a tutorial for particle filter-based prognostics algorithm using Matlab

    Reliab Eng Syst Saf

    (2013)
  • M Jouin et al.

    Prognostics of PEM fuel cell in a particle filtering framework

    Int J Hydrogen Energy

    (2014)
  • H Dong et al.

    Lithium-ion battery state of health monitoring and remaining useful life prediction based on support vector regression-particle filter

    J Power Sources

    (2014)
  • G Chowdhary et al.

    Aerodynamics parameter estimation from flight data applying extended and unscented Kalman filter

    Aerosp Sci Technol

    (2010)
  • S Bisht et al.

    An adaptive unscented Kalman filter for tracking sudden stiffness changes

    Mech Syst Signal Process

    (2014)
  • M.B. Cortie et al.

    On the correlation between the C and m in the Paris equation for fatigue crack propagation

    Eng Fract Mech

    (1988)
  • J.P. Benson et al.

    The relationship between the parameters C and m of Paris' law for fatigue crack growth in a low-alloy steel

    Scripta Metall

    (1978)
  • ÖG. Bi̇li̇r

    The relationship between the parameters C and n of Paris' law for fatigue crack growth in a SAE 1010 steel

    Eng Fract Mech

    (1990)
  • Cited by (11)

    • A weight-allocation-based ensemble remaining useful life prediction approach for a single device

      2024, Measurement: Journal of the International Measurement Confederation
    • Temporal convolutional network with soft thresholding and attention mechanism for machinery prognostics

      2021, Journal of Manufacturing Systems
      Citation Excerpt :

      Among all these functions, prognostics, or RUL prediction is the core and most challenging task. The view among researcher about the classification of RUL prediction techniques was divergent in the earlier years but gradually converged to three categories: model-based, data-driven, and hybrid methods [4,5]. Due to the complexity and the difficulty of precisely modeling the machine degradation process, learning-based data-driven methods, especially the deep learning methods such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), which learn the degradation pattern purely from historical raw data without any physical models, become more promising solutions in contrast to the model-based ones.

    • A method for the reduction of the computational cost associated with the implementation of particle-filter-based failure prognostic algorithms

      2020, Mechanical Systems and Signal Processing
      Citation Excerpt :

      PHM researchers in the past have declared this problem for a number of years [26,25,28] and nevertheless, to the best of our knowledge, developments on this area have been scarce. In fact, so far, only two recent publications have made advances in this regard (see [29,30]). In [29], the authors proposed an online prognostic performance metric based on a comparison between the prognostic execution and the current degradation trajectory.

    View all citing articles on Scopus
    View full text