Adaptive self-attention LSTM for RUL prediction of lithium-ion batteries

doi:10.1016/j.ins.2023.01.100

Information Sciences

Volume 635, July 2023, Pages 398-413

https://doi.org/10.1016/j.ins.2023.01.100 Get rights and content

Abstract

To achieve an accurate remaining useful life (RUL) prediction for lithium-ion batteries (LIBs), this study proposes an adaptive self-attention long short-term memory (SA-LSTM) prediction model. The innovations of the designed prediction model include the following. (1) It features an optimized local tangent space alignment algorithm, which allows the extraction of an indirect health indicator (HI) that can precisely describe battery degeneration from charge data. The extracted HI exhibits a high correlation with the standard capacity, thus facilitating RUL estimation. (2) By introducing a masked multi-head self-attention module into the time-series prediction model based on LSTM, critical information in the sequences is captured and the prediction performance is improved. (3) An online self-tuning mechanism for the weights and biases of neural networks is designed to correct cumulative estimation errors in long-term predictions and reduce the effects of local fluctuations and regeneration. The proposed prediction model enables the HI values in future cycles to be iteratively estimated using the one-step-ahead method, and the RUL can be forecast once the predicted signal falls. Experimental results indicate the effectiveness and superiority of the proposed prediction method.

Introduction

Lithium-ion batteries (LIBs), as an alternative source of fossil fuel, demonstrate high potential for addressing energy depletion and environmental crises. Therefore, they have been used extensively in electric vehicles, aerospace, consumer electronics, and other fields [1]. However, the performance of LIBs degrades over time; thus, it can result in battery failure or severe accidents. In this regard, an accurate estimation of the remaining useful life (RUL) can significantly facilitate battery performance monitoring, failure warning, and battery replacement, thus helping in avoiding unexpected breakdowns and safety incidents.

However, RUL prognosis remains challenging in practical scenarios. Battery degradation is an unknown and comprehensive nonlinear dynamic process affected by the complex interplay between internal electrochemical reactions and external operating conditions [2], [3]. The nonlinear dynamic process results in intricate degradation phenomena during battery operation, such as accelerated degradation (AD) from the middle to the end of battery life [4] and the local regeneration phenomenon (CRP) [5], which occurs during the resting phase of the battery. Consequently, an appropriate model must be established to accurately describe the degradation patterns and dynamics of LIBs, which is challenging. Additionally, the battery typically has a lifespan of hundreds or even thousands of cycles, which satisfies the practical requirements of long lifetime and high storage [6]. The long lifespan poses a challenge for modeling the long-term dependencies of the battery degradation process. Furthermore, battery capacity and resistance, which are the most typically used direct indicators for RUL prediction, cannot be measured online but can only be measured in a laboratory using specific measuring equipment or operating conditions [7], [8].

To overcome these challenges, this study developed a data-driven RUL prognosis algorithm for LIBs by applying adaptive deep learning. This novel prediction method considers the characterization of LIB degradation and the construction of a long-term prediction model such that an accurate RUL prediction for LIBs can be achieved.

RUL prediction methods can be classified into two primary categories [1]: model-based and data-driven methods.

1) Model-based method.

For the model-based method, a degradation model of a battery must be established based on prior knowledge or physical laws; these models can be further classified into the failure mechanism model (FMM) [9], equivalent circuit model (ECM) [10], filtering model (FM) [11], and stochastic process model (SPM) [12]. To investigate the chemical or physical properties and operating principles of both the FMM and ECM, battery aging analysis must be performed. However, these approaches require complicated modeling processes, high accuracy, and generalization. The aim of the FM and SPM is to mine the recurrence relations of the internal battery status or the change regulations of the battery monitoring data [13]. However, these methods exhibit the disadvantages of low dynamic accuracy, limited adaptability, and insufficient prior knowledge owing to the interaction between internal mechanisms and external operations.

2) Data-driven method.

The data-driven method, also known as the model-free method, directly infers the degradation state and predicts the RUL from battery monitoring data. Thus, this method does not require electrochemical analysis or prior knowledge, which distinguishes it from model-based methods [14]. Recently, various intelligent algorithms, such as vector machines [15], autoregressive modeling [16], and neural networks (NN) [17], have been widely applied for RUL prediction because of their high flexibility and adaptability in approximating nonlinear systems. In particular, advanced deep learning NNs [18], [19] perform remarkably well in nonlinear and high-dimensional data modeling; this demonstrates their potential in data-driven RUL prediction. As an effective method for managing time series, a long short-term memory (LSTM) NN is explicitly designed to control the flow of information for long-term dependencies and eliminate vanishing or exploding gradients [20], [21]. Furthermore, the elaborate gate mechanisms and loop cell states in an LSTM NN guarantee that the predicted state is affected less by backpropagation. This allows critical information to be partially retained longer compared with using the typical recurrent NN (RNN)-based methods. Researchers highly recommend using LSTM and its variants for estimating both the long-term degradation state and the RUL of LIBs [22], [23], [24]. Nevertheless, several limitations exist in the prediction performance of vanilla LSTM-based methods, which are presented as follows.

First, LSTM fails to capture the dynamic degradation properties (i.e., the AD and CRP) in battery time series [25]. Generally, before the LSTM NN performs a prediction, a learning process is required to determine the network parameters using the historical data of LIBs. Because the degradation dynamics of LIBs are unknown and uncertain, parameter-fixed LSTM trained only with historical data demonstrates unsatisfactory adaptability and generalization in the prediction stage [26]. Furthermore, the cumulative errors increase with the number of prediction cycles under the one-step or multistep forward iterative mode of LSTM-based sequence prediction [27]. In particular, when encountering accelerated degradation or local regeneration, the predicted value may deviate promptly from its actual value, thus resulting in a rapid augmentation of cumulative errors and the possible failure of iterative prediction. To avoid these issues, researchers [5], [22], [28], [29] have attempted to separate the local regeneration of time series from a global trend by employing signal decomposition methods. Nevertheless, these signal-separated prediction methods require the construction of the corresponding submodels for all decomposed subsignals, which significantly increases the computational cost and uncertainty of RUL prediction.

Second, although LSTM can alleviate the effect of long-term dependencies by emphasizing the dependencies of a sequence on its proximity in a timely manner, relatively more effort is required for LIBs. This is because the contributing features of battery degradation are not in chronological order, and some arbitrary early steps may contribute to the final RUL prediction [30]. An efficient operation requires the prioritization of more important features or time steps by assigning larger weights [31]. The attention mechanism was explicitly proposed for assigning weights to prompt a model to focus on more important features, regardless of their distance in the sequence [32]. In this regard, the self-attention (SA) module, which is the key module of the transformer architecture, has been demonstrated to be a state-of-the-art method [33], [34]. It describes the global dependencies between inputs and outputs with high parallelization and computational efficiency. The effectiveness of SA in improving prediction has been proven in the RUL prediction of mechanical components when it was combined with an RNN [3], [35]. Nevertheless, the prediction performance requires further improvement owing to the widespread long-term dependencies throughout the life of LIBs.

Third, health indicator (HI) extraction is another critical element of data-driven RUL prediction methods. The essence of the data-driven method is the regression of relevant performance indicators. Thus, an HI that can accurately characterize battery degradation is necessary. Recently, researchers have focused on indirect HIs, which can be easily extracted from measurable battery data, such as incremental capacity analysis curves [14], [36] and discharge voltage difference [37]. However, the extraction of these HIs is performed manually, is time consuming, and requires extensive domain knowledge. By contrast, several intelligent algorithms, such as NNs, evolutionary algorithms (EAs), and manifold learning (ML) methods, can extract HIs automatically and efficiently [38]. However, both the NNs (such as the autoencoder (AE) and its extensions) [39] and EAs lack feasible guidelines for parameter tuning and exhibit additional computational complexity. By contrast, ML methods, which offer the advantages of low computational burden and high execution efficiency, are regarded as effective approaches for nonlinear dimension reduction. The main purpose of ML is to enable the construction of nonlinear low-dimensional manifolds using sampled data points in high-dimensional spaces. Typical ML methods include multidimensional scaling (MDS) [40], local linear embedding (LLE) [41], and local tangent space alignment (LTSA) [42]. Notably, LTSA, which is an improved version of LLE for solving the singularity of weight coefficients, offers a higher embedding accuracy, stronger noise resistance, and less overhead compared with other ML methods. However, it requires manual intervention when selecting the local neighborhood size, which is impractical in real-time applications [43].

In summary, constructing a dependable network to estimate the HI and RUL of LIBs in the presence of strong long-term dependencies and dynamic degradation properties (including the AD and CRP) as well as constructing highly reliable indirect HIs remain challenging.

Section snippets

Key contributions

To address the aforementioned issues, this study developed an adaptive deep learning method for the RUL prediction of LIBs. In the proposed approach, an improved LTSA with an optimal neighbor domain is first employed to automatically extract the HI from battery monitoring data to be used as an input for the prediction model. Subsequently, an adaptive self-attention long short-term memory (SA-LSTM) NN with a self-tuning mechanism is constructed to conduct long-term predictions for future HI

Structure of proposed method

To provide an accurate and robust RUL prediction for LIBs, a systematic data-driven RUL prediction framework based on an adaptive SA-LSTM NN was proposed; in this framework, the indirect HI yielded by the optimized LTSA is used as an input. The prediction framework, as shown in Fig. 1, comprises two stages: modeling and adaptive RUL prediction.

As shown in the modeling stage in Fig. 1, offline data of LIBs are necessary to determine the key parameters of the LTSA and establish the SA-LSTM model.

Experiment and analysis

The effectiveness and advantages of the proposed algorithm were verified based on comparisons with general feature extraction methods and prediction algorithms. Simulations were programmed using the Python 3.8 and implemented on a graphics processing unit (GPU) server with an Intel Xeon Silver 4214 processor (16.5 MB cache, up to 3.60 GHz) and NVIDIA RTX 2080Ti graphics card (11 GB).

Conclusion

The accurate RUL prediction of LIBs remains challenging owing to long-term dependencies and abrupt fluctuations in the degradation process. Hence, this study proposed an adaptive SA-LSTM prediction model. The main contributions of this study are as follows.

1)
An optimized LTSA arithmetic based on the MC algorithm was proposed to extract a representative HI that can precisely describe the performance degeneration of LIBs based on only measurement data.
2)
A novel prediction model was established by

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The authors would like to thank the reviewers and editor for their comments and suggestions, which improved the paper significantly.

References (47)

S. Behera et al.
Multiscale deep bidirectional gated recurrent neural networks based prognostic method for complex non-linear degradation systems
Inf. Sci.
(2021)
Y.X. Yang
A machine-learning prediction method of lithium-ion battery life based on charge process for different applications
Appl. Energy
(2021)
J.C. Shi et al.
Battery health management using physics-informed machine learning: online degradation modeling and remaining useful life prediction
Mech. Syst. Sig. Process.
(2022)
Y. Zhou et al.
Remaining useful life prediction with probability distribution for lithium-ion batteries based on edge and cloud collaborative computation
J. Storage Mater.
(2021)
H. Feng et al.
A health indicator extraction based on surface temperature for lithium-ion batteries remaining useful life prediction
J. Storage Mater.
(2021)
L. Sánchez et al.
A design methodology for semi-physical fuzzy models applied to the dynamic characterization of LiFePO4 batteries
Appl. Soft Comput.
(2014)
L. Yang et al.
Supervisory long-term prediction of state of available power for lithium-ion batteries in electric vehicles
Appl. Energy
(2020)
W.J. Pan et al.
A health indicator extraction and optimization for capacity estimation of li-ion battery using incremental capacity curves
J. Storage Mater.
(2021)
P. Tao et al.
Predicting time series by data-driven spatiotemporal information transformation
Inf. Sci.
(2023)
S. Fan et al.
Multi-attention deep neural network fusing character and word embedding for clinical and biomedical concept extraction
Inf. Sci.
(2022)

X. Shi et al.

Multi-models and dual-sampling periods quality prediction with time-dimensional K-means and state transition-LSTM network

Inf. Sci.

(2021)

G. Cheng et al.

Remaining useful life and state of health prediction for lithium batteries based on empirical mode decomposition and a long and short memory neural network

Energy

(2021)

M. Wei et al.

Remaining useful life prediction of lithium-ion batteries based on Monte Carlo Dropout and gated recurrent unit

Energy Rep.

(2021)

X. Li et al.

An online dual filters RUL prediction method of lithium-ion battery based on unscented particle filter and least squares support vector machine

Measurement

(2021)

Y.F. Ji et al.

An RUL prediction approach for lithium-ion battery based on SADE-MESN

Appl. Soft Comput.

(2021)

J. Yu

State of health prediction of lithium-ion batteries: Multiscale logic regression and gaussian process regression ensemble

Reliab. Eng. Syst. Saf.

(2018)

P. Ding et al.

Useful life prediction based on wavelet packet decomposition and two-dimensional convolutional neural network for lithium-ion batteries

Renew. Sustain. Energy Rev.

(2021)

W. Li et al.

Bidirectional LSTM with self-attention mechanism and multi-channel features for sentiment classification

Neurocomputing

(2020)

D. Kwak et al.

Self-attention based deep direct recurrent reinforcement learning with hybrid loss for trading signal generation

Inf. Sci.

(2023)

X. Yu et al.

Novel hybrid multi-head self-attention and multifractal algorithm for non-stationary time series prediction

Inf. Sci.

(2022)

J.Z. Kong et al.

Voltage-temperature health feature extraction to improve prognostics and health management of lithium-ion batteries

Energy

(2021)

R. Espinosa et al.

Multi-surrogate assisted multi-objective evolutionary algorithms for feature selection in regression and classification problems with time series data

Inf. Sci.

(2023)

Y. Wang et al.

Deep learning for fault-relevant feature extraction and fault classification with stacked supervised auto-encoder

J. Process Control

(2020)

Cited by (22)

A novel workflow including denoising and hybrid deep learning model for shield tunneling construction parameter prediction
2024, Engineering Applications of Artificial Intelligence
Deep Learning (DL) has shown mind-blowing potential in geotechnical engineering and has been widely used in shield tunneling as an efficient soft computing method. Considering that the data acquisition system for monitoring the shield machine is severely affected by noise and the high dimension feature, this study proposes a workflow for predicting the tunnelling performance of the shield machine. The workflow mainly consists of two parts: Data pre-processing and DL-based predictor. The data processing adopts an improved Variational Modal Decomposition (VMD) algorithm for denoising the raw data. The algorithm combines the permutation entropy of the decomposed modal functions with the Flower Pollination Algorithm (FPA) to determine the optimal parameters of VMD. The denoised data is achieved through a joint model of Attention Mechanism and Hybrid Deep Learning (Att-HDL) model to realize the prediction of the penetration rate. The Att-HDL model mainly consists of two stages: compressing the features of input data and predicting the penetration rate. The work of this study provides a decision support for the adjustment of operating parameters in shield tunnelling. The proposed workflow shows satisfactory results in tunnelling, and when comparing five predictors, the Att-HDL model achieves the best performance with a coefficient of determination of 0.965.
Towards trustworthy remaining useful life prediction through multi-source information fusion and a novel LSTM-DAU model
2024, Reliability Engineering and System Safety
Remaining useful life (RUL) prediction is a key part of the prognostic and health management of machines, which can effectively prevent catastrophic faults and decrease expensive unplanned maintenance. A good health indicator (HI) can ensure the accuracy and reliability of RUL prediction. However, most of the existing HI construction methods use only a single signal and rely heavily on prior knowledge, making it difficult to capture critical information about mechanical degradation, which in turn affects the performance of RUL prediction. To solve the above problems, a novel adaptive multi-source fusion method based on genetic programming is proposed for building a HI that can effectively reflect the health state of machines. Subsequently, a new LSTM model with a dual-attention mechanism is developed, which differentially handles the network input information and the recurrent information to improve the prediction performance and reduce the time complexity at the same time. Experimental validation is carried out on the real PRONOSTIA bearing dataset. The comparative results validate that the constructed fusion HI has better comprehensive performance than other fusion HIs, and the proposed prediction method is competitive with the current state-of-the-art methods.
A transferable long-term lithium-ion battery aging trajectory prediction model considering internal resistance and capacity regeneration phenomenon
2024, Applied Energy
Accurately predicting the remaining useful life (RUL) of lithium-ion batteries (LiBs) is crucial for improving battery management system design and ensuring device safety. However, achieving accurate long-term predictions of aging trajectories is challenging due to error accumulation in multi-step ahead forecasts. This study shows that considering future internal resistance (R), which is related to the aging process, and the capacity regeneration phenomenon (CRP) that occurs during aging can help reduce error accumulation. Specifically, we propose a hybrid method that incorporates future R and CRP to predict the aging trajectories and RULs of LiBs. Experiment results demonstrate: (1) for the same charging/discharging policies and battery types, the proposed method can accurately predict the aging trajectory and RUL using only the first 20 cycles’ data (approximately 5% of the complete data); (2) for different charging/discharging policies and battery types, with transfer learning, the proposed method can predict the aging trajectory and RUL using the first 40 cycles’ data. These results demonstrate that the proposed model is both accurate in long-term prediction and robust for estimating the aging trajectory and RUL across various datasets.
A novel parallel feature extraction-based multibatch process quality prediction method with application to a hot rolling mill process
2024, Journal of Process Control
In a hot strip rolling mill (HSRM) process, the prediction of the steel crown is a key factor in improving the quality of the strip steel. In this paper, a new multibatch feature extraction-based method is proposed for predicting the steel crown. Different from the cascaded feature extraction-based method which cannot extract both temporal and local features well, this method parallelly captures the feature between different batches of data using a method based on the multi-channel convolution neural network (MCNN) and long short-term memory (LSTM). The feature extraction is performed in parallel by an LSTM layer fusing variable attention and temporal attention, and a Multi-channel convolutional neural network fusing channel attention and spatial attention, which are used to extract temporal and local features of the input variables, respectively. Then, an LSTM-based fusion layer is used to incorporate both features for the development of the prediction model. The proposed method is applied to a cloud–edge-end collaborative prototype system, where the actual HSRM data is integrated. Based on the fact that an HSRM process commonly runs with the steel header crown data for the model update, an adaptive prediction method is also developed and deployed in the prototype system. It can be seen from the model complexity analysis and application results that the prediction performance improves by 42.70% compared with the cascaded feature extraction-based method, and the adaptive method can ensure a realtime prediction realization.
A data and physical model joint driven method for lithium-ion battery remaining useful life prediction under complex dynamic conditions
2024, Journal of Energy Storage
Accurate remaining useful life (RUL) prediction of batteries plays an important role in battery management. The existing methods mainly rely on the battery test data under ideal operating conditions to show good performance. However, the actual operating conditions of batteries are usually complex, which bring great challenges to the RUL prediction. To improve the accuracy and the generalization ability of RUL prediction method, a fusion method of electrochemical–thermal model (ECT) and unscented kalman filter (UKF) is proposed for lithium-ion battery. Firstly, an ECT–based capacity degradation model is established by coupling pseudo two dimensions (P2D) electrochemistry model, 3D thermal model and solid electrolyte interface (SEI) formation model. Secondly, the UKF method is used to update iteratively the parameters of model to improve its accuracy. Finally, several sets of operating data of lithium-ion battery under different conditions are used for RUL prediction, followed by the comparative analysis of different models and algorithms. As a result, the proposed fusion method not only exhibits better accuracy in RUL prediction under ideal conditions, but also shows excellent accuracy and generalization for other dynamic stochastic operating conditions.
A critical review on prognostics for stochastic degrading systems under big data
2024, Fundamental Research
As one of the key technologies to maintain the safety and reliability of stochastic degrading systems, remaining useful life (RUL) prediction, also known as prognostics, has been attached great importance in recent years. Particularly, with the rapid development of industrial 4.0 and internet-of-things (IoT), prognostics for stochastic degrading systems under big data have been paid much attention in recent years and various prognosis methods have been reported. However, there has not been a critical review particularly focused on the strengths and weaknesses of these methods to provoke the new ideas for the prognostics research. To fill this gap, facing the realistic demand of prognostics of stochastic degrading systems under the background of big data, this paper profoundly analyzes the basic research ideas, development trends, and common problems of various data-driven prognostics methods, mainly including statistical data-driven methods, machine learning (ML) based methods, hybrid prognostics of statistical data-driven methods and ML based methods. Particularly, this paper discusses the emerging topic of prognosis under incomplete big data and the possible opportunities in the future are highlighted. Through discussing the pros and cons of existing methods, we provide discussions on challenges and possible opportunities to steer the future development of prognostics for stochastic degrading systems under big data. While an exhaustive review on prognostics methods remains elusive, we hope that the perspectives and discussions in this paper can serve as a stimulus for new prognostics research in the era of big data.

View all citing articles on Scopus

View full text

Adaptive self-attention LSTM for RUL prediction of lithium-ion batteries

Abstract

Introduction

Section snippets

Key contributions

Structure of proposed method

Experiment and analysis

Conclusion

Declaration of Competing Interest

Acknowledgments

Inf. Sci.

Appl. Energy

Mech. Syst. Sig. Process.

J. Storage Mater.

J. Storage Mater.

Appl. Soft Comput.

Appl. Energy

J. Storage Mater.

Inf. Sci.

Inf. Sci.

Inf. Sci.

Energy

Energy Rep.

Measurement

Appl. Soft Comput.

Reliab. Eng. Syst. Saf.

Renew. Sustain. Energy Rev.

Neurocomputing

Inf. Sci.

Inf. Sci.

Energy

Inf. Sci.

J. Process Control