A deep transfer learning method based on stacked autoencoder for cross-domain fault diagnosis

https://doi.org/10.1016/j.amc.2021.126318Get rights and content

Highlights

  • A cross-domain fault diagnosis method based on transferred stacked autoencoder is proposed.

  • Some typical transfer strategies are discussed for better understanding.

  • Extensive experiments verified the effectiveness and superiority of the proposed method.

Abstract

In the actual industrial process, the distribution of historical training data and online testing data is always different due to the switching of operating modes and changes in climate conditions. At this time, the performance of traditional data-driven fault diagnosis methods based on the assumption that historical training data and online testing data follow the same distribution will degrade. Therefore, how to ensure the reliability of the fault diagnosis method for the distribution distortion is necessary yet challenge. In this paper, a cross-domain fault diagnosis method based on transferred stacked autoencoder is proposed. In detail, a stacked autoencoder is firstly used to extract features of a large amount of source domain data, and the features are classified to establish the source domain model. Then, a small amount of target domain data is introduced to fine-tune the source domain model to achieve domain adaptation. The effectiveness and superiority of the proposed deep transfer method was demonstrated through wind turbine system experiment and pump truck experiment. In addition, this paper also discusses the number of layers of the stacked autoencoder and model transfer strategies in detail to help practitioners understand the proposed method in practice.

Introduction

With the development of information and communication technology, the scale of industrial system has become larger and the industrial process, even a slight failure may cause the collapse of the entire industrial system, thus unnecessary downtime is increased, causing unnecessary economic losses and even causing casualties. Therefore, it is very important to accurately diagnose faults in industrial plant.

The data-driven fault diagnosis method is highly praised by researchers due to it does not require the first principles mechanism model and empirical knowledge of the industrial system, only the historical data of the sensors is used for modeling, and has the advantages of simplicity, directness, good performance, etc. Recently, it has been widely researched and applied within both academic and industrial community. Statistical analysis method is a commonly used data-driven fault diagnosis method. It diagnoses faults based on the invariance of the extracted process data feature statistics such as the average value and variance. As a multivariate statistical method (MSM) [1], principal component analysis (PCA) and the partial least squares (PLS) extract the main information of the data through dimensionality reduction for fault diagnosis [2], [3], [4], [5], [6], [7], [8], [9]. More recently, machine learning method which can extract feature from raw data has drawn many attention in fault diagnosis community. Among many machine learning method, dictionary learning is a commonly used and effective method, which learns a concise but complete dictionary for fault diagnosis task [10], [11], [12]. In addition, some other machine learning methods are also introduced to fault diagnosis. For example, support vector machine (SVM) [13], [14], [15], [16], logistic regression (LR), decision tree (DT) [17], random forest (RF) [18], [19], deep neural network (DNN) [20], [21], [22] etc. These methods transform fault diagnosis problems into machine learning problems and use historical data to train models to achieve fault diagnosis goals.

Although the above-mentioned data-driven methods have addressed the fault diagnosis problem to some extent, they are mainly based on the assumption that the training data and testing data follow the same distribution. However, in the actual industrial process, due to the switching of the operation mode, climate change and other reasons, this assumption cannot be satisfied in practice. On the other hand, the acquisition of label data is relatively expensive in industrial operation sites, so the labeled data after the working conditions change is usually scarce, which further increase the difficulties of fault diagnosis of wind turbine. For convenience, we suppose the data before the change of working conditions is recorded as source domain data, and the changed data is recorded as target domain data. At this time, there will be certain problems in the fault diagnosis of the target plant. As shown in Fig. 1, if a large amount of source domain data (red triangle) and a small amount of target domain data (orange triangle) are used together to train the model, the trained model will mainly fit the source domain data (in the black circle) and most of the target domain data will be ignored as outliers, as shown in Fig. 1(a). If only a small amount of target domain data is used for training, it will lead to overfitting of the model and cannot be used for fault diagnosis, as shown in Fig. 1(b). Therefore, how to use a large amount of source domain data and a small amount of target domain data simultaneous to fulfil an efficient fault diagnosis is necessary.

Transfer learning, which can transfer the knowledge of source domain to target domain, is an effective way to solve cross-domain problems. Through transfer learning, we can obtain the potential knowledge shared by the source domain and the target domain, thereby extracting domain invariant features to improve the performance of the target task. In recent years, some transfer learning methods have been proposed to solve cross-domain problems. Long et al. added Maximum Mean Discrepancy (MMD) constraints to the sparse representation to learn a common dictionary between the source and target domains, thereby reducing the difference between domains and realizing domain adaptation [23]. Ni et al. achieved domain adaptation by smoothly interpolating a series of intermediate domains between the source domain and the target domain to establish inter-domain connections [24]. Shekhar et al. proposed a transfer learning method that reduces the difference between domains by projecting data from the original space to the common latent space, and learns the common dictionary to perform the target task [25], [26]. Huang et al. introduced a transfer dictionary learning to address the process monitoring and fault detection problem of multimode process [27].

Generally, the above-mentioned transfer learning methods are based on shallow machine learning methods. In recent year, deep learning has increasingly attracted the attention of academia and industry due to its powerful feature extraction ability. Typically, the deep learning method automatically extracts the feature representation through unsupervised or semi-supervised methods, which greatly reduces the difficulty of feature engineering. In addition, due to the hierarchical learning characteristics of deep learning, deep learning method can extract features at the shallow layers of the model, and making decisions at the deep layers, thus, the model based transfer learning is possible. Deep transfer learning is a transfer learning method based on deep network models. At present, some deep transfer methods have been proposed to deal with cross-domain fault diagnosis problems [28]. Han et al. introduced joint distributed adaptation into CNN to achieve cross-domain fault diagnosis [29]. Guo et al. proposed a deep convolutional transfer learning method to achieve cross-domain bearing fault diagnosis [30]. Chai et al. proposed the fine-grained adversarial network-based domain adaptation (FANDA) method to address the cross-domain industrial fault diagnosis problem[31].

Although the above transfer learning methods can solve the problem of inconsistent cross-domain distribution well, there is still a lot of work to be explored in the field of cross-domain fault diagnosis. Inspired by the knowledge transfer ability of deep transfer learning, this paper proposes a deep transfer learning method to solve the cross-domain fault diagnosis problem. First, the stacked autoencoder is used to extract features of a large amount of source domain data, and the features are classified to establish the source domain network. Then, the source domain network is transferred to the target domain through a model transfer method. Finally, when the online testing data arrives, the potential characteristics are extracted through the transfer model and then classified, and the industrial fault diagnosis is completed by comparing the mapping relationship between the category and the process state. In summary, the main contribution of this paper is as follows:

  • This paper uses an approach based on model transference for cross-domain fault diagnosis. By reusing the feature extraction structure of the source domain, knowledge is transferred from the source domain to the target domain to solve the problem of cross-domain fault diagnosis.

  • This paper discusses various transfer strategies to select the optimal strategy to improve the accuracy of fault diagnosis.

  • Compared with the traditional method, the proposed method improves the accuracy of fault diagnosis, and the results are reasonably explained through t-SNE visualization.

The rest of this paper is organized as follows. Section 2 introduces the basic knowledge of autoencoder and the motivation of transfer learning for fault diagnosis. Section 3 introduces the deep transfer method proposed in this paper in detail. In Section 4, wind turbine experiment and pump truck experiment are set up to verify the effectiveness of the proposed method. Finally, Section 5 summarizes the conclusion of the proposed method.

Section snippets

Autoencoder

Autoencoder [32] is a kind of feedforward neural network, including an input layer, a hidden layer and an output layer. The input layer and hidden layer constitute the encoding network, and the hidden layer and output layer constitute the decoding network.

The encoding network maps the input data x from the origin space to the latent space to get a meaningful latent representation h, which can restore the input data x by the decoding network. The calculation of latent representation h and

Methodology

In this section, we will introduce the proposed deep transfer learning fault diagnosis method in detail. Compared with the traditional data-driven fault diagnosis met-hod, this method can overcome the assumption that training data and testing data follow the same distribution and can handle cross-domain fault diagnosis problem. The overall fault diagnosis process can be divided into the three steps. First, the deep neural network model is used to extract the characteristics of the source domain

Illustrative examples

This paper conducted fault diagnosis on industrial wind turbine data and pump truck data. Traditional data-driven methods (Source Domain Model M1, Target Domain Model M2 and SVM) are used for comparison to verify the effectiveness and superiority of the proposed methods. For the wind turbine experiment, we discussed the number of network layers and transfer strategies of the model, and explained the experimental results through the t-SNE nonlinear dimensionality reduction method.

Conclusion

For a typical industrial plant, due to the differences in the working conditions, the distribution of the source data and the target data will be different. The traditional fault diagnosis method will confront two problems. First, the source domain model and the target domain data does not match. Second, there is a little label data in the target domain and an effective model cannot be established. In order to solve the above two problems, this paper proposed a cross-domain fault diagnosis

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China (Grant Nos. 61860206014, 61803232, 61751312), in part by the Innovation-Driven Plan in Central South University, China (2019CX020), in part by the Natural Science Foundation of Hunan Province (Grant No. 2019JJ50777) and in part by the 111 Project, China (B17048).

References (32)

Cited by (0)

View full text