Elsevier

Neurocomputing

Volume 320, 3 December 2018, Pages 129-140
Neurocomputing

A novel supervised sparse feature extraction method and its application on rotating machine fault diagnosis

https://doi.org/10.1016/j.neucom.2018.09.027Get rights and content

Abstract

Intelligent fault diagnosis methods are promising in dealing with mechanical big data owing to its efficiency in extracting discriminative features automatically. Sparse filtering (SF) is a simple and effective unsupervised feature extraction method aiming at optimizing the feature sparsity. However, the sparsity realized by SF is irregular and the features are unnecessarily discriminative for further classification. Hence, a simple and fast supervised feature extraction algorithm called supervised regularized sparse filtering (SRSF) is proposed, which explores a new way to optimize for sparsity. The supervised feature extraction is realized through fusing a novel parameterized sparse label matrix (PSLM) into the feature matrix to regular the sparsity. Meanwhile, a new objective function is developed together with it, and they work together to quicken the network convergence. In addition, SRSF can find out the specific frequencies from the learned weight matrix for each health condition innovatively, which connects the proposed method with traditional signal processing techniques. Furthermore, based on SRSF, a three-stage fault diagnosis network is developed. Experiments on a bearing case and a gearbox case are conducted separately to verify its effectiveness, and comparisons with the state of the art confirm its superiority.

Introduction

With the increase of demands for machine operation safety, fault diagnosis and health management has become more and more significant [1]. As a representative category of fault diagnosis methods, data-driven fault diagnosis methods [2] make better use of big data collected from monitored machines than model-driven [3] methods. Meanwhile, for the complicated big systems which have challenges in locating signal symptoms or establishing explicit system models, its remarkable nonlinearity-extracting capability is desirable [4]. Hence, effective data-driven fault diagnosis has recently been a research topic in the diagnosis and health management of rotary machinery systems.

Generally, there are three main sequential steps shared by most data-driven methods [5], [6], [7]: (1) dataset obtaining; (2) feature extraction and selection and (3) feature classification. In the first step, vibration signals are often acquired from monitored machines and utilized as source data for containing the essential information of the equipment [8]. Machine learning has widely been used in fault diagnosis for its powerful nonlinearity-extracting and labor-saving capabilities [9], [10]. In traditional fault diagnosis methods, machine learning methods are always adopted in the third step namely feature classification such as Naive Bayes (NB) [11], Support Vector Machines (SVM) [12], logistic regression [13], softmax regression [14]. Nevertheless, its effectiveness mainly depends on the handcrafted features which should be predesigned in the second step. These widely used handcrafted feature-extraction methods include: time-domain analysis [15], frequency transform [16], high resolution time-frequency analysis [17], wavelet transform [18] and envelope demodulation algorithms [19] et al. These signal processing-based methods can give a certain interpretation of vibration signals and mine the intrinsic information embedded in them. However, the extracted features with given nonlinearity decide the upper-bound performances of the following machine learning methods [20]. Meanwhile, in the second step, although these signal processing-based methods can extract principle features of signals, they are not always discriminative, which is critical for further feature classification. Therefore, dimension reduction strategies are always required to conduct the feature selection for further feature classification in the second step. In the second step of an intelligent fault diagnosis method, machine learning is extended further into feature extraction to mine features automatically which is highly free of manpower and can fuse feature extraction with automatic feature selection in a certain degree. These machine learning methods include autoencoders (AE) [21], restricted Blotzmann machine (RBM) [22], convolution neural network (CNN) [23], sparse filtering (SF) [5], [24] et al. Recently, deep learning emerges as a promising area of machine leaning and has been successfully applied in many fields [25]. There are always two essential steps in its training: (1) greedy layer-wise pre-training and (2) global fine tuning [26], [27]. Commonly, the first step is always realized through unsupervised feature extraction units like AE, RBM and SF [21], [22], [28]. These unsupervised feature extraction methods can be roughly categorized into two categories by whether it reconstructs the raw signals or not [29]. AE minimizes the reconstruction error through encouraging the hidden layer learning the principle components of the inputs. For discriminative feature extraction, some methods aiming at extracting sparse features are developed, such as [30] and [31]. Meanwhile, a new kind of methods designed to extract both discriminative and principle features are also proposed such as [32]. For SF, it aims at learning discriminative features rather than principle components of inputs by optimizing three kinds of sparsity of the feature matrix [5], [24], where each column is the feature vector of one input and the same kind of features for all the inputs are in the same row. SF is simple and computes fast, and has been successfully applied to tasks like pattern recognition, image classification [24], [28]. Lei et al. [5] introduced it into machine fault diagnosis and constructed a diagnosis network based on it, and experiments on the famous bearing dataset validated its effectiveness. Meanwhile, the sparse filtering used in this work was the standard one. Yang et al. [33] used SF to extract features from raw vibration signals and then classified them using the Improved Particle Swarm SVM (IPSO-SVM). Qian et al. [34] proposed L1 norm-regularized SF (L1SF) to constrain the sparsity of the weight matrix and constructed a fault diagnosis method based on it. Despite the fact that SF performs well in optimizing for sparsity, the feature matrix sparsity and weight matrix obtained by the methods described above are always irregular. For fault diagnosis, it means the extracted discriminative features will be gathered into clusters with different loads or rotating speeds, instead of being gathered into clusters with different health conditions. What is worse, although the weight matrix can extract sparsely activated features, the activated features are arranged irregularly in the corresponding feature vector of each sample. No label information involvement in feature extraction can account for the defects mentioned above in a certain degree.

Aiming at solving the shortcomings mentioned above, we develop a novel supervised feature extraction method called supervised regularized sparse filtering (SRSF), which aims at optimizing and regularizing the sparsity of the feature matrix by constructing and adding a parameterized sparse label matrix (PSLM) into it. The label matrix increases almost no computing cost, and the results show that SRSF can optimize for sparsity more regularly and extract better features than SF. The main contributions are described as follows.

  • (1)

    To introduce label information into feature extraction, we develop the parameterized sparse label matrix (PSLM). It can control the designed activated subregions of the feature matrix by a parameter.

  • (2)

    Based on PSLM, the supervised regularized sparse filtering (SRSF) is developed, which aims at optimizing for the sparsity of feature matrix. Compared with SF, it realizes the sparsity of feature matrix further and more regularly via PSLM and converges faster. Furthermore, SRSF can extract and list the frequencies specific to each health condition through the optimized weight matrix, which builds a link between SRSF and the widely used signal processing techniques.

  • (3)

    Based on SRSF, a three-stage intelligent fault diagnosis network is constructed. In the first stage, samples are processed into segments in a convolution way. In the second stage, features are extracted by SRSF. In the last stage, we utilize softmax regression to diagnosis the health conditions.

This rest of the paper is organized as follows. Section 2 details the proposed SRSF. Based on it, a fault diagnosis network is also constructed in this section. In Section 3, experiments on a bearing case and a gearbox case are investigated respectively utilizing the proposed method. In Section 4, the properties of weight matrix learned from SRSF are pointed out and analyzed. Finally, in Section 5, the conclusion is drawn.

Section snippets

The new supervised feature extraction method

Generally, unsupervised feature learning algorithms can be classified into two main categories [18]: modeling the input distribution explicitly or not. ICA [29], AE and RBM reconstruct the signal structure and learn the input distribution explicitly. Although it's desirable to learn a good approximation of the input distribution, algorithms such as sparse filtering (SF) demonstrate that it is not so significant if the target is to get discriminative features. The fundamental idea of SF is

Data description

The dataset provided by Case Western Reserve University [37] is adopted to verify the validity of the constructed network. The signals were all acquired from the drive end of the motor through a tri-axial accelerator under 4 health conditions: (1) normal condition (NO); (2) inner race fault (IF); (3) outer race fault (OF) and (4) roller fault (RF). Vibration signals of three different severity levels (0.18, 0.36 and 0.54 mm) were collected separately for health condition OF, IF and RF.

Discussion

The artificial neural networks (ANN) are always regarded as black boxes, and it is important to investigate the connection between the signal processing-based feature extraction methods and the ANN-based methods. Some interesting properties of the weight matrices learned from the proposed method are observed and investigated in the following. The shape version of the label matrix used in the following is the non-overlapping version.

As interpreted in Ref. [5], [24], the rows of the weight matrix

Conclusions

Aiming at extracting regular and discriminative features, a supervised feature extraction method called supervised regularized sparse filtering (SRSF) is proposed, which attempts to optimize feature sparsity according to the label information. To introduce the label information, the parameterized sparse label matrix (PSLM) is developed. Based on SRSF, a three-stage rotating machine fault diagnosis network is constructed. Through the case studies on two datasets, it is verified that SRSF can

Acknowledgments

The research was supported by National Natural Science Foundation of China (51675262) and also supported by the Project of National Key Research and Development Plan of China (2016YFD0700800) and the Advance research field fund project of China (6140210020102).

Weiwei Qian received the B.S. degrees in Jiangsu University of Science and Technology (JUST), Zhenjiang, China, in 2012 and 2016. Now he is a Ph.D. Candidate with College of Energy and Power Engineering, Nanjing University of Aeronautics and Astronautics (NUAA), Nanjing, China. His current research interests include Rotating Machinery Fault Diagnosis and Mechanical Signal and Information Processing, Machine learning.

References (44)

  • K.B. Raja et al.

    Smartphone based visible iris recognition using deep sparse filtering

    Pattern Recognit. Lett.

    (2015)
  • ZhaoC. et al.

    A sparse dissimilarity analysis algorithm for incipient fault isolation with no priori fault information

    Control Eng. Pract.

    (2017)
  • ZhangX. et al.

    Multi-fault diagnosis for rolling element bearings based on ensemble empirical mode decomposition and optimized support vector machines

    Mech. Syst. Signal Process.

    (2013)
  • L. Saidi et al.

    Application of higher order spectral features and support vector machines for bearing faults classification

    ISA Trans.

    (2015)
  • XuZ. et al.

    A selective fuzzy ARTMAP ensemble and its application to the fault diagnosis of rolling element bearing

    Neurocomputing

    (2016)
  • N. Baydar et al.

    A comparative study of acoustic and vibration signals in detection of gear failures using Wigner-Ville distributions

    Mech. Syst. Signal Process.

    (2001)
  • DaiX.W. et al.

    From model, signal to knowledge: A data-driven perspective of fault detection and diagnosis

    IEEE Trans. Ind. Inf.

    (2013)
  • LiuQ. et al.

    Data-driven and model-driven tension estimation and fault diagnosis of cold rolling continuous annealing processes

    J. Soc. Mater. Sci. Jpn.

    (2011)
  • GaoZ.W. et al.

    A survey of fault diagnosis and fault-tolerant techniques—Part II: Fault diagnosis with knowledge-based and hybrid/active approaches, IEEE Trans

    Ind. Electron.

    (2015)
  • Y. Lei, F. Jia, J. Lin, et al., An Intelligent fault diagnosis method using unsupervised feature learning towards...
  • KangM. et al.

    Reliable Fault diagnosis for low-speed bearings using individually trained support vector machines with kernel discriminative feature analysis

    IEEE Trans. Power Electron.

    (2015)
  • ShaoH. et al.

    Intelligent fault diagnosis of rolling bearing using deep wavelet auto-encoder with extreme learning machine

    Knowl. Based Syst.

    (2018)
  • Cited by (35)

    • Data driven-based model for predicting pump failures in the oil and gas industry

      2023, Engineering Failure Analysis
      Citation Excerpt :

      Recently, data-driven methods for fault diagnosis of machinery have been developed [31,43] presented a deep learning approach for fault diagnosis of the machinery components in a variety of working environments. Other studies that used data-driven algorithms to diagnose roller-bearing faults were conducted by [39,42,30,22,33] designed a novel design support system for machine specification data prediction and estimation. [12] used an approximate method of the probability distribution function of current signals to explore the electric motor fault diagnosis and detection [10] predicted the corrosion rate of conveyance pipes used in an HVAC system.

    • Adaptive nearest neighbor reconstruction with deep contractive sparse filtering for fault diagnosis of roller bearings

      2022, Engineering Applications of Artificial Intelligence
      Citation Excerpt :

      On one hand, one category utilizes mechanical big data in network training, and only their optimized network parameters are required in testing stages, so that only these parameters need to be stored for the purpose of future diagnosis. Following this strategy, a number of diagnosis methods are developed, such as methods based on stacked autoencoders (SAE) (He et al., 2021; Jia et al., 2016), restricted Boltzmann machine (RBM) (Shao et al., 2018; Wang et al., 2020), sparse filtering (Lei et al., 2016; Qian et al., 2018), convolution neural network (CNN) (Zhao et al., 2017; Lei et al., 2019), recurrent neural network (RNN) (Guo et al., 2019a; Wen et al., 2018). Lei et al. (2020), Zhao et al. (2019) and Liu et al. (2018) gave fairly comprehensive reviews of recent work.

    • Multi-scale and multi-pooling sparse filtering: A simple and effective representation learning method for intelligent fault diagnosis

      2021, Neurocomputing
      Citation Excerpt :

      After local features calculation, pooling operation is needed to aggregate multi-scale local features into a concise representation. A popular pooling strategy is the averaging (abbreviated as ‘Ave’) [7,21,23,24,28,30,33], which reflects the average activation level of all segments on each basis vector. Nevertheless, only using Ave pooling would probably lead to an insufficient representation because it loses some latent information on other levels.

    • Visual domain adaptation based on modified A−distance and sparse filtering

      2020, Pattern Recognition
      Citation Excerpt :

      Xie et al. used deep sparse filtering for community discovery [14]. Qian et al. proposed supervised regularized sparse filtering for machine fault diagnosis by introducing the label matrix [15]. The rest of the paper is organized as follows.

    View all citing articles on Scopus

    Weiwei Qian received the B.S. degrees in Jiangsu University of Science and Technology (JUST), Zhenjiang, China, in 2012 and 2016. Now he is a Ph.D. Candidate with College of Energy and Power Engineering, Nanjing University of Aeronautics and Astronautics (NUAA), Nanjing, China. His current research interests include Rotating Machinery Fault Diagnosis and Mechanical Signal and Information Processing, Machine learning.

    Shunming Li received Ph.D. degree in mechanics from Xi'an Jiaotong University, China, in 1988. He is a Professor in Nanjing University of Aeronautics and Astronautics (NUAA), Nanjing, China. His current research interests include noise and vibration analysis & control, signal processing, machine fault diagnosis, sensing and measurement technology, intelligent vehicles.

    Jinrui Wang received the B.S. and M.S. degrees in Shandong University of Science and Technology (SDUST), Tsingdao, China, in 2013 and 2015. Now he is a Ph.D. Candidate with College of Energy and Power Engineering, Nanjing University of Aeronautics and Astronautics (NUAA), Nanjing, China. His current research interests include Rotating Machinery Fault Diagnosis and Mechanical Signal and Information Processing.

    Qijun Wu received Ph.D. degree in mechanics from Nanjing University of Science and Technology, China, in 2011. He now is an engineer in China Ship Development and Design Center, Wuhan, China.

    View full text