Skip to main content
Log in

Industrial data classification using stochastic configuration networks with self-attention learning features

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Industrial data contain a lot of noisy information, which cannot be well suppressed in deep learning models. The current industrial data classification models are problematic in terms of feature incompleteness and inadequate self-adaptability, insufficient capacity for approximation of classifier and weak robustness. To this end, this paper proposes an intelligent classification method based on self-attention learning features and stochastic configuration networks (SCNs). This method imitates human cognitive mode to regulate feedback so as to achieve ensemble learning. In particular, firstly, at the feature extraction stage, a fused deep neural network model based on self-attention is constructed. It adopts a self-attention long short-term memory (LSTM) network and self-attention residual network with adaptive hierarchies and extracts the fault global temporal features and local spatial features of the industrial time-series dataset after noise suppression, respectively. Secondly, at the classifier design stage, the fused complete feature vectors are sent to SCNs with universal approximation capability to establish general classification criteria. Then, based on generalized error and entropy theory, the performance indexes for real-time evaluation of credibility of uncertainty classified results are established, and the adaptive adjustment mechanism of self-attention fusion networks for the network hierarchy is built to realize the self-optimization of multi-hierarchy complete features and their classification criteria. Finally, fuzzy integral is used to integrate the classified results of self-attention fusion network models with different hierarchies to improve the robustness of the classification model. Compared with other classification models, the proposed model performs better using rolling bearing fault dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Data availibility

The datasets analysed during the current study are available in the the following public domain resources: [https://github.com/yyxyz/CaseWesternReserveUniversityData]

Abbreviations

\({\varvec{x}}_t\) :

Input of the time-series sample at the moment t

\({\varvec{c}}_t\) :

Cellular memory state at the moment t in LSTM cell structure

\({\varvec{h}}_t\) :

Hidden state output at the moment t in LSTM cell structure

\({\varvec{i}}_t\) :

Input gate at the moment t in LSTM cell structure

\({\varvec{f}}_t\) :

Forgetting gate at the moment t in LSTM cell structure

\({\varvec{o}}_t\) :

Output gate at the moment t in LSTM cell structure

\(\tilde{{\varvec{c}}}_t\) :

Intermediate candidate vector of tanh layer in LSTM cell structure

\({\varvec{x}}^{q=1}\) :

Time-series dataset input to the self-attention LSTM network (\(q=1\))

\({\varvec{h}}^{q=1}\) :

Hidden layer output of the self-attention LSTM network (\(q=1\))

\(\varvec{\varphi }_1^{q=1}\) :

Global contextual feature of the self-attention LSTM network (\(q=1\))

\(\varvec{\varphi }_2^{q=1}\) :

Intermediate feature by Sigmoid function of the self-attention LSTM network (\(q=1\))

\(\varvec{\tau }^{q=1}\) :

Time-scale threshold for \({\varvec{h}}^{q=1}\) of the self-attention LSTM network (\(q=1\))

\({\varvec{h}}_{\text{filter}}^{q=1}\) :

Filtered feature vector output for \({\varvec{h}}^{q=1}\)

\({\varvec{H}}^{q=1}\) :

Output of the self-attention LSTM network (\(q=1\))

\({\varvec{f}}^{q=1}\) :

Feature map output after two convolutions of the self-attention residual network (\(q=1\))

\({\varvec{f}}_1^{q=1}\) :

Global contextual feature of the self-attention residual network (\(q=1\))

\({\varvec{f}}_2^{q=1}\) :

Intermediate feature by Sigmoid function of the self-attention residual network (\(q=1\))

\(\varvec{\varepsilon }^{q=1}\) :

Channel threshold for \({\varvec{f}}^{q=1}\) of the self-attention residual network (\(q=1\))

\({\varvec{f}}_{\text{filter}}^{q=1}\) :

Filtered feature vector output for \({\varvec{f}}^{q=1}\)

\({\varvec{Y}}^{q=1}\) :

Output of self-attention residual network (\(q=1\))

\(\varvec{\lambda }_{L-1}\) :

Output of the \(L-1{\text{th}}\) hidden node in SCNs

\({\varvec{Z}}\) :

Fused feature vector of fusion deep network with self-attention for N samples

k :

Dimension of the fused feature vector \({\varvec{Z}}_j,j\in [1,N]\)

\(L_{\text{max}}\) :

Maximum hidden node number of SCNs

\({\varvec{w}}_j\) :

Input weight of the \(j{\text{th}}\) hidden node of SCNs

\({\varvec{b}}_j\) :

Bias of the \(j{\text{th}}\) hidden node of SCNs

p :

Number of fault categories

\(\varvec{\beta }_j\) :

Output weight matrix of the \(j{\text{th}}\) hidden node of SCNs

\(g_j(\cdot )\) :

Activation function of the \(j{\text{th}}\) hidden node of SCNs

\({\varvec{e}}_{L-1}({\varvec{Z}})\) :

Error output of the \(j{\text{th}}\) hidden node of SCNs

\({\varvec{g}}_L({\varvec{Z}})\) :

Activation output of the \(L{\text{th}}\) hidden node of SCNs for \({\varvec{Z}}\)

\({\varvec{G}}_L\) :

Output matrix of the \(L{\text{th}}\) hidden layer of SCNs

\(\varvec{\xi }_{L,a}\) :

Inequality constraint variables for hidden parameters of SCNs

\(\varvec{\beta }^*\) :

Output weight matrix for L hidden nodes based on the least square method

\({\varvec{G}}_L^{\dag }\) :

Moore–Penrose generalized inverse of matrix \({\varvec{G}}_L\)

U :

Training time-series dataset of rolling bearing

\(M_q\) :

Self-attention fusion network models with q hierarchy

\({\varvec{Z}}_j^i\) :

Fusion feature of the \(j{\text{th}}\) sample via \(M_q\)

\(\tilde{{\varvec{Z}}}_j^i\) :

Fusion latent semantic feature for \({\varvec{Z}}_j^i\)

X :

Any sample in U

\(\tilde{{\varvec{C}}}\) :

Fusion latent semantic feature of X

\(E_i\) :

Fusion latent semantic error entropy of X and \(U_i\)

\({\varvec{S}}\) :

Covariance matrix of \(\left[ \tilde{{\varvec{C}}}; \tilde{{\varvec{Z}}}^{i}\right] ^{\mathrm {T}}\)

E :

Fusion latent semantic error entropy of X and U

m :

Feedback number

\(q_0\) :

Initial network hierarchy of the self-attention fusion deep network

\(q_{\text{max}}\) :

Maximum of the adaptive adjustment of the network hierarchy

thres:

Error threshold of SCNs

\(\mu _{\text{max}}\) :

Iteration maximum of network training

num:

Sample number of U

\(\gamma \) :

Sample credibility threshold

V :

Trusted sample dataset

A :

Fusion network model set

T :

Intermediate training dataset

v :

Fuzzy measure of fusion deep network model \(A_i\)

\(X'\) :

Testing time-series dataset of rolling bearing

\(\sigma \) :

Fuzzy integral of testing sample

References

  1. Rai A, Upadhyay SH (2016) A review on signal processing techniques utilized in the fault diagnosis of rolling element bearings. Tribol Int 96:289–306

    Article  Google Scholar 

  2. Xu G, Liu M, Jiang Z, Shen W, Huang C (2020) Online fault diagnosis method based on transfer convolutional neural networks. IEEE Trans Instrum Measure 69(2):509–520

    Article  Google Scholar 

  3. Qu J, Zhang Z, Gong T (2016) A novel intelligent method for mechanical fault diagnosis based on dual-tree complex wavelet packet transform and multiple classifier fusion. Neurocomputing 171:837–853

    Article  Google Scholar 

  4. Gong W, Wang Y, Zhang M, Mihankhah E, Chen H, Wang D (2021) A fast anomaly diagnosis approach based on modified CNN and multi-sensor data fusion. IEEE Trans Ind Electr. https://doi.org/10.1109/TIE.2021.3135520

    Article  Google Scholar 

  5. Muruganatham B, Sanjith MA, Krishnakumar B, Satya Murty SAV (2013) Roller element bearing fault diagnosis using singular spectrum analysis. Mech Sys Sig Process 35(1–2):150–166

    Article  Google Scholar 

  6. Li B, Chow MY, Tipsuwan Y, Hung JC (2000) Neural-network-based motor rolling bearing fault diagnosis. IEEE Trans Ind Electr 47(5):1060–1069

    Article  Google Scholar 

  7. Maurya S, Singh V, Verma NK (2020) Condition monitoring of machines using fused features from EMD-based local energy with DNN. IEEE Sens J 20(15):8316–8327

    Article  Google Scholar 

  8. Harmouche J, Delpha C, Diallo D (2015) Incipient fault detection and diagnosis based on Kullback-Leibler divergence using principal component analysis: part II. Sig Process 109(1):334–344

    Article  Google Scholar 

  9. Guo Y, Wu X, Na J, Fung RF (2015) Incipient faults identification in gearbox by combining kurtogram and independent component analysis. Appl Mech Mater 764–765:309–313

    Article  Google Scholar 

  10. Yang Y, Yu DJ, Cheng JS (2006) A roller bearing fault diagnosis method based on EMD energy entropy and ANN. J Sound Vibrat 294(1–2):269–277

    Google Scholar 

  11. Souza JDS, dos Santos MVL, Suzuki Bayma R, Amarante Mesquita AL (2021) Analysis of window size and statistical features for SVM-based fault diagnosis in bearings. IEEE Latin Am Trans 19(02):243–249

    Article  Google Scholar 

  12. Sun J, Yan C, Wen J (2018) Intelligent bearing fault dagnosis method combining compressed data acquisition and deep learning. IEEE Trans Instr Measur 67(1):185–195

    Article  Google Scholar 

  13. Zhao ZZ, Xu QS, Jia MP (2016) Improved shuffled frog leaping algorithm-based BP neural network and its application in bearing early fault diagnosis. Neur Comput Appl 27:375–385

    Article  Google Scholar 

  14. Yu L, Qu J, Gao F et al (2019) A novel hierarchical algorithm for bearing fault diagnosis based on stacked LSTM. Shock Vib 2019:1–11

    Google Scholar 

  15. Pan H, He X, Tang S et al (2018) An improved bearing fault diagnosis method using one-dimensional CNN and LSTM. J Mech Eng 64(7–8):443–452

    Google Scholar 

  16. Aljemely AH, Xuan JP, Azzawi OA, Jawad FKJ (2022) Intelligent fault diagnosis of rolling bearings based on LSTM with large margin nearest neighbor algorithm. Neur Comp Appl. https://doi.org/10.1007/s00521-022-07353-8

    Article  Google Scholar 

  17. Fan W, Zhou Q, Li J, Zhu Z (2018) A wavelet-based statistical approach for monitoring and diagnosis of compound faults with application to rolling bearings. IEEE Trans Auto Sci Eng 15(4):1563–1572

    Article  Google Scholar 

  18. Zhang W, Peng G, Li C (2017) Rolling element bearings fault intelligent diagnosis based on convolutional neural networks using raw sensing Signal. Adv Intell Infor Hid Multim Sig Process 64:77–84

    Google Scholar 

  19. Xia M, Li T, Xu L et al (2017) Fault diagnosis for rotating machinery using multiple sensors and convolutional neural networks. IEEE/ASME Trans Mechatr 23(1):101–110

    Article  Google Scholar 

  20. Hoang DT, Kang HJ (2020) A motor current signal-based bearing fault diagnosis using deep learning and information fusion. IEEE Trans Instr Measur 69(6):3325–3333

    Article  Google Scholar 

  21. Li X, Zhang W, Ding Q (2018) Cross-domain fault diagnosis of rolling element bearings using deep generative neural networks. IEEE Trans Ind Electr 66(7):5525–5534

    Article  Google Scholar 

  22. Li Y (2021) Exploring real-time fault detection of high-speed train traction motor based on machine learning and wavelet analysis. Neur Comp Appl 34:9301–9314

    Article  Google Scholar 

  23. Cao R, Fang L, Lu T, He N (2021) Self-attention-based deep feature fusion for remote sensing scene classification. IEEE Geosci Remot Sens Lett 18(1):43–47

    Article  Google Scholar 

  24. Gao CX, Zhang N, Li YR, Bian F, Wan HYY (2022) Self-attention-based time-variant neural networks for multi-step time series forecasting. Neur Comput Appl 34:8737–8754

    Article  Google Scholar 

  25. Igelnik B, Pao YH (1995) Stochastic choice of basis functions in adaptive function approximation and the functional-link net. IEEE Trans Neural Netw 6(6):1320–1329

    Article  Google Scholar 

  26. Dai W, Li DP, Zhou P, Chai TY (2019) Stochastic confifiguration networks with block increments for data modeling in process industries. Infor Sci 484:367–386

    Article  Google Scholar 

  27. Li WT, Tao H, Li H, Chen KQ, Wang JP (2019) Greengage grading using stochastic configuration networks and a semi-supervised feedback mechanism. Infor Sci 488:1–12

    Article  Google Scholar 

  28. Lu J, Ding JL (2020) Mixed-distribution-based robust stochastic confifiguration networks for prediction interval construction. IEEE Trans Ind Infor 16(8):5099–5109

    Article  Google Scholar 

  29. Scardapane S, Wang D (2017) Randomness in neural networks: an overview. WIREs Data Min Knowl Discov 7(2):1–18

    Google Scholar 

  30. Wang D (2016) Editorial: randomized algorithms for training neural networks. Infor. Sci. 126–128:364–365

    MATH  Google Scholar 

  31. Zhang Q, Li WT, Li H, Wang JP (2020) Self-blast state detection of glass insulators based on stochastic configuration networks and a feedback transfer learning mechanism. Info Sci 522:259–274

    Article  Google Scholar 

  32. El-Thalji I, Jantunen E (2015) A summary of fault modelling and predictive health monitoring of rolling element bearings. Mech Sys Sig Process 60–61:252–272

    Article  Google Scholar 

  33. Zhang W, Gao LP, Chuan HL et al (2017) A new deep learning model for fault diagnosis with good anti-noise and domain adaptation ability on raw vibration signals. Sensors 17(03):425–441

    Article  Google Scholar 

  34. Cheng Y, Yuan H, Liu H et al (2017) Fault diagnosis for rolling bearing based on SIFT-KPCA and SVM. Eng Comput 34(1):53–65

    Article  Google Scholar 

  35. Tan, J., Lu, W., An, J.: Fault diagnosis method study in roller bearing based on wavelet transform and stacked auto-encoder. 27th Chinese Control and Decision Conference, 4608-4613, (2015)

  36. Ince T, Kiranyaz S, Eren L (2016) Real-time motor fault detection by 1-d convolutional neural networks. IEEE Trans Ind Electr 63(11):7067–7075

    Article  Google Scholar 

  37. Wei Z, Peng G, Li C (2017) Rolling element bearings fault intelligent diagnosis based on convolutional neural networks using raw sensing signal. Springer, Berlin 11:77–84

  38. Yu L, Qu J, Gao F et al (2019) A novel hierarchical algorithm for bearing fault diagnosis based on stacked LSTM. Shock Vib 2019:1–10

    Google Scholar 

  39. Pan H, He X, Tang S et al (2018) An improved bearing fault diagnosis method using one-dimensional CNN and LSTM. J Mech Eng 64(7/8):443–452

    Google Scholar 

  40. Zhao R, Yan R, Wang J et al (2017) Learning to monitor machine health with convolutional bi-directional LSTM networks. Sensors 17(2):273

    Article  Google Scholar 

  41. Jafari H, Poshtan J (2019) Fault detection and isolation based on fuzzy-integral fusion approach. IET Sci Measur Tech 13(2):296–302

    Article  Google Scholar 

  42. Wu SL, Liu YT, Hsieh TY et al (2017) Fuzzy integral with particle swarm optimization for a motor-imagery-based brain Ccomputer interface. IEEE Trans Fuzzy Sys 25(1):21–28

    Article  Google Scholar 

  43. Li WT, Wang DH, Chai TY (2012) Flame image-based burning state recognition for sintering process of rotary kiln using heterogeneous features and fuzzy integral. IEEE Trans Ind Infor 8(4):780–790

    Article  Google Scholar 

  44. Cao, Y., Xu, J., Lin, S., et al.: GCNet: non-local networks meet squeeze-excitation networks and beyond, (2019) arXiv:1904.11492

  45. Wang D, Li M (2017) Stochastic configuration networks: fundamentals and algorithms. IEEE Trans Cyber 47(10):3466–3479

    Article  Google Scholar 

  46. Wernecke SJ (2016) Two-dimensional maximum entropy reconstruction of radio brightness. Rad Sci 12(5):831–844

    Article  Google Scholar 

  47. Zhang W, Li C, Peng G et al (2018) A deep convolutional neural network with new training methods for bearing fault diagnosis under noisy environment and different working load. Mech Sys Sig Process 100:439–453

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported in part by grants from the National Natural Science Foundation of China (62173120, 52077049, 51877060), National Key R & D Program of China under Grant No. (2018AAA0100304), Anhui Provincial Natural Science Foundation (2008085UD04, 2108085UD07, 2108085UD11), and 111 Project No. (BP0719039).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dianhui Wang.

Ethics declarations

Conflict of interest Statement

The authors declared that they have no conflicts of interest to this work. We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, W., Deng, Y., Ding, M. et al. Industrial data classification using stochastic configuration networks with self-attention learning features. Neural Comput & Applic 34, 22047–22069 (2022). https://doi.org/10.1007/s00521-022-07657-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07657-9

Keywords

Navigation