Abstract
With the rapid development of Internet of Things and artificial intelligence technologies, the massive data generated by mechanical equipment has driven the fault diagnosis technology into the “big data” era. The analysis, diagnosis, and prediction of these data have become crucial for ensuring the smooth and safe operation of mechanical equipment. In recent years, traditional convolutional neural networks (CNNs) and Transformer-based models have been widely used in industrial fault diagnosis. This paper proposes a new lightweight fault diagnosis framework, Sim-ConvFormer, to address the issues of high model complexity and stringent hardware requirements. The Sim-ConvFormer framework integrates SimAM and External Attention. SimAM (A Simple, Parameter-Free Attention Module for Convolutional Neural Networks) enhances the model’s sensitivity to fine-grained signal variations, capturing locally significant features at different scales. Unlike self-attention, External Attention enhances generalization by computing the affinity between input features and two external memory modules shared across the dataset, thereby capturing global contextual information. The synergy of these technologies not only preserves their individual advantages but also enhances the model’s robustness and accuracy in handling various faults. Experiments on three different mechanical systems demonstrate Sim-ConvFormer’s superior fault diagnosis performance, particularly in terms of model lightness and diagnostic robustness compared to existing Transformer-based methods and CNN-based methods. These results indicate that Sim-ConvFormer is an effective fault diagnosis framework suitable for deployment in resource-constrained industrial environments.
















Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
No datasets were generated or analyzed during the current study.
References
Guo X, Chen L, Shen C (2016) Hierarchical adaptive deep convolution neural network and its application to bearing fault diagnosis. Measurement 93:490–502. https://doi.org/10.1016/j.measurement.2016.07.054
Tama BA, Vania M, Lee S, Lim S (2023) Recent advances in the application of deep learning for fault diagnosis of rotating machinery using vibration signals. Artif. Intell. Rev. 56(5):4667–4709. https://doi.org/10.1007/s10462-022-10293-3
Dai Z, Liu H, Le QV, Tan M (2021) CoAtNet: Marrying convolution and attention for all data sizes, arXiv:2106.04803
Deng J, Jiang W, Zhang Y, Wang G, Li S, Fang H (2021) HS-KDNet: A lightweight network based on hierarchical-split block and knowledge distillation for fault diagnosis with extremely imbalanced data. IEEE Trans. Instrum. Meas. 70:1–9
Saufi SR, Ahmad ZAB, Leong MS, Lim MH (2020) Gearbox fault diagnosis using a deep learning model with limited data sample. IEEE Trans. Ind. Informat. 16(10):6263–6271
Vaswani A, Ramachandran P, Srinivas A, Parmar N, Hechtman B, Shlens J (2021) Scaling local self-attention for parameter efficient visual backbones, in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 12894–12904
Dibaj A, Ettefagh MM, Hassannejad R, Ehghaghi MB (2021) A hybrid fine-tuned VMD and CNN scheme for untrained compound fault diagnosis of rotating machinery with unequal-severity faults. Expert Syst. Appl. 167:114094. https://doi.org/10.1016/j.eswa.2020.114094
Huang D, Zhang WA, Ding SX (2023) Bearing fault diagnosis with incomplete training data: fault data with partial diameters," IEEE Transactions on Automation Science and Engineering
Chen T, Guo L, Gao H, et al (2024) Clustering weighted envelope spectrum for rolling bearing fault diagnosis, IEEE Transactions on Automation Science and Engineering
Cai L, Yin H, Lin J, et al (2024) A multiattribute learning model for zero-sample mechanical fault diagnosis," IEEE Transactions on Industrial Informatics
Pan H, He X, Tang S, Meng F (2018) An improved bearing fault diagnosis method using one-dimensional CNN and LSTM. J. Mech. Eng. 64(7–8):443–452. https://doi.org/10.5545/sv-jme.2018.5249
Tang S, Zhu Y, Yuan S (2022) Intelligent fault identification of hydraulic pump using deep adaptive normalized CNN and synchrosqueezed wavelet transform. Reliability Engineering & System Safety 224:108560
Fang H, Deng J, Chen D, Jiang W, Shao S, Tang M, Liu J (2023) You can get smaller: A lightweight self-activation convolution unit modified by transformer for fault diagnosis. Adv Eng Info 55:101890
Ding Y, Jia M (2022) Convolutional transformer: An enhanced attention mechanism architecture for remaining useful life estimation of bearings. IEEE Trans. Instrum. Meas. 71:1–10
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al (2017) Attention is all you need, Advances in Neural Information Processing Systems, 30
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, et al (2021) An image is worth 16x16 words: Transformers for image recognition at scale," in International Conference on Learning Representations, 3-7
Li C, Chen J, Yang C, Yang J, Liu Z, Davari P (2023) Convolutional Neural Network-Based Transformer Fault Diagnosis Using Vibration Signals. Sensors 23:4781. https://doi.org/10.3390/s23104781
Yan P, Chen F, Kan X, Zhang H, Wang J, Li G (2023) Research on transformer fault diagnosis based on an IWHO optimized MS1DCNN algorithm and LIF spectrum. Analytical Methods 15:3562–3576. https://doi.org/10.1039/D3AY00713H
Yan S, Shao H, Wang J, Zheng X, Liu B (2024) LiConvFormer: A lightweight fault diagnosis framework using separable multiscale convolution and broadcast self-attention. Expert Syst. Appl. 237:121338. https://doi.org/10.1016/j.eswa.2023.121338
Yang L, Zhang R-Y, Li L, Xie X (2021) SimAM: A simple, parameter-free attention module for convolutional neural networks, in Proc. 38th Int. Conf. on Machine Learning, 11863–11874
Guo M-H, Liu Z-N, Mu T-J, Hu S-M (2021) Beyond self-attention: External attention using two linear layers for visual tasks, arXiv preprint arXiv:2105.02358. Available: https://arxiv.org/abs/2105.02358
Li Y, Cheng G, Liu C (2021) Research on bearing fault diagnosis based on spectrum characteristics under strong noise interference. Measurement 169:108509
Zhang W, Li C, Peng G, Chen Y, Zhang Z (2018) A deep convolutional neural network with new training methods for bearing fault diagnosis under noisy environment and different working load. Mech. Syst. Signal Process. 100:439–453
Jin G, Zhu T, Akram MW, Jin Y, Zhu C (2020) An adaptive anti-noise neural network for bearing fault diagnosis under noise and varying load conditions. IEEE Access 8:74793–74807
Hu J, Shen L, Albanie S, Sun G, Wu E (2020) Squeeze-and-excitation networks. IEEE Trans. Pattern Anal. Mach. Intell. 42(8):2011–2023
Huang Z, Wang X, Huang L, Huang C, Wei Y, Liu W (2019) Ccnet: Criss-cross attention for semantic segmentation, in ICCV, 603–612
Li X, Zhong Z, Wu J, Yang Y, Lin Z, Liu H (2019) Expectation-maximization attention networks for semantic segmentation, in ICCV, 9166–9175
Bai W, Zhao J, Dai C, Zhang H, Zhao L, Ji Z, Ganchev I (2023) Two Novel Models for Traffic Sign Detection Based on YOLOv5s. Axioms 12:160
Li J, Tian Y, Chen J, Wang H (2023) Rock Crack Recognition Technology Based on Deep Learning. Sensors 23:5421
Zhang Y, Ni Q (2023) A Novel Weld-Seam Defect Detection Algorithm Based on the S-YOLO Model. Axioms 12:697
Hariharan B, Malik J, Ramanan D (2012) Discriminative Decorrelation for Clustering and Classification, in European Conference on Computer Vision, 459–472, Springer
Wei Y, Wu C, Li G et al (2022) Sequential transformer via an outside-in attention for image captioning. Eng Appl Artificial Intelligence 108:104574
Xu Y, Zhu C, Wang S, et al (2021) Human parity on commonsenseqa: Augmenting self-attention with external attention, arXiv preprint arXiv:2112.03254
Huang S, Liu Y, Cui H et al (2024) MEAformer: An all-MLP transformer with temporal external attention for long-term time series forecasting. Information Sciences 669:120605
Cheng Y, Wang S, Chen B et al (2022) An improved envelope spectrum via candidate fault frequency optimization-gram for bearing fault diagnosis. Journal of Sound and Vibration 523:116746
Deng W, Zhang S, Zhao H, Yang X (2018) A novel fault diagnosis method based on integrating empirical wavelet transform and fuzzy entropy for motor bearing, IEEE Access, 6:35 042–35 056
Zhao X, Yao J, Deng W, Ding P, Ding Y, Jia M, Liu Z (2022) Intelligent fault diagnosis of gearbox under variable working conditions with adaptive intraclass and interclass convolutional neural network. IEEE Trans, Neural Networks Learning Syst
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Guo B (2021) Swin transformer: Hierarchical vision transformer using shifted windows, in Proc. IEEE/CVF Int. Conf. Comput. Vis., 10012–10022
Acknowledgments
This project is supported by the Shaanxi Province Key R&D Program Project (2024GX-YBXM-507), the Special scientific research Project of Shaanxi Provincial Education Department (22JK0508), and the Innovation and Practical Ability Cultivation Program for Postgraduates of Xi ’an Shiyou University (YCS23214255).
Author information
Authors and Affiliations
Contributions
All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Gao, J., Guo, Y. & Gao, G. Sim-ConvFormer: a lightweight fault diagnosis framework incorporating SimAM and external attention. J Supercomput 81, 603 (2025). https://doi.org/10.1007/s11227-025-07075-3
Accepted:
Published:
DOI: https://doi.org/10.1007/s11227-025-07075-3