Skip to main content
Log in

A novel multi-scale and sparsity auto-encoder for classification

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

The inspiration for generating the multi-scale feature representation originates from the basic observation that multi-scale is closely related to human visual physiological characteristics. Also, since the increase of hidden layer neurons and the amount of data leads to the rise of redundant information, a large amount of calculation makes a model more complex. This paper proposes a novel learning method, namely, multi-scale feature consistency regularization and L21-norm minimization sparse auto-encoder (LR21-MSAE). The multi-scale feature consistency regularization can achieve the latent representations and the visual details while retaining multi-scale information. This method ensures that LR21-MSAE can get valid information for better classification accuracy. By implementing the L21-norm minimization constraint, the LR21-MSAE can adaptively eliminate the potential noise and redundant neurons by enforcing some rows and columns of the weight matrix to be reduced to zero. It can reduce the complexity of the learning model and promote learning sparsity features to generate a compact network. Moreover, introducing the Wasserstein distance in the sparse auto-encoder to measure the difference between the two distributions allows for a more stable training process and faster convergence. To complete the test of the LR21-MSAE model, we choose to conduct the experiments on some publicly available datasets MNIST, Fashion-MNIST, CIFAR-10, USPS, ISOLET, Pendigits, and Ecoli. We demonstrate the advantages of LR21-MSAE, through the experimental results, compared with state-of-the-art feature extraction methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Song ZJ (2020) English speech recognition based on deep learning with multiple features. Computing 102(99):1–20. https://doi.org/10.1007/s00607-019-00753-0

    Article  MathSciNet  MATH  Google Scholar 

  2. Byun SW, Lee SP (2021) A study on a speech emotion recognition system with effective acoustic features using deep learning algorithms. Appl Sci 11(4):1890–1890. https://doi.org/10.3390/APP11041890

    Article  Google Scholar 

  3. Kovalev VA, Liauchuk VA, Voynov DM, Tuzikov AV (2021) Biomedical image recognition in pulmonology and oncology with the use of deep learning. Pattern Recognit Image Anal 31(1):144–162. https://doi.org/10.1134/S1054661821010120

    Article  Google Scholar 

  4. Cheok MJ, Omar Z, Jaward MH (2019) A review of hand gesture and sign language recognition techniques. Int J Mach Learn Cybern 10(1–3):1–23. https://doi.org/10.1007/s13042-017-0705-5

    Article  Google Scholar 

  5. Yang BS, Wang LY, Wong DF, Shi SM, Tu ZP (2021) Context-aware self-attention networks for natural language processing. Neurocomputing 458:157–169. https://doi.org/10.1016/J.NEUCOM.2021.06.009

    Article  Google Scholar 

  6. Li R, Zhang X, Li C, Zheng Z, Zhou Z, Geng Y (2021) Keyword extraction method for machine reading comprehension based on natural language processing. J Phys Conf Ser 1955(1):012072. https://doi.org/10.1088/1742-6596/1955/1/012072

    Article  Google Scholar 

  7. Bengio Y, Courville A, Pascal V (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828. https://doi.org/10.1109/TPAMI.2013.50

    Article  Google Scholar 

  8. Jia K, Sun L, Gao SH, Song Z, Shi BE (2015) Laplacian auto-encoders: an explicit learning of nonlinear data manifold. Neurocomputing 106:250–260. https://doi.org/10.1016/j.neucom.2015.02.023

    Article  Google Scholar 

  9. Liu WF, Ma TZ, Tao DP, You JN (2016) HSAE: a Hessian regularized sparse auto-encoders. Neurocomputing 187:59–65. https://doi.org/10.1016/j.neucom.2015.07.119

    Article  Google Scholar 

  10. Zhang MH, Yang CL, Yuan Y, Guan Y, Wang SY, Liu QG (2021) Multi-wavelet guided deep mean-shift prior for image restoration. Signal Process Image Commun 99(9):116449. https://doi.org/10.1016/j.image.2021.116449

    Article  Google Scholar 

  11. Lu C, Wang ZY, Qin WL, Ma J (2017) Fault diagnosis of rotary machinery components using a stacked denoising autoencoder-based health state identification. Signal Process 130:377–388. https://doi.org/10.1016/j.sigpro.2016.07.028

    Article  Google Scholar 

  12. Luo S, Zhu L, Althoefer K, Liu H (2017) Knock-Knock: Acoustic object recognition by using stacked denoising autoencoders. Neurocomputing 267:18–24. https://doi.org/10.1016/j.neucom.2017.03.014

    Article  Google Scholar 

  13. Ozkan S, Kaya B, Akar GB (2019) EndNet: sparse autoencoder network for endmember extraction and hyperspectral unmixing. IEEE Trans Geosci Remote Sens 57(1):482–496. https://doi.org/10.1109/TGRS.2018.2856929

    Article  Google Scholar 

  14. Sun WJ, Shao SY, Zhao R, Yan RQ, Zhang XW, Chen XF (2016) A sparse auto-encoder-based deep neural network approach for induction motor faults classification. Measurement 89:171–178. https://doi.org/10.1016/j.measurement.2016.04.007

    Article  Google Scholar 

  15. Kamimura R, Takeuchi H (2019) Sparse semi-autoencoders to solve the vanishing information problem in multi-layered neural networks. Appl Intell 49(7):2522–2545. https://doi.org/10.1007/s10489-018-1393-x

    Article  Google Scholar 

  16. Binbusayyis A, Vaiyapuri T (2021) Unsupervised deep learning approach for network intrusion detection combining convolutional autoencoder and one-class SVM. Appl Intell 51(10):1–15. https://doi.org/10.1007/S10489-021-02205-9

    Article  Google Scholar 

  17. Li B, Gong XF, Wang C, Wu RJ, Bian T, Li YM, Wang ZY, Luo RS (2021) MMD-encouraging convolutional autoencoder: a novel classification algorithm for imbalanced data. Appl Intell 51(10):1–18. https://doi.org/10.1007/S10489-021-02235-3

    Article  Google Scholar 

  18. Zhang J, Li K, Liang Y, Li N (2017) Learning 3D faces from 2D images via stacked contractive autoencoder. Neurocomputing 257:67–78. https://doi.org/10.1016/j.neucom.2016.11.062

    Article  Google Scholar 

  19. Lan RS, Li ZY, Liu ZB, Gu TL, Luo XN (2018) Hyperspectral image classification using k-sparse denoising autoencoder and spectral-restricted spatial characteristics. Appl Soft Comput J 74:693–708. https://doi.org/10.1016/j.asoc.2018.08.049

    Article  Google Scholar 

  20. Liu WF, Ma TZ, Xie QS, Tao DP, Cheng J (2017) LMAE: a large margin Auto-Encoders for classification. Signal Process 141:137–143. https://doi.org/10.1016/j.sigpro.2017.05.030

    Article  Google Scholar 

  21. Dong WC, Sun HX, Li Z, Zhang JX, Yang HF (2020) Short-term wind-speed forecasting based on multiscale mathematical morphological decomposition, K-means clustering, and stacked denoising autoencoders. IEEE ACCESS 8:146901–146914. https://doi.org/10.1109/ACCESS.2020.3015336

    Article  Google Scholar 

  22. Xu JW, Ni BB, Yang XK (2020) Progressive multi-granularity analysis for video prediction. Int J Comput Vis (prepublish). https://doi.org/10.1007/s11263-020-01389-w

  23. Wang RP, Cui Y, Song X, Chen K, Fang H (2021) Multi-information-based convolutional neural network with attention mechanism for pedestrian trajectory prediction. Image Vis Comput 107. https://doi.org/10.1016/J.IMAVIS.2021.104110

    Article  Google Scholar 

  24. Xiao R, Zhang ZL, Wu YY, Jiang PY, Deng J (2021) Multi-scale information fusion model for feature extraction of converter transformer vibration signal. Measurement 180:109555. https://doi.org/10.1016/J.MEASUREMENT.2021.109555

    Article  Google Scholar 

  25. Burt PJ, Adelson EH (1987) The Laplacian pyramid as a compact image code. Read Comput Vis 31(4):671–679. https://doi.org/10.1109/TCOM.1983.1095851

    Article  Google Scholar 

  26. Nair D, Sankaran P (2020) A modular architecture for high resolution image dehazing. Signal Process Image Commun 92(3):116113. https://doi.org/10.1016/j.image.2020.116113

    Article  Google Scholar 

  27. Zhao QL, Li ZM, Dong JY (2019) Unsupervised representation learning with Laplacian pyramid auto-encoders. Appl Soft Comput J 85(C):105851–105851. https://doi.org/10.1016/j.asoc.2019.105851

  28. Gu JY, Wei MT, Guo YY, Wang HX (2021) Common spatial pattern with L21-norm. Neural Process Lett 53(5):1–20. https://doi.org/10.1007/S11063-021-10567-X

    Article  Google Scholar 

  29. Li R, Wang XD, Quan W, Song YF, Lei L (2020) Robust and structural sparsity auto-encoder with L21-norm minimization. Neurocomputing 425:71–81. https://doi.org/10.1016/j.neucom.2020.02.051

    Article  Google Scholar 

  30. Liu GQ, Ge HW, Yang JL, Wang SX (2021) Robust semi non-negative low-rank graph embedding algorithm via the L21 norm. Appl Intell 52(8):8708–8720. https://doi.org/10.1007/S10489-021-02837-X

    Article  Google Scholar 

  31. Li R, Wang X, Lei L (2019) L21-norm based loss function and regularization extreme learning machine. IEEE Access 7:6575–6586. https://doi.org/10.1109/ACCESS.2018.2887260

    Article  Google Scholar 

  32. MNIST dataset. http://yann.lecun.com/exdb/mnist. Accessed 6 June 2021

  33. Fashion-MNIST dataset. https://github.com/zalandoresearch/fashion-mnist. Accessed 22 June 2021

  34. CIFAR-10 dataset. http://www.cs.toronto.edu/~kriz/cifar.html. Accessed 30 June 2021

  35. USPS dataset. http://www.gaussianprocess.org/gpml/data. Accessed 26 June 2021

  36. UCI-ISOLET dataset. http://archive.ics.uci.edu/ml/datasets/isolet. Accessed 12 July 2021

  37. UCI-Pendigits dataset. http://archive.ics.uci.edu/ml/datasets/Pen-BasedRecognitionofHandwrittenDigits. Accessed 6 June 2021

  38. UCI-Ecoli dataset. http://archive.ics.uci.edu/ml/datasets/Ecoli. Accessed 20 July 2021

  39. Zhang GH, Cui DS, Mao SB, Huang GB (2020) Unsupervised feature learning with sparse Bayesian auto-encoding based extreme learning machine. Int J Mach Learn Cybern 11(3):1557–1569. https://doi.org/10.1007/s13042-019-01057-7

    Article  Google Scholar 

  40. Chai ZL, Song W, Wang HL, Liu F (2019) A semi-supervised auto-encoder using label and sparse regularizations for classification. Appl Soft Comput J 77:205–217. https://doi.org/10.1016/j.asoc.2019.01.021

    Article  Google Scholar 

  41. Quintanar-Reséndiz AL, Rodríguez-Santos F, Pichardo-Méndez JL, Delgado-Gutiérrez G, Ramírez OJ, Vázquez-Medina R (2021) Capture device identification from digital images using Kullback–Leibler divergence. Multim Tools Appl 80(13):19513–19538. https://doi.org/10.1007/S11042-021-10653-1

    Article  Google Scholar 

  42. Li YP, Cao WH, Hu WK, Wu M (2020) Abnormality detection for drilling processes based on Jensen–Shannon divergence and adaptive alarm limits. IEEE Trans Ind Inf 17(9):6104–6113. https://doi.org/10.1109/TII.2020.3032433

    Article  Google Scholar 

  43. Takemura S, Takeda T, Nakanishi T, Koyama Y, Hirosaki N (2021) Dissimilarity measure of local structure in inorganic crystals using wasserstein distance to search for novel phosphors. Sci Technol Adv Mater 22(1):185–193. https://doi.org/10.1080/14686996.2021.1899555

    Article  Google Scholar 

  44. Shuai R, Mu D, Tao Z (2013) Information hiding algorithm based on Gaussian pyramid and color field structure. Int J Dig Content Technol Appl 7(5):222–229. https://doi.org/10.4156/jdcta.vol7.issue5.27

    Article  Google Scholar 

  45. Munoz MA, Villanova L, Baatar D, Smith-Miles K (2018) Instance spaces for machine learning classification. Mach Learn 107(1):109–147. https://doi.org/10.1007/s10994-017-5629-5

    Article  MathSciNet  MATH  Google Scholar 

  46. Qiang N, Shen XJ, Huang CB, Wu SL, Abeo TA, Ganaa ED, Huang SC (2022) Diversified feature representation via deep auto-encoder ensemble through multiple activation functions. Applied Intelligence 52(9):10591–10603. https://doi.org/10.1007/s10489-021-03054-2

    Article  Google Scholar 

  47. Cao X, Luo YH, Zhu XY, Zhang LQ, Xu Y, Shen HB, Wang TJ, Feng Q (2021) Daeanet: dual auto-encoder attention network for depth map super-resolution. Neurocomputing 454:350–360. https://doi.org/10.1016/j.neucom.2021.04.096

    Article  Google Scholar 

  48. Yang DG, Karimi HR, Sun KK (2021) Residual wide-kernel deep convolutional auto-encoder for intelligent rotating machinery fault diagnosis with limited samples. Neural Netw 141:133–144. https://doi.org/10.1016/j.neunet.2021.04.003

    Article  Google Scholar 

  49. Zhao X, Jia M, Liu Z (2021) Semi-supervised deep sparse auto-encoder with local and non-local information for intelligent fault diagnosis of rotating machinery. IEEE Transactions on Instrumentation and Measurement 70:1–13. https://doi.org/10.1109/TIM.2020.3016045

    Article  Google Scholar 

  50. Hou YZ, Zhai JH, Chen JK (2021) Coupled adversarial variational autoencoder. Signal Process Image Commun 98(5786):116396. https://doi.org/10.1016/j.image.2021.116396

    Article  Google Scholar 

  51. Song W, Li W, Hua ZY, Zhu FX (2021) A new deep auto-encoder using multiscale reconstruction errors and weight update correlation. Inf Sci 559:130–152. https://doi.org/10.1016/J.INS.2021.01.064

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work is partly supported by the National Natural Science Foundation of China (Projects numbers: 61673194, 61672263, 61672265, 62076110, 61673193), the Natural Science Foundation of Jiangsu Province (Project number: BK20181341) and the national first-class discipline program of Light Industry Technology and Engineering (Project number: LITE2018-25).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Sun.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, H., Sun, J., Gu, X. et al. A novel multi-scale and sparsity auto-encoder for classification. Int. J. Mach. Learn. & Cyber. 13, 3909–3925 (2022). https://doi.org/10.1007/s13042-022-01632-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-022-01632-5

Keywords

Navigation