Two novel ELM-based stacking deep models focused on image recognition

Song, Gang; Dai, Qun; Han, Xiaomeng; Guo, Lin

doi:10.1007/s10489-019-01584-4

Two novel ELM-based stacking deep models focused on image recognition

Published: 15 January 2020

Volume 50, pages 1345–1366, (2020)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Gang Song¹,
Qun Dai¹,
Xiaomeng Han¹ &
…
Lin Guo¹

728 Accesses
19 Citations
Explore all metrics

Abstract

Extreme learning machine (ELM) and its variants have been widely used in the field of object recognition and other complex classification tasks. Traditional deep learning architectures like Convolutional Neural Network (CNN) are capable of extracting high-level features, which are the key for the models to make right decisions. However, traditional deep architectures are confronted with solving a tough, non-convex optimization problem, which is a time-consuming process. In this paper, we propose two hierarchical models, i.e., Random Recursive Constrained ELM (R²CELM) and Random Recursive Local- Receptive-Fields-Based ELM (R²ELM-LRF), which are constructed by stacking with CELM or ELM-LRF, respectively. Besides, inspired by the stacking generalization philosophy, random projection and kernelization are incorporated as their constitutive elements. R²CELM and R²ELM-LRF not only fully inherit the merits of ELM, but also take advantage of the superiority of CELM and ELM-LRF in the field of image recognition, respectively. The essence of CELM is to constrain the weight vectors from the input layer to the hidden layer to be consistent with the directions from one class to another class, while ELM-LRF is adept at exploiting the local structures in images through many local receptive fields. In the empirical results, R²CELM and R²ELM-LRF demonstrate their better performance in testing accuracy on the six benchmark image recognition datasets, compared with their basic learners and other state-of-the-art algorithms. Moreover, the proposed two deep ELM models need less training time when compared with traditional Deep Neural Network (DNN) based models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CBAM: Convolutional Block Attention Module

A survey on Image Data Augmentation for Deep Learning

Article Open access 06 July 2019

A review of object detection based on deep learning

Article 12 June 2020

References

Nath SS, Mishra G, Kar J, Chakraborty S, Dey N (2014) A survey of image classification methods and techniques. In: International conference on control, instrumentation, communication and computational technologies, pp 554–557
Li Z, Fan Y, Liu W (2015, 2015) The effect of whitening transformation on pooling operations in convolutional autoencoders. EURASIP J Adv Signal Process:37–48
Lowe DG, Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60:91–110
Article Google Scholar
N. Dalal and B. Triggs (2005) Histograms of oriented gradients for human detection. In: Computer vision and pattern recognition (CVPR), pp 886–893
Ojala T, Pietikäinen M, Mäenpää T (2001) A generalized local binary pattern operator for multiresolution gray scale and rotation invariant texture classification. Springer, Berlin Heidelberg
Book Google Scholar
Collobert R, Weston J (2008) A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland, pp 160–167
Goodfellow IJ, Bulatov Y, Ibarz J, Arnoud S, Shet V (2014) Multi-digit number recognition from street view imagery using deep convolutional neural networks. Computer Science
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313:504–507
Article MathSciNet Google Scholar
Huval B, Coates A, Ng A (2013) Deep learning for class-generic object detection. Computer Science
Shin HC, Orton MR, Collins DJ, Doran SJ, Leach MO (2013) Stacked autoencoders for unsupervised feature learning and multiple organ detection in a pilot study using 4D patient data. IEEE Trans Pattern Anal Mach Intell 35:1930–1943
Article Google Scholar
Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70:489–501
Article Google Scholar
Wolpert DH (1992) Stacked generalization. Neural Netw 5:241–259
Article Google Scholar
Zhu W, Miao J, Qing L (2015) Constrained extreme learning machines: a study on classification cases. Computer Science
Huang GB, Bai Z, Chi MV (2015) Local receptive fields based extreme learning machine. IEEE Comput Intell Mag 10:18–29
Article Google Scholar
Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18:1527–1543
Article MathSciNet Google Scholar
Lopes N, Ribeiro B (2015) Deep belief networks (DBNs). Springer, Cham
Book Google Scholar
Salakhutdinov R, Hinton G (2009) Deep Boltzmann machines. J Mach Learn Res 5:1967–2006
MATH Google Scholar
Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol PA (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11:3371–3408
MathSciNet MATH Google Scholar
Dan C, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. IEEE Conference on Computer Vision and Pattern Recognition:3642–3649
Tang J, Deng C, Huang GB (2016) Extreme learning machine for multilayer perceptron. IEEE Trans Neural Netw Learn Syst 27:809–821
Article MathSciNet Google Scholar
Haykin S, Kosko B (2009) Gradient based learning applied to document recognition. IEEE Wiley-IEEE Press:306–351
Krizhevsky A (2012) Learning multiple layers of features from tiny images. Master’s thesis, Department of Computer Science, University of Toronto
Lecun Y, Huang FJ, Bottou L (2004) Learning methods for generic object recognition with invariance to pose and lighting. Computer Vision and Pattern Recognition (CVPR) 2:97–104
Google Scholar
Cai D, He X, Han J, Huang TS (2011) Graph regularized nonnegative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Intell 33:1548–1560
Article Google Scholar
Hull JJ (1994) A database for handwritten text recognition research. IEEE Trans Pattern Anal Mach Intell 16:550–554
Article Google Scholar
Cai D, He X, Hu Y et al (2015) Learning a spatially smooth subspace for face recognition. IEEE conference on computer vision and pattern recognition:1–7
Cheng M (2015) The cross-field DBN for image recognition. IEEE international conference on progress in informatics and computing:83–86
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. International conference on neural information processing systems:1097–1105
Ahmed E, Jones M, Marks TK (2015) An improved deep learning architecture for person re-identification. IEEE conference on computer vision and pattern recognition (CVPR):3908–3916
He K, Zhang X, Ren S, Sun J (2015, 2015) Deep residual learning for image recognition. IEEE conference on computer vision and pattern recognition (CVPR):770–778
Shah SAA, Bennamoun M, Boussaid F (2015) Iterative deep learning for image set based face and object recognition. Neurocomputing 174:866–874
Article Google Scholar
Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E et al (2013) DeCAF: a deep convolutional activation feature for generic visual recognition. Comput Sci 50:815–830
Google Scholar
Parker SP (2012) GPU implementation of a deep learning network for image recognition tasks. MS (Master of Science) thesis, University of Iowa
Han X, Dai Q Batch-normalized Mlpconv-wise supervised pre-training network in network. Applied Intelligence 48(1):142–155. https://doi.org/10.1007/s10489-017-0968-2
Lv Y, Duan Y, Kang W, Li Z (2015) Traffic flow prediction with big data: a deep learning approach. IEEE Trans Intell Transp Syst 16:865–873
Google Scholar
Vinyals O, Jia Y, Deng L, Darrell T (2012) Learning with recursive perceptual representations. Adv Neural Inf Proces Syst:2834–2842
Wang S, Deng C, Lin W, Huang GB, Zhao B (2017) NMF-based image quality assessment using extreme learning machine. IEEE Trans Cybern 47(1):232–243
Article Google Scholar
Deng C, Wang S, Li Z, Huang GB, Lin W (2017) Content-insensitive blind image blurriness assessment using Weibull statistics and sparse extreme learning machine. IEEE Trans Syst Man Cybern: Systems 99:1–12
Google Scholar
Decherchi S, Gastaldo P, Zunino R, Cambria E, Redi J (2013) Circular-ELM for the reduced-reference assessment of perceived image quality. Neurocomputing 102:78–89
Article Google Scholar
Liu H, Wu Y, Sun F Extreme trust region policy optimization for active object recognition. IEEE Trans Neural Netw Learn Syst 29(6):2253–2258. https://doi.org/10.1109/TNNLS.2017.2785233
Liu H, Qin J, Sun F, Guo D (2017) Extreme kernel sparse learning for tactile object recognition. IEEE Trans Cybern 47(12):4509–4520
Article Google Scholar
Huang GB, Zhou H, Ding X et al (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern: Systems 42:513–529
Article Google Scholar
Huang GB, Wang DH, Lan Y (2011) Extreme learning machines: a survey. Int J Mach Learn Cybern 2:107–122
Article Google Scholar
Huang GB, Chen L, Siew CK (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 17:879–892
Article Google Scholar
Jin Y,Peng L,Zhang W (2018) Hybrid macro/micro level backpropagation for training deep spiking neural networks. In: 32nd conference on neural information processing systems (NeurIPS), Montréal, Canada
Blumensath T, Davies ME (2007) On the difference between orthogonal matching pursuit and orthogonal least squares. Unpublished manuscript, available at: http://www.personal.soton.ac.uk/tb1m08/papers/BDOMPvsOLS07.pdf
Yu W, Zhuang F, He Q, Shi Z (2015) Learning deep representations via extreme learning machines. Neurocomputing 149:308–315
Article Google Scholar
Vincent P,Larochelle H,Bengio Y, Manzagol PA 2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland, pp. 1096–1103
Le QV, Ngiam J, Chen Z, Chia DJH, Pang WK, Ng AY (2010) Tiled convolutional neural networks. In: International conference on neural information processing systems, pp. 1279–1287,
Mcdonnell MD, Vladusich T (2015) Enhanced image classification with a fast-learning shallow convolutional neural network. In: International joint conference on neural networks, Killarney, Ireland
Maaløe L, Sønderby CK, Sønderby SK, Winther O (2016) Auxiliary deep generative models. arXiv:1602.05473 [stat.ML]
Cherla S, Tran SN, Garcez AD, Weyde T (2017) Generalising the discriminative restricted Boltzmann machines. In: Lintas A, Rovetta S, Verschure P, Villa A (eds) Artificial neural networks and machine learning – ICANN 2017, Lecture notes in computer science, vol 10614. Springer, Cham
Chapter Google Scholar
Mazdak F, Mahmood A, Arash A, Shahsavari M, Devienn P (2016) Towards an spiking deep belief network for face recognition application. In: 2016 6th International Conference on Computer and Knowledge Engineering (ICCKE), Mashhad, Iran
Brancati N, Gragnaniello D, Verdoliva L (2016) Scale invariant descriptor for content based image retrieval in biomedical applications. In 2016 12th international conference on signal-image technology & Internet-based systems (SITIS), Naples, Italy
Agarap AF (2019) An architecture combining convolutional neural network (CNN) and support vector machine (SVM) for image classification. arXiv:1712.03541v2 [cs.CV]

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China under the Grant no. 61473150.

Author information

Authors and Affiliations

College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China
Gang Song, Qun Dai, Xiaomeng Han & Lin Guo

Authors

Gang Song
View author publications
You can also search for this author in PubMed Google Scholar
Qun Dai
View author publications
You can also search for this author in PubMed Google Scholar
Xiaomeng Han
View author publications
You can also search for this author in PubMed Google Scholar
Lin Guo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qun Dai.

Ethics declarations

Conflict of interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Song, G., Dai, Q., Han, X. et al. Two novel ELM-based stacking deep models focused on image recognition. Appl Intell 50, 1345–1366 (2020). https://doi.org/10.1007/s10489-019-01584-4

Download citation

Published: 15 January 2020
Issue Date: May 2020
DOI: https://doi.org/10.1007/s10489-019-01584-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Two novel ELM-based stacking deep models focused on image recognition

Abstract

Access this article

Similar content being viewed by others

CBAM: Convolutional Block Attention Module

A survey on Image Data Augmentation for Deep Learning

A review of object detection based on deep learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Two novel ELM-based stacking deep models focused on image recognition

Abstract

Access this article

Similar content being viewed by others

CBAM: Convolutional Block Attention Module

A survey on Image Data Augmentation for Deep Learning

A review of object detection based on deep learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation