Abstract
Convolutional Neural Networks (CNNs) have been established as a powerful class of models for image classification and related tasks. However, the fully-connected layers in CNN are not robust enough to serve as a classifier to discriminate deep convolutional features, due to the local minima problem of back-propagation. Kernel Extreme Learning Machines (KELMs), known as an outstanding classifier, can not only converge extremely fast but also ensure an outstanding generalization performance. In this paper, we propose a novel image classification framework, in which CNN and KELM are well integrated. In our work, Densely connected network (DenseNet) is employed as the feature extractor, while a radial basis function kernel ELM instead of linear fully connected layer is adopted as a classifier to discriminate categories of extracted features to promote the image classification performance. Experiments conducted on four publicly available datasets demonstrate the promising performance of the proposed framework against the state-of-the-art methods.
Similar content being viewed by others
References
Aguilar E, Bolaños M, Radeva P (2017) Food recognition using fusion of classifiers based on cnns[C]. In: International conference on image analysis and processing. Springer, Cham, pp 213–224
An L, Bhanu B (2012) Image super-resolution by extreme learning machine[C]. In: 2012 19th IEEE international conference on image processing (ICIP). IEEE, pp 2209–2212
Bengio Y, Courville A, Vincent P. (2013) Representation learning: a review and new perspectives[J]. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828
Bossard L, Guillaumin M, Van Gool L (2014) Food-101?mining discriminative components with random forests[C]. In: European conference on computer vision. Springer, Cham, pp 446–461
Cai Z, Long Y, Shao L, Adaptive RGB (2018) Image recognition by visual-depth embedding[J]. IEEE Trans Image Process 27(5):2471–2483
Cui Y, Zhou F, Wang J et al (2017) Kernel pooling for convolutional neural networks[C]. CVPR 1(2):7
Cui Y, Song Y, Sun C et al (2018) Large scale fine-grained categorization and domain-specific transfer learning[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4109–4118
Dai P, Gwadry-Sridhar F, Bauer M et al (2017) Healthy cognitive aging: A hybrid random vector functional-link model for the analysis of Alzheimer’s disease[C]. AAAI, pp 4567–4573
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection[C]. In: 2005 IEEE computer society conference on computer vision and pattern recognition, 2005. CVPR, vol 1. IEEE, pp 886–893
Deng J, Dong W, Socher R, et al (2009) Imagenet: A large-scale hierarchical image database[C]. In: 2009 IEEE conference on computer vision and pattern recognition, 2009. CVPR. IEEE, pp 248–255
Dhungel N, Carneiro G, Bradley AP (2016) The automated learning of deep features for breast mass classification from mammograms[C]. In: International conference on medical image computing and computer-assisted intervention. Springer, Cham, pp 106–114
Gao S, Duan L, Tsang IW (2016) DEFEATnet—A deep conventional image representation for image classification[J]. IEEE Trans Circuits Syst Video Technol 26(3):494–505
Gomez AN, Ren M, Urtasun R et al (2017) The reversible residual network: Backpropagation without storing activations[C]. In: Advances in neural information processing systems, pp 2214–2224
Griffin G, Holub A, Perona P (2007) Caltech-256 object category dataset[J]
Gurpinar F, Kaya H, Dibeklioglu H et al (2016) Kernel ELM and CNN based facial age estimation[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 80–86
Gürp?nar F, Kaya H, Salah AA (2016) Combining deep facial and ambient features for first impression estimation[C]. In: European conference on computer vision. Springer, Cham, pp 372–385
He Q, Jin X, Du C, et al (2014) Clustering in extreme learning machine feature space[J]. Neurocomputing 128:88–95
He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Heravi EJ, Aghdam HH, Puig D (2018) An optimized convolutional neural network with bottleneck and spatial pyramid pooling layers for classification of foods[J]. Pattern Recogn Lett 105:50–58
Hou S, Liu X, Wang Z (2017) Dualnet: Learn complementary features for image recognition[C]. In: 2017 IEEE international conference on computer vision (ICCV). IEEE, pp 502–510
Howard AG, Zhu M, Chen B, et al (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications[J]. arXiv:1704.04861
Huang GB (2015) What are extreme learning machines? Filling the gap between Frank Rosenblatt?s dream and John von Neumann?s puzzle[J]. Cogn Comput 7 (3):263–278
Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications[J]. Neurocomputing 70(1-3):489–501
Huang GB, Chen L, Siew CK (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes[J]. IEEE Trans Neural Networks 17(4):879–892
Huang GB, Zhou H, Ding X, et al (2012) Extreme learning machine for regression and multiclass classification[J]. IEEE Trans Syst Man Cybern Part B Cybern 42(2):513–529
Huang G, Liu Z, Weinberger KQ, et al (2017) Densely connected convolutional networks[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, vol, no 2, p 3
Huang Z, Yu Y, Gu J, et al (2017) An efficient method for traffic sign recognition based on extreme learning machine[J]. IEEE Trans Cybern 47(4):920–933
Igelnik B, Pao YH (1995) Stochastic choice of basis functions in adaptive function approximation and the functional-link net[J]. IEEE Trans Neural Netw 6 (6):1320–1329
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift[C]. In: International conference on machine learning, pp 448–456
Kasun LLC, Zhou H, Huang GB, et al (2013) Representational learning with extreme learning machine for big data[J]. IEEE Intell Syst 28(6):31–34
Khosla A, Jayadevaprakash N, Yao B, et al (2011) Novel dataset for fine-grained image categorization: Stanford dogs[C]//Proc. CVPR Workshop on Fine-Grained Visual Categorization (FGVC). 2:1
Krause J, Sapp B, Howard A, et al (2016) The unreasonable effectiveness of noisy data for fine-grained recognition[C]. In: European conference on computer vision. Springer, Cham, pp 301–320
Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images[R]. Technical report. University of Toronto, Toronto
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: MIT Press Conf on advances in neural information processing systems, pp 1097–1105
Larsson G, Maire M, Shakhnarovich G (2016) Fractalnet: Ultra-deep neural networks without residuals[J]. arXiv:1605.07648
LeCun Y, Bengio Y, Hinton G (2015) Deep learning[J]. Nature 521 (7553):436
Li Q, Peng Q, Chen J, et al (2018) Improving image classification accuracy with ELM and CSIFT[J]. Computing in Science and Engineering
Liu C, Cao Y, Luo Y, et al (2016) Deepfood: Deep learning-based food image recognition for computer-aided dietary assessment[C]. In: International conference on smar homes and health telematics. Springer, Cham, pp 37–48
Lowe DG (2004) Distinctive image features from scale-invariant keypoints[J]. Int J Comput Vis 60(2):91–110
Luan S, Chen C, Zhang B, et al (2018) Gabor convolutional networks[J]. IEEE Transactions on Image Processing
Luo W, Li J, Yang J et al (2017) Convolutional sparse autoencoders for image classification[J]. IEEE Trans Neural Netw Learn Syst 99:1–6
Martinel N, Piciarelli C, Micheloni C (2016) A supervised extreme learning committee for food recognition[J]. Comput Vis Image Underst 148:67–86
Meyers A, Johnston N, Rathod V et al (2015) Im2Calories: towards an automated mobile vision food diary[C]. In: Proceedings of the IEEE international conference on computer vision, pp 1233–1241
Niu XX, Suen CY (2012) A novel hybrid CNN? SVM classifier for recognizing handwritten digits[J]. Pattern Recogn 45(4):1318–1325
Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns[J]. IEEE Trans Pattern Anal Mach Intell 24(7):971–987
Pao YH, Park GH, Sobajic DJ (1994) Learning and generalization characteristics of the random vector functional-link net[J]. Neurocomputing 6(2):163–180
Shen F, Mu Y, Yang Y, et al (2017) Classification by retrieval: Binarizing data and classifiers[C]. In: Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval. ACM, pp 595–604
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition[J]. arXiv:1409.1556
Srivastava RK, Greff K, Schmidhuber J (2015) Highway networks[J]. arXiv:1505.00387
Szegedy C, Vanhoucke V, Ioffe S et al (2016) Rethinking the inception architecture for computer vision[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Szegedy C, Ioffe S, Vanhoucke V et al (2017) Inception-v4, inception-resnet and the impact of residual connections on learning[C]. AAAI 4:12
Tang J, Deng C, Huang GB (2016) Extreme learning machine for multilayer perceptron[J]. IEEE Trans Neural Netw Learn Syst 27(4):809–821
Targ S, Almeida D, Lyman K (2016) Resnet in Resnet: generalizing residual architectures[J]. arXiv:1603.08029
Uzair M, Mian A (2017) Blind domain adaptation with augmented extreme learning machine features[J]. IEEE trans Cybern 47(3):651–660
Weng Q, Mao Z, Lin J, et al (2018) Land-use scene classification based on a CNN using a constrained extreme learning machine[J]. International Journal of Remote Sensing, pp 1–19
White H (1989) An additional hidden unit test for neglected nonlinearity in multilayer feedforward networks[C]. In: Proceedings of the international joint conference on neural networks, vol 2, pp 451–455
Xie Z, Xu K, Liu L et al (2014) 3d shape segmentation and labeling via extreme learning machine[C]. Comput Graphics Forum 33(5):85–95
Yan J, Zhu M, Liu H, et al (2010) Visual saliency detection via sparsity pursuit[J]. IEEE Signal Process Lett 17(8):739–742
Yan J, Wang J, Zha H, et al (2015) Consistency-driven alternating optimization for multigraph matching: a unified approach[J]. IEEE Trans Image Process 24(3):994–1009
Yan J, Cho M, Zha H, et al (2016) Multi-graph matching via affinity optimization with graduated consistency regularization[J]. IEEE Trans Pattern Anal Mach Intell 38(6):1228–1242
Yan J, Li C, Li Y, et al (2018) Adaptive discrete hypergraph matching[J]. IEEE Trans Cybern 48(2):765–779
Yanai K, Kawano Y (2015) Food image recognition using deep convolutional network with pre-training and fine-tuning[C]. In: 2015 IEEE international conference on multimedia and expo workshops (ICMEW). IEEE, pp 1–6
Yosinski J, Clune J, Bengio Y et al (2014) How transferable are features in deep neural networks?[C]. In: Advances in neural information processing systems, pp 3320–3328
Zagoruyko S, Komodakis N. (2016) Wide residual networks[J]. arXiv:1605.07146
Zhang X, Wang S, Yun X (2015) Bidirectional active learning: a two-way exploration into unlabeled and labeled data set[J]. IEEE Trans Neural Netw Learning Syst 26(12):3034–3044. MLA
Zhang C, Xue Z, Zhu X et al (2016) Boosted random contextual semantic space based representation for visual recognition[J]. Inform Sci 369:160–170
Zhang C, Cheng J, Li C et al (2017) Image-specific classification with local and global discriminations[J]. IEEE Trans Neural Netw Learn Syst 99:1–8
Zhao B, Wu X, Feng J, et al (2017) Diversified visual attention networks for fine-grained object classification[J]. IEEE Trans Multimed 19(6):1245–1256
Zheng J, Zou L, Wang ZJ (2017) Mid-level deep Food Part mining for food image recognition[J]. IET Comput Vis 12(3):298–304
Zhu X, Liu J, Wang J, et al (2014) Sparse representation for robust abnormality detection in crowded scenes[J]. Pattern Recogn 47(5):1791–1799
Acknowledgements
This work was supported by National Key R&D Program of China (2017YFB1401000) and National Natural Science Foundation of China (61501457, 61602517).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhu, X., Li, Z., Zhang, XY. et al. Deep convolutional representations and kernel extreme learning machines for image classification. Multimed Tools Appl 78, 29271–29290 (2019). https://doi.org/10.1007/s11042-018-6781-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-6781-z