Skip to main content
Log in

Deep convolutional representations and kernel extreme learning machines for image classification

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Convolutional Neural Networks (CNNs) have been established as a powerful class of models for image classification and related tasks. However, the fully-connected layers in CNN are not robust enough to serve as a classifier to discriminate deep convolutional features, due to the local minima problem of back-propagation. Kernel Extreme Learning Machines (KELMs), known as an outstanding classifier, can not only converge extremely fast but also ensure an outstanding generalization performance. In this paper, we propose a novel image classification framework, in which CNN and KELM are well integrated. In our work, Densely connected network (DenseNet) is employed as the feature extractor, while a radial basis function kernel ELM instead of linear fully connected layer is adopted as a classifier to discriminate categories of extracted features to promote the image classification performance. Experiments conducted on four publicly available datasets demonstrate the promising performance of the proposed framework against the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Aguilar E, Bolaños M, Radeva P (2017) Food recognition using fusion of classifiers based on cnns[C]. In: International conference on image analysis and processing. Springer, Cham, pp 213–224

  2. An L, Bhanu B (2012) Image super-resolution by extreme learning machine[C]. In: 2012 19th IEEE international conference on image processing (ICIP). IEEE, pp 2209–2212

  3. Bengio Y, Courville A, Vincent P. (2013) Representation learning: a review and new perspectives[J]. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828

    Article  Google Scholar 

  4. Bossard L, Guillaumin M, Van Gool L (2014) Food-101?mining discriminative components with random forests[C]. In: European conference on computer vision. Springer, Cham, pp 446–461

  5. Cai Z, Long Y, Shao L, Adaptive RGB (2018) Image recognition by visual-depth embedding[J]. IEEE Trans Image Process 27(5):2471–2483

    Article  MathSciNet  MATH  Google Scholar 

  6. Cui Y, Zhou F, Wang J et al (2017) Kernel pooling for convolutional neural networks[C]. CVPR 1(2):7

    Google Scholar 

  7. Cui Y, Song Y, Sun C et al (2018) Large scale fine-grained categorization and domain-specific transfer learning[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4109–4118

  8. Dai P, Gwadry-Sridhar F, Bauer M et al (2017) Healthy cognitive aging: A hybrid random vector functional-link model for the analysis of Alzheimer’s disease[C]. AAAI, pp 4567–4573

  9. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection[C]. In: 2005 IEEE computer society conference on computer vision and pattern recognition, 2005. CVPR, vol 1. IEEE, pp 886–893

  10. Deng J, Dong W, Socher R, et al (2009) Imagenet: A large-scale hierarchical image database[C]. In: 2009 IEEE conference on computer vision and pattern recognition, 2009. CVPR. IEEE, pp 248–255

  11. Dhungel N, Carneiro G, Bradley AP (2016) The automated learning of deep features for breast mass classification from mammograms[C]. In: International conference on medical image computing and computer-assisted intervention. Springer, Cham, pp 106–114

  12. Gao S, Duan L, Tsang IW (2016) DEFEATnet—A deep conventional image representation for image classification[J]. IEEE Trans Circuits Syst Video Technol 26(3):494–505

    Article  Google Scholar 

  13. Gomez AN, Ren M, Urtasun R et al (2017) The reversible residual network: Backpropagation without storing activations[C]. In: Advances in neural information processing systems, pp 2214–2224

  14. Griffin G, Holub A, Perona P (2007) Caltech-256 object category dataset[J]

  15. Gurpinar F, Kaya H, Dibeklioglu H et al (2016) Kernel ELM and CNN based facial age estimation[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 80–86

  16. Gürp?nar F, Kaya H, Salah AA (2016) Combining deep facial and ambient features for first impression estimation[C]. In: European conference on computer vision. Springer, Cham, pp 372–385

  17. He Q, Jin X, Du C, et al (2014) Clustering in extreme learning machine feature space[J]. Neurocomputing 128:88–95

    Article  Google Scholar 

  18. He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  19. Heravi EJ, Aghdam HH, Puig D (2018) An optimized convolutional neural network with bottleneck and spatial pyramid pooling layers for classification of foods[J]. Pattern Recogn Lett 105:50–58

    Article  Google Scholar 

  20. Hou S, Liu X, Wang Z (2017) Dualnet: Learn complementary features for image recognition[C]. In: 2017 IEEE international conference on computer vision (ICCV). IEEE, pp 502–510

  21. Howard AG, Zhu M, Chen B, et al (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications[J]. arXiv:1704.04861

  22. Huang GB (2015) What are extreme learning machines? Filling the gap between Frank Rosenblatt?s dream and John von Neumann?s puzzle[J]. Cogn Comput 7 (3):263–278

    Article  Google Scholar 

  23. Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications[J]. Neurocomputing 70(1-3):489–501

    Article  Google Scholar 

  24. Huang GB, Chen L, Siew CK (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes[J]. IEEE Trans Neural Networks 17(4):879–892

    Article  Google Scholar 

  25. Huang GB, Zhou H, Ding X, et al (2012) Extreme learning machine for regression and multiclass classification[J]. IEEE Trans Syst Man Cybern Part B Cybern 42(2):513–529

    Article  Google Scholar 

  26. Huang G, Liu Z, Weinberger KQ, et al (2017) Densely connected convolutional networks[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, vol, no 2, p 3

  27. Huang Z, Yu Y, Gu J, et al (2017) An efficient method for traffic sign recognition based on extreme learning machine[J]. IEEE Trans Cybern 47(4):920–933

    Article  Google Scholar 

  28. Igelnik B, Pao YH (1995) Stochastic choice of basis functions in adaptive function approximation and the functional-link net[J]. IEEE Trans Neural Netw 6 (6):1320–1329

    Article  Google Scholar 

  29. Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift[C]. In: International conference on machine learning, pp 448–456

  30. Kasun LLC, Zhou H, Huang GB, et al (2013) Representational learning with extreme learning machine for big data[J]. IEEE Intell Syst 28(6):31–34

    Google Scholar 

  31. Khosla A, Jayadevaprakash N, Yao B, et al (2011) Novel dataset for fine-grained image categorization: Stanford dogs[C]//Proc. CVPR Workshop on Fine-Grained Visual Categorization (FGVC). 2:1

    Google Scholar 

  32. Krause J, Sapp B, Howard A, et al (2016) The unreasonable effectiveness of noisy data for fine-grained recognition[C]. In: European conference on computer vision. Springer, Cham, pp 301–320

  33. Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images[R]. Technical report. University of Toronto, Toronto

    Google Scholar 

  34. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: MIT Press Conf on advances in neural information processing systems, pp 1097–1105

  35. Larsson G, Maire M, Shakhnarovich G (2016) Fractalnet: Ultra-deep neural networks without residuals[J]. arXiv:1605.07648

  36. LeCun Y, Bengio Y, Hinton G (2015) Deep learning[J]. Nature 521 (7553):436

    Article  Google Scholar 

  37. Li Q, Peng Q, Chen J, et al (2018) Improving image classification accuracy with ELM and CSIFT[J]. Computing in Science and Engineering

  38. Liu C, Cao Y, Luo Y, et al (2016) Deepfood: Deep learning-based food image recognition for computer-aided dietary assessment[C]. In: International conference on smar homes and health telematics. Springer, Cham, pp 37–48

  39. Lowe DG (2004) Distinctive image features from scale-invariant keypoints[J]. Int J Comput Vis 60(2):91–110

    Article  Google Scholar 

  40. Luan S, Chen C, Zhang B, et al (2018) Gabor convolutional networks[J]. IEEE Transactions on Image Processing

  41. Luo W, Li J, Yang J et al (2017) Convolutional sparse autoencoders for image classification[J]. IEEE Trans Neural Netw Learn Syst 99:1–6

    Article  Google Scholar 

  42. Martinel N, Piciarelli C, Micheloni C (2016) A supervised extreme learning committee for food recognition[J]. Comput Vis Image Underst 148:67–86

    Article  Google Scholar 

  43. Meyers A, Johnston N, Rathod V et al (2015) Im2Calories: towards an automated mobile vision food diary[C]. In: Proceedings of the IEEE international conference on computer vision, pp 1233–1241

  44. Niu XX, Suen CY (2012) A novel hybrid CNN? SVM classifier for recognizing handwritten digits[J]. Pattern Recogn 45(4):1318–1325

    Article  Google Scholar 

  45. Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns[J]. IEEE Trans Pattern Anal Mach Intell 24(7):971–987

    Article  MATH  Google Scholar 

  46. Pao YH, Park GH, Sobajic DJ (1994) Learning and generalization characteristics of the random vector functional-link net[J]. Neurocomputing 6(2):163–180

    Article  Google Scholar 

  47. Shen F, Mu Y, Yang Y, et al (2017) Classification by retrieval: Binarizing data and classifiers[C]. In: Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval. ACM, pp 595–604

  48. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition[J]. arXiv:1409.1556

  49. Srivastava RK, Greff K, Schmidhuber J (2015) Highway networks[J]. arXiv:1505.00387

  50. Szegedy C, Vanhoucke V, Ioffe S et al (2016) Rethinking the inception architecture for computer vision[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826

  51. Szegedy C, Ioffe S, Vanhoucke V et al (2017) Inception-v4, inception-resnet and the impact of residual connections on learning[C]. AAAI 4:12

    Google Scholar 

  52. Tang J, Deng C, Huang GB (2016) Extreme learning machine for multilayer perceptron[J]. IEEE Trans Neural Netw Learn Syst 27(4):809–821

    Article  MathSciNet  Google Scholar 

  53. Targ S, Almeida D, Lyman K (2016) Resnet in Resnet: generalizing residual architectures[J]. arXiv:1603.08029

  54. Uzair M, Mian A (2017) Blind domain adaptation with augmented extreme learning machine features[J]. IEEE trans Cybern 47(3):651–660

    Article  Google Scholar 

  55. Weng Q, Mao Z, Lin J, et al (2018) Land-use scene classification based on a CNN using a constrained extreme learning machine[J]. International Journal of Remote Sensing, pp 1–19

  56. White H (1989) An additional hidden unit test for neglected nonlinearity in multilayer feedforward networks[C]. In: Proceedings of the international joint conference on neural networks, vol 2, pp 451–455

  57. Xie Z, Xu K, Liu L et al (2014) 3d shape segmentation and labeling via extreme learning machine[C]. Comput Graphics Forum 33(5):85–95

    Article  Google Scholar 

  58. Yan J, Zhu M, Liu H, et al (2010) Visual saliency detection via sparsity pursuit[J]. IEEE Signal Process Lett 17(8):739–742

    Article  Google Scholar 

  59. Yan J, Wang J, Zha H, et al (2015) Consistency-driven alternating optimization for multigraph matching: a unified approach[J]. IEEE Trans Image Process 24(3):994–1009

    Article  MathSciNet  MATH  Google Scholar 

  60. Yan J, Cho M, Zha H, et al (2016) Multi-graph matching via affinity optimization with graduated consistency regularization[J]. IEEE Trans Pattern Anal Mach Intell 38(6):1228–1242

    Article  Google Scholar 

  61. Yan J, Li C, Li Y, et al (2018) Adaptive discrete hypergraph matching[J]. IEEE Trans Cybern 48(2):765–779

    Article  Google Scholar 

  62. Yanai K, Kawano Y (2015) Food image recognition using deep convolutional network with pre-training and fine-tuning[C]. In: 2015 IEEE international conference on multimedia and expo workshops (ICMEW). IEEE, pp 1–6

  63. Yosinski J, Clune J, Bengio Y et al (2014) How transferable are features in deep neural networks?[C]. In: Advances in neural information processing systems, pp 3320–3328

  64. Zagoruyko S, Komodakis N. (2016) Wide residual networks[J]. arXiv:1605.07146

  65. Zhang X, Wang S, Yun X (2015) Bidirectional active learning: a two-way exploration into unlabeled and labeled data set[J]. IEEE Trans Neural Netw Learning Syst 26(12):3034–3044. MLA

    Article  MathSciNet  Google Scholar 

  66. Zhang C, Xue Z, Zhu X et al (2016) Boosted random contextual semantic space based representation for visual recognition[J]. Inform Sci 369:160–170

    Article  Google Scholar 

  67. Zhang C, Cheng J, Li C et al (2017) Image-specific classification with local and global discriminations[J]. IEEE Trans Neural Netw Learn Syst 99:1–8

    Article  Google Scholar 

  68. Zhao B, Wu X, Feng J, et al (2017) Diversified visual attention networks for fine-grained object classification[J]. IEEE Trans Multimed 19(6):1245–1256

    Article  Google Scholar 

  69. Zheng J, Zou L, Wang ZJ (2017) Mid-level deep Food Part mining for food image recognition[J]. IET Comput Vis 12(3):298–304

    Article  Google Scholar 

  70. Zhu X, Liu J, Wang J, et al (2014) Sparse representation for robust abnormality detection in crowded scenes[J]. Pattern Recogn 47(5):1791–1799

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by National Key R&D Program of China (2017YFB1401000) and National Natural Science Foundation of China (61501457, 61602517).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiao-Yu Zhang.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, X., Li, Z., Zhang, XY. et al. Deep convolutional representations and kernel extreme learning machines for image classification. Multimed Tools Appl 78, 29271–29290 (2019). https://doi.org/10.1007/s11042-018-6781-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-6781-z

Keywords

Navigation