RGB-D object recognition based on the joint deep random kernel convolution and ELM

Yin, Yunhua; Li, Huifang

doi:10.1007/s12652-018-1067-x

RGB-D object recognition based on the joint deep random kernel convolution and ELM

Original Research
Published: 28 September 2018

Volume 11, pages 4337–4346, (2020)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Yunhua Yin¹ &
Huifang Li¹

340 Accesses
3 Citations
Explore all metrics

Abstract

Nowadays RGB-D object recognition has been a challenging and important task in computer vision field. Convolutional Neural Network is a current popular algorithm for feature extraction from RGB and Depth modality separately, which cannot fully exploit some potential and complementary information between different modalities. The conventional training methods designed for CNN involve many gradient-descent searching, and usually face some troubles such as time-consuming convergence, local minima. In order to solve these problems, we propose a Joint Deep Radom Kernel Convolution and ELM (JDRKC-ELM) method for object recognition, which integrating the power of CNN feature extraction and fast training of ELM-AE. Our JDRKC-ELM can learn feature representations from raw RGB-D data directly. In this structure, Radom Kernel Convolutional neural network (RKCNN) is used for lower-level feature extraction from RGB and Depth modality separately. And then, combining these features from different modality by a feature fusion layer and feeding these fusion features to a Double-layer ELM-AE (DLELM-AE) for higher-level features. At last, the final feature representations are sent to a standard ELM for the object classification. We evaluate the quality of the JDRKC-ELM method on the RGB-D Object Dataset. The results show that the proposed method achieves high recognition accuracy and good generalization performance in comparison with deep learning methods and other ELM methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-view CSPMPR-ELM feature learning and classifying for RGB-D object recognition

Article 11 January 2018

An Empirical Analysis of Deep Feature Learning for RGB-D Object Recognition

Exploiting Multi-layer Features Using a CNN-RNN Approach for RGB-D Object Recognition

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Blum M, Springenberg JT, Wuelfing J, Riedmiller M (2012) A learned feature descriptor for object recognition in RGB-D data. IEEE Int Conf Robot Autom ICRA 44(8):1298–1303. https://doi.org/10.1109/icra.2012.6225188
Article Google Scholar
Bo LF, Ren XF, Fox D (2011a) Depth kernel descriptors for object recognition. IEEE RSJ Int Conf Intell Robots Syst IROS 32(14):821–826. https://doi.org/10.1109/CVPR.2011.5995719
Article Google Scholar
Bo LF, Ren XF, Fox D (2011b) Hierarchical matching pursuit for image classification: architecture and fast algorithms. In: Proceedings of the 24th international conference on neural information processing systems, Granada, Spain, pp 2115–2123
Bo LF, Ren XF, Fox D (2012) Unsupervised feature learning for RGB-D based object recognition. Proc Int Symp Exp Robot ISER 88:387–402. https://doi.org/10.1007/978-3-319-00065-7_27
Article Google Scholar
Browatzki B, Fischer J, Graf B, Bulthoff H, Wallraven C (2011) Going into depth: evaluating 2D and 3D cues for object classification on a new, large-scale object dataset. IEEE Int Conf Comput Vis Workshops 28(5):1189–1195. https://doi.org/10.1109/ICCVW.2011.6130385
Article Google Scholar
Castro D, Hickson S, Bettadapura V, Thomaz E, Abowd G et al (2015) Predicting daily activities from egocentric images using deep learning. In: Proceedings of the 2015 ACM international symposium on wearable computers, Osaka, Japan, pp 75–82. https://doi.org/10.1145/2802083.2808398
Chen X, Ji D, Xu LF, Wu CZ, Li XH (2018) Image denoising via deep network based on edge enhancement. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-018-1036-4
Article Google Scholar
Cheng YH, Zhao X, Huang KQ, Tan TN (2014) Semi-supervised learning for RGB-D object recognition. Int Conf Pattern Recognit. https://doi.org/10.1109/ICPR.2014.412
Article Google Scholar
Cheng YH, Zhao X, Huang KQ, Tan TN (2015) Semi-supervised learning and feature evluation for RGB-D object recognition. Comput Vis Image Underst 139(C):149–160. https://doi.org/10.1016/j.cviu.2015.05.007
Article Google Scholar
Chikhaoui B, Ye B, Mihailidis A (2017) Aggressive and agitated behavior recognition from accelerometer data using non-negative matrix factorization. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-017-0537-x
Article Google Scholar
Cho KH, Raiko T, Ilin A (2013) Gaussian–Bernoulli deep Bolzmann machine. In: Proceedings of the 2013 international joint conference on neural networks (IJCNN). https://doi.org/10.1109/IJCNN.2013.6706831
Ding SF, Zhang N, Xu XZ, Guo LL, Zhang J (2015) Deep extreme learning machine and its application in EEG classification. Math Probl Eng 2015(1):1–12. https://doi.org/10.1155/2015/129021
Article MathSciNet MATH Google Scholar
Farabet C, Couprie C, Najman L, LeCun Y (2013) Learning hierarchical features for scene labelling. IEEE Trans Pattern Anal Mach Intell 35(8):1915–1929. https://doi.org/10.1109/TPAMI.2012.231
Article Google Scholar
Feng GR, Huang GB, Lin QP, Gay R (2009) Error minimized extreme learning machine with growth of hidden nodes and incremental learning. IEEE Trans Neural Netw 20(8):1352–1357. https://doi.org/10.1109/TNN.2009.2024147
Article Google Scholar
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. Proc IEEE Conf Comput Vis Pattern Recognit. https://doi.org/10.1109/CVPR.2014.81
Article Google Scholar
Hariharan B, Arbeláez P, Girshick R, Malik J (2014) Simultaneous detection and segmentation. Proc Eur Conf Comput Vis ECCV 8695:297–312. https://doi.org/10.1007/978-3-319-10584-0_20
Article Google Scholar
Huang GB, Zhu QY, Siew CK (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. Proc Int Joint Conf Neural Netw IJCNN 2(2):985–990. https://doi.org/10.1109/IJCNN.2004.1380068
Article Google Scholar
Huang GB, Zhu QY, Siew CK (2006a) Extreme learning machine: theory and applications. Neurocomputing 70(1):489–501. https://doi.org/10.1016/j.neucom.2005.12.126
Article Google Scholar
Huang GB, Zhu QY, Siew CK (2006b) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 17(4):879–892. https://doi.org/10.1109/TNN.2006.875977
Article Google Scholar
Huang GB, Zhou HM, Ding XJ, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern Part B Cybern 42(2):513–529. https://doi.org/10.1109/TSMCB.2011.2168604
Article Google Scholar
Huang WH, Hong HK, Song GJ, Xie KQ (2014) Deep process neural network for temporal deep learning. In: Proceedings of the 2014 international joint conference on neural networks (IJCNN), pp 465–472. https://doi.org/10.1109/IJCNN.2014.6889533
Huang GB, Bai Z, Kasun LLC, Chi MV (2015) Local receptive fields based extreme learning machine. Proc IEEE Comput Intell Mag 10(2):18–29. https://doi.org/10.1109/MCI.2015.2405316
Article Google Scholar
Huang R, Feng W, Fan MY, Guo Q, Sun JZ (2017) Learning multi-path CNN for mural deterioration detection. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-017-0656-4
Article Google Scholar
Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Li FF (2014) Large-scale video classification with convolutional neural networks. In: Proceedings of the 2014 IEEE conference on computer vision and pattern recognition, pp 1725–1732. https://doi.org/10.1109/CVPR.2014.223
Kasun LLC, Zhou HM, Huang GB, Vong CM (2013) Representational learning with extreme learning machine for big data. IEEE Intell Syst 28(6):31–34. https://doi.org/10.1109/MIS.2013.140
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90. https://doi.org/10.1145/3065386
Article Google Scholar
Lai K, Bo LF, Ren XF, Fox D (2011) A large-scale hierarchical multiview RGB-D object dataset. IEEE Int Conf Robot Autom ICRA 47(10):1817–1824. https://doi.org/10.1109/ICRA.2011.5980382
Article Google Scholar
Lai K, Bo LF, Ren XF, Fox D (2013) RGB-D object recognition features, algorithms, and a large scale benchmark. Springer, London, pp 167–192. https://doi.org/10.1007/978-1-4471-4640-7_9
Book Google Scholar
Lcun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324. https://doi.org/10.1109/5.726791
Article Google Scholar
Liu HP, Li FX, Xu XY, Sun FC (2017) Multi-modal local receptive field extreme learning machine for object recognition.Neurocomputing 277:1696-1701. https://doi.org/10.1016/j.neucom.2017.04.077
Article Google Scholar
Münzner S, Schmidt P, Reiss A, Hanselmann M, Stiefelhagen R et al (2017) CNN-based sensor fusion techniques for multimodal human activity recognition. In: Proceedings of the 2017 ACM international symposium on wearable computers, Maui, Hawaii, pp 158–165. https://doi.org/10.1145/3123021.3123046
Scherer D, Müller A, Behnke S (2010) Evaluation of pooling operations in convolutional architectures for object recognition. Int Conf Artif Neural Netw 6354:92–101. https://doi.org/10.1007/978-3-642-15825-4_10
Article Google Scholar
Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos. Proc Neural Inf Process Syst NIPS 1(4):568–576
Google Scholar
Socher R, Huval B, Bhat B, Manning CD, Ng AY (2012) Convolutional-recursive deep learning for 3D object classification. Proc Neural Inf Process Syst NIPS 1:665–673
Google Scholar
Srivastava N, Salakhutdinov RR (2012) Multimodal learning with deep Boltzmann machines. Proc Neural Inf Process Syst NIPS 15(8):2222–2230
MATH Google Scholar
Toshev A, Szegedy C (2014) Deeppose: Human pose estimation via deep neural networks. In: 2014 IEEE conference on computer vision and pattern recognition, pp 1653–1660. https://doi.org/10.1109/CVPR.2014.214
Wang XY, Han M (2014) Multivariate time series prediction based on multiple kernel extreme learning machine. In: 2014 international joint conference on neural networks (IJCNN), pp 198–201. https://doi.org/10.1109/IJCNN.2014.6889479
Wang JJ, Yang JC, Yu K, Lv FJ, Huang T, Gong YH (2010) Locality-constrained linear coding for image classification. Comput Vis Pattern Recognit 119(5):3360–3367. https://doi.org/10.1109/cvpr.2010.5540018
Article Google Scholar
Wang A, Lu JW, Cai JF, Cham T-J, Wang G (2015) Large-margin multi-modal deep learning for RGB-D object recognition. IEEE Trans Multimedia 17(11):1887–1898. https://doi.org/10.1109/tmm.2015.2476655
Article Google Scholar
Xia ZQ, Feng XY, Lin J, Hadid A (2017) Deep convolutional hashing using pairwise multi-label supervision for large-scale visual search. Signal Process Image Commun 59:109–116. https://doi.org/10.1016/j.image.2017.06.008
Article Google Scholar
Yang YM, Wu QMJ (2015) Mutilayer extreme learning machine with subnetwork nodes for representation learning. IEEE Trans Cybern 46(11):2570–2583. https://doi.org/10.1109/tcyb.2015.2481713
Article Google Scholar
Yu K, Lin YQ, Lafferty J (2011) Learning image representations from the pixel level via hierarchical sparse coding. In: Proc Comput Vis Pattern Recognit CVPR 42(7):1713–1720. https://doi.org/10.1109/cvpr.2011.5995732
Article Google Scholar
Zhang ZY, Tian ZS, Zhou M (2018) HandSense: smart multimodal hand gesture recognition based on deep neural networks. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-018-0989-7
Article Google Scholar

Download references

Acknowledgements

This work is funded by National Natural Science Foundation of China (Grant No. 61402368). The authors thank all the reviewers for their very helpful comments to improve the paper.

Author information

Authors and Affiliations

School of Electronics and Information, Northwestern Polytechnical University, Xi’an, 710129, China
Yunhua Yin & Huifang Li

Authors

Yunhua Yin
View author publications
You can also search for this author inPubMed Google Scholar
Huifang Li
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Yunhua Yin.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yin, Y., Li, H. RGB-D object recognition based on the joint deep random kernel convolution and ELM. J Ambient Intell Human Comput 11, 4337–4346 (2020). https://doi.org/10.1007/s12652-018-1067-x

Download citation

Received: 13 July 2018
Accepted: 22 September 2018
Published: 28 September 2018
Issue Date: November 2020
DOI: https://doi.org/10.1007/s12652-018-1067-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RGB-D object recognition based on the joint deep random kernel convolution and ELM

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-view CSPMPR-ELM feature learning and classifying for RGB-D object recognition

An Empirical Analysis of Deep Feature Learning for RGB-D Object Recognition

Exploiting Multi-layer Features Using a CNN-RNN Approach for RGB-D Object Recognition

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now