Skip to main content
Log in

Multi-view CSPMPR-ELM feature learning and classifying for RGB-D object recognition

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

In order to fully utilize potential feature information of RGB-D images, current popular algorithms mainly use convolutional neural network (CNN) to execute both feature extraction and classification. Such methods could achieve impressive results but usually on the basis of an extremely huge and complex network. What’s more, since the fully connected layers in CNN form a classical neural network classifier, which is trained by gradient descent-based implementations, the generalization ability is limited and sub-optimal. To address these problems, this paper introduce a multi-view CNN-SPMP-RNN-ELM (MCSPMPR-ELM) model for RGB-D object recognition, which combines the power of MCSPMPR and fast training of ELM. It uses the MCSPMPR algorithm to extract discriminative features from raw RGB images and depth images separately. Then the abstracted features are fed to a nonlinear ELM classifier, which leads to better generalization performance with faster learning speed. At last, co-training is employed to learn from the unlabeled data using the two distinct feature sets by semi-supervised learning method. Experimental results on widely used RGB-D object datasets show that our method achieves competitive performance compared with other state-of-the-art algorithms specifically designed for RGB-D data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Lecun, Y., Kavukcuoglu, K., Farabet, C.: Convolutional networks and applications in vision. IEEE Int. Symp. Circuits Syst. 14(5), 253–256 (2010)

    Google Scholar 

  2. Socher, R., Huval, B., Bhat, B., Manning, C.D., Ng, A.Y.: Convolutional-recursive deep learning for 3D object classification. In: International Conference on Neural Information Processing System (NIPS), pp. 665–673 (2012)

  3. Bo, L., Ren, X., Fox, D.: Hierarchical matching pursuit for image classification: architecture and fast algorithms. In: International Conference on Neural Information Processing System (NIPS), pp. 2115–2123 (2011)

  4. Bo, L., Ren, X., Fox, D.: Unsupervised feature learning for RGB-D based object recognition. In: International Symposium on Experimental Robotics (ISER), pp. 1–15 (2012)

  5. Blum, M., Springenberg, J.T., Wulfing, J., Riedmiller, M.: A learned feature descriptor for object recognition in RGB-D data. In: IEEE International Conference on Robotics & Automation, pp. 1298–1303 (2012)

  6. Yu, K., Lin, Y., Lafferty, J.: Learning image representations from the pixel level via hierarchical sparse coding. IEEE Conf. Comput. Vis. Pattern Recognit. 42(7), 1713–1720 (2011)

    Google Scholar 

  7. Wang, J., et al.: Locality-constrained linear coding for image classification. IEEE Conf. Comput. Vis. Pattern Recognit. 119(5), 3360–3367 (2010)

    Google Scholar 

  8. Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Eleventh Conference on Computational Learning Theory, pp. 92–100 (1998)

  9. Balcan, M.F., Blum, A., Yang, K.: Co-training and expansion: towards bridging theory and practice. Int. Conf. Neural Inf. Process. Syst. 8(1), 89–96 (2004)

    Google Scholar 

  10. Cheng, Y., Zhao, X., Huang, K., Tan, T.: Semi-supervised learning for RGB-D object recognition. Int. Conf. Pattern Recognit, 2377–2382 (2014)

  11. Cheng, Y., Zhao, X., Huang, K., Tan, T.: Semi-supervised learning and feature evluation for RGB-D object recognition. Comput. Vis. Image Underst. 139(C), 149–160 (2015)

    Article  Google Scholar 

  12. Razavian, A.S., Azizpour, H., et al.: CNN Features off-the-shelf: an astounding baseline for recognition. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 512–519 (2014)

  13. Azizpour, H., Razavian, A.S., et al.: From generic to specific deep representations for visual recognition. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 36–45 (2015)

  14. Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine:a new learning scheme of feedforward neural networks. IEEE Int. Joint Conf. Neural Netw. 2(2), 985–990 (2004)

    Google Scholar 

  15. Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine: theory and applications. Neurocomputing 70(1–3), 489–501 (2006)

    Article  Google Scholar 

  16. Huang, G.-B., et al.: Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. Part B 42(2), 513 (2012)

    Article  Google Scholar 

  17. He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)

    Article  Google Scholar 

  18. Bo, L., Ren, X., Fox, D.: Learning hierarchical sparse features for RGB-(D) object recognition. Int. J. Robot. Res. 33(4), 581–599 (2014)

    Article  Google Scholar 

  19. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. 2(17), 2169–2178 (2006)

    Google Scholar 

  20. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(2), 2012 (2012)

    Google Scholar 

  21. Abdel-Hakim, A.E., Farag, A.A.: CSIFT: a sift descriptor with color invariant characteristics. In: IEEE Computer Society Conference on Computer Vision&Pattern Recognition (CVPR), vol. 2, pp. 1978–1983 (2006)

  22. Socher, R., Lin, C.C., Ng, A., Manning, C.: Parsing natural scenes and natural language with recursive neural networks. In: International Conference on International Conference on Machine Learning( ICML), pp. 129–136 (2011)

  23. Huang, G.B.: Learning capability and storage capacity of two hidden-layer feedforward networks. IEEE Trans. Neural Netw. 14(2), 274–281 (2003)

    Article  MathSciNet  Google Scholar 

  24. Feng, G., Huang, G.B., Lin, Q., Gay, R.: Error minimized extreme learning machine with growth of hidden nodes and incremental learning. IEEE Trans. Neural Netw. 20(8), 1352–1357 (2009)

    Article  Google Scholar 

  25. Ding, S., Zhang, N., Xu, X., Guo, L., Zhang, J.: Deep extreme learning machine and its application in EEG classification. Math. Probl. Eng. 1(1), 1–11 (2015)

    MathSciNet  MATH  Google Scholar 

  26. Lai, K., Bo, L., Ren, X., Fox, D.: A large-scale hierarchical multi-view RGB-D object dataset. IEEE Int. Conf. Robot. Automat. 47(10), 1817–1824 (2011)

    Article  Google Scholar 

Download references

Acknowledgements

This work is funded by National Natural Science Foundation of China (Grant No. 61402368) and Science and Technology on Transient Impact Laboratory Foundation (Grant No. 61426060103162606007). The authors thank all the anonymous reviewers for their very helpful comments to improve the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yunhua Yin.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yin, Y., Li, H. Multi-view CSPMPR-ELM feature learning and classifying for RGB-D object recognition. Cluster Comput 22 (Suppl 4), 8181–8191 (2019). https://doi.org/10.1007/s10586-018-1695-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-018-1695-0

Keywords

Navigation