Skip to main content
Log in

Robotic grasping recognition using multi-modal deep extreme learning machine

  • Published:
Multidimensional Systems and Signal Processing Aims and scope Submit manuscript

Abstract

Recognizing which part of an object is graspable or not is important for intelligent robot to perform some complicated tasks. In order to obtain good grasping performance, learning rich representations efficiently from multi-modal RGB-D images is crucial. To address this problem, in this paper, we propose an effective multi-modal deep extreme learning machine structure. In this structure, unsupervised hierarchical extreme learning machine (ELM) is conducted for feature extraction for RGB and depth modalities separately. Then, the shared layer is developed by combining both RGB and depth features. Finally, the ELM is used as supervised feature classifier for final decision. Experimental validation on Cornell grasping dataset illustrates that the proposed multiple modality fusion method achieves better grasp recognition performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Akusok, A., Miche, Y., Karhunen, J., Bjork, K. M., Nian, R., & Lendasse, A. (2015). Arbitrary category classification of websites based on image content. IEEE Computational Intelligence Magazine, 10(2), 30–41.

    Article  Google Scholar 

  • Bai, J., & Wu, Y. (2014). SAE-RNN deep learning for RGB-D based object recognition. In Intelligent computing theory. Lecture notes in computer science, Vol. 8588, pp. 235–240.

  • Beksi, W. J., & Papanikolopoulos, N. (2015). Object classification using dictionary learning and RGB-D covariance descriptors. In International conference on robotics and automation (ICRA) (pp. 1–6).

  • Bicchi, A., & Kumar, V. (2000). Robotic grasping and contact: A review. In International conference on robotics and automation (ICRA) (pp. 348–353).

  • Bohg, J., Morales, A., Asfour, T., & Kragic, D. (2014). Data-driven grasp synthesis—A survey. IEEE Transactions on Robotics, 30(2), 289–309.

    Article  Google Scholar 

  • Cambria, E., & Huang, G. (2013). Extreme learning machines-representational learning with ELMs for big data. IEEE Intelligent Systems, 28(6), 30–59.

    Article  Google Scholar 

  • Cao, J. W., Chen, T., & Fan, J. Y. (2015). Landmark recognition with compact BoW histogram and ensemble ELM. Multimedia Tools and Applications. doi:10.1007/s11042-014-2424-1.

  • Cao, J., & Lin, Z. (2015). Extreme learning machine on high dimensional and large data applications: A survey. Mathematical Problems in Engineering. doi:10.1155/2015/103796.

  • Cao, J., Lin, Z., Huang, G.-B., & Liu, N. (2012). Voting based extreme learning machine. Information Sciences, 185(1), 66–77.

    Article  MathSciNet  Google Scholar 

  • Chen, Y., Yao, E., & Basu, A. (2015). A 128 channel extreme learning machine based neural decoder for brain machine interfaces. IEEE Transactions on Biomedical Circuits and Systems (in press).

  • Ding, S., Zhang, N., Xu, X., Guo, L., & Zhang, J. (2015). Deep extreme learning machine and its application in EEG classification. Mathematical Problems in Engineering. doi:10.1155/2015/129021.

  • Feng, G., Huang, G., Lin, Q., & Gay, R. (2009). Error minimized extreme learning machine with growth of hidden nodes and incremental learning. IEEE Transactions on Neural Networks, 20(8), 1352–1357.

    Article  Google Scholar 

  • Huang, G., Zhu, Q., & Siew, C. (2004). Extreme learning machine: A new learning scheme of feedforward neural networks. In Proceedings of international joint conference on neural network (IJCNN) (Vol. 2, pp. 985–990).

  • Huang, G. B. (2014). An insight into extreme learning machines: Random neurons, random features and kernels. Cognitive Computation, 61(1), 376–390.

    Article  Google Scholar 

  • Huang, G., Zhou, H., Ding, X., & Zhang, R. (2012). Extreme learning machine for regression and multiclass classification. IEEE Transaction on Systems, Man, and Cybernetics, Part B: Cybernetics, 42(2), 513–529.

    Article  Google Scholar 

  • Huang, G., Zhu, Q., & Siew, C. (2006). Extreme learning machine: Theory and applications. Neurocomputing, 70, 489–501.

    Article  Google Scholar 

  • Hu, X., Zhang, X., Liu, M., Chen, Y., Li, P., Liu, J., et al. (2016). High precision intelligent flexible grasping front-end with CMOS interface for robots application. Science China Information Sciences, 59, 032203(11).

    Google Scholar 

  • Jhuo, I. H., Gao, S., Zhuang, L., & Lee, D. T. (2015). Unsupervised feature learning for RGB-D image classification. In Asian conference on computer vision (ACCV) (pp. 276–289).

  • Jiang, C. F., Chang, C. C., & Huang, S. H. (2012). Regions of interest extraction from SPECT images for neural degeneration assessment using multimodality image fusion. Multidimensional Systems and Signal Processing, 23(4), 437–449.

    Article  MathSciNet  MATH  Google Scholar 

  • Lai, K., Bo, L., Ren, X., & Fox, D. (2011). A large-scale hierarchical multi-view RGB-D object dataset. In International conference on robotics and automation (ICRA) (pp. 1817–1824).

  • Lenz, I., Lee, H., & Saxena, A. (2015). Deep learning for detecting robotic grasps. The International Journal of Robotics Research, 34(4–5), 705–724.

    Article  Google Scholar 

  • Ouyang, W., Chu, X., & Wang, X. (2014). Multi-source deep learning for human pose estimation. In Computer vision and pattern recognition (CVPR) (pp. 2337–2344).

  • Porter, W. A., & Liu, W. (1994). Object recognition by a massively parallel 2-D neural architecture. Multidimensional Systems and Signal Processing, 5(2), 179–201.

    Article  MATH  Google Scholar 

  • Sahbani, A., El-Khoury, S., & Bidaud, P. (2012). An overview of 3D object grasp synthesis algorithms. Robotics and Autonomous Systems, 60, 326–336.

    Article  Google Scholar 

  • Saxena, A., Driemeyer, J., & Ng, A. Y. (2008). Robotic grasping of novel objects using vision. The International Journal of Robotics Research, 27(2), 157–173.

    Article  Google Scholar 

  • Srivastava, N., & Salakhutdinov, R. (2012). Learning representations for multi-modal data with deep belief nets. In International conference on machine learning workshop (pp. 1–8).

  • Tang, J., Deng, C., & Huang, G. (2015). Extreme learning machine for multilayer perceptron. IEEE Transactions on Neural Networks and Learning Systems. doi:10.1109/TNNLS.2015.2424995.

  • Uzair, M., Shafait, F., Ghanem, B., & Mian, A. (2015). Representation learning with deep extreme learning machines for efficient image set classification. arXiv preprint arXiv:1503.02445, pp. 1–10.

  • Wang, A., Lu, J., Wang, G., Cai, J., & Cham, T. J. (2014). Multimodal unsupervised feature learning for RGB-D scene labeling. In European conference on computer vision (ECCV) (pp. 453–467).

  • Wang, W., Ooi, B. C., Yang, X., Zhang, D., & Zhuang, Y. (2014). Effective multi-modal retrieval based on stacked auto-encoders. Proceedings of the VLDB Endowment, 7(8), 649–660.

    Article  Google Scholar 

  • Wang, J., Su, G., Xiong, Y., Chen, J., Shang, Y., Liu, J., et al. (2013). Sparse representation for face recognition based on constraint sampling and face alignment. Tsinghua Science and Technology, 1, 62–67.

    Article  MATH  Google Scholar 

  • Yuan, Y., & Sun, F. (2015). Data fusion-based resilient control system under DoS attacks: A game theoretic approach. International Journal of Control Automation and Systems, 13(3), 513–520.

    Article  Google Scholar 

  • Yu, W., Zhuang, F., He, Q., & Shi, Z. (2015). Learning deep representations via extreme learning machines. Neurocomputing, 149, 308–315.

    Article  Google Scholar 

  • Zaki, M., Ghalwash, A., & Elkouny, A. A. (1996). CNN: A speaker recognition system using a cascaded neural network. Multidimensional Systems and Signal Processing, 7(1), 87–99.

    Article  MATH  Google Scholar 

  • Zhu, W., Miao, J., Qing, L., & Huang, G. (in press). Hierarchical extreme learning machine for unsupervised representation learning. Neurocomputing.

Download references

Acknowledgments

This work was supported in part by the National Key Project for Basic Research of China under Grant 2013CB329403; in part by National High-tech Research and Development Plan under Grant 2015AA042306; in part by the National Natural Science Foundation of China under Grants 61210013 and 61450011; and in part by the Tsinghua University Initiative Scientific Research Program under Grant 20131089295.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Huaping Liu or Gaowei Yan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wei, J., Liu, H., Yan, G. et al. Robotic grasping recognition using multi-modal deep extreme learning machine. Multidim Syst Sign Process 28, 817–833 (2017). https://doi.org/10.1007/s11045-016-0389-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11045-016-0389-0

Keywords

Navigation