An Empirical Analysis of Deep Feature Learning for RGB-D Object Recognition

Caglayan, Ali; Can, Ahmet Burak

doi:10.1007/978-3-319-59876-5_35

Ali Caglayan¹⁶ &
Ahmet Burak Can¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10317))

Included in the following conference series:

International Conference Image Analysis and Recognition

Abstract

Conventional deep feature learning methods use the same model parameters for both RGB and depth domains in RGB-D object recognition. Since the characteristics of RGB and depth data are different, the suitability of such approaches is suspicious. In this paper, we empirically investigate the effects of different model parameters on RGB and depth domains using the Washington RGB-D Object Dataset. We have explored the effects of different filter learning approaches, rectifier functions, pooling methods, and classifiers for RGB and depth data separately. We have found that individual model parameters fit best for RGB and depth data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In fact, there are 207.920 images in total, but 258 of them do not have object mask.

References

Bai, J., Wu, Y., Zhang, J., Chen, F.: Subset based deep learning for RGB-D object recognition. Neurocomputing 165, 280–292 (2015)
Article Google Scholar
Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (SURF). Comput. Vis. Image Underst. 110(3), 346–359 (2008)
Article Google Scholar
Blum, M., Springenberg, J.T., Wülfing, J., Riedmiller, M.: A learned feature descriptor for object recognition in RGB-D data. In: 2012 IEEE International Conference on Robotics and Automation (ICRA), pp. 1298–1303. IEEE (2012)
Google Scholar
Bo, L., Ren, X., Fox, D.: Depth kernel descriptors for object recognition. In: 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 821–826. IEEE (2011)
Google Scholar
Bo, L., Ren, X., Fox, D.: Unsupervised feature learning for RGB-D based object recognition. In: Desai, J.P., Dudek, G., Khatib, O., Kumar, V. (eds.) Experimental Robotics, pp. 387–402. Springer, Cham (2013)
Chapter Google Scholar
Cheng, Y., Zhao, X., Huang, K., Tan, T.: Semi-supervised learning for RGB-D object recognition. In: 2014 22nd International Conference on Pattern Recognition (ICPR), pp. 2377–2382. IEEE (2014)
Google Scholar
Cheng, Y., Zhao, X., Huang, K., Tan, T.: Semi-supervised learning and feature evaluation for RGB-D object recognition. Comput. Vis. Image Underst. 139, 149–160 (2015)
Article Google Scholar
Coates, A., Lee, H., Ng, A.Y.: An analysis of single-layer networks in unsupervised feature learning. Ann Arbor 1001(48109), 2 (2010)
Google Scholar
Guo, Q., Wang, F., Lei, J., Tu, D., Li, G.: Convolutional feature learning and Hybrid CNN-HMM for scene number recognition. Neurocomputing 184, 78–90 (2016)
Article Google Scholar
Jarrett, K., Kavukcuoglu, K., Lecun, Y., et al.: What is the best multi-stage architecture for object recognition? In: 2009 IEEE 12th International Conference on Computer Vision, pp. 2146–2153. IEEE (2009)
Google Scholar
Jhuo, I.-H., Gao, S., Zhuang, L., Lee, D.T., Ma, Y.: Unsupervised feature learning for RGB-D image classification. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014. LNCS, vol. 9003, pp. 276–289. Springer, Cham (2015). doi:10.1007/978-3-319-16865-4_18
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Lai, K., Bo, L., Ren, X., Fox, D.: A large-scale hierarchical multi-view RGB-D object dataset. In: 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 1817–1824. IEEE (2011)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the ICML, vol. 30 (2013)
Google Scholar
Socher, R., Huval, B., Bath, B., Manning, C.D., Ng, A.Y.: Convolutional-recursive deep learning for 3D object classification. In: Advances in Neural Information Processing Systems, pp. 665–673 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Hacettepe University, Ankara, Turkey
Ali Caglayan & Ahmet Burak Can

Authors

Ali Caglayan
View author publications
You can also search for this author in PubMed Google Scholar
Ahmet Burak Can
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ali Caglayan .

Editor information

Editors and Affiliations

University of Waterloo, Waterloo, Ontario, Canada
Fakhri Karray
University of Porto, Porto, Portugal
Aurélio Campilho
Politechnique Montreal, Montreal, Québec, Canada
Farida Cheriet

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Caglayan, A., Can, A.B. (2017). An Empirical Analysis of Deep Feature Learning for RGB-D Object Recognition. In: Karray, F., Campilho, A., Cheriet, F. (eds) Image Analysis and Recognition. ICIAR 2017. Lecture Notes in Computer Science(), vol 10317. Springer, Cham. https://doi.org/10.1007/978-3-319-59876-5_35

Download citation

DOI: https://doi.org/10.1007/978-3-319-59876-5_35
Published: 02 June 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59875-8
Online ISBN: 978-3-319-59876-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics