Abstract
In computer vision, object recognition has gained a lot of attention due to its numerous practical usage. For real-world applications, it is necessary to consider conditions like object images are captured from multiple viewpoints, change in illumination and different distance locations of objects from the camera for better recognition. In this work, a new CVPR34K RGB-D dataset is proposed consisting of RGB-D images which are acquired from different distance location from the camera. A distance invariant RGB-D object recognition system is introduced using Depth Estimation, Scale data with Unit Depth and Multimodal Convolutional neural network with SVM (DSMS). The proposed DSMS system is divided into three parts. First, the Depth Estimation is introduced to detect distance location of acquired RGB-D object image. The second stage consists of several preprocessing operation to normalize input RGB-D data with respect to a reference distance. The final stage is to learn features from normalized RGB and depth images and performed RGB-D object recognition. The experimental results show that the DSMS method achieves comparable performance to state-of-the-art methods on the RGB-D object dataset. Effectiveness of our method is clearly observed for the cases when distance location RGB-D object image is changed in proposed CVPR34K Dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abdel-Hakim, A., Farag, A.: CSIFT: a SIFT descriptor with color invariant characteristics. Comput. Vis. Pattern Recogn. 2, 1978–1983 (2006)
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006). https://doi.org/10.1007/11744023_32
Bo, L., Lai, K., Ren, X., Fox, D.: Object recognition with hierarchical kernel descriptors. In: CVPR (2011)
Bo, L., Ren, X., Fox, D.: Depth Kernel descriptors for object recognition. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 821–826 (2011)
Chang, C., Lin, C.: LIBSVM a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2, 1–39 (2011)
Eitel, A., Springenberg, J., Spinello, L., Riedmiller, M., Burgard, W.: Multimodal deep learning for robust RGB-D object recognition. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 681–687 (2015)
Fadnavis, S.: Image interpolation techniques in digital image processing: an overview. Int. Eng. Res. Appl. 4, 70–73 (2014). 2248-962270
Hsu, C., Chang, C., Lin, C.: A practical guide to support vector classification. BJU Int. 101(1), 1396–400 (2010)
Jin, L., Gao, S., Li, Z., Tang, J.: Hand-crafted features or machine learnt features? Together they improve RGB-D object recognition. In: IEEE International Symposium on Multimedia, pp. 311–319 (2014)
Khan, W., Phaisangittisagul, E., Ali, L., Gansawat, D., Kumazawa, I.: Combining features for RGB-D object recognition. In: Electrical Engineering Congress (iEECON) International, pp. 1–5 (2017)
Krizhevsky, A., Sulskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information and Processing Systems (NIPS), vol. 60, no. 6, pp. 84–90 (2012)
Lai, K., Bo, L., Ren, X., Fox, D.: A large-scale hierarchical multi-view RGB-D object dataset. In: IEEE International Conference on Robotics and Automation, pp. 1817–1824 (2011)
Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004). https://doi.org/10.1023/B:VISI.0000029664.99615.94
Pierre, F., Aujol, J.F., Bugeau, A., Steidl, G., Ta, V.T.: Variational contrast enhancement of RGB images (2015)
Rahman, M., Tan, Y., Xue, J., Lu, K.: RGB-D object recognition with multimodal deep convolutional neural networks. In: IEEE International Conference on Multimedia and Expo (ICME), pp. 991–996 (2017)
Socher, R., Huval, B., Bhat, B., Manning, C., Ng, A.: Convolutional-recursive deep learning for 3D object classification. In: International Conference on Neural Information Processing Systems, vol. 1, pp. 656–664 (2012)
Sun, S., Zhao, X., Xu, J., Tan, M.: RGB-D object recognition based on RGBD-PCANet learning. In: IEEE International Conference on Mechatronics and Automation (ICMA), pp. 1075–1080 (2017)
Wang, A., Lu, J., Cai, J., Cham, T., Wang, G.: Large-margin multimodal deep learning for RGB-D object recognition. IEEE Trans. Multimed. 17(11), 1887–1898 (2015)
Ying, Z., Li, G., Ren, Y., Wang, R., Wang, W.: A new image contrast enhancement algorithm using exposure fusion framework. In: Felsberg, M., Heyden, A., Krüger, N. (eds.) CAIP 2017. LNCS, vol. 10425, pp. 36–46. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-64698-5_4
Zeiler, M.: ADADELTA: An Adaptive Learning Rate Method. arXiv:1212.5701v1 [cs.LG] (2012)
Zia, S., Yüksel, B., Yüret, D., Yemez, Y.: RGB-D object recognition using deep convolutional neural networks. In: IEEE International Conference on Computer Vision Workshops (ICCVW), pp. 887–894 (2018)
Patekar, R., Nandedkar, A.: CVPR34K RGB-D Object dataset. https://drive.google.com/file/d/1vOiBPkwoLecj0hHQMP8s1kJHQsZXQEuT/view?usp=sharing
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Patekar, R., Nandedkar, A. (2020). Distance Invariant RGB-D Object Recognition Using DSMS System. In: Bhattacharjee, A., Borgohain, S., Soni, B., Verma, G., Gao, XZ. (eds) Machine Learning, Image Processing, Network Security and Data Sciences. MIND 2020. Communications in Computer and Information Science, vol 1240. Springer, Singapore. https://doi.org/10.1007/978-981-15-6315-7_11
Download citation
DOI: https://doi.org/10.1007/978-981-15-6315-7_11
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-6314-0
Online ISBN: 978-981-15-6315-7
eBook Packages: Computer ScienceComputer Science (R0)