Skip to main content

3D Object Categorization in Cluttered Scene Using Deep Belief Network Architectures

  • Chapter
  • First Online:
Nature-Inspired Computation in Data Mining and Machine Learning

Part of the book series: Studies in Computational Intelligence ((SCI,volume 855))

Abstract

3D object classification in cluttered scenes is a critical area of computer vision and robotic research for autonomous robots to act in their surrounding area. In this chapter, we extend our previous work [51] by classifying 3D object categories in real-world scenes. We extract geometric features from 3D point clouds using a 3D global descriptor called Viewpoint Feature Histogram (VFH) then we learn the extracted features with Deep Belief Networks (DBNs). Thereafter, we test the power of Discriminative and Generative DBN architectures (DDBN/GDBN) for object categorization. The experiments on Washington RGBD dataset demonstrate the robustness of discriminative architecture which outperforms state-of-the-art. Also, we evaluate the performance of our approach on the real-world objects that are segmented from cluttered indoor scenes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Alexandre, L.A.: 3D object recognition using convolutional neural networks with transfer learning between input channels. In: Intelligent Autonomous Systems 13, pp. 889–898. Springer, Berlin (2016)

    Google Scholar 

  2. Azevedo, F.A.C., Carvalho, L.R.B., Grinberg, L.T., Farfel, J.M., Ferretti, R.E., Leite, R.E.P., Lent, R., Herculano-Houzel, S., et al.: Equal numbers of neuronal and nonneuronal cells make the human brain an isometrically scaled-up primate brain. J. Compar. Neurol. 513(5), 532–541 (2009)

    Article  Google Scholar 

  3. Basu, J.K., Bhattacharyya, D., Kim, T.: Use of artificial neural network in pattern recognition. Int. J. Softw. Eng. Appl. 4(2) (2010)

    Google Scholar 

  4. Bengio, Y., Chapados, N., Delalleau, O., Larochelle, H., Saint-Mleux, X., Hudon, C., Louradour, J.: Detonation classification from acoustic signature with the restricted Boltzmann machine. Comput. Intell. 28(2), 261–288 (2012)

    Article  MathSciNet  Google Scholar 

  5. Bo, L., Ren, X., Fox, D.: Depth kernel descriptors for object recognition. In: 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 821–826. IEEE (2011)

    Google Scholar 

  6. Bobkov, B, Chen, S, Jian, R, Iqbal, Z, Steinbach, E.: . Noise-resistant deep learning for object classification in 3D point clouds using a point pair descriptor. IEEE Robot. Autom. Lett. (2018)

    Google Scholar 

  7. Carreira-Perpinan, M.A., Hinton, G.E.: On contrastive divergence learning. In: Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, pp. 33–40

    Google Scholar 

  8. Deng, L.: A tutorial survey of architectures, algorithms, and applications for deep learning. APSIPA Trans. Signal Inform. Process. 3, e2 (2014)

    Article  Google Scholar 

  9. Eitel, A., Springenberg, J.T., Spinello, L., Riedmiller, M., Burgard, W.: Multimodal deep learning for robust RGB-D object recognition. In: Intelligent Robots and Systems (IROS), pp. 681–687. IEEE (2015)

    Google Scholar 

  10. Fischer, A., Igel, C.: Training restricted boltzmann machines: an introduction. Patt. Recogn. 47(1), 25–39 (2014)

    Article  Google Scholar 

  11. Gomez-Donoso, F., Garcia-Garcia, A., Garcia-Rodriguez, J., Orts-Escolano, S., Cazorla, M.: Lonchanet: a sliced-based cnn architecture for real-time 3D object recognition. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 412–418. IEEE (2017)

    Google Scholar 

  12. Hegde, V., Zadeh, R.: Fusionnet: 3D object classification using multiple data representations. arXiv preprint arXiv:1607.05695 (2016)

  13. Hinton, G.E.: A practical guide to training restricted Boltzmann machines. In: Neural Networks: Tricks of the Trade, pp. 599–619. Springer, Berlin (2012)

    Google Scholar 

  14. Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)

    Article  MathSciNet  Google Scholar 

  15. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MathSciNet  Google Scholar 

  16. Janoch, A., Karayev, S., Jia, Y., Barron, J.T, Fritz, M., Saenko, K., Darrell, T.: A category-level 3D object dataset: putting the kinect to work. In: Consumer Depth Cameras for Computer Vision, pp. 141–165. Springer, Berlin

    Chapter  Google Scholar 

  17. Keronen, S., Cho, K., Raiko, T., Ilin, A., Palomäki, K.: Gaussian-Bernoulli restricted Boltzmann machines and automatic feature extraction for noise robust missing data mask estimation. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6729–6733. IEEE (2013)

    Google Scholar 

  18. Keyvanrad, M.A., Homayounpour, M.M.: Deep belief network training improvement using elite samples minimizing free energy. arXiv preprint arXiv:1411.4046 (2014)

  19. Lai, K., Bo, L., Ren, X., Fox, D.: A large-scale hierarchical multi-view RGB-D object dataset. In: 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 1817–1824. IEEE (2011)

    Google Scholar 

  20. Larochelle, H., Bengio, Y.: Classification using discriminative restricted Boltzmann machines. In: Proceedings of the 25th International Conference on Machine Learning, pp. 536–543. ACM (2008)

    Google Scholar 

  21. LeCun, Y., Huang, F.J., Bottou, L.: Learning methods for generic object recognition with invariance to pose and lighting. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2004), vol. 2, pp II–97. IEEE (2004)

    Google Scholar 

  22. Liu, Y., Zhou, S., Chen, Q.: Discriminative deep belief networks for visual data classification. Patt. Recogn. 44(10), 2287–2296 (2011)

    Article  Google Scholar 

  23. Loghmani, M.R., Planamente, M., Caputo, B., Vincze, M.: Recurrent convolutional fusion for RGB-D object recognition. arXiv preprint arXiv:1806.01673 (2018)

  24. Madai-Tahy, L., Otte, S., Hanten, R., Zell, A.: Revisiting deep convolutional neural networks for RGB-D based object recognition. In: International Conference on Artificial Neural Networks, pp. 29–37. Springer, Berlin (2016)

    Chapter  Google Scholar 

  25. Madry, M., Ek C.H., Detry, R., Hang, K., Kragic, D.: Improving generalization for 3D object categorization with global structure histograms. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1379–1386. IEEE (2012)

    Google Scholar 

  26. Maturana, D., Scherer, S.: Voxnet: a 3D convolutional neural network for real-time object recognition. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 922–928. IEEE (2015)

    Google Scholar 

  27. McCann, S., Lowe, D.G.: Local Naive Bayes nearest neighbor for image classification. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3650–3656. IEEE (2012)

    Google Scholar 

  28. Mian, A., Bennamoun, M., Owens, R.: On the repeatability and quality of keypoints for local feature-based 3D object retrieval from cluttered scenes. Int. J. Comput. Vis. 89(2–3), 348–361 (2010)

    Article  Google Scholar 

  29. Ouadiay, F.Z., Zrira, N., Bouyakhf, E.H., Majid Himmi, M.: 3D object categorization and recognition based on deep belief networks and point clouds. In: Proceedings of the 13th International Conference on Informatics in Control, Automation and Robotics, pp. 311–318 (2016)

    Google Scholar 

  30. Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3D classification and segmentation. Proc. Comput. Vis. Patt. Recogn. (CVPR) 1(2), 4 (2017)

    Google Scholar 

  31. Rumelbart, D.E., McClelland, J.L.: Parallel distributed processing: Explorations in the microstuctures of cognition (1986)

    Google Scholar 

  32. Rusu, R.B., Blodow, N., Beetz, M.: Fast point feature histograms (FPFH) for 3D registration. In: IEEE International Conference on Robotics and Automation (ICRA’09), pp. 3212–3217. IEEE (2009)

    Google Scholar 

  33. Rusu, R.B., Blodow, N., Marton, Z.C., Beetz, M.: Close-range scene segmentation and reconstruction of 3D point cloud maps for mobile manipulation in domestic environments. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2009), pp. 1–6. IEEE (2009)

    Google Scholar 

  34. Rusu, R.B., Bradski, G., Thibaux, R., Hsu, J.: Fast 3D recognition and pose using the viewpoint feature histogram. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2155–2162. IEEE (2010)

    Google Scholar 

  35. Salakhutdinov, R.: Learning deep generative models. Annual Rev. Statistics Appl. 2, 361–385 (2015)

    Article  Google Scholar 

  36. Savarese, S., Fei-Fei, L.: 3D generic object categorization, localization and pose estimation. In: IEEE 11th International Conference on Computer Vision (ICCV 2007), pp. 1–8. IEEE (2007)

    Google Scholar 

  37. Schwarz, M., Schulz, H., Behnke, S.: RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 1329–1335. IEEE (2015)

    Google Scholar 

  38. Serre, T., Kreiman, G., Kouh, M., Cadieu, C., Knoblich, U., Poggio, T.: A quantitative theory of immediate visual recognition. Progress Brain Res. 165, 33–56 (2007)

    Article  Google Scholar 

  39. Shin, J., Triebel, R., Siegwart, R.: Unsupervised 3D object discovery and categorization for mobile robots. In: Robotics Research, pp. 61–76. Springer, Berlin (2017)

    Google Scholar 

  40. Socher, R., Huval, B., Bath, B., Manning, C.D., Ng, A.Y.: Convolutional-recursive deep learning for 3D object classification. In: Advances in Neural Information Processing Systems, pp. 665–673 (2012)

    Google Scholar 

  41. Sun, S., An, N., Zhao, X., Tan, M.: A PCA-CCA network for RGB-D object recognition. Int. J. Adv. Robotic Syst. 15(1), 1729881417752820 (2018)

    Google Scholar 

  42. Tang, S., Wang, X., Lv, X., Han, T.X., Keller, J., He, Z., Skubic, M., Lao, S.: Histogram of oriented normal vectors for object recognition with a depth sensor. In: Asian Conference on Computer Vision, pp. 525–538. Springer, Berlin (2012)

    Chapter  Google Scholar 

  43. Tieleman, T.: Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1064–1071. ACM (2008)

    Google Scholar 

  44. Toldo, R., Castellani, U., Fusiello, A.: A bag of words approach for 3D object categorization. In: Computer Vision/Computer Graphics CollaborationTechniques, pp. 116–127. Springer,Berlin (2009)

    Chapter  Google Scholar 

  45. Torralba, A., Murphy, K.P., Freeman, W.T., Rubin, M.A.: Context-based vision system for place and object recognition. In: Ninth IEEE International Conference on Computer Vision, pp. 273–280. IEEE (2003)

    Google Scholar 

  46. Yamashita, T., Tanaka, M., Yoshida, E., Yamauchi, Y., Fujiyoshii, H.: To be Bernoulli or to be Gaussian, for a restricted Boltzmann machine. In: 2014 22nd International Conference on Pattern Recognition (ICPR), pp. 1520–1525. IEEE (2014)

    Google Scholar 

  47. Zaki, H.F.M., Shafait, F., Mian, A.: Convolutional hypercube pyramid for accurate RGB-D object category and instance recognition. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 1685–1692. IEEE (2016)

    Google Scholar 

  48. Zhang, H., Berg, A.C., Maire, M., Malik, JSVM-KNN: discriminative nearest neighbor classification for visual category recognition. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2, pp. 2126–2136. IEEE (2006)

    Google Scholar 

  49. Zhi, S., Liu, Y., Li, X., Guo, Y.: Lightnet: a lightweight 3D convolutional neural network for real-time 3D object recognition. In: Eurographics Workshop on 3D Object Retrieval (2017)

    Google Scholar 

  50. Zhou, S., Chen, Q., Wang, X.: Discriminative deep belief networks for image classification. In: 2010 IEEE International Conference on Image Processing, pp. 1561–1564. IEEE (2010)

    Google Scholar 

  51. Zrira, N., Hannat, M., Bouyakhf, E.-H., Khan, H.A.: Generative vs. discriminative deep belief network for 3D object categorization. In: VISIGRAPP (5: VISAPP), pp. 98–107 (2017)

    Google Scholar 

  52. Zrira, N., Khan, H.A., Bouyakhf, E.-H.: Discriminative deep belief network for indoor environment classification using global visual features. Cogn. Comput. 10(3), 437–453 (2018)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nabila Zrira .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Zrira, N., Hannat, M., Bouyakhf, E.H. (2020). 3D Object Categorization in Cluttered Scene Using Deep Belief Network Architectures. In: Yang, XS., He, XS. (eds) Nature-Inspired Computation in Data Mining and Machine Learning. Studies in Computational Intelligence, vol 855. Springer, Cham. https://doi.org/10.1007/978-3-030-28553-1_8

Download citation

Publish with us

Policies and ethics