3D spatial pyramid: descriptors generation from point clouds for indoor scene classification

Romero-González, Cristina; Martínez-Gómez, Jesus; García-Varea, Ismael; Rodríguez-Ruiz, Luis

doi:10.1007/s00138-015-0744-4

3D spatial pyramid: descriptors generation from point clouds for indoor scene classification

Original Paper
Published: 06 January 2016

Volume 27, pages 263–273, (2016)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Cristina Romero-González¹,
Jesus Martínez-Gómez^1,2,
Ismael García-Varea¹ &
…
Luis Rodríguez-Ruiz¹

626 Accesses
2 Citations
Explore all metrics

Abstract

Traditionally, the indoor scene classification problem has been approached from a 2D image recognition point of view. In most visual scene classification systems, a descriptor for the input image is generated to obtain a suitable representation that includes features related to color, shape or spatial information. Techniques based on the use of a spatial pyramid have proven to be adequate to perform this step. In the past years, on the other hand, 3D sensors have become widely available, which allows to include new information sources to the framework previously described. In this work we rely on RGB-D data to extend the spatial pyramid approach, aimed at building descriptors that can lead to a more robust representation against changing lighting conditions. The proposed descriptors are evaluated on the RobotVision@ImageCLEF-2013 benchmark dataset, remarkably outperforming state-of-the-art 3D local and global descriptors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Microsoft COCO: Common Objects in Context

A Survey on Global LiDAR Localization: Challenges, Advances and Open Problems

Article 06 March 2024

3D point cloud-based place recognition: a survey

Article Open access 07 March 2024

Notes

http://www.imageclef.org/2013/robot.

References

Alexandre, L.A.: 3D descriptors for object and category recognition: a comparative evaluation. In: Workshop on Color-Depth Camera Fusion in Robotics at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2012)
Bay, H., Tuytelaars, T., Van Gool, L.: Surf: speeded up robust features. In: Computer Vision—ECCV 2006, pp. 404–417. Springer, New York (2006)
Ben-Chen, M., Gotsman, C.: Characterizing shape using conformal factors. In: 3DOR, pp. 1–8 (2008)
Bo, L., Ren, X., Fox, D.: Depth kernel descriptors for object recognition. In: 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 821–826. IEEE, New York (2011)
Boiman, O., Shechtman, E., Irani, M.: In defense of nearest-neighbor based image classification. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. CVPR 2008, pp. 1–8 (2008)
Bosch, A., Zisserman, A., Muñoz, X.: Scene classification via pLSA. In: Computer Vision—ECCV 2006, pp. 517–530. Springer, New York (2006)
Bosch, A., Zisserman, A., Muñoz, X.: Image classification using random forests and ferns. In: IEEE 11th International Conference on Computer Vision, 2007. ICCV 2007, pp. 1–8. IEEE, New York (2007)
Bosch, A., Zisserman, A., Muñoz, X.: Scene classification using a hybrid generative/discriminative approach. IEEE Trans. Pattern Anal. Mach. Intell. 30(4), 712–727 (2008)
Article Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(27), 1–27 (2011). http://www.csie.ntu.edu.tw/~cjlin/libsvm
Chapelle, O., Haffner, P., Vapnik, V.: Support vector machines for histogram-based image classification. IEEE Trans. Neural Netw. 10(5), 1055–1064 (1999)
Article Google Scholar
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press, Cambridge (2000)
Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, vol. 1, p. 22 (2004)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005. CVPR 2005, vol. 1, pp. 886–893 (2005)
Endres, F., Hess, J., Engelhard, N., Sturm, J., Cremers, D., Burgard, W.: An evaluation of the RGB-D SLAM system. In: 2012 IEEE International Conference on Robotics and Automation (ICRA), pp. 1691–1696 (2012)
Fei-Fei, L., Perona, P.: A Bayesian hierarchical model for learning natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005. CVPR 2005, vol. 2, pp. 524–531 (2005)
Filipe, S., Alexandre, L.: A Comparative Evaluation of 3D Keypoint Detectors in a RGB-D Object Dataset, pp. 476–483 (2014)
Garcia, S., Derrac, J., Cano, J.R., Herrera, F.: Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE Trans. Pattern Anal. Mach. Intell. 34(3), 417–435 (2012)
Article Google Scholar
Gatzke, T., Grimm, C., Garland, M., Zelinka, S.: Curvature maps for local shape comparison. In: 2005 International Conference Shape Modeling and Applications, pp. 244–253. IEEE, New York (2005)
Henry, P., Krainin, M., Herbst, E., Ren, X., Fox, D.: RGB-D mapping: using kinect-style depth cameras for dense 3D modeling of indoor environments. Int. J. Robot. Res. 31(5), 647–663 (2012)
Article Google Scholar
Horn, B.: Extended Gaussian images. Proc. IEEE 72(12), 1671–1686 (1984)
Article Google Scholar
Krainin, M., Curless, B., Fox, D.: Autonomous generation of complete 3D object models using next best view manipulation planning. In: 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 5031–5037. IEEE, New York (2011)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Lai, K., Bo, L., Ren, X., Fox, D.: A large-scale hierarchical multi-view RGB-D object dataset. In: 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 1817–1824 (2011)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2169–2178. IEEE, New York (2006)
Li, J., Allinson, N.M.: A comprehensive review of current local features for computer vision. Neurocomputing 71(10–12), 1771–1787 (2008)
Article Google Scholar
Linde, O., Lindeberg, T.: Object recognition using composed receptive field histograms of higher dimensionality. In: Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004, vol. 2, pp. 1–6. IEEE, New York (2004)
Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Martinez-Gomez, J., Caputo, B.: Towards semi-supervised learning of semantic spatial concepts. In: 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 1936–1943. IEEE, New York (2011)
Martínez-Gómez, J., García-Varea, I., Cazorla, M., Caputo, B.: Overview of the imageCLEF 2013 robot vision task. In: Working Notes for CLEF 2013 Conference, Valencia, 23–26 September 2013 (2013)
Martinez Mozos, O., Stachniss, C., Burgard, W.: Supervised learning of places from range data using AdaBoost. In: Proceedings of the 2005 IEEE International Conference on Robotics and Automation, 2005. ICRA 2005, pp. 1730–1735. IEEE, New York (2005)
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1615–1630 (2005)
Article Google Scholar
Mitchell, T.M.: Machine Learning, 1st edn. McGraw-Hill Inc, New York (1997)
MATH Google Scholar
Nowak, E., Jurie, F., Triggs, B.: Sampling Strategies for bag-of-features image classification. In: Computer Vision ECCV 2006. Lecture Notes in Computer Science, vol. 3954, pp. 490–503. Springer, Berlin (2006)
Osada, R., Funkhouser, T., Chazelle, B., Dobkin, D.: Shape distributions. ACM Trans. Graph. 21(4), 807–832 (2002)
Article Google Scholar
Park, H.S., Jun, C.H.: A simple and fast algorithm for \(K\)-medoids clustering. Expert Syst. Appl. 36(2), 3336–3341 (2009)
Article Google Scholar
Pronobis, A., Martinez Mozos, O., Caputo, B.: SVM-based discriminative accumulation scheme for place recognition. In: IEEE International Conference on Robotics and Automation, 2008. ICRA 2008, pp. 522–529. IEEE, New York (2008)
Pronobis, A., Martínez Mozos, O., Caputo, B., Jensfelt, P.: Multi-modal semantic place classification. Int. J. Robot. Res. (2009). doi:10.1177/0278364909356483
Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009, pp. 413–420 (2009)
Redondo-Cabrera, C., López-Sastre, R.J., Acevedo-Rodríguez, J., Maldonado-Bascón, S.: Surfing the point clouds: selective 3D spatial pyramids for category-level object recognition. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3458–3465. IEEE, New York (2012)
Redondo-Cabrera, C., López-Sastre, R.J., Acevedo-Rodríguez, J., Maldonado-Bascón, S.: Recognizing in the depth: selective 3D spatial pyramid matching kernel for object and scene categorization. Image Vis. Comput. 32(12), 965–978 (2014)
Article Google Scholar
Ren, X., Bo, L., Fox, D.: RGB-(D) scene labeling: features and algorithms. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2759–2766 (2012)
Romero-González, C.: Clasificación automática de espacios utilizando información visual y de profundidad. Master’s thesis, University of Castilla-La Mancha, Spain (2012)
Rusu, R., Blodow, N., Beetz, M.: Fast point feature histograms (FPFH) for 3D registration. In: IEEE International Conference on Robotics and Automation, 2009. ICRA ’09, pp. 3212–3217 (2009). doi:10.1109/ROBOT.2009.5152473
Rusu, R., Bradski, G., Thibaux, R., Hsu, J.: Fast 3D recognition and pose using the viewpoint feature histogram. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2155–2162 (2010). doi:10.1109/IROS.2010.5651280
Rusu, R., Marton, Z., Blodow, N., Beetz, M.: Learning informative point classes for the acquisition of object model maps. In: 10th International Conference on Control, Automation, Robotics and Vision, 2008. ICARCV 2008, pp. 643–650 (2008)
Rusu, R.B., Cousins, S.: 3D is here: point cloud library (PCL). In: IEEE International Conference on Robotics and Automation (ICRA), Shanghai (2011)
Sinha, A., Banerji, S., Liu, C.: New color GPHOG descriptors for object and scene image classification. Mach. Vis. Appl. 25(2), 361–375 (2014)
Article Google Scholar
Socher, R., Huval, B., Bath, B., Manning, C.D., Ng, A.Y.: Convolutional-recursive deep learning for 3D object classification. In: Advances in Neural Information Processing Systems, pp. 665–673 (2012)
Song, S., Lichtenberg, S.P., Xiao, J.: Sun RGB-D: A RGB-D scene understanding benchmark suite. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 567–576 (2015)
Steder, B., Rusu, R.B., Konolige, K., Burgard, W.: Point feature extraction on 3D range scans taking into account object boundaries. In: 2011 IEEE International Conference on Robotics and automation (ICRA), pp. 2601–2608. IEEE, New York (2011)
Stückler, J., Steffens, R., Holz, D., Behnke, S.: Efficient 3D object perception and grasp planning for mobile manipulation in domestic environments. Robot. Auton. Syst. 61(10), 1106–1115 (2013)
Article Google Scholar
Tangelder, J., Veltkamp, R.: A survey of content based 3D shape retrieval methods. Multimed. Tools Appl. 39(3), 441–471 (2008)
Article Google Scholar
Tombari, F., Salti, S., Di Stefano, L.: Unique signatures of histograms for local surface description. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) Computer Vision—ECCV 2010. Lecture Notes in Computer Science, vol. 6313, pp. 356–369. Springer, Berlin (2010)
Chapter Google Scholar
Tombari, F., Salti, S., Di Stefano, L.: Performance evaluation of 3D keypoint detectors. Int. J. Comput. Vis. 102(1–3), 198–220 (2013)
Article Google Scholar
Torralba, A., Murphy, K.P., Freeman, W.T., Rubin, M.A.: Context-based vision system for place and object recognition. In: Proceedings of the Ninth IEEE International Conference on Computer Vision, 2003, pp. 273–280. IEEE, New York (2003)
Vedaldi, A., Fulkerson, B.: VLFeat: an open and portable library of computer vision algorithms. In: Proceedings of the International Conference on Multimedia, pp. 1469–1472. ACM, New York (2010)
Wang, M., Gao, Y., Lu, K., Rui, Y.: View-based discriminative probabilistic modeling for 3D object retrieval and recognition. IEEE Trans. Image Process. 22(4), 1395–1407 (2013)
Article MathSciNet Google Scholar
Wohlkinger, W., Vincze, M.: Ensemble of shape functions for 3D object classification. In: 2011 IEEE International Conference on Robotics and Biomimetics (ROBIO), pp. 2987–2992 (2011). doi:10.1109/ROBIO.2011.6181760
Yamauchi, B., Langley, P.: Place recognition in dynamic environments. J. Robot. Syst. 14(2), 107–120 (1997)
Article MATH Google Scholar
Yang, J., Jiang, Y.G., Hauptmann, A.G., Ngo, C.W.: Evaluating bag-of-visual-words representations in scene classification. In: Proceedings of the International Workshop on Workshop on Multimedia Information Retrieval, pp. 197–206. ACM, New York (2007)
Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009, pp. 1794–1801 (2009)
Zhang, J., Marszałek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: a comprehensive study. Int. J. Comput. Vis. 73(2), 213–238 (2007)
Article Google Scholar
Zhang, M.L., Zhou, Z.H.: A \(k\)-nearest neighbor based algorithm for multi-label classification. In: 2005 IEEE International Conference on Granular Computing, vol. 2, pp. 718–721. IEEE, New York (2005)
Zhong, Y.: Intrinsic shape signatures: a shape descriptor for 3D object recognition. In: 2009 IEEE 12th International Conference on Computer Vision Workshops (ICCV Workshops), pp. 689–696 (2009)
Zou, Q., Cao, Y., Li, Q., Mao, Q., Wang, S.: Automatic inpainting by removing fence-like structures in RGBD images. Mach. Vis. Appl. 25(7), 1841–1858 (2014)
Article Google Scholar

Download references

Acknowledgments

This work has been partially funded by FEDER funds and the Spanish Government (MICINN) through project TIN2013-46638-C3-3-P and by Consejería de Educación, Cultura y Deportes of the JCCM regional government through project PPII-2014-015-P. Cristina Romero-González is also funded by the MECD grant FPU12/04387, and Jesus Martínez-Gómez is also funded by the JCCM grant POST2014/8171.

Author information

Authors and Affiliations

Departamento de Sistemas Informáticos, Universidad de Castilla-La Mancha, Campus Univ. s/n, Albacete, Spain
Cristina Romero-González, Jesus Martínez-Gómez, Ismael García-Varea & Luis Rodríguez-Ruiz
Departamento de Ciencia de la Computación e Inteligencia Artificial, Universidad de Alicante, Alicante, Spain
Jesus Martínez-Gómez

Authors

Cristina Romero-González
View author publications
You can also search for this author in PubMed Google Scholar
Jesus Martínez-Gómez
View author publications
You can also search for this author in PubMed Google Scholar
Ismael García-Varea
View author publications
You can also search for this author in PubMed Google Scholar
Luis Rodríguez-Ruiz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cristina Romero-González.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Romero-González, C., Martínez-Gómez, J., García-Varea, I. et al. 3D spatial pyramid: descriptors generation from point clouds for indoor scene classification. Machine Vision and Applications 27, 263–273 (2016). https://doi.org/10.1007/s00138-015-0744-4

Download citation

Received: 27 February 2015
Revised: 02 October 2015
Accepted: 30 November 2015
Published: 06 January 2016
Issue Date: February 2016
DOI: https://doi.org/10.1007/s00138-015-0744-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

3D spatial pyramid: descriptors generation from point clouds for indoor scene classification

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

A Survey on Global LiDAR Localization: Challenges, Advances and Open Problems

3D point cloud-based place recognition: a survey

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

3D spatial pyramid: descriptors generation from point clouds for indoor scene classification

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

A Survey on Global LiDAR Localization: Challenges, Advances and Open Problems

3D point cloud-based place recognition: a survey

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation