3D Semantic Maps for Scene Segmentation

Romero-González, Cristina; Martínez-Gómez, Jesus; García-Varea, Ismael

doi:10.1007/978-3-319-70833-1_49

Cristina Romero-González¹⁹,
Jesus Martínez-Gómez¹⁹ &
Ismael García-Varea¹⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 693))

Included in the following conference series:

Iberian Robotics conference

Abstract

The semantic segmentation problem has been widely studied in the computer vision community. However, state-of-the-art solutions based on deep learning are only available for 2D images. The lack of large annotated datasets makes more difficult the training of models with 3D images. In this work we propose to use the already available 2D deep learning based solutions to semantically segment the 3D environment for robotic applications. Concretely, deep learning applications provide the semantic labeling, and the geometrical information from RGB-D cameras along with the robot pose provides the 3D position.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). http://tensorflow.org/, software available from tensorflow.org
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
MATH Google Scholar
Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for scene labeling. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1915–1929 (2013)
Article Google Scholar
Girshick, R.: Fast R-CNN. In: The IEEE International Conference on Computer Vision (ICCV) (2015)
Google Scholar
Gupta, S., Girshick, R., Arbeláez, P., Malik, J.: Learning rich features from RGB-D images for object detection and segmentation. In: 13th European Conference on Computer Vision (ECCV), pp. 345–360. Springer International Publishing (2014)
Google Scholar
Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., Kingsbury, B.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Sig. Process. Mag. 29(6), 82–97 (2012)
Article Google Scholar
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093 (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105 (2012)
Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015
Google Scholar
Rabbani, T., Van Den Heuvel, F., Vosselmann, G.: Segmentation of point clouds using smoothness constraint. In: International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. 36(5), pp. 248–253 (2006)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016
Google Scholar
Sainath, T.N., Mohamed, A., Kingsbury, B., Ramabhadran, B.: Deep convolutional neural networks for LVCSR. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 8614–8618, May 2013
Google Scholar
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015
Google Scholar
Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to human-level performance in face verification. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2014
Google Scholar
Tompson, J., Goroshin, R., Jain, A., LeCun, Y., Bregler, C.: Efficient object localization using convolutional networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015
Google Scholar
Vinyals, O., Toshev, A., Bengio, S., Erhan, D.: Show and tell: a neural image caption generator. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015
Google Scholar
Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., Torralba, A.: Scene parsing through ade20k dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Google Scholar

Download references

Acknowledgments

This work has been partially funded by FEDER funds and the Spanish Government (MICINN) through project TIN2015-65686-C5-3-R. We also want to acknowledge the Red de Agentes Físicos TIN2015-71693-REDT.

Author information

Authors and Affiliations

University of Castilla-La Mancha, Campus Univ. s/n, 02071, Albacete, Spain
Cristina Romero-González, Jesus Martínez-Gómez & Ismael García-Varea

Authors

Cristina Romero-González
View author publications
You can also search for this author in PubMed Google Scholar
Jesus Martínez-Gómez
View author publications
You can also search for this author in PubMed Google Scholar
Ismael García-Varea
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cristina Romero-González .

Editor information

Editors and Affiliations

Escuela Técnica Superior de Ingeniería, Universidad de Sevilla, Sevilla, Spain
Anibal Ollero
Institut de Robòtica I Informàtica Industrial (CSIC-UPC), Universitat Politècnica de Catalunya, Barcelona, Spain
Alberto Sanfeliu
Departamento de Informática e Ingeniería de Sistemas, Escuela de Ingeniería y Arquitectura, Instituto de Investigación en Ingeniería de Aragón, Zaragoza, Spain
Luis Montano
Institute of Electronics and Telematics Engineering of Aveiro (IEETA), Universidade de Aveiro, Aveiro, Portugal
Nuno Lau
IDMEC, Instituto Superior Técnico de Lisboa, Universidade de Lisboa, Lisbon, Portugal
Carlos Cardeira

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Romero-González, C., Martínez-Gómez, J., García-Varea, I. (2018). 3D Semantic Maps for Scene Segmentation. In: Ollero, A., Sanfeliu, A., Montano, L., Lau, N., Cardeira, C. (eds) ROBOT 2017: Third Iberian Robotics Conference. ROBOT 2017. Advances in Intelligent Systems and Computing, vol 693. Springer, Cham. https://doi.org/10.1007/978-3-319-70833-1_49

Download citation

DOI: https://doi.org/10.1007/978-3-319-70833-1_49
Published: 12 November 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70832-4
Online ISBN: 978-3-319-70833-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics