Abstract
In order to operate in human environments, a robot’s semantic perception has to overcome open-world challenges such as novel objects and domain gaps. Autonomous deployment to such environments therefore requires robots to update their knowledge and learn without supervision. We investigate how a robot can autonomously discover novel semantic classes and improve accuracy on known classes when exploring an unknown environment. To this end, we develop a general framework for mapping and clustering that we then use to generate a self-supervised learning signal to update a semantic segmentation model. In particular, we show how clustering parameters can be optimized during deployment and that fusion of multiple observation modalities improves novel object discovery compared to prior work. Models, data, and implementations can be found at github.com/hermannsblum/scim.
This paper was financially supported by the HILTI Group.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
‘unsupervised’ refers to novel classes. All tested methods are supervised on the known classes.
References
Garg, S., Sünderhauf, N., Dayoub, F., et al.: Semantics for robotic mapping, perception and interaction: a survey. Engl. Found. Trends®in Robot. 8(1–2) (2020). https://doi.org/10.1561/2300000059, https://www.nowpublishers.com/article/Details/ROB-059
Liu, B.: Learning on the job: online lifelong and continual learning. In: AAAI, vol. 34, no. 09 (2020). https://doi.org/10.1609/aaai.v34i09.7079, https://ojs.aaai.org/index.php/AAAI/article/view/7079
Joseph, K.J., Khan, S., Khan, F.S., Balasubramanian, V.N.: Towards open world object detection (2021)
Cen, J., Yun, P., Cai, J., Wang, M.Y., Liu, M.: Deep metric learning for open world semantic segmentation (2021)
Lungarella, M., Metta, G., Pfeifer, R., Sandini, G.: Developmental robotics: a survey. Connect. Sci. 15(4), 151–190 (2003). https://doi.org/10.1080/09540090310001655110
McCormac, J., Handa, A., Davison, A., Leutenegger, S.: SemanticFusion: dense 3D semantic mapping with convolutional neural networks. Bayesian Forecast. Dyn. Models 22(2) (2016). https://doi.org/10.1007/b94608
Grinvald, M., Furrer, F., Novkovic, T., et al.: Volumetric instance-aware semantic mapping and 3D object discovery. IEEE Robot. Autom. Lett. 4(3) (2019). https://doi.org/10.1109/LRA.2019.2923960, https://ieeexplore.ieee.org/document/8741085/
Blum, H., Milano, F., Zurbrügg, R., Siegwart, R., Cadena, C., Gawel, A.: Self-improving semantic perception for indoor localisation. In: Proceedings of the 5th Conference on Robot Learning (2021). https://proceedings.mlr.press/v164/blum22a.html
Nakajima, Y., Kang, B., Saito, H., Kitani, K.: Incremental class discovery for semantic segmentation with RGBD sensing (2019). http://openaccess.thecvf.com/content_ICCV_2019/html/Nakajima_Incremental_Class_Discovery_for_Semantic_Segmentation_With_RGBD_Sensing_ICCV_2019_paper.html
Hamilton, M., Zhang, Z., Hariharan, B., Snavely, N., Freeman, W.T.: Unsupervised semantic segmentation by distilling feature correspondences. In: ICLR (2022)
Uhlemeyer, S., Rottmann, M., Gottschalk, H.: Towards unsupervised open world semantic segmentation (2022)
Caron, M., Touvron, H., Misra, I., et al.: Emerging properties in self-supervised vision transformers. arXiv:2104.14294 [cs] (2021)
Ester, M., Kriegel, H.-P., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise (1996)
Campello, R.J.G.B., Moulavi, D., Sander, J.: Density-based clustering based on hierarchical density estimates. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013. LNCS (LNAI), vol. 7819, pp. 160–172. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37456-2_14
Fu, L., Lin, P., Vasilakos, A.V., Wang, S.: An overview of recent multi-view clustering. Neurocomputing 402, 148-161 (2020). https://doi.org/10.1016/j.neucom.2020.02.104, https://www.sciencedirect.com/science/article/pii/S0925231220303222
Shah, S.A., Koltun, V.: Deep continuous clustering (2018)
Du, S., Liu, Z., Chen, Z., Yang, W., Wang, S.: Differentiable bi-sparse multi-view co-clustering. IEEE Trans. Signal Process. 69, 4623–4636 (2021). https://doi.org/10.1109/TSP.2021.3101979
Yu, L., Liu, X., van de Weijer, J.: Self-training for class-incremental semantic segmentation (2020). http://arxiv.org/abs/2012.03362
Michieli, U., Zanuttigh, P.: Incremental learning techniques for semantic segmentation (2019). http://openaccess.thecvf.com/content_ICCVW_2019/html/TASK-CV/Michieli_Incremental_Learning_Techniques_for_Semantic_Segmentation_ICCVW_2019_paper.html
Potts, R.B.: Some generalized order-disorder transformations. In: Mathematical Proceedings of the Cambridge Philosophical Society (1952). https://doi.org/10.1017/S0305004100027419, http://www.cambridge.org/core/journals/mathematical-proceedings-of-the-cambridge-philosophical-society/article/some-generalized-orderdisorder-transformations/5FD50240095F40BD123171E5F76CDBE0
Head, T., Kumar, M., Nahrstaedt, H., Louppe, G., Shcherbatyi, I.: Scikit-optimize/scikit-optimize (2021). https://zenodo.org/record/1157319
Schmid, L., Delmerico, J., Schönberger, J., et al.: Panoptic multi-TSDFs: a flexible representation for online multi-resolution volumetric mapping and long-term dynamic scene consistency. In: ICRA (2022). http://arxiv.org/abs/2109.10165
Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., Nießner, M.: ScanNet: richly-annotated 3D reconstructions of indoor scenes. In: CVPR (2017). http://openaccess.thecvf.com/content_cvpr_2017/html/Dai_ScanNet_Richly-Annotated_3D_CVPR_2017_paper.html
Frey, J., Blum, H., Milano, F., Siegwart, R., Cadena, C.: Continual learning of semantic segmentation using complementary 2D-3D data representations. arXiv:2111.02156 [cs] (2021)
Gojcic, Z., Zhou, C., Wegner, J.D., Wieser, A.: The perfect match: 3D point cloud matching with smoothed densities. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019). https://doi.org/10.1109/CVPR.2019.00569, https://ieeexplore.ieee.org/document/8954296/
Blum, H., Sarlin, P.-E., Nieto, J., Siegwart, R., Cadena, C.: The fishyscapes benchmark: measuring blind spots in semantic segmentation. Int. J. Comput. Vision 129(11), 3119–3135 (2021). https://doi.org/10.1007/s11263-021-01511-6
Jung, S., Lee, J., Gwak, D., Choi, S., Choo, J.: Standardized max logits: a simple yet effective approach for identifying unexpected road obstacles in urban-scene segmentation (2021)
Douillard, A., Chen, Y., Dapogny, A., Cord, M.: PLOP: learning without forgetting for continual semantic segmentation (2020)
Munkres - Munkres implementation for Python. http://software.clapper.org/munkres/#license
Rosenberg, A., Hirschberg, J.: V-measure: a conditional entropy-based external cluster evaluation measure. In: Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) (2007). https://aclanthology.org/D07-1043
Zurbrügg, R., Blum, H., Cadena, C., Siegwart, R., Schmid, L.: Embodied active domain adaptation for semantic segmentation via informative path planning. arXiv, Technical report arXiv:2203.00549 (2022)
Chaplot, D.S., Dalal, M., Gupta, S., Malik, J., Salakhutdinov, R.R.: SEAL: self-supervised embodied active learning using exploration and 3D consistency. In: Advances in Neural Information Processing Systems, vol. 34 (2021). https://proceedings.neurips.cc/paper/2021/hash/6d0c932802f6953f70eb20931645fa40-Abstract.html
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary material 1 (mp4 9014 KB)
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Blum, H., Müller, M.G., Gawel, A., Siegwart, R., Cadena, C. (2023). SCIM: Simultaneous Clustering, Inference, and Mapping for Open-World Semantic Scene Understanding. In: Billard, A., Asfour, T., Khatib, O. (eds) Robotics Research. ISRR 2022. Springer Proceedings in Advanced Robotics, vol 27. Springer, Cham. https://doi.org/10.1007/978-3-031-25555-7_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-25555-7_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25554-0
Online ISBN: 978-3-031-25555-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)