Abstract
This paper presents a novel object segmentation approach for highly complex indoor scenes. Our approach starts with a novel algorithm which partitions the scene into distinct regions whose boundaries accurately conform to the physical object boundaries in the scene. Next, we propose a novel perceptual grouping algorithm based on local cues (e.g., 3D proximity, co-planarity, and shape convexity) to merge these regions into object hypotheses. Our extensive experimental evaluations demonstrate that our object segmentation results are superior compared to the state-of-the-art methods.
Similar content being viewed by others
Notes
In our experiments, we set \(\gamma _1=0.9\), and \(\gamma _2=0.85\).
For the experiments, we set \(\beta _1=0.95\), and \(\beta _2=0.4\).
Summarized in Algorithm 2.
We also implemented the case where a boundary point is assigned to a random region (within the rectangular search area) instead of the region which minimizes the distance, and found that except for a faster runtime, it did not yield significant improvements in the overall segmentation accuracy.
We empirically found that \(\delta _{\phi }=10^{\circ }\) and \(\delta _{c}=10^{\circ }\) produced the best results.
See video in the supplementary material.
References
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., & Susstrunk, S. (2012). Slic superpixels compared to state-of-the-art superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11), 2274–2282.
Arbelaez, P., Maire, M., Fowlkes, C., & Malik, J. (2011). Contour detection and hierarchical image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5), 898–916.
Asif, U., Bennamoun, M., & Sohel, F. (2013). Real-time pose estimation of rigid objects using RGB-D imagery. In ICIEA (pp. 1692–1699).
Asif, U., Bennamoun, M., & Sohel, F. (2014a). A model-free approach for the segmentation of unknown objects. In IROS (pp. 4914–4921).
Asif, U., Bennamoun, M., & Sohel, F. (2014b). Model-free segmentation and grasp selection of unknown stacked objects. In ECCV (pp. 659–674). Springer.
Asif, U., Bennamoun, M., & Sohel, F. (2015a). Discriminative feature learning for efficient RGB-D object recognition. In IROS.
Asif, U., Bennamoun, M., & Sohel, F. (2015b). Efficient RGB-D object categorization using cascaded ensembles of randomized decision trees. In ICRA (pp. 1295–1302).
Björkman, M., Bergström, N., & Kragic, D. (2014). Detecting, segmenting and tracking unknown objects using multi-label MRF inference. Computer Vision and Image Understanding, 118, 111–127.
Bleiweiss, A., & Werman, M. (2009). Fusing time-of-flight depth and color for real-time segmentation and tracking. In A. Kolb & R. Koch (Eds.), Dynamic 3D imaging (pp. 58–69). Heidelberg: Springer.
Bo, L., Ren, X., & Fox, D. (2014). Learning hierarchical sparse features for RGB-(D) object recognition. The International Journal of Robotics Research, 33(4), 581–599.
Boykov, Y., & Funka-Lea, G. (2006). Graph cuts and efficient nd image segmentation. International Journal of Computer Vision, 70(2), 109–131.
Carreira, J., & Sminchisescu, C. (2010). Constrained parametric min-cuts for automatic object segmentation. In IEEE conference on computer vision and pattern recognition (CVPR), 2010 (pp. 3241–3248). IEEE.
Collet, A., Martinez, M., & Srinivasa, S. S. (2011). The moped framework: Object recognition and pose estimation for manipulation. The International Journal of Robotics Research, 30(10), 1284–1306.
Cour, T., Benezit, F., & Shi, J. (2005). Spectral segmentation with multiscale graph decomposition. In IEEE computer society conference on computer vision and pattern recognition, 2005. CVPR 2005 (Vol. 2, pp. 1124–1131). IEEE.
Cremers, D., Schmidt, F.R., & Barthel, F. (2008). Shape priors in variational image segmentation: Convexity, Lipschitz continuity and globally optimal solutions. In IEEE conference on computer vision and pattern recognition, 2008. CVPR 2008 (pp. 1–6). IEEE.
Felzenszwalb, P. F., & Huttenlocher, D. P. (2004). Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2), 167–181.
Felzenszwalb, P. F., Girshick, R. B., McAllester, D., & Ramanan, D. (2010). Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9), 1627–1645.
Fenzi, M., Dragon, R., Leal-Taixé, L., Rosenhahn, B., & Ostermann, J. (2012). 3D object recognition and pose estimation for multiple objects using multi-prioritized RANSAC and model updating. In Pattern recognition (pp. 123–133). Springer.
Goron, L.C., Marton, Z.C., Lazea, G., & Beetz, M. (2012). Robustly segmenting cylindrical and box-like objects in cluttered scenes using depth cameras. In Proceedings of ROBOTIK 2012, 7th German conference on VDE robotics (pp. 1–6).
Gupta, S., Arbelaez, P., & Malik, J. (2013). Perceptual organization and recognition of indoor scenes from RGB-D images. In IEEE conference on computer vision and pattern recognition (CVPR) 2013 (pp. 564–571). IEEE.
Hager, G. D., & Wegbreit, B. (2011). Scene parsing using a prior world model. The International Journal of Robotics Research, 30(12), 1477–1507.
Harville, M., Gordon, G., & Woodfill, J. (2001). Foreground segmentation using adaptive mixture models in color and depth. In Proceedings of IEEE workshop on detection and recognition of events in video, 2001 (pp. 3–11). IEEE.
Hoiem, D., Efros, A. A., & Hebert, M. (2011). Recovering occlusion boundaries from an image. International Journal of Computer Vision, 91(3), 328–346.
Holz, D., Holzer, S., Rusu, R.B., & Behnke, S. (2012). Real-time plane segmentation using RGB-D cameras. In RoboCup 2011: robot soccer world cup XV (pp. 306–317). Springer.
Ignakov, D., Liu, G., & Okouneva, G. (2013). Object segmentation in cluttered and visually complex environments. Autonomous Robots, 37(2), 111–135.
Kim, E., & Medioni, G. (2011). 3D object recognition in range images using visibility context. In IEEE/RSJ international conference on intelligent robots and systems (IROS), 2011 (pp. 3800–3807), IEEE.
Kim, J. S., & Hong, K. S. (2009). Color-texture segmentation using unsupervised graph cuts. Pattern Recognition, 42(5), 735–750.
Kirkpatrick, S. (1984). Optimization by simulated annealing: Quantitative studies. Journal of Statistical Physics, 34(5–6), 975–986.
Kootstra, G., & Kragic, D. (2011). Fast and bottom-up object detection, segmentation, and evaluation using Gestalt principles. In IEEE international conference on robotics and automation (ICRA), 2011 (pp. 3423–3428). IEEE.
Kootstra, G., Popović, M., Jørgensen, J. A., Kuklinski, K., Miatliuk, K., Kragic, D., et al. (2012). Enabling grasping of unknown objects through a synergistic use of edge and surface information. The International Journal of Robotics Research, 31(10), 1190–1213.
Kuehnle, J., Verl, A., Xue, Z., Ruehl, S., Zoellner, J.M., Dillmann, R., Grundmann, T., Eidenberger, R., & Zoellner, R.D. (2009). 6D object localization and obstacle detection for collision-free manipulation with a mobile service robot. In International conference on advanced robotics, 2009. ICAR 2009 (pp. 1–6). IEEE.
Lai, K., Bo, L., Ren, X., & Fox, D. (2011). A large-scale hierarchical multi-view RGB-D object dataset. In IEEE international conference on robotics and automation (ICRA), 2011 (pp. 1817–1824). IEEE.
Lai, K., Bo, L., Ren, X., & Fox, D. (2012). Detection-based object labeling in 3D scenes. In IEEE international conference on robotics and automation (ICRA), 2012 (pp. 1330–1337). IEEE.
Leibe, B., Leonardis, A., & Schiele, B. (2008). Robust object detection with interleaved categorization and segmentation. International Journal of Computer Vision, 77(1–3), 259–289.
Levinshtein, A., Stere, A., Kutulakos, K. N., Fleet, D. J., Dickinson, S. J., & Siddiqi, K. (2009). Turbopixels: Fast superpixels using geometric flows. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(12), 2290–2297.
Li, W. H., & Kleeman, L. (2011). Segmentation and modeling of visually symmetric objects by robot actions. The International Journal of Robotics Research, 30(9), 1124–1142.
Li, X., & Guskov, I. (2007) 3D object recognition from range images using pyramid matching. In IEEE 11th international conference on computer vision, 2007. ICCV 2007 (pp. 1–6). IEEE.
Maire, M., Arbeláez, P., Fowlkes, C., & Malik, J. (2008). Using contours to detect and localize junctions in natural images. In IEEE conference on computer vision and pattern recognition, 2008. CVPR 2008 (pp. 1–8). IEEE.
Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., & Teller, E. (1953). Equation of state calculations by fast computing machines. The Journal of Chemical Physics, 21(6), 1087–1092.
Mishra, A.K., Shrivastava, A., & Aloimonos, Y. (2012). Segmenting “simple” objects using RGB-D. In IEEE international conference on robotics and automation (ICRA), 2012 (pp. 4406–4413). IEEE.
Papazov, C., Haddadin, S., Parusel, S., Krieger, K., & Burschka, D. (2012). Rigid 3D geometry matching for grasping of known objects in cluttered scenes. The International Journal of Robotics Research, 31(4), 538–553.
Papon, J., Abramov, A., Schoeler, M., & Worgotter, F. (2013). Voxel cloud connectivity segmentation-supervoxels for point clouds. In IEEE conference on computer vision and pattern recognition (CVPR), 2013 (pp. 2027–2034). IEEE.
Pepik, B., Stark, M., Gehler, P., & Schiele, B. (2012). Teaching 3D geometry to deformable part models. In IEEE conference on computer vision and pattern recognition (CVPR), 2012 (pp. 3362–3369). IEEE.
Rasolzadeh, B., Björkman, M., Huebner, K., & Kragic, D. (2010). An active vision system for detecting, fixating and manipulating objects in the real world. The International Journal of Robotics Research, 29(2–3), 133–154.
Ren, X., & Malik, J. (2003). Learning a classification model for segmentation. In Proceedings of the ninth IEEE international conference on computer vision, 2003 (pp. 10–17). IEEE.
Richtsfeld, A., Morwald, T., Prankl, J., Zillich, M., & Vincze, M. (2012). Segmentation of unknown objects in indoor environments. In IEEE/RSJ international conference on intelligent robots and systems (IROS), 2012 (pp. 4791–4796). IEEE.
Richtsfeld, A., Mörwald, T., Prankl, J., Zillich, M., & Vincze, M. (2014). Learning of perceptual grouping for object segmentation on RGB-D data. Journal of Visual Communication and Image Representation, 25(1), 64–73.
Rusu, R.B., Blodow, N., Marton, Z.C., & Beetz, M. (2009). Close-range scene segmentation and reconstruction of 3D point cloud maps for mobile manipulation in domestic environments. In IEEE/RSJ international conference on intelligent robots and systems, 2009. IROS 2009 (pp. 1–6). IEEE.
Rusu, R.B., Bradski, G., Thibaux, R., & Hsu, J. (2010). Fast 3D recognition and pose using the viewpoint feature histogram. In IEEE/RSJ international conference on intelligent robots and systems (IROS), 2010 (pp. 2155–2162). IEEE.
Shapiro, L., & Stockman, G. C. (2001). Computer vision (pp. 69–75). Upper Saddle River: Prentice Hall.
Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 888–905.
Silberman, N., Hoiem, D., Kohli, P., & Fergus, R. (2012). Indoor segmentation and support inference from RGBD images. In Computer vision—ECCV 2012 (pp. 746–760). Springer.
Sun, M., Bradski, G., Xu, B.X., & Savarese, S. (2010). Depth-encoded hough voting for joint object detection and shape recovery. In Computer vision—ECCV 2010 (pp. 658–671). Springer.
Uckermann, A., Haschke, R., & Ritter, H. (2012). Real-time 3D segmentation of cluttered scenes for robot grasping. In 12th IEEE-RAS international conference on humanoid robots (Humanoids), 2012 (pp. 198–203). IEEE.
Vedaldi, A., & Soatto, S. (2008). Quick shift and kernel methods for mode seeking. In Computer vision—ECCV 2008 (pp. 705–718). Springer.
Veksler, O., Boykov, Y., & Mehrani, P. (2010). Superpixels and supervoxels in an energy optimization framework. In Computer vision—ECCV 2010 (pp. 211–224). Springer.
Weikersdorfer, D., Gossow, D., & Beetz, M. (2012). Depth-adaptive superpixels. In 21st international conference on pattern recognition (ICPR), 2012 (pp. 2087–2090). IEEE.
Xiang, Y., & Savarese, S. (2012). Estimating the aspect layout of object categories. In IEEE conference on computer vision and pattern recognition (CVPR), 2012 (pp. 3410–3417). IEEE.
Acknowledgments
This work was supported by Australian Research Council Grants: DP150100294, DP110102166, DE120102960.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary material 1 (mp4 37626 KB)
Rights and permissions
About this article
Cite this article
Asif, U., Bennamoun, M. & Sohel, F. Unsupervised segmentation of unknown objects in complex environments. Auton Robot 40, 805–829 (2016). https://doi.org/10.1007/s10514-015-9495-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10514-015-9495-3