Abstract
With more and more household objects built on planned obsolescence and consumed by a fast-growing population, hazardous waste recycling has become a critical challenge. Given the large variability of household waste, current recycling platforms mostly rely on human operators to analyze the scene, typically composed of many object instances piled up in bulk. Helping them by robotizing the unitary extraction is a key challenge to speed up this tedious process. Whereas supervised deep learning has proven very efficient for such object-level scene understanding, e.g., generic object detection and segmentation in everyday scenes, it however requires large sets of per-pixel labeled images, that are hardly available for numerous application contexts, including industrial robotics. We thus propose a step towards a practical interactive application for generating an object-oriented robotic grasp, requiring as inputs only one depth map of the scene and one user click on the next object to extract. More precisely, we address in this paper the middle issue of object segmentation in top views of piles of bulk objects given a pixel location, namely seed, provided interactively by a human operator. We propose a two-fold framework for generating edge-driven instance segments. First, we repurpose a state-of-the-art fully convolutional object contour detector for seed-based instance segmentation by introducing the notion of edge-mask duality with a novel patch-free and contour-oriented loss function. Second, we train one model using only synthetic scenes, instead of manually labeled training data. Our experimental results show that considering edge-mask duality for training an encoder-decoder network, as we suggest, outperforms a state-of-the-art patch-based network in the present application context.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Arbeláez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 33(5), 898–916 (2011)
Blender—A 3D Modelling and Rendering Package. Blender Foundation, Blender Institute, Amsterdam (2016)
Chang, A.X., Funkhouser, T.A., Guibas, L.J., Hanrahan, P., Huang, Q.-X., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., Xiao, J., Yi, L., Yu, F.: ShapeNet: an information-rich 3D model repository (2015). CoRR arXiv:abs/1512.03012
Choi, C., Taguchi, Y., Tuzel, O., Liu, M.-Y., Ramalingam, S.: Voting-based pose estimation for robotic assembly using a 3D sensor. In: ICRA, pp. 1724–1731. IEEE (2012)
Dollár, P., Zitnick, C.L.: Fast edge detection using structured forests (2014). CoRR arXiv:abs/1406.5549
Dosovitskiy, A., Fischer, P., Ilg, E., Husser, P., Hazirbas, C., Golkov, V., van der Smagt, P., Cremers, D., Brox, T.: FlowNet: learning optical flow with convolutional networks. In: ICCV, pp. 2758–2766. IEEE Computer Society (2015)
Firman, M.: RGBD datasets: past, present and future. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2016
Handa, A., Patraucean, V., Badrinarayanan, V., Stent, S., Cipolla, R.: Understanding realworld indoor scenes with synthetic data. In: CVPR, pp. 4077–4085. IEEE Computer Society (2016)
Handa, A., Patraucean, V., Stent, S., Cipolla, R.: SceneNet: an annotated model generator for indoor scene understanding. In: Kragic, D., Bicchi, A., Luca, A.D. (eds.) ICRA, pp. 5737–5743. IEEE (2016)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask-RCNN. In: ICCV (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778. IEEE Computer Society (2016)
Hosang, J.H., Benenson, R., Dollr, P., Schiele, B.: What makes for effective detection proposals? (2015). CoRR arXiv:abs/1502.05082
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding (2014). arXiv:1408.5093
Krahenbuhl, P., Koltun, V.: Learning to propose objects. In: CVPR, pp. 1574–1582. IEEE (2015)
Lai, K., Bo, L., Ren, X., Fox, D.: A large-scale hierarchical multi-view RGB-D object dataset. In: ICRA, pp. 1817–1824. IEEE (2011)
Lin, G., Shen, C., van den Hengel, A., Reid, I.: Efficient piecewise training of deep structured models for semantic segmentation. In: CVPR, pp. 3194–3203. IEEE Computer Society (2016)
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft COCO: common objects in context. In: Fleet, D.J., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV (5). Lecture Notes in Computer Science, vol. 8693, pp. 740–755. Springer (2014)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440. IEEE (2015)
Mayer, N., Ilg, E., Husser, P., Fischer, P., Cremers, D., Dosovitskiy, A., Brox, T.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: CVPR, pp. 4040–4048. IEEE Computer Society (2016)
Peng, X., Sun, B., Ali, K., Saenko, K.: Learning deep object detectors from 3D models. In: ICCV, pp. 1278–1286. IEEE Computer Society (2015)
Pinheiro, P.H.O., Collobert, R., Dollár, P.: Learning to segment object candidates. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) NIPS, pp. 1990–1998 (2015)
Pinheiro, P.O., Lin, T.-Y., Collobert, R., Dollár, P.: Learning to refine object segments. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV (1). Lecture Notes in Computer Science, vol. 9905, pp. 75–91. Springer (2016)
Pont-Tuset, J., Arbelaez, P., Barron, J.T., Marqus, F., Malik, J.: Multiscale combinatorial grouping for image segmentation and object proposal generation (2015). CoRR arXiv:abs/1503.00848
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) NIPS, pp. 91–99 (2015)
Richtsfeld, A., Morwald, T., Prankl, J., Zillich, M., Vincze, M.: Segmentation of unknown objects in Indoor environments. In: IROS, pp. 4791–4796. IEEE (2012)
Romera-Paredes, B., Torr, P.H.S.: Recurrent instance segmentation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV (6). Lecture Notes in Computer Science, vol. 9910, pp. 312–329. Springer (2016)
Ros, G., Sellart, L., Materzynska, J., Vzquez, D., Lopez, A.M.: The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: CVPR, pp. 3234–3243. IEEE Computer Society (2016)
Rozantsev, A., Lepetit, V., Fua, P.: On rendering synthetic images for training an object detector. Comput. Vis. Image Underst. 137, 24–37 (2015)
Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A.W., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV (5). Lecture Notes in Computer Science, vol. 7576, pp. 746–760. Springer (2012)
Yang, J., Price, B.L., Cohen, S., Lee, H., Yang, M.-H.: Object contour detection with a fully convolutional encoder-decoder network. In: CVPR, pp. 193–202. IEEE Computer Society (2016)
Zitnick, C.L., Dollár, P.: Edge boxes: locating object proposals from edges. In: Fleet, D.J., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV (5). Lecture Notes in Computer Science, vol. 8693, pp. 391–405. Springer (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Grard, M., Brégier, R., Sella, F., Dellandréa, E., Chen, L. (2019). Object Segmentation in Depth Maps with One User Click and a Synthetically Trained Fully Convolutional Network. In: Ficuciello, F., Ruggiero, F., Finzi, A. (eds) Human Friendly Robotics. Springer Proceedings in Advanced Robotics, vol 7. Springer, Cham. https://doi.org/10.1007/978-3-319-89327-3_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-89327-3_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-89326-6
Online ISBN: 978-3-319-89327-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)