Abstract
We present an attention-based approach for the detection of unknown objects in a 3D environment. The ability to address individual objects in the environment without having previous knowledge about their properties or their identity is one important requirement of the Situated Vision theory. Based on saliency maps, our attention system determines the regions where objects are likely to be found; these are the proto-objects whose extent is refined by a 2D segmentation step. At the same time a 3D scene model is built from measurements of a depth camera. The detected objects are projected into the 3D scene, resulting in 3D object models which are incrementally updated. We show the validity of our approach in an RGB-D sequence recorded in an office environment.
Similar content being viewed by others
Notes
This work is part of DFG DACH project FR 2598/5-1 called Situated Vision to Perceive Object Shape and Affordances., in cooperation with TU Wien, RTWH Aachen and IDIAP.
In [7], the TSDF function is raycasted, given a camera pose, to generate a depth map prediction. Using this method in our extended TSDF function means we can generate 2D IOR or object label maps for every new pose of the camera.
References
Frintrop S, Rome E, Christensen HI (2010) Computational visual attention systems and their cognitive foundations: a survey. ACM Trans Appl Percept 7(1)
Givens CR, Shortt RM (1984) A class of Wasserstein metrics for probability distributions. Mich Math J 31:231–240
Klein DA, Frintrop S (2012) Salient pattern detection using W2 on multivariate normal distributions. In: Proc of DAGM-OAGM. Springer, Berlin
Kootstra G, Kragic D (2011) Fast and bottom-up object detection, segmentation, and evaluation using Gestalt principles. In: IEEE int’l conf on robotics and automation
Martín-García G, Frintrop S (2013) A computational framework for attentional 3d object detection. In: Proceedings of the annual meeting of the cognitive science society
Meger D, Muja M, Helmer S, Gupta A, Gamroth C, Hoffman T, Baumann M, Southey T, Fazli P, Wohlkinger W, Viswanathan P, Little JJ, Lowe DG, Orwell J (2010) Curious George: an integrated visual search platform. In: Canadian conference on computer and robot vision
Newcombe RA, Izadi S, Hilliges O, Molyneaux D, Kim D, Davison AJ, Kohli P, Shotton J, Hodges S, Fitzgibbon A (2011) KinectFusion: real-time dense surface mapping and tracking. In: Proc of IEEE int’l symposium on mixed and augmented reality (ISMAR ’11)
Pylyshyn ZW (2001) Visual indexes, preconceptual objects, and situated vision. Cognition 80(1–2):127–158
Rensink RA (2000) The dynamic representation of scenes. Vis Cogn 7:17–42
Rensink RA (2000) Seeing, sensing and scrutinizing. Vis Res 40:1469–1487
Rother C, Kolmogorov V, Blake A (2004) GrabCut: interactive foreground extraction using iterated graph cuts. ACM Trans Graph 23:309–314
Rusu RB, Cousins S (2011) 3D is here: point cloud library (PCL). In: IEEE international conference on robotics and automation (ICRA)
Schlemmer M (2009) Getting past passive vision—on the use of an ontology for situated perception in robots. PhD thesis, Faculty of Electrical Engineering and Information Technology, Vienna University of Technology
Tipper SP, Weaver B, Jerreat LM, Burak AL (1994) Object-based and environment-based inhibition of return of visual attention. J Exp Psychol 20(3):478
Walther D, Koch C (2006) Modeling attention to salient proto-objects. Neural Netw 19(9):1395–1407
Wolfe JM, Horowitz TS (2004) What attributes guide the deployment of visual attention and how do they do it? Nat Rev Neurosci 5:1–7
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Martín García, G., Frintrop, S. & Cremers, A.B. Attention-Based Detection of Unknown Objects in a Situated Vision Framework. Künstl Intell 27, 267–272 (2013). https://doi.org/10.1007/s13218-013-0256-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13218-013-0256-1