Situiertes Sehen für bessere Erkennung von Objekten und Objektklassen

Vincze, M.; Wohlkinger, W.; Aldoma, A.; Olufs, S.; Einramhof, P.; Zhou, K.; Potapova, E.; Fischinger, D.; Zillich, M.

doi:10.1007/s00502-012-0072-6

Situiertes Sehen für bessere Erkennung von Objekten und Objektklassen

Situated vision for improved detection of objects and object classes

Originalarbeiten
Published: January 2012

Volume 129, pages 42–52, (2012)
Cite this article

e & i Elektrotechnik und Informationstechnik Aims and scope Submit manuscript

M. Vincze¹,
W. Wohlkinger¹,
A. Aldoma¹,
S. Olufs¹,
P. Einramhof¹,
K. Zhou¹,
E. Potapova¹,
D. Fischinger¹ &
…
M. Zillich¹

244 Accesses
Explore all metrics

Summary

A main task for domestic robots is to recognize object categories. Image-based approaches learn from large data bases but have no access to contextual knowledge such as available to a robot navigating in the rooms at home. Consequently, we set out to exploit the knowledge available to the robot to constrain the task of object classification. Based on the estimation of free ground space in front of the robot, which is essential for the save navigation in a home setting, we show that we can greatly simplify self-localization, the detection of support surfaces, and the classification of objects. We further show that object classes can be efficiently acquired from 3D models of the Web if learned from automatically generated view data. We modelled 200 object classes (available from www.3d-net.org) and provide sample data of scenes for testing. Using the situated approach we can detect, e.g., chairs with 93 per cent detection rate.

Zusammenfassung

Eine Hauptaufgabe für Roboter ist es, Objekte und Objektklassen zu erkennen, um diese zu finden und handzuhaben. Bild-basierte Ansätze lernen aus großen Datenbanken, haben aber keinen Zugriff auf Kontextwissen zur Verfügung, zum Beispiel, wie Roboter in Zimmern navigieren. Wir schlagen daher den Ansatz des situierten Sehens vor, um kontextuelles Wissen über die Aufgabe und die Anwendung zur Verbesserung der Objekterkennung zu verwenden. Basierend auf der Bestimmung des freien Bodens vor dem Roboter, der für die sichere Navigation notwendig ist, zeigen wir, dass dadurch die Lokalisierung, das Erkennung von Flächen und die Kategorisierung von Objekten vereinfacht werden. Wir zeigen ferner, dass Objektklassen effizient aus 3D-Web-Daten gelernt werden können, wenn das Lernen virtuelle 2,5D-Ansichten verwendet, um die Sicht der Sensoren des Roboters auf die reale Welt anzunehmen. Mit diesem Ansatz wurden 200 Objektklassen (zu finden unter www.3d-net.org) modelliert und in Szenen erkannt, z. B. Stühle mit einer Erkennungsrate von 93 Prozent.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unsupervised 3D Object Discovery and Categorization for Mobile Robots

Interactive Open-Ended Learning for 3D Object Recognition: An Approach and Experiments

Article 31 January 2015

The Art of Detection

Literatur

Aldoma, A., Blodow, N., Gossow, D., Gedikli, S., Rusu, R. B.,Vincze, M., Bradski, G. (2011): Cad-model recognition and 6dof pose estimation using 3d cues. In: 3rd IEEE Workshop on 3D Representation and Recognition at IROS, 2011
Aldoma, A., Vincze, M. (2011): Towards 0-th order affordances through cad-model recognition and 6dof pose estimation. In: IROS Workshop Active Semantic Perception and Object Search in the Real World
Arras, K. O., Castellanos, J. A., Schilt, M., Siegwart, R. (2003): Feature-based multi-hypothesis localization and tracking using geometric constraints. Robot Auton Syst, 1 (44): 41–53
Article Google Scholar
Buchaca, A. A., Vincze, M. (2011): Pose alignment for 3d models and single view stereo point clouds based on stable planes. In: Proceedings of the Joint 3DIM/3DPVT Conference 3D Imaging, Modeling, Processing, Visualization, Transmission (3DIMPVT 2011)
Einramhof, P., Vincze, M. (2010): Stereo-based real-time scene segmentation for a home robot. In: International Symposium ELMAR
Elinas, P., Little, J. J. (2005): omcl: Monte-carlo localization for mobile robots with stereo vision. In: Proceedings of Robotics: Science and Systems, pp 373–380, Cambridge, MA, USA
Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D. (2010): Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. (PAMI), 32 (9): 1627–1645
Article Google Scholar
Funkhouser, T., Min, P., Kazhdan, M., Chen, J., Halderman, A., Dobkin, D., Jacobs. D. (2003): A search engine for 3d models. ACM Trans. Graphic., 22: 83–105
Article Google Scholar
Golovinskiy, A., Kim, V. G., Funkhouser, T. (2009): Shape-based recognition of 3d point clouds in urban environments. ICCV
Helmer, S., Lowe, D. (2010): Using stereo for object recognition. ICRA
Helmer, S., Meger, D., Muja, M., Little, J. J., Lowe, D. G. (2010): Multiple viewpoint recognition and localization. In: Asian Computer Vision Conference ACCV
Humenberger, C., Zinner, C., Weber, M., Kubinger, W., Vincze, M. (2010): A fast stereo matching algorithm suitable for embedded real-time systems. Comput. Vis. Image Und., 114: 1180–1202
Article Google Scholar
Itti, L., Koch, C., Niebur, E. (1998): A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal., 20 (11): 1254–1259
Article Google Scholar
Johnson, A. E., Hebert, M. (1999): Using spin images for efficient object recognition in cluttered 3d scenes. IEEE Trans. Pattern Anal. 21 (5): 433–449
Article Google Scholar
Kazhdan, M., Funkhouser, T., Rusinkiewicz, S. (2003): Rotation invariant spherical harmonic representation of 3d shape descriptors. SGP, 156–164
Lai, K., Bo, L., Ren, X., Fox, D. (2011): Sparse distance learning for object recognition combining rgb and depth information. In: ICRA
Land, M. F., Hayhoe, M. M. (2001): In what ways do eye movements contribute to everyday activities? Vision Res., 41: 3559–3565
Article Google Scholar
Mataric, M. J. (2002): Situated robotics. In: Encyclopaedia of Cognitive Science. Nature Publishing Group, Macmillan Reference Ltd
Meger, D., Muja, M., Helmer, S., Gupta, A., Gamroth, C., Hoffman, T., Baumann, M., Southey, T., Fazli, P., Wohlkinger, W. Viswanathan, P., Little, J. J., Lowe, D. G., Orwell, J. (2010): Curious george: an integrated visual search platform. CRV
Nüchter, A., Surmann, H., Hertzberg, J. (2004): Automatic classification of objects in 3d laser range scans. IAS, 963–970
Olufs, S., Vincze, M. (2009): An efficient area-based observation model for monte-carlo robot localization. In: International Conference on Intelligent Robots and Systems IROS 2009, St. Louis, USA
Olufs, S., Vincze, M. (2011): Robust single view room structure segmentation in manhattan-like environments from stereo vision. In: Proceedings of 2011 IEEE International Conference on Robotics and Automation
Pfeifer, R., Lungarella, M., Iida, F. (2007): Self-organization, embodiment, and biologically inspired robotics. Science, 318: 1088–1093
Article Google Scholar
Potapova, E., Zillich, M., Vincze, M. (2011): Learning what matters: combining probabilistic models of 2d and 3d saliency cues. In: International Conference on Computer Vision Systems ICVS, pp. 132–142
Pylyshyn, Z. (2001): Visual indexes, preconceptual objects, and situated vision. Cognition, 80: 127–158
Article Google Scholar
Rusu, R. B., Blodow, N., Marton, Z.-C., Beetz, M. (2009): Close-range scene segmentation and reconstruction of 3d point cloud maps for mobile manipulation in domestic environments. In: IEEE/RSJ International Conference on Intelligent Robots and Systems
Rusu, R. B., Bradski, G., Thibaux, R., Hsu, J. (2010): Fast 3d recognition and pose using the viewpoint feature histogram. In: Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on, 2155–2162, October 2010
Schlemmer, M. J. (2009): Getting Past Passive Vision – On the Use of an Ontology for Situated Perception in Robots. PhD thesis, Vienna University of Technology
Sudowe, P., Leibe, B. (2011): Efficient use of geometric constraints for sliding-window detection in video. In: ICVS International Conference on Vision Systems, LNCS, Springer Verlag
Sutton, M., Stark, L., Bowyer, K. (1998): Function from visual analysis and physical interaction: a methodology for recognition of generic classes of objects. Image Vision Comput., 16: 746–763
Article Google Scholar
Swadzba, A., Wachsmuth, S. (2010): Indoor scene classification using combined 3d and gist features. In: Asian Conference on Computer Vision, vol. 2, 725–739, Queenstown, New Zealand
Thrun, S., Fox, D., Burgard, W., Dellaert, F. (2000): Robust monte carlo localization for mobile robots. Artif. Intell., 128 (1–2): 99–141
Google Scholar
Tombari, F., Salti, S., Stefano, L. D. (2010): Unique signatures of histograms for local surface description. In: 11th European Conference on Computer Vision
Viswanathan, P., Meger, D., Southey, T., Little, J. J., Mackworth, A. (2009): Automated spatial-semantic modeling with applications to place labeling and informed search. CRV
Wohlkinger, W., Vincze, M. (2010): 3d object classification for mobile robots in home-environments using web-data. In: IEEE International Workshop on Robotics in Alpe-Adria-Danube Region RAAD
Wohlkinger, W., Vincze, M. (2011): Shape-based depth image to 3d model matching and classification with inter-view similarity. In: submitted to IEEE IROS
Zhang, J., Huang, K., Yu, Y., Tan, T. (2011): Boosted local structured hog-lbp for object localization. In: Computer Vision and Pattern Recognition (CVPR)
Zhou, K., Richtsfeld, A., Zillich, M., Vincze, M. (2011): Coherent spatial abstraction and stereo line detection for robotic visual attention. In: Proceedings of the 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2011)

Download references

Author information

Authors and Affiliations

Institut für Automatisierungs- und Regelungstechnik, Technische Universität Wien, Wien, Austria
M. Vincze, W. Wohlkinger, A. Aldoma, S. Olufs, P. Einramhof, K. Zhou, E. Potapova, D. Fischinger & M. Zillich

Authors

M. Vincze
View author publications
You can also search for this author in PubMed Google Scholar
W. Wohlkinger
View author publications
You can also search for this author in PubMed Google Scholar
A. Aldoma
View author publications
You can also search for this author in PubMed Google Scholar
S. Olufs
View author publications
You can also search for this author in PubMed Google Scholar
P. Einramhof
View author publications
You can also search for this author in PubMed Google Scholar
K. Zhou
View author publications
You can also search for this author in PubMed Google Scholar
E. Potapova
View author publications
You can also search for this author in PubMed Google Scholar
D. Fischinger
View author publications
You can also search for this author in PubMed Google Scholar
M. Zillich
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

M. Zillich: The research leading to these results has received funding from the European Community for projects robots@home (IST-6-043450), CogX (IST-7-215181) and HOBBIT (IST-7-288246).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vincze, M., Wohlkinger, W., Aldoma, A. et al. Situiertes Sehen für bessere Erkennung von Objekten und Objektklassen. Elektrotech. Inftech. 129, 42–52 (2012). https://doi.org/10.1007/s00502-012-0072-6

Download citation

Received: 12 October 2011
Accepted: 18 January 2012
Issue Date: January 2012
DOI: https://doi.org/10.1007/s00502-012-0072-6

Keywords

Schlüsselwörter

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Situiertes Sehen für bessere Erkennung von Objekten und Objektklassen

Summary

Zusammenfassung

Access this article

Similar content being viewed by others

Unsupervised 3D Object Discovery and Categorization for Mobile Robots

Interactive Open-Ended Learning for 3D Object Recognition: An Approach and Experiments

The Art of Detection

Literatur

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Schlüsselwörter

Search

Navigation