A Multimodal System for Object Learning

Lömker, Frank; Sagerer, Gerhard

doi:10.1007/3-540-45783-6_59

Frank Lömker⁵ &
Gerhard Sagerer⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2449))

Included in the following conference series:

Joint Pattern Recognition Symposium

1591 Accesses
4 Citations

Abstract

A multimodal system for acquiring new objects, updating already known ones, and searching for them is presented. The system is able to learn objects and associate them to speech received from a speech recogniser in a natural and convenient fashion. The learning and retrieval process takes into account information gained from multiple attributes calculated from an image recorded by a standard video camera, from deictic gestures, and from information of a dialog based conversation. Histogram intersection and subgraph matching on segmented color regions are used as attributes.

This work is supported within the Graduate Program “Task Oriented Communication” by the German Research Foundation (DFG).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

BFWS99. H. Brandt-Pook, G. A. Fink, S. Wachsmuth, and G. Sagerer. Integrated recognition and interpretation of speech for a construction task domain. In H.-J. Bullinger and J. Ziegler, editors, Proc. 8th Int. Conf. on Human-Computer Interaction, volume 1, pages 550–554, München, 1999.
Google Scholar
CIE86. CIE. CIE colorimetry specifications. No. 15.2, Central Bureau of the CIE, Vienna, Austria, 1986.
Google Scholar
CM97. D. Comaniciu and P. Meer. Robust analysis of feature space: Color image segmentation. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pages 750–755, Puerto Rico, 1997.
Google Scholar
Fin99. G. A. Fink. Developing HMM-based recognizers with ESMERALDA. In V. Matoušek, P. Mautner, J. Ocelíková, and P. Sojka, editors, Lecture Notes in Artificial Intelligence, volume 1692, pages 229–234, Berlin Heidelberg, 1999. Springer.
Google Scholar
FLWS00. J. Fritsch, F. Lömker, M. Wienecke, and G. Sagerer. Detecting assembly actions by scene observation. In Proc. Int. Conf. on Image Processing, volume I, pages 212–215, Vancouver, CA, 2000. IEEE.
Google Scholar
MB98. B. T. Messmer and H. Bunke. A new algorithm for error-tolerant subgraph isomorphism detection. IEEE Trans. PAMI, 20:493–505, 1998.
Google Scholar
Roy99. D. K. Roy. Learning Words from Sights and Sounds: A Computational Model. PhD thesis, Massachusetts Institute of Technology, 1999.
Google Scholar
SB91. M. J. Swain and D. H. Ballard. Color indexing. International Journal of Computer Vision, 7(1):11–32, 1991.
Article Google Scholar
SK01. L. Steels and F. Kaplan. Aibo’s first words: The social learning of language and meaning. Evolution of Communication, 4(1), 2001.
Google Scholar
VM97. V. V. Vinod and H. Murase. Focused color intersection with efficient searching for object extraction. Pattern Recognition, 30(10):1787–1797, 1997.
Article Google Scholar
WFS98. S. Wachsmuth, G. A. Fink, and G. Sagerer. Integration of parsing and incremental speech recognition. In Proc. of the European Signal Processing Conf., volume 1, pages 371–375, Rhodes, September 1998.
Google Scholar

Download references

Author information

Authors and Affiliations

Applied Computer Science, Technical Faculty, Bielefeld University, P.O. Box 10 01 31, 33501, Bielefeld
Frank Lömker & Gerhard Sagerer

Authors

Frank Lömker
View author publications
You can also search for this author in PubMed Google Scholar
Gerhard Sagerer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Vision Laboratory, ETH Zürich, Gloriastrasse 35, 8092, Zürich, Switzerland
Luc Van Gool

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lömker, F., Sagerer, G. (2002). A Multimodal System for Object Learning. In: Van Gool, L. (eds) Pattern Recognition. DAGM 2002. Lecture Notes in Computer Science, vol 2449. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45783-6_59

Download citation

DOI: https://doi.org/10.1007/3-540-45783-6_59
Published: 10 October 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44209-7
Online ISBN: 978-3-540-45783-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics