Skip to main content

A Multimodal System for Object Learning

  • Conference paper
  • First Online:
Pattern Recognition (DAGM 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2449))

Included in the following conference series:

Abstract

A multimodal system for acquiring new objects, updating already known ones, and searching for them is presented. The system is able to learn objects and associate them to speech received from a speech recogniser in a natural and convenient fashion. The learning and retrieval process takes into account information gained from multiple attributes calculated from an image recorded by a standard video camera, from deictic gestures, and from information of a dialog based conversation. Histogram intersection and subgraph matching on segmented color regions are used as attributes.

This work is supported within the Graduate Program “Task Oriented Communication” by the German Research Foundation (DFG).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • BFWS99. H. Brandt-Pook, G. A. Fink, S. Wachsmuth, and G. Sagerer. Integrated recognition and interpretation of speech for a construction task domain. In H.-J. Bullinger and J. Ziegler, editors, Proc. 8th Int. Conf. on Human-Computer Interaction, volume 1, pages 550–554, München, 1999.

    Google Scholar 

  • CIE86. CIE. CIE colorimetry specifications. No. 15.2, Central Bureau of the CIE, Vienna, Austria, 1986.

    Google Scholar 

  • CM97. D. Comaniciu and P. Meer. Robust analysis of feature space: Color image segmentation. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pages 750–755, Puerto Rico, 1997.

    Google Scholar 

  • Fin99. G. A. Fink. Developing HMM-based recognizers with ESMERALDA. In V. Matoušek, P. Mautner, J. Ocelíková, and P. Sojka, editors, Lecture Notes in Artificial Intelligence, volume 1692, pages 229–234, Berlin Heidelberg, 1999. Springer.

    Google Scholar 

  • FLWS00. J. Fritsch, F. Lömker, M. Wienecke, and G. Sagerer. Detecting assembly actions by scene observation. In Proc. Int. Conf. on Image Processing, volume I, pages 212–215, Vancouver, CA, 2000. IEEE.

    Google Scholar 

  • MB98. B. T. Messmer and H. Bunke. A new algorithm for error-tolerant subgraph isomorphism detection. IEEE Trans. PAMI, 20:493–505, 1998.

    Google Scholar 

  • Roy99. D. K. Roy. Learning Words from Sights and Sounds: A Computational Model. PhD thesis, Massachusetts Institute of Technology, 1999.

    Google Scholar 

  • SB91. M. J. Swain and D. H. Ballard. Color indexing. International Journal of Computer Vision, 7(1):11–32, 1991.

    Article  Google Scholar 

  • SK01. L. Steels and F. Kaplan. Aibo’s first words: The social learning of language and meaning. Evolution of Communication, 4(1), 2001.

    Google Scholar 

  • VM97. V. V. Vinod and H. Murase. Focused color intersection with efficient searching for object extraction. Pattern Recognition, 30(10):1787–1797, 1997.

    Article  Google Scholar 

  • WFS98. S. Wachsmuth, G. A. Fink, and G. Sagerer. Integration of parsing and incremental speech recognition. In Proc. of the European Signal Processing Conf., volume 1, pages 371–375, Rhodes, September 1998.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lömker, F., Sagerer, G. (2002). A Multimodal System for Object Learning. In: Van Gool, L. (eds) Pattern Recognition. DAGM 2002. Lecture Notes in Computer Science, vol 2449. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45783-6_59

Download citation

  • DOI: https://doi.org/10.1007/3-540-45783-6_59

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44209-7

  • Online ISBN: 978-3-540-45783-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics