An Intelligent Multimodal Interface

Wahlster, Wolfgang

doi:10.1007/978-3-322-95233-2_29

An Intelligent Multimodal Interface

Wolfgang Wahlster²

Chapter

49 Accesses

Part of the book series: TEUBNER-TEXTE zur Informatik ((TTZI,volume 1))

Abstract

In face-to-face conversation humans frequently use deictic gestures parallel to verbal descriptions for referent identification. Such a multimodal mode of communication is of great importance for intelligent interfaces, as it simplifies and speeds up reference to objects in a visualized application domain. Natural pointing behavior is very flexible, but also possibly ambiguous or vague, so that without a careful analysis of the discourse context of a gesture there would be a high risk of reference failure. The subject of this paper is how the user and discourse model of an intelligent interface influences the comprehension and production of natural language with coordinated pointing, and conversely how multimodal communication influences the user and discourse model. We briefly describe the deixis analyzer of our XTRA system, which handles a variety of tactile gestures, including different granularities, inexact pointing gestures and pars-pro-toto deixis. We show how gestures can be used to shift focus and how focus can be used to disambiguate gestures. Finally, we discuss the impact of the user model on the decision of the presentation planning component, as to whether to use a pointing gesture, a verbal description, or both, for referent identification.

This is a condensed and revised version of my paper ‘User and Discourse Models for Multimodal Communication’, which appears in ‘Sullivan, J.W., Tyler, S.W. (eds.) Architectures for Intelligent Interfaces: Elements and Prototypes. Reading: Addison-Wesley 1991.’ The research was partially supported by the German Science Foundation (DFG) in its Special Collaborative Programme on AI and Knowledge-Based Systems (SFB 314).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 49.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Bibliography

Allgayer, J. and Reddig, C. 1986. Processing Descriptions containing Words and Gestures — A System Architecture. In Rollinger, C.-R. (ed.), Proc. of GWAI/ÖGAI 1986, Berlin, Springer.
Google Scholar
Bolt, R.A. 1980. Put-That-there: Voice and Gesture at the Graphics Interface. Computer Graphics, 14, pp. 262–270.
Article Google Scholar
Brown, D.C., Kwasny, S.C., Chandrasekaran, B., Sondheimer, N.K. 1979. An Experimental Graphics System with Natural Language Input. Computer and Graphics, 4, pp. 13–22.
Article Google Scholar
Buxton, W. and Myers, B.A. 1986. A Study in Two-Handed Input. Proc. CHI’86 Human Factors in Computing Systems, ACM, New York, pp. 321–326.
Google Scholar
Carbonell, J.R. 1970. Mixed-Initiative Man-Computer Dialogues. BBN Report No. 1971, Bolt, Beranek and Newman, Cambridge, MA.
Google Scholar
Clark, H.H., Schreuder, R. and Buttrick, S. 1983. Common Ground and the Understanding of Demonstrative Reference. Journal of Verbal Learning and Verbal Behavior, 22, pp. 245–258.
Article Google Scholar
Hayes, P.J. 1986. Steps towards Integrating Natural Language and Graphical Interaction for Knowledge-based Systems. Proc. of the 7th European Conference on Artificial Intelligence, Brighton, Great Britain, pp. 436–465.
Google Scholar
Grosz, B. 1981. Focusing and Description in Natural Language Dialogues, in Joshi, A., Webber, B., Sag, I. (eds.), Elements of Discourse Understanding. New York: Cambridge Univ. Press, pages 84–105.
Google Scholar
Hinrichs, E. and Polanyi, L. 1987. Pointing The Way: A Unified Treatment of Referential Gesture in Interactive Discourse. Papers from the Parasession on Pragmatics and Grammatical Theory at the 22nd Regional Meeting, Chicago Linguistic Society, Chicago, pp. 298–314.
Google Scholar
Kobsa, A., Allgayer, J., Reddig, C., Reithinger, N., Schmauks, D., Harbusch, K. and Wahlster, W. 1986. Combining Deictic Gestures and Natural Language for Referent Indentification. Proc. of the 11th International Conf. on Computational Linguistics, Bonn, West Germany, pp. 356–361.
Google Scholar
Neal, J.G., Shapiro, S.C. 1988. Intelligent Multi-Media Interface Technology. In Proc. of the Workshop on Architecures for Intelligent Interfaces: Elements and Prototypes. Monterey, Ca., pp. 69–91.
Google Scholar
Reilly, R. (ed.) 1985. Communication Failure in Dialogue: Techniques for Detection and Repair. Deliverable 2, Esprit Project 527, Educational Research Center, St. Patrick’s College, Dublin, Ireland.
Google Scholar
Reithinger, N. 1987. Generating Referring Expressions and Pointing Gestures. In Kempen, G. (ed.) Natural Language Generation, Dordrecht, Kluwer, pp. 71–81.
Chapter Google Scholar
Retz-Schmidt, G. (1988): Various Views on Spatial Prepositions. In AI Magazine, Vol. 9, No. 2, also appeared as: Report No. 33, SFB 314, University of Saarbrücken, Computer Science Department.
Google Scholar
Schmauks, D. 1987. Natural and Simulated Pointing. In Proc. of the 3rd European ACL Conference, Copenhagen, Danmark, pp. 179–185.
Google Scholar
Schmauks, D. and Reithinger, N. 1988. Generating Multimodal Output — Conditions, Advantages and Problems. To appear in Proc. of the 12th International Conference on Computational Linguistics, Budapest, Hungary. Also appeared as Report No. 29, SFB 314, Computer Science Department, University of Saarbrücken.
Google Scholar
Scragg, G.W. 1987. Deictic Resolution of Anaphora. Unpublished paper, Franklin and Marshall College, P.O.Box 3003, Lancaster, PA 17604.
Google Scholar
Thompson, C. 1986. Building Menu-Based Natural Language Interfaces. Texas Engineering Journal, 3, pp. 140–150.
Google Scholar
Wahlster, W. 1984. Cooperative Access Systems. Future Generation Computer Systems, 1, pp. 103–111.
Article Google Scholar
Wahlster, W. and Kobsa, A. 1986. Dialog-Based User Models. In Ferrari, G. (ed.) Proceedings of the IEEE, 74, 7, pp. 948–960.
Google Scholar
Wahlster, W. 1988. Distinguishing User Models from Discourse Models, Report No. 27, SFB 314, Computer Science Department, University of Saarbruecken, Fed. Rep. of Germany, to appear in Kobsa, A. and Wahlster, W. (eds.) Computational Linguistics, Special Issue on User Modeling, 1988.
Google Scholar
Wetzel, R.P., Hanne, K.H. and Hoepelmann, J.P. 1987. DIS-QUE: Deictic Interaction System-Query Environment. LOKI Report KR-GR 5.3/KR-NL 5, Fraunhofer Gesellschaft, IAO, Stuttgart, Fed. Rep. of Germany.
Google Scholar
Woods, W.A. et al. 1979. Research in Natural Language Understanding. Annual Report, TR 4274, Bolt, Beranek and Newman, Cambridge, MA, USA.
Google Scholar
Zimmermann, T.G., Lanier, J., Blouchard, C, Bryson, S. and Harvill, Y. 1987. A Hand Gesture Interface Device. Proc. CHI’87 Human Factors in Computing Systems, ACM, New York, pp. 189–192.
Google Scholar

Download references

Author information

Authors and Affiliations

Fachbereich 14 Informatik, Universität Saarbrücken, 6600, Saarbrücken, Germany
Wolfgang Wahlster

Authors

Wolfgang Wahlster
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Universität Saarbrücken, Deutschland
Johannes Buchmann , Harald Ganzinger & Wolfgang J. Paul , &

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Wahlster, W. (1992). An Intelligent Multimodal Interface. In: Buchmann, J., Ganzinger, H., Paul, W.J. (eds) Informatik. TEUBNER-TEXTE zur Informatik, vol 1. Vieweg+Teubner Verlag, Wiesbaden. https://doi.org/10.1007/978-3-322-95233-2_29

Download citation

DOI: https://doi.org/10.1007/978-3-322-95233-2_29
Publisher Name: Vieweg+Teubner Verlag, Wiesbaden
Print ISBN: 978-3-8154-2033-1
Online ISBN: 978-3-322-95233-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics