Abstract
In face-to-face conversation humans frequently use deictic gestures parallel to verbal descriptions for referent identification. Such a multimodal mode of communication is of great importance for intelligent interfaces, as it simplifies and speeds up reference to objects in a visualized application domain. Natural pointing behavior is very flexible, but also possibly ambiguous or vague, so that without a careful analysis of the discourse context of a gesture there would be a high risk of reference failure. The subject of this paper is how the user and discourse model of an intelligent interface influences the comprehension and production of natural language with coordinated pointing, and conversely how multimodal communication influences the user and discourse model. We briefly describe the deixis analyzer of our XTRA system, which handles a variety of tactile gestures, including different granularities, inexact pointing gestures and pars-pro-toto deixis. We show how gestures can be used to shift focus and how focus can be used to disambiguate gestures. Finally, we discuss the impact of the user model on the decision of the presentation planning component, as to whether to use a pointing gesture, a verbal description, or both, for referent identification.
This is a condensed and revised version of my paper ‘User and Discourse Models for Multimodal Communication’, which appears in ‘Sullivan, J.W., Tyler, S.W. (eds.) Architectures for Intelligent Interfaces: Elements and Prototypes. Reading: Addison-Wesley 1991.’ The research was partially supported by the German Science Foundation (DFG) in its Special Collaborative Programme on AI and Knowledge-Based Systems (SFB 314).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
Bibliography
Allgayer, J. and Reddig, C. 1986. Processing Descriptions containing Words and Gestures — A System Architecture. In Rollinger, C.-R. (ed.), Proc. of GWAI/ÖGAI 1986, Berlin, Springer.
Bolt, R.A. 1980. Put-That-there: Voice and Gesture at the Graphics Interface. Computer Graphics, 14, pp. 262–270.
Brown, D.C., Kwasny, S.C., Chandrasekaran, B., Sondheimer, N.K. 1979. An Experimental Graphics System with Natural Language Input. Computer and Graphics, 4, pp. 13–22.
Buxton, W. and Myers, B.A. 1986. A Study in Two-Handed Input. Proc. CHI’86 Human Factors in Computing Systems, ACM, New York, pp. 321–326.
Carbonell, J.R. 1970. Mixed-Initiative Man-Computer Dialogues. BBN Report No. 1971, Bolt, Beranek and Newman, Cambridge, MA.
Clark, H.H., Schreuder, R. and Buttrick, S. 1983. Common Ground and the Understanding of Demonstrative Reference. Journal of Verbal Learning and Verbal Behavior, 22, pp. 245–258.
Hayes, P.J. 1986. Steps towards Integrating Natural Language and Graphical Interaction for Knowledge-based Systems. Proc. of the 7th European Conference on Artificial Intelligence, Brighton, Great Britain, pp. 436–465.
Grosz, B. 1981. Focusing and Description in Natural Language Dialogues, in Joshi, A., Webber, B., Sag, I. (eds.), Elements of Discourse Understanding. New York: Cambridge Univ. Press, pages 84–105.
Hinrichs, E. and Polanyi, L. 1987. Pointing The Way: A Unified Treatment of Referential Gesture in Interactive Discourse. Papers from the Parasession on Pragmatics and Grammatical Theory at the 22nd Regional Meeting, Chicago Linguistic Society, Chicago, pp. 298–314.
Kobsa, A., Allgayer, J., Reddig, C., Reithinger, N., Schmauks, D., Harbusch, K. and Wahlster, W. 1986. Combining Deictic Gestures and Natural Language for Referent Indentification. Proc. of the 11th International Conf. on Computational Linguistics, Bonn, West Germany, pp. 356–361.
Neal, J.G., Shapiro, S.C. 1988. Intelligent Multi-Media Interface Technology. In Proc. of the Workshop on Architecures for Intelligent Interfaces: Elements and Prototypes. Monterey, Ca., pp. 69–91.
Reilly, R. (ed.) 1985. Communication Failure in Dialogue: Techniques for Detection and Repair. Deliverable 2, Esprit Project 527, Educational Research Center, St. Patrick’s College, Dublin, Ireland.
Reithinger, N. 1987. Generating Referring Expressions and Pointing Gestures. In Kempen, G. (ed.) Natural Language Generation, Dordrecht, Kluwer, pp. 71–81.
Retz-Schmidt, G. (1988): Various Views on Spatial Prepositions. In AI Magazine, Vol. 9, No. 2, also appeared as: Report No. 33, SFB 314, University of Saarbrücken, Computer Science Department.
Schmauks, D. 1987. Natural and Simulated Pointing. In Proc. of the 3rd European ACL Conference, Copenhagen, Danmark, pp. 179–185.
Schmauks, D. and Reithinger, N. 1988. Generating Multimodal Output — Conditions, Advantages and Problems. To appear in Proc. of the 12th International Conference on Computational Linguistics, Budapest, Hungary. Also appeared as Report No. 29, SFB 314, Computer Science Department, University of Saarbrücken.
Scragg, G.W. 1987. Deictic Resolution of Anaphora. Unpublished paper, Franklin and Marshall College, P.O.Box 3003, Lancaster, PA 17604.
Thompson, C. 1986. Building Menu-Based Natural Language Interfaces. Texas Engineering Journal, 3, pp. 140–150.
Wahlster, W. 1984. Cooperative Access Systems. Future Generation Computer Systems, 1, pp. 103–111.
Wahlster, W. and Kobsa, A. 1986. Dialog-Based User Models. In Ferrari, G. (ed.) Proceedings of the IEEE, 74, 7, pp. 948–960.
Wahlster, W. 1988. Distinguishing User Models from Discourse Models, Report No. 27, SFB 314, Computer Science Department, University of Saarbruecken, Fed. Rep. of Germany, to appear in Kobsa, A. and Wahlster, W. (eds.) Computational Linguistics, Special Issue on User Modeling, 1988.
Wetzel, R.P., Hanne, K.H. and Hoepelmann, J.P. 1987. DIS-QUE: Deictic Interaction System-Query Environment. LOKI Report KR-GR 5.3/KR-NL 5, Fraunhofer Gesellschaft, IAO, Stuttgart, Fed. Rep. of Germany.
Woods, W.A. et al. 1979. Research in Natural Language Understanding. Annual Report, TR 4274, Bolt, Beranek and Newman, Cambridge, MA, USA.
Zimmermann, T.G., Lanier, J., Blouchard, C, Bryson, S. and Harvill, Y. 1987. A Hand Gesture Interface Device. Proc. CHI’87 Human Factors in Computing Systems, ACM, New York, pp. 189–192.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1992 B. G. Teubner Verlagsgesellschaft, Leipzig
About this chapter
Cite this chapter
Wahlster, W. (1992). An Intelligent Multimodal Interface. In: Buchmann, J., Ganzinger, H., Paul, W.J. (eds) Informatik. TEUBNER-TEXTE zur Informatik, vol 1. Vieweg+Teubner Verlag, Wiesbaden. https://doi.org/10.1007/978-3-322-95233-2_29
Download citation
DOI: https://doi.org/10.1007/978-3-322-95233-2_29
Publisher Name: Vieweg+Teubner Verlag, Wiesbaden
Print ISBN: 978-3-8154-2033-1
Online ISBN: 978-3-322-95233-2
eBook Packages: Springer Book Archive