Visual Salience and Reference Resolution in Simulated 3-D Environments

Kelleher, J.; van Genabith, J.

doi:10.1023/B:AIRE.0000036258.60851.83

Visual Salience and Reference Resolution in Simulated 3-D Environments

Published: June 2004

Volume 21, pages 253–267, (2004)
Cite this article

Artificial Intelligence Review Aims and scope Submit manuscript

J. Kelleher &
J. van Genabith

142 Accesses
10 Citations
3 Altmetric
Explore all metrics

Abstract

In this paper we present a novel false colouring-based visual saliency algorithm and illustrate how it is used in the situated language interpreter (SLI) system to ground a reference resolution framework for natural language interfaces to 3-D simulated environments. The visual saliency algorithm allows us to dynamically maintain a model of the evolving visual context. The visual saliency scores associated with the elements in the context model can be used to resolve underspecified references.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Exploring the Future of Spatial Typography in Immersive Design Applications

Natural language modelled and printed in 3D: a multi-disciplinary approach

Article Open access 04 March 2022

Automatic Directions for Object Localization in Virtual Environments

References

Andre, E., Herzog, G. & Rist, T. (1988). On the Simultaneous Interpretation of Real World Image Sequences and their Natural Language Description: The System SOCCER. In Proceedings of the 8th European Conference on Artificial Intelligence (ECAI-88), pp. 449–454, Pitmann.
Byron, D. (2003). Understanding Referring Expressions in Situated Language: Some Challenges for Real-World Agents. In Proceedings of the First International Workshop on Language Understanding and Agents for the Real World. Hokkaido University.
Chum, M. & Wolfe, J. (2001). Visual Attention. In Goldstein, E. B. (ed.)Blackwell Handbook of Perception, Handbooks of Experimental Psychology, 272–310. Blackwell (Chapter 9).
Duwe, I. & Strohner, H. (1997). Towards a Cognitive Model of Linguistic Reference. Report: 97/1-Situierte Kunstlicher Kommunikatoren 97/1, Univeristat Bielefeld.
Forgus, R. & Melamed, L. (1976). Perception A Cognitive Stage Approach. McGraw-Hill.
Fuhr, T., Socher, G., Scheering, C. & Sagerer, G. (1998). A Three-Dimensional Spatial Model for the Interpretation of Image Data. In Olivier, P. & Gapp, K. (eds.) Representation and Processing of Spatial Expressions, 103–118. Lawrence Erlbaum Associates.
Goldwater, S. J., Bratt, E., Gawron, J. & Dowding, J. (2000). Building a Robust Dialogue System with Limited Data. In Proceedings of the Workshop on Conversational Systems at the First Meeting of the North American Chapter of the Association of Computational Linguistics. Seattle, WA.
Heinke, D. & Humphreys, G. (2004). Computational Models of Visual Selective Attention: A Review. In Houghton, G. (ed.) Connectionist Models in Psychology. Psychology Press.
Herzog, G. (1997). Connecting Vision and Natural Language Systems. Technical Report SFB 314 Project VITRA, Universitt des Saarlandes.
Jording, T. & Wachsmuth, I. (2002). An Anthropomorphic Agent for the Use of Spatial Language. In Coventry, K. & Olivier, P. (eds.) Spatial Language: Cognitive and Computational Aspects, 69–86 Dordrecht: Kluwer Academic Publishers.
Google Scholar
Kelleher, J., Doris, T., Hussain, Q. & ONullain, S. (2000). SONAS: Multimodal, Multiuser Interaction with a Modelled Environment. In Nuallin, S. (ed.) Spatial Cognition-Foundation and Applications, Advances in Consciousness Research, 171–18 Amsterdam/Philadelphia: John Benjamins Publishing.
Google Scholar
Kievit, L., Piwek, P., Beun, R. & Bunt, H. (2001). Multimodal Cooperative Resolution of Referential Expressions in the DenK System. In Bunt, H. & Beun, R. (eds.) Cooperative Multimodal Communication, Lecture notes in Artificial Intelligence, Vol. 2155, 197–214. Berlin Heidelberg: Springer-Verlag.
Google Scholar
Klipple, E. & Gurney, J. (1999). Deixis to Properties in the NLVR System. In Andre, E. Massimo, P. & Rieser, H. (eds.) Proceedings of the Workshop on Deixis, Demonstration and Deictic Belief held on occasion of ESSLI XI, 58–68. Utrecht, The Netherlands.
Google Scholar
Koch, C. & Itti, L. (2001). Computational Modelling of Visual Attention. Nature Reviews Neuroscience 2(3): 194–203.
Google Scholar
Kuffner, J. & Latombe, J. (1999). Fast synthetic vision, memory, and learning models for virtual humans. In Proceedings of Computer Animation Conference (CA-99). 118–127, Geneva, Switzerland, IEEE Computer Society.
Google Scholar
Landragin, F., Bellalem, N. & Romary, L. (2001). Visual Salience and Perceptual Grouping in Multimodal Interactivity. In Proceeding of the International Workshop on Information Presentation and Natural Multimodal Dialogue (IPNMD). Verona, Italy.
Maybury, M. & Wahlster, W. (eds.) (1998). Readings in Intelligent User Interfaces. San Francisco, CA: Morgan Kaufman Publishers, Inc.
Google Scholar
McKevitt, P. (ed.) (1995/1996). Integration of Natural Language and Vision Processing (Vols. I-IV). Dordrecht, The Netherlands: Kluwer Academic Publishers.
Google Scholar
Noser, H., Renault, O., Thalmann, D. & Magnenat-Thalmann, N. (1995). Navigation for Digital Actors based on Synthetic Vision, Memory and Learning. Computer Graphics 19(1): 7–9.
Google Scholar
Peter, C. & O'Sullivan, C. (2002). A Memory Model for Autonomous Vitual Humans. In Proceedings of Europraphics Irish Chapter Workshop (EGIreland-02), 21–26. Dublin.
Posner, M. I., Snyder, C. R. & Davidson, B. J. (1980). Attention and the Detection of Signals. Journal of Experimental Psychology: General 109(2): 160–174.
Google Scholar
Russell, B. 1905. On Denoting. Mind 14: 479–493. Reprinted Logic and Knowledge (1956), pp. 39-56, R.C. Marsh ed.
Smith, A., Farley, B. & O'Nuallain, S. (1997). Visualization of Natural Language. In Dybjjaer, L. (ed.) Third Spoken Dialogue and Discourse Workshop: Topics in Natural Interactive Systems 1, 80–86. Odense University.
Spivey-Knowlton, M., Tanenhaus, M., Eberhard, K. & Sedivy, J. (1998). Integration of Visuospatial and Linguistic Information: Language Comprehension in Real Time and Real Space. In Olivier, P. & Gapp, K. (eds.) Representation and Processing of Spatial Expressions, 201–214, Lawrence Erlbaum Associates.
Tu, X. & Terzopoulos, D. (1994a). Artificial Fishes: Physics, Locomotion, Perception, Behaviour. In Proceedings of ACM SIGGRAPH, 43–50, Orlando, FL.
Tu, X. & Terzopoulos, D. (1994b). Perceptual Modelling for Behavioural Animation of Fishes. In Proceedings of the Second Pacific Conference on Computer Graphics and Applications, 185–200. Beijing, China.
Winograd, T. (1973). A Procedural Model of Language Understanding. In Schank, R. & Colby, K. (eds.) Computer Models of Thought and Language, 152–186. W.H. Freeman and Company.

Download references

Authors

J. Kelleher
View author publications
You can also search for this author in PubMed Google Scholar
J. van Genabith
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kelleher, J., van Genabith, J. Visual Salience and Reference Resolution in Simulated 3-D Environments. Artificial Intelligence Review 21, 253–267 (2004). https://doi.org/10.1023/B:AIRE.0000036258.60851.83

Download citation

Issue Date: June 2004
DOI: https://doi.org/10.1023/B:AIRE.0000036258.60851.83

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Visual Salience and Reference Resolution in Simulated 3-D Environments

Abstract

Access this article

Similar content being viewed by others

Exploring the Future of Spatial Typography in Immersive Design Applications

Natural language modelled and printed in 3D: a multi-disciplinary approach

Automatic Directions for Object Localization in Virtual Environments

References

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation