Memory and expectations in learning, language, and visual understanding

Schank, Roger C.; Fano, Andrew

doi:10.1007/BF00849039

Memory and expectations in learning, language, and visual understanding

Published: October 1995

Volume 9, pages 261–271, (1995)
Cite this article

Artificial Intelligence Review Aims and scope Submit manuscript

Roger C. Schank¹ &
Andrew Fano¹

74 Accesses
Explore all metrics

Abstract

Research in vision and language has traditionally remained separate in part because the classic task of generating a representation of a given image or sentence has resulted in an emphasis on low level structural aspects of these media. In this paper we argue that image and language understanding should be approached with the intent of facilitating the performance of a task. Under this view research in image and language understanding must confront common issues that arise as a task is pursued. Language and images are both input that can be used to maintain a model of a task. We argue that a model may be maintained by incorporating changes in the scene that can be characterized at a high level of abstraction yet manifest themselves at relatively low levels of analysis. Existing task-relevant models and the associated domain knowledge are used to expect specific changes and disambiguate the interpretation of these changes, thereby allowing them to modify the existing model. From this perspective, understanding input is largely independent of the modality of the input.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Images of the unseen: extrapolating visual representations for abstract and concrete words in a data-driven computational model

Article Open access 12 November 2020

Visual cognition in multimodal large language models

Article Open access 15 January 2025

Artificial Visual Intelligence

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Aloimonos, J., Bandapadhay, A. & Weiss, I. (1987). Active Vision. In Proceedings ofThe First International Conference on Computer Vision.
Bajscy, R. (1988).Active Perception. In Proceedings ofThe IEEE.76: 996–1005.
Ballard, D. H. (1991). Animate Vision.Artificial Intelligence 48(1): 57–86.
Article Google Scholar
Berwick, R. C., Abnewy, S. P. & Tenny, C. (eds.) (1991).Principle-Based Parsing: Computation and Psycholinguistics. Kluwer: Dordrecht.
Google Scholar
Birnbaum, L., Brand, M. & Cooper P. (1993). Looking for Trouble: Using Causal Semantics to Direct Focus of Attention. In Proceedings ofThe Fourth International Conference on Computer Vision ICCV '93, Berlin, Germany.
Charniak, E. & McDermott, D. (1985).Introduction to Artificial Intelligence, 89. Addison-Wesley: Reading, MA.
Google Scholar
Chomsky, N. (1965).Aspects of the Theory of Syntax. MIT Press: Cambridge, MA.
Google Scholar
Coombs, D. J. & Brown, C. M. (1992).Intelligent Gaze Control in Binocular Vision. Department of Computer Science. University of Rochester.
Fano, A. & Cooper, P. (1994). Maintaining Visual Models of a Scene Using Change Primitives. In Proceedings ofThe Computer Vision and Pattern Recognition Conference, Seattle.
Ferguson, W., Bareiss, R., Birnbaum, L. & Osgood, R. (1992).Ask Systems: An Approach to the Realization of Story-Based Teachers. Technical Report #22, The Institute for the Learning Sciences, Northwestern University, Evanston, IL.
Google Scholar
Marcus, M. P. (1980).A Theory of Syntactic Recognition for Natural Language. MIT Press: Cambridge, MA.
Google Scholar
Papert, S. (1980).Mindstorms: Children, Computers, and Powerful Ideas. Basic Books: New York.
Google Scholar
Poggio, T., Torre, V. & Koch, C. (1987). Computational Vision and Regularization theory. In Fischler, M. & Firschein, O. (eds.),Readings In Computer Vision. Morgan Kaufman: Los Altos, CA.
Google Scholar
Prokopowicz, P. & Cooper, P. (1993)The Dynamic Retina: Contrast and Motion Detection for Active Vision. Forthcoming Technical Report. The Institute for the Learning Sciences. Northwestern University.
Riesbeck, C. & Martin, C. E. (1985).Direct Memory Access Parsing. Technical Report #354. Department of Computer Science, Yale University.
Schank, R. (1977) Rules and Topics in Conversation.Cognitive Science 1: 421–441.
Google Scholar
Schank, R. (1982).Dynamic Memory. Cambridge University Press: Cambridge.
Google Scholar
Schank, R., Fano, A., Bell, B. & Jona, M. The Design of Goal-Based Scenarios.The Journal of the Learning Sciences 3(4).
Swain, M. J. (1990).Color Indexing. Technical Report #360. Department of Computer Science. University of Rochester.
Tomita, M. (1986).Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems. Kluwer: Boston.
Google Scholar
Whitehead, A. N. (1929).The Aims of Education. Macmillan: New York.
Google Scholar

Download references

Author information

Authors and Affiliations

The Institute for the Learning Sciences, Northwestern University, 60201, Evanston, IL, USA
Roger C. Schank & Andrew Fano

Authors

Roger C. Schank
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Fano
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schank, R.C., Fano, A. Memory and expectations in learning, language, and visual understanding. Artif Intell Rev 9, 261–271 (1995). https://doi.org/10.1007/BF00849039

Download citation

Issue Date: October 1995
DOI: https://doi.org/10.1007/BF00849039

Key words

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Memory and expectations in learning, language, and visual understanding

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Images of the unseen: extrapolating visual representations for abstract and concrete words in a data-driven computational model

Visual cognition in multimodal large language models

Artificial Visual Intelligence

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Key words

Subscribe and save

Buy Now

Navigation

Memory and expectations in learning, language, and visual understanding

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Images of the unseen: extrapolating visual representations for abstract and concrete words in a data-driven computational model

Visual cognition in multimodal large language models

Artificial Visual Intelligence

Explore related subjects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Key words

Subscribe and save

Buy Now

Search

Navigation