Skip to main content
Log in

Interpreting a dynamic and uncertain world: High-level vision

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

When interpreting a dynamic and uncertain world it is important to have a high-level vision component that can guide the reasoning of the whole vision system. This guidance is provided by an attentional mechanism that exploits knowledge of the specific problem being solved. Here we survey work relevant to the development of such an attentional mechanism, using surveillance as an application domain to tie together issues of spatial representation, events, behaviour, control and planning. The paper culminates in a brief description of HIVIS-WATCHER a program that makes use of all these areas.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Agosta, J. M. (1991). The Structure of Bayes Networks for Visual Recognition. In Shachter, R. D., Levitt, T. S., Kanal, L. N. & Lemmer, J. F. (eds.)Uncertainty in Artificial Intelligence 4: 397–405. Elsevier: North-Holland Publ. Co.

    Google Scholar 

  • Agre, P. E. (1988).The Dynamic Structure of Everyday Life. Ph.D. Thesis, MIT AI Lab., AI-TR 1085.

  • Agre, P. E. & Chapman, D. (1987). Pengi: An Implementation of a Theory of Activity. In Proceedings ofThe Sixth AAAI Conference, 268–272.

  • Allen, J. F. (1984). Towards a General Theory of Action and Time.Artificial Intelligence 23(2): 123–154.

    Google Scholar 

  • Allen, J. F., Kautz, A., Pelavin, R. N. & Tenenberg, J. D. (1991).Reasoning About Plans. Morgan Kaufman Publ. Inc.

  • Alterman, R. (1988). Adaptive Planning.Cognitive Science 12: 393–421.

    Google Scholar 

  • Badler, N. I. (1975).Temporal Scene Analysis: Conceptual Descriptions of Object Movements. Ph.D. Thesis, Department of Computer Science, University of Toronto, Canada. Technical report TR-80, Canadian Thesis of Microfiche 33080.

  • Bar-Shalom, Y. & Fortmann, T. E. (1988).Tracking and Data Association. Academic Press.

  • Beardslee, D. C. & Wertheimer, M. (1958).Readings in Perception. D. Van Nostrand.

  • Bekoff, M. & Jamieson, D. (1991). Reflective Ethology, Applied Philosophy, and the Moral Status of Animals. In Bateson, P. P. G. & Klopfer, P. H. (eds.)Perspectives in Ethology 9: 1–47. Plenum Press: New York.

    Google Scholar 

  • Breese, J. S. (1992). Construction of Belief and Decision Networks.Computational Intelligence 8(4): 624–647.

    Google Scholar 

  • Brooks, R. A. (1986). A Robust Layered Control System for a Mobile Robot.IEEE Journal of Robotics and Automation 2(1): 14–23.

    Google Scholar 

  • Brooks, R. A. (1991). Intelligence Without Reason. In Proceedings ofThe Twelfth IJCAI Conference, 569–595.

  • Bruce, V. & Green, P. R. (1990).Visual Perception: Physiology, Psychology and Ecology. Lawrence Erlbaum Associates.

  • Bruner, J. S. (1957). On Perceptual Readiness.Psychological Review 64: 123–152. Also in Beardslee & Wertheimer (1958), pp. 686–729.

    Google Scholar 

  • Bühler, K. (1934). The Deictic Field of Language and Deictic Words. In Jarvella, Robert J. & Klein, Wolfgang (eds.)Speech, Place, and Action: Studies in Deixis and Related Topics, 9–30. John Wiley & Sons, 1982. An abridged translation ofSprachtheorie, Fischer: Jena.

  • Chapman, D. (1991).Vision, Instruction and Action. The MIT Press.

  • Charniak, E. & Goldman, R. P. (1991). A Probabilistic Model of Plan Recognition. In Proceedings ofThe Ninth AAAI Conference, 160–165.

  • Charniak, E. & McDermott, D. (1985).Introduction to Artificial Intelligence. Addison-Wesley.

  • Cohen, L. D. (1991). On Active Contour Models and Balloons.CVGIP: Image Understanding 53(2): 211–218.

    Google Scholar 

  • Cohen, P. R., Greenberg, M. L., Hart, D. M. & Howe, A. E. (1989). Trial by Fire: Understanding the Design Requirements for Agents in Complex Environments.AI Magazine 10(3): 32–48.

    Google Scholar 

  • Corrall, D. R. & Hill, A. G. (1992). Visual Surveillance.GEC Review 8(1): 15–27.

    Google Scholar 

  • Cox, I. J. (1993). A Review of Statistical Data Association Techniques for Motion Correspondence.International Journal of Computer Vision 10(1): 53–66.

    Google Scholar 

  • Culhane, S. M. & Tsotsos, J. K. (1992). An Attentional Prototype for Early Vision. In Proceedings ofThe Second ECCV Conference, 551–560. Lecture Notes in Computer Science 588, Springer Verlag.

  • Dean, T., Camus, T. & Kirman, J. (1990). Sequential Decision Making for Active Perception. In Proceedings ofThe Image Understanding Workshop, 889–894.

  • Dean, T. L. & Boddy, M. (1987). Incremental Causal Reasoning. In Proceedings ofThe Sixth AAAI Conference, 196–201.

  • Dean T. L. & McDermott, D. V. (1987). Temporal Data Base Management.Artificial Intelligence 32: 1–55.

    Google Scholar 

  • Dickmanns, E. D. & Mysliwetz, B. D. (1992). Recursive 3-D Road and Relative Ego-State Recognition.IEEE Trans. on Pattern Analysis and Machine Intelligence 14(2): 199–213.

    Google Scholar 

  • Dousson, C., Goborit, P. & Ghallab, M. (1993). Situation Recognition: Representation and Algorithms. In Proceedings ofThe Thirteenth IJCAI Conference, 166–172.

  • Dowty, D. R., Wall, R. E. & Peters, S. (1981).Introduction to Montague Semantics. D. Reidel Publ. Co.

  • Fischler, M. A. & Firschein, O. (eds.). (1987).Readings in Computer Vision: Issues, Problems, Principles, and Paradigms. Morgan Kaufman Publ. Inc.

  • Fleck, M. M. (1986) Representing Space for Practical Reasoning.Image and Vision Computing 6(2): 75–86.

    Google Scholar 

  • Fodor, J. A. (1983).The Modularity of Mind: An Essay on Faculty Psychology. The MIT Press.

  • Fritsch, R. & Piccinini, R. A. (1990).Cellular Structures in Topology. Cambridge University Press.

  • From, F. (1971).Perception of Other People. Columbia University Press.

  • Frost, R. A. (1992). Constructing Programs as Executable Attribute Grammars.The Computer Journal 35(4): 376–387.

    Google Scholar 

  • Fu, K. S. (1974).Syntactic Methods in Pattern Recognition. Academic Press.

  • Garfinkel, H. (1967).Studies in Ethnomethodology. Prentice-Hall.

  • Georgeff, M. P. & Ingrand, F. F. (1989). Decision-Making in an Embedded Reasoning System. In Proceedings ofThe Eleventh IJCAI Conference, 972–978.

  • Gibson, J. J. (1979).The Ecological Approach to Visual Perception. Houghton Mifflin Company.

  • Gibson, J. J. & Crooks, L. E. (1938). A Theoretical Field-Analysis of Automobile-Driving. In Reed, Edward & Jones, Rebecca (eds.)Reasons for Realism: Selected Essays of James J. Gibson, 119–136. Lawrence Erlbaum Associates, 1982. AlsoAmerican Journal of Psychology 51: 453–471.

  • Goldman, R. P. & Charniak, E. (1991). Dynamic Construction of Belief Networks. In Bonissone, P. P., Henrion, M., Kanal, L. N. & Lemmer, J. F. (eds.),Uncertainty in Artificial Intelligence 6, 171–184. Elsevier: North-Holland Publ. Co.

    Google Scholar 

  • Goodwin, G. C. & Sin, K. S. (1984).Adaptive Filtering Prediction and Control. Prentice-Hall.

  • Gordon, I. E. (1989).Theories of Visual Perception. John Wiley & Sons.

  • Heider, F. & Simmel, M. (1944). An Experimental Study of Apparent Behavior.American Journal of Psychology 57: 243–259.

    Google Scholar 

  • Heritage, J. (1984).Garfinkel and Ethnomethodology. Polity Press.

  • Herkovits, A. (1986).Language and Spatial Cognition: An Interdisciplinary Study of the Prepositions in English. Cambridge University Press.

  • Hoffmann, C. H. (1989).Geometric and Solid Modeling: An Introduction. Morgan Kaufman Publ. Inc.

  • Horn, B. K. P. (1986).Robot Vision. The MIT Press.

  • Howarth, R. J. (1994).Spatial Representation, Reasoning and Control for a Surveillance System. Ph.D. Thesis, Queen Mary and Westfield College, London.

    Google Scholar 

  • Howarth, R. J. & Buxton, H. (1992). An Analogical Representation of Space and Time.Image and Vision Computing 10(7): 467–478.

    Google Scholar 

  • Howarth, R. J. & Buxton, H. (1993). Selective Attention in Dynamic Vision. In Proceedings ofThe Thirteenth IJCAI Conference, 1579–1584.

  • Ivins, J. & Porrill, J. (1994). Statistical Snakes: Active Region Models. In Proceedings ofThe Fifth British Machine Vision Conference, 377–386.

  • Kapur, D. & Mundy, J. L. (eds.). (1988).Geometric Reasoning. The MIT Press, Also published asArtificial Intelligence 37(1–3).

  • Kass, M., Witkin, A. & Terzopoulos, D. (1987). Snakes: Active Contour Models.International Journal of Computer Vision 1(4): 321–331.

    Google Scholar 

  • Kiaerulff, U. (1992). A Computational Scheme for Reasoning in Dynamic Probabilistic Networks. InUncertainty in Artificial Intelligence 8, 121–129. Morgan Kaufman Publ. Inc.

    Google Scholar 

  • Koch, C. & Ullman, S. (1985). Shifts in Selective Visual Attention: Towards the Underlying Neural Circuity.Human Neurobiology 4: 219–227.

    Google Scholar 

  • LaBerge, D. L. (1990). Attention.Psychological Science 1(3): 156–162.

    Google Scholar 

  • Lansky, A. L. (1987). A Representation of Parallel Activity Based on Events, Structure, and Causality. In Georgeff, M. P. & Lansky, A. L. (eds.)Reasoning about Actions and Plans: Proceedings of the 1986 Workshop, 123–159. Morgan Kaufman Publ. Inc.

  • Latombe, J. C. (1991).Robot Motion Planning. Kluwer Academic Publ.

  • Laurini, R. & Thompson, D. (1992).Fundamentals of Spatial Information Systems. Academic Press.

  • Levinson, S. C. (1983).Pragmatics. Cambridge University Press.

  • Levitt, T. S., Agosta, J. M. & Binford, T. O. (1990). Model-Based Influence Diagrams for Machine vision. In Henrion, M., Shachter, R. D., Kanal, L. N. & Lemmer, J. F. (eds.)Uncertainty in Artificial Intelligence 5: 371–388. Elsevier: North-Holland Publ. Co.

    Google Scholar 

  • Mahoney, J. V. & Ullman, S. (1988). Image Chunking Defining Spatial Building Blocks for Scene Analysis. In Zenon W. Pylyshyn (ed.).Computational Processes in Human Vision: An Interdisciplinary Perspective, 169–209. Ablex Publishing Corporation.

  • Marr, D. (1982).Vision. W. H. Freeman and Company.

  • McDermott, D. (1982). A Temporal Logic for Reasoning About Processes and Plans.Cognitive Science 6: 101–155.

    Google Scholar 

  • Michotte, A. E. (1963).The Perception of Causality. London: Methuen & Co. Ltd. A translation ofLa Perception de la Causalité, Louvain, 1946.

    Google Scholar 

  • Miller, G. & Johnson-Laird, P. N. (1976).Language and Perception. Harvard University Press.

  • Miller, G. A. (1956). The Magic Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information.Psychological Review 63: 81–97. Also in Beardslee & Wertheimer (1958), 90–114.

    PubMed  Google Scholar 

  • Moore, R. C. (1985). A Formal Theory of Knowledge and Action. In Jerry R. Hobbs & Robert C. Moore (eds.)Formal Theories of the Commonsense World, 319–358. Ablex Publishing Corporation.

  • Mozer, M. C. (1991).The Perception of Multiple Objects: A Connectionist Approach. The MIT Press.

  • Murray, D. W. & Buxton, B. F. (1990).Experiments in the Machine Interpretation of Visual Motion. The MIT Press.

  • Murray, D. W., McLauchlan P. F., Reid, J. D. & Sharkey, P. M. (1993). Reactions to Peripheral Image Motion Using a Head/Eye Platform. In Proceedings ofThe Fourth International Conference on Computer Vision, 403–411.

  • Nagel, H-H. (1988). From Image Sequences Towards Conceptual Descriptions.Image and Vision Computing 6(2): 59–74.

    Google Scholar 

  • Neapolitan, R. E. (1990).Probabilistic Reasoning in Expert System: Theory and Algorithms. John Wiley & Sons.

  • Neumann, B. (1989). Natural Language Descriptions of Time-Varying Scenes. In David L. Waltz (ed.)Semantic Structures: Advances in Natural Language Processing, 167–206. Lawrence Erlbaum Associates.

  • Newtson, D. (1976). Foundations of Attribution: The Perception of Ongoing Behaviour. In Harvey, John H., Ickes, William J. & Kidd, Robert F. (eds.),New Directions in Attribution Research: Volume 1, 223–247. Lawrence Erlbaum Associates.

  • Nicholson, A. E. & Brady, J. M. (1992a). The Data Association Problem when Monitoring Robot Vehicles Using Dynamic Belief Networks. In Proceedings ofThe Tenth ECAI Conference, 689–693.

  • Nicholson, A. E. & Brady, J. M. (1992b). Sensor Validation Using Dynamic Belief Networks. InUncertainty in Artificial Intelligence 8, 207–214. Morgan Kaufman Publ. Inc.

    Google Scholar 

  • Pearl, J. (1988).Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufman Publ. Inc.

  • Pereira, F. C.N. & Warren, D. H. D. (1980). Definite Clause Grammars for Language Analysis — A Survey of the Formalism and Comparison with Augmented Transition Networks.Artificial Intelligence 13: 231–278.

    Article  Google Scholar 

  • Pylyshyn, Z. W. & Storm, R. W. (1988). Tracking Multiple Independent Targets: Evidence for a Parallel Tracking Mechanism.Spatial Vision 3(3): 179–197.

    Google Scholar 

  • Reichenbach, H. (1957).The Philosophy of Space and Time. New York: Dover. An English translation ofPhilosophie der Raum-Zeit-Lehre, 1927.

    Google Scholar 

  • Retz-Schmidt, G. (1988). Various Views on Spatial Propositions.AI Magazine 9(2): 95–105.

    Google Scholar 

  • Reynolds, V. (1976). The Origins of a Behavioural Vocabulary: The Case of the Rhesus Monkey.Journal for the Theory of Social Behaviour 6: 105–142.

    Google Scholar 

  • Rimey, R. D. & Brown, C. M. (1992). Where to Look Next Using a Bayes Net: Incorporating Geometric Relations. In Proceedings ofThe Second ECCV Conference, 542–550. Lecture Notes in Computer Science 588, Springer Verlag.

  • Rock, I. (1983).The Logic of Perception. The MIT Press.

  • Romanycia, M. H. J. (1989). The Composition and Control of Visual Routines. In Krzyzak, Adam, Kasvand, Tony, Suen, Ching Y. (eds.)Computer Vision and Shape Recognition, 291–315. World Scientific: Singapore.

    Google Scholar 

  • Rosenschein, S. J. (1989). Synthesizing Information-Tracking Automata from Environment Descriptions. In Brachman, R.et al. (eds.)KR '89: Principles of Knowledge Representation and Reasoning, 386–393. Morgan Kaufman Publ. Inc., 1989. Proceedings of the first conference.

  • Rosenschein S. J. & Kaelbling, L. P. (1986). The Synthesis of Digital Machines with Provable Epistemic Properties. In Halpern, Joseph Y. (ed.)Theoretical Aspects of Reasoning about Knowledge, 83–98. Morgan Kaufman Publ. Inc. Proceedings of the 1986 Conference.

  • Samet, H. (1984). The Quadtree are related Hierarchical Data Structures.Computing Surveys 16(2): 187–260.

    Google Scholar 

  • Schank, R. C. & Abelson, R. P. (1977).Scripts, Plans, Goals and Understanding. Lawrence Erlbaum Associates.

  • Schmidt, C. F., Srdharan, N. S. & Goodson, J. L. (1978). The Plan Recognition Problem: An Intersection of Psychology and Artificial Intelligence.Artificial Intelligence 11: 45–83.

    Google Scholar 

  • Shoham, Y. (1988).Reasoning about Change: Time and Causation from the Standpoint of Artificial Intelligence. The MIT Press.

  • Spiegelhalter, D. J. (1986). Probabilistic Reasoning in Predictive Expert Systems. In Kanal, L. N. & Lemmer, J. F. (eds.)Uncertainty in Artificial Intelligence, 47–67. Elsevier: North-Holland Publ. co.

    Google Scholar 

  • Sridharan, N. S. & Schmidt, C. F. (1978). Knowledge-Directed Inference in BELIEVER. In Waterman, D. A. & Hayes-Roth, F. (eds.)Pattern-Directed Inference Systems, 361–379. Academic Press.

  • Subramanian, D. & Woodfill, J. (1989). Making Situation Calculus Indexical. In Brachman, R.et al. (eds.)KR '89: Principles of Knowledge Representation and Reasoning, 467–474. Morgan Kaufman Publ. Inc. Proceedings of the first conference.

  • Suchman, L. (1987).Plans and Situated Actions: The Problems of Human-Machine Communication. Cambridge University Press.

  • Taylor, B. (1977). Tense and Continuity.Linguistics and Philosophy 1: 199–220.

    Google Scholar 

  • Thibadeau, R. (1986). Artificial Perception of Actions.Cognitive Science 10(2): 117–149.

    Google Scholar 

  • Treisman, A. (1988). Preattentive Processing in Vision. In Pylyshyn, Zenon W. (ed.)Computational Processes in Human Vision: An Interdisciplinary Perspective, 341–369. Ablex Publishing Corporation, 1988. Also inComputer Vision, Graphics and Image Processing 31: 156–177, 1985.

  • Tsotsos, J. K. (1985). Knowledge Organisation and Its Role in Representation and Interpretation for Time-Varying Data: The ALVEN System.Computational Intelligence 1: 16–32. Also in Fischler & Firschein (1987), 498–514.

    Google Scholar 

  • Tsotsos, J. K. (1987). A Complexity Level Analysis of Immediate Vision.International Journal of Computer Vision 1(4): 303–320.

    Google Scholar 

  • Tsotsos, J. K. (1987). Representational Axes and Temporal Cooperative Processes. In Arbib, Michael A. & Hanson, Allen R. (eds.),Vision, Brain and Cooperative Computation, 361–417. The MIT Press.

  • Ullman, S. (1985). Visual Routines. In Pinker, Steven (ed.)Visual Cognition, 97–159. The MIT Press, 1985. Also in Fischler & Firschein (1987) 298–328, and in a special issue ofCognition 18, 1984.

  • van Benthem, J. F. A. K. (1983).The Logic of Time. D. Reidel Publ. Co.

  • Weld, D. S. & de Kleer, J. (eds.).Readings in Qualitative Reasoning about Physical Systems. Morgan Kaufman Publ. Inc.

  • Whitehead, J. H. C. (1949). Combinatorial Homotopy I.Bull. Amer. Math. Soc 55: 213–245. Also inThe Mathematical Works of J.H.C. Whitehead: Volume III on Homotopy Theory, edited by I. M. James, Pergamon Press, pp. 85–177, 1962.

    Google Scholar 

  • Whitehead, S. D. & Ballard, D. H. (1991). Learning to Perceive and Act by Trial and Error.Machine Learning 7(1): 45–83.

    Google Scholar 

  • Winograd, T. & Flores, F. (1986).Understanding Computers and cognition: A New Foundation for Design. Ablex Publishing Corporation.

  • Woods, W. A. (1970). Transition Network Grammars for Natural Language Analysis.Communications of the ACM 13(10): 591–606.

    Article  Google Scholar 

  • Worrall, A. D., Marslin, R. F., Sullivan, G. D. & Baker, K. B. (1991). Model-Based Tracking. In Mowforth, P. (ed.)British Machine Vision Conference 1991, 310–318. Springer-Verlag.

  • Yarbus, A. L. (1967).Eye Movements and Vision. Plenum Press: New York. An English translation of the Russian text published by Nauka Press: Moscow, 1965.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

Address for correspondence: School of Cognitive and Computing Sciences, University of Sussex, Falmer, Brighton BN1 9QH, UK. Email: richardh@cogs.susx.ac.uk.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Howarth, R. Interpreting a dynamic and uncertain world: High-level vision. Artif Intell Rev 9, 37–63 (1995). https://doi.org/10.1007/BF00857653

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00857653

Key words

Navigation