Natural Language Description of Image Sequences as a Form of Knowledge Representation

Nagel, H. -H.

doi:10.1007/3-540-48238-5_4

H. -H. Nagel^3,4

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1701))

Included in the following conference series:

Annual Conference on Artificial Intelligence

506 Accesses
3 Citations

Abstract

An image sequence evaluation process combines information from different information sources. One of these sources is a camera which records a scene and provides the acquired information as a digitized image sequence. A different source provides knowledge regarding signal processing and geometry, exploited in order to map the image sequence signal to a system-internal representation of visible bodies and their movemex.nt in the depicted scene. Still another type of source provides abstract conceptual knowledge linking the system-internal geometric representation to tasks and goals of agents which act within the depicted scene or may influence it from the outside.

Rather than providing this third type of information for inference engines by ‘handcrafted’ rules or sets of axioms, it is postulated that this type of knowledge should be derived by algorithmic analysis of a suitably formulated natural language text: natural language text is considered as a genuine represention of abstract knowledge for an image sequence evaluation process. This hypothesis is studied for the example of a system which transforms video sequences of road scenes into natural language text describing the recorded actual traffic.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

H. Buxton and S. Gong: Visual Surveillance in a Dynamic and Uncertain World. Artificial Intelligence, 78 (1995) 431–459.
Article Google Scholar
A. G. Cohn: Qualitative Spatial Representation and Reasoning Techniques. In G. Brewka, Ch. Habel, and B. Nebel (Eds.): KI-97: Advances in Artificial Intelligence, Proc. 21st Annual German Conference on Artificial Intelligence, 9-12 September 1997, Freiburg, Germany. LNAI 1303, Springer-Verlag Berlin Heidelberg New York/NY 1997, pp. 1–30.
Google Scholar
E. D. Dickmanns: Vehicles Capable of Dynamic Vision: A New Breed of Technical Beings? Artificial Intelligence, 103 (1998) 49–76.
Article MATH Google Scholar
R. Gerber and H.-H. Nagel: (Mis-?)Using DRT for Generation of Natural Language Text from Image Sequences. In Proc. Fifth European Conference on Computer tVision (ECCV’98), 2-6 June 1998, Freiburg/Germany; H. Burkhardt and B. Neumann (Eds.), Lecture Notes in Computer Science LNCS 1407 (Vol. II), Springer-Verlag Berlin Heidelberg New York/NY 1998, pp. 255–270.
Google Scholar
J. Fernyhough, A. G. Cohn, and D. C. Hogg: Building Qualitative Event Models Automatically from Visual Input. In Proc. Sixth International Conference on Computer Vision (ICCV’98), 4-7 January 1998, Bombay/India, pp. 350–355
Google Scholar
M. Haag: Bildfolgenauswertung zur Erkennung der Absichten von Straβenverkehrsteilnehmern. Dissertation, Fakultät für Informatik der Universität Karlsruhe (TH), Juli 1998. Erschienen in ‘Dissertationen zur Künstlichen Intelligenz’ DISKI 193, infix-Verlag St. Augustin 1998 (in German).
Google Scholar
M. Haag and H.-H. Nagel: Incremental Recognition of Traffic Situations from Video Image Sequences. In Proc. ICCV-98 Workshop on Conceptual Descriptions of Images (CDI-98), H. Buxton and A. Mukerjee (Eds.), 2 January 1998, Bombay/ India; Indian Institute of Technology, Kanpur/India 1998, pp. 1–20.
Google Scholar
R. J. Howarth: Interpreting a Dynamic and Uncertain World: Taks-Based Control. Artificial Intelligence 100 (1998) 5–85.
Article MATH Google Scholar
T. Huang and S. Russell: Object Identification: A Bayesian Analysis with Application to Traffic Surveillance. Artificial Intelligence 103 (1998) 77–93.
Article MATH Google Scholar
St. Intille and A. Bobick: Visual Recognition of Multi-Agent Action Using Binary Temporal Relations. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR’99), 23-25 June 1999, Fort Collins, Colorado, Vol. 1, pp. 56–62.
Google Scholar
Y. Ivanov, Ch. Stauffer, A. Bobick, and W. E. L. Grimson: Visual Surveillance of Interactions. In Proc. Second IEEE International Workshop on Visual Surveillance (VS’99), 26 June 1999, Fort Collins, Colorado, pp. 82–89.
Google Scholar
N. Johnson and D. Hogg: Learning the Distribution of Object Trajectories for Event Recognition. Image and Vision Computing 14:8 (1996) 609–615.
Article Google Scholar
H. Kamp and U. Reyle: From Discourse to Logic. Kluwer Academic Publishers Dordrecht Boston London 1993.
Google Scholar
V. Kettnaker and M. Brand: Minimum-Entropy Models of Scene Activity. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR’99), 23-25 June 1999, Fort Collins, Colorado, Vol. 1, pp. 281–286.
Google Scholar
D. W. Loveland: Automated Deduction: Looking Ahead. AI magazine 20:1 (Spring 1999) 77–98.
Google Scholar
H.-H. Nagel: Kann ein Rechner schon wahrnehmen oder sieht ein Informatiker die Dinge nur anders? In Dynamische Perzeption, S. Posch und H. Ritter (Hrsg.), Proceedings in Artificial Intelligence Vol. 8, infix-Verlag Sankt Augustin 1998, pp. 192–215 (in German).
Google Scholar
H.-H. Nagel and A. Gehrke: Bildbereichsbasierte Verfolgung von Straßenfahrzeugen durch adaptive Schätzung und Segmentierung von Optischen-Fluß-Feldern. In 20. DAGM-Symposium ‘;Mustererkennung 1998’, 29. September-1. Oktober 1998, Stuttgart/Germany; P. Levi, R.-J. Ahlers, F. May und M. Schanz (Hrsg.), Informatik aktuell, Springer-Verlag Berlin Heidelberg New York/NY 1998, pp. 314–321 (in German).
Google Scholar
P. Remagnino, T. Tan, and K. Baker: Multi-Agent Visual Surveillance of Dynamic Scenes. Image and Vision Computing 16:8 (1998) 529–532.
Article Google Scholar
G. Sagerer and H. Niemann: Semantic Networks for Understanding Scenes. Plenum Press New York/NY 1997.
Google Scholar
K. Schäfer: Unscharfe zeitlogische Modellierung von Situationen und Handlungen in Bildfolgenauswertung und Robotik. Dissertation, Fakultät für Informatik der Universität Karlsruhe (TH), Juli 1996. Erschienen in ‘Dissertationen zur Künstlichen Intelligenz (DISKI)’ 135, infix-Verlag Sankt Augustin 1996 (in German).
Google Scholar

Download references

Author information

Authors and Affiliations

Institut für Algorithmen und Kognitive Systeme, Fakultät für Informatik der Universität Karlsruhe (TH), D—76128, Karlsruhe, Germany
H. -H. Nagel
Fraunhofer-Institut für Informations-und Datenverarbeitung (IITB), Fraunhoferstr. 1, D—76131, Karlsruhe, Germany
H. -H. Nagel

Authors

H. -H. Nagel
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institut für Informatik III, Universität Bonn, Römerstraße 164, D-53117, Bonn, Germany
Wolfram Burgard & Armin B. Cremers &
GMD-Forschungszentrum Informationstechnik GmbH, Schloß Birlinghoven, D-53754, Sankt Augustin, Germany
Thomas Cristaller

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nagel, H.H. (1999). Natural Language Description of Image Sequences as a Form of Knowledge Representation. In: Burgard, W., Cremers, A.B., Cristaller, T. (eds) KI-99: Advances in Artificial Intelligence. KI 1999. Lecture Notes in Computer Science(), vol 1701. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48238-5_4

Download citation

DOI: https://doi.org/10.1007/3-540-48238-5_4
Published: 03 June 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66495-6
Online ISBN: 978-3-540-48238-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics