Skip to main content

Toward cooperative multimedia interaction

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1374))

Abstract

The proliferation of information and services on our global information highways demands mechanisms to support more effective and efficient interaction. This article claims that efficient and effective interaction requires both cooperative and multimedia communication, illustrating this through several applications developed by our group that aim to enhance interaction with complex systems or information sources. After defining the terms cooperative and multimedia and arguing for their centrality in interfaces, we overview key processes in the automated generation of multimedia. We then illustrate these with implemented examples from several domains including information retrieval, direction providing, mission planning and computer maintenance. After describing a visionary system for information access, we conclude by describing our current efforts toward providing content-based browsing and search of video.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Aberdeen, J., Burger, J., Day, D., Hirschman, L., Robinson, P., and Vilain, M. (1995) Description of the Alembic System Used for MUC-6. Proc. Sixth Message Understanding Conference. Columbia, MD: Advanced Research Projects Agency, Information Technology Office.

    Google Scholar 

  • Aigraine, P., Joly, P. and Longueville, V. (1995) Medium Knowledge-based Macro-Segmentation of Video into Sequences. In M. Maybury (ed.) Working notes of IJCAI-95 Workshop on Intelligent Multimedia Information Retrieval, 5–16. To appear in (Maybury, 1997)

    Google Scholar 

  • André, E., Finkler, W., Graf, W., Rist, T., Schauder, A., and Wahlster, W. (1993) WIP: The Automatic Synthesis of Multimodal Presentations. In M. Maybury (ed.) Intelligent Multimedia Interfaces, Menlo Park: AAAI/MIT Press, 73–90. Also DFKI Research Report RR-92-46, Saarbrücken.

    Google Scholar 

  • Appelt, D. (1985) Planning English Sentences. Cambridge, UK: Cambridge University Press.

    Book  Google Scholar 

  • Arens, Y., Miller, L., and Sondheimer, N. K. (1991) Presentation Design Using an Integrated Knowledge Base. In J.W. Sullivan, and S.W. Tyler (eds) Intelligent User Interfaces. New York: ACM Press, 241–258.

    Google Scholar 

  • Austin, J. (1962) How to do Things with Words, edited by J. O. Urmson. Oxford, UK: Oxford University Press.

    Google Scholar 

  • Boyle, C. and Encarnacion, A. O. (1994) MetaDoc: An Adaptive Hypertext Reading System. User Modeling and User-Adapted Interaction 4(1), 1–19.

    Article  Google Scholar 

  • Bunt, H., R. Ahn, R.-J. Beun, T. Borghuis and C. van Overveld (1995) Cooperative Multimodal Communication in the DenK Project. In H. Bunt, R.-J. Beun and T. Borghuis (eds.) Proc. International Conference on Cooperative Multimodal Communication CMC/95). Eindhoven: IPO, 79–102.

    Google Scholar 

  • Burger, J., and Marshall, R. (1993) The Application of Natural Language Models to Intelligent Multimedia. In (Maybury, 1993), 167–187.

    Google Scholar 

  • Catarci, T., Costabile, M. F., and Levialdi, S. (eds) (1992) Advanced Visual Interfaces: Proceedings of the International Workshop AVI'92 Singapore: World Scientific Series in Computer Science, Vol 36.

    Google Scholar 

  • Cheyer, A. and Julia, L. (1995) Multimodal Maps: An Agent-based Approach. In H. Bunt, R.-J. Beun and T. Borghuis (eds.) Proc. International Conference on Cooperative Multimodal Communication CMC/95). Eindhoven: IPO, 103–113. Reprinted in this volume.

    Google Scholar 

  • Cremers, A. 1995. Object Reference During task-related Terminal Dialogues. In H. Bunt, R.-J. Beun and T. Borghuis (eds.) Proc. International Conference on Cooperative Multimodal Communication CMC/95). Eindhoven: IPO, 115–128. Reprinted in this volume.

    Google Scholar 

  • Dubner, B. (1996) Automatic Scene Detector and Videotape logging system, User Guide. Dubner International, Inc., Copyright 1995.

    Google Scholar 

  • Fais, L., Loken-Kim, K., Park, Y. (1995) Speaker's Responses to Requests for Repetition in a Multimedia Language Processing Environment. In H. Bunt, R.-J. Beun and T. Borghuis (eds.) Proc. International Conference on Cooperative Multimodal Communication CMC/95). Eindhoven: IPO, 129–143. See also Fais et al., this volume.

    Google Scholar 

  • Feiner, S. K. and McKeown, K. R. (1993) Automating the Generation of Coordinated Multimedia Explanations. In M. Maybury (ed.) Intelligent Multimedia Information Retrieval. Menlo Park: AAAI/MIT Press, 113–134.

    Google Scholar 

  • Flickner, M., Sawhney, H., Niblack, W., Ashley, J., Huang, Q., Dom, B., Gorkani, M., Hafner, J., Lee, D., Petkovic, D., Steele, D. and Yanker, P. (1995) Query by Image and Video Content: The QBIC System. IEEE Computer.

    Google Scholar 

  • Goodman, B.A. (1993) Multimedia Explanations for Intelligent Training Systems. In M. Maybury (ed.) Intelligent Multimedia Information Retrieval. Menlo Park: AAAI/MIT Press, 148–171

    Google Scholar 

  • Graf, W. (1992) Constraint-based Graphical Layout of Multimodal Presentations. In Catarci, Costabile, and Levialdi (1992), 365–385. Also available as DFKI Report RR-92-15.

    Google Scholar 

  • Herzog, G. and Wazinski, P. (1994) Visual TRAnslator: Linking Perceptions and Natural Language Descriptions. Artificial Intelligence Review 8, 175–187.

    Article  Google Scholar 

  • Hovy, E. H. and Arens, Y. (1993) On the Knowledge Underlying Multimedia Presentations. In M. Maybury (ed.) Intelligent Multimedia Information Retrieval. Menlo Park: AAAI/MIT Press, 280–306.

    Google Scholar 

  • Kaplan, C., Fenwick, J. and Chen, J. (1993) Adaptive Hypertext Navigation Based on User Goals and Context. User Modeling and User-Adapted Interaction 3(3), 193–220.

    Article  Google Scholar 

  • Kobsa, A., Nill, A. and Fink, J. (1994) KN-AHS: An Adaptive Hypertext Client of the User Modeling System BGP-MS. Proc. First International Conference on User Modeling (UM-94), Cape Cod, MA.

    Google Scholar 

  • Kobsa, A. and Wahlster, W. (eds.) (1989) User Models in Dialogue Systems. Berlin: Springer Verlag.

    Google Scholar 

  • Koons, D.B., Sparrell, C.J., and Thorisson, K.R. (1993) Integrating Simultaneous Output from Speech, Gaze, and Hand Gestures. In M. Maybury (ed.) Intelligent Multimedia Information Retrieval. Menlo Park: AAAI/MIT Press, 243–261.

    Google Scholar 

  • Mackinlay, J. D. (1986) Automating the Design of Graphical Presentations of Relational Information. ACM Transactions on Graphics 5(2), 110–141.

    Article  Google Scholar 

  • Mani I., House, D., Maybury, M. and Green, M. (in press) Toward Content-based Browsing of Broadcast News Video. In (Maybury, forth.)

    Google Scholar 

  • Maass, W. (1994) From Vision to Multimodal Communication: Incremental Route Descriptions. Artificial Intelligence Review 8, 159–174.

    Article  Google Scholar 

  • Maybury, M.T. (1991a) Planning Multimedia Explanations Using Communicative Acts. In Proc. Ninth National Conference on Artificial Intelligence. Anaheim, CA: AAAI, 61–66.

    Google Scholar 

  • Maybury, M. T. (1991b) Topical, Temporal and Spatial Constraints on Linguistic Realization. Computational Intelligence: Special Issue on Natural Language Generation 7(4), 266–275.

    Article  Google Scholar 

  • Maybury, M. T. (ed.) (1993) Intelligent Multimedia Interfaces. Menlo Park: AAAI/MIT Press.(http://www.aaai.org:80/Press/Books/Maybury-1/maybury.html)

    Google Scholar 

  • Maybury, M. T. (1994) Automated Explanation and Natural Language Generation. In C. Sabourin (ed) A Bibliography of Natural Language Generation. Montreal: Infolingua, 1–88.

    Google Scholar 

  • Maybury, M. T. (1995a) Using Similarity Metrics to Determine Content for Explanation Generation. International Journal of Expert Systems with Applications. Special issue on Explanation, 8(4), 513–525.

    Article  Google Scholar 

  • Maybury, M. T. (1995b) Research in Multimedia Parsing and Generation. In P. McKevitt (ed.) Artificial Intelligence Review: Special Issue on on the Integration of Natural Language and Vision Processing 9(2–3), 103–127.

    Google Scholar 

  • Maybury, M. T. (ed.) (1997) Intelligent Multimedia Information Retrieval. Menlo Park: AAAI/MIT Press. (http://www.aaai.org:80/Press/Books/Maybury-2)

    Google Scholar 

  • Maybury, M. T. (in press) Communicative Acts for Multimedia and Multimodal Dialogue. In M. M. Taylor, F. Néel, and D. G. Bouwhuis (eds.) (in press) The Structure of Multimodal Dialogue II.

    Google Scholar 

  • Michell, R. (1996) Forager for Information on the Super Highway (FISH). Unpublished Manuscript.

    Google Scholar 

  • Neal, J. G. and Shapiro, S. C. (1991) Intelligent Multi-Media Interface Technology. In Sullivan and Tyler (1991), 11–43.

    Google Scholar 

  • Neuwirth, C., Chandhok, R., Chamey, D., Wojahn, P. and Kim, L. (1994) Distributed Collaborative Writing: A Comparison of Spoken and Written Modalities for Reviewing and Revising Documents. In Proc. Human Factors inCOmputing Systems (CHI'94), Boston, 51–57.

    Google Scholar 

  • Pelachaud, C. (1992) Functional Decomposition of Facial Expressions for an Animation System. In Catarci, Costabile, and Levialdi (1992), 26–49.

    Google Scholar 

  • Reiter, E., Mellish, C. and Levine, J. (1992) Automatic Generation of on-line Documentation in the IDAS Project. Proc. 3rd Conference on Applied Natural Language Processing. Morristown: ACL.

    Google Scholar 

  • Roth, S. F., and Mattis, J. (1991) Automating the Presentation of Information. In Proc. IEEE Conference on AI Applications, Miami Beach, FL., 90–97.

    Google Scholar 

  • Searle, J. R. (1969) Speech Acts: An Essay in the Philosophy of Language. Cambridge, UK: Cambridge University Press.

    Book  Google Scholar 

  • Smotroff, I., Hirschman, L., and Bayer, S. (1995) Integrating Natural Language with Large DataspaceVisualization. To appear in N. Adam and B. Bhargava (eds), Advances in Digital Libraries. Lecture Notes in Computer Science, Berlin: Springer Verlag.

    Google Scholar 

  • Stock, O. and the ALFRESCO Project Team (1993) ALFRESCO: Enjoying the Combination of Natural Language Processing and Hypermedia for Information Exploration. In M. Maybury (ed.) Intelligent Multimedia Information Retrieval. Menlo Park: AAAI/MIT Press, 197–224.

    Google Scholar 

  • Sullivan, J.W., and Tyler, S.W. (eds) (1991) Intelligent User Interfaces. New York: ACM Press, Frontier Series.

    MATH  Google Scholar 

  • Sutcliffe, A., Hare, M., Doubleday, A. and Ryan, M. (1997) Empirical Studies in Multimedia Information Retrieval. In M. Maybury (ed.) Intelligent Multimedia Information Retrieval. Menlo Park: AAAI/MIT Press, 449–472.

    Google Scholar 

  • Wahlster, W. (1991) User and Discourse Models for Multimodal Communication. In Sullivan and Tyler (1991), 45–67.

    Google Scholar 

  • Wahlster, W. (1996) Intellimedia. Invited Talk at the International Workshop on Cooperative Multimodal Communication CMC/95, May 1995, Eindhoven, the Netherlands.

    Google Scholar 

  • Webber, B. (1995) Instructing Animated Agents: Viewing Language in Behavioral Terms. Invited Talk at the International Conference on Cooperative Multimodal Communication CMC/95, May 1995, Eindhoven, the Netherlands. Reprinted in this volume.

    Google Scholar 

  • Zhang, H.J., Low, C.Y., Smoliar, S.W. and Zhong, D. (1995) Video Parsing, Retrieval, and Browsing: An Integrated and Content-Based Solution. Proc. of ACM Multimedia '95. To appear in Maybury, 1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Harry Bunt Robbert-Jan Beun Tijn Borghuis

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag

About this paper

Cite this paper

Maybury, M.T. (1998). Toward cooperative multimedia interaction. In: Bunt, H., Beun, RJ., Borghuis, T. (eds) Multimodal Human-Computer Communication. CMC 1995. Lecture Notes in Computer Science, vol 1374. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0052311

Download citation

  • DOI: https://doi.org/10.1007/BFb0052311

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-64380-7

  • Online ISBN: 978-3-540-69764-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics