Toward cooperative multimedia interaction

Maybury, Mark T.

doi:10.1007/BFb0052311

Toward cooperative multimedia interaction

Mark T. Maybury¹

Conference paper
First Online: 01 January 2006

272 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1374))

Abstract

The proliferation of information and services on our global information highways demands mechanisms to support more effective and efficient interaction. This article claims that efficient and effective interaction requires both cooperative and multimedia communication, illustrating this through several applications developed by our group that aim to enhance interaction with complex systems or information sources. After defining the terms cooperative and multimedia and arguing for their centrality in interfaces, we overview key processes in the automated generation of multimedia. We then illustrate these with implemented examples from several domains including information retrieval, direction providing, mission planning and computer maintenance. After describing a visionary system for information access, we conclude by describing our current efforts toward providing content-based browsing and search of video.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aberdeen, J., Burger, J., Day, D., Hirschman, L., Robinson, P., and Vilain, M. (1995) Description of the Alembic System Used for MUC-6. Proc. Sixth Message Understanding Conference. Columbia, MD: Advanced Research Projects Agency, Information Technology Office.
Google Scholar
Aigraine, P., Joly, P. and Longueville, V. (1995) Medium Knowledge-based Macro-Segmentation of Video into Sequences. In M. Maybury (ed.) Working notes of IJCAI-95 Workshop on Intelligent Multimedia Information Retrieval, 5–16. To appear in (Maybury, 1997)
Google Scholar
André, E., Finkler, W., Graf, W., Rist, T., Schauder, A., and Wahlster, W. (1993) WIP: The Automatic Synthesis of Multimodal Presentations. In M. Maybury (ed.) Intelligent Multimedia Interfaces, Menlo Park: AAAI/MIT Press, 73–90. Also DFKI Research Report RR-92-46, Saarbrücken.
Google Scholar
Appelt, D. (1985) Planning English Sentences. Cambridge, UK: Cambridge University Press.
Book Google Scholar
Arens, Y., Miller, L., and Sondheimer, N. K. (1991) Presentation Design Using an Integrated Knowledge Base. In J.W. Sullivan, and S.W. Tyler (eds) Intelligent User Interfaces. New York: ACM Press, 241–258.
Google Scholar
Austin, J. (1962) How to do Things with Words, edited by J. O. Urmson. Oxford, UK: Oxford University Press.
Google Scholar
Boyle, C. and Encarnacion, A. O. (1994) MetaDoc: An Adaptive Hypertext Reading System. User Modeling and User-Adapted Interaction 4(1), 1–19.
Article Google Scholar
Bunt, H., R. Ahn, R.-J. Beun, T. Borghuis and C. van Overveld (1995) Cooperative Multimodal Communication in the DenK Project. In H. Bunt, R.-J. Beun and T. Borghuis (eds.) Proc. International Conference on Cooperative Multimodal Communication CMC/95). Eindhoven: IPO, 79–102.
Google Scholar
Burger, J., and Marshall, R. (1993) The Application of Natural Language Models to Intelligent Multimedia. In (Maybury, 1993), 167–187.
Google Scholar
Catarci, T., Costabile, M. F., and Levialdi, S. (eds) (1992) Advanced Visual Interfaces: Proceedings of the International Workshop AVI'92 Singapore: World Scientific Series in Computer Science, Vol 36.
Google Scholar
Cheyer, A. and Julia, L. (1995) Multimodal Maps: An Agent-based Approach. In H. Bunt, R.-J. Beun and T. Borghuis (eds.) Proc. International Conference on Cooperative Multimodal Communication CMC/95). Eindhoven: IPO, 103–113. Reprinted in this volume.
Google Scholar
Cremers, A. 1995. Object Reference During task-related Terminal Dialogues. In H. Bunt, R.-J. Beun and T. Borghuis (eds.) Proc. International Conference on Cooperative Multimodal Communication CMC/95). Eindhoven: IPO, 115–128. Reprinted in this volume.
Google Scholar
Dubner, B. (1996) Automatic Scene Detector and Videotape logging system, User Guide. Dubner International, Inc., Copyright 1995.
Google Scholar
Fais, L., Loken-Kim, K., Park, Y. (1995) Speaker's Responses to Requests for Repetition in a Multimedia Language Processing Environment. In H. Bunt, R.-J. Beun and T. Borghuis (eds.) Proc. International Conference on Cooperative Multimodal Communication CMC/95). Eindhoven: IPO, 129–143. See also Fais et al., this volume.
Google Scholar
Feiner, S. K. and McKeown, K. R. (1993) Automating the Generation of Coordinated Multimedia Explanations. In M. Maybury (ed.) Intelligent Multimedia Information Retrieval. Menlo Park: AAAI/MIT Press, 113–134.
Google Scholar
Flickner, M., Sawhney, H., Niblack, W., Ashley, J., Huang, Q., Dom, B., Gorkani, M., Hafner, J., Lee, D., Petkovic, D., Steele, D. and Yanker, P. (1995) Query by Image and Video Content: The QBIC System. IEEE Computer.
Google Scholar
Goodman, B.A. (1993) Multimedia Explanations for Intelligent Training Systems. In M. Maybury (ed.) Intelligent Multimedia Information Retrieval. Menlo Park: AAAI/MIT Press, 148–171
Google Scholar
Graf, W. (1992) Constraint-based Graphical Layout of Multimodal Presentations. In Catarci, Costabile, and Levialdi (1992), 365–385. Also available as DFKI Report RR-92-15.
Google Scholar
Herzog, G. and Wazinski, P. (1994) Visual TRAnslator: Linking Perceptions and Natural Language Descriptions. Artificial Intelligence Review 8, 175–187.
Article Google Scholar
Hovy, E. H. and Arens, Y. (1993) On the Knowledge Underlying Multimedia Presentations. In M. Maybury (ed.) Intelligent Multimedia Information Retrieval. Menlo Park: AAAI/MIT Press, 280–306.
Google Scholar
Kaplan, C., Fenwick, J. and Chen, J. (1993) Adaptive Hypertext Navigation Based on User Goals and Context. User Modeling and User-Adapted Interaction 3(3), 193–220.
Article Google Scholar
Kobsa, A., Nill, A. and Fink, J. (1994) KN-AHS: An Adaptive Hypertext Client of the User Modeling System BGP-MS. Proc. First International Conference on User Modeling (UM-94), Cape Cod, MA.
Google Scholar
Kobsa, A. and Wahlster, W. (eds.) (1989) User Models in Dialogue Systems. Berlin: Springer Verlag.
Google Scholar
Koons, D.B., Sparrell, C.J., and Thorisson, K.R. (1993) Integrating Simultaneous Output from Speech, Gaze, and Hand Gestures. In M. Maybury (ed.) Intelligent Multimedia Information Retrieval. Menlo Park: AAAI/MIT Press, 243–261.
Google Scholar
Mackinlay, J. D. (1986) Automating the Design of Graphical Presentations of Relational Information. ACM Transactions on Graphics 5(2), 110–141.
Article Google Scholar
Mani I., House, D., Maybury, M. and Green, M. (in press) Toward Content-based Browsing of Broadcast News Video. In (Maybury, forth.)
Google Scholar
Maass, W. (1994) From Vision to Multimodal Communication: Incremental Route Descriptions. Artificial Intelligence Review 8, 159–174.
Article Google Scholar
Maybury, M.T. (1991a) Planning Multimedia Explanations Using Communicative Acts. In Proc. Ninth National Conference on Artificial Intelligence. Anaheim, CA: AAAI, 61–66.
Google Scholar
Maybury, M. T. (1991b) Topical, Temporal and Spatial Constraints on Linguistic Realization. Computational Intelligence: Special Issue on Natural Language Generation 7(4), 266–275.
Article Google Scholar
Maybury, M. T. (ed.) (1993) Intelligent Multimedia Interfaces. Menlo Park: AAAI/MIT Press.(http://www.aaai.org:80/Press/Books/Maybury-1/maybury.html)
Google Scholar
Maybury, M. T. (1994) Automated Explanation and Natural Language Generation. In C. Sabourin (ed) A Bibliography of Natural Language Generation. Montreal: Infolingua, 1–88.
Google Scholar
Maybury, M. T. (1995a) Using Similarity Metrics to Determine Content for Explanation Generation. International Journal of Expert Systems with Applications. Special issue on Explanation, 8(4), 513–525.
Article Google Scholar
Maybury, M. T. (1995b) Research in Multimedia Parsing and Generation. In P. McKevitt (ed.) Artificial Intelligence Review: Special Issue on on the Integration of Natural Language and Vision Processing 9(2–3), 103–127.
Google Scholar
Maybury, M. T. (ed.) (1997) Intelligent Multimedia Information Retrieval. Menlo Park: AAAI/MIT Press. (http://www.aaai.org:80/Press/Books/Maybury-2)
Google Scholar
Maybury, M. T. (in press) Communicative Acts for Multimedia and Multimodal Dialogue. In M. M. Taylor, F. Néel, and D. G. Bouwhuis (eds.) (in press) The Structure of Multimodal Dialogue II.
Google Scholar
Michell, R. (1996) Forager for Information on the Super Highway (FISH). Unpublished Manuscript.
Google Scholar
Neal, J. G. and Shapiro, S. C. (1991) Intelligent Multi-Media Interface Technology. In Sullivan and Tyler (1991), 11–43.
Google Scholar
Neuwirth, C., Chandhok, R., Chamey, D., Wojahn, P. and Kim, L. (1994) Distributed Collaborative Writing: A Comparison of Spoken and Written Modalities for Reviewing and Revising Documents. In Proc. Human Factors inCOmputing Systems (CHI'94), Boston, 51–57.
Google Scholar
Pelachaud, C. (1992) Functional Decomposition of Facial Expressions for an Animation System. In Catarci, Costabile, and Levialdi (1992), 26–49.
Google Scholar
Reiter, E., Mellish, C. and Levine, J. (1992) Automatic Generation of on-line Documentation in the IDAS Project. Proc. 3rd Conference on Applied Natural Language Processing. Morristown: ACL.
Google Scholar
Roth, S. F., and Mattis, J. (1991) Automating the Presentation of Information. In Proc. IEEE Conference on AI Applications, Miami Beach, FL., 90–97.
Google Scholar
Searle, J. R. (1969) Speech Acts: An Essay in the Philosophy of Language. Cambridge, UK: Cambridge University Press.
Book Google Scholar
Smotroff, I., Hirschman, L., and Bayer, S. (1995) Integrating Natural Language with Large DataspaceVisualization. To appear in N. Adam and B. Bhargava (eds), Advances in Digital Libraries. Lecture Notes in Computer Science, Berlin: Springer Verlag.
Google Scholar
Stock, O. and the ALFRESCO Project Team (1993) ALFRESCO: Enjoying the Combination of Natural Language Processing and Hypermedia for Information Exploration. In M. Maybury (ed.) Intelligent Multimedia Information Retrieval. Menlo Park: AAAI/MIT Press, 197–224.
Google Scholar
Sullivan, J.W., and Tyler, S.W. (eds) (1991) Intelligent User Interfaces. New York: ACM Press, Frontier Series.
MATH Google Scholar
Sutcliffe, A., Hare, M., Doubleday, A. and Ryan, M. (1997) Empirical Studies in Multimedia Information Retrieval. In M. Maybury (ed.) Intelligent Multimedia Information Retrieval. Menlo Park: AAAI/MIT Press, 449–472.
Google Scholar
Wahlster, W. (1991) User and Discourse Models for Multimodal Communication. In Sullivan and Tyler (1991), 45–67.
Google Scholar
Wahlster, W. (1996) Intellimedia. Invited Talk at the International Workshop on Cooperative Multimodal Communication CMC/95, May 1995, Eindhoven, the Netherlands.
Google Scholar
Webber, B. (1995) Instructing Animated Agents: Viewing Language in Behavioral Terms. Invited Talk at the International Conference on Cooperative Multimodal Communication CMC/95, May 1995, Eindhoven, the Netherlands. Reprinted in this volume.
Google Scholar
Zhang, H.J., Low, C.Y., Smoliar, S.W. and Zhong, D. (1995) Video Parsing, Retrieval, and Browsing: An Integrated and Content-Based Solution. Proc. of ACM Multimedia '95. To appear in Maybury, 1997)
Google Scholar

Download references

Author information

Authors and Affiliations

Artificial Intelligence Center, The MITRE Corporation, 202 Burlington Road, 01730, Bedford, MA
Mark T. Maybury

Authors

Mark T. Maybury
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Harry Bunt Robbert-Jan Beun Tijn Borghuis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Maybury, M.T. (1998). Toward cooperative multimedia interaction. In: Bunt, H., Beun, RJ., Borghuis, T. (eds) Multimodal Human-Computer Communication. CMC 1995. Lecture Notes in Computer Science, vol 1374. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0052311

Download citation

DOI: https://doi.org/10.1007/BFb0052311
Published: 17 May 2006
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64380-7
Online ISBN: 978-3-540-69764-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics