skip to main content
10.1145/958432.958443acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
Article

Toward a theory of organized multimodal integration patterns during human-computer interaction

Published:05 November 2003Publication History

ABSTRACT

As a new generation of multimodal systems begins to emerge, one dominant theme will be the integration and synchronization requirements for combining modalities into robust whole systems. In the present research, quantitative modeling is presented on the organization of users' speech and pen multimodal integration patterns. In particular, the potential malleability of users' multimodal integration patterns is explored, as well as variation in these patterns during system error handling and tasks varying in difficulty. Using a new dual-wizard simulation method, data was collected from twelve adults as they interacted with a map-based task using multimodal speech and pen input. Analyses based on over 1600 multimodal constructions revealed that users' dominant multimodal integration pattern was resistant to change, even when strong selective reinforcement was delivered to encourage switching from a sequential to simultaneous integration pattern, or vice versa. Instead, both sequential and simultaneous integrators showed evidence of entrenching further in their dominant integration patterns (i.e., increasing either their inter-modal lag or signal overlap) over the course of an interactive session, during system error handling, and when completing increasingly difficult tasks. In fact, during error handling these changes in the co-timing of multimodal signals became the main feature of hyper-clear multimodal language, with elongation of individual signals either attenuated or absent. Whereas Behavioral/Structuralist theory cannot account for these data, it is argued that Gestalt theory provides a valuable framework and insights into multimodal interaction. Implications of these findings are discussed for the development of a coherent theory of multimodal integration during human-computer interaction, and for the design of a new class of adaptive multimodal interfaces.

References

  1. Benoit, J., C. Martin, C. Pelachaud, L. Schomaker & B. Suhm. Audio-visual and multimodal speech-based systems. Handbook of Multimodal and Spoken Dialogue Systems: Resources, Terminology and Product Evaluation (D. Gibbon, I. Mertins & R. Moore, eds.), Kluwer, Boston MA, 2000, 102--203.Google ScholarGoogle Scholar
  2. Bregman, A. S. Auditory Scene Analysis. MIT Press, Cambridge MA, 1990.Google ScholarGoogle Scholar
  3. Koffka, K. Principles of Gestalt Psychology. Harcourt, Brace & Company, NY, 1935.Google ScholarGoogle Scholar
  4. Kohler, W. Dynamics in Psychology. Liveright, NY, 1929.Google ScholarGoogle Scholar
  5. Massaro, D. & D. Stork. Sensory integration and speech reading by humans and machines. Amer. Scien., 1998, 86, 236--244.Google ScholarGoogle ScholarCross RefCross Ref
  6. McGrath, M. & Q. Summerfield. Intermodal timing relations and audio-visual speech recognition by normal-hearing adults. JASA, 1985, 77(2), 678--685.Google ScholarGoogle ScholarCross RefCross Ref
  7. McNeill, D. Hand and Mind: What Gestures Reveal about Thought. Univ. of Chicago Press, Chicago IL, 1992.Google ScholarGoogle Scholar
  8. Naughton, K. Spontaneous gesture and sign: A study of ASL signs co-occurring with speech. In Proc. of the Workshop on the Integration of Gesture in Language & Speech (L. Messing, ed.), Univ. of Delaware, 1996, 125--34.Google ScholarGoogle Scholar
  9. Oviatt, S. L. Ten myths of multimodal interaction. CACM, 1999, 42(11), 74--81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Oviatt, S. L. Multimodal Interfaces. Handbook of Human-Computer Interaction (J. Jacko & A. Sears, eds.), Law. Erlb., Mahwah NJ, 2003, 286--304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Oviatt, S. L., R. Coulston & C. Darves. Predicting children's hyperarticulate speech during human-computer error resolution. Conf. of ASA, Nashville TN., April 2003.Google ScholarGoogle Scholar
  12. Oviatt, S.L., A. DeAngeli & K. Kuhn. Integration and synchronization of input modes during multimodal human-computer interaction, In Proc. of CHI '97, 415--422. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Oviatt, S. L., G. Levow, E. Moreton & M. MacEachern. Modeling global and focal hyperarticulation during human-computer error resolution. JASA, 1998, 104(5), 1--19.Google ScholarGoogle ScholarCross RefCross Ref
  14. Xiao, B., C. Girand & S. L. Oviatt. Multimodal integration patterns in children. In Proc. of ICSLP'2002, 629--632.Google ScholarGoogle Scholar
  15. Xiao, B., R. Lunsford, R. Coulston, M. Wesson & S. L. Oviatt. Modeling multimodal integration patterns and performance in seniors: Toward adaptive processing of individual differences, to be presented at the Fifth International Conference on Multimodal Interfaces, Vancouver, B.C., Nov. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Toward a theory of organized multimodal integration patterns during human-computer interaction

                    Recommendations

                    Comments

                    Login options

                    Check if you have access through your login credentials or your institution to get full access on this article.

                    Sign in
                    • Published in

                      cover image ACM Conferences
                      ICMI '03: Proceedings of the 5th international conference on Multimodal interfaces
                      November 2003
                      318 pages
                      ISBN:1581136218
                      DOI:10.1145/958432

                      Copyright © 2003 ACM

                      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                      Publisher

                      Association for Computing Machinery

                      New York, NY, United States

                      Publication History

                      • Published: 5 November 2003

                      Permissions

                      Request permissions about this article.

                      Request Permissions

                      Check for updates

                      Qualifiers

                      • Article

                      Acceptance Rates

                      ICMI '03 Paper Acceptance Rate45of130submissions,35%Overall Acceptance Rate453of1,080submissions,42%

                    PDF Format

                    View or Download as a PDF file.

                    PDF

                    eReader

                    View online with eReader.

                    eReader