Article

Experimental evaluation of vision and speech based multimodal interfaces

Authors:
Emilio Schapira

The Pennsylvania State University, University Park, Pennsylvania

The Pennsylvania State University, University Park, Pennsylvania
View Profile

,
Rajeev Sharma

The Pennsylvania State University, University Park, Pennsylvania

The Pennsylvania State University, University Park, Pennsylvania
View Profile

PUI '01: Proceedings of the 2001 workshop on Perceptive user interfacesNovember 2001Pages 1–9https://doi.org/10.1145/971478.971481

Published:15 November 2001Publication History

PUI '01: Proceedings of the 2001 workshop on Perceptive user interfaces

Pages 1–9

ABSTRACT

Progress in computer vision and speech recognition technologies has recently enabled multimodal interfaces that use speech and gestures. These technologies o er promising alternatives to existing interfaces because they emulate the natural way in which humans communicate. However, no systematic work has been reported that formally evaluates the new speech/gesture interfaces. This paper is concerned with formal experimental evaluation of new human-computer interactions enabled by speech and hand gestures.The paper describes an experiment conducted with 23 subjects that evaluates selection strategies for interaction with large screen displays. The multimodal interface designed for this experiment does not require the user to be in physical contact with any device. Video cameras and long range microphones are used as input for the system. Three selection strategies are evaluated and results for Different target sizes and positions are reported in terms of accuracy, selection times and user preference. Design implications for vision/speech based interfaces are inferred from these results. This study also raises new question and topics for future research.

References

ACM. 2001 Workshop on Perceptive User Interfaces (PUI '01), to be held on November 11--14 2001. Google ScholarDigital Library
Advanced Interface Technologies, Inc. http://www.advancedinterfaces.com.Google Scholar
H. Ando, Y. Kitahara, and N. Hataoka. Evaluation of multimodal interface using spoken language and pointing gesture on interior design system. In International Conference on Spoken Language Processing, pages 567--570, 1994.Google Scholar
T. Baudel and M. Beaudouin-Lafon. Charade: Remote control of objects using free-hand gestures. Communications of the ACM, 36(7):28--35, 1993. Google ScholarDigital Library
R. Bolt. Put-that-there: voice and gesture at the graphics interface. Computer Graphics, 14(3):262--270, 1980. Google ScholarDigital Library
S. A. Douglas, A. E. Kirkpatrick, and I. S. MacKenzie. Testing pointing device performance and user assessment with the ISO 9241, part 9 standard. Proceeding of the CHI 99 conference on Human factors in computing systems, pages 215--222, 1999. Google ScholarDigital Library
D. Franklin. Cooperating with people: the intelligent classroom. In AAAI/IAAI, pages 555--560, 1998. Google ScholarDigital Library
G. W. Furnas, T. K. Landauer, L. M. Gomez, and S. T. Dumais. The vocabulary problem in human-system communication. Communications of the ACM, 30(11):964--971, 1987. Google ScholarDigital Library
E. Graham and C. L. MacKenzie. Pointing on a computer display. Conference companion on Human factors in computing systems, pages 314--315, 1995. Google ScholarDigital Library
M. A. Grasso, D. S. Ebert, and T. W. Finin. The integrality of speech in multimodal interfaces. ACM Transactions on Computer-Human Interaction, 5(4):303--325, 1998. Google ScholarDigital Library
ISO. Report number ISO/TC 159/SC4/WG3 N147: Ergonomic requirements for office work with visual display terminals (VDTs) - part 9 - requirements for non-keyboard input devices (ISO 9241-9). International Organisation for Standardisation, 1998.Google Scholar
D. B. Koons, C. J. Sparrell, and K. R. Thorisson. Integrating simultaneous input from speech, gaze, and hand gestures. In AAAI Workshop on Intelligent Multimedia Interfaces, pages 257--276, 1991. Google ScholarDigital Library
I. S. MacKenzie and A. Oniszczak. A comparison of three selection techniques for touchpads. Proceedings of the ACM Conference on Human Factors in Computing Systems - CHI '98, pages 336--343, 1998. Google ScholarDigital Library
L. Mark, G. Zwart, and A. George. Visualization space: A testbed for deviceless multimodal user interface. In Intelligent Environments 98, AAAI Spring Symposium Series, pages 87--92, 1998.Google Scholar
X. Ren and S. Moriya. Improving selection performance on pen-based systems: a study of pen-based interaction for selection tasks. ACM Transactions on Computer-Human Interaction, 7(3):384--416, 2000. Google ScholarDigital Library
E. Schapira. Experimental evaluation of vision and speech based multimodal interfaces. Master's thesis, The Pennsylvania State University, Aug. 2001.Google Scholar
R. Sharma, V. Pavlovic, and T. Huang. Toward multimodal human-computer interface. Proc. IEEE, Special issue on Multimedia Signal Processing, 86(5):853--869, 1998.Google ScholarCross Ref
R. Sharma, I. Poddar, E. Ozyildiz, S. Kettebekov, H. Kim, and T. S. Huang. Toward interpretation of natural speech/gesture for spatial planning on a virtual map. In In Proc. 1999 Advanced Display Federated Laboratory Symposium, pages 35--39, Adelphi, MD, 1999.Google Scholar
B. Suhm, B. Myers, and A. Waibel. Multimodal error correction for speech user interfaces. ACM Transactions on Computer-Human Interaction, 8(1):60--98, 2001. Google ScholarDigital Library

Experimental evaluation of vision and speech based multimodal interfaces
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interaction devices

Recommendations

Multimodal error correction for speech user interfaces

Although commercial dictation systems and speech-enabled telephone voice user interfaces have become readily available, speech recognition errors remain a serious problem in the design and implementation of speech user interfaces. Previous work ...
Read More
An exploration of gesture-speech multimodal patterns for touch interfaces
IndiaHCI '11: Proceedings of the 3rd Indian Conference on Human-Computer Interaction

Multimodal interfaces that integrate multiple input modalities such as speech, gestures, gaze, and so on have shown considerable promise in terms of higher task efficiency, lower error rates and higher user satisfaction. However, the adoption of such ...
Read More
Multimodal speech and pen interfaces
The Handbook of Multimodal-Multisensor Interfaces

This chapter describes interfaces that enable users to combine digital pen and speech input for interacting with computing systems. Such interfaces promise natural and efficient interaction, taking advantage of skills that users have developed over many ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

PUI '01: Proceedings of the 2001 workshop on Perceptive user interfaces
November 2001
241 pages
ISBN:9781450374736
DOI:10.1145/971478

Copyright © 2001 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 November 2001
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 19
  Total Citations
  View Citations
- 687
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Experimental evaluation of vision and speech based multimodal interfaces

PUI '01: Proceedings of the 2001 workshop on Perceptive user interfaces

ABSTRACT

References

Cited By

Recommendations

Multimodal error correction for speech user interfaces

An exploration of gesture-speech multimodal patterns for touch interfaces

Multimodal speech and pen interfaces

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Experimental evaluation of vision and speech based multimodal interfaces

PUI '01: Proceedings of the 2001 workshop on Perceptive user interfaces

ABSTRACT

References

Cited By

Recommendations

Multimodal error correction for speech user interfaces

An exploration of gesture-speech multimodal patterns for touch interfaces

Multimodal speech and pen interfaces

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media