poster

Disambiguating speech commands using physical context

Authors:
Katherine M. Everitt

University of Washington, Seattle, WA

University of Washington, Seattle, WA
View Profile

,
Susumu Harada

University of Washington, Seattle, WA

University of Washington, Seattle, WA
View Profile

,
Jeff Bilmes

University of Washington, Seattle, WA

University of Washington, Seattle, WA
View Profile

,
James A. Landay

University of Washington, Seattle, WA

University of Washington, Seattle, WA
View Profile

ICMI '07: Proceedings of the 9th international conference on Multimodal interfacesNovember 2007Pages 247–254https://doi.org/10.1145/1322192.1322235

Published:12 November 2007Publication History

ICMI '07: Proceedings of the 9th international conference on Multimodal interfaces

Pages 247–254

ABSTRACT

Speech has great potential as an input mechanism for ubiquitous computing. However, the current requirements necessary for accurate speech recognition, such as a quiet environment and a well-positioned and high-quality microphone, are unreasonable to expect in a realistic setting. In a physical environment, there is often contextual information which can be sensed and used to augment the speech signal. We investigated improving speech recognition rates for an electronic personal trainer using knowledge about what equipment was in use as context. We performed an experiment with participants speaking in an instrumented apartment environment and compared the recognition rates of a larger grammar with those of a smaller grammar that is determined by the context.

References

G. S.Aist and J. Mostow. 1997. Adapting Human Tutorial Interventions for a Reading Tutor that Listens: Using Continous Speech Recognition in Interactive Educational Multimedia. In CALL '97 Conference on Multimedia, England.Google Scholar
Answers Anywhere www.iAnywhere.comGoogle Scholar
Chang, K. Chen, M., Canny, J., Towards Balanced Exercise Programs: Tracking Free-weight Exercises, Ubicomp 2007.Google Scholar
Coen, M.; Weisman, L; Thomas, K; Groh, M. A Context Sensitive Natural Language Modality for the Intelligent Room. In Proc. MANSE'99. Dublin, Ireland. 1999.Google Scholar
Consolvo, S., Paulos, E., Smith, I. Mobile Persuasion for Everyday Behavior Change. Mobile Persuasion. Stanford Captology Media. 2007Google Scholar
Richard C. Davis, T. Scott Saponas, Michael Shilman, and James A. Landay. SketchWizard: Wizard of Oz Prototyping of Pen-based User Interfaces, In submission to UIST 2007 Google ScholarDigital Library
A. K. Dey, Understanding and Using Context, Personal and Ubiquitous Computing Journal, Volume 5(1), pp 4--7, 2001. Google ScholarDigital Library
J. Glass, T. J. Hazen and I. L. Hetherington, "Realtime Telephone-based Speech Recognition in the Jupiter Domain," in Proc. ICASSP '99, Phoenix, pp. 61--64, Mar. 1999. Google ScholarDigital Library
Leong et al. CASIS: a context-aware speech interface system. In Proc. Intelligent user interfaces 2005 Google ScholarDigital Library
Microsoft Speech Server http://microsoft.com/speechGoogle Scholar
MySportTraining 3.97 by VidaOne, Inc http://www.pocketgear.com/software_detail.asp?id=630Google Scholar
T. Paek & D. Chickering. Improving command and control speech recognition on mobile devices: Using predictive user models for language modeling. User Modeling and User-Adapted Interaction, Special Issue on Statistical and Probabilistic Methods for User Modeling, 2007, 17(1--2): 93--117. Google ScholarDigital Library
Matthai Philipose, Joshua R. Smith, Bing Jiang, Kishore Sundara-Rajan, Alexander Mamishev, Sumit Roy. Battery-Free Wireless Identification and Sensing, IEEE Pervasive Computing, Vol. 4, No. 1, pp. 37--45, January-March 2005. Google ScholarDigital Library
R. Porzel and I. Gurevych, Contextual Coherence in Natural Language Processing, CONTEXT 2003, LNAI 2680, Springer-Verlag, pp 272--285, 2003. Google ScholarDigital Library
Lawrence Rabiner, Biing-Hwang Juang, Fundamentals of Speech Recognition, Prentice-Hall, Inc., NJ, 1993 Google ScholarDigital Library
Joshua R. Smith, Alanson Sample, Pauline Powledge, Alexander Mamishev, Sumit Roy. A wirelessly powered platform for sensing and computation. In Proc. Ubicomp 2006 Google ScholarDigital Library
C. Wai, R. Pieraccini, and H. M. Meng, A Dynamic Semantic Model for Re-scoring Recognition Hypotheses, In Proceedings of ICASSP2001, pp 589--592, 2001.Google ScholarCross Ref
Mark Weiser, John Seely Brown "The Coming Age of Calm Technology", In Beyond Calculation: The Next Fifty Years of Computing, Peter J. Denning and Robert M. Metcalfe, New York, Springer-Verlag 1997. Google ScholarDigital Library
Mark Weiser, "The Computer for the Twenty-First Century," Scientific American, pp. 94--10, September 1991Google ScholarCross Ref

Index Terms

Disambiguating speech commands using physical context
1. Hardware
  1. Communication hardware, interfaces and storage
    1. Sound-based input / output
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interaction devices
      1. Sound-based input / output

Recommendations

Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

Dysarthria is a motor speech disorder that causes inability to control and coordinate one or more articulators. This makes it difficult for a dysarthric speaker to utter certain speech sound units, thereby producing poorly articulated, slurred, and ...
Read More
Using tone information in Cantonese continuous speech recognition

In Chinese languages, tones carry important information at various linguistic levels. This research is based on the belief that tone information, if acquired accurately and utilized effectively, contributes to the automatic speech recognition of ...
Read More
Analysis and Recognition of NAM Speech Using HMM Distances and Visual Information

Non-audible murmur (NAM) is an unvoiced speech signal that can be received through the body tissue with the use of special acoustic sensors (i.e., NAM microphones) attached behind the talker's ear. The authors had previously reported experimental ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICMI '07: Proceedings of the 9th international conference on Multimodal interfaces
November 2007
402 pages
ISBN:9781595938176
DOI:10.1145/1322192
General Chairs:
Kenji Mase
Nagoya University, Japan
,
Dominic Massaro
UC Santa Cruz, USA
,
Program Chairs:
Kazuya Takeda
Nagoya University, Japan
,
Deb Roy
MIT, USA
,
Alexandros Potamianos
Technical University of Crete, Greece
Copyright © 2007 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 November 2007
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
context
exercise
fitness
speech recognition
Qualifiers
- poster
Conference

Acceptance Rates
Overall Acceptance Rate453of1,080submissions,42%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 256
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Disambiguating speech commands using physical context

ICMI '07: Proceedings of the 9th international conference on Multimodal interfaces

ABSTRACT

References

Cited By

Index Terms

Recommendations

Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

Using tone information in Cantonese continuous speech recognition

Analysis and Recognition of NAM Speech Using HMM Distances and Visual Information

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Disambiguating speech commands using physical context

ICMI '07: Proceedings of the 9th international conference on Multimodal interfaces

ABSTRACT

References

Cited By

Index Terms

Recommendations

Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

Using tone information in Cantonese continuous speech recognition

Analysis and Recognition of NAM Speech Using HMM Distances and Visual Information

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media