ABSTRACT
Speech has great potential as an input mechanism for ubiquitous computing. However, the current requirements necessary for accurate speech recognition, such as a quiet environment and a well-positioned and high-quality microphone, are unreasonable to expect in a realistic setting. In a physical environment, there is often contextual information which can be sensed and used to augment the speech signal. We investigated improving speech recognition rates for an electronic personal trainer using knowledge about what equipment was in use as context. We performed an experiment with participants speaking in an instrumented apartment environment and compared the recognition rates of a larger grammar with those of a smaller grammar that is determined by the context.
- G. S.Aist and J. Mostow. 1997. Adapting Human Tutorial Interventions for a Reading Tutor that Listens: Using Continous Speech Recognition in Interactive Educational Multimedia. In CALL '97 Conference on Multimedia, England.Google Scholar
- Answers Anywhere www.iAnywhere.comGoogle Scholar
- Chang, K. Chen, M., Canny, J., Towards Balanced Exercise Programs: Tracking Free-weight Exercises, Ubicomp 2007.Google Scholar
- Coen, M.; Weisman, L; Thomas, K; Groh, M. A Context Sensitive Natural Language Modality for the Intelligent Room. In Proc. MANSE'99. Dublin, Ireland. 1999.Google Scholar
- Consolvo, S., Paulos, E., Smith, I. Mobile Persuasion for Everyday Behavior Change. Mobile Persuasion. Stanford Captology Media. 2007Google Scholar
- Richard C. Davis, T. Scott Saponas, Michael Shilman, and James A. Landay. SketchWizard: Wizard of Oz Prototyping of Pen-based User Interfaces, In submission to UIST 2007 Google ScholarDigital Library
- A. K. Dey, Understanding and Using Context, Personal and Ubiquitous Computing Journal, Volume 5(1), pp 4--7, 2001. Google ScholarDigital Library
- J. Glass, T. J. Hazen and I. L. Hetherington, "Realtime Telephone-based Speech Recognition in the Jupiter Domain," in Proc. ICASSP '99, Phoenix, pp. 61--64, Mar. 1999. Google ScholarDigital Library
- Leong et al. CASIS: a context-aware speech interface system. In Proc. Intelligent user interfaces 2005 Google ScholarDigital Library
- Microsoft Speech Server http://microsoft.com/speechGoogle Scholar
- MySportTraining 3.97 by VidaOne, Inc http://www.pocketgear.com/software_detail.asp?id=630Google Scholar
- T. Paek & D. Chickering. Improving command and control speech recognition on mobile devices: Using predictive user models for language modeling. User Modeling and User-Adapted Interaction, Special Issue on Statistical and Probabilistic Methods for User Modeling, 2007, 17(1--2): 93--117. Google ScholarDigital Library
- Matthai Philipose, Joshua R. Smith, Bing Jiang, Kishore Sundara-Rajan, Alexander Mamishev, Sumit Roy. Battery-Free Wireless Identification and Sensing, IEEE Pervasive Computing, Vol. 4, No. 1, pp. 37--45, January-March 2005. Google ScholarDigital Library
- R. Porzel and I. Gurevych, Contextual Coherence in Natural Language Processing, CONTEXT 2003, LNAI 2680, Springer-Verlag, pp 272--285, 2003. Google ScholarDigital Library
- Lawrence Rabiner, Biing-Hwang Juang, Fundamentals of Speech Recognition, Prentice-Hall, Inc., NJ, 1993 Google ScholarDigital Library
- Joshua R. Smith, Alanson Sample, Pauline Powledge, Alexander Mamishev, Sumit Roy. A wirelessly powered platform for sensing and computation. In Proc. Ubicomp 2006 Google ScholarDigital Library
- C. Wai, R. Pieraccini, and H. M. Meng, A Dynamic Semantic Model for Re-scoring Recognition Hypotheses, In Proceedings of ICASSP2001, pp 589--592, 2001.Google ScholarCross Ref
- Mark Weiser, John Seely Brown "The Coming Age of Calm Technology", In Beyond Calculation: The Next Fifty Years of Computing, Peter J. Denning and Robert M. Metcalfe, New York, Springer-Verlag 1997. Google ScholarDigital Library
- Mark Weiser, "The Computer for the Twenty-First Century," Scientific American, pp. 94--10, September 1991Google ScholarCross Ref
Index Terms
- Disambiguating speech commands using physical context
Recommendations
Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System
Dysarthria is a motor speech disorder that causes inability to control and coordinate one or more articulators. This makes it difficult for a dysarthric speaker to utter certain speech sound units, thereby producing poorly articulated, slurred, and ...
Using tone information in Cantonese continuous speech recognition
In Chinese languages, tones carry important information at various linguistic levels. This research is based on the belief that tone information, if acquired accurately and utilized effectively, contributes to the automatic speech recognition of ...
Analysis and Recognition of NAM Speech Using HMM Distances and Visual Information
Non-audible murmur (NAM) is an unvoiced speech signal that can be received through the body tissue with the use of special acoustic sensors (i.e., NAM microphones) attached behind the talker's ear. The authors had previously reported experimental ...
Comments