article

Free Access

Challenges in adopting speech recognition

Authors:
Li Deng

Microsoft Research, Redmond, WA

Microsoft Research, Redmond, WA
View Profile

,
Xuedong Huang

Microsoft .NET Speech Technologies Group, Redmond, WA

Microsoft .NET Speech Technologies Group, Redmond, WA
View Profile

Authors Info & Claims

Communications of the ACM Volume 47 Issue 1January 2004pp 69–75https://doi.org/10.1145/962081.962108

Published:01 January 2004Publication History

Communications of the ACM

Abstract

Although progress has been impressive, there are still several hurdles that speech recognition technology must clear before ubiquitous adoption can be realized. R&D in spontaneous and free-flowing speech style is critical to its success.

References

DARPA's EARS Conference (Boston, MA, May 21--22, 2003).Google Scholar
DARPA's EARS Kickoff Meeting (Vienna, VA, May 9--10, 2002).Google Scholar
Datamonitor. Voice Automation---Past, Present, and Future. White Paper (July 2003).Google Scholar
Deng, L., and O'Shaughnessy, D. Speech Processing---A Dynamic and Optimization-Oriented Approach. Marcel Dekker, NY, 2003.Google Scholar
Deng, L. Wang, K., Acero, A., Hon, H., Droppo, J., Boulis, C., Wang, Y., Jacoby, D., Mahajan, M., Chelba, C., and Huang, X.D. Distributed speech processing in MiPad's multimodal user interface. IEEE Transactions on Speech and Audio 10 (2002), 605--619.Google ScholarCross Ref
Furui, S. Recent progress in spontaneous speech recognition and understanding. In Proceedings of the IEEE Workshop on Multimedia Signal Processing (Dec. 2002).Google ScholarCross Ref
Hirsch, H., and Pearce, D. The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy conditions. ISCA ITRW Workshop on Automatic Speech Recognition (Paris, 2000).Google Scholar
Huang, X.D., Acero, A., and Hon, H. Spoken Language Processing---A Guide to Theory, Algorithms, and System Development. Prentice Hall, NY, 2001. Google ScholarDigital Library
Neti, C., Iyengar, G., Potamianos, G., Senior, A., and Maison, B. Perceptual interfaces for information interaction: Joint processing of audio and visual information for human-computer interaction. In the ICSLP Proceedings 1. (Beijing, 2000), 11--14.Google Scholar
Oviatt, S. Breaking the robustness barrier: Recent progress on the design of robust multimodal systems. Advances in Computers. M. Zelkowitz, Ed. Academic Press, 2002, 305--341.Google Scholar
Zhang, Y. et al. Air- and bone-conductive integrated microphones for robust speech detection and enhancement. In Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding. (St. Thomas, U.S. Virgin Islands, Dec, 2003.)Google Scholar

Index Terms

Challenges in adopting speech recognition

Recommendations

MFCC-GMM based accent recognition system for Telugu speech signals

Speech processing is very important research area where speaker recognition, speech synthesis, speech codec, speech noise reduction are some of the research areas. Many of the languages have different speaking styles called accents or dialects. ...
Read More
Acoustical pre-processing for robust speech recognition
HLT '89: Proceedings of the workshop on Speech and Natural Language

In this paper we describe our initial efforts to make SPHINX, the CMU continuous speech recognition system, environmentally robust. Our work has two major goals: to enable SPHINX to adapt to changes in microphone and acoustical environment, and to ...
Read More
Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

Dysarthria is a motor speech disorder that causes inability to control and coordinate one or more articulators. This makes it difficult for a dysarthric speaker to utter certain speech sound units, thereby producing poorly articulated, slurred, and ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Communications of the ACM Volume 47, Issue 1
Multimodal interfaces that flex, adapt, and persist
January 2004
104 pages
ISSN:0001-0782
EISSN:1557-7317
DOI:10.1145/962081
Issue’s Table of Contents

Copyright © 2004 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 January 2004
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 92
  Total Citations
  View Citations
- 4,948
  Total Downloads
- Downloads (Last 12 months)131
- Downloads (Last 6 weeks)47
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Challenges in adopting speech recognition

Communications of the ACM

Abstract

References

Cited By

Index Terms

Recommendations

MFCC-GMM based accent recognition system for Telugu speech signals

Acoustical pre-processing for robust speech recognition

Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Challenges in adopting speech recognition

Communications of the ACM

Abstract

References

Cited By

Index Terms

Recommendations

MFCC-GMM based accent recognition system for Telugu speech signals

Acoustical pre-processing for robust speech recognition

Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media