Speech Recognition

Berton, André; Kaltenmeier, Alfred; Haiber, Udo; Schreiner, Olaf

doi:10.1007/3-540-36678-4_6

André Berton⁴,
Alfred Kaltenmeier⁴,
Udo Haiber⁴ &
…
Olaf Schreiner⁴

Part of the book series: Cognitive Technologies ((COGTECH))

678 Accesses

Summary

The human machine interaction of SmartKom is a very complex task, defined by natural, spontaneous language, speaker independence, large vocabularies, and background noises. Speech recognition is an integral part of the multimodal dialogue system. It transforms the acoustic input signal into an orthographic transcription representing the utterance of the speaker. This contribution discusses how to enhance and customize the speech recognizer for the SmartKom applications. Significant improvements were achieved by adapting the speech recognizer to the environment, to the speaker, and to the task. Speech recognition confidence measures were investigated to reject unreliable user input and to detect user input containing unknown words, i.e., words that are not contained in the vocabulary of the speech recognizer. Finally, we present new ideas for future work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

L.L. Chase. Error-Responsive Feedback Mechanisms for Speech Recognizers. PhD thesis, Carnegie Mellon University, Pittsburgh, PA, 1997.
Google Scholar
F. Class, A. Kaltenmeier, and P. Regel-Brietzmann. Optimization of an HMM-based Continuous Speech Recognizer. In: Proc. EUROSPEECH-93, pp. 803–806, Berlin, Germany, 1993.
Google Scholar
E. Eide, H. Gish, P. Jeanrenaud, and A. Mielke. Understanding and Improving Speech Recognition Performance Through the Use of Diagnostic Tools. In: Proc. Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-95), pp. 221–224, Detroit, MI, 1995.
Google Scholar
P. Fetter. Detection and Transcription of Out-Of-Vocabulary Words in Continuous Speech Recognition. PhD thesis, Technical University of Berlin, 1998.
Google Scholar
P. Fetter, F. Dandurand, and P. Regel-Brietzmann. Word Graph Rescoring Using Confidence Measures. In: Proc. ICSLP-96, pp. 10–13, Philadelphia, PA, 1996.
Google Scholar
M. Finke, T. Zeppenfeld, M. Maier, L. Mayfield, K. Ries, P. Zhan, and A. Waibel. Switchboard April 1996. Technical report, DARPA, 1996.
Google Scholar
F. Metze, T. Kemp, T. Schultz, and H. Soltau. Confidence Measure Based Language Identification. In: Proc. Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-2000), Istanbul, Turkey, 2000.
Google Scholar
M. Pitz, F. Wessel, and H. Ney. Improved MLLR Speaker Adaptation Using Confidence Measures for Conversational Speech Recognition. In: Proc. ICSLP-2000, Beijing, China, 2000.
Google Scholar
T. Schaaf and T. Kemp. Confidence Measure for Spontaneous Speech Recognition. In: Proc. Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-97), pp. 875–878, Munich, Germany, 1997.
Google Scholar
M. Weintraub. LVCSR Log-Likelihood Ratio Scoring for Keyword Spotting. In: Proc. Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-95), pp. 887–890, Detroit, MI, 1995.
Google Scholar
F. Wessel, K. Macherey, and R. Schlüter. Using Word Probabilities as Confidence Measures. In: Proc. Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-98), pp. 225–228, Budapest, Hungary, 1998.
Google Scholar
G. Williams and S. Renals. Confidence Measures for Hybrid HMM/ANN Speech Recognition. In: Proc. EUROSPEECH-97, pp. 1955–1958, Rhodes, Greece, 1997.
Google Scholar
S. Young. Detection of Misrecognitions and Out-Of-Vocabulary Words. In: Proc. Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-94), pp. 21–24, Adelaide, Australia, 1994.
Google Scholar

Download references

Author information

Authors and Affiliations

DaimlerChrysler AG, Research and Technology, Ulm, Germany
André Berton, Alfred Kaltenmeier, Udo Haiber & Olaf Schreiner

Authors

André Berton
View author publications
You can also search for this author in PubMed Google Scholar
Alfred Kaltenmeier
View author publications
You can also search for this author in PubMed Google Scholar
Udo Haiber
View author publications
You can also search for this author in PubMed Google Scholar
Olaf Schreiner
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

German Research Center for AI, DFKI GmbH, Stuhlsatzenhausweg 3, 66123, Saarbrücken, Germany
Wolfgang Wahlster

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Berton, A., Kaltenmeier, A., Haiber, U., Schreiner, O. (2006). Speech Recognition. In: Wahlster, W. (eds) SmartKom: Foundations of Multimodal Dialogue Systems. Cognitive Technologies. Springer, Berlin, Heidelberg . https://doi.org/10.1007/3-540-36678-4_6

Download citation

DOI: https://doi.org/10.1007/3-540-36678-4_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23732-7
Online ISBN: 978-3-540-36678-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics