Abstract
In this work, we summarize our experiences in detection of unexpected words in automatic speech recognition (ASR). Two approaches based upon a paradigm of incongruence detection between generic and specific recognition systems are introduced. By arguing, that detection of incongruence is a necessity, but does not suffice when having in mind possible follow-up actions, we motivate the preference of one approach over the other. Nevertheless, we show, that a fusion outperforms both single systems. Finally, we propose possible actions after the detection of unexpected words, and conclude with general remarks about what we found to be important when dealing with unexpected words.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Deligne, et al.: Language Modeling by Variable Length Sequences: Theoretical Formulation and Evaluation of Multigrams. In: ICASSP, Detroit, MI, pp. 169–172 (1995)
Jiang, H.: Confidence measures for speech recognition: A survey. Speech communication 45(4), 455–470 (2005)
Bisani, M., Ney, H.: Joint-sequence models for grapheme-to-phoneme conversion. Speech Communication 50(5), 434–451 (2008)
Kombrink, S., Burget, L., Matějka, P., Karafiát, M., Heřmansky, H.: Posterior-based Out-of-Vocabulary Word Detection in Telephone Speech. In: Proc. Interspeech 2009, Brighton, UK (2009)
Hannemann, M., Kombrink, S., Burget, L.: Similarity Scoring for Recognizing Repeated Out-of-Vocabulary Words. Submitted to Interspeech, Tokyo, JP (2010)
Kombrink, S., Hannemann, M., Burget, L., Heřmansky, H.: Recovery of rare words in lecture speech. Accepted for Text, Speech and Dialogue (TSD), Brno, CZ (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Kombrink, S., Hannemann, M., Burget, L. (2012). Out-of-Vocabulary Word Detection and Beyond. In: Weinshall, D., Anemüller, J., van Gool, L. (eds) Detection and Identification of Rare Audiovisual Cues. Studies in Computational Intelligence, vol 384. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24034-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-24034-8_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24033-1
Online ISBN: 978-3-642-24034-8
eBook Packages: EngineeringEngineering (R0)