Significant ASR Error Detection for Conversational Voice Assistants | IEEE Conference Publication | IEEE Xplore

Significant ASR Error Detection for Conversational Voice Assistants


Abstract:

Modern Automatic Speech Recognition (ASR) systems are evaluated with respect to Word Error Rate (WER). While WER is a useful metric for training and evaluation of speech ...Show More

Abstract:

Modern Automatic Speech Recognition (ASR) systems are evaluated with respect to Word Error Rate (WER). While WER is a useful metric for training and evaluation of speech models, it does not fully reflect the difference in semantics between predicted and ground truth transcriptions. In conversational voice assistants, the ability to sufficiently understand semantic meaning of the user request is often more important than recognizing all words correctly. In this work, we propose a system that can determine, to a high degree of accuracy, whether the semantics of a predicted and reference transcript are significantly different. This knowledge is used to identify ASR errors that can result in downstream failure in conversational voice assistants. Reliable identification of these errors can be used to inform design choices for ASR systems targeting improvement on the most harmful errors.
Date of Conference: 14-19 April 2024
Date Added to IEEE Xplore: 18 March 2024
ISBN Information:

ISSN Information:

Conference Location: Seoul, Korea, Republic of

References

References is not available for this document.