Loading [a11y]/accessibility-menu.js
Improving automatic speech recognition in noise by energy normalization and signal resynthesis | IEEE Conference Publication | IEEE Xplore

Improving automatic speech recognition in noise by energy normalization and signal resynthesis


Abstract:

This paper presents the contribution of energy normalization technique in automatic speech recognition in babble noise, where machine assumes that speech and noise have t...Show More

Abstract:

This paper presents the contribution of energy normalization technique in automatic speech recognition in babble noise, where machine assumes that speech and noise have the same level of energy, therefore loudness. Similarly, loudness of target speech and noise is an important contributing factor while recognizing speech by humans in everyday conditions. Louder speech is better recognized than non louder speech by humans, even if they are approaching to the listeners at a same signal to noise ratio (SNR). This phenomenon has been tested over the machines and the recognition performance roughly varies from 75% to 90% across a wide range of SNRs. In exchange, human recognition performance is more SNR-dependent: it varies from 30% to 95%. By using energy normalization, the machines have a poor recognition rate in average in comparison to the performance of humans in less noisy conditions (positive SNR), but tend to outperform humans in high noisy conditions (negative SNR like -4dB, -6dB). It is also confirmed by this study that formant processing has no significant effect in recognizing speech in noise. Subsequently, it implies that formant based vocal tract length normalization is unable to improve the performance of machines in noise.
Date of Conference: 25-27 August 2011
Date Added to IEEE Xplore: 20 October 2011
ISBN Information:
Conference Location: Cluj-Napoca, Romania

Contact IEEE to Subscribe

References

References is not available for this document.