Two-Stage Enhancement of Noisy and Reverberant Microphone Array Speech for Automatic Speech Recognition Systems Trained with Only Clean Speech | IEEE Conference Publication | IEEE Xplore

Two-Stage Enhancement of Noisy and Reverberant Microphone Array Speech for Automatic Speech Recognition Systems Trained with Only Clean Speech


Abstract:

We propose a two-stage approach to enhancement of far-field microphone array speech collected in reverberant conditions corrupted by interfering speakers and noises. We i...Show More

Abstract:

We propose a two-stage approach to enhancement of far-field microphone array speech collected in reverberant conditions corrupted by interfering speakers and noises. We intend to produce top-quality enhanced speech to be used by a black-box automatic speech recognition (ASR) system already trained with clean speech. We explore different deep neural network (DNN) architectures and the best configuration comprises two stages. First, in pre-enhancement, we utilize features in temporal context in a subset of microphones to perform enhancement. Second, in integration, we concatenate the enhanced and noisy features from all microphones to estimate anechoic speech of a reference channel as the overall output. Tested on eight speakers, each with 5 minutes of speech for DNN training, from the Wall Street Journal corpus, at a signal-to-interference-plus-noise-ratio level of 5-15dB, at a distance of l-5m and a reverberation time of 0.2-0.3s, our best 8-channel, speaker-dependent enhancement system attains a perceptual evaluation of speech quality score of 2.95, up from 2.43 for our single-channel system. Followed by speaker-independent ASR for a 230K-word recognition task, we achieve a word error rate of 6.56%, down from 17.89% for enhanced speech of the single-channel system, and from 48.47% for unprocessed noisy speech of the reference channel.
Date of Conference: 26-29 November 2018
Date Added to IEEE Xplore: 06 May 2019
ISBN Information:
Conference Location: Taipei, Taiwan

References

References is not available for this document.