Simultaneous Denoising and Dereverberation for Low-Latency Applications Using Frame-by-Frame Online Unified Convolutional Beamformer

Nakatani, Tomohiro; Kinoshita, Keisuke

doi:10.21437/Interspeech.2019-1286

Simultaneous Denoising and Dereverberation for Low-Latency Applications Using Frame-by-Frame Online Unified Convolutional Beamformer

Tomohiro Nakatani, Keisuke Kinoshita

This article presents frame-by-frame online processing algorithms for a Weighted Power minimization Distortionless response convolutional beamformer (WPD). The WPD unifies widely-used multichannel dereverberation and denoising methods, namely a weighted prediction error based dereverberation method (WPE) and a minimum power distortionless response beamformer (MPDR) into a single convolutional beamformer, and achieves simultaneous dereverberation and denoising based on maximum likelihood estimation. We derive two different online algorithms, one based on frame-by-frame recursive updating of the spatio-temporal covariance matrix of the captured signal, and the other on recursive least square estimation of the convolutional beamformer. In addition, for both algorithms, the desired signal’s relative transfer function (RTF) is estimated by online processing using a neural network based online mask estimation. Experiments using the REVERB challenge dataset show the effectiveness of both algorithms in terms of objective speech enhancement measures and automatic speech recognition (ASR) performance.

doi: 10.21437/Interspeech.2019-1286

Cite as: Nakatani, T., Kinoshita, K. (2019) Simultaneous Denoising and Dereverberation for Low-Latency Applications Using Frame-by-Frame Online Unified Convolutional Beamformer. Proc. Interspeech 2019, 111-115, doi: 10.21437/Interspeech.2019-1286

@inproceedings{nakatani19_interspeech,
  author={Tomohiro Nakatani and Keisuke Kinoshita},
  title={{Simultaneous Denoising and Dereverberation for Low-Latency Applications Using Frame-by-Frame Online Unified Convolutional Beamformer}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={111--115},
  doi={10.21437/Interspeech.2019-1286}
}