Single-Channel Speech Enhancement Using Double Spectrum

Blass, Martin; Mowlaee, Pejman; Kleijn, W. Bastiaan

doi:10.21437/Interspeech.2016-234

Single-Channel Speech Enhancement Using Double Spectrum

Martin Blass, Pejman Mowlaee, W. Bastiaan Kleijn

Single-channel speech enhancement is often formulated in the Short-Time Fourier Transform (STFT) domain. As an alternative, several previous studies have reported advantages of speech processing using pitch-synchronous analysis and filtering in the modulation transform domain. We propose to use the Double Spectrum (DS) obtained by combining pitch-synchronous transform followed by modulation transform. The linearity and sparseness properties of DS domain are beneficial for single-channel speech enhancement. The effectiveness of the proposed DS-based speech enhancement is demonstrated by comparing it with STFT-based and modulation-based benchmarks. In contrast to the benchmark methods, the proposed method does not exploit any statistical information nor does it use temporal smoothing. The proposed method leads to an improvement of 0.3 PESQ on average for babble noise.

doi: 10.21437/Interspeech.2016-234

Cite as: Blass, M., Mowlaee, P., Kleijn, W.B. (2016) Single-Channel Speech Enhancement Using Double Spectrum. Proc. Interspeech 2016, 1740-1744, doi: 10.21437/Interspeech.2016-234

@inproceedings{blass16_interspeech,
  author={Martin Blass and Pejman Mowlaee and W. Bastiaan Kleijn},
  title={{Single-Channel Speech Enhancement Using Double Spectrum}},
  year=2016,
  booktitle={Proc. Interspeech 2016},
  pages={1740--1744},
  doi={10.21437/Interspeech.2016-234}
}