A Phase-Based Time-Frequency Masking for Multi-Channel Speech Enhancement in Domestic Environments

Brutti, Alessio; Tsiami, Antigoni; Katsamanis, Athanasios; Maragos, Petros

doi:10.21437/Interspeech.2016-150

A Phase-Based Time-Frequency Masking for Multi-Channel Speech Enhancement in Domestic Environments

Alessio Brutti, Antigoni Tsiami, Athanasios Katsamanis, Petros Maragos

This paper introduces a novel time-frequency masking approach for speech enhancement, based on the consistency of the phase of the cross-spectrum observed at multiple microphones. The proposed approach is derived from solutions commonly adopted in spatial source separation and can be used as a post-filter in traditional multi-channel speech enhancement schemes. Since it is not based on a modeling of the coherence of diffuse noise, the proposed method complements traditional post-filters implementations, targeting non diffuse/coherent sources. It is particularly effective in domestic scenarios where microphones in a given room capture interfering coherent sources active in adjacent rooms.

An experimental analysis on the DIRHA-GRID corpus shows that the proposed method considerably improves the signal-to-interference-ratio and can be used on top of state-of-the-art multi-channel speech enhancement methods.

doi: 10.21437/Interspeech.2016-150

Cite as: Brutti, A., Tsiami, A., Katsamanis, A., Maragos, P. (2016) A Phase-Based Time-Frequency Masking for Multi-Channel Speech Enhancement in Domestic Environments. Proc. Interspeech 2016, 2875-2879, doi: 10.21437/Interspeech.2016-150

@inproceedings{brutti16_interspeech,
  author={Alessio Brutti and Antigoni Tsiami and Athanasios Katsamanis and Petros Maragos},
  title={{A Phase-Based Time-Frequency Masking for Multi-Channel Speech Enhancement in Domestic Environments}},
  year=2016,
  booktitle={Proc. Interspeech 2016},
  pages={2875--2879},
  doi={10.21437/Interspeech.2016-150}
}