Abstract:
This paper investigates the speech time-frequency (TF) sparsity together with the unique characteristics between the acoustic vector sensors (AVS) to formulate an effecti...Show MoreMetadata
Abstract:
This paper investigates the speech time-frequency (TF) sparsity together with the unique characteristics between the acoustic vector sensors (AVS) to formulate an effective speech enhancement approach under the minimum mean square error (MMSE) criterion together with a fixed beamformer (FBF). The proposed approach exploits the inter-sensor data ratio (ISDR) of the AVS and time-frequency sparsity of speech to derive a mask that is used to extract and enhance a target speech signal recorded in the presence of a spatially separated interfering speech signal and background noise. Experimental results show that the proposed AVS-ISDRSS algorithm effectively suppresses the spatial interference and additive background noise meanwhile increases the perceptual quality of the target speech. In addition, it is noted that the proposed AVS-ISDRSS algorithm does not require voice activity detection (VAD) for estimating the speech and this greatly reduces the computational complexity.
Date of Conference: 20-23 August 2014
Date Added to IEEE Xplore: 18 September 2014
Electronic ISBN:978-1-4799-4612-9