Abstract:
Acoustic signal processing in the circular harmonic domain (CHD) is an appealing method for speaker localization, since it inherently supports wideband acoustic sources a...Show MoreMetadata
Abstract:
Acoustic signal processing in the circular harmonic domain (CHD) is an appealing method for speaker localization, since it inherently supports wideband acoustic sources and provides frequency invariant beampatterns. However, the performance of existing circular harmonic direction-of-arrival (DOA) estimation approaches can be degraded by a variety of factors, including background noise and reverberation in the acoustic environments, small aperture size of the circular array and the presence of multiple active sources. This paper addresses these issues by proposing a novel multi-speaker CHD localization method with small-sized microphone arrays using deep convolutional neural networks (CNN). The core idea is to construct circular harmonic features through joining the selected time-frequency (TF) bins of higher power and the operation of a randomization process by mimicking the sparsity property of speech signals. After that, we implement multi-speaker estimation as a multi-label classification task, and propose to use CNN with binary cross-entropy as the loss function. Experimental results show that our method performs significantly better than the baseline methods, on both simulated and real data, in terms of the accuracy of DOA estimation.
Published in: ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 14-19 April 2024
Date Added to IEEE Xplore: 18 March 2024
ISBN Information: