Journals & Magazines >IEEE Transactions on Audio, S... >Volume: 20 Issue: 2

Simultaneous Speech Detection With Spatial Features for Speaker Diarization

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Simultaneous speech poses a challenging problem for conventional speaker diarization systems. In meeting data, a substantial amount of missed speech error is due to speak...Show More

Metadata

Abstract:

Simultaneous speech poses a challenging problem for conventional speaker diarization systems. In meeting data, a substantial amount of missed speech error is due to speaker overlaps, since usually only one speaker label per segment is assigned. Furthermore, simultaneous speech included in training data can lead to corrupt speaker models and thus worse segmentation performance. In this paper, we propose the use of three spatial cross-correlation-based features together with spectral information for speaker overlap detection on distant microphones. Different microphone-pair data are fused by means of principal component analysis. We have obtained an improvement of the speaker diarization system over the baseline by discarding overlap segments from model training and assigning two speaker labels to them according to likelihoods in Viterbi decoding. In experiments conducted on the AMI Meeting corpus, we achieve a relative DER reduction of 11.2% and 17.0% for single- and multi-site data, respectively. The improvement of clustering with techniques such as beamforming and TDOA-feature stream also leads to a higher effectiveness of the overlap labeling algorithm. Preliminary experiments with NIST RT data show DER improvement on the RT'09 meeting recordings as well.

Published in: IEEE Transactions on Audio, Speech, and Language Processing ( Volume: 20, Issue: 2, February 2012)

Page(s): 436 - 446

Date of Publication: 23 January 2012

ISSN Information:

DOI: 10.1109/TASL.2011.2160167

Contents

References is not available for this document.

Simultaneous Speech Detection With Spatial Features for Speaker Diarization

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Simultaneous Speech Detection With Spatial Features for Speaker Diarization

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?