Conferences >2012 IEEE International Confe...

Low-latency speaker diarization based on Bayesian information criterion with multiple phoneme classes

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Low-latency speaker diarization is desirable for online-oriented speaker adaptation in real-time speech recognition. Especially in spontaneous conversations, several spea...Show More

Metadata

Abstract:

Low-latency speaker diarization is desirable for online-oriented speaker adaptation in real-time speech recognition. Especially in spontaneous conversations, several speakers tend to speak alternatively and continuously without any silence in between utterances. We therefore propose a speaker diarization method that detects speaker-change points and determines the speaker with a fixed low latency on the basis of a Bayesian information criterion (BIC) by using acoustic features classified into multiple phoneme classes. To improve the accuracy of speaker diarization in the low latency condition, the speaker-decision is made continuously at each phoneme boundary. In an experiment on conversational broadcast news programs, our diarization method reduced the speaker diarization error rate relatively by 20.0% compared to the conventional BIC with a single phoneme class. The online speaker adaptation applied in a speech-recognition experiment reduced word error rate at speaker-change points relatively by 7.8%.

Published in: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Date of Conference: 25-30 March 2012

Date Added to IEEE Xplore: 30 August 2012

ISBN Information:

ISSN Information:

DOI: 10.1109/ICASSP.2012.6288842

Conference Location: Kyoto, Japan

Contents

References is not available for this document.

Low-latency speaker diarization based on Bayesian information criterion with multiple phoneme classes

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Low-latency speaker diarization based on Bayesian information criterion with multiple phoneme classes

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?