Conferences >2014 IEEE International Confe...

I-vector-based speaker adaptation of deep neural networks for French broadcast audio transcription

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

State of the art speaker recognition systems are based on the i-vector representation of speech segments. In this paper we show how this representation can be used to per...Show More

Metadata

Abstract:

State of the art speaker recognition systems are based on the i-vector representation of speech segments. In this paper we show how this representation can be used to perform blind speaker adaptation of hybrid DNN-HMM speech recognition system and we report excellent results on a French language audio transcription task. The implemenation is very simple. An audio file is first diarized and each speaker cluster is represented by an i-vector. Acoustic feature vectors are augmented by the corresponding i-vectors before being presented to the DNN. (The same i-vector is used for all acoustic feature vectors aligned with a given speaker.) This supplementary information improves the DNN's ability to discriminate between phonetic events in a speaker independent way without having to make any modification to the DNN training algorithms. We report results on the ETAPE 2011 transcription task, and show that i-vector based speaker adaptation is effective irrespective of whether cross-entropy or sequence training is used. For cross-entropy training, we obtained a word error rate (WER) reduction from 22.16% to 20.67% whereas for sequence training the WER reduces from 19.93% to 18.40%.

Published in: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Date of Conference: 04-09 May 2014

Date Added to IEEE Xplore: 14 July 2014

Electronic ISBN:978-1-4799-2893-4

ISSN Information:

DOI: 10.1109/ICASSP.2014.6854823

Conference Location: Florence, Italy

Contents

References is not available for this document.

I-vector-based speaker adaptation of deep neural networks for French broadcast audio transcription

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

I-vector-based speaker adaptation of deep neural networks for French broadcast audio transcription

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?