Journals & Magazines >IEEE/ACM Transactions on Audi... >Volume: 28

Online Speaker Adaptation Using Memory-Aware Networks for Speech Recognition

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

In our previous work, we introduced our attention-based speaker adaptation method, which has been proved to be an efficient online speaker adaptation method for real-time...Show More

Metadata

Abstract:

In our previous work, we introduced our attention-based speaker adaptation method, which has been proved to be an efficient online speaker adaptation method for real-time speech recognition. In this paper, we present a more complete framework of this method named memory-aware networks, which consists of the main network, the memory module, the attention module and the connection module. A gate mechanism and a multiple-connections strategy are presented to connect the memory with the main network in order to take full advantage of the memory. An auxiliary speaker classification task is provided to improve the accuracy of the attention module. The fixed-size ordinally forgetting encoding method is used together with average pooling to gather both short-term and long-term information. Furthermore, instead of only using traditional speaker embeddings such as i-vectors or d-vectors as the memory, we design a new form of memory called residual vectors, which can represent different pronunciation habits. Experiments on both the Switchboard and AISHELL-2 tasks show that our method can perform online speaker adaptation very well with no additional adaptation data and with only a relative 3% increase in decoding computation complexity. Under the cross-entropy criterion, our method achieves a relative word error rate reduction of 9.4% and 8.3% compared to that of the speaker-independent model on the Switchboard task and the AISHELL-2 task, respectively, and approximately 7.0% compared to that of the traditional d-vector-based speaker adaptation method.

Published in: IEEE/ACM Transactions on Audio, Speech, and Language Processing ( Volume: 28)

Page(s): 1025 - 1037

Date of Publication: 12 March 2020

ISSN Information:

DOI: 10.1109/TASLP.2020.2980372

Funding Agency:

Contents

References is not available for this document.

Online Speaker Adaptation Using Memory-Aware Networks for Speech Recognition

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Online Speaker Adaptation Using Memory-Aware Networks for Speech Recognition

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

Authors

Figures

References

Citations

Keywords

Metrics

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?