Loading [a11y]/accessibility-menu.js
Learning utterance-level normalisation using Variational Autoencoders for robust automatic speech recognition | IEEE Conference Publication | IEEE Xplore

Learning utterance-level normalisation using Variational Autoencoders for robust automatic speech recognition


Abstract:

This paper presents a Variational Autoencoder (VAE) based framework for modelling utterances. In this model, a mapping from an utterance to a distribution over the latent...Show More

Abstract:

This paper presents a Variational Autoencoder (VAE) based framework for modelling utterances. In this model, a mapping from an utterance to a distribution over the latent space, the VAE-utterance feature, is defined. This is in addition to a frame-level mapping, the VAE-frame feature. Using the Aurora-4 dataset, we train and perform some analysis on these models based on their detection of speaker and utterance variability, and also use combinations of LDA, i-vector, and VAE-frame and utterance features for speech recognition training. We find that it works equally well using VAE-frame + VAE-utterance features alone, and by using an LDA + VAE-frame +VAE-utterance feature combination, we obtain a word-errorrate (WER) of 9.59%, a gain over the 9.72% baseline which uses an LDA + i-vector combination.
Date of Conference: 13-16 December 2016
Date Added to IEEE Xplore: 09 February 2017
ISBN Information:
Conference Location: San Diego, CA, USA

References

References is not available for this document.