Novel alignment method for DNN TTS training using HMM synthesis models | IEEE Conference Publication | IEEE Xplore

Novel alignment method for DNN TTS training using HMM synthesis models


Abstract:

In order to train neural networks (NN) for text-to-speech synthesis (TTS), phonetic segmentation must be performed. The most accurate segmentation is performed manually, ...Show More

Abstract:

In order to train neural networks (NN) for text-to-speech synthesis (TTS), phonetic segmentation must be performed. The most accurate segmentation is performed manually, but the process of creating manual alignments is costly and time-consuming, so automatic procedures are preferable. In this paper, a simple alignment method based on models trained during hidden Markov Model (HMM) based TTS system training is presented. It is shown that this approach slightly outperforms standard alignment procedures based on monophone models. Both objective measurements, as well as listening tests, show that NN trained with alignments obtained with the proposed method, can produce speech of higher quality compared to NN trained with monophone alignments.
Date of Conference: 14-16 September 2017
Date Added to IEEE Xplore: 26 October 2017
ISBN Information:
Electronic ISSN: 1949-0488
Conference Location: Subotica, Serbia

Contact IEEE to Subscribe

References

References is not available for this document.