Loading [a11y]/accessibility-menu.js
Samplernn-Based Neural Vocoder for Statistical Parametric Speech Synthesis | IEEE Conference Publication | IEEE Xplore

Samplernn-Based Neural Vocoder for Statistical Parametric Speech Synthesis


Abstract:

This paper presents a SampleRNN-based neural vocoder for statistical parametric speech synthesis. This method utilizes a conditional SampleRNN model composed of a hierarc...Show More

Abstract:

This paper presents a SampleRNN-based neural vocoder for statistical parametric speech synthesis. This method utilizes a conditional SampleRNN model composed of a hierarchical structure of GRU layers and feed-forward layers to capture long-span dependencies between acoustic features and waveform sequences. Compared with conventional vocoders based on the source-filter model, our proposed vocoder is trained without assumptions derived from the prior knowledge of speech production and is able to provide a better modeling and recovery of phase information. Objective and subjective evaluations are conducted on two corpora. Experimental results suggested that our proposed vocoder can achieve higher quality of synthetic speech than the STRAIGHT vocoder and a WaveNet-based neural vocoder with similar run-time efficiency, no matter natural or predicted acoustic features are used as inputs.
Date of Conference: 15-20 April 2018
Date Added to IEEE Xplore: 13 September 2018
ISBN Information:
Electronic ISSN: 2379-190X
Conference Location: Calgary, AB, Canada

Contact IEEE to Subscribe

References

References is not available for this document.