Loading [a11y]/accessibility-menu.js
A pitch-synchronous speech analysis and synthesis method for DNN-SPSS system | IEEE Conference Publication | IEEE Xplore

A pitch-synchronous speech analysis and synthesis method for DNN-SPSS system


Abstract:

This paper proposes a pitch-synchronous deep neural network (DNN)-based statistical parametric speech synthesis (SPSS) system. The pitch-synchronous frames defined by the...Show More

Abstract:

This paper proposes a pitch-synchronous deep neural network (DNN)-based statistical parametric speech synthesis (SPSS) system. The pitch-synchronous frames defined by the locations of glottal closure instants (GCIs) are used to extract speech parameters, which significantly reduce coupling effects between vocal tract and excitation signals. As a result, the distribution of spectral parameters within the same context of phonetic classes becomes more uniform, which improves a model trainability especially for a small-scaled DNN framework. Although the effectiveness of pitch-synchronous approach has been proven in other applications, it is not trivial to integrate the method into the typical DNN-based SPSS systems that have regularized structures, i.e. fixed frame rate and fixed dimension of features. In this paper, we design a new DNN-based SPSS system that pitch-synchronously trains and generates speech parameters. Objective and subjective test results verify the superiority of the proposed system compared to the conventional approach.
Date of Conference: 16-18 October 2016
Date Added to IEEE Xplore: 02 March 2017
ISBN Information:
Electronic ISSN: 2165-3577
Conference Location: Beijing, China

References

References is not available for this document.