ISCA Archive Interspeech 2021
ISCA Archive Interspeech 2021

Sequence-to-Sequence Learning for Deep Gaussian Process Based Speech Synthesis Using Self-Attention GP Layer

Taiki Nakamura, Tomoki Koriyama, Hiroshi Saruwatari

This paper presents a speech synthesis method based on deep Gaussian process (DGP) and sequence-to-sequence (Seq2Seq) learning toward high-quality end-to-end speech synthesis. Feed-forward and recurrent models using DGP are known to produce more natural synthetic speech than deep neural networks (DNNs) because of Bayesian learning and kernel regression. However, such DGP models consist of a pipeline architecture of independent models, acoustic and duration models, and require a high level of expertise in text processing. The proposed model is based on Seq2Seq learning, which enables a unified training of acoustic and duration models. The encoder and decoder layers are represented by Gaussian process regressions (GPRs) and the parameters are trained as a Bayesian model. We also propose a self-attention mechanism with Gaussian processes to effectively model character-level input in the encoder. The subjective evaluation results show that the proposed Seq2Seq-SA-DGP can synthesize more natural speech than DNNs with self-attention and recurrent structures. Besides, Seq2Seq-SA-DGP reduces the smoothing problems of recurrent structures and is effective when a simple input for an end-to-end system is given.


doi: 10.21437/Interspeech.2021-896

Cite as: Nakamura, T., Koriyama, T., Saruwatari, H. (2021) Sequence-to-Sequence Learning for Deep Gaussian Process Based Speech Synthesis Using Self-Attention GP Layer. Proc. Interspeech 2021, 121-125, doi: 10.21437/Interspeech.2021-896

@inproceedings{nakamura21_interspeech,
  author={Taiki Nakamura and Tomoki Koriyama and Hiroshi Saruwatari},
  title={{Sequence-to-Sequence Learning for Deep Gaussian Process Based Speech Synthesis Using Self-Attention GP Layer}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={121--125},
  doi={10.21437/Interspeech.2021-896}
}