Parallel Dictionary Learning for Voice Conversion Using Discriminative Graph-Embedded Non-Negative Matrix Factorization

Aihara, Ryo; Takiguchi, Tetsuya; Ariki, Yasuo

doi:10.21437/Interspeech.2016-227

Parallel Dictionary Learning for Voice Conversion Using Discriminative Graph-Embedded Non-Negative Matrix Factorization

Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki

This paper proposes a discriminative learning method for Non-negative Matrix Factorization (NMF)-based Voice Conversion (VC). NMF-based VC has been researched because of the natural-sounding voice it produces compared with conventional Gaussian Mixture Model (GMM)-based VC. In conventional NMF-based VC, parallel exemplars are used as the dictionary; therefore, dictionary learning is not adopted. In order to enhance the conversion quality of NMF-based VC, we propose Discriminative Graph-embedded Non-negative Matrix Factorization (DGNMF). Parallel dictionaries of the source and target speakers are discriminatively estimated by using DGNMF based on the phoneme labels of the training data. Experimental results show that our proposed method can not only improve the conversion quality but also reduce the computational times.

doi: 10.21437/Interspeech.2016-227

Cite as: Aihara, R., Takiguchi, T., Ariki, Y. (2016) Parallel Dictionary Learning for Voice Conversion Using Discriminative Graph-Embedded Non-Negative Matrix Factorization. Proc. Interspeech 2016, 292-296, doi: 10.21437/Interspeech.2016-227

@inproceedings{aihara16_interspeech,
  author={Ryo Aihara and Tetsuya Takiguchi and Yasuo Ariki},
  title={{Parallel Dictionary Learning for Voice Conversion Using Discriminative Graph-Embedded Non-Negative Matrix Factorization}},
  year=2016,
  booktitle={Proc. Interspeech 2016},
  pages={292--296},
  doi={10.21437/Interspeech.2016-227}
}