ISCA Archive Interspeech 2021
ISCA Archive Interspeech 2021

AISHELL-3: A Multi-Speaker Mandarin TTS Corpus

Yao Shi, Hui Bu, Xin Xu, Shaoji Zhang, Ming Li

In this paper, we present AISHELL-3, a large-scale multi-speaker Mandarin speech corpus which could be used to train multi-speaker Text-To-Speech (TTS) systems. The corpus contains roughly 85 hours of emotion-neutral recordings spanning across 218 native Chinese mandarin speakers. Their auxiliary attributes such as gender, age group and native accents are explicitly marked and provided in the corpus. Moreover, transcripts in Chinese character-level and pinyin-level are provided along with the recordings. We also present some data processing strategies and techniques which match with the characteristics of the presented corpus and conduct experiments on multiple speech-synthesis systems to assess the quality of the generated speech samples, showing promising results. The corpus is available online under Apache v2.0 license.


doi: 10.21437/Interspeech.2021-755

Cite as: Shi, Y., Bu, H., Xu, X., Zhang, S., Li, M. (2021) AISHELL-3: A Multi-Speaker Mandarin TTS Corpus. Proc. Interspeech 2021, 2756-2760, doi: 10.21437/Interspeech.2021-755

@inproceedings{shi21c_interspeech,
  author={Yao Shi and Hui Bu and Xin Xu and Shaoji Zhang and Ming Li},
  title={{AISHELL-3: A Multi-Speaker Mandarin TTS Corpus}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={2756--2760},
  doi={10.21437/Interspeech.2021-755}
}