Abstract:
This paper describes the Microsoft Text-to-Speech (TTS) system: LeanSpeech for LIMMITS (Lightweight, Multi-speaker, Multi-lingual Indic TTS) Challenge 20231, which is par...Show MoreMetadata
Abstract:
This paper describes the Microsoft Text-to-Speech (TTS) system: LeanSpeech for LIMMITS (Lightweight, Multi-speaker, Multi-lingual Indic TTS) Challenge 20231, which is part of ICASSP2023 to encourage the advance of TTS in Indian Languages. We propose a lightweight encoder-decoder acoustic model composed of 1-D convolution and LSTM blocks, which is trained with knowledge distillation from a multi-speaker multi-lingual teacher model, DelightfulTTS [1]. The speech corpus is reprocessed and used in both AM training and vocoder fine-tuning. In Track-2 of the challenge, our system achieves MOS 4.56 and SMOS 3.98, which indicates the efficiency of the proposed model and training strategy.
Published in: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 04-10 June 2023
Date Added to IEEE Xplore: 05 May 2023
ISBN Information: