Conferences >2023 International Conference...

Harmonic-plus-Noise Network with Linear Prediction and Perceptual Weighting Filters for Filipino Speech Synthesis

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Vocoders in Text-To-Speech (TTS) systems are responsible for converting acoustic feature representations such as the Mel Spectrogram to the sound waveform. Recent develop...Show More

Metadata

Abstract:

Vocoders in Text-To-Speech (TTS) systems are responsible for converting acoustic feature representations such as the Mel Spectrogram to the sound waveform. Recent developments in vocoders, such as WaveRNN [1], Parallel WaveGAN [2], HiFi-GAN [3], and Diffusion models [4], [5], mostly have introduced neural architectures outperforming traditional architectures like those using the Griffin-Lim algorithm (GLA)[6]. In this paper, a multi-band Parallel WaveGAN architecture (PWG), the Harmonic-plus-Noise (H+N) vocoder, is trained, implemented, and combined with two types of filters: a) Linear Prediction (LP) filter and b) Perceptual Weighting (PW) filter to improve the TTS quality in Filipino language. Based on the results, HN-PWG garnered the highest total MOS at 4.59 ± 0.10, closely followed by HN-PWG-PW at 4.58 ± 0.07 with no statistically significant difference between the two. All the implemented H+N systems were able to outperform the Tacotron2-based Filipino TTS using a WaveGlow vocoder based on the MOS.

Published in: 2023 International Conference on Speech Technology and Human-Computer Dialogue (SpeD)

Date of Conference: 25-27 October 2023

Date Added to IEEE Xplore: 15 November 2023

ISBN Information:

ISSN Information:

DOI: 10.1109/SpeD59241.2023.10314926

Conference Location: Bucharest, Romania

Contents

References is not available for this document.

Harmonic-plus-Noise Network with Linear Prediction and Perceptual Weighting Filters for Filipino Speech Synthesis

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Harmonic-plus-Noise Network with Linear Prediction and Perceptual Weighting Filters for Filipino Speech Synthesis

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?