Conferences >2023 International Conference...

Modeling Irregular Voice in End-to-End Speech Synthesis via Speaker Adaptation

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

End-to-end text-to-speech (TTS) synthesizers may not create a speech similar to the target speaker when the adaptation data is limited or/and chosen randomly. Creaky voic...Show More

Metadata

Abstract:

End-to-end text-to-speech (TTS) synthesizers may not create a speech similar to the target speaker when the adaptation data is limited or/and chosen randomly. Creaky voice might occur frequently, depending on the speaker and the context. This paper uses speaker adaptation to model creaky voice in speech synthesis. We adapted FastSpeech 2 with four target speakers by selecting the adaptation data based on the occurrence of creaky phonation: 1) sentences with frequent creaky voice, 2) randomly chosen sentences, and 3) sentences with few creaky voice. In an objective evaluation, the proposed model successfully modeled creaky voice using data selection (1), producing speech with more creakiness than the other data selections. A subjective test revealed that these frequent creaky voice synthesized samples (for the average of four speakers) obtained slightly less preference than the synthesized speech from a few creaky voice adaptation sentences. Irregular voice models might contribute to building emotional and personalized speech synthesis.

Published in: 2023 International Conference on Speech Technology and Human-Computer Dialogue (SpeD)

Date of Conference: 25-27 October 2023

Date Added to IEEE Xplore: 15 November 2023

ISBN Information:

ISSN Information:

DOI: 10.1109/SpeD59241.2023.10314920

Conference Location: Bucharest, Romania

Contents

References is not available for this document.

Modeling Irregular Voice in End-to-End Speech Synthesis via Speaker Adaptation

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Modeling Irregular Voice in End-to-End Speech Synthesis via Speaker Adaptation

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?