Authors:
Gražina Korvel
1
;
Olga Kurasova
1
and
Bożena Kostek
2
Affiliations:
1
Institute of Data Science and Digital Technologies, Vilnius University, Akademijos str. 4, LT-04812, Vilnius and Lithuania
;
2
Audio Acoustics Laboratory, Faculty of Electronics, Telecommunications and Informatics, Gdańsk University of Technology, G. Narutowicza 11/12, 80-233 Gdansk and Poland
Keyword(s):
Speech Analysis and Synthesis, Lombard Effect, SISO (Single-Input and Single-Output) System, Sinusoidal Model.
Related
Ontology
Subjects/Areas/Topics:
Multidimensional Signal Processing
;
Multimedia
;
Multimedia Signal Processing
;
Multimodal Signal Processing
;
Perceptual/Human Audiovisual System Modeling
;
Telecommunications
Abstract:
The speech with the Lombard effect has been extensively studied in the context of speech recognition or speech enhancement. However, few studies have investigated the Lombard effect in the context of speech synthesis. The aim of this paper is to create a mathematical model that allows for retaining the Lombard effect. These models could be used as a basis of a formant speech synthesizer. The proposed models are based on dividing the speech signal into harmonics and modeling them as the output of a SISO system whose transfer function poles are multiple, and inputs vary in time. An analysis of the Lombard effect of the synthesized signal is performed on the noise residual. The synthesized signal residual is described by vectors of acoustic parameters related to the Lombard effect. For testing the performance of the created models in various noise conditions two classifiers are employed, namely kNN and Naive Bayes. For comparison of results, we created models of sinusoids based on frequ
ency tracks. The results show that a model based on the residual sinewave sum demonstrates the possibility of retaining the Lombard effect. Finally, future work directions are outlined in conclusions.
(More)