Abstract
This paper presents a voice source analysis method by studying the spectral characters of LF model and their representation in output speech signal. The estimation of source features is defined as the set of LF parameter whose spectrum has the most similar characters in frequency domain, including glottal formant and spectral tilt, with the corresponding characters held by the STRAIGHT spectrum of speech signal for analysis. Besides, the concept of analyzable frame is introduced to ensure the feasibility and improve the reliability of proposed method. Evaluation with synthetic speech proves this method is able to estimate the LF parameters with satisfactory precision. Furthermore, the experiment with emotional speech shows the effectiveness of proposed method in describing voice quality variety among speech with different emotions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Murray, I., Arnott, J.L.: Towards the Simulation of Emotion in Synthetic Speech: A review of the Literature on Human Vocal Emotion. Journal of the Acoustic Society of America, 1097–1108 (1993)
Gobl, C.: The voice source in speech communication - production and perception experiments involving inverse filtering and synthesis, Department of Speech, Music and Hearing, KTH, Stockholm (2003)
Fant, G., Liljencrants, J., Lin, Q.: A four-parameter model of glottal flow. In: STL-QPSR, Speech, Music and Hearing, vol. 4, pp. 1–13. Royal Institute of Technology, Stockholm (1985)
Hedelin, P.: High quality glottal LPC-vocoder. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (1986)
Alku, P.: Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering. Speech Communication 11, 109–118 (1992)
Kawahara, H., Masuda-Katsuse, I., Cheveigné, A.: Restructuring speech representations using a pitch adaptive time frequency smoothing and a instantaneous frequency based F0 extraction: Possible role of a repetitive structure in sound. Speech Communication 27, 187–207 (1999)
Ling, Z., et al.: Modeling Glottal Effect on the Spectral Envelop of STRAIGHT using Mixture of Gaussians. In: International Symposium on Chinese Spoken Language Processing (2004)
d’Alessandro, C., Doval, B.: Voice quality modification for emotional speech synthesis. In: Eurospeech (2003)
Fröhlich, M., Michaelis, D., Strube, H.W.: SIM — simultaneous inverse filtering and matching of a glottal flow model for acoustic speech signals. Journal of the Acoustical Society of America 110, 479–488 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ling, ZH., Hu, Y., Wang, RH. (2005). A Novel Source Analysis Method by Matching Spectral Characters of LF Model with STRAIGHT Spectrum. In: Tao, J., Tan, T., Picard, R.W. (eds) Affective Computing and Intelligent Interaction. ACII 2005. Lecture Notes in Computer Science, vol 3784. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11573548_57
Download citation
DOI: https://doi.org/10.1007/11573548_57
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29621-8
Online ISBN: 978-3-540-32273-3
eBook Packages: Computer ScienceComputer Science (R0)