Abstract
An on-the-fly Mandarin singing voice synthesis system, called SINVOIS (singing voice synthesis), is proposed in this paper. The SINVOIS system can receive the continuous speech of the lyrics of a song, and generate the singing voice immediately based on the music score information (embedded in a MIDI file) of the song. Two sub-systems are designed and embedded into the system. One is the synthesis unit generator and the other is the pitch-shifting module. In the first one, the Viterbi decoding algorithm is employed on a continuous speech to generate the synthesis unit for singing voice. And the PSOLA method is employed to implement the pitch-shifting function in the second one. Moreover, the energy, duration, and spectrum modifications on the synthesis unit are also implemented in the second part. The synthesized singing voice sounds reasonably good. From the subjective listening test, the MOS (mean opinion score) of 3.1 are obtained for synthesized singing voices.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bennett, Gerald, and Rodet, Xavier, “Synthesis of the singing voice,” in Current Directions in Computer Music Research (M. V. Mathews and J. R. Pierce, eds.), pp. 19–44, MIT Press, 1989.
Chen, S.G. and Lin, G.J., “High Quality and Low Complexity Pitch Modification of Acoustic Signals,” Proceedings of the 1995 IEEE International Conference on Acoustic, Speech, and Signal Processing, May, Detroit, USA, 1995, p2987–2990.
Chowning, John M., “Frequency Modulation Synthesis of the Singing Voice,” in Current Directions in Computer Music Research (Max. V. Mathews and John. R. Pierce, eds.), pp. 57–63, MIT Press, 1989.
Cook, P.R., “SPASM, a real time vocal track physical model controller and singer, the companion software synthesis system,” Computer Music Journal, vol. 17, pp.30–43, spring 1993.
F. Charpentier and Moulines, “Pitch-synchronous Waveform Processing Technique for Text-to-Speech Synthesis Using Diphones,” European Conf. On Speech Communication and Technology, pp.13–19, Paris, 1989.
ITU-T, Methods for Subjective Determination of Transmission Quality, 1996, Int. Telecommunication Unit.
Macon, Michael W. and Jensen-Link, Leslie and Oliverio, James and Clements, Mark A. and George, E. Bryan, “A Singing voice synthesis system based on sinusoidal modeling,” Proc. of International Conference on Acoustics, Speech, and Signal Processing, Vol. 1, pp. 435–438, 1997.
Macon, Michael W., and Jensen-Link, Leslie and Oliverio, James and Clements, Mark A. and George, E. Bryan, “Concatenation-based MIDI-to-Singing Voice Synthesis,” 103rd Meeting of the Audio Engineering Society, New York, 1997.
Macon, Michael W., M. W. Macon, “Speech Synthesis Based on Sinusoidal Modeling,” PhD thesis, Georgia Institute of Technology, October 1996.
Ney, F., and Aubert, X., “Dynamic programming search: from digit strings to large vocabulary word graphs,” in C. H. Lee, F Soong, and K. Paliwal, eds., Automatic Speech and Speaker Recognition, Kluwer, Norwell, Mass., 1996.
Rabiner, L., and Juang, B-H., Fundamentals of Speech Recognition, Prentice-Hall, Englewood Cliffs, N.J., pp. 339–340, 1993.
Yiying Zhang, Xiaoyan Zhu, Yu Hao, Yupin Luo, “A robust and fast endpoint detection algorithm for isolated word recognition”, IEEE International Conference on Volume: 2, 1997, Page(s): 1819–1822 vol.2
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lin, CY., Jang, JS.R., Hwang, SH. (2002). An On-the-Fly Mandarin Singing Voice Synthesis System. In: Chen, YC., Chang, LW., Hsu, CT. (eds) Advances in Multimedia Information Processing — PCM 2002. PCM 2002. Lecture Notes in Computer Science, vol 2532. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36228-2_78
Download citation
DOI: https://doi.org/10.1007/3-540-36228-2_78
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00262-8
Online ISBN: 978-3-540-36228-9
eBook Packages: Springer Book Archive