Abstract
The paper describes voice assignment techniques for synchronized scenario speech output in an instant casting movie system that enables anyone to be a movie star using his or her own voice and face. Two prototype systems were implemented, and both systems worked well for various participants, ranging from children to the elderly.
Chapter PDF
Similar content being viewed by others
Keywords
References
Maejima, A., Wemler, S., Machida, T., Takebayahashi, M., Morishima, S.: Instant Casting Movie Theater: The Future Cast System. The IEICE Transactions on Information and Systems E91-D(4), 1135–1148 (2008)
Zen, H., Nose, T., Yamagishi, J., Sako, S., Masuko, T., Black, A.W., Tokuda, K.: The HMM-based speech synthesis system version 2.0. In: Proc. of ISCA SSW6, Bonn, Germany (2007)
Kawai, H., Toda, T., Yamagishi, J., Hirai, T., Ni, J., Nishizawa, N., Tsuzaki, M., Tokuda, K.: XIMERA: A Concatenative Speech Synthesis System with Large Scale Corpora. IEICE Trans. J89-D-II(12), 2688–2698 (2006)
Hunt, A., Black, A.: Unit selection in a concatenative speech synthesis system using a large speech database. In: Proc. ICASSP, pp. 373–376 (1996)
Clark, R.A.K., Richmond, K., King, S.: Multisyn: Open-domain unit selection for the Festival speech synthesis system. Speech Communication 49(4), 317–330 (2007)
Reynolds, D.: Robust text-independent speaker identication using gaussian mixture speaker models. IEEE Trans. On Acoust. Speech and Audio Processing 3(1) (1995)
Kitamura, T., Saitou, T.: Contribution of acoustic features of sustained vowels on perception of speaker characteristic. In: Proc. of Acoustical Society of Japan 2007 Spring Meeting, pp. 443–444 (2007)
Saitou, T., Kitamura, T.: Factors in /vvv/ concatenated vowels affecting perception of speaker individuality. In: Proc. of Acoustical Society of Japan 2007 Spring Meeting, pp. 441–442 (2007)
Higuchi, N., Hashimoto, M.: Analysis of acoustic features affecting speaker identification. In: Proc. of EUROSPEECH, pp. 435–438 (1995)
Kawahara, H.: Straight: An extremely high-quality vocoder for auditory and speech perception research. In: Greenberg, Slaney (eds.) Computational Models of Auditory Function, pp. 343–354 (2001)
Kawahara, H., Matsui, H.: Auditory morphing based on an elastic perceptual distance metric in an interference-free time-frequency representation. In: Proc. of ICASSP, vol. 1, pp. 256–259 (2003)
Slaney, M., Covell, M., Lassiter, B.: Automatic audio morphing. In: Proc. of ICASSP, pp. 1001–1004 (1995)
Takahashi, T., Nishi, M., Irino, T., Kawahara, H.: Average voice synthesis using multiple speech morphing. In: Proc. of Acoustical Society of Japan 2006 Spring Meeting, pp. 229–230 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kawamoto, Si., Yotsukura, T., Nakamura, S., Morishima, S. (2011). Personalized Voice Assignment Techniques for Synchronized Scenario Speech Output in Entertainment Systems. In: Shumaker, R. (eds) Virtual and Mixed Reality - Systems and Applications. VMR 2011. Lecture Notes in Computer Science, vol 6774. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22024-1_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-22024-1_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22023-4
Online ISBN: 978-3-642-22024-1
eBook Packages: Computer ScienceComputer Science (R0)