Abstract
Voice Recognition technique has been developed and it has been actively applied to various information devices in Korea such as smart phones and car navigation systems. Since the basic research technique related the speech recognition has been based on research results of other languages such as English and Japanese, it is possible to meet a sort of difficulties or some problems in point of view from the recognition. It should check once at least or a margin for applying the Korean vocal sound system to improve the recognition of Korean speech, 44 since Korean phonemes always have a same phonetic value. However, the scope of this study is the recognition of single vowels for a digital contents producing, particularly lip sync animation, since the lip sync producing generally requires tedious hand work of animators and it seriously affects the animation producing cost and development period to get a high quality of lip animation. In this research, a real time processed automatic lip sync algorithm for virtual characters as the animation key in digital contents is studied by considering Korean vocal sound system. The proposed algorithm contributes to produce a natural condonable lip animation with the lower producing cost and the shorter development period. The system of real time vowel recognition for producing digital contents focusing on formants frequencies is proposed. The recognition process consists of speech signal as the input, filtering, Fast Fourier Transform and identification. The algorithm based on the formant frequency using F1 and F2 was proposed, whose output was applied to the autonomic natural animating of the character’ s mouth shape for small and medium sized animation productions or e-learning contents productions. The result shows the proposed speaker dependent single vowel recognition system is able to distinguish Korean single vowels from dialogue of a dubbing artist with real-time. The average of the recognition ratio was 97.3 % in the laboratory environment. It gives a possibility that the more condonable lip sync produces automatically without any animator involved.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hwang SM, Yun HK, Song BH (2013) Automatic lip sync solution for virtual characters in 3D animations. ICCT2013 2(1):432–433
Umesh S, Cohen L, Nelson D (1997) Frequency warping and speaker-normalization. In: ICASSP-97, IEEE international conference on acoustics, speech, and signal processing, vol 2, pp 983–986, April 1997
Umesh S, Cohen L, Nelson D (2002) Frequency warping and the mel scale. Sig Process Lett 9(3):104–107
Umesh S, Kumar SB, Vinay MK, Sharma R, Sinha R (2002) A simple approach to non-uniform vowel normalization. In: ICASSP 2012, IEEE international conference on acoustics, speech, and signal processing, vol 1, pp 517–520, May 2002
Paul AK, Das D, Kamal MM (2009) Bangla speech recognition system using LPC and ANN. In: Advanced in pattern recognition, ICAPR’09, pp 171–174
Kodandaramaiah GN, Giriprasad MN, Mukunda Rao M (2010) Independent speaker recognition for native english vowels. Int J Electron Eng Res Res India Publ 2(4):377–381
Kocharov DA (2004) Automatic vowel recognition in fluent speech (on the Materrial of the Russian Language). In: 9th conference Speech and Computer, SPECOM’2004 http://www.isca-speech.org/archive
Murakami T, Maruyama K, Minematsu N, Hirose K (2005) Japanese vowel recognition based on structural representation of speech. In: INTERSPEECH, pp 1261–1264
Shin Moonja, Han Sook-ja (2006) A study of rate and fluency in normal speaker. Speech Sci 10(2):159–168
Chung Hyun-yeol, Makino Shozo, Kido Keniti (1991) Analysis, perception and recognition of Korean vowels. ICEIC 91 2:195–198
Hwang SM, Yun HK (2013) Extraction for lip shape using real-time vowel recognition. In: Spring conference, KIIECT 2013, vol 6, no 1, pp 39–42
Moon Cho Sung (2003) An acoustic study of Korean vowel system. Korean Lang Cult 20:427–441
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media Dordrecht
About this paper
Cite this paper
Whang, SM., Song, BH., Yun, HK. (2014). Speaker Adaptive Real-Time Korean Single Vowel Recognition for an Animation Producing. In: Park, J., Zomaya, A., Jeong, HY., Obaidat, M. (eds) Frontier and Innovation in Future Computing and Communications. Lecture Notes in Electrical Engineering, vol 301. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-8798-7_73
Download citation
DOI: https://doi.org/10.1007/978-94-017-8798-7_73
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-017-8797-0
Online ISBN: 978-94-017-8798-7
eBook Packages: EngineeringEngineering (R0)