Speaker Adaptive Real-Time Korean Single Vowel Recognition for an Animation Producing

Whang, Sun-Min; Song, Bok-Hee; Yun, Han-Kyung

doi:10.1007/978-94-017-8798-7_73

Sun-Min Whang⁵,
Bok-Hee Song⁶ &
Han-Kyung Yun⁵

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 301))

2105 Accesses

Abstract

Voice Recognition technique has been developed and it has been actively applied to various information devices in Korea such as smart phones and car navigation systems. Since the basic research technique related the speech recognition has been based on research results of other languages such as English and Japanese, it is possible to meet a sort of difficulties or some problems in point of view from the recognition. It should check once at least or a margin for applying the Korean vocal sound system to improve the recognition of Korean speech, 44 since Korean phonemes always have a same phonetic value. However, the scope of this study is the recognition of single vowels for a digital contents producing, particularly lip sync animation, since the lip sync producing generally requires tedious hand work of animators and it seriously affects the animation producing cost and development period to get a high quality of lip animation. In this research, a real time processed automatic lip sync algorithm for virtual characters as the animation key in digital contents is studied by considering Korean vocal sound system. The proposed algorithm contributes to produce a natural condonable lip animation with the lower producing cost and the shorter development period. The system of real time vowel recognition for producing digital contents focusing on formants frequencies is proposed. The recognition process consists of speech signal as the input, filtering, Fast Fourier Transform and identification. The algorithm based on the formant frequency using F1 and F2 was proposed, whose output was applied to the autonomic natural animating of the character’ s mouth shape for small and medium sized animation productions or e-learning contents productions. The result shows the proposed speaker dependent single vowel recognition system is able to distinguish Korean single vowels from dialogue of a dubbing artist with real-time. The average of the recognition ratio was 97.3 % in the laboratory environment. It gives a possibility that the more condonable lip sync produces automatically without any animator involved.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Hardcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Hwang SM, Yun HK, Song BH (2013) Automatic lip sync solution for virtual characters in 3D animations. ICCT2013 2(1):432–433
Google Scholar
Umesh S, Cohen L, Nelson D (1997) Frequency warping and speaker-normalization. In: ICASSP-97, IEEE international conference on acoustics, speech, and signal processing, vol 2, pp 983–986, April 1997
Google Scholar
Umesh S, Cohen L, Nelson D (2002) Frequency warping and the mel scale. Sig Process Lett 9(3):104–107
Article Google Scholar
Umesh S, Kumar SB, Vinay MK, Sharma R, Sinha R (2002) A simple approach to non-uniform vowel normalization. In: ICASSP 2012, IEEE international conference on acoustics, speech, and signal processing, vol 1, pp 517–520, May 2002
Google Scholar
Paul AK, Das D, Kamal MM (2009) Bangla speech recognition system using LPC and ANN. In: Advanced in pattern recognition, ICAPR’09, pp 171–174
Google Scholar
Kodandaramaiah GN, Giriprasad MN, Mukunda Rao M (2010) Independent speaker recognition for native english vowels. Int J Electron Eng Res Res India Publ 2(4):377–381
Google Scholar
Kocharov DA (2004) Automatic vowel recognition in fluent speech (on the Materrial of the Russian Language). In: 9th conference Speech and Computer, SPECOM’2004 http://www.isca-speech.org/archive
Murakami T, Maruyama K, Minematsu N, Hirose K (2005) Japanese vowel recognition based on structural representation of speech. In: INTERSPEECH, pp 1261–1264
Google Scholar
Shin Moonja, Han Sook-ja (2006) A study of rate and fluency in normal speaker. Speech Sci 10(2):159–168
Google Scholar
Chung Hyun-yeol, Makino Shozo, Kido Keniti (1991) Analysis, perception and recognition of Korean vowels. ICEIC 91 2:195–198
Google Scholar
Hwang SM, Yun HK (2013) Extraction for lip shape using real-time vowel recognition. In: Spring conference, KIIECT 2013, vol 6, no 1, pp 39–42
Google Scholar
Moon Cho Sung (2003) An acoustic study of Korean vowel system. Korean Lang Cult 20:427–441
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, Korea University of Technology and Education, Cheonan, South Korea
Sun-Min Whang & Han-Kyung Yun
Department of Industrial Design Engineering, Korea University of Technology and Education, Cheonan, South Korea
Bok-Hee Song

Authors

Sun-Min Whang
View author publications
You can also search for this author in PubMed Google Scholar
Bok-Hee Song
View author publications
You can also search for this author in PubMed Google Scholar
Han-Kyung Yun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sun-Min Whang .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Seoul University of Science & and Technology (SeoulTech), Seoul, Korea, Republic of (South Korea)
James J. (Jong Hyuk) Park
School of Information Technologies, University of Sydney, Sydney, New South Wales, Australia
Albert Zomaya
Humanitas College, Kyung Hee University, Seoul, Korea, Republic of (South Korea)
Hwa-Young Jeong
Computer Science & Software Engineering, Monmouth University, W. Long Branch, New Jersey, USA
Mohammad Obaidat

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Whang, SM., Song, BH., Yun, HK. (2014). Speaker Adaptive Real-Time Korean Single Vowel Recognition for an Animation Producing. In: Park, J., Zomaya, A., Jeong, HY., Obaidat, M. (eds) Frontier and Innovation in Future Computing and Communications. Lecture Notes in Electrical Engineering, vol 301. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-8798-7_73

Download citation

DOI: https://doi.org/10.1007/978-94-017-8798-7_73
Published: 19 April 2014
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-017-8797-0
Online ISBN: 978-94-017-8798-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics