ABSTRACT
Thai language is the official language of Thailand. At present, about 70 million speakers are located in Thailand and the southern parts of China, Yunnan, Guizhou, and Guangxi. The Thai language is a tonal language. Thai Language is a challenging language for speech processing technology. Because the Thai spoken language database is limited and also lacks a specific speech corpus, such as a children's speech database, elderly speech, accents spoken in each region, etc. This research develops the Thai elderly speech named Satja meaning is truth of speech. The content of this corpus is a voice command. There are 50 speakers, 24 males and 26 females, covering six regions in Thailand, aged 60-85 years. In addition, the database of elderly voice was compared to non-elderly voice. For a model training, we used CMUSphinx and tested with Sphinx4. We found that when the elderly speech was tested with the elderly model, it was more accurate when experimented than the model trained by the non-elderly people.
- United Nations, Department of Economic and Social Affairs, 2017, World Population Ageing (2017), New York, (ST/ESA/SER.A/390).Google Scholar
- Office of the national economic and social development board, Population projections for Thailand 2010-2040, 2013, Bangkok, Thailand.Google Scholar
- Wutiwiwatchai, C., and Furui, S., "Thai Speech Processing Technology: A Review", J. Speech Communication, Vol. 49, pp. 8--27, 2007. Google ScholarDigital Library
- IPA, The principles of the International Phonetic Association, 2nd ed. London, UK: University College of London,1982.Google Scholar
- Somsak botong, 2560, ภาษาศาสตร์ภาษาไทย (2nd. ed.), Offset, Bangkok, Thailand, page 1--144.Google Scholar
- S. Suebvisai, P. Charoenpornsawat, A. Black, M. Woszczyna, and T. Schultz, "Thai automatic speech recognition", Proc.ICASSP, pp. 857--860, 2005.Google ScholarCross Ref
- Xuedong Huang, Alex Acero, Hsiao-Wuen Hon, et al. Spoken Language Processing, volume 18. Prentice Hall, 2001.Google ScholarDigital Library
- Speaker Independent Connected Speech Recognition- Fifth Generation Computer Corporation. Fifthgen.com. Archived from the original on 11 November 2013. Retrieved 15 June 2013.Google Scholar
- A. Anusuya and S. K. Katti, Speech Recognition by Machine: A Review. International Journal of Computer Science and Information Security, Vol. 6, No. 3, 2009.Google Scholar
- Ravichander Vipperla, Steve Renals, and Joe Frankel. Ageing voices: The effect of changes in voice parameters on ASR performance. EURASIP Journal onAudio, Speech and Music Processing, 2010.Google Scholar
- Yu, D. (2014) Automatic Speech Recognition: A DeepLearning Approach, Springer. Google ScholarDigital Library
- Ravichander Vipperla. Automatic Speech Recognition for ageing voices. PhD thesis, School of Informatics University of Edinburgh, Edinburgh, United Kingdom, 2011.Google Scholar
- B.Bogert, M. Healy, J.'The quefrency analysis of time series for echoes: cepstrum, pseudo-autocovariance, cross-cepstrum and saphe craking', Proc. Symp. On Time Series Analysis John Wiley and Sons, Inc (1963), pp. 209--243.Google Scholar
- S.B. Davis, and P. Mermelstein (1980), "Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences," in IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(4), pp. 357--366.Google ScholarCross Ref
- Hermansky, H. (1990) Perceptual Linear Predictive (PLP) Analysis of Speech. The Journal of the Acoustical Society of America, 87, 1738--1752.Google ScholarCross Ref
- Steve Young, Gunnar Evermann, Mark Gales, Thomas Hain, Dan Kershaw, Xunying, Liu, Gareth Moore, Julian Odell, Dave Ollason, Dan Povey, Valtcho Valtchev, and Phil Woodland. The HTK Book (for Hidden Markov Model Toolkit Version 3.4), 2006.Google Scholar
- Xiang Li, Combination and Generation of Parallel Feature Streams for Improved Speech Recognition, Ph.D. Thesis, ECE Department, CMU, February 2005. Google ScholarDigital Library
- Iribe, Y., Kitaoka, N. and Segawa, S. (2015) Development of New Speech Corpus for Elderly Japanese Speech Recognition. 2015 International Conference Oriental COC OSDA Held Jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), Shanghai, 28-30 October 2015.Google Scholar
Index Terms
- Satja: Thai Elderly Speech Corpus for Speech Recognition
Recommendations
Toward an automatic speech recognition system for amazigh-tarifit language
This work aims at contributing to the Amazigh language Automatic Speech Recognition (ASR). We have studied and realized an automatic speech recognition system, using an environment totally based on the Amazigh-Tarifit language. In this framework, we ...
A new speech corpus of super-elderly Japanese for acoustic modeling
AbstractThe development of accessible speech recognition technology will allow the elderly to more easily access electronically stored information. However, the necessary level of recognition accuracy for elderly speech has not yet been ...
Highlights- The acoustic characteristics of elderly speech differ from those of younger speakers.
The CARES corpus: a database of older adult actor simulated emergency dialogue for developing a personal emergency response system
There has been limited research on automatic speech recognition systems developed specifically for older adults and there exist few older adult speech corpora available for training them. For our research, samples of primarily older adult voices within ...
Comments