skip to main content
10.1145/3291280.3291793acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiaitConference Proceedingsconference-collections
research-article

Satja: Thai Elderly Speech Corpus for Speech Recognition

Authors Info & Claims
Published:10 December 2018Publication History

ABSTRACT

Thai language is the official language of Thailand. At present, about 70 million speakers are located in Thailand and the southern parts of China, Yunnan, Guizhou, and Guangxi. The Thai language is a tonal language. Thai Language is a challenging language for speech processing technology. Because the Thai spoken language database is limited and also lacks a specific speech corpus, such as a children's speech database, elderly speech, accents spoken in each region, etc. This research develops the Thai elderly speech named Satja meaning is truth of speech. The content of this corpus is a voice command. There are 50 speakers, 24 males and 26 females, covering six regions in Thailand, aged 60-85 years. In addition, the database of elderly voice was compared to non-elderly voice. For a model training, we used CMUSphinx and tested with Sphinx4. We found that when the elderly speech was tested with the elderly model, it was more accurate when experimented than the model trained by the non-elderly people.

References

  1. United Nations, Department of Economic and Social Affairs, 2017, World Population Ageing (2017), New York, (ST/ESA/SER.A/390).Google ScholarGoogle Scholar
  2. Office of the national economic and social development board, Population projections for Thailand 2010-2040, 2013, Bangkok, Thailand.Google ScholarGoogle Scholar
  3. Wutiwiwatchai, C., and Furui, S., "Thai Speech Processing Technology: A Review", J. Speech Communication, Vol. 49, pp. 8--27, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. IPA, The principles of the International Phonetic Association, 2nd ed. London, UK: University College of London,1982.Google ScholarGoogle Scholar
  5. Somsak botong, 2560, ภาษาศาสตร์ภาษาไทย (2nd. ed.), Offset, Bangkok, Thailand, page 1--144.Google ScholarGoogle Scholar
  6. S. Suebvisai, P. Charoenpornsawat, A. Black, M. Woszczyna, and T. Schultz, "Thai automatic speech recognition", Proc.ICASSP, pp. 857--860, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  7. Xuedong Huang, Alex Acero, Hsiao-Wuen Hon, et al. Spoken Language Processing, volume 18. Prentice Hall, 2001.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Speaker Independent Connected Speech Recognition- Fifth Generation Computer Corporation. Fifthgen.com. Archived from the original on 11 November 2013. Retrieved 15 June 2013.Google ScholarGoogle Scholar
  9. A. Anusuya and S. K. Katti, Speech Recognition by Machine: A Review. International Journal of Computer Science and Information Security, Vol. 6, No. 3, 2009.Google ScholarGoogle Scholar
  10. Ravichander Vipperla, Steve Renals, and Joe Frankel. Ageing voices: The effect of changes in voice parameters on ASR performance. EURASIP Journal onAudio, Speech and Music Processing, 2010.Google ScholarGoogle Scholar
  11. Yu, D. (2014) Automatic Speech Recognition: A DeepLearning Approach, Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Ravichander Vipperla. Automatic Speech Recognition for ageing voices. PhD thesis, School of Informatics University of Edinburgh, Edinburgh, United Kingdom, 2011.Google ScholarGoogle Scholar
  13. B.Bogert, M. Healy, J.'The quefrency analysis of time series for echoes: cepstrum, pseudo-autocovariance, cross-cepstrum and saphe craking', Proc. Symp. On Time Series Analysis John Wiley and Sons, Inc (1963), pp. 209--243.Google ScholarGoogle Scholar
  14. S.B. Davis, and P. Mermelstein (1980), "Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences," in IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(4), pp. 357--366.Google ScholarGoogle ScholarCross RefCross Ref
  15. Hermansky, H. (1990) Perceptual Linear Predictive (PLP) Analysis of Speech. The Journal of the Acoustical Society of America, 87, 1738--1752.Google ScholarGoogle ScholarCross RefCross Ref
  16. Steve Young, Gunnar Evermann, Mark Gales, Thomas Hain, Dan Kershaw, Xunying, Liu, Gareth Moore, Julian Odell, Dave Ollason, Dan Povey, Valtcho Valtchev, and Phil Woodland. The HTK Book (for Hidden Markov Model Toolkit Version 3.4), 2006.Google ScholarGoogle Scholar
  17. Xiang Li, Combination and Generation of Parallel Feature Streams for Improved Speech Recognition, Ph.D. Thesis, ECE Department, CMU, February 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Iribe, Y., Kitaoka, N. and Segawa, S. (2015) Development of New Speech Corpus for Elderly Japanese Speech Recognition. 2015 International Conference Oriental COC OSDA Held Jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), Shanghai, 28-30 October 2015.Google ScholarGoogle Scholar

Index Terms

  1. Satja: Thai Elderly Speech Corpus for Speech Recognition

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        IAIT '18: Proceedings of the 10th International Conference on Advances in Information Technology
        December 2018
        145 pages
        ISBN:9781450365680
        DOI:10.1145/3291280

        Copyright © 2018 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 10 December 2018

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        IAIT '18 Paper Acceptance Rate20of47submissions,43%Overall Acceptance Rate20of47submissions,43%
      • Article Metrics

        • Downloads (Last 12 months)11
        • Downloads (Last 6 weeks)2

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader