research-article

Satja: Thai Elderly Speech Corpus for Speech Recognition

Authors:
Suphunnee Prajongjai

School of Information Technology King Mongkut's, University of Technology, Thonburi, Thailand

School of Information Technology King Mongkut's, University of Technology, Thonburi, Thailand
View Profile

,
Tuul Triyason

School of Information Technology King Mongkut's, University of Technology, Thonburi, Thailand

School of Information Technology King Mongkut's, University of Technology, Thonburi, Thailand
View Profile

,
Pornchai Mongkolnam

School of Information Technology King Mongkut's, University of Technology, Thonburi, Thailand

School of Information Technology King Mongkut's, University of Technology, Thonburi, Thailand
View Profile

IAIT '18: Proceedings of the 10th International Conference on Advances in Information TechnologyDecember 2018Article No.: 16Pages 1–7https://doi.org/10.1145/3291280.3291793

Published:10 December 2018Publication History

IAIT '18: Proceedings of the 10th International Conference on Advances in Information Technology

Pages 1–7

ABSTRACT

Thai language is the official language of Thailand. At present, about 70 million speakers are located in Thailand and the southern parts of China, Yunnan, Guizhou, and Guangxi. The Thai language is a tonal language. Thai Language is a challenging language for speech processing technology. Because the Thai spoken language database is limited and also lacks a specific speech corpus, such as a children's speech database, elderly speech, accents spoken in each region, etc. This research develops the Thai elderly speech named Satja meaning is truth of speech. The content of this corpus is a voice command. There are 50 speakers, 24 males and 26 females, covering six regions in Thailand, aged 60-85 years. In addition, the database of elderly voice was compared to non-elderly voice. For a model training, we used CMUSphinx and tested with Sphinx4. We found that when the elderly speech was tested with the elderly model, it was more accurate when experimented than the model trained by the non-elderly people.

References

United Nations, Department of Economic and Social Affairs, 2017, World Population Ageing (2017), New York, (ST/ESA/SER.A/390).Google Scholar
Office of the national economic and social development board, Population projections for Thailand 2010-2040, 2013, Bangkok, Thailand.Google Scholar
Wutiwiwatchai, C., and Furui, S., "Thai Speech Processing Technology: A Review", J. Speech Communication, Vol. 49, pp. 8--27, 2007. Google ScholarDigital Library
IPA, The principles of the International Phonetic Association, 2nd ed. London, UK: University College of London,1982.Google Scholar
Somsak botong, 2560, ภาษาศาสตร์ภาษาไทย (2nd. ed.), Offset, Bangkok, Thailand, page 1--144.Google Scholar
S. Suebvisai, P. Charoenpornsawat, A. Black, M. Woszczyna, and T. Schultz, "Thai automatic speech recognition", Proc.ICASSP, pp. 857--860, 2005.Google ScholarCross Ref
Xuedong Huang, Alex Acero, Hsiao-Wuen Hon, et al. Spoken Language Processing, volume 18. Prentice Hall, 2001.Google ScholarDigital Library
Speaker Independent Connected Speech Recognition- Fifth Generation Computer Corporation. Fifthgen.com. Archived from the original on 11 November 2013. Retrieved 15 June 2013.Google Scholar
A. Anusuya and S. K. Katti, Speech Recognition by Machine: A Review. International Journal of Computer Science and Information Security, Vol. 6, No. 3, 2009.Google Scholar
Ravichander Vipperla, Steve Renals, and Joe Frankel. Ageing voices: The effect of changes in voice parameters on ASR performance. EURASIP Journal onAudio, Speech and Music Processing, 2010.Google Scholar
Yu, D. (2014) Automatic Speech Recognition: A DeepLearning Approach, Springer. Google ScholarDigital Library
Ravichander Vipperla. Automatic Speech Recognition for ageing voices. PhD thesis, School of Informatics University of Edinburgh, Edinburgh, United Kingdom, 2011.Google Scholar
B.Bogert, M. Healy, J.'The quefrency analysis of time series for echoes: cepstrum, pseudo-autocovariance, cross-cepstrum and saphe craking', Proc. Symp. On Time Series Analysis John Wiley and Sons, Inc (1963), pp. 209--243.Google Scholar
S.B. Davis, and P. Mermelstein (1980), "Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences," in IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(4), pp. 357--366.Google ScholarCross Ref
Hermansky, H. (1990) Perceptual Linear Predictive (PLP) Analysis of Speech. The Journal of the Acoustical Society of America, 87, 1738--1752.Google ScholarCross Ref
Steve Young, Gunnar Evermann, Mark Gales, Thomas Hain, Dan Kershaw, Xunying, Liu, Gareth Moore, Julian Odell, Dave Ollason, Dan Povey, Valtcho Valtchev, and Phil Woodland. The HTK Book (for Hidden Markov Model Toolkit Version 3.4), 2006.Google Scholar
Xiang Li, Combination and Generation of Parallel Feature Streams for Improved Speech Recognition, Ph.D. Thesis, ECE Department, CMU, February 2005. Google ScholarDigital Library
Iribe, Y., Kitaoka, N. and Segawa, S. (2015) Development of New Speech Corpus for Elderly Japanese Speech Recognition. 2015 International Conference Oriental COC OSDA Held Jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), Shanghai, 28-30 October 2015.Google Scholar

Index Terms

Satja: Thai Elderly Speech Corpus for Speech Recognition
1. Computing methodologies
  1. Modeling and simulation
    1. Simulation support systems
      1. Simulation tools
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interactive systems and tools
      1. User interface toolkits

Recommendations

Toward an automatic speech recognition system for amazigh-tarifit language

This work aims at contributing to the Amazigh language Automatic Speech Recognition (ASR). We have studied and realized an automatic speech recognition system, using an environment totally based on the Amazigh-Tarifit language. In this framework, we ...
Read More
A new speech corpus of super-elderly Japanese for acoustic modeling
Abstract
The development of accessible speech recognition technology will allow the elderly to more easily access electronically stored information. However, the necessary level of recognition accuracy for elderly speech has not yet been ...
Highlights
- The acoustic characteristics of elderly speech differ from those of younger speakers.
Read More
The CARES corpus: a database of older adult actor simulated emergency dialogue for developing a personal emergency response system

There has been limited research on automatic speech recognition systems developed specifically for older adults and there exist few older adult speech corpora available for training them. For our research, samples of primarily older adult voices within ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

IAIT '18: Proceedings of the 10th International Conference on Advances in Information Technology
December 2018
145 pages
ISBN:9781450365680
DOI:10.1145/3291280

Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 December 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Elderly
Speech corpus development
Speech recognition system
Thai language
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
IAIT '18 Paper Acceptance Rate20of47submissions,43%Overall Acceptance Rate20of47submissions,43%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 137
  Total Downloads
- Downloads (Last 12 months)11
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Satja: Thai Elderly Speech Corpus for Speech Recognition

IAIT '18: Proceedings of the 10th International Conference on Advances in Information Technology

ABSTRACT

References

Cited By

Index Terms

Recommendations

Toward an automatic speech recognition system for amazigh-tarifit language

A new speech corpus of super-elderly Japanese for acoustic modeling

The CARES corpus: a database of older adult actor simulated emergency dialogue for developing a personal emergency response system

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Satja: Thai Elderly Speech Corpus for Speech Recognition

IAIT '18: Proceedings of the 10th International Conference on Advances in Information Technology

ABSTRACT

References

Cited By

Index Terms

Recommendations

Toward an automatic speech recognition system for amazigh-tarifit language

A new speech corpus of super-elderly Japanese for acoustic modeling

The CARES corpus: a database of older adult actor simulated emergency dialogue for developing a personal emergency response system

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media