Authors:
Wuttichai Vijitkunsawat
;
Teeradaj Racharak
;
Chau Nguyen
and
Nguyen Minh
Affiliation:
Japan Advanced Institute of Science and Technology, Nomi, Ishikawa, Japan
Keyword(s):
Thai Sign Language, Sign Language Recognition, Benchmark Dataset.
Abstract:
Video-based sign language recognition aims to support deaf people, so they can communicate with others by assisting them to recognise signs from video input. Unfortunately, most existing sign language datasets are limited to a small vocabulary, especially in low-resource languages such as Thai. Recent research in the Thai community has mostly paid attention to building recognisers from static input with limited datasets, making it difficult to train machine learning models for practical applications. To overcome this limitation, this paper originally introduces a new video database for automatic sign language recognition for Thai sign language digits. Our dataset has about 63 videos for each of the nine digits and is performed by 21 signers. Preliminary baseline results for this new dataset are presented under extensive experiments. Indeed, we implement four deep-learning-based architectures: CNN-Mode, CNN-LSTM, VGG-Mode, and VGG-LSTM, and compare their performances under two scenari
os: (1) the whole body pose with backgrounds, and (2) hand-cropped images only as pre-processing. The results show that VGG-LSTM with pre-processing has the best accuracy for our in-sample and out-of-sample test datasets.
(More)