research-article

3D Sign language recognition based on multi-path hybrid residual neural network

Authors:
xiaoyu Shi

Department of Air Defense Early Warning Equipment, Air Force Early Warning Academy, China

Department of Air Defense Early Warning Equipment, Air Force Early Warning Academy, China
View Profile

,
Xiaoli Jiao

Department of Air Defense Early Warning Equipment, Air Force Early Warning Academy, China

Department of Air Defense Early Warning Equipment, Air Force Early Warning Academy, China
View Profile

,
Cangzhen Meng

Department of Air Defense Early Warning Equipment, Air Force Early Warning Academy, China

Department of Air Defense Early Warning Equipment, Air Force Early Warning Academy, China
View Profile

,
Zhiyun Bian

Department of Technical service support, Air Force Radar unit 95174, China

Department of Technical service support, Air Force Radar unit 95174, China
View Profile

ICMLC '22: Proceedings of the 2022 14th International Conference on Machine Learning and ComputingFebruary 2022Pages 413–418https://doi.org/10.1145/3529836.3529943

Published:21 June 2022Publication History

ICMLC '22: Proceedings of the 2022 14th International Conference on Machine Learning and Computing

Pages 413–418

ABSTRACT

Abstract: Sign language is an important communicating method for deaf-mute people. In recent years, the hybrid model between the Bi-directional Long-Short Term Memory (BiLSTM) and 3D convolutional network model makes full use of the feature extraction ability of convolutional neural networks and the advantages of time series classification of the recurrent neural network model to achieve more accurate recognition. However, high precision, scalability and robustness are still important challenges in future sign language recognition research. The main research direction and responding research methods aim to improve the accuracy and speed of 3D poses and continuous sentences sign language recognition based on hybrid models with the upgrading of computer hardware equipment and network. The paper improves a novel residual neural network and then engages it to extract features and build models with BiLSTM. The proposed hybrid model combines the improved neural network and Bi-directional Long-Short Term Memory (BiLSTM). In order to validate the proposed algorithm, we introduce the Chalearn dataset and Sports-1M dataset captured with depth, color and stereo-IR sensors. On the two challenging datasets, our multi-path hybrid residual neural network achieves an accuracy of 78.9% and 82.7%, outperforms other state-of-the-art algorithms, and is close to human accuracy of 88.4%.

References

CHEOK M J, OMAR Z, and JAWARD M H. A review of hand gesture and sign language recognition techniques[J]. International Journal of Machine Learning and Cybernetics, 2019, 10(1): 131–153. Doi: 10.1007/s13042-017-0705-5.Google ScholarCross Ref
CAMGOZ N C, HADFIELD S, KOLLER O, SubUNets: End-to-end hand shape and continuous sign language recognition[C]. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017: 3075–3084.Google ScholarCross Ref
KO S K, SON J G, and JUNG H. Sign language recognition with recurrent neural network using human keypoint detection[C]. 2018 Conference on Research in Adaptive and Convergent Systems, Honolulu, USA, 2018: 326–328.Google ScholarDigital Library
CAMGOZ N C, HADFIELD S, KOLLER O, Using convolutional 3d neural networks for user-independent continuous gesture recognition[C]. The 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico, 2016: 49–54.Google Scholar
PU Junfu, ZHOU Wengang, and LI Houqiang. Dilated convolutional network with iterative optimization for continuous sign language recognition[C]. The 27th International Joint Conference on Artificial Intelligence, Wellington, New Zealand, 2018: 885–891.Google ScholarDigital Library
HUANG Jie, ZHOU Wengang, ZHANG Qilin, Video- based sign language recognition without temporal segmentation[C]. The 32nd AAAI Conference on Artificial Intelligence, New Orleans, USA, 2018: 2257–2264.Google ScholarCross Ref
WANG Shuo, GUO Dan, ZHOU Wengang, Connectionist temporal fusion for sign language translation[C]. The 26th ACM International Conference on Multimedia, Seoul, Korea, 2018: 1483– 1491.Google ScholarDigital Library
KOLLER O, ZARGARAN O, NEY H, Deep sign: Hybrid CNN-HMM for continuous sign language recognition[C]. 2016 British Machine Vision Conference, York, UK, 2016: 1–2.Google ScholarCross Ref
KOLLER O, ZARGARAN S, and NEY H. Re-sign: Re- aligned end-to-end sequence modelling with deep recurrent CNN-HMMs[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Hawaii, USA, 2017: 4297–4305.Google ScholarCross Ref
KOLLER O, ZARGARAN S, NEY H, Deep sign: Enabling robust statistical continuous sign language recognition via hybrid CNN-HMMs[J]. International Journal of Computer Vision, 2018, 126(12): 1311–1325. Doi: 10.1007/s11263-018-1121-3.Google ScholarDigital Library
PIGOU L, VAN HERREWEGHE M, and DAMBRE J. Gesture and sign language recognition with temporal residual networks[C]. 2017 IEEE International Conference on Computer Vision Workshops, Venice, Italy, 2017: 3086–3093.Google ScholarCross Ref
CUI Runpeng, LIU Hu, and ZHANG Changshui. Recurrent convolutional neural networks for continuous sign language recognition by staged optimization[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 7361–7369.Google ScholarCross Ref
ARIESTA M C, WIRYANA F, SUHARJITO, Sentence level Indonesian sign language recognition using 3D convolutional neural network and bidirectional recurrent neural network[C]. 2018 Indonesian Association for Pattern Recognition International Conference (INAPR), Jakarta, Indonesia, 2018: 16–22.Google Scholar
GUO Dan, ZHOU Wengang, LI Houqiang, Hierarchical LSTM for sign language translation[C]. The 32nd AAAI Conference on Artificial Intelligence, the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence, New Orleans, USA, 2018: 6845–6852.Google Scholar
CUI Runpeng, LIU Hu, and ZHANG Changshui. A deepneural framework for continuous sign language recognition by iterative training[J]. IEEE Transactions on Multimedia, 2019, 21(7): 1880–1891. Doi: 10.1109/TMM.2018.2889563.Google ScholarCross Ref
FORSTER J, SCHMIDT C, HOYOUX T, RWTH- PHOENIX-Weather: A large vocabulary sign language recognition and translation corpus[C]. The 8th International Conference on Language Resources and Evaluation, Istanbul, Turkey, 2012: 3785–3789.Google Scholar
Rekha J, Bhattacharya J, Majumder S. Shape, texture and local movement hand gesture features for indian sign language recognition[C]//3rd International Conference on Trendz in Information Sciences & Computing (TISC2011). IEEE, 2011: 30-35.Google Scholar
E.Ohn-BarandM.Trivedi.Handgesturerecognitioninreal time for automotive interfaces: a multimodal vision-based approach and evaluations. IEEE ITS, 15(6):1–10, 2014.Google Scholar
K. Simonyan and A. Zisserman. Two-stream convolutional networks for action recognition. In NIPS, 2014.Google Scholar
H. Wang, D. Oneata, J. Verbeek, and C. Schmid. A robust and efficient video representation for action recognition. IJCV, 2015.Google Scholar

Recommendations

Sign language recognition with recurrent neural network using human keypoint detection
RACS '18: Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems

We study the sign language recognition problem which is to translate the meaning of signs from visual input such as videos. It is well-known that many problems in the field of computer vision require a huge amount of dataset to train deep neural network ...
Read More
Sign language recognition using 3-D Hopfield neural network
ICIP '95: Proceedings of the 1995 International Conference on Image Processing (Vol.2)-Volume 2 - Volume 2

This paper presents a sign language recognition system which consists of three modules: model-based hand tracking, feature extraction, and gesture recognition using a 3-D Hopfield neural network. In the experiments, we illustrate that this system can ...
Read More
Time-shiftable Convolutional Sign Language Recognition Based on Key Frame Extraction
ICIT '22: Proceedings of the 2022 10th International Conference on Information Technology: IoT and Smart City

Sign language recognition for the deaf-mute is an important technology in the field of computer vision, which is conducive to promoting communication between hearing person and the deaf-mute. However, the current mainstream methods for sign language ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICMLC '22: Proceedings of the 2022 14th International Conference on Machine Learning and Computing
February 2022
570 pages
ISBN:9781450395700
DOI:10.1145/3529836

Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 June 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Artificial intelligence
Deep learning algorithms
Sign language recognition
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 53
  Total Downloads
- Downloads (Last 12 months)29
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

3D Sign language recognition based on multi-path hybrid residual neural network

ICMLC '22: Proceedings of the 2022 14th International Conference on Machine Learning and Computing

ABSTRACT

References

Cited By

Recommendations

Sign language recognition with recurrent neural network using human keypoint detection

Sign language recognition using 3-D Hopfield neural network

Time-shiftable Convolutional Sign Language Recognition Based on Key Frame Extraction

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

3D Sign language recognition based on multi-path hybrid residual neural network

ICMLC '22: Proceedings of the 2022 14th International Conference on Machine Learning and Computing

ABSTRACT

References

Cited By

Recommendations

Sign language recognition with recurrent neural network using human keypoint detection

Sign language recognition using 3-D Hopfield neural network

Time-shiftable Convolutional Sign Language Recognition Based on Key Frame Extraction

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media