Abstract
Sign language plays a significant role in communication for the hearing-impaired and the speech-impaired. Sign language recognition smooths the barriers between the disabled and the healthy. However, the method has been difficult for artificial intelligence to use because it requires complex gestures that must be recognized in real time and with great accuracy. Fingerspelling sign language recognition methods based on convolutional neural networks have gradually gained popularity in recent years thanks to the advancement of deep learning techniques. Recognition of sign language using finger spelling has taken center stage. This study proposed an optimized eight-layer convolutional neural network based on blocks (CNN-BB) for fingerspelling recognition of Chinese sign language. Three different blocks: Conv-BN-ReLU-Pooling, Conv-BN-ReLU, Conv-BN-ReLU-BN were adopted and some advanced technologies such as bath normalization, dropout, pooling and data augmentation were employed. The results displayed that our CNN-BB achieved MSD of 93.32 ± 1.42%, which is superior to eight state-of-the-art approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Yang, X., Lei, J., Sun, K.: Evolution and trend of sign language research in China: a visual analysis based on CiteSpace, vol. 267, no. 09, pp. 21–28+65 (2022)
Yu, Z.: Adaptive problems in Chinese sign language recognition, Ph.D. Harbin Institute of Technology (2010)
Yao, G., Yao, H., Jiang, F.: A multi-layer classifier sign language recognition method based on DTW/ISODATA algorithm, vol. 08, pp. 45–47+200 (2005)
Zhao, W.: Chinese sign language recognition based on HMM_SVM, vol. 21, no. 10, pp. 24–26 (2011)
Wu, J., Gao, W.: ANN/HMM based sign language recognition method, no. 10, pp. 63–66 (1999)
Zou, W., Yuan, K., Du, Q., Xu, C.: Fuzzy neural network based word recognition in static sign language, no. 04, pp. 616–621 (2003)
Ma, C., Shao, J., Qin, B.: Progress in sign language recognition in the teaching of the hearing impaired, vol. 42, no. 10, pp. 23–27 (2022)
Jiang, X., Satapathy, S.C., Yang, L., Wang, S.-H., Zhang, Y.-D.: A Survey on artificial intelligence in Chinese sign language recognition. Arab. J. Sci. Eng. 45(12), 9859–9894 (2020)
Lee, Y., Hua, F.: Principle and realization of conversation from standard Chinese pinyin to international phonetic alphabet, vol. 14, pp. 540–545 (2012)
Feng, B., Yang, H., Yuan, G., Li, J., Zhan, C.: A review of the research of neural networks in SAR image target recognition, vol. 42, no. 10, pp. 15–22 (2021)
Zhou, F., Jin, L., Dong, J.: Review of convolutional neural networks, vol. 40, no. 06, pp. 1229–1251 (2017). https://kns.cnki.net/kcms/detail/11.1826.TP.20170122.1035.002.html
Chang, L., et al.: Convolutional neural networks in image understanding. Acta Autom. Sin. 42(09), 1300–1312 (2016). https://doi.org/10.16383/j.aas.2016.c150800
Zhang, Y., Liu, Y., Liu, M., Man, W., Song, T., Li, C.: Fine classification of wetland plant communities based on relief F and convolutional neural networks, no. 02, pp. 58–64 (2023). https://doi.org/10.13474/j.cnki.11-2246.2023.0041
Qian, X., Zhang, X., Hao, Z.: Gait recognition based on improved convolutional neural network, vol. 9, no. 02, pp. 91–97 (2022). https://doi.org/10.19306/j.cnki.2095-8110.2022.02.011
Hao, T.: Construction of activation function LeafSpring and comparative study of multiple data sets, vol. 49, no. 03, pp. 306–314+322 (2020). https://doi.org/10.13976/j.cnki.xk.2020.9332
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. Presented at the Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37, Lille, France (2015)
Hinton, G.E., Srivastava, N., Krizhevsky, A., et al.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv 2012; abs/1207.0580
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60, 84–90 (2012)
Srivastava, N., Hinton, G., Krizhevsky, A., et al.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)
Gao, Y., Zhou, B., Hu, X.: Research on convolutional neural network image recognition based on data enhancement. Comput. Technol. Dev. 28, 62–65 (2018)
Yanzhen, Z., Xiangyu, C., Jian, L., et al.: Transient stability prediction of power systems based on data augmentation and deep residual networks. China Electr. Power 53, 22–31 (2020)
Eckert, D., Vesal, S., Ritschl, L., Kappler, S., Maier, A.: Deep learning-based denoising of mammographic images using physics-driven data augmentation. In: Tolxdorff, T., Deserno, T., Handels, H., Maier, A., Maier-Hein, K., Palm, C. (eds.) Bildverarbeitung für die Medizin 2020. Informatik aktuell, pp. 94–100. Springer, Wiesbaden (2020). https://doi.org/10.1007/978-3-658-29267-6_21
Vasconcelos, C.N., Vasconcelos, B.N.: Convolutional neural network committees for melanoma classification with classical and expert knowledge based image transforms data augmentation. Comput. Vis. Pattern Recognit. (2017)
Igl, M., Ciosek, K., Li, Y., et al.: Generalization in reinforcement learning with selective noise injection and information bottleneck (2019)
Wang, S.H., Tang, C., Sun, J., et al.: Multiple sclerosis identification by 14-layer convolutional neural network with batch normalization, dropout, and stochastic pooling. Front. Neurosci. 12 (2018)
Singh, P., Yadav, A.K., Singh, K.: Color image encryption using affine transform in fractional Hartley domain. Optica Applicata 47 (2017)
Zhao, N., Yang, H.: Realizing speech to gesture conversion by keyword spotting. In: 2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 1–5. IEEE (2016)
Li, Y., Chen, X., Zhang, X., et al.: A sign-component-based framework for Chinese sign language recognition using accelerometer and sEMG data. IEEE Trans. Biomed. Eng. 59, 2695–2704 (2012)
Yang, H.-.D, Lee, S.-W.: Robust sign language recognition with hierarchical conditional random fields. In: 2010 20th International Conference on Pattern Recognition, pp. 2202–2205. IEEE (2010)
Anguita, D., Ghelardoni, L., Ghio, A., et al.: The ‘K’ in K-fold cross validation. In: ESANN, pp. 441–446 (2012)
Zhu, Z., Zhang, M., Jiang, X.: Fingerspelling identification for chinese sign language via wavelet entropy and kernel support vector machine. In: Satapathy, S., Zhang, YD., Bhateja, V., Majhi, R. (eds.) Intelligent Data Engineering and Analytics. AISC, vol. 1177, pp. 539–549. Springer, Singapore (2021). https://doi.org/10.1007/978-981-15-5679-1_52
Jiang, X., Zhang, Y.-D.: Chinese sign language fingerspelling via six-layer convolutional neural network with leaky rectified linear units for therapy and rehabilitation. J. Med. Imaging Health Informat. 9, 2031–2090 (2019)
Jiang, X., Hu, B., Chandra Satapathy, S., et al.: Fingerspelling identification for Chinese sign language via AlexNet-based transfer learning and Adam optimizer. Sci. Progr. 2020, 1–13 (2020)
Gao, Y., Zhu, R., Gao, R., Weng, Y., Jiang, X.: An optimized seven-layer convolutional neural network with data augmentation for classification of chinese fingerspelling sign language. In: Fu, W., Xu, Y., Wang, SH., Zhang, Y. (eds.) ICMTEL 2021. LNICST, Part II, vol. 388, pp. 21–42. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-82565-2_3
Acknowledgements
This work was supported by National Philosophy and Social Sciences Foundation (20BTQ065), Natural Science Foundation of Jiangsu Higher Education Institutions of China (19KJA310002).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Chu, H., Jiang, C., Xu, J., Ye, Q., Jiang, X. (2024). An Optimized Eight-Layer Convolutional Neural Network Based on Blocks for Chinese Fingerspelling Sign Language Recognition. In: Wang, B., Hu, Z., Jiang, X., Zhang, YD. (eds) Multimedia Technology and Enhanced Learning. ICMTEL 2023. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 535. Springer, Cham. https://doi.org/10.1007/978-3-031-50580-5_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-50580-5_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50579-9
Online ISBN: 978-3-031-50580-5
eBook Packages: Computer ScienceComputer Science (R0)