Abstract
We present a unimodal, comprehensive, and easy-to-use dataset for visual free-hand gesture recognition. We call it GestureMNIST because of the 28 \(\times \) 28 grayscale format of its images, and because the number of samples is approximately 80,000, similar to MNIST. Each of the six gesture classes is composed of a sequence of 12 images taken by a 3D camera. As a peculiarity w.r.t. other datasets, all sequences are recorded by a single person, ensuring high sample uniformity and quality. A particular focus is to provide a vision-based dataset that can be used “out of the box” for sequence classification without any preprocessing, segmentation, and feature extraction steps. We present classification experiments on GestureMNIST with different types of DNNs, establishing a performance baseline for sequence classification algorithms. We place particular emphasis on ahead-of-time classification, i.e., the correct identification of a gestures class before the gesture is completed. It is shown that CNN and LSTM-based deep learning achieves near-perfect performance, whereas ahead-of-time classification performance offers ample scope for future research with GestureMNIST. GestureMNIST contains visual samples only, but other modalities, namely acceleration and sound data, are available upon request.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statis. Soc. Ser. B 39, 1–38 (1977). http://web.mit.edu/6.435/www/Dempster77.pdf
Gepperth, A., Pfülb, B.: Image modeling with deep convolutional gaussian mixture models. In: International Joint Conference on Neural Networks (IJCNN) (2021)
Graves, A., Jaitly, N., Mohamed, A.R.: Hybrid speech recognition with deep bidirectional LSTM. In: 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 273–278. IEEE (2013)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–80 (1997). https://doi.org/10.1162/neco.1997.9.8.1735
Khan, R.Z., Ibraheem, N.A.: Hand gesture recognition: a literature review. Int. J. Artif. Intell. Appl. 3(4), 161 (2012)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25. Curran Associates, Inc. (2012)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998). https://doi.org/10.1109/5.726791
Molchanov, P., Yang, X., Gupta, S., Kim, K., Tyree, S., Kautz, J.: Online detection and classification of dynamic hand gestures with recurrent 3D convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4207–4215 (2016)
Pfülb, B., Gepperth, A.: A comprehensive, application-oriented study of catastrophic forgetting in DNNs. In: International Conference on Learning Representations (2019). http://openreview.net/forum?id=BkloRs0qK7
Rautaray, S.S., Agrawal, A.: Vision based hand gesture recognition for human computer interaction: a survey. Artif. Intell. Rev. 43(1), 1–54 (2015). https://doi.org/10.1007/s10462-012-9356-9
Schak, M., Gepperth, A.: Gesture recognition on a new multi-modal hand gesture dataset. In: ICPRAM (2022)
Tharwat, A.: Principal component analysis - a tutorial. Int. J. Appl. Pattern Recogn. 3(3), 197–240 (2016). https://doi.org/10.1504/IJAPR.2016.079733, www.inderscienceonline.com/doi/abs/10.1504/IJAPR.2016.079733, pMID: 79733
Wan, J., Zhao, Y., Zhou, S., Guyon, I., Escalera, S., Li, S.Z.: Chalearn looking at people RGB-D isolated and continuous datasets for gesture recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 56–64 (2016)
Zhang, Y., Cao, C., Cheng, J., Lu, H.: EgoGesture: a new dataset and benchmark for egocentric hand gesture recognition. IEEE Trans. Multimedia 20(5), 1038–1050 (2018). https://doi.org/10.1109/TMM.2018.2808769
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Schak, M., Gepperth, A. (2022). Gesture MNIST: A New Free-Hand Gesture Dataset. In: Pimenidis, E., Angelov, P., Jayne, C., Papaleonidas, A., Aydin, M. (eds) Artificial Neural Networks and Machine Learning – ICANN 2022. ICANN 2022. Lecture Notes in Computer Science, vol 13532. Springer, Cham. https://doi.org/10.1007/978-3-031-15937-4_55
Download citation
DOI: https://doi.org/10.1007/978-3-031-15937-4_55
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15936-7
Online ISBN: 978-3-031-15937-4
eBook Packages: Computer ScienceComputer Science (R0)