Abstract
There is an urgent need, accelerated by the COVID-19 pandemic, for methods that allow clinicians and neuroscientists to remotely evaluate hand movements. This would help detect and monitor degenerative brain disorders that are particularly prevalent in older adults. With the wide accessibility of computer cameras, a vision-based real-time hand gesture detection method would facilitate online assessments in home and clinical settings. However, motion blur is one of the most challenging problems in the fast-moving hands data collection. The objective of this study was to develop a computer vision-based method that accurately detects older adults’ hand gestures using video data collected in real-life settings. We invited adults over 50 years old to complete validated hand movement tests (fast finger tapping and hand opening–closing) at home or in clinic. Data were collected without researcher supervision via a website programme using standard laptop and desktop cameras. We processed and labelled images, split the data into training, validation and testing, respectively, and then analysed how well different network structures detected hand gestures. We recruited 1,900 adults (age range 50–90 years) as part of the TAS Test project and developed UTAS7k—a new dataset of 7071 hand gesture images, split 4:1 into clear: motion-blurred images. Our new network, RGRNet, achieved 0.782 mean average precision (mAP) on clear images, outperforming the state-of-the-art network structure (YOLOV5-P6, mAP 0.776), and mAP 0.771 on blurred images. A new robust real-time automated network that detects static gestures from a single camera, RGRNet, and a new database comprising the largest range of individual hands, UTAS7k, both show strong potential for medical and research applications.








Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Data availability
The datasets generated during and/or analysed during the current study are available from the authors on reasonable request.
References
Alex K, Sutskever I, Hinton GE Imagenet classification with deep convolutional networks. In: NIPS’12 Proceedings of the 25th international conference on neural information processing systems, Vol. 1; pp. 1097–1105
LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551
Al-Hammadi M, Muhammad G, Abdul W, Alsulaiman M, Bencherif MA, Mekhtiche MA (2020) Hand gesture recognition for sign language using 3dcnn. IEEE Access 8:79491–79509
Zadikoff C, Lang AE (2005) Apraxia in movement disorders. Brain 128(7):1480–1497
Alty J, Bai Q, Li R, Lawler K, St George RJ, Hill E, Bindoff A, Garg S, Wang X, Huang G et al (2022) The TAS Test project: a prospective longitudinal validation of new online motor-cognitive tests to detect preclinical alzheimer’s disease and estimate 5-year risks of cognitive decline and dementia. BMC Neurol 22(1):1–13
Alty J, Bai Q, George RJS, Bindoff A, Li R, Lawler K, Hill E, Garg S, Bartlett L, King AE, Vickers JC (2021) Tastest: moving towards a digital screening test for pre-clinical Alzheimer’s disease. Alzheimer’s Dementia 17(S5):058732. https://doi.org/10.1002/alz.058732 (https://alz-journals.onlinelibrary.wiley.com/doi/pdf/10.1002/alz.058732)
Goetz CG, Fahn S, Martinez-Martin P, Poewe W, Sampaio C, Stebbins GT, Stern MB, Tilley BC, Dodel R, Dubois B et al (2007) Movement disorder society-sponsored revision of the unified Parkinson’s disease rating scale (mds-updrs): process, format, and clinimetric testing plan. Movement Disorders 22(1):41–47
Bochkovskiy A, Wang C-Y, Liao H-YM (2020) Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934
Lee M, Bae J (2020) Deep learning based real-time recognition of dynamic finger gestures using a data glove. IEEE Access 8:219923–219933. https://doi.org/10.1109/ACCESS.2020.3039401
Jung P-G, Lim G, Kim S, Kong K (2015) A wearable gesture recognition device for detecting muscular activities based on air-pressure sensors. IEEE Trans Ind Inf 11(2):485–494
Premaratne P (2014) Historical development of hand gesture recognition. Springer, Cham, pp 5–29
Ahmed M, Zaidan B, Zaidan A, Alamoodi A, Albahri O, Al-Qaysi Z, Albahri A, Salih MM (2021) Real-time sign language framework based on wearable device: analysis of msl, dataglove, and gesture recognition. Soft Comput, 1–22
Zhu Y, Yang Z, Yuan B (2013) Vision based hand gesture recognition. In: 2013 international conference on service sciences (ICSS), pp. 260–265. IEEE
Lee H-K, Kim J-H (1999) An hmm-based threshold model approach for gesture recognition. IEEE Trans Pattern Anal Mach Intell 21(10):961–973
Marcel S, Bernier O, Viallet J-E, Collobert D (2000) Hand gesture recognition using input-output hidden Markov models. In: proceedings fourth IEEE international conference on automatic face and gesture recognition (Cat. No. PR00580), pp. 456–461. IEEE
Ng CW, Ranganath S (2002) Real-time gesture recognition system and application. Image Vis Comput 20(13–14):993–1007
Chen Q, Georganas ND, Petriu EM (2008) Hand gesture recognition using haar-like features and a stochastic context-free grammar. IEEE Trans Instrum Meas 57(8):1562–1571
Mohanty A, Rambhatla SS, Sahay RR (2017) Deep gesture: static hand gesture recognition using CNN. In: proceedings of international conference on computer vision and image processing, pp. 449–461. Springer
Bose SR, Kumar VS (2020) Efficient inception v2 based deep convolutional neural network for real-time hand action recognition. IET Image Process 14(4):688–696
Yi C, Zhou L, Wang Z, Sun Z, Tan C (2018) Long-range hand gesture recognition with joint ssd network. In: 2018 IEEE international conference on robotics and biomimetics (ROBIO), pp. 1959–1963. IEEE
Mujahid A, Awan MJ, Yasin A, Mohammed MA, Damaševičius R, Maskeliūnas R, Abdulkareem KH (2021) Real-time hand gesture recognition based on deep learning yolov3 model. Appl Sci 11(9):4164
Benitez-Garcia G, Prudente-Tixteco L, Castro-Madrid LC, Toscano-Medina R, Olivares-Mercado J, Sanchez-Perez G, Villalba LJG (2021) Improving real-time hand gesture recognition with semantic segmentation. Sensors 21(2):356
Benitez-Garcia G, Olivares-Mercado J, Sanchez-Perez G, Yanai K (2021) IPN hand: a video dataset and benchmark for real-time continuous hand gesture recognition. In: 2020 25th international conference on pattern recognition (ICPR), pp. 4340–4347. IEEE
Gupta P, Kautz K, et al (2016) Online detection and classification of dynamic hand gestures with recurrent 3d convolutional neural networks. In: CVPR, vol 1, p. 3
Köpüklü O, Gunduz A, Kose N, Rigoll G (2019) Real-time hand gesture detection and classification using convolutional neural networks. In: 2019 14th IEEE international conference on automatic face & gesture recognition (FG 2019), pp. 1–8. IEEE
Do N-T, Kim S-H, Yang H-J, Lee G-S (2020) Robust hand shape features for dynamic hand gesture recognition using multi-level feature lstm. Appl Sci 10(18):6293
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788
Ni Z, Chen J, Sang N, Gao C, Liu L (2018) Light yolo for high-speed gesture recognition. In: 2018 25th IEEE international conference on image processing (ICIP), pp. 3099–3103. IEEE
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7263–7271
Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767
Jocher G, et al. (2021) ultralytics/yolov5: V5.0 - YOLOv5-P6 1280 Models, AWS, Supervise.ly and YouTube integrations. https://doi.org/10.5281/zenodo.4679653
Xianbao C, Guihua Q, Yu J, Zhaomin Z (2021) An improved small object detection method based on yolo v3. Pattern Anal Appl 1–9
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778
Ross T-Y, Dollár G (2017) Focal loss for dense object detection. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2980–2988
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: single shot multibox detector. In: European conference on computer vision, pp. 21–37. Springer
Tan M, Le Q (2019) Efficientnet: rethinking model scaling for convolutional neural networks. In: international conference on machine learning, pp. 6105–6114. PMLR
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, et al (2020) An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929
Wang C-Y, Liao H-YM, Wu Y-H, Chen P-Y, Hsieh J-W, Yeh I-H (2020) CSPNet: a new backbone that can enhance learning capability of CNN. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp. 390–391
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2117–2125
Wang K, Liew JH, Zou Y, Zhou D, Feng J (2019) Panet: few-shot image semantic segmentation with prototype alignment. In: proceedings of the IEEE/CVF international conference on computer vision, pp. 9197–9206
Ridnik T, Lawen H, Noy A, Ben Baruch E, Sharir G, Friedman I (2021) TRESNet: high performance GPU-dedicated architecture. In: proceedings of the IEEE/CVF winter conference on applications of computer vision, pp. 1400–1409
Elfwing S, Uchibe E, Doya K (2018) Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Netw 107:3–11
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132–7141
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30:5998–6008
Bartlett L, Doherty K, Farrow M, Kim S, Hill E, King A, Alty J, Eccleston C, Kitsos A, Bindoff A et al (2022) Island study linking aging and neurodegenerative disease (island) targeting dementia risk reduction: protocol for a prospective web-based cohort study. JMIR Res Protoc 11(3):34688
Afifi M (2019) 11k hands: gender recognition and biometric identification using a large dataset of hand images. Multimed Tools Appl. https://doi.org/10.1007/s11042-019-7424-8
Sun Z, Tan T, Wang Y, Li S (2005) Ordinal palmprint representation for personal identification. In: proceedings of the IEEE conference on computer vision and pattern recognition
Abdesselam A, Al-Busaidi A (2012) Person identification prototype using hand geometry. https://doi.org/10.13140/2.1.2181.9844
Kumar A (2008) Incorporating cohort information for reliable palmprint authentication. In: 2008 Sixth Indian conference on computer vision, graphics & image processing, pp. 583–590. IEEE
Ferrer MA, Morales A, Travieso CM, Alonso JB (2007) Low cost multimodal biometric identification system based on hand geometry, palm and finger print texture. In: 2007 41st annual IEEE international Carnahan conference on security technology, pp. 52–58. IEEE
Pech-Pacheco JL, Cristóbal G, Chamorro-Martinez J, Fernández-Valdivia J (2000) Diatom autofocusing in brightfield microscopy: a comparative study. In: proceedings 15th international conference on pattern recognition. ICPR-2000, vol. 3, pp. 314–317. IEEE
Han K, Wang Y, Tian Q, Guo J, Xu C, Xu C (2020) GhostNet: more features from cheap operations
Howard A, Sandler M, Chu G, Chen L-C, Chen B, Tan M, Wang W, Zhu Y, Pang R, Vasudevan V, Le QV, Adam H (2019) Searching for MobileNetV3
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision, pp. 740–755. Springer
Xie T, Deng J, Cheng X, Liu M, Wang X, Liu M (2022) Feature mining: a novel training strategy for convolutional neural network. Appl Sci 12(7):3318
Acknowledgements
We would like to thank all the participants of The ISLAND Project who provided the video data for this study. We would also like to thank Professor James Vickers and all the staff at the University of Tasmania who work on The ISLAND Project for their support. We acknowledge the funding contributions for this project from the Medical Research Future Fund, St Lukes Health, Tasmanian Masonic Medical Research Foundation and the J.O. and J.R. Wicking Trust (Equity Trustees). We acknowledge funding from the National Health and Medical Research Council for the TAS Test Project.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary file 1 (mp4 8104 KB)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Huang, G., Tran, S.N., Bai, Q. et al. Real-time automated detection of older adults' hand gestures in home and clinical settings. Neural Comput & Applic 35, 8143–8156 (2023). https://doi.org/10.1007/s00521-022-08090-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-08090-8