Abstract
Human action recognition (HAR) has a considerable place in scientific studies. Additionally, hand gesture recognition, which is a subcategory of HAR, plays an important role in communicating with deaf people. Convolutional neural network (CNN) structures are frequently used to recognize human actions. In the study, hyperparameters of the CNN structures, which are based on AlexNet model, are optimized by heuristic optimization algorithms. The proposed method is tested on sign language digits and Thomas Moeslund’s gesture recognition datasets. Due to using heuristic algorithms, training procedures are repeated 30 times for both datasets. According to the experimental results, the average accuracy performance for action classification of the proposed artificial bee colony-based method is 98.40%, which is better than the performance of the existing work (with accuracy of 94.2%) for sign language digits dataset. Concurrently, for Thomas Moeslund’s gesture recognition dataset, the proposed approach has an average accuracy performance of 98.09%, outperforming the best existing work (which reported 94.33% classification performance).
Similar content being viewed by others
References
Lin H-I, Hsu M-H, Chen W-K (2014) Human hand gesture recognition using a convolution neural network. In: IEEE international conference on automation science and engineering (CASE), pp 1038–1043
Flores CJL, Cutipa AE, Enciso RL (2017) Application of convolutional neural networks for static hand gestures recognition under different invariant features. In: 2017 International congress on electronics, electrical engineering and computing INTERCON, pp 5–8
Arenas JOP, Murillo PCU, Moreno RJ (2017) Convolutional neural network architecture for hand gesture recognition. In: International conference on electronics, electrical engineering and computing INTERCON, pp 1–4
Oyedotun OK, Khashman A (2017) Deep learning in vision-based static hand gesture recognition. Neural Comput Appl 28(12):3941–3951
Czuszynski K, Ruminski J, Kwasniewska A (2018) Gesture recognition with the linear optical sensor and recurrent neural networks. IEEE Sensors J 18(13):5429–5438
Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional neural networks. NIPS 25:1106–1114
Karaboga D (2005) An idea based on honey bee swarm for numerical optimization. Technical report, Erciyes University
Holland JH (1992) Adaptation in natural and artificial systems: an introductory analysis with applications to biology. Control and artificial intelligence. MIT Press, Cambridge
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN’95—international conference on neural networks, vol 4, pp 1942–1948
Mavi A (2018) Sign language digits dataset. https://www.kaggle.com/ardamavi/sign-language-digits-dataset/home. Accessed 15 Aug 2018
Birk H, Moeslund T, Madsen C (1997) Real-time recognition of hand alphabet gestures using principal component analysis. In: The Scandinavian conference on image analysis, pp 261–268
Beser F, Kizrak MA, Bolat B, Yildirim T (2018) Recognition of sign language using capsule networks. In: 26th Signal processing and communications applications conference (SIU), pp 1–4
Oyedotun OK, Khashman A (2018) Prototype-incorporated emotional neural network. IEEE Trans Neural Netw Learn Syst 29(8):3560–3572
Alashhab S, Gallego A-J, Lozano MÁ (2019) Hand gesture detection with convolutional neural networks. In: 15th International conference on distributed computing and artificial intelligence. Springer, Cham, pp 45–52
Vedaldi A, Lenc K (2015) MatConvNet—convolutional neural networks for MATLAB. In: Proceedings of the ACM international conference on multimedia
Cote-Allard U, Fall CL, Campeau-Lecours A, Gosselin C, Laviolette F, Gosselin B (2017) Transfer learning for sEMG hand gestures recognition using convolutional neural networks. In: 2017 IEEE international conference on systems man and cybernetics (SMC), pp 1663–1668
Xing K, Ding Z, Jiang S, Ma X, Yang K, Yang C, Li X, Jiang F (2018) Hand gesture recognition based on deep learning method. In: 2018 IEEE 3rd international conference on data science in cyberspace (DSC), pp 542–546
Alani AA, Cosma G (2018) Hand gesture recognition using an adapted convolutional neural network with data augmentation. In: 2018 4th International conference on information management (ICIM), pp 5–12
Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. In: Annual conference on neural information processing systems 2017, Long Beach CA USA, pp 3859–3869
Rubio G, Pomares H, Rojas I, Herrera LJ (2011) A heuristic method for parameter selection in LS-SVM: application to time series prediction. Int J Forecast 27(3):725–739
Klein A, Falkner S, Bartels S, Hennig P, Hutter F (2017) Fast Bayesian optimization of machine learning hyperparameters on large datasets. In: 20th International conference on artificial intelligence and statistics (AISTATS), PMLR 54, pp 528–536
Yao C, Cai D, Bu J, Chen G (2017) Pre-training the deep generative models with adaptive hyperparameter optimization. Neurocomputing 247:144–155
Diaz G, Fokoue A, Nannicini G, Samulowitz H (2017) An effective algorithm for hyperparameter optimization of neural networks. IBM J Res Dev 61(4):1–20
Desmet B, Hoste V (2018) Online suicide prevention through optimised text classification. Inf Sci 439–440:61–78
Sánchez-Illana Á, Pérez-Guaita D, Cuesta-García D, Sanjuan-Herráez JD, Vento M, Ruiz-Cerdá JL, Quintás G, Kuligowski J (2018) Model selection for within-batch effect correction in UPLC-MS metabolomics using quality control—support vector regression. Anal Chim Acta 1026:62–68
Hinz T, Navarro-Guerrero N, Magg S, Wermter S (2018) Speeding up the hyperparameter optimization of deep convolutional neural networks. Int J Comput Intell Appl 17(02):1850008
Rahnama AHA, Toloo M, Zaidenberg NJ (2018) An LP-based hyperparameter optimization model for language modeling 1–11. arXiv:1803.10927
Riegler M (2018) HINDSIGHT: an R-based framework towards long short term memory (LSTM) optimization. In: Proceedings of the 9th ACM multimedia systems conference, pp 381–386
Stoean R (2018) Analysis on the potential of an EA–surrogate modelling tandem for deep learning parametrization: an example for cancer classification from medical images. Neural Comput Appl. https://doi.org/10.1007/s00521-018-3709-5
Saravanakumar R, Rajchakit G, Ali MS, Xiang Z, Joo YH (2018) Robust extended dissipativity criteria for discrete-time uncertain neural networks with time-varying delays. Neural Comput Appl 30(12):3893–3904
Maharajan C, Raja R, Cao J, Ravi G, Rajchakit G (2018) Global exponential stability of Markovian jumping stochastic impulsive uncertain bam neural networks with leakage, mixed time delays, and \(\alpha\)-inverse Hölder activation functions. Adv Diff Equ 2018(1):113–143
Maharajan C, Raja R, Cao J, Rajchakit G (2018) Novel global robust exponential stability criterion for uncertain inertial-type bam neural networks with discrete and distributed time-varying delays via lagrange sense. J Frankl Inst 355(11):4727–4754
Pratap A, Raja R, Cao J, Rajchakit G, Alsaadi FE (2018) Further synchronization in finite time analysis for time-varying delayed fractional order memristive competitive neural networks with leakage delay. Neurocomputing 317:110–126
Maharajan C, Raja R, Cao J, Rajchakit G, Alsaedi A (2018) Novel results on passivity and exponential passivity for multiple discrete delayed neutral-type neural networks with leakage and distributed time-delays. Chaos Solitons Fractals 115:268–282
Sowmiya C, Raja R, Zhu Q, Rajchakit G (2019) Further mean-square asymptotic stability of impulsive discrete-time stochastic bam neural networks with Markovian jumping and multiple time-varying delays. J Frankl Inst 356(1):561–591
Maharajan C, Raja R, Cao J, Rajchakit G (2019) Fractional delay segments method on time-delayed recurrent neural networks with impulsive and stochastic effects: an exponential stability approach. Neurocomputing 323:277–298
Jayaraman D, Vanitha K (2014) Nonspecific-user hand gesture recognition by using MEMS accelerometer. In: International conference on information communication and embedded systems, pp 1–6
Maraqa M, Abu-Zaiter R (2008) Recognition of Arabic sign language (ArSL) using recurrent neural networks. In: International conference on the applications of digital information and web technologies, pp 478–481
Chevtchenko SF, Vale RF, Macario V (2018) Multi-objective optimization for hand posture recognition. Expert Syst Appl 92:170–181
Barczak ALC, Reyes NH, Abastillas M, Piccio A, Susnjak T (2011) A new 2D static hand gesture colour image dataset for ASL gestures. Res Lett Inf Math Sci 15:12–20
Sagayam KM, Hemanth DJ (2018) ABC algorithm based optimization of 1-D hidden Markov model for hand gesture recognition applications. Comput Ind 99(April):313–323
Kim TK, Wong SF, Cipolla R (2007) Tensor canonical correlation analysis for action classification. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), Minneapolis, MN, pp 1–8
Basturk A, Basturk NS, Qurbanov O (2018) A comparative performance analysis of various classifiers for fingerprint recognition. Omer Halisdemir Universitesi Muhendislik Bilimleri Dergisi 7:504–513
Ozcan T, Basturk A (2019) Lip reading using convolutional neural networks with and without pre-trained models. Balk J Electr Comput Eng 7(2):195–201
Ozcan T, Basturk A (2019) Static image-based emotion recognition using convolutional neural network. In: 27th IEEE signal processing and communications applications conference, pp 1–4
Gu J, Wang Z, Kuen J, Ma L, Shahroudy A, Shuai B, Liu T, Wang X, Wang G (2015) Recent advances in convolutional neural networks. CoRR. arXiv:1512.07108
Kilic E (2016) Classification of mitotic figures with convolutional neural networks. M.Sc. thesis, Erciyes University Graduate School of Natural and Applied Sciences
Murray C (2017) Deep learning CNN’s in tensorflow with GPUs. https://hackernoon.com/@ColeMurray. Accessed 20 Sept 2018
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. CoRR. arXiv:1409.1556
Szegedy C, Liu W, Jia Y, Sermanet P, Reed SE, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2014) Going deeper with convolutions. CoRR. arXiv:1409.4842
He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. CoRR. arXiv:1512.03385
MathWorks, Transfer Learning Using AlexNet. https://www.mathworks.com/help/deeplearning/examples/transfer-learning-using-alexnet.html. Accessed 30 Aug 2018
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein MS, Berg AC, Li F (2014) Imagenet large scale visual recognition challenge. CoRR. arXiv:1409.0575
Basturk A, Akay R (2013) Performance analysis of the coarse-grained parallel model of the artificial bee colony algorithm. Inf Sci 253:34–55
Yuksel ME, Basturk NS, Badem H, Caliskan A, Basturk A (2018) Classification of high resolution hyperspectral remote sensing data using deep neural networks. J Intell Fuzzy Syst 34:2273–2285
Aslan S (2019) A transition control mechanism for artificial bee colony (ABC) algorithm. Comput Intell Neurosci 2019:1–24
Karaboga D, Aslan S (2019) Discovery of conserved regions in DNA sequences by artificial bee colony (ABC) algorithm based methods. Nat Comput 18(2):333–350
Aslan S (2019) Time-based dance scheduling for artificial bee colony algorithm and its variants. Int J Comput Intell Syst 12(2):597–612
Karaboga D, Basturk B (2007) A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm. J Glob Optim 39(3):459–471
Acar B, Akkaya A, Genc G, Yilmaz HB, Kuran MŞ, Tugcu T (2017) Understanding communication via diffusion: simulation design and intricacies. In: Modeling, methodologies and tools for molecular and nano-scale communications. Springer, pp 139–163
Hays WL (1994) Statistics. Holt Rinehart and Winston, New York City
Montgomery DC, Runger GC (2010) Applied statistics and probability for engineers. Wiley, Hoboken
Chen Q, Liu B, Zhang Q, Liang J, Suganthan P, Qu B (2014) Problem definitions and evaluation criteria for CEC 2015 special session on bound constrained single-objective computationally expensive numerical optimization. Technical report, Computational Intelligence Laboratory, Zhengzhou University, Zhengzhou, China and technical report, Nanyang Technological University, pp 1–17
Lopes RHC (2011) Kolmogorov–Smirnov test, vol 2011. Springer, Berlin, pp 718–720
Kruskal WH, Wallis WA (1952) Use of ranks in one-criterion variance analysis. J Am Stat Assoc 47(260):583–621
Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 18(1):50–60
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We confirm that the manuscript has been read and approved by all named authors and that there are no other persons who satisfied the criterion for authorship, but are not listed. We further confirm that the order of authors listed in the manuscript has been approved by all of us. We understand that the Corresponding Author is the sole contact for the Editorial process (including Editorial Manager and direct communications with the office). He is responsible for communicating with the other authors about progress, submissions of revisions, and the final approval of proofs.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ozcan, T., Basturk, A. Transfer learning-based convolutional neural networks with heuristic optimization for hand gesture recognition. Neural Comput & Applic 31, 8955–8970 (2019). https://doi.org/10.1007/s00521-019-04427-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-019-04427-y