Abstract
In this paper, we present a system for the detection of fast gestural motion by using a linear predictor of hand movements. We also use the proposed detection scheme for the implementation of a virtual drumkit simulator. A database of drum-hitting motions is gathered and two different sets of features are proposed to discriminate different drum-hitting gestures. The two feature sets are related to observations of different nature: the trajectory of the hand and the pose of the arm. These two sets are used to train classifier models using a variety of machine learning techniques in order to analyse which features and machine learning techniques are more suitable for our classification task. Finally, the system has been validated by means of the Kinect application implemented and the participation of 12 different subjects for the experimental performance evaluation. Results showed a successful discrimination rate higher than 95 % for six different gestures per hand and good user experience.
Similar content being viewed by others
References
Aha D, Kibler D, Albert M (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66
Bandera J, Marfil R, Bandera A, Rodríguez JA, Molina-Tanco L, Sandoval F (2009) Fast gesture recognition based on a two-level representation. Pattern Recognit Lett 30(13):1181–1189
Bandera J, Rodríguez J, Molina-Tanco L, Bandera A (2012) A survey of vision-based architectures for robot learning by imitation. International Journal Humanoid Robot 9(1):1–40
Barbancho I, Rosa-Pujazón A, Tardón L, Barbancho A (2013) Human–computer interaction and music. In: Sound Perception-Performance, pp 367–389. Springer
Bevilacqua F, Guédy F, Schnell N, Fléty E, Leroy N (2007) Wireless sensor interface and gesture-follower for music pedagogy. In: Proceedings of the 7th International Conference on New Interfaces for Musical Expression, pp 124–129. ACM
Bouënard A, Wanderley MM, Gibet S (2010) Gesture control of sound synthesis: Analysis and classification of percussion gestures. Acta Acustica united with Acustica 96(4):668–677
Calinon S (2007) Continuous extraction of task constraints in a robot programming by demonstration framework. Unpublished doctoral dissertation, Ecole Polytechnique Fédérale de Lausanne (EPFL)
Cao D, Masoud O, Boley D, Papanikolopoulos N (2009) Human motion recognition using support vector machines. Comp Vision Image Underst 113 (10):1064–1075
Caramiaux B (2014) Motion modeling for expressive interaction: A design proposal using bayesian adaptive systems. In: Proceedings of the 2014 International Workshop on Movement and Computing, p 76. ACM
Castellano G, Bresin R, Camurri A, Volpe G (2007) Expressive control of music and visual media by full-body movement. In: Proceedings of the 7th International Conference on New Interfaces for Musical Expression, pp 390–391 ACM
Celebi S, Aydin A, Temiz T, Arici T (2013) Gesture recognition using skeleton data with weighted dynamic time warping. Computer Vision Theory and Applications. VISAPP
Chen C, Liang J, Zhao H, Hu H, Tian J (2009) Factorial HMM and parallel HMM for gait recognition. IEEE Transaction on Systems, Man, and Cybernetics, Part C Applications and Reviews 39(1):114–123
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
El-Baz A, Tolba A (2013) An efficient algorithm for 3d hand gesture recognition using combined neural classifiers. Neural Computing and Applications pp 1–8
Halpern M, Tholander J, Evjen M, Davis S, Ehrlich A, Schustak K, Baumer E, Gay G (2011) Moboogie: creative expression through whole body musical interaction. In: Proceedings of the 2011, Annual Conference on Human Factors in Computing Systems, pp 557–560. ACM (2011)
Haykin S (2007) Neural networks: a comprehensive foundation. Prentice Hall Englewood Cliffs, NJ
Holland S, Bouwer A, Dalgelish M, Hurtig T (2010) Feeling the beat where it counts: fostering multi-limb rhythm skills with the haptic drum kit. In: Proceedings of the fourth International Conference on Tangible, Embedded, and Embodied Interaction, pp 21–28. ACM
Hosmer D, Lemeshow S, Sturdivant R (2013) Applied logistic regression. Wiley
Howell D (2011) Statistical methods for psychology. Cengage Learning
Itauma I, Kivrak H, Kose H (2012) Gesture imitation using machine learning techniques. In: Signal Processing and Communications Applications Conference (SIU), 2012 20th, pp 1–4. IEEE
Jacob M, Wachs J (2013) Context-based hand gesture recognition for the operating room. Pattern Recognition Letters 36:196–203
John G, Langley P (1995) Estimating continuous distributions in bayesian classifiers. In: Proceedings of the Eleventh conference on Uncertainty in Artificial Intelligence, pp 338–345. Morgan Kaufmann Publishers Inc
Jordà S (2010) The reactable: tangible and tabletop music performance. In: Proceedings of the 28th of the International Conference Extended Abstracts on Human Factors in Computing Systems, pp 2989–2994. ACM
Khoo E, Merritt T, Fei V, Liu W, Rahaman H, Prasad J, Marsh T (2008) Body music: physical exploration of music theory. In: Proceedings of the 2008 ACM SIGGRAPH Symposium on Video Games, pp 35–42
Kim D, Song J, Kim D (2007) Simultaneous gesture segmentation and recognition based on forward spotting accumulative. Pattern Recog 40(11):3012–3026
Lago N, Kon F (2004) The quest for low latency. In: Proceedings of the International Computer Music Conference, pp 33–36
Lee H, Kim J (1999) An HMM-based threshold model approach for gesture recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 21 (10):961–973
Li H, Greenspan M (2011) Model-based segmentation and recognition of dynamic gestures in continuous video streams. Pattern Recog 44(8):1614–1628
Livingston M, Sebastian J, Ai Z, Decker J (2012) Performance measurements for the microsoft Kinect skeleton. In: Proceedings of the 2012 IEEE Virtual Reality Workshops, pp 119–120
Mannini A, Sabatini A (2010) Machine learning methods for classifying human physical activity from on-body accelerometers. Sensors 10(2):1154–1175
Muhlig M, Gienger M, Hellbach S, Steil JJ, Goerick C (2009). In: Task-level imitation learning using variance-based movement optimization. In: Proceedings of the IEEE International Conference on Robotics and Automation, ICRA’09, pp 1177–1184
Odowichuk G, Trail S, Driessen P, Nie W, Page W (2011) Sensor fusion: Towards a fully expressive 3d music control interface. In: Proceedings of the 2011 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PacRim), pp 836–841
Quinlan J (1993) C4. 5: programs for machine learning, vol 1. Morgan Kaufmann
Rasamimanana N, Fléty E, Bevilacqua F (2006) Gesture analysis of violin bow strokes. In: Gesture in Human-Computer Interaction and Simulation, pp 145–155. Springer
Rosa-Pujazón A, Barbancho I, Tardón L, Barbancho A (2013) Conducting a virtual ensemble with a kinect device. In: SMAC 2013 - Stockholm Music Acoustics Conference 2013, pp 284–291
Rosa-Pujazón A, Barbancho I, Tardón L, Barbancho A (2013) Drum-hitting gesture recognition and prediction system using Kinect. In: I Simposio Espaṅol de Entrenimiento Digital SEED 2013, pp 108–118
Rosa-Pujazón A., Barbancho I., Tardón L., Barbancho A. (2015) A virtual reality drumkit simulator system with a Kinect device. International Journal of Creative Interfaces and Computer Graphics accepted for publication 6(1):72–86
Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local SVM approach. In: Pattern Recognition 2004, ICPR Proceedings of the 17th International Conference on, vol 3, pp 32–36
Stanton C, Bogdanovych A, Ratanasena E (2012). In: Teleoperation of a humanoid robot using full-body motion capture, example movements, and machine learning. In, Proc Australasian Conference on Robotics and Automation
Stierman C (2012) Kinotes: Mapping musical scales to gestures in a Kinect-based interface for musican expression. Ph.D.thesis, MSc Thesis University of Amsterdam
Todoroff T, Leroy J, Picard-Limpens C (2011) Orchestra: Wireless sensor system for augmented performances & fusion with Kinect. QPSR of the numediart research program 4(2)
Trail S, Dean M, Tavares T, Odowichuk G, Driessen P, Schloss W, Tzanetakis G (2012) Non-invasive sensing and gesture control for pitched percussion hyper-instruments using the Kinect. In: 12th International Conference on New Interfaces for Musical Expression. NIME’12
Wiener N (1964) Extrapolation, Interpolation, and Smoothing of Stationary Time Series. Wiley, New York
Yamato J, Ohya J, Ishii K (1992) Recognizing human action in time-sequential images using hidden markov model. In: Computer Vision and Pattern Recognition, 1992. Proceedings CVPR’92., 1992 IEEE Computer Society Conference on, pp 379–385. IEEE
Yoo M, Beak J, Lee I (2011) Creating musical expression using kinect. In: Proceedings of the 2011 Conference on New Interfaces for Musical Expression, Oslo Norway
Yoon H, Soh J, Bae Y, Seung Yang H (2001) Hand gesture recognition using combined features of location, angle and velocity. Pattern Recog 34(7):1491–1501
Zhang Y, Huang Q, Qin L, Zhao S, Yao H, Xu P (2014) Representing dense crowd patterns using bag of trajectory graphs. SIViP 8(1):173–181
Zhao S, Chen L, Yao H, Zhang Y, Sun X (2015) Strategy for dynamic 3d depth data matching towards robust action retrieval. Neurocomputing 151:533–543
Acknowledgments
This work has been funded by the Ministerio de Economía y Competitividad of the Spanish Government under Project No. TIN2013-47276-C6-2-R and by the Junta de Andalucía under Project No. P11-TIC-7154 . This work has been done at Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rosa-Pujazón, A., Barbancho, I., Tardón, L.J. et al. Fast-gesture recognition and classification using Kinect: an application for a virtual reality drumkit. Multimed Tools Appl 75, 8137–8164 (2016). https://doi.org/10.1007/s11042-015-2729-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-015-2729-8