Skip to main content
Log in

Fast-gesture recognition and classification using Kinect: an application for a virtual reality drumkit

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper, we present a system for the detection of fast gestural motion by using a linear predictor of hand movements. We also use the proposed detection scheme for the implementation of a virtual drumkit simulator. A database of drum-hitting motions is gathered and two different sets of features are proposed to discriminate different drum-hitting gestures. The two feature sets are related to observations of different nature: the trajectory of the hand and the pose of the arm. These two sets are used to train classifier models using a variety of machine learning techniques in order to analyse which features and machine learning techniques are more suitable for our classification task. Finally, the system has been validated by means of the Kinect application implemented and the participation of 12 different subjects for the experimental performance evaluation. Results showed a successful discrimination rate higher than 95 % for six different gestures per hand and good user experience.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Aha D, Kibler D, Albert M (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66

    Google Scholar 

  2. Bandera J, Marfil R, Bandera A, Rodríguez JA, Molina-Tanco L, Sandoval F (2009) Fast gesture recognition based on a two-level representation. Pattern Recognit Lett 30(13):1181–1189

    Article  MATH  Google Scholar 

  3. Bandera J, Rodríguez J, Molina-Tanco L, Bandera A (2012) A survey of vision-based architectures for robot learning by imitation. International Journal Humanoid Robot 9(1):1–40

    Article  Google Scholar 

  4. Barbancho I, Rosa-Pujazón A, Tardón L, Barbancho A (2013) Human–computer interaction and music. In: Sound Perception-Performance, pp 367–389. Springer

  5. Bevilacqua F, Guédy F, Schnell N, Fléty E, Leroy N (2007) Wireless sensor interface and gesture-follower for music pedagogy. In: Proceedings of the 7th International Conference on New Interfaces for Musical Expression, pp 124–129. ACM

  6. Bouënard A, Wanderley MM, Gibet S (2010) Gesture control of sound synthesis: Analysis and classification of percussion gestures. Acta Acustica united with Acustica 96(4):668–677

    Article  Google Scholar 

  7. Calinon S (2007) Continuous extraction of task constraints in a robot programming by demonstration framework. Unpublished doctoral dissertation, Ecole Polytechnique Fédérale de Lausanne (EPFL)

  8. Cao D, Masoud O, Boley D, Papanikolopoulos N (2009) Human motion recognition using support vector machines. Comp Vision Image Underst 113 (10):1064–1075

    Article  Google Scholar 

  9. Caramiaux B (2014) Motion modeling for expressive interaction: A design proposal using bayesian adaptive systems. In: Proceedings of the 2014 International Workshop on Movement and Computing, p 76. ACM

  10. Castellano G, Bresin R, Camurri A, Volpe G (2007) Expressive control of music and visual media by full-body movement. In: Proceedings of the 7th International Conference on New Interfaces for Musical Expression, pp 390–391 ACM

  11. Celebi S, Aydin A, Temiz T, Arici T (2013) Gesture recognition using skeleton data with weighted dynamic time warping. Computer Vision Theory and Applications. VISAPP

  12. Chen C, Liang J, Zhao H, Hu H, Tian J (2009) Factorial HMM and parallel HMM for gait recognition. IEEE Transaction on Systems, Man, and Cybernetics, Part C Applications and Reviews 39(1):114–123

    Article  Google Scholar 

  13. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    MATH  Google Scholar 

  14. El-Baz A, Tolba A (2013) An efficient algorithm for 3d hand gesture recognition using combined neural classifiers. Neural Computing and Applications pp 1–8

  15. Halpern M, Tholander J, Evjen M, Davis S, Ehrlich A, Schustak K, Baumer E, Gay G (2011) Moboogie: creative expression through whole body musical interaction. In: Proceedings of the 2011, Annual Conference on Human Factors in Computing Systems, pp 557–560. ACM (2011)

  16. Haykin S (2007) Neural networks: a comprehensive foundation. Prentice Hall Englewood Cliffs, NJ

    MATH  Google Scholar 

  17. Holland S, Bouwer A, Dalgelish M, Hurtig T (2010) Feeling the beat where it counts: fostering multi-limb rhythm skills with the haptic drum kit. In: Proceedings of the fourth International Conference on Tangible, Embedded, and Embodied Interaction, pp 21–28. ACM

  18. Hosmer D, Lemeshow S, Sturdivant R (2013) Applied logistic regression. Wiley

  19. Howell D (2011) Statistical methods for psychology. Cengage Learning

  20. Itauma I, Kivrak H, Kose H (2012) Gesture imitation using machine learning techniques. In: Signal Processing and Communications Applications Conference (SIU), 2012 20th, pp 1–4. IEEE

  21. Jacob M, Wachs J (2013) Context-based hand gesture recognition for the operating room. Pattern Recognition Letters 36:196–203

    Article  Google Scholar 

  22. John G, Langley P (1995) Estimating continuous distributions in bayesian classifiers. In: Proceedings of the Eleventh conference on Uncertainty in Artificial Intelligence, pp 338–345. Morgan Kaufmann Publishers Inc

  23. Jordà S (2010) The reactable: tangible and tabletop music performance. In: Proceedings of the 28th of the International Conference Extended Abstracts on Human Factors in Computing Systems, pp 2989–2994. ACM

  24. Khoo E, Merritt T, Fei V, Liu W, Rahaman H, Prasad J, Marsh T (2008) Body music: physical exploration of music theory. In: Proceedings of the 2008 ACM SIGGRAPH Symposium on Video Games, pp 35–42

  25. Kim D, Song J, Kim D (2007) Simultaneous gesture segmentation and recognition based on forward spotting accumulative. Pattern Recog 40(11):3012–3026

    Article  MATH  Google Scholar 

  26. Lago N, Kon F (2004) The quest for low latency. In: Proceedings of the International Computer Music Conference, pp 33–36

  27. Lee H, Kim J (1999) An HMM-based threshold model approach for gesture recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 21 (10):961–973

    Article  Google Scholar 

  28. Li H, Greenspan M (2011) Model-based segmentation and recognition of dynamic gestures in continuous video streams. Pattern Recog 44(8):1614–1628

    Article  Google Scholar 

  29. Livingston M, Sebastian J, Ai Z, Decker J (2012) Performance measurements for the microsoft Kinect skeleton. In: Proceedings of the 2012 IEEE Virtual Reality Workshops, pp 119–120

  30. Mannini A, Sabatini A (2010) Machine learning methods for classifying human physical activity from on-body accelerometers. Sensors 10(2):1154–1175

    Article  Google Scholar 

  31. Muhlig M, Gienger M, Hellbach S, Steil JJ, Goerick C (2009). In: Task-level imitation learning using variance-based movement optimization. In: Proceedings of the IEEE International Conference on Robotics and Automation, ICRA’09, pp 1177–1184

  32. Odowichuk G, Trail S, Driessen P, Nie W, Page W (2011) Sensor fusion: Towards a fully expressive 3d music control interface. In: Proceedings of the 2011 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PacRim), pp 836–841

  33. Quinlan J (1993) C4. 5: programs for machine learning, vol 1. Morgan Kaufmann

  34. Rasamimanana N, Fléty E, Bevilacqua F (2006) Gesture analysis of violin bow strokes. In: Gesture in Human-Computer Interaction and Simulation, pp 145–155. Springer

  35. Rosa-Pujazón A, Barbancho I, Tardón L, Barbancho A (2013) Conducting a virtual ensemble with a kinect device. In: SMAC 2013 - Stockholm Music Acoustics Conference 2013, pp 284–291

  36. Rosa-Pujazón A, Barbancho I, Tardón L, Barbancho A (2013) Drum-hitting gesture recognition and prediction system using Kinect. In: I Simposio Espaṅol de Entrenimiento Digital SEED 2013, pp 108–118

  37. Rosa-Pujazón A., Barbancho I., Tardón L., Barbancho A. (2015) A virtual reality drumkit simulator system with a Kinect device. International Journal of Creative Interfaces and Computer Graphics accepted for publication 6(1):72–86

    Article  Google Scholar 

  38. Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local SVM approach. In: Pattern Recognition 2004, ICPR Proceedings of the 17th International Conference on, vol 3, pp 32–36

  39. Stanton C, Bogdanovych A, Ratanasena E (2012). In: Teleoperation of a humanoid robot using full-body motion capture, example movements, and machine learning. In, Proc Australasian Conference on Robotics and Automation

  40. Stierman C (2012) Kinotes: Mapping musical scales to gestures in a Kinect-based interface for musican expression. Ph.D.thesis, MSc Thesis University of Amsterdam

  41. Todoroff T, Leroy J, Picard-Limpens C (2011) Orchestra: Wireless sensor system for augmented performances & fusion with Kinect. QPSR of the numediart research program 4(2)

  42. Trail S, Dean M, Tavares T, Odowichuk G, Driessen P, Schloss W, Tzanetakis G (2012) Non-invasive sensing and gesture control for pitched percussion hyper-instruments using the Kinect. In: 12th International Conference on New Interfaces for Musical Expression. NIME’12

  43. Wiener N (1964) Extrapolation, Interpolation, and Smoothing of Stationary Time Series. Wiley, New York

    Google Scholar 

  44. Yamato J, Ohya J, Ishii K (1992) Recognizing human action in time-sequential images using hidden markov model. In: Computer Vision and Pattern Recognition, 1992. Proceedings CVPR’92., 1992 IEEE Computer Society Conference on, pp 379–385. IEEE

  45. Yoo M, Beak J, Lee I (2011) Creating musical expression using kinect. In: Proceedings of the 2011 Conference on New Interfaces for Musical Expression, Oslo Norway

  46. Yoon H, Soh J, Bae Y, Seung Yang H (2001) Hand gesture recognition using combined features of location, angle and velocity. Pattern Recog 34(7):1491–1501

    Article  MATH  Google Scholar 

  47. Zhang Y, Huang Q, Qin L, Zhao S, Yao H, Xu P (2014) Representing dense crowd patterns using bag of trajectory graphs. SIViP 8(1):173–181

    Article  Google Scholar 

  48. Zhao S, Chen L, Yao H, Zhang Y, Sun X (2015) Strategy for dynamic 3d depth data matching towards robust action retrieval. Neurocomputing 151:533–543

    Article  Google Scholar 

Download references

Acknowledgments

This work has been funded by the Ministerio de Economía y Competitividad of the Spanish Government under Project No. TIN2013-47276-C6-2-R and by the Junta de Andalucía under Project No. P11-TIC-7154 . This work has been done at Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ana M. Barbancho.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rosa-Pujazón, A., Barbancho, I., Tardón, L.J. et al. Fast-gesture recognition and classification using Kinect: an application for a virtual reality drumkit. Multimed Tools Appl 75, 8137–8164 (2016). https://doi.org/10.1007/s11042-015-2729-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-015-2729-8

Keywords

Navigation