Skip to main content
Log in

A natural and synthetic corpus for benchmarking of hand gesture recognition systems

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

The use of hand gestures offers an alternative to the commonly used human–computer interfaces (i.e. keyboard, mouse, gamepad, voice, etc.), providing a more intuitive way of navigating among menus and in multimedia applications. This paper presents a dataset for the evaluation of hand gesture recognition approaches in human–computer interaction scenarios. It includes natural data and synthetic data from several State of the Art dictionaries. The dataset considers single-pose and multiple-pose gestures, as well as gestures defined by pose and motion or just by motion. Data types include static pose videos and gesture execution videos—performed by a set of eleven users and recorded with a time-of-flight camera—and synthetically generated gesture images. A novel collection of critical factors involved in the creation of a hand gestures dataset is proposed: capture technology, temporal coherence, nature of gestures, representativeness, pose issues and scalability. Special attention is given to the scalability factor, proposing a simple method for the synthetic generation of depth images of gestures, making possible the extension of a dataset with new dictionaries and gestures without the need of recruiting new users, as well as providing more flexibility in the point-of-view selection. The method is validated for the presented dataset. Finally, a separability study of the pose-based gestures of a dictionary is performed. The resulting corpus, which exceeds in terms of representativity and scalability the datasets existing in the State Of Art, provides a significant evaluation scenario for different kinds of hand gesture recognition solutions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://wii.com.

  2. http://www.xbox.com/kinect/.

  3. http://playstation.com/psmove/.

  4. http://www.samsung.com/us/video/tvs.

  5. http://www.mesa-imaging.ch/.

  6. http://www.mmorph.com/.

  7. http://www.sematos.eu/lse.html.

References

  1. Causo, A., Matsuo, M., Ueda, E., Takemura, K., Matsumoto, Y., Takamatsu, J., Ogasawara, T.: Hand pose estimation using voxel-based individualized hand model. In: IEEE/ASME International Conference on Advanced Intelligent Mechatronics, 2009. AIM 2009, pp. 451–456 (2009)

  2. Causo, A., Ueda, E., Kurita, Y., Matsumoto, Y., Ogasawara, T.: Model-based hand pose estimation using multiple viewpoint silhouette images and unscented kalman filter. In: The 17th IEEE International Symposium on Robot and Human Interactive Communication, 2008. RO-MAN 2008 , pp. 291–296 (2008)

  3. Dadgostar, F., Barczak, A.L.C., Sarrafzadeh, A.: A color hand gesture database for evaluating and improving algorithms on hand gesture and posture recognition. Res. Lett. Inf. Math. Sci. 7, 127–134 (2005)

    Google Scholar 

  4. Erol, A., Bebis, G., Nicolescu, M., Boyle, R.D., Twombly, X.: Vision-based hand pose estimation: a review. Comput. Vis. Image Underst. 108(1–2), 52–73 (2007)

    Article  Google Scholar 

  5. Ge, S., Yang, Y., Lee, T.: Hand gesture recognition and tracking based on distributed locally linear embedding. In: IEEE Conference on Robotics, Automation and Mechatronics, 2006, pp. 1–6 (2006)

  6. Han, L., Liang, W.: Continuous hand gesture recognition in the learned hierarchical latent variable space. In: Proceedings of the 5th international conference on Articulated Motion and Deformable Objects, AMDO ’08, pp. 32–41. Springer, Berlin (2008)

  7. Ho, M.F., Tseng, C.Y., Lien, C.C., Huang, C.L.: A multi-view vision-based hand motion capturing system. Pattern Recognit. 44, 443–453 (2011)

    Article  MATH  Google Scholar 

  8. Holte, M.B., Stoerring, M.: Pointing and command gestures under mixed illumination conditions: video sequence dataset, http://www-prima.inrialpes.fr/fgnet/data/03-pointing/index.html (2004)

  9. Hu, M.K.: Visual pattern recognition by moment invariants. IRE Trans. Inf. Theory 8(2), 179–187 (1962)

    Article  MATH  Google Scholar 

  10. Kim, T.K., Wong, S.F., Cipolla, R.: Tensor canonical correlation analysis for action classification. In: IEEE Conference on Computer Vision and Pattern Recognition, 2007. CVPR ’07, pp. 1–8 (2007)

  11. Kollorz, E., Penne, J., Hornegger, J., Barke, A.: Gesture recognition with a time-of-flight camera. Int. J. Intell. Syst. Technol. Appl. 5(3/4), 334–343 (2008)

    Google Scholar 

  12. Laviola, J.J.: Bringing vr and spatial 3d interaction to the masses through video games. IEEE Comput. Gr. Appl. 28(5), 10–15 (2008)

    Article  Google Scholar 

  13. Lewis, J.P., Cordner, M., Fong, N.: Pose space deformation: a unified approach to shape interpolation and skeleton-driven deformation. In: Proceedings of the 27th annual conference on Computer graphics and interactive techniques, SIGGRAPH ’00, pp. 165–172. ACM Press/Addison-Wesley Publishing Co., New York, NY, USA (2000)

  14. Liu, X., Fujimura, K.: Hand gesture recognition using depth data. In: Sixth IEEE International Conference on Automatic Face and Gesture Recognition, pp. 529–534 (2004)

  15. Marcel, S.: Hand posture recognition in a body-face centered space. In: CHI ’99 extended abstracts on Human factors in computing systems, CHI EA ’99, pp. 302–303. ACM, New York, NY, USA (1999)

  16. Marcel, S., Bernier, O., Viallet, J.E., Collobert, D.: Hand gesture recognition using input-output hidden markov models. In: Fourth IEEE International Conference on Automatic Face and Gesture Recognition, 2000. Proceedings, pp. 456–461 (2000)

  17. Martin Larsson Isabel Serrano Vicente, D.K.: Cvap arm/hand activity database, http://www.nada.kth.se/~danik/gesture_database/ (2011)

  18. Mitra, S., Acharya, T.: Gesture recognition: a survey. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 37(3), 311–324 (2007)

    Article  Google Scholar 

  19. Molina, J., Escudero-Viñolo, M., Signoriello, A., Pardás, M., Ferrán, C., Bescós, J., Marqués, F., Martínez, J.: Real-time user independent hand gesture recognition from time-of-flight camera video using static and dynamic models. Mach. Vis. Appl. 24, 187–204 (2013)

    Article  Google Scholar 

  20. Ren, Z., Meng, J., Yuan, J., Zhang, Z.: Robust hand gesture recognition with kinect sensor. In: Proceedings of the 19th ACM international conference on Multimedia, ACM MM ’11, pp. 759–760. ACM, New York (2011)

  21. Soutschek, S., Penne, J., Hornegger, J., Kornhuber, J.: 3-d gesture-based scene navigation in medical imaging applications using time-of-flight cameras. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–6 (2008).

  22. Triesch, J., VD Malsburg, C.: Robust classification of hand postures against complex backgrounds. In: Proceedings of the Second International Conference on Automatic Face and Gesture Recognition, 1996, pp. 170–175 (1996)

  23. Triesch, J., VD Malsburg, C.: A system for person-independent hand posture recognition against complex backgrounds. Pattern Anal. Mach. Intell. 23(12), 1449–1453 (2001)

  24. Yamanaka, K., Yano, A., Morishima, S.: Example based skinning with progressively optimized support joints. In: ACM SIGGRAPH ASIA 2009 Posters, SIGGRAPH ASIA ’09, p. 55:1. ACM, New York, NY, USA (2009)

  25. Yoshiyasu, Y., Yamazaki, N.: Pose space surface manipulation. Int. J. Comput. Games Technol. 2012, 1:1–1:13 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Javier Molina.

Additional information

This work has been supported by the Spanish Administration Agency CDTI under Project CENIT-VISION 2007-1007.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Molina, J., Pajuelo, J.A., Escudero-Viñolo, M. et al. A natural and synthetic corpus for benchmarking of hand gesture recognition systems. Machine Vision and Applications 25, 943–954 (2014). https://doi.org/10.1007/s00138-013-0576-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00138-013-0576-z

Keywords

Navigation