Abstract
The KinectTMcamera has revolutionized the field of computer vision by making available low cost 3D cameras recording both RGB and depth data, using a structured light infrared sensor. We recorded and made available a large database of 50,000 hand and arm gestures. With these data, we organized a challenge emphasizing the problem of learning from very few examples. The data are split into subtasks, each using a small vocabulary of 8 to 12 gestures, related to a particular application domain: hand signals used by divers, finger codes to represent numerals, signals used by referees, Marshalling signals to guide vehicles or aircrafts, etc. We limited the problem to single users for each task and to the recognition of short sequences of gestures punctuated by returning the hands to a resting position. This situation is encountered in computer interface applications, including robotics, education, and gaming. The challenge setting fosters progress in transfer learning by providing for training a large number of subtasks related to, but different from the tasks on which the competitors are tested.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bobick, A.F., Davis, J.W.: The recognition of human movement using temporal templates. IEEE Trans. Pattern Anal. Mach. Intell. 23(3), 257–267 (2001)
Bradski, G.: The OpenCV Library. Dr. Dobb’s Journal of Software Tools (2000)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, pp. 886–893 (2005)
Escalante, H.J., Guyon, I.: Principal motion: PCA-based reconstruction of motion histograms. Technical report, ChaLearn (2012)
Escalera, S., Fornés, A., Pujol, O., Lladós, J., Radeva, P.: Circular blurred shape model for multiclass symbol recognition. IEEE Transactions on Systems, Man, and Cybernetics, Part B 41(2), 497–506 (2011)
Fanelli, G., Gall, J., Van Gool, L.J.: Real time head pose estimation with random regression forests. In: CVPR, pp. 617–624 (2011)
Gallo, L., Placitelli, A.P., Ciampi, M.: Controller-free exploration of medical image data: Experiencing the kinect. In: CBMS, pp. 1–6 (2011)
Gori, I., Fanello, S.R., Metta, G., Odone, F.: All gestures you can: a memory game. Technical report, Istituto Italiano di Tecnologia, Italy, Submitted to JMLR (2012)
Guyon, I., Athitsos, V., Jangyodsuk, P., Escalante, H.J.: The ChaLearn Gesture Dataset (CGD 2011). Submitted to Machine Vision and Applications (2013)
Hastie, T., Tibshirani, R., Friedman, J.H.: The elements of statistical learning: data mining, inference, and prediction: with 200 full-color illustrations. Springer, New York (2001)
Keskin, C., Kira, F., Kara, Y.E., Akarun, L.: Randomized decision forests for static and dynamic hand shape classification. In: CVPR Workshops, pp. 31–36. IEEE (2012)
Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT Press (2009)
Lafferty, J.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data, pp. 282–289. Morgan Kaufmann (2001)
Laptev, I.: On space-time interest points. International Journal of Computer Vision 64(2-3), 107–123 (2005)
Lucena, M., de la Blanca, N.P., Fuertes, J.M., MarÃn-Jiménez, M.J.: Human action recognition using optical flow accumulated local histograms. In: Araujo, H., Mendonça, A.M., Pinho, A.J., Torres, M.I. (eds.) IbPRIA 2009. LNCS, vol. 5524, pp. 32–39. Springer, Heidelberg (2009)
Malgireddy, M., Nwogu, I., Govindaraju, V.: Language-motivated approaches to action recognition. Submitted to JMLR (2013)
Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Tracking the articulated motion of two strongly interacting hands. In: CVPR, pp. 1862–1869 (2012)
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Transactions on Knoweledge and Data Engineering 22(10), 1345–1359 (2010)
Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE, 257–286 (1989)
Viterbi, A.J.: Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory IT-13(2), 260–269 (1967)
Wan, J., Ruan, Q., Li, W.: One-shot learning gesture recognition from rgb-d data using bag-of-features. JMLR (in press, 2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Guyon, I., Athitsos, V., Jangyodsuk, P., Escalante, H.J., Hamner, B. (2013). Results and Analysis of the ChaLearn Gesture Challenge 2012. In: Jiang, X., Bellon, O.R.P., Goldgof, D., Oishi, T. (eds) Advances in Depth Image Analysis and Applications. WDIA 2012. Lecture Notes in Computer Science, vol 7854. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40303-3_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-40303-3_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40302-6
Online ISBN: 978-3-642-40303-3
eBook Packages: Computer ScienceComputer Science (R0)