A Study of Feature Combination in Gesture Recognition with Kinect

Pham, Ngoc-Quan; Le, Hai-Son; Nguyen, Duc-Dung; Ngo, Truong-Giang

doi:10.1007/978-3-319-11680-8_37

Ngoc-Quan Pham⁵,
Hai-Son Le⁵,
Duc-Dung Nguyen⁵ &
…
Truong-Giang Ngo⁶

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 326))

1807 Accesses

Abstract

Human gesture recognition is an interdisciplinary problem, with many important applications. In the structure of a gesture recognition system, feature extraction, without doubt, is one of the most important factor affecting the performance. In this paper, we desired to improve the covariance feature, which is the current state-of-the-art feature extraction method, by integrating other frame-level features extracted in the data captured by Microsoft Kinect, and experimenting the features with various classification methods such as Random Forest (RF), Multi Layer Perceptron (MLP), Support Vector Machines (SVM). The leave-person-out experiments showed that feature combination is beneficial, especially with Random Forest, to achieve the highest score in recognition, which is improved by 2%, from 90.9% to 93.0%. However, the dimensional increase sometimes exacerbated the performance, indicating the side effect of feature combination.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Fothergill, S., Mentis, H., Kohli, P., Nowozin, S.: Instructing people for training gestural interactive systems. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, ser. CHI 2012, pp. 1737–1746. ACM, USA (2012)
Google Scholar
Wang, J., Liu, Z., Chorowski, J., Chen, Z., Wu, Y.: Robust 3D action recognition with random occupancy patterns. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 872–885. Springer, Heidelberg (2012)
Chapter Google Scholar
Hussein, M.E., Torki, M., Gowayyed, M.A., El-Saban, M.: Human action recognition using a temporal hierarchy of covariance descriptors on 3d joint locations. In: Proceedings of the Twenty-Third IJCAI 2013, pp. 2466–2472. AAAI Press (2013)
Google Scholar
Yang, M.-H., Ahuja, N., Tabb, M.: Extraction of 2d motion trajectories and its application to hand gesture recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(8), 1061–1074 (2002)
Article Google Scholar
Alon, J., Athitsos, V., Yuan, Q., Sclaroff, S.: A unified framework for gesture recognition and spatiotemporal gesture segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(9), 1685–1699 (2009)
Article Google Scholar
Xia, L., Chen, C.-C., Aggarwal, J.K.: View invariant human action recognition using histograms of 3d joints. In: CVPR Workshops, pp. 20–27. IEEE (2012)
Google Scholar
Nickel, K., Stiefelhagen, R.: Pointing gesture recognition based on 3d-tracking of face, hands and head orientation. In: Workshop on Perceptive User Interfaces, pp. 140–146. ACM Press (2003)
Google Scholar
Adistambha, K., Ritz, C., Burnett, I.: Motion classification using dynamic time warping. In: IEEE 10th Workshop on Multimedia Signal Processing, pp. 622–627 (October 2008)
Google Scholar
Deng, L., Leung, H., Gu, N., Yang, Y.: Automated recognition of sequential patterns in captured motion streams. In: Chen, L., Tang, C., Yang, J., Gao, Y. (eds.) WAIM 2010. LNCS, vol. 6184, pp. 250–261. Springer, Heidelberg (2010)
Chapter Google Scholar
Wang, J., Liu, Z., Wu, Y., Yuan, J.: Mining actionlet ensemble for action recognition with depth cameras. In: CVPR, pp. 1290–1297 (June 2012)
Google Scholar
Kim, D., Nguyen-Duc-Thanh, N., Lee, S.: Two-stage hidden markov model in gesture recognition for human robot interaction. Int. J. Adv. Robot. Syst., 9–39 (2012)
Google Scholar
Sminchisescu, C., Kanaujia, A., Li, Z., Metaxas, D.: Conditional models for contextual human motion recognition. In: Tenth IEEE International Conference on Computer Vision, ICCV 2005, vol. 2, pp. 1808–1815 (2005)
Google Scholar
Wang, S.B., Quattoni, A., Morency, L., Demirdjian, D., Darrell, T.: Hidden conditional random fields for gesture recognition. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1521–1527 (2006)
Google Scholar
Elmezain, M., Al-Hamadi, A., Sadek, S., Michaelis, B.: Robust methods for hand gesture spotting and recognition using hidden markov models and conditional random fields. In: ISSPIT, pp. 131–136 (2010)
Google Scholar
Vinh, L., Lee, S., Le, H., Ngo, H., Kim, H., Han, M., Lee, Y.-K.: Semi-markov conditional random fields for accelerometer-based activity recognition. Applied Intelligence 35(2), 226–241 (2011)
Article Google Scholar
Oommen, T., Misra, D., Twarakavi, N., Prakash, A., Sahoo, B., Bandopadhyay, S.: An objective analysis of support vector machine based classification for remote sensing. In: Mathematical Geosciences, vol. 40(4), pp. 409–424 (2008)
Google Scholar
Graf, A.B.A., Borer, S.: Normalization in support vector machines. In: Radig, B., Florczyk, S. (eds.) DAGM 2001. LNCS, vol. 2191, p. 277. Springer, Heidelberg (2001)
Chapter Google Scholar
Ali, S., Smith-Miles, K.A.: Improved support vector machine generalization using normalized input space. In: Sattar, A., Kang, B.-H. (eds.) AI 2006. LNCS (LNAI), vol. 4304, pp. 362–371. Springer, Heidelberg (2006)
Google Scholar
Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)
Article MATH Google Scholar
Bengio, Y.: Learning deep architectures for ai. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
Article MathSciNet MATH Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MathSciNet MATH Google Scholar
Seide, F., Li, G., Yu, D.: Conversational speech transcription using context-dependent deep neural networks. In: Proceedings of the 12th INTERSPEECH, pp. 437–440 (2011)
Google Scholar
Nowozin, S., Shotton, J.: Action points: A representation for low-latency online human action recognition. In: TechReport MSR-TR-2012-68. 7 J J Thomson Ave, CB30FB Cambridge, UK: Microsoft Research Cambridge (July 2012)
Google Scholar
Collobert, R., Kavukcuoglu, K., Farabet, C.: Torch7: A Matlab-like Environment for Machine Learning. In: BigLearn NIPS Workshop (2011)
Google Scholar
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2011)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Information Technology, Vietnam Academy of Science and Technology, Hanoi, Vietnam
Ngoc-Quan Pham, Hai-Son Le & Duc-Dung Nguyen
Hai Phong Private University, Hai Phong, Vietnam
Truong-Giang Ngo

Authors

Ngoc-Quan Pham
View author publications
You can also search for this author in PubMed Google Scholar
Hai-Son Le
View author publications
You can also search for this author in PubMed Google Scholar
Duc-Dung Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Truong-Giang Ngo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ngoc-Quan Pham .

Editor information

Editors and Affiliations

Faculty of Information Technology, VNU University of Engineering and Technology, Hanoi, Vietnam
Viet-Ha Nguyen
Faculty of Information Technology, VNU University of Engineering and Technology, Hanoi, Vietnam
Anh-Cuong Le
School of Knowledge Science, Japan Advanced Institute of Science and Technology, Nomi, Ishikawa, Japan
Van-Nam Huynh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pham, NQ., Le, HS., Nguyen, DD., Ngo, TG. (2015). A Study of Feature Combination in Gesture Recognition with Kinect. In: Nguyen, VH., Le, AC., Huynh, VN. (eds) Knowledge and Systems Engineering. Advances in Intelligent Systems and Computing, vol 326. Springer, Cham. https://doi.org/10.1007/978-3-319-11680-8_37

Download citation

DOI: https://doi.org/10.1007/978-3-319-11680-8_37
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11679-2
Online ISBN: 978-3-319-11680-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics