skip to main content
10.1145/3197768.3197788acmotherconferencesArticle/Chapter ViewAbstractPublication PagespetraConference Proceedingsconference-collections
short-paper

Enabling Early Gesture Recognition by Motion Augmentation

Published: 26 June 2018 Publication History

Abstract

In real-time gesture recognition algorithms, accurately classifying gestures early, when they are only partially observed, can be advantageous as it minimizes latency and improves user experience. This work investigates a novel approach for improving the results of an early gesture classification model. The method involves augmenting the input sequence of human poses of a partially observed gesture with a series of poses predicted by an auxiliary recurrent neural network sequence-to-sequence motion prediction model before being fed into a random forest gesture classifier. By concatenating the partially observed ground truth sequence with the forecasted motion sequence, we are able to significantly improve early gesture recognition accuracy. When forecasting 25 future frames of a partially observed input gesture sequence of 50 frames, recognition accuracy improves from 45% to 87% on average when evaluated on the MSRC-12 gesture dataset.

References

[1]
Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, et al. 2016. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016).
[2]
Jake K Aggarwal and Lu Xia. 2014. Human activity recognition from 3d data: A review. Pattern Recognition Letters 48 (2014), 70--80.
[3]
Leo Breiman. 2001. Random forests. Machine learning 45, 1 (2001), 5--32.
[4]
Judith Bütepage, Michael Black, Danica Kragic, and Hedvig Kjellström. 2017. Deep representation learning for human motion prediction and classification. arXiv preprint arXiv:1702.07486 (2017).
[5]
Yong Du, Wei Wang, and Liang Wang. 2015. Hierarchical recurrent neural network for skeleton based action recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1110--1118.
[6]
Simon Fothergill, Helena Mentis, Pushmeet Kohli, and Sebastian Nowozin. 2012. Instructing people for training gestural interactive systems. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1737--1746.
[7]
Ajjen Joshi, Camille Monnier, Margrit Betke, and Stan Sclaroff. 2015. A random forest approach to segmenting and classifying gestures. In Automatic Face and Gesture Recognition (FG), 2015 11th IEEE International Conference and Workshops on, Vol. 1. IEEE, 1--7.
[8]
Shugao Ma, Leonid Sigal, and Stan Sclaroff. 2016. Learning activity progression in lstms for activity detection and early detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1942--1950.
[9]
Sotiris Malassiotis, Niki Aifanti, and Michael G Strintzis. 2002. A gesture recognition system using 3D data. In 3D Data Processing Visualization and Transmission, 2002. Proceedings. First International Symposium on. IEEE, 190--193.
[10]
Julieta Martinez, Michael J Black, and Javier Romero. 2017. On human motion prediction using recurrent neural networks. arXiv preprint arXiv:1705.02445 (2017).
[11]
Sushmita Mitra and Tinku Acharya. 2007. Gesture recognition: A survey. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 37, 3 (2007), 311--324.
[12]
Akihiro Mori, Seiichi Uchida, Ryo Kurazume, Rin-ichiro Taniguchi, Tsutomu Hasegawa, and Hiroaki Sakoe. 2006. Early recognition and prediction of gestures. In Pattern Recognition, 2006. ICPR 2006. 18th International Conference on, Vol. 3. IEEE, 560--563.
[13]
Jürgen Schmidhuber. 2015. Deep learning in neural networks: An overview. Neural networks 61 (2015), 85--117.
[14]
Jamie Shotton, Toby Sharp, Alex Kipman, Andrew Fitzgibbon, Mark Finocchio, Andrew Blake, Mat Cook, and Richard Moore. 2013. Real-time human pose recognition in parts from single depth images. Commun. ACM 56, 1 (2013), 116--124.
[15]
Yale Song, David Demirdjian, and Randall Davis. 2011. Multi-signal gesture recognition using temporal smoothing hidden conditional random fields. In Automatic Face & Gesture Recognition and Workshops (FG 2011), 2011 IEEE International Conference on. IEEE, 388--393.
[16]
Thad Starner and Alex Pentland. 1997. Real-time american sign language recognition from video using hidden markov models. In Motion-Based Recognition. Springer, 227--243.

Cited By

View all
  • (2024)Affect Behavior Prediction: Using Transformers and Timing Information to Make Early Predictions of Student Exercise OutcomeArtificial Intelligence in Education10.1007/978-3-031-64299-9_14(194-208)Online publication date: 2-Jul-2024
  • (2020)Robust Real-Time Hand Gestural Recognition for Non-Verbal Communication with Tabletop Robot Haru2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)10.1109/RO-MAN47096.2020.9223566(891-898)Online publication date: Aug-2020

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
PETRA '18: Proceedings of the 11th PErvasive Technologies Related to Assistive Environments Conference
June 2018
591 pages
ISBN:9781450363907
DOI:10.1145/3197768
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • NSF: National Science Foundation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 June 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. early prediction
  2. gesture recognition

Qualifiers

  • Short-paper
  • Research
  • Refereed limited

Conference

PETRA '18

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Affect Behavior Prediction: Using Transformers and Timing Information to Make Early Predictions of Student Exercise OutcomeArtificial Intelligence in Education10.1007/978-3-031-64299-9_14(194-208)Online publication date: 2-Jul-2024
  • (2020)Robust Real-Time Hand Gestural Recognition for Non-Verbal Communication with Tabletop Robot Haru2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)10.1109/RO-MAN47096.2020.9223566(891-898)Online publication date: Aug-2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media