Human Motion prediction based on attention mechanism

Sang, Hai-Feng; Chen, Zi-Zhen; He, Da-Kuo

doi:10.1007/s11042-019-08269-7

Human Motion prediction based on attention mechanism

Published: 06 December 2019

Volume 79, pages 5529–5544, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Hai-Feng Sang¹,
Zi-Zhen Chen¹ &
Da-Kuo He²

871 Accesses
23 Citations
Explore all metrics

Abstract

Human motion prediction, although in the field of human-computer interaction, personnel tracking, automatic driving and other fields have very important significance. However, human motion prediction is affected by uncertainties such as motion speed and amplitude, which results in the predicted first frame is discontinuous and the time for accurate prediction is short. This paper proposes a method that combines sequence-to-sequence (seq2seq) structure and Attention mechanisms to improve the problems of current methods. We refer to the proposed structure as the At-seq2seq model, which is a sequence-to-sequence model based on GRU (Gated Recurrent Unit). We added an attention mechanism in the decoder part of the seq2seq model to further encode the output of the encoder into a vector sequence containing multiple subsets so that the decoder selects the most relevant part of the sequence for decoding prediction. The At-seq2seq model has been validated on the human3.6 m dataset. The experimental results show that the proposed model can not only improve the error of short-term motion prediction but also significantly increase the time of accurate prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A multilayer human motion prediction perceptron by aggregating repetitive motion

Article 13 September 2023

Multi-level Motion Attention for Human Motion Prediction

Article 16 June 2021

History Repeats Itself: Human Motion Prediction via Motion Attention

References

Akhter I, Simon T, Khan S et al (2012) Bilinear spatiotemporal basis models. ACM Trans Graph 31(2):1–12
Article Google Scholar
Brand M (2000) Style machines. Siggraph Computer Graphics Proceedings, 183–192
Cascianelli S, Costante G, Ciarfuglia TA et al (2018) Full-GRU Natural Language Video Description for Service Robotics Applications. IEEE Robotics & Automation Letters 3(2):841–848
Article Google Scholar
Cho K, Van Merrienboer B, Gulcehre C et al (2014) Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. Comput Therm Sci. https://doi.org/10.3115/v1/D14-1179
Donahue J, Hendricks LA, Rohrbach M, Venugopalan S, Guadarrama S, Saenko K et al (2014) Long-term recurrent convolutional networks for visual recognition and description. IEEE Transactions on Pattern Analysis & Machine Intelligence 39(4):677–691
Article Google Scholar
Fragkiadaki, K., Levine, S., Felsen, P., & Malik, J. (2015). Recurrent network models for human dynamics. https://doi.org/10.1109/ICCV.2015.494
Graves A (2013) Generating sequences with recurrent neural networks. Computer Science. https://arxiv.org/abs/1308.0850
Gwynne SMV, Hulse LM, Kinsey MJ (2017) Guidance for the Model Developer on Representing Human Behavior in Egress Models. Fire Technol 53(2):649
Article Google Scholar
Ionescu C, Papava D, Olaru V et al (2014) Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments. IEEE Trans Pattern Anal Mach Intell 36(7):1325–1339
Article Google Scholar
Jain A, Zamir AR, Savarese S, et al (2015) Structural-RNN: Deep Learning on Spatio-Temporal Graphs. 5308-5317. https://doi.org/10.1109/CVPR.2016.573
Jozefowicz R, Zaremba W, Sutskever I (2015) An Empirical Exploration of Recurrent Network Architectures. International Conference on International Conference on Machine Learning. JMLR.org
Kim B, Choi J, Lee GG (2016) ASR Error Management Using RNN Based Syllable Prediction for Spoken Dialog Applications. Advances in Parallel and Distributed Computing and Ubiquitous Services. Springer Singapore
Kombrink S (2011) Recurrent neural network based language modeling in meeting recognition. Proc. INTERSPEECH, 2011
Lee YM, Kim JH (2017) Trajectory Generation Using RNN with Context Information for Mobile Robots. Robot Intelligence Technology and Applications 4
Li X, Mao C, Huang S, Ye Z (2017) Chinese Sign Language Recognition Based on SHS Descriptor and Encoder-Decoder LSTM Model. Chinese Conference on Biometric Recognition. Springer, Cham
Google Scholar
Lin C, Chi M (2017) A Comparisons of BKT, RNN and LSTM for Learning Gain Prediction. International Conference on Artificial Intelligence in Education. Springer, Cham. https://doi.org/10.1007/978-3-319-61425-0_58
Book Google Scholar
Mao C, Huang S, Li X, et al (2017) Chinese Sign Language Recognition with Sequence to Sequence Learning. CCF Chinese Conference on Computer Vision. Springer, Singapore. https://doi.org/10.1007/978-981-10-7299-4_15
Google Scholar
Martinez J, Black MJ, Romero J (2017) On human motion prediction using recurrent neural networks. https://doi.org/10.1109/CVPR.2017.497
Noah W, Balasubramanian LS (2018) The fine line between linguistic generalization and failure in seq2seq-attention models. https://arxiv.org/abs/1805.01445
Pavlovic V (2001) Learning switching linear models of human motion. Advances in Neural Information Processing Systems. 13:981--987. Advances in Neural Information Processing Systems 13 (NIPS 2000)
Saini S, Rambli DRBA, Zakaria N, Sulaiman SB (2014) A review on particle swarm optimization algorithm and its variants to human motion tracking. Math Probl Eng 2014
Shen Y, Phan N, Xiao X et al (2016) Dynamic Socialized Gaussian Process Models for Human Behavior Prediction in a Health Social Network. Knowl Inf Syst 49(2):1–25
Article Google Scholar
Strobelt H, Gehrmann S, Behrisch M et al (2018) SEQ2SEQ-VIS: A Visual Debugging Tool for Sequence-to-Sequence Models. IEEE Trans Vis Comput Graph:1–1
Sutskever I, Martens J, Hinton GE (2011) Generating Text with Recurrent Neural Networks. Proceedings of the 28th International Conference on Machine Learning, ICML 2011, Bellevue, Washington, DBLP
Vinyals O, Toshev A, Bengio S et al (2014) Show and Tell: A Neural Image Caption Generator 3156-3164. https://arxiv.org/abs/1411.4555
Wang JM, Fleet DJ, Hertzmann A (2007) Gaussian process dynamical models for human motion
Wang J, Fleet D, Hertzmann A (2007) Multifactor Gaussian process models for style-content separation. International Conference on Machine Learning. ACM
Xia J, Zhang J, Wang R (2016) Modeling of Adaptive Human–Machine Systems Based on Fuzzy Inference Petri Nets. Advances in Cognitive Neurodynamics (V). Springer Singapore. https://doi.org/10.1007/978-981-10-0207-6_67
Chapter Google Scholar
Yu Z, Yu J, Fan J, et al (2017) Multi-modal Factorized Bilinear Pooling with Co-Attention Learning for Visual Question Answering. https://doi.org/10.1109/ICCV.2017.202
Zhang J, Edwards TE (2017) Guest editorial for special issue on modeling and analysis of human–machine systems in transportation. Cogn Tech Work:1–2
Zhu Z, Zhang J, Zou J (2018) A multi-kernel based Gaussian process dynamic model for human motion modeling. International Conference on Security. IEEE. https://doi.org/10.1109/SPAC.2017.8304322

Download references

Acknowledgments

Thanks are due to the National Natural Science Foundation of China under grant nos. 61773105 and 61374147 and the Fundamental Research Funds for the Central Universities under grant no. N182008004 for supporting this research work.

Author information

Authors and Affiliations

School of Information Science & Engineering, Shenyang University of Technology, Shenyang, 110870, Liaoning, China
Hai-Feng Sang & Zi-Zhen Chen
College of Information Science & Engineering, Northeastern University, Shenyang, 110819, Liaoning, China
Da-Kuo He

Authors

Hai-Feng Sang
View author publications
You can also search for this author in PubMed Google Scholar
Zi-Zhen Chen
View author publications
You can also search for this author in PubMed Google Scholar
Da-Kuo He
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zi-Zhen Chen.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sang, HF., Chen, ZZ. & He, DK. Human Motion prediction based on attention mechanism. Multimed Tools Appl 79, 5529–5544 (2020). https://doi.org/10.1007/s11042-019-08269-7

Download citation

Received: 30 September 2018
Revised: 27 June 2019
Accepted: 22 September 2019
Published: 06 December 2019
Issue Date: March 2020
DOI: https://doi.org/10.1007/s11042-019-08269-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Human Motion prediction based on attention mechanism

Abstract

Access this article

Similar content being viewed by others

A multilayer human motion prediction perceptron by aggregating repetitive motion

Multi-level Motion Attention for Human Motion Prediction

History Repeats Itself: Human Motion Prediction via Motion Attention

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Human Motion prediction based on attention mechanism

Abstract

Access this article

Similar content being viewed by others

A multilayer human motion prediction perceptron by aggregating repetitive motion

Multi-level Motion Attention for Human Motion Prediction

History Repeats Itself: Human Motion Prediction via Motion Attention

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation