Skip to main content
Log in

Human Motion prediction based on attention mechanism

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Human motion prediction, although in the field of human-computer interaction, personnel tracking, automatic driving and other fields have very important significance. However, human motion prediction is affected by uncertainties such as motion speed and amplitude, which results in the predicted first frame is discontinuous and the time for accurate prediction is short. This paper proposes a method that combines sequence-to-sequence (seq2seq) structure and Attention mechanisms to improve the problems of current methods. We refer to the proposed structure as the At-seq2seq model, which is a sequence-to-sequence model based on GRU (Gated Recurrent Unit). We added an attention mechanism in the decoder part of the seq2seq model to further encode the output of the encoder into a vector sequence containing multiple subsets so that the decoder selects the most relevant part of the sequence for decoding prediction. The At-seq2seq model has been validated on the human3.6 m dataset. The experimental results show that the proposed model can not only improve the error of short-term motion prediction but also significantly increase the time of accurate prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Akhter I, Simon T, Khan S et al (2012) Bilinear spatiotemporal basis models. ACM Trans Graph 31(2):1–12

    Article  Google Scholar 

  2. Brand M (2000) Style machines. Siggraph Computer Graphics Proceedings, 183–192

  3. Cascianelli S, Costante G, Ciarfuglia TA et al (2018) Full-GRU Natural Language Video Description for Service Robotics Applications. IEEE Robotics & Automation Letters 3(2):841–848

    Article  Google Scholar 

  4. Cho K, Van Merrienboer B, Gulcehre C et al (2014) Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. Comput Therm Sci. https://doi.org/10.3115/v1/D14-1179

  5. Donahue J, Hendricks LA, Rohrbach M, Venugopalan S, Guadarrama S, Saenko K et al (2014) Long-term recurrent convolutional networks for visual recognition and description. IEEE Transactions on Pattern Analysis & Machine Intelligence 39(4):677–691

    Article  Google Scholar 

  6. Fragkiadaki, K., Levine, S., Felsen, P., & Malik, J. (2015). Recurrent network models for human dynamics. https://doi.org/10.1109/ICCV.2015.494

  7. Graves A (2013) Generating sequences with recurrent neural networks. Computer Science. https://arxiv.org/abs/1308.0850

  8. Gwynne SMV, Hulse LM, Kinsey MJ (2017) Guidance for the Model Developer on Representing Human Behavior in Egress Models. Fire Technol 53(2):649

    Article  Google Scholar 

  9. Ionescu C, Papava D, Olaru V et al (2014) Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments. IEEE Trans Pattern Anal Mach Intell 36(7):1325–1339

    Article  Google Scholar 

  10. Jain A, Zamir AR, Savarese S, et al (2015) Structural-RNN: Deep Learning on Spatio-Temporal Graphs. 5308-5317. https://doi.org/10.1109/CVPR.2016.573

  11. Jozefowicz R, Zaremba W, Sutskever I (2015) An Empirical Exploration of Recurrent Network Architectures. International Conference on International Conference on Machine Learning. JMLR.org

  12. Kim B, Choi J, Lee GG (2016) ASR Error Management Using RNN Based Syllable Prediction for Spoken Dialog Applications. Advances in Parallel and Distributed Computing and Ubiquitous Services. Springer Singapore

  13. Kombrink S (2011) Recurrent neural network based language modeling in meeting recognition. Proc. INTERSPEECH, 2011

  14. Lee YM, Kim JH (2017) Trajectory Generation Using RNN with Context Information for Mobile Robots. Robot Intelligence Technology and Applications 4

  15. Li X, Mao C, Huang S, Ye Z (2017) Chinese Sign Language Recognition Based on SHS Descriptor and Encoder-Decoder LSTM Model. Chinese Conference on Biometric Recognition. Springer, Cham

    Google Scholar 

  16. Lin C, Chi M (2017) A Comparisons of BKT, RNN and LSTM for Learning Gain Prediction. International Conference on Artificial Intelligence in Education. Springer, Cham. https://doi.org/10.1007/978-3-319-61425-0_58

    Book  Google Scholar 

  17. Mao C, Huang S, Li X, et al (2017) Chinese Sign Language Recognition with Sequence to Sequence Learning. CCF Chinese Conference on Computer Vision. Springer, Singapore. https://doi.org/10.1007/978-981-10-7299-4_15

    Google Scholar 

  18. Martinez J, Black MJ, Romero J (2017) On human motion prediction using recurrent neural networks. https://doi.org/10.1109/CVPR.2017.497

  19. Noah W, Balasubramanian LS (2018) The fine line between linguistic generalization and failure in seq2seq-attention models. https://arxiv.org/abs/1805.01445

  20. Pavlovic V (2001) Learning switching linear models of human motion. Advances in Neural Information Processing Systems. 13:981--987. Advances in Neural Information Processing Systems 13 (NIPS 2000)

  21. Saini S, Rambli DRBA, Zakaria N, Sulaiman SB (2014) A review on particle swarm optimization algorithm and its variants to human motion tracking. Math Probl Eng 2014

  22. Shen Y, Phan N, Xiao X et al (2016) Dynamic Socialized Gaussian Process Models for Human Behavior Prediction in a Health Social Network. Knowl Inf Syst 49(2):1–25

    Article  Google Scholar 

  23. Strobelt H, Gehrmann S, Behrisch M et al (2018) SEQ2SEQ-VIS: A Visual Debugging Tool for Sequence-to-Sequence Models. IEEE Trans Vis Comput Graph:1–1

  24. Sutskever I, Martens J, Hinton GE (2011) Generating Text with Recurrent Neural Networks. Proceedings of the 28th International Conference on Machine Learning, ICML 2011, Bellevue, Washington, DBLP

  25. Vinyals O, Toshev A, Bengio S et al (2014) Show and Tell: A Neural Image Caption Generator 3156-3164. https://arxiv.org/abs/1411.4555

  26. Wang JM, Fleet DJ, Hertzmann A (2007) Gaussian process dynamical models for human motion

  27. Wang J, Fleet D, Hertzmann A (2007) Multifactor Gaussian process models for style-content separation. International Conference on Machine Learning. ACM

  28. Xia J, Zhang J, Wang R (2016) Modeling of Adaptive Human–Machine Systems Based on Fuzzy Inference Petri Nets. Advances in Cognitive Neurodynamics (V). Springer Singapore. https://doi.org/10.1007/978-981-10-0207-6_67

    Chapter  Google Scholar 

  29. Yu Z, Yu J, Fan J, et al (2017) Multi-modal Factorized Bilinear Pooling with Co-Attention Learning for Visual Question Answering. https://doi.org/10.1109/ICCV.2017.202

  30. Zhang J, Edwards TE (2017) Guest editorial for special issue on modeling and analysis of human–machine systems in transportation. Cogn Tech Work:1–2

  31. Zhu Z, Zhang J, Zou J (2018) A multi-kernel based Gaussian process dynamic model for human motion modeling. International Conference on Security. IEEE. https://doi.org/10.1109/SPAC.2017.8304322

Download references

Acknowledgments

Thanks are due to the National Natural Science Foundation of China under grant nos. 61773105 and 61374147 and the Fundamental Research Funds for the Central Universities under grant no. N182008004 for supporting this research work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zi-Zhen Chen.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sang, HF., Chen, ZZ. & He, DK. Human Motion prediction based on attention mechanism. Multimed Tools Appl 79, 5529–5544 (2020). https://doi.org/10.1007/s11042-019-08269-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-019-08269-7

Keywords

Navigation