Skip to main content
Log in

Implicit regularization of a deep augmented neural network model for human motion prediction

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Predicting human motion based on past observed motion is one of the challenging issues in computer vision and graphics. Existing research works are dealing with this issue by using discriminative models and showing the results for cases that follow a homogeneous distribution (in distribution) and not discussing the issues of the domain shift problem, where training and testing data follow a heterogeneous (out of distribution) problem, which is the reality when such models are used in practice. However, recent research proposed addressing domain shift issues by augmenting the discriminative model with a generative model and obtained better results. In the present investigation, we propose regularizing the extended network by inserting linear layers to minimize the rank of the latent space and train the entire end-to-end network. We regularize the network to strengthen the model to deal effectively with domain shift scenarios. Both training and testing data come from different distribution sets; to deal with this, we toughen our network by adding the extra linear layers to the network encoder. We tested our model with the benchmark datasets, CMU Motion Capture and Human3.6M, and proved that our model outperforms 14 OoD actions of H3.6M and 7 OoD actions of CMU MoCap in terms of the Euclidean distance calculated between predicted and ground truth joint angle values. Our average results of 14 OoD actions for short-term (80, 160, 320, 400) are 0.34, 0.6, 0.96, 1.07, and for CMU MoCap of 7 OoD actions for short-term and long term (80, 160, 320, 400, 1000) are 0.28, 0.45, 0.77, 0.89, 1.46. All these results are much better than the other state-of-the-art results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data Availability Statement

No associated data is available

References

  1. Gui L-Y, Zhang K, Wang Y-X, Liang X, Moura JM, Veloso M (2018) Teaching robots to predict human motion. In: 2018 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, pp 562–567

  2. Geertsema EE, Thijs RD, Gutter T, Vledder B, Arends JB, Leijten FS, Visser GH, Kalitzin SN (2018) Automated video-based detection of nocturnal convulsive seizures in a residential care setting. Epilepsia 59:53–60

    Article  Google Scholar 

  3. Shirai A, Geslin E, Richir S (2007) Wiimedia: motion analysis methods and applications using a consumer video game controller. In: Proceedings of the 2007 ACM SIGGRAPH symposium on video games, pp 133–140

  4. Rofougaran AR, Rofougaran M, Seshadri N, Ibrahim BB, Walley J, Karaoguz J (2018) Game console and gaming object with motion prediction modeling and methods for use therewith. Google Patents, US Patent 9,943.760

  5. Zhang B, Zhong J, Cai W (2022) A data-driven approach for pedestrian intention prediction in large public places. In: SIGSIM Conference on principles of advanced discrete simulation, pp 33–36

  6. Ma Q, Zou Q, Huang Y, Wang N (2022) Dynamic pedestrian trajectory forecasting with lstm-based delaunay triangulation. Appl Intell 52(3):3018–3028

    Article  Google Scholar 

  7. Hsu Y. -C., Shen Y, Jin H, Kira Z (2020) Generalized odin: detecting out-of-distribution image without learning from out-of-distribution data. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10951–10960

  8. Singh D, Srivastava R (2022) Graph neural network with rnns based trajectory prediction of dynamic agents for autonomous vehicle. Appl Intell 1–16

  9. Kalatian A, Farooq B (2022) A context-aware pedestrian trajectory prediction framework for automated vehicles. Transportation Research Part C: Emerging Technologies 134:103453

    Article  Google Scholar 

  10. Dafrallah S, Amine A, Mousset S, Bensrhair A (2021) Monocular pedestrian orientation recognition based on capsule network for a novel collision warning system. IEEE Access 9:141635–141650

    Article  Google Scholar 

  11. Bourached A, Griffiths R. -R., Gray R, Jha A, Nachev P (2020) Generative model-enhanced human motion prediction. Applied AI Letters

  12. Mao W, Liu M, Salzmann M, Li H (2019) Learning trajectory dependencies for human motion prediction. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9489–9497

  13. Jing L, Zbontar J, et al. (2020) Implicit rank-minimizing autoencoder. Adv Neural Inf Process Syst 33:14736–14746

    Google Scholar 

  14. Ionescu C, Papava D, Olaru V, Sminchisescu C (2014) Human3.6m: large scale datasets and predictive methods for 3d human sensing in natural environments. IEEE Trans Pattern Anal Mach Intell 36 (7):1325–1339

    Article  Google Scholar 

  15. CMU Graphics Lab Motion Capture Database. http://mocap.cs.cmu.edu/

  16. Li M, Chen S, Zhao Y, Zhang Y, Wang Y, Tian Q (2020) Dynamic multiscale graph neural networks for 3d skeleton based human motion prediction. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 214–223

  17. Butepage J, Black MJ, Kragic D, Kjellstrom H (2017) Deep representation learning for human motion prediction and classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6158–6166

  18. Fragkiadaki K, Levine S, Felsen P, Malik J (2015) Recurrent network models for human dynamics. In: Proceedings of the IEEE international conference on computer vision, pp 4346–4354

  19. Mao W, Liu M, Salzmann M (2020) History repeats itself: human motion prediction via motion attention. In: European conference on computer vision. Springer, pp 474–489

  20. Yu Y, Tian N, Hao X, Ma T, Yang C (2022) Human motion prediction with gated recurrent unit model of multi-dimensional input. Appl Intell 52(6):6769–6781

    Article  Google Scholar 

  21. Zhang C, Yang Z, He X, Deng L (2020) Multimodal intelligence: representation learning, information fusion, and applications. IEEE J Sel Top Signal Process 14(3):478–493

    Article  Google Scholar 

  22. Aldhubri A, Lasheng Y, Mohsen F, Al-Qatf M (2021) Variational autoencoder bayesian matrix factorization (vabmf) for collaborative filtering. Appl Intell 51(7):5132–5145

    Article  Google Scholar 

  23. Lopez R, Boyeau P, Yosef N, Jordan M, Regier J (2020) Decision-making with auto-encoding variational bayes. Adv Neural Inf Process Syst 33:5081–5092

    Google Scholar 

  24. Zietlow D, Rolinek M, Martius G (2021) Demystifying inductive biases for (beta-) vae based architectures. In: International conference on machine learning. PMLR, pp 12945–12954

  25. Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning. PMLR, pp 1597–1607

  26. Liang S, Li Y, Srikant R (2018) Enhancing the reliability of out-of-distribution image detection in neural networks. In: International conference on learning representations

  27. Hendrycks D, Mazeika M, Dietterich T (2018) Deep anomaly detection with outlier exposure. In: International conference on learning representations

  28. Gustafsson FK, Danelljan M, Schon TB (2020) Evaluating scalable bayesian deep learning methods for robust computer vision. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 318–319

  29. Lee K, Lee K, Lee H, Shin J (2018) A simple unified framework for detecting out-of-distribution samples and adversarial attacks. Advances in Neural Information Processing Systems 31

  30. Saxe AM, McClelland JL, Ganguli S (2019) A mathematical theory of semantic development in deep neural networks. Proc Natl Acad Sci 116(23):11537–11546

    Article  MathSciNet  MATH  Google Scholar 

  31. Gunasekar S, Woodworth B, Bhojanapalli S, Neyshabur B, Srebro N (2018) Implicit regularization in matrix factorization. In: 2018 information theory and applications workshop (ITA). IEEE, pp 1–10

  32. Soudry D, Hoffer E, Nacson MS, Gunasekar S, Srebro N (2018) The implicit bias of gradient descent on separable data. The Journal of Machine Learning Research 19(1):2822–2878

    MathSciNet  MATH  Google Scholar 

  33. Gidel G, Bach F, Lacoste-Julien S (2019) Implicit regularization of discrete gradient dynamics in linear neural networks. Adv Neural Inf Process Syst 32

  34. Ionescu C, Papava D, Olaru V, Sminchisescu C (2013) Human3. 6m: large scale datasets and predictive methods for 3d human sensing in natural environments. IEEE Trans Pattern Anal Mach Intell 36 (7):1325–1339

    Article  Google Scholar 

  35. Yadav GK, Nandi G (2020) Development of adaptive sampling based strategy for human activity predictions using sequential networks. In: 2020 IEEE 4th conference on information & communication technology (CICT). IEEE, pp 1–6

  36. Lian J, Ren W, Li L, Zhou Y, Zhou B (2022) Ptp-stgcn: pedestrian trajectory prediction based on a spatio-temporal graph convolutional neural network. Appl Intell 1–17

  37. Martinez J, Black MJ, Romero J (2017) On human motion prediction using recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2891–2900

  38. Li D, Rodriguez C, Yu X, Li H (2020) Word-level deep sign language recognition from video: a new large-scale dataset and methods comparison. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 1459–1469

  39. Myronenko A (2018) 3d mri brain tumor segmentation using autoencoder regularization. In: International MICCAI brainlesion workshop. Springer, pp 311–320

  40. Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorch

  41. Zhang Z (2018) Improved adam optimizer for deep neural networks. In: 2018 IEEE/ACM 26th international symposium on quality of Service (IWQoS). Ieee, pp 1–2

  42. Lebailly T, Kiciroglu S, Salzmann M, Fua P, Wang W (2020) Motion prediction using temporal inception module. In: Proceedings of the asian conference on computer vision

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gaurav Kumar Yadav.

Ethics declarations

Conflict of Interests

The authors have no conflicts of interest to declare.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yadav, G.K., Abdel-Nasser, M., Rashwan, H.A. et al. Implicit regularization of a deep augmented neural network model for human motion prediction. Appl Intell 53, 18027–18040 (2023). https://doi.org/10.1007/s10489-022-04419-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-04419-x

Keywords

Navigation