Loading [a11y]/accessibility-menu.js
Fusion of Temporal Transformer and Spatial Graph Convolutional Network for 3-D Skeleton-Parts-Based Human Motion Prediction | IEEE Journals & Magazine | IEEE Xplore

Fusion of Temporal Transformer and Spatial Graph Convolutional Network for 3-D Skeleton-Parts-Based Human Motion Prediction


Abstract:

The field of human motion prediction has gained prominence, finding applications in various domains such as intelligent surveillance and human–robot interaction. However,...Show More

Abstract:

The field of human motion prediction has gained prominence, finding applications in various domains such as intelligent surveillance and human–robot interaction. However, predicting full-body human motion poses challenges in capturing joint interactions, handling diverse movement patterns, managing occlusions, and ensuring real-time performance. To address these challenges, the proposed model adopts a skeleton-parted strategy to dissect the skeleton structure, enhancing coordination and fusion between body parts. This novel method combines transformer-enabled graph convolutional networks for predicting human motion in 3-D skeleton data. It integrates a temporal transformer (T-Transformer) for comprehensive temporal feature extraction and a spatial graph convolutional network (S-GCN) for capturing spatial characteristics of human motion. The model's performance is evaluated on two comprehensive human motion datasets, Human3.6M and CMU motion capture (CMU Mocap), containing numerous videos encompassing short and long human motion sequences. Results indicate that the proposed model outperforms state-of-the-art methods on both datasets, significantly improving the average mean per joint positional error (avg-MPJPE) by 3.50% and 11.45% for short-term and long-term motion prediction, respectively. Similarly, on the CMU Mocap dataset, it achieves avg-MPJPE improvements of 2.69% and 1.05% for short-term and long-term motion prediction, respectively, demonstrating its superior accuracy in predicting human motion over extended periods. The study also investigates the impact of different numbers of T-Transformers and S-GCNs and explores the specific roles and contributions of the T-Transformer, S-GCN, and cross-part components.
Published in: IEEE Transactions on Human-Machine Systems ( Volume: 54, Issue: 6, December 2024)
Page(s): 788 - 797
Date of Publication: 11 September 2024

ISSN Information:


References

References is not available for this document.