research-article

AZY-GCN: Multi-scale feature suppression attentional diagram convolutional network for human pose prediction

Authors:

Gang HeAuthors Info & Claims

CSAI '21: Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence

Pages 112 - 121

https://doi.org/10.1145/3507548.3507565

Published: 09 March 2022 Publication History

Abstract

Due to the randomness and non-periodic nature of the future posture of the human body, the prediction of the posture of the human body has always been a very challenging task. In the latest research, graph convolution is proved to be an effective method to capture the dynamic relationship between the human body posture joints, which is helpful for the human body posture prediction. Moreover, graph convolution can abstract the pose of the human body to obtain a multi-scale pose set. As the level of abstraction increases, the posture movement will become more stable. Although the average prediction accuracy has improved significantly in recent years, there is still much room for exploration in the application of graph convolution in pose prediction. In this work, we propose a new multi-scale feature suppression attention map convolutional network (AZY-GCN) for end-to-end human pose prediction tasks. We use GCN to extract features from the fine-grained scale to the coarse-grained scale and then from the coarse-grained scale to the fine-grained scale. Then we combine and decode the extracted features at each scale to obtain the residual between the input and the target pose. We also performed intermediate supervision on all predicted poses so that the network can learn more representative features. In addition, we also propose a new feature suppression attention module (FISA-block), which can effectively extract relevant information from neighboring nodes while suppressing poor GCN learning noise. Our proposed method was evaluated on the public data sets of Human3.6M and CMU Mocap. After a large number of experiments, it is shown that our method has achieved relatively advanced performance.

References

[1]

Yujun Cai, Lin Huang, Yiwei Wang, Tat-Jen Cham, Jianfei Cai, Junsong Yuan, Jun Liu, Xu Yang, Yiheng Zhu, Xiaohui Shen,In Proceedings of the European Conference on Computer Vision, Pages 226 – 242. Springer, 2020. 2

[2]

Dario Pavllo, Christoph Feichtenhofer, David Grangier,and Michael Auli. 3d human pose estimation in video with temporal convolutions and semisupervised training. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,Pages 7753 – 7762, 2019. 1, 6, 7, 0

[3]

Yongxin Wang, Kris Kitani, And Xinshuo Weng. Joint Object Detection with Graph neural networks. ArXiv PrePrint arXiv:2006.13164, 2020

[4]

Maosen Li, Siheng Chen, Yangheng Zhao, Ya Zhang, Yanfeng Wang,And Qi Tian. Dynamic Multiscale Graph Neural Networks for 3D Skeleton Based on Human Motion Prediction. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Pages 214 – 223,2020. 1, 2, 3, 5, 6,7

[5]

Wei Mao, Miaomiao Liu, Mathieu Salzmann, and Hongdong Li. Learning trajectory dependencies for human motion prediction. In Proceedings of the IEEE International Conference on Computer Vision,Pages 9489 – 9497, 2019. 1, 2, 3,5, 6, 7

[6]

Thomas N Kipf and Max Welling. Semi-supervised classification with graph convolutional networks. International Conference on Learning Representations, 2016. 3

[7]

Ashraful Islam, Chengjiang Long, Arslan Basharat, and Anthony Hoogs. Doa-gan:Dual-order attentive generative adversarial network for image copy-move forgery detection and localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2020. 2

[8]

Chengjiang Long, Gang Hua, and Ashish Kapoor. A joint gaussian process model for active visual recognition with expertise estimation in crowdsourcing. International Journal of Computer Vision,116 (2) : 136-160, 2016. 2

[9]

Qiuhong Ke, Mohammed Bennamoun, Hossein Rahmani,Senjian An, Ferdous Sohel,And Farid Boussaid.Learning Latent Global Network for skeleton based Action Prediction.IEEE Transactions on Image Processing, 29:959 – 970, 2019.1, 2

[10]

Julieta Martinez, Rayat Hossain, Javier Romero,A simple yet effective Baseline for 3D human Pose estimation. Proceedings of the IEEE International Conference on Computer Vision, Pages 2640 – 2649, 2017. 1, 3, 5, 6, 7, 0, 2

[11]

Danfeng Hong, Lianru Gao, Jing Yao, Bing Zhang, Antonio Plaza,and Jocelyn Chanussot. Graph convolutional networks for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing, 2020. 2

[12]

Tao Hu, Chengjiang Long, Chunxia Xiao. Novel Visual representation using Diverse Conditional Gan for Visual recognition. IEEE Transactions on Image Processing, 30:3499-3512, 2021

Digital Library

[13]

Qiongjie Cui, Huaijiang Sun, and Fei Yang. Learning dynamic relationships for 3d human motion prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,Pages 6519 – 6527, 2020. 1, 2, 3

[14]

Andreas M Lehrmann, Peter V Gehler, and Sebastian Nowozin. Efficient nonlinear markov models for human motion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,Pages 1314-1321, 2014. 1

Digital Library

[15]

Tianhang Zheng, Sheng Liu, Changyou Chen, Junsong Yuan, Baochun Li,Kui Ren. Towards Understanding the Adversarial vulnerability of resilience of skeleton based action Recognition. ArXiv Preprint arXiv:2005.07151, 2020. 1, 2

[16]

Xikun Zhang, Chang Xu, and Dacheng Tao. Context aware graph convolution for skeleton-based action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,Pages 14333-14342, 2020. 1

[17]

Lei Shi, Yifan Zhang, Jian Cheng,and Hanqing Lu. Skeleton-based action recognition with directed graph neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 2019. 1

[18]

Xiaodan Liang, Yuxiong Wang, Xiaodan Liang, And Jose MF Moura. Adversarial Geometry - Aware Human Motion Prediction. Proceedings of the European Conference on Computer Vision, Pages 786 – 803,2018. 1, 2

[19]

Dong Wang, Yuan Yuan, And Qi Wang. Early action prediction with generative adversarial networks. IEEE Transactions on Neural Networks, 2019. 1, 2

[20]

Emre Aksan, Peng Cao, Manuel Kaufmann, And Otmar Hilliges. Spatio-temporal Transformer for 3D Human Motion Prediction. ArXiv E-Prints, Pages arXiv – 2004, 2020

[21]

Gang Hua, Chengjiang Long, Ming Yang,Yan Gao. Collaborative Active Learning of a kernel Machine ensemble for Recognition. In Proceedings of the IEEE International Conference on Computer Vision, Pages 1209 – 1216, 2013

Digital Library

[22]

Liushuai Shi, Le Wang, Chengjiang Long, Sanping Zhou, Mo Zhou, Zhenxing Niu, and Gang Hua. Sgcn:Sparse graph convolution for pedestrian trajectory prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021. 2

[23]

Chengjiang Long and Gang Hua. Multi-class multiannotator active learning with robust gaussian process for visual recognition. In Proceedings of the IEEE International Conference on ComputerVision, 2015. 2

[24]

Katerina Fragkiadaki, Sergey Levine, Panna Felsen, and Jitendra Malik. Recurrent network models for human dynamics. In Proceedings of the IEEE International Conference on Computer Vision,Pages 4346 – 4354, 2015. 1, 2

[25]

Hao Yang, Chunfeng Yuan, Li Zhang, Yunda Sun, Weiming Hu, and Stephen J Maybank. Sta-cnn:Convolutional temporal attention learning for action recognition. IEEE Transactions on Image Processing, 29:5783 – 5793, 2020. 1,2

[26]

Wensong Chan, Zhiqiang Tian, and Yang Wu. Gasgcn: Gated action-specific graph convolutional networks for skeleton-based action recognition. Sensors, 20(12):3499, 2020. 1, 2

[27]

Alejandro Hernandez, Jurgen Gall, and Francesc MorenoNoguer. Human motion prediction via spatio-temporal inpainting. In Proceedings of the IEEE International Conference on Computer Vision,Pages 7134 – 7143, 2019. 1, 2

[28]

Hai-Feng Sang, Zi-Zhen Chen, Da-Kuo He. Human Motion prediction based on attentional mechanism. Multimedia Tools and Applications, 79(9):5529 – 5544, 2020

[29]

Fanjia Li, Aichun Zhu, Yonggang Xu, Ran Cui, Gang Hua. Multi-stream and Enhanced spatial-temporal graph convolution for skeleton based action recognition. IEEE Transactions on Pattern Recognition, 8:97757 – 97770,1 2020.

[30]

Chengjiang Long and Gang Hua. Correlational gaussian processes for cross-domain visual recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. 2

[31]

Xiaoli Liu, Jianqin Yin, Jin Liu, Pengxiang Ding, Jun Liu, and Huaping Liub. Trajectorycnn:A spatio-temporal Feature Learning Network for Human Motion Prediction. IEEE Transactions on Circuits and Systems for Video Technology, 2020.1,2

[32]

Ailing Zeng, Xiao Sun, Fuyang Huang, Minhao Liu, Qiang Xu, and Stephen Lin. Srnet:Generalization in 3D human pose estimation with a split and recombine approach. ArXiv Preprint arXiv:2007.09389, 2020. 1, 2, 3, 5, 6, 7, 0

[33]

Julieta Martinez, Michael J Black,And Javier Romero. On Human Motion Prediction In Recurrent neural Networks. Proceedings of the IEEE Conference On Computer Vision and Pattern Recognition, pages 2891 – 2900, 2017. 1, 2, 5, 6, 7

[34]

Catalin Ionescu, Dragos Papava, Vlad Olaru, and Cristian Sminchisescu. Human3. 6m:Large scale datasets and Predictive Methods for 3D Human Sensing in Natural Environments. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(7):1325 – 1339, 2013. 5

Digital Library

[35]

Luyang Wang, Yan Chen, Zhenhua Guo, Keyuan Qian, Mude Lin, Hongsheng Li, And Jimmy S Ren. Generalizing monocular 3D human pose estimation in the wild. ArXiv Preprint arXiv: 194.05512, 2019

[36]

Sungheon Park and Nojun Kwak. A Relational Analysis of Relational Networks. ArXiv Preprint arXiv:1805.08961, 2018

[37]

Sijie Song, Cuiling Lan, Junliang Xing, Wenjun Zeng,and Jiaying Liu. An end-to-end spatio-temporal attention model for human action recognition from skeleton data. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 31, 2017. 1, 2

[38]

Amal Fahad Al-Aqel and Murtaza Ali Khan. A Study on human Motion Prediction based on fuzzy neural network. Proceedings of 2020 3rd International Conference on Computer Applications & Information Security, Pages 1 – 6.2020. 1, 2

[39]

Hao-Shu Fang, Yuanlu Xu, Wenguan Wang, Xiaobai Liu,and Song-Chun Zhu. Learning pose grammar to encode human body configuration for 3d pose estimation. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018. 1

[40]

Gang Hua, Chengjiang Long, Ming Yang, and Yan Gao. Collaborative active visual recognition from crowds:IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(3):582 – 594, 2018. 2

[41]

Li Zhang, Yan Ge, Zhang Zhang, Haiping Lu. The Neural Networks of Hop-hop Relation-aware Graph. ArXiv Preprint arXiv:2012.11147, 2020

[42]

Jogendra Nath Kundu, Maharshi Gor, and R Venkatesh Babu. Bihmp-gan:Bidirectional 3D Human Motion Prediction Gan. In Proceedings of the AAAI Conference on Artificial Intelligence, Volume 33, Pages 8553 – 8560, 2019. 1, 2

[43]

Hai Ci, Chunyu Wang, Xiaoxuan Ma, and Yizhou Wang. Optimizing network structure for 3d human pose estimation. In Proceedings of the IEEE International Conference on Computer Vision,Pages 2262 – 2271, 2019. 1, 2, 3, 4, 5, 6, 7, 8, 0

[44]

Zhiming Zou, Kenkun Liu, Le Wang, and Wei Tang. High-order graph convolutional networks for 3d human pose estimation. BMVC, 2020. 1, 2, 3, 6

[45]

Long Zhao, Xi Peng, Yu Tian, Mubbasir Kapadia,Convolutional graph networks for 3D Human Pose Regression. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Pages 3425 – 3435, 2019. 1,Two, three, five, six, zero

Index Terms

AZY-GCN: Multi-scale feature suppression attentional diagram convolutional network for human pose prediction
1. Computing methodologies
2. Theory of computation
  1. Theory and algorithms for application domains

Index terms have been assigned to the content through auto-classification.

Recommendations

Dynamic Compositional Graph Convolutional Network for Efficient Composite Human Motion Prediction
MM '23: Proceedings of the 31st ACM International Conference on Multimedia

With potential applications in fields including intelligent surveillance and human-robot interaction, the human motion prediction task has become a hot research topic and also has achieved high success, especially using the recent Graph Convolutional ...
Pose graph parsing network for human-object interaction detection
Graphical abstract
We construct a multibranch network to study high-level semantic features. In addition to emphasizing the appearance area of each instance in an image, feature propagation based on a pose graph is further adopted to consider ...
Highlights
- We utilize GCN to obtain the fine-grained correlation features between body parts.
Abstract
The detection of interactions between humans and objects is one of the core issues in the area of scene understanding in image analysis. The conventional method is to pair the human body with the object as an entity and pay attention ...
Scale-Aware Network with Attentional Selection for Human Pose Estimation
Human Centered Computing
Abstract
Human pose estimation is a fundamental yet challenging task in computer vision. Human pose estimation from a single image is a challenging problem due to the limited information of 2D images and the large variations in configuration and appearance ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

CSAI '21: Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence

December 2021

437 pages

ISBN:9781450384155

DOI:10.1145/3507548

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 March 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

CSAI 2021

CSAI 2021: 2021 5th International Conference on Computer Science and Artificial Intelligence

December 4 - 6, 2021

Beijing, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
44
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)0

Reflects downloads up to 27 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten