skip to main content
10.1145/3507548.3507565acmotherconferencesArticle/Chapter ViewAbstractPublication PagescsaiConference Proceedingsconference-collections
research-article

AZY-GCN: Multi-scale feature suppression attentional diagram convolutional network for human pose prediction

Published: 09 March 2022 Publication History

Abstract

Due to the randomness and non-periodic nature of the future posture of the human body, the prediction of the posture of the human body has always been a very challenging task. In the latest research, graph convolution is proved to be an effective method to capture the dynamic relationship between the human body posture joints, which is helpful for the human body posture prediction. Moreover, graph convolution can abstract the pose of the human body to obtain a multi-scale pose set. As the level of abstraction increases, the posture movement will become more stable. Although the average prediction accuracy has improved significantly in recent years, there is still much room for exploration in the application of graph convolution in pose prediction. In this work, we propose a new multi-scale feature suppression attention map convolutional network (AZY-GCN) for end-to-end human pose prediction tasks. We use GCN to extract features from the fine-grained scale to the coarse-grained scale and then from the coarse-grained scale to the fine-grained scale. Then we combine and decode the extracted features at each scale to obtain the residual between the input and the target pose. We also performed intermediate supervision on all predicted poses so that the network can learn more representative features. In addition, we also propose a new feature suppression attention module (FISA-block), which can effectively extract relevant information from neighboring nodes while suppressing poor GCN learning noise. Our proposed method was evaluated on the public data sets of Human3.6M and CMU Mocap. After a large number of experiments, it is shown that our method has achieved relatively advanced performance.

References

[1]
Yujun Cai, Lin Huang, Yiwei Wang, Tat-Jen Cham, Jianfei Cai, Junsong Yuan, Jun Liu, Xu Yang, Yiheng Zhu, Xiaohui Shen,In Proceedings of the European Conference on Computer Vision, Pages 226 – 242. Springer, 2020. 2
[2]
Dario Pavllo, Christoph Feichtenhofer, David Grangier,and Michael Auli. 3d human pose estimation in video with temporal convolutions and semisupervised training. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,Pages 7753 – 7762, 2019. 1, 6, 7, 0
[3]
Yongxin Wang, Kris Kitani, And Xinshuo Weng. Joint Object Detection with Graph neural networks. ArXiv PrePrint arXiv:2006.13164, 2020
[4]
Maosen Li, Siheng Chen, Yangheng Zhao, Ya Zhang, Yanfeng Wang,And Qi Tian. Dynamic Multiscale Graph Neural Networks for 3D Skeleton Based on Human Motion Prediction. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Pages 214 – 223,2020. 1, 2, 3, 5, 6,7
[5]
Wei Mao, Miaomiao Liu, Mathieu Salzmann, and Hongdong Li. Learning trajectory dependencies for human motion prediction. In Proceedings of the IEEE International Conference on Computer Vision,Pages 9489 – 9497, 2019. 1, 2, 3,5, 6, 7
[6]
Thomas N Kipf and Max Welling. Semi-supervised classification with graph convolutional networks. International Conference on Learning Representations, 2016. 3
[7]
Ashraful Islam, Chengjiang Long, Arslan Basharat, and Anthony Hoogs. Doa-gan:Dual-order attentive generative adversarial network for image copy-move forgery detection and localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2020. 2
[8]
Chengjiang Long, Gang Hua, and Ashish Kapoor. A joint gaussian process model for active visual recognition with expertise estimation in crowdsourcing. International Journal of Computer Vision,116 (2) : 136-160, 2016. 2
[9]
Qiuhong Ke, Mohammed Bennamoun, Hossein Rahmani,Senjian An, Ferdous Sohel,And Farid Boussaid.Learning Latent Global Network for skeleton based Action Prediction.IEEE Transactions on Image Processing, 29:959 – 970, 2019.1, 2
[10]
Julieta Martinez, Rayat Hossain, Javier Romero,A simple yet effective Baseline for 3D human Pose estimation. Proceedings of the IEEE International Conference on Computer Vision, Pages 2640 – 2649, 2017. 1, 3, 5, 6, 7, 0, 2
[11]
Danfeng Hong, Lianru Gao, Jing Yao, Bing Zhang, Antonio Plaza,and Jocelyn Chanussot. Graph convolutional networks for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing, 2020. 2
[12]
Tao Hu, Chengjiang Long, Chunxia Xiao. Novel Visual representation using Diverse Conditional Gan for Visual recognition. IEEE Transactions on Image Processing, 30:3499-3512, 2021
[13]
Qiongjie Cui, Huaijiang Sun, and Fei Yang. Learning dynamic relationships for 3d human motion prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,Pages 6519 – 6527, 2020. 1, 2, 3
[14]
Andreas M Lehrmann, Peter V Gehler, and Sebastian Nowozin. Efficient nonlinear markov models for human motion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,Pages 1314-1321, 2014. 1
[15]
Tianhang Zheng, Sheng Liu, Changyou Chen, Junsong Yuan, Baochun Li,Kui Ren. Towards Understanding the Adversarial vulnerability of resilience of skeleton based action Recognition. ArXiv Preprint arXiv:2005.07151, 2020. 1, 2
[16]
Xikun Zhang, Chang Xu, and Dacheng Tao. Context aware graph convolution for skeleton-based action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,Pages 14333-14342, 2020. 1
[17]
Lei Shi, Yifan Zhang, Jian Cheng,and Hanqing Lu. Skeleton-based action recognition with directed graph neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 2019. 1
[18]
Xiaodan Liang, Yuxiong Wang, Xiaodan Liang, And Jose MF Moura. Adversarial Geometry - Aware Human Motion Prediction. Proceedings of the European Conference on Computer Vision, Pages 786 – 803,2018. 1, 2
[19]
Dong Wang, Yuan Yuan, And Qi Wang. Early action prediction with generative adversarial networks. IEEE Transactions on Neural Networks, 2019. 1, 2
[20]
Emre Aksan, Peng Cao, Manuel Kaufmann, And Otmar Hilliges. Spatio-temporal Transformer for 3D Human Motion Prediction. ArXiv E-Prints, Pages arXiv – 2004, 2020
[21]
Gang Hua, Chengjiang Long, Ming Yang,Yan Gao. Collaborative Active Learning of a kernel Machine ensemble for Recognition. In Proceedings of the IEEE International Conference on Computer Vision, Pages 1209 – 1216, 2013
[22]
Liushuai Shi, Le Wang, Chengjiang Long, Sanping Zhou, Mo Zhou, Zhenxing Niu, and Gang Hua. Sgcn:Sparse graph convolution for pedestrian trajectory prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021. 2
[23]
Chengjiang Long and Gang Hua. Multi-class multiannotator active learning with robust gaussian process for visual recognition. In Proceedings of the IEEE International Conference on ComputerVision, 2015. 2
[24]
Katerina Fragkiadaki, Sergey Levine, Panna Felsen, and Jitendra Malik. Recurrent network models for human dynamics. In Proceedings of the IEEE International Conference on Computer Vision,Pages 4346 – 4354, 2015. 1, 2
[25]
Hao Yang, Chunfeng Yuan, Li Zhang, Yunda Sun, Weiming Hu, and Stephen J Maybank. Sta-cnn:Convolutional temporal attention learning for action recognition. IEEE Transactions on Image Processing, 29:5783 – 5793, 2020. 1,2
[26]
Wensong Chan, Zhiqiang Tian, and Yang Wu. Gasgcn: Gated action-specific graph convolutional networks for skeleton-based action recognition. Sensors, 20(12):3499, 2020. 1, 2
[27]
Alejandro Hernandez, Jurgen Gall, and Francesc MorenoNoguer. Human motion prediction via spatio-temporal inpainting. In Proceedings of the IEEE International Conference on Computer Vision,Pages 7134 – 7143, 2019. 1, 2
[28]
Hai-Feng Sang, Zi-Zhen Chen, Da-Kuo He. Human Motion prediction based on attentional mechanism. Multimedia Tools and Applications, 79(9):5529 – 5544, 2020
[29]
Fanjia Li, Aichun Zhu, Yonggang Xu, Ran Cui, Gang Hua. Multi-stream and Enhanced spatial-temporal graph convolution for skeleton based action recognition. IEEE Transactions on Pattern Recognition, 8:97757 – 97770,1 2020.
[30]
Chengjiang Long and Gang Hua. Correlational gaussian processes for cross-domain visual recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. 2
[31]
Xiaoli Liu, Jianqin Yin, Jin Liu, Pengxiang Ding, Jun Liu, and Huaping Liub. Trajectorycnn:A spatio-temporal Feature Learning Network for Human Motion Prediction. IEEE Transactions on Circuits and Systems for Video Technology, 2020.1,2
[32]
Ailing Zeng, Xiao Sun, Fuyang Huang, Minhao Liu, Qiang Xu, and Stephen Lin. Srnet:Generalization in 3D human pose estimation with a split and recombine approach. ArXiv Preprint arXiv:2007.09389, 2020. 1, 2, 3, 5, 6, 7, 0
[33]
Julieta Martinez, Michael J Black,And Javier Romero. On Human Motion Prediction In Recurrent neural Networks. Proceedings of the IEEE Conference On Computer Vision and Pattern Recognition, pages 2891 – 2900, 2017. 1, 2, 5, 6, 7
[34]
Catalin Ionescu, Dragos Papava, Vlad Olaru, and Cristian Sminchisescu. Human3. 6m:Large scale datasets and Predictive Methods for 3D Human Sensing in Natural Environments. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(7):1325 – 1339, 2013. 5
[35]
Luyang Wang, Yan Chen, Zhenhua Guo, Keyuan Qian, Mude Lin, Hongsheng Li, And Jimmy S Ren. Generalizing monocular 3D human pose estimation in the wild. ArXiv Preprint arXiv: 194.05512, 2019
[36]
Sungheon Park and Nojun Kwak. A Relational Analysis of Relational Networks. ArXiv Preprint arXiv:1805.08961, 2018
[37]
Sijie Song, Cuiling Lan, Junliang Xing, Wenjun Zeng,and Jiaying Liu. An end-to-end spatio-temporal attention model for human action recognition from skeleton data. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 31, 2017. 1, 2
[38]
Amal Fahad Al-Aqel and Murtaza Ali Khan. A Study on human Motion Prediction based on fuzzy neural network. Proceedings of 2020 3rd International Conference on Computer Applications & Information Security, Pages 1 – 6.2020. 1, 2
[39]
Hao-Shu Fang, Yuanlu Xu, Wenguan Wang, Xiaobai Liu,and Song-Chun Zhu. Learning pose grammar to encode human body configuration for 3d pose estimation. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018. 1
[40]
Gang Hua, Chengjiang Long, Ming Yang, and Yan Gao. Collaborative active visual recognition from crowds:IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(3):582 – 594, 2018. 2
[41]
Li Zhang, Yan Ge, Zhang Zhang, Haiping Lu. The Neural Networks of Hop-hop Relation-aware Graph. ArXiv Preprint arXiv:2012.11147, 2020
[42]
Jogendra Nath Kundu, Maharshi Gor, and R Venkatesh Babu. Bihmp-gan:Bidirectional 3D Human Motion Prediction Gan. In Proceedings of the AAAI Conference on Artificial Intelligence, Volume 33, Pages 8553 – 8560, 2019. 1, 2
[43]
Hai Ci, Chunyu Wang, Xiaoxuan Ma, and Yizhou Wang. Optimizing network structure for 3d human pose estimation. In Proceedings of the IEEE International Conference on Computer Vision,Pages 2262 – 2271, 2019. 1, 2, 3, 4, 5, 6, 7, 8, 0
[44]
Zhiming Zou, Kenkun Liu, Le Wang, and Wei Tang. High-order graph convolutional networks for 3d human pose estimation. BMVC, 2020. 1, 2, 3, 6
[45]
Long Zhao, Xi Peng, Yu Tian, Mubbasir Kapadia,Convolutional graph networks for 3D Human Pose Regression. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Pages 3425 – 3435, 2019. 1,Two, three, five, six, zero

Index Terms

  1. AZY-GCN: Multi-scale feature suppression attentional diagram convolutional network for human pose prediction
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image ACM Other conferences
            CSAI '21: Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence
            December 2021
            437 pages
            ISBN:9781450384155
            DOI:10.1145/3507548
            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            Published: 09 March 2022

            Permissions

            Request permissions for this article.

            Check for updates

            Author Tags

            1. Attention
            2. Attitude prediction
            3. Behavior analysis
            4. Graph convolution

            Qualifiers

            • Research-article
            • Research
            • Refereed limited

            Conference

            CSAI 2021

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • 0
              Total Citations
            • 44
              Total Downloads
            • Downloads (Last 12 months)5
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 27 Feb 2025

            Other Metrics

            Citations

            View Options

            Login options

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            HTML Format

            View this article in HTML Format.

            HTML Format

            Figures

            Tables

            Media

            Share

            Share

            Share this Publication link

            Share on social media