Abstract:
To improve the accuracy and robustness of human pose estimation, a step deep convolution neural network is proposed, which consists of a feed forward module and several s...Show MoreMetadata
Abstract:
To improve the accuracy and robustness of human pose estimation, a step deep convolution neural network is proposed, which consists of a feed forward module and several step modules. Image features output from the last four layers of feed forward module are fused with the context features, the fused features are used as the input of the first step module; Image features of each step module fused with context features are used as the input of the next step module; The confidence map output from the last step module is used to predict joint position. This stepwise approach increases the receptive fields, which is good to learn the long-distance relationship between joints and predicting the position of occluded joints. At the same time, the confidence calculated by the previous module provides more and more accurate estimation of the position of each joint in the subsequent modules. In addition, the network also provides a way to strengthen the intermediate supervision by learning the objective function, so as to supplement the gradient of back-propagation and adjust the learning process, which effectively solves problem that the gradient disappears during training. Our approach is tested on two standard datasets of Leeds Sports Poses (LSP) and Frames Labeled In Cinema (FLIC), whose results indicate that our network has better performance in pose estimation of human body.
Published in: 2018 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)
Date of Conference: 13-15 October 2018
Date Added to IEEE Xplore: 03 February 2019
ISBN Information: