Robust human pose estimation from distorted wide-angle images through iterative search of transformation parameters

Miki, Daisuke; Abe, Shinya; Chen, Shi; Demachi, Kazuyuki

doi:10.1007/s11760-019-01602-5

Robust human pose estimation from distorted wide-angle images through iterative search of transformation parameters

Original Paper
Published: 19 November 2019

Volume 14, pages 693–700, (2020)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Daisuke Miki^1,2,
Shinya Abe¹,
Shi Chen² &
…
Kazuyuki Demachi²

564 Accesses
7 Citations
Explore all metrics

Abstract

Tracking human motion from video sequences is a well-known video surveillance technique, and many commercially available motion capture devices can now recognize human poses using a depth camera. However, depth camera systems are complicated and have limited optical fields of view. To overcome this problem, it is necessary to develop techniques for recognizing human motion in wide-angle images. In this study, we devised a method for tracking human motion that is robust to wide-angle image distortion. To do so, we developed a new multilayered convolutional neural network architecture for estimating the locations of human body parts in images along with associated transformation parameters that can be applied to a distorted wide-angle image on a frame-by-frame basis. The proposed method was applied to distorted wide-angle images, and its robustness was demonstrated via a quantitative evaluation of human joint prediction and a comparative analysis with a commercially available depth camera-based motion capture system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the Role of Depth Predictions for 3D Human Pose Estimation

DeepTAM: Deep Tracking and Mapping with Convolutional Neural Networks

Article 03 September 2019

Motion Guided 3D Pose Estimation from Videos

References

Girshick, R., Shotton, J., Kohli, P., Criminisi, A., Fitzgibbon, A.: Efficient regression of general-activity human poses from depth images. In: 2011 International Conference on Computer Vision, pp. 415–422. IEEE (2011)
Shotton, J., Girshick, R., Fitzgibbon, A., Sharp, T., Cook, M., Finocchio, M., Moore, R., Kohli, P., Criminisi, A., Kipman, A., et al.: Efficient human pose estimation from single depth images. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2821 (2012)
Article Google Scholar
Jiang, F., Wu, S., Yang, G., Zhao, D., Kung, S.Y.: Viewpoint-independent hand gesture recognition with Kinect. Signal Image Video Process. 8(1), 163 (2014)
Article Google Scholar
Alzahrani, M.S., Jarraya, S.K., Ben-Abdallah, H., Ali, M.S.: Comprehensive evaluation of skeleton features-based fall detection from Microsoft Kinect v2. Signal Image Video Process. 13(7), 1431–1439 (2019)
Article Google Scholar
Felzenszwalb, P.F., Huttenlocher, D.P.: Pictorial structures for object recognition. Int. J. Comput. Vis. 61(1), 55 (2005)
Article Google Scholar
Andriluka, M., Roth, S., Schiele, B.: Pictorial structures revisited: people detection and articulated pose estimation. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1014–1021. IEEE (2009)
Yang, Y., Ramanan, D.: Articulated human detection with flexible mixtures of parts. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2878 (2012)
Article Google Scholar
Wang, Y., Mori, G.: Multiple tree models for occlusion and spatial constraints in human pose estimation. In: European Conference on Computer Vision, pp. 710–724. Springer (2008)
Sigal, L., Black, M.J.: Measure locally, reason globally: occlusion-sensitive articulated pose estimation. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2, pp. 2041–2048. IEEE (2006)
Toshev, A., Szegedy, C.: Deeppose: human pose estimation via deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1653–1660 (2014)
Tompson, J.J., Jain, A., LeCun, Y., Bregler, C.: Joint training of a convolutional network and a graphical model for human pose estimation. In: Advances in Neural Information Processing Systems, pp. 1799–1807 (2014)
Pfister, T., Charles, J., Zisserman, A.: Flowing convnets for human pose estimation in videos. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1913–1921 (2015)
Agarwal, A., Triggs, B.: Recovering 3D human pose from monocular images. IEEE Trans. Pattern Anal. Mach. Intell. 28(1), 44 (2005)
Article Google Scholar
Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4724–4732 (2016)
Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: European Conference on Computer Vision, pp. 483–499. Springer (2016)
Cao, Z., Hidalgo, G., Simon, T., Wei, S.E., Sheikh, Y.: OpenPose: realtime multi-person 2D pose estimation using Part Affinity Fields. arXiv preprint arXiv:1812.08008 (2018)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, NIPS’12 Proceedings of the 25th International Conference on Neural Information Processing Systems, pp 1097–1105 (2012)
Li, W., Mahadevan, V., Vasconcelos, N.: Anomaly detection and localization in crowded scenes. IEEE Trans. Pattern Anal. Mach. Intell. 36(1), 18 (2013)
Google Scholar
Tekin, B., Rozantsev, A., Lepetit, V., Fua, P.: Direct prediction of 3D body poses from motion compensated sequences. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 991–1000 (2016)
Mehta, D., Sridhar, S., Sotnychenko, O., Rhodin, H., Shafiei, M., Seidel, H.P., Xu, W., Casas, D., Theobalt, C.: Vnect: real-time 3D human pose estimation with a single RGB camera. ACM Trans. Graph. (TOG) 36(4), 44 (2017)
Article Google Scholar
Rogez, G., Schmid, C.: Image-based synthesis for deep 3D human pose estimation. Int. J. Comput. Vis. 126(9), 993 (2018)
Article Google Scholar
Tome, D., Russell, C., Agapito, L.: Lifting from the deep: convolutional 3D pose estimation from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2500–2509 (2017)
Martinez, J., Hossain, R., Romero, J., Little, J.J.: A simple yet effective baseline for 3D human pose estimation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2640–2649 (2017)
Liu, J., Ding, H., Shahroudy, A., Duan, L.Y., Jiang, X., Wang, G., Chichung, A.K.: Feature boosting network for 3D pose estimation. arXiv preprint arXiv:1901.04877 (2019)
Pavllo, D., Feichtenhofer, C., Grangier, D., Auli, M.: 3D human pose estimation in video with temporal convolutions and semi-supervised training. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7753–7762 (2019)
Xu, W., Chatterjee, A., Zollhoefer, M., Rhodin, H., Fua, P., Seidel, H.P., Theobalt, C.: Mo\(^2\)Cap\(^2\): real-time mobile 3D motion capture with a cap-mounted fisheye camera. IEEE Trans. Vis. Comput. Graph. 25(5), 2093 (2019)
Article Google Scholar
Iqbal, U., Doering, A., Yasin, H., Krüger, B., Weber, A., Gall, J.: A dual-source approach for 3D human pose estimation from single images. Comput. Vis. Image Underst. 172, 37 (2018)
Article Google Scholar
Unity software. https://unity.com Accessed 22 Oct 2019
CMU Graphics Lab Motion Capture Database. http://mocap.cs.cmu.edu/ Accessed 22 Oct 2019
Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B.: 2D human pose estimation: new benchmark and state of the art analysis. In: Proceedings of the IEEE Conference on computer Vision and Pattern Recognition, pp. 3686–3693 (2014)
Johnson, S., Everingham, M.: Clustered pose and nonlinear appearance models for human pose estimation. In: BMVC, vol. 2, p. 5 (2010)
Yang, Y., Ramanan, D.: Articulated pose estimation with flexible mixtures-of-parts. In: CVPR 2011, pp. 1385–1392. IEEE (2011)
Evaluation tool for the LSP dataset. http://human-pose.mpi-inf.mpg.de/results/lsp/evalLSP.zip Accessed 22 Oct 2019

Download references

Acknowledgements

This work was supported by Japan Society for the Promotion of Science (JSPS) Grants-in-Aid for Scientific Research Grant No. JP19K20310.

Author information

Authors and Affiliations

Tokyo Metropolitan Industrial Technology Research Institute, The University of Tokyo, 2-4-10 Aomi, Koto-ku, Tokyo, 135-0064, Japan
Daisuke Miki & Shinya Abe
The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan
Daisuke Miki, Shi Chen & Kazuyuki Demachi

Authors

Daisuke Miki
View author publications
You can also search for this author in PubMed Google Scholar
Shinya Abe
View author publications
You can also search for this author in PubMed Google Scholar
Shi Chen
View author publications
You can also search for this author in PubMed Google Scholar
Kazuyuki Demachi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daisuke Miki.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Miki, D., Abe, S., Chen, S. et al. Robust human pose estimation from distorted wide-angle images through iterative search of transformation parameters. SIViP 14, 693–700 (2020). https://doi.org/10.1007/s11760-019-01602-5

Download citation

Received: 16 August 2019
Revised: 28 October 2019
Accepted: 11 November 2019
Published: 19 November 2019
Issue Date: June 2020
DOI: https://doi.org/10.1007/s11760-019-01602-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust human pose estimation from distorted wide-angle images through iterative search of transformation parameters

Abstract

Access this article

Similar content being viewed by others

On the Role of Depth Predictions for 3D Human Pose Estimation

DeepTAM: Deep Tracking and Mapping with Convolutional Neural Networks

Motion Guided 3D Pose Estimation from Videos

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Robust human pose estimation from distorted wide-angle images through iterative search of transformation parameters

Abstract

Access this article

Similar content being viewed by others

On the Role of Depth Predictions for 3D Human Pose Estimation

DeepTAM: Deep Tracking and Mapping with Convolutional Neural Networks

Motion Guided 3D Pose Estimation from Videos

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation