Skip to main content
Log in

Robust human pose estimation from distorted wide-angle images through iterative search of transformation parameters

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Tracking human motion from video sequences is a well-known video surveillance technique, and many commercially available motion capture devices can now recognize human poses using a depth camera. However, depth camera systems are complicated and have limited optical fields of view. To overcome this problem, it is necessary to develop techniques for recognizing human motion in wide-angle images. In this study, we devised a method for tracking human motion that is robust to wide-angle image distortion. To do so, we developed a new multilayered convolutional neural network architecture for estimating the locations of human body parts in images along with associated transformation parameters that can be applied to a distorted wide-angle image on a frame-by-frame basis. The proposed method was applied to distorted wide-angle images, and its robustness was demonstrated via a quantitative evaluation of human joint prediction and a comparative analysis with a commercially available depth camera-based motion capture system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Girshick, R., Shotton, J., Kohli, P., Criminisi, A., Fitzgibbon, A.: Efficient regression of general-activity human poses from depth images. In: 2011 International Conference on Computer Vision, pp. 415–422. IEEE (2011)

  2. Shotton, J., Girshick, R., Fitzgibbon, A., Sharp, T., Cook, M., Finocchio, M., Moore, R., Kohli, P., Criminisi, A., Kipman, A., et al.: Efficient human pose estimation from single depth images. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2821 (2012)

    Article  Google Scholar 

  3. Jiang, F., Wu, S., Yang, G., Zhao, D., Kung, S.Y.: Viewpoint-independent hand gesture recognition with Kinect. Signal Image Video Process. 8(1), 163 (2014)

    Article  Google Scholar 

  4. Alzahrani, M.S., Jarraya, S.K., Ben-Abdallah, H., Ali, M.S.: Comprehensive evaluation of skeleton features-based fall detection from Microsoft Kinect v2. Signal Image Video Process. 13(7), 1431–1439 (2019)

    Article  Google Scholar 

  5. Felzenszwalb, P.F., Huttenlocher, D.P.: Pictorial structures for object recognition. Int. J. Comput. Vis. 61(1), 55 (2005)

    Article  Google Scholar 

  6. Andriluka, M., Roth, S., Schiele, B.: Pictorial structures revisited: people detection and articulated pose estimation. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1014–1021. IEEE (2009)

  7. Yang, Y., Ramanan, D.: Articulated human detection with flexible mixtures of parts. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2878 (2012)

    Article  Google Scholar 

  8. Wang, Y., Mori, G.: Multiple tree models for occlusion and spatial constraints in human pose estimation. In: European Conference on Computer Vision, pp. 710–724. Springer (2008)

  9. Sigal, L., Black, M.J.: Measure locally, reason globally: occlusion-sensitive articulated pose estimation. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2, pp. 2041–2048. IEEE (2006)

  10. Toshev, A., Szegedy, C.: Deeppose: human pose estimation via deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1653–1660 (2014)

  11. Tompson, J.J., Jain, A., LeCun, Y., Bregler, C.: Joint training of a convolutional network and a graphical model for human pose estimation. In: Advances in Neural Information Processing Systems, pp. 1799–1807 (2014)

  12. Pfister, T., Charles, J., Zisserman, A.: Flowing convnets for human pose estimation in videos. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1913–1921 (2015)

  13. Agarwal, A., Triggs, B.: Recovering 3D human pose from monocular images. IEEE Trans. Pattern Anal. Mach. Intell. 28(1), 44 (2005)

    Article  Google Scholar 

  14. Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4724–4732 (2016)

  15. Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: European Conference on Computer Vision, pp. 483–499. Springer (2016)

  16. Cao, Z., Hidalgo, G., Simon, T., Wei, S.E., Sheikh, Y.: OpenPose: realtime multi-person 2D pose estimation using Part Affinity Fields. arXiv preprint arXiv:1812.08008 (2018)

  17. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, NIPS’12 Proceedings of the 25th International Conference on Neural Information Processing Systems, pp 1097–1105 (2012)

  18. Li, W., Mahadevan, V., Vasconcelos, N.: Anomaly detection and localization in crowded scenes. IEEE Trans. Pattern Anal. Mach. Intell. 36(1), 18 (2013)

    Google Scholar 

  19. Tekin, B., Rozantsev, A., Lepetit, V., Fua, P.: Direct prediction of 3D body poses from motion compensated sequences. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 991–1000 (2016)

  20. Mehta, D., Sridhar, S., Sotnychenko, O., Rhodin, H., Shafiei, M., Seidel, H.P., Xu, W., Casas, D., Theobalt, C.: Vnect: real-time 3D human pose estimation with a single RGB camera. ACM Trans. Graph. (TOG) 36(4), 44 (2017)

    Article  Google Scholar 

  21. Rogez, G., Schmid, C.: Image-based synthesis for deep 3D human pose estimation. Int. J. Comput. Vis. 126(9), 993 (2018)

    Article  Google Scholar 

  22. Tome, D., Russell, C., Agapito, L.: Lifting from the deep: convolutional 3D pose estimation from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2500–2509 (2017)

  23. Martinez, J., Hossain, R., Romero, J., Little, J.J.: A simple yet effective baseline for 3D human pose estimation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2640–2649 (2017)

  24. Liu, J., Ding, H., Shahroudy, A., Duan, L.Y., Jiang, X., Wang, G., Chichung, A.K.: Feature boosting network for 3D pose estimation. arXiv preprint arXiv:1901.04877 (2019)

  25. Pavllo, D., Feichtenhofer, C., Grangier, D., Auli, M.: 3D human pose estimation in video with temporal convolutions and semi-supervised training. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7753–7762 (2019)

  26. Xu, W., Chatterjee, A., Zollhoefer, M., Rhodin, H., Fua, P., Seidel, H.P., Theobalt, C.: Mo\(^2\)Cap\(^2\): real-time mobile 3D motion capture with a cap-mounted fisheye camera. IEEE Trans. Vis. Comput. Graph. 25(5), 2093 (2019)

    Article  Google Scholar 

  27. Iqbal, U., Doering, A., Yasin, H., Krüger, B., Weber, A., Gall, J.: A dual-source approach for 3D human pose estimation from single images. Comput. Vis. Image Underst. 172, 37 (2018)

    Article  Google Scholar 

  28. Unity software. https://unity.com Accessed 22 Oct 2019

  29. CMU Graphics Lab Motion Capture Database. http://mocap.cs.cmu.edu/ Accessed 22 Oct 2019

  30. Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B.: 2D human pose estimation: new benchmark and state of the art analysis. In: Proceedings of the IEEE Conference on computer Vision and Pattern Recognition, pp. 3686–3693 (2014)

  31. Johnson, S., Everingham, M.: Clustered pose and nonlinear appearance models for human pose estimation. In: BMVC, vol. 2, p. 5 (2010)

  32. Yang, Y., Ramanan, D.: Articulated pose estimation with flexible mixtures-of-parts. In: CVPR 2011, pp. 1385–1392. IEEE (2011)

  33. Evaluation tool for the LSP dataset. http://human-pose.mpi-inf.mpg.de/results/lsp/evalLSP.zip Accessed 22 Oct 2019

Download references

Acknowledgements

This work was supported by Japan Society for the Promotion of Science (JSPS) Grants-in-Aid for Scientific Research Grant No. JP19K20310.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daisuke Miki.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Miki, D., Abe, S., Chen, S. et al. Robust human pose estimation from distorted wide-angle images through iterative search of transformation parameters. SIViP 14, 693–700 (2020). https://doi.org/10.1007/s11760-019-01602-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-019-01602-5

Keywords

Navigation