Skip to main content

Multiple-Branches Faster RCNN for Human Parts Detection and Pose Estimation

  • Conference paper
  • First Online:
Computer Vision – ACCV 2016 Workshops (ACCV 2016)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10118))

Included in the following conference series:

Abstract

In this work, we primarily address multiple people pose estimation challenge by exploring the performance of Faster RCNN on human parts detection. We develop a multiple-branches Faster RCNN model for our specific task of detecting persons and their parts. Our model can improve the performance of detecting human parts and the whole persons, meanwhile speeding up detection process with shared weights. A part-based method is proposed to estimate multiple people poses, bringing recent advances on object detection to this task. Experiments demonstrate that our model achieves better performance than the original Faster RCNN model on our task. Compared with other pose estimation approaches, our approach achieves fair or better results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Andriluka, M., Roth, S., Schiele, B.: Pictorial structures revisited: people detection and articulated pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 1014–1021. IEEE (2009)

    Google Scholar 

  2. Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B.: 2D human pose estimation: new benchmark and state of the art analysis. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3686–3693. IEEE (2014)

    Google Scholar 

  3. Bourdev, L., Malik, J.: Poselets: body part detectors trained using 3D human pose annotations. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 1365–1372. IEEE (2009)

    Google Scholar 

  4. Chen, X., Yuille, A.L.: Articulated pose estimation by a graphical model with image dependent pairwise relations. In: Advances in Neural Information Processing Systems, pp. 1736–1744 (2014)

    Google Scholar 

  5. Dantone, M., Gall, J., Leistner, C., Van Gool, L.: Human pose estimation using body parts dependent joint regressors. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3041–3048. IEEE (2013)

    Google Scholar 

  6. Felzenszwalb, P.F., Huttenlocher, D.P.: Pictorial structures for object recognition. Int. J. Comput. Vis. 61, 55–79 (2005)

    Article  Google Scholar 

  7. Gkioxari, G., Girshick, R., Malik, J.: Actions and attributes from wholes and parts. arXiv preprint arXiv:1412.2604 (2014)

  8. Gkioxari, G., Hariharan, B., Girshick, R., Malik, J.: Using k-poselets for detecting people and localizing their keypoints. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3582–3589. IEEE (2014)

    Google Scholar 

  9. Gkioxari, G., Hariharan, B., Girshick, R., Malik, J.: R-CNNs for pose estimation and action detection. arXiv preprint arXiv:1406.5212 (2014)

  10. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 580–587. IEEE (2014)

    Google Scholar 

  11. Girshick, R.: Fast R-CNN. arXiv preprint arXiv:1504.08083 (2015)

  12. He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 346–361. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10578-9_23

    Google Scholar 

  13. Johnson, S., Everingham, M.: Clustered pose and nonlinear appearance models for human pose estimation. In: BMVC, vol. 2, p. 5 (2010)

    Google Scholar 

  14. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105 (2012)

    Google Scholar 

  15. Pfister, T., Charles, J., Zisserman, A.: Flowing convNets for human pose estimation in videos. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1913–1921 (2015)

    Google Scholar 

  16. Pishchulin, L., Andriluka, M., Gehler, P., Schiele, B.: Poselet conditioned pictorial structures. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 588–595 (2013)

    Google Scholar 

  17. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)

    Google Scholar 

  18. Rogez, G., Rihan, J., Ramalingam, S., Orrite, C., Torr, P.H.: Randomized trees for human pose detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–8. IEEE (2008)

    Google Scholar 

  19. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)

    Google Scholar 

  20. Toshev, A., Szegedy, C.: DeepPose: human pose estimation via deep neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1653–1660. IEEE (2014)

    Google Scholar 

  21. Uijlings, J.R., Sande, K.E., Gevers, T., Smeulders, A.W.: Selective search for object recognition. Int. J. Comput. Vis. 104, 154–171 (2013)

    Article  Google Scholar 

  22. Yang, Y., Ramanan, D.: Articulated human detection with flexible mixtures of parts. IEEE Trans. Pattern Anal. Mach. Intell. 35, 2878–2890 (2013)

    Article  Google Scholar 

Download references

Acknowledgement

We gratefully acknowledge that this work is supported by the fundings from National Natural Science Foundation of China (61273285, 61375019).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xu Zhao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Wei, K., Zhao, X. (2017). Multiple-Branches Faster RCNN for Human Parts Detection and Pose Estimation. In: Chen, CS., Lu, J., Ma, KK. (eds) Computer Vision – ACCV 2016 Workshops. ACCV 2016. Lecture Notes in Computer Science(), vol 10118. Springer, Cham. https://doi.org/10.1007/978-3-319-54526-4_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-54526-4_33

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-54525-7

  • Online ISBN: 978-3-319-54526-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics