Abstract
Deep hand pose estimation from single depth image plays a significant role in human-computer interaction. This paper proposes a novel method based on multiple transfer net to estimate hand pose utilizing only single-channel depth photos. A channel extending process for original single channel depth image is implemented to extend hand and hand palm regions and match the input format of a pre-trained network and fully utilize the parameters. A multiple transfer network refinement for the previous convolutional neural network is made to obtain various different feature maps. Also, a region ensemble is used to merge all output feature maps and integrate the results. The experimental results demonstrate that the proposed method outperforms state-of-art results with considerable accuracy on the NYU [1] and ICVL [2] datasets.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Tompson, J., Stein, M., Lecun, Y., Perlin, K.: Real-time continuous pose recovery of human hands using convolutional networks. ACM Trans. Graph. (TOG) 33, 1935–1946 (2014)
Tang, D., Chang, H.J., Tejani, A., Kim, T.K.: Latent regression forest: structured estimation of 3D articulated hand posture. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3786–3793. IEEE (2014)
Supancic, J., Rogez, G., Yang, Y., Shotton, J., Ramanan, D.: Depth-based hand pose estimation: data, methods, and challenges. In: International Conference on Computer Vision (ICCV). IEEE (2015)
Zhang, Y., Xu, C., Cheng, L.: Learning to search on manifolds for 3D pose estimation of articulated objects. In: arXiv preprint arXiv (2016)
Qian, C., Sun, X., Wei, Y., Tang, X., Sun, J.: Realtime and robust hand tracking from depth. In: Computer Vision and Pattern Recognition (CVPR), pp. 1106–1113. IEEE (2014)
Makris, A., Kyriazis, N., Argyros, A.A.: Hierarchical particle filtering for 3D hand tracking. In: Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 8–17. IEEE (2015)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Regionbased convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 38(1), 142–158 (2016)
Chen, L., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. In: arXiv preprint arXiv (2016)
Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Carreira, J., Agrawal, P., Fragkiadaki, K., Malik, J.: Human pose estimation with iterative error feedback. In: Computer Vision and Pattern Recognition (CVPR), pp. 4733–4742. IEEE (2016)
Haque, A., Peng, B., Luo, Z., Alahi, A., Yeung, S., Fei-Fei, L.: Towards viewpoint invariant 3D human pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 160–177. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_10
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 346–361. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10578-9_23
Hengkai, G., Guijin, W., Xinghao, C., Cairong, Z., Fei, Q., Huazhong, Y.: Region ensemble network: improving convolutional network for hand pose estimation. In: International Conference on Image Processing (ICIP). IEEE (2017)
Xinghao, C., Guijin, W., Hengkai, G., Cairong, Z.: Pose guided structured region ensemble network for cascaded hand pose estimation. In: arXiv preprint arXiv (2017)
Bar, Y., Diamant, I., Greenspan, H., Wolf, L.: Chest pathology detection using deep learning with non-medical training. In: Biomedical Imaging (ISBI), vol. 13. IEEE (2015)
Maxime, O., Leon, B., Ivan, L., Josef, S.: Learning and transferring mid-level image representations using convolutional neural networks. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1717–1724 (2014)
Andrej, K., George, T., Sanketh, S., Thomas, L., Rahul, S., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1725–1732 (2014)
Alexandre, L.A.: 3D object recognition using convolutional neural networks with transfer learning between input channels. In: Menegatti, E., Michael, N., Berns, K., Yamaguchi, H. (eds.) Intelligent Autonomous Systems 13. AISC, vol. 302, pp. 889–898. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-08338-4_64
Li, P., Ling, H., Li, X., Liao, C.: 3D hand pose estimation using randomized decision forest with segmentation index points. In: International Conference on Computer Vision (ICCV), pp. 819–827. IEEE (2015)
Taylor, J., Shotton, J., Sharp, T., Fitzgibbon, A.: The vitruvian manifold: inferring dense correspondences for one-shot human pose estimation. In: Computer Vision and Pattern Recognition (CVPR). IEEE (2012)
Madadi, M., Escalera, S., Baro, X., Gonzalez, J.: End-to-end global to local CNN learning for hand pose recovery in depth data. In: arXiv Preprint (2017)
Markus, O., Vincent, L.: DeepPrior++: improving fast and accurate 3D hand pose estimation. In: International Conference on Computer Vision (ICCV) Workshops. IEEE (2017)
Sun, X., Wei, Y., Liang, S., Tang, X., Sun, J.: Cascaded hand pose regression. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 824–832. IEEE (2015)
Oberweger, M., Wohlhart, P., Lepetit, V.: Hands deep in deep learning for hand pose estimation. In: Computer Vision Winter Workshop (CVWW), pp. 21–30 (2015)
Deng, X., Yang, S., Zhang, Y., Tan, P., Chang, L., Wang, H.: Hand3D: hand pose estimation using 3D neural network. In: arXiv Preprint (2017)
Wan, C., Yao, A., Van Gool, L.: Hand pose estimation from local surface normals. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 554–569. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_34
Zhou, X., Wan, Q., Zhang, W., Xue, X., Wei, Y.: Model-based deep hand pose estimation. In: International Joint Conference on Artificial Intelligence (IJCAI) (2016)
Wan, C., Probst, T., Van Gool, L., Yao, A.: Crossing nets: dual generative models with a shared latent space for hand pose estimation. In: Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2017)
Fourure, D., Emonet, R., Fromont, E., Muselet, D., Neverova, N., Tremeau, A., Wolf, C.: Multi-task, multi-domain learning: application to semantic segmentation and pose regression. Neurocomputing 1(251), 68–80 (2017)
Krejov, P., Gilbert, A., Bowden, R.: Guided optimisation through classification and regression for hand pose estimation. Comput. Vis. Image Underst. 155(2), 124–138 (2016)
Oberweger, M., Wohlhart, P., Lepetit, V.: Training a feedback loop for hand pose estimation. In: International Conference on Computer Vision (ICCV). IEEE (2015)
Xu, C., Govindarajan, L., Zhang, Y., Cheng, L.: Lie-X: depth image based articulated object pose estimation, tracking, and action recognition on lie groups. In: International Journal of Computer Vision (IJCV) (2016)
Neverova, N., Wolf, C., Nebout, F., Taylor, G.: Hand pose estimation through semi-supervised and weakly-supervised learning. In: arXiv Preprint (2015)
Tompson, J., Stein, M., Lecun, Y., Perlin, K.: Real-time continuous pose recovery of human hands using convolutional networks. ACM Trans. Graph. 33, 169 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, H., Li, D., Wang, X. (2018). Mutiple Transfer Net with Region Ensemble for Deep Hand Pose Estimation. In: Hong, R., Cheng, WH., Yamasaki, T., Wang, M., Ngo, CW. (eds) Advances in Multimedia Information Processing – PCM 2018. PCM 2018. Lecture Notes in Computer Science(), vol 11164. Springer, Cham. https://doi.org/10.1007/978-3-030-00776-8_58
Download citation
DOI: https://doi.org/10.1007/978-3-030-00776-8_58
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00775-1
Online ISBN: 978-3-030-00776-8
eBook Packages: Computer ScienceComputer Science (R0)