Skip to main content
Log in

Tracking a hand in interaction with an object based on single depth images

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Tracking a hand in interaction with an object based on vision is a challenging research topic. The occlusions that occur during the hand-object interaction make it difficult to develop an effective tracking system. To overcome the impacts of occlusions, we build 3D models for both the hand and the manipulated object and propose a model-based tracking method to track the hand and the object simultaneously from single depth images during the hand-object interaction. The most likely hand-object state is searched by an improved particle filtering (PF) tracking algorithm in the high-dimensional hand-object space, which uses Gaussian particle swarm optimization (Gaussian PSO) algorithm to improve the process of particle sampling, moving the particles to the regions with higher likelihood. According to the proposed tracking algorithm, two kinds of hand-object tracking prototype systems are developed by using the graphics rendering engine OSG and off-screen rendering techniques. Experimental results demonstrate that the proposed method can track hand-object motion robustly with few particles.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Bray M, Koller-Meier E, Van Gool L (2004) Smart particle filtering for 3D hand tracking. In: Proceedings of the sixth IEEE international conference on automatic face and gesture recognition, pp 675–680

  2. Cui J, Sun Z (2004) Visual hand motion capture for guiding a dexterous hand. In: Proceedings of the sixth IEEE international conference on automatic face and gesture recognition, pp 729–734

  3. Deutscher J, Blake A, Reid I (2000) Articulated body motion capture by annealed particle filtering [C]. In: IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 2, pp 126–133

  4. Do MQ, Hung CH, Lin CH (2017) Robot navigation control using vision based steering wheel. Multimed Tools Appl 76(22):24569–24588

    Article  Google Scholar 

  5. Doliotis P, Athitsos V, Kosmopoulos D, Perantonis S (2012) Hand shape and 3D pose estimation using depth data from a single cluttered frame. In: International symposium on visual computing (ISVC), vol 1, pp 148–158

    Chapter  Google Scholar 

  6. Doucet A, Godsill S, Andrieu C (2000) On sequential Monte Carlo sampling methods for Bayesian filtering. Stat Comput 10(3):197–208

    Article  Google Scholar 

  7. Erol A, Bebis G, Nicolescu M, Boyle RD, Twombly X (2005) A review on vision-based full DOF hand motion estimation. In: IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 75–82

  8. Hamer H, Schindler K, Koller-Meier E, Van Gool L (2009) Tracking a hand manipulating an object. In: IEEE 12th international conference on computer vision (ICCV), pp 1475–1482

  9. Keskin C, Kiraç F, Kara YE, Akarun L (2012) Hand pose estimation and hand shape classification using multi-layered randomized decision forests. In: Proceedings of the 12th European conference on computer vision (ECCV), vol 6, pp 852–863

    Chapter  Google Scholar 

  10. Keskin C, Kiraç F, Kara YE, Akarun L (2013) Real time hand pose estimation using depth sensors. In: Fossati A, Gall J, Grabner H, Ren X, Konolige K (eds) Consumer depth cameras for computer vision. Springer, London, pp 119–137

    Chapter  Google Scholar 

  11. Kjellström H, Romero J, Martinez D, Kragic D (2008) Simultaneous visual recognition of manipulation actions and manipulated objects. In: European conference on computer vision (ECCV), vol 2, pp 336–349

    Google Scholar 

  12. Krohling RA (2004) Gaussian swarm: a novel particle swarm optimization algorithm. In: Proceedings of the 2004 IEEE conference on cybernetics and intelligent systems, vol 1, pp 372–376

  13. Kyriazis N, Argyros AA (2013) Physically plausible 3D scene tracking: the single actor hypothesis. In: IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 9–16

  14. Li D, Zhou Y (2015) Combining differential evolution with particle filtering for articulated hand tracking from single depth images. Int J Signal Process Image Process Pattern Recognit 8(4):237–248

    Google Scholar 

  15. Liang H, Yuan JS, Thalmann D, Zhang Z (2013) Model-based hand pose estimation via spatial-temporal hand parsing and 3D fingertip localization. Vis Comput 29(6-8):837–848

    Article  Google Scholar 

  16. Lu G, Nie L, Sorensen S, Kambhamettu C (2017) Large-scale tracking for images with few textures. IEEE Trans Multimedia 19(9):2117–2128

    Article  Google Scholar 

  17. Nie L, Yan S, Wang M, Hong R, Chua TS (2012) Harvesting visual concepts for image search with complex queries. In: Proceedings of the 20th ACM international conference on multimedia, pp 59–68

  18. Oikonomidis I, Kyriazis N, Argyros AA (2011) Full DOF tracking of a hand interacting with an object by modeling occlusions and physical constraints. In: IEEE international conference on computer vision (ICCV), pp 2088–2095

  19. Oikonomidis I, Kyriazis N, Argyros AA (2011) Efficient model-based 3D tracking of hand articulations using Kinect. In: Proceedings of the 22nd British machine vision conference (BMVC), pp 101.1–101.11

  20. Prisacariu VA, Reid I (2012) 3D hand tracking for human computer interaction. Image Vis Comput 30(3):236–250

    Article  Google Scholar 

  21. Romero J, Kjellström H, Kragic D (2009) Monocular real-time 3D articulated hand pose estimation. In: International conference on humanoid robots, pp 87–92

  22. Romero J, Kjellström H, Kragic D (2010) Hands in action: real-time 3D reconstruction of hands in interaction with objects. In: IEEE international conference on robotics and automation, pp 458–463

  23. Taylor J, Bordeaux L, Cashman T, Corish B, Keskin C, Sharp T, Soto E, Sweeney D, Valentin J, Luffx B, Topalianx A, Wood E, Khamis S, Kohli P, Izadi S, Banks R, Fitzgibbon A, Shotton J (2016) Efficient and precise interactive hand tracking through joint, continuous optimization of pose and correspondences. ACM Trans Graph 35(4):143

    Article  Google Scholar 

  24. Wang Y, Min J, Zhang J, Liu Y, Xu F, Dai Q, Chai J (2013) Video-based hand manipulation capture through composite motion control. ACM Trans Graph 32(4):43

    Article  Google Scholar 

  25. Zhang Z, Seah HS, Quah CK, Sun J (2013) GPU-accelerated real-time tracking of full-body motion with multi-layer search. IEEE Trans Multimedia 15(1):106–119

    Article  Google Scholar 

Download references

Acknowledgements

This work was funded by the National Natural Science Foundation of China (NO. 51475251 and 51705273).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chengjun Chen.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, D., Chen, C. Tracking a hand in interaction with an object based on single depth images. Multimed Tools Appl 78, 6745–6762 (2019). https://doi.org/10.1007/s11042-018-6452-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-6452-0

Keywords

Navigation