Abstract
Tracking a hand in interaction with an object based on vision is a challenging research topic. The occlusions that occur during the hand-object interaction make it difficult to develop an effective tracking system. To overcome the impacts of occlusions, we build 3D models for both the hand and the manipulated object and propose a model-based tracking method to track the hand and the object simultaneously from single depth images during the hand-object interaction. The most likely hand-object state is searched by an improved particle filtering (PF) tracking algorithm in the high-dimensional hand-object space, which uses Gaussian particle swarm optimization (Gaussian PSO) algorithm to improve the process of particle sampling, moving the particles to the regions with higher likelihood. According to the proposed tracking algorithm, two kinds of hand-object tracking prototype systems are developed by using the graphics rendering engine OSG and off-screen rendering techniques. Experimental results demonstrate that the proposed method can track hand-object motion robustly with few particles.









Similar content being viewed by others
References
Bray M, Koller-Meier E, Van Gool L (2004) Smart particle filtering for 3D hand tracking. In: Proceedings of the sixth IEEE international conference on automatic face and gesture recognition, pp 675–680
Cui J, Sun Z (2004) Visual hand motion capture for guiding a dexterous hand. In: Proceedings of the sixth IEEE international conference on automatic face and gesture recognition, pp 729–734
Deutscher J, Blake A, Reid I (2000) Articulated body motion capture by annealed particle filtering [C]. In: IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 2, pp 126–133
Do MQ, Hung CH, Lin CH (2017) Robot navigation control using vision based steering wheel. Multimed Tools Appl 76(22):24569–24588
Doliotis P, Athitsos V, Kosmopoulos D, Perantonis S (2012) Hand shape and 3D pose estimation using depth data from a single cluttered frame. In: International symposium on visual computing (ISVC), vol 1, pp 148–158
Doucet A, Godsill S, Andrieu C (2000) On sequential Monte Carlo sampling methods for Bayesian filtering. Stat Comput 10(3):197–208
Erol A, Bebis G, Nicolescu M, Boyle RD, Twombly X (2005) A review on vision-based full DOF hand motion estimation. In: IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 75–82
Hamer H, Schindler K, Koller-Meier E, Van Gool L (2009) Tracking a hand manipulating an object. In: IEEE 12th international conference on computer vision (ICCV), pp 1475–1482
Keskin C, Kiraç F, Kara YE, Akarun L (2012) Hand pose estimation and hand shape classification using multi-layered randomized decision forests. In: Proceedings of the 12th European conference on computer vision (ECCV), vol 6, pp 852–863
Keskin C, Kiraç F, Kara YE, Akarun L (2013) Real time hand pose estimation using depth sensors. In: Fossati A, Gall J, Grabner H, Ren X, Konolige K (eds) Consumer depth cameras for computer vision. Springer, London, pp 119–137
Kjellström H, Romero J, Martinez D, Kragic D (2008) Simultaneous visual recognition of manipulation actions and manipulated objects. In: European conference on computer vision (ECCV), vol 2, pp 336–349
Krohling RA (2004) Gaussian swarm: a novel particle swarm optimization algorithm. In: Proceedings of the 2004 IEEE conference on cybernetics and intelligent systems, vol 1, pp 372–376
Kyriazis N, Argyros AA (2013) Physically plausible 3D scene tracking: the single actor hypothesis. In: IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 9–16
Li D, Zhou Y (2015) Combining differential evolution with particle filtering for articulated hand tracking from single depth images. Int J Signal Process Image Process Pattern Recognit 8(4):237–248
Liang H, Yuan JS, Thalmann D, Zhang Z (2013) Model-based hand pose estimation via spatial-temporal hand parsing and 3D fingertip localization. Vis Comput 29(6-8):837–848
Lu G, Nie L, Sorensen S, Kambhamettu C (2017) Large-scale tracking for images with few textures. IEEE Trans Multimedia 19(9):2117–2128
Nie L, Yan S, Wang M, Hong R, Chua TS (2012) Harvesting visual concepts for image search with complex queries. In: Proceedings of the 20th ACM international conference on multimedia, pp 59–68
Oikonomidis I, Kyriazis N, Argyros AA (2011) Full DOF tracking of a hand interacting with an object by modeling occlusions and physical constraints. In: IEEE international conference on computer vision (ICCV), pp 2088–2095
Oikonomidis I, Kyriazis N, Argyros AA (2011) Efficient model-based 3D tracking of hand articulations using Kinect. In: Proceedings of the 22nd British machine vision conference (BMVC), pp 101.1–101.11
Prisacariu VA, Reid I (2012) 3D hand tracking for human computer interaction. Image Vis Comput 30(3):236–250
Romero J, Kjellström H, Kragic D (2009) Monocular real-time 3D articulated hand pose estimation. In: International conference on humanoid robots, pp 87–92
Romero J, Kjellström H, Kragic D (2010) Hands in action: real-time 3D reconstruction of hands in interaction with objects. In: IEEE international conference on robotics and automation, pp 458–463
Taylor J, Bordeaux L, Cashman T, Corish B, Keskin C, Sharp T, Soto E, Sweeney D, Valentin J, Luffx B, Topalianx A, Wood E, Khamis S, Kohli P, Izadi S, Banks R, Fitzgibbon A, Shotton J (2016) Efficient and precise interactive hand tracking through joint, continuous optimization of pose and correspondences. ACM Trans Graph 35(4):143
Wang Y, Min J, Zhang J, Liu Y, Xu F, Dai Q, Chai J (2013) Video-based hand manipulation capture through composite motion control. ACM Trans Graph 32(4):43
Zhang Z, Seah HS, Quah CK, Sun J (2013) GPU-accelerated real-time tracking of full-body motion with multi-layer search. IEEE Trans Multimedia 15(1):106–119
Acknowledgements
This work was funded by the National Natural Science Foundation of China (NO. 51475251 and 51705273).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, D., Chen, C. Tracking a hand in interaction with an object based on single depth images. Multimed Tools Appl 78, 6745–6762 (2019). https://doi.org/10.1007/s11042-018-6452-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-6452-0