Abstract
A three-dimensional virtual object can be manipulated by hand and finger movements with an optical hand tracking device which can recognize the posture of one’s hand. Many of the conventional hand posture recognitions are based on three-dimensional coordinates of fingertips and a skeletal model of the hand. It is difficult for the conventional methods to estimate the posture of the hand when a fingertip is hidden from an optical camera, and self-occlusion often hides the fingertip. Our study, therefore, proposes an estimation of the posture of a hand based on a hand dorsal image that can be taken even when the hand occludes its fingertips. Manipulation of a virtual object requires the recognition of movements like pinching, and many of such movements can be recognized based on the distance between the fingertips of the thumb and the forefinger. Therefore, we use a regression model to estimate the distance between the fingertips of the thumb and forefinger using hand dorsal images. The regression model was constructed using Convolution Neural Networks (CNN). Our study proposes Silhouette and Texture methods for estimation of the distance between fingertips using hand dorsal images and aggregates them into two methods: Clipping method and Aggregation method. The Root Mean Squared Error (RMSE) of estimation of the distance between fingertips was 1.98 mm or less by Aggregation method for hand dorsal images which does not contain any fingertips. The RMSE of Aggregation method is smaller than that of other methods. The result shows that the proposed Aggregation method could be an effective method which is robust to self-occlusion.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Rohrbach, M., Amin, S., Andriluka, M., Schiele, B.: A database for fine grained activity detection of cooking activities. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1194–1201 (2012)
Zhao, W., Zhang, J., Min, J., Chai, J.: Robust realtime physics-based motion control for human grasping. ACM Trans. Graph. (TOG) 32(6), 207 (2013)
Sinha, A., Choi, C., Ramani, K.: DeepHand: robust hand pose estimation by completing a matrix imputed with deep features. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4150–4158 (2016)
Kawashima, K.: Vision-based data glove considering the hiding of the thumb by hand-drawn image. Nagoya Institute of Technology Graduation thesis (2014)
Katou, H., Mark, B., Asano, K., Tachibana, K.: Augmented reality system and its calibration based on marker tracking. Trans. Virtual Reality Soc. Jpn. 4(4), 607–616 (1999)
Kamakura, N.: Hand Shape and Hand Movement. Medical and Tooth Drug Publishing Co., Ltd. (1989)
Ichikawa, R.: Motion modeling of finger joints during grasping and manipulation of objects. Wakayama University Bachelor thesis (2002)
Yamamoto, S., Funahashi, K., Iwahori, Y.: Study for vision based data glove considering hidden fingertip with self-occlusion. In: Proceedings of SNPD 2012, pp. 315–320 (2012)
Mueller, F., Mehta, D., Sotnychenko, O., Sridhar, S., Casas, D., Theobalt, C.: Real-time hand tracking under occlusion from an egocentric RGB-D sensor. In: ICCV (2017)
Jang, Y., Noh, S.-T., Chang, H.J., Kim, T.K., Woo, W.: 3D finger CAPE: clicking action and position estimation under self-occlusions in egocentric viewpoint. IEEE Trans. Vis. Comput. Graph. (TVCG) 21(4), 501–510 (2015)
Rogez, G., Supancic, J.S., Ramanan, D.: First-person pose recognition using egocentric workspaces. In: CVPR (2015)
Farrell, R., Oza, O., Zhang, N., Morariu, V., Darrell, T., Davis, L.: Birdlets: subordinate categorization using volumetric primitives and pose-normalized appearance. In: ICCV (2011)
Branson, S., et al.: Visual recognition with humans in the loop. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 438–451. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15561-1_32
Yang, S., Bo, L., Wang, J., Shapiro, L.G.: Unsupervised template learning for fine-grained object recognition. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25, pp. 3122–3130. Curran Associates Inc., Red Hook (2012)
Yao, B., Khosla, A., Fei-Fei, L.: Combining randomization and discrimination for fine-grained image categorization. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1577–1584. IEEE (2011)
Shimizume, T., Noritaka, O., Umezawa, T.: Contact estimation between thumb and forefinger from hand dorsal image using deep learning. Chiba University Graduation thesis (2018)
Schröder, M., Waltemate, T., Maycock, J., Röhlig, T., Ritter, H., Botsch, M.: Design and evaluation of reduced marker layouts for hand motion capture. Comput. Animat. Virtual Worlds 29(6), e1751 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Shimizume, T., Umezawa, T., Osawa, N. (2019). Estimation of the Distance Between Fingertips Using Silhouette and Texture Information of Dorsal of Hand. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2019. Lecture Notes in Computer Science(), vol 11844. Springer, Cham. https://doi.org/10.1007/978-3-030-33720-9_36
Download citation
DOI: https://doi.org/10.1007/978-3-030-33720-9_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33719-3
Online ISBN: 978-3-030-33720-9
eBook Packages: Computer ScienceComputer Science (R0)