Abstract
Different manual assembly orientations have a significant impact on assembly accuracy. The success or confidence of posture estimation depends on the accurate six degree-of-freedom (6DoF) position and orientation (pose) estimation of the tracked objects. In this paper, we present an improved Efficient Pose algorithm, which is a single-shot learning-based approach to hand and object pose estimation. Based on the original Efficient Pose algorithm, we added a subnetwork for hand prediction, replaced some MBConv modules with Fused-MBConv modules, modified the number of network layers, and used different training strategies. Experimental results show that on the public dataset for monocular red-green-blue (RGB) 6DoF marker-less hand and surgical instrument pose tracking, it improves performance and shortens training time compared to other methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114. PMLR (2019)
Tan, M., Pang, R., Le, Q.V.: EfficientDet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10781–10790 (2020)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Mahendran, S., Ali, H., Vidal, R.: 3D pose regression using convolutional neural networks. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 2174–2182 (2017)
Xiang, Y., Schmidt, T., Narayanan, V., et al.: PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes. arXiv preprint arXiv:1711.00199 (2017)
Romero, J., Tzionas, D., Black, M.J.: Embodied hands: modeling and capturing hands and bodies together. ACM Trans. Graph. (2017)
Touvron, H., Vedaldi, A., Douze, M., et al.: Fixing the train-test resolution discrepancy. Adv. Neural. Inf. Process. Syst. 356, 32 (2019)
Tan, M., Le, Q.: EfficientNetV2: smaller models and faster training. In: International Conference on Machine Learning, pp. 10096–10106. PMLR (2021)
Hoffer, E., Weinstein, B., Hubara, I., et al.: Mix & match: training convnets with mixed image sizes for improved accuracy, speed and scale resiliency. arXiv preprint arXiv:1908.08986 (2019)
You, Y., Zhang, Z., Hsieh, C.J., et al.: ImageNet training in minutes. In: Proceedings of the 47th International Conference on Parallel Processing, pp. 1–10 (2018)
Srivastava, N., Hinton, G., Krizhevsky, A., et al.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Cubuk, E.D., Zoph, B., Shlens, J., et al.: RandAugment: practical automated data augmentation with a reduced search space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 702–703 (2020)
Zhang, H., Cisse, M., Dauphin, Y.N., et al.: Mixup: beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017)
Hein, J., et al.: Towards markerless surgical tool and hand pose estimation. Int. J. Comput. Assist. Radiol. Surg. 16(5), 799–808 (2021)
Peng, S., Liu, Y., Huang, Q., et al.: PvNet: pixel-wise voting network for 6D of pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4561–4570 (2019)
Weng, W., Zhu, X.: INet: convolutional networks for biomedical image segmentation. IEEE Access 9, 16591–16603 (2021)
He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Acknowledgement
This work was supported by the National Natural Science Foundation of China (Nos. 62072002 and 62172004), and Special Fund for Anhui Agriculture Research System.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Niu, Z., Xia, Y., Zhang, J., Wang, B., Chen, P. (2023). Improved Deep Learning-Based Efficientpose Algorithm for Egocentric Marker-Less Tool and Hand Pose Estimation in Manual Assembly. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science(), vol 14090. Springer, Singapore. https://doi.org/10.1007/978-981-99-4761-4_25
Download citation
DOI: https://doi.org/10.1007/978-981-99-4761-4_25
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-4760-7
Online ISBN: 978-981-99-4761-4
eBook Packages: Computer ScienceComputer Science (R0)