Skip to main content

Improved Deep Learning-Based Efficientpose Algorithm for Egocentric Marker-Less Tool and Hand Pose Estimation in Manual Assembly

  • Conference paper
  • First Online:
Advanced Intelligent Computing Technology and Applications (ICIC 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14090))

Included in the following conference series:

  • 1412 Accesses

Abstract

Different manual assembly orientations have a significant impact on assembly accuracy. The success or confidence of posture estimation depends on the accurate six degree-of-freedom (6DoF) position and orientation (pose) estimation of the tracked objects. In this paper, we present an improved Efficient Pose algorithm, which is a single-shot learning-based approach to hand and object pose estimation. Based on the original Efficient Pose algorithm, we added a subnetwork for hand prediction, replaced some MBConv modules with Fused-MBConv modules, modified the number of network layers, and used different training strategies. Experimental results show that on the public dataset for monocular red-green-blue (RGB) 6DoF marker-less hand and surgical instrument pose tracking, it improves performance and shortens training time compared to other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114. PMLR (2019)

    Google Scholar 

  2. Tan, M., Pang, R., Le, Q.V.: EfficientDet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10781–10790 (2020)

    Google Scholar 

  3. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)

    Google Scholar 

  4. Mahendran, S., Ali, H., Vidal, R.: 3D pose regression using convolutional neural networks. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 2174–2182 (2017)

    Google Scholar 

  5. Xiang, Y., Schmidt, T., Narayanan, V., et al.: PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes. arXiv preprint arXiv:1711.00199 (2017)

  6. Romero, J., Tzionas, D., Black, M.J.: Embodied hands: modeling and capturing hands and bodies together. ACM Trans. Graph. (2017)

    Google Scholar 

  7. Touvron, H., Vedaldi, A., Douze, M., et al.: Fixing the train-test resolution discrepancy. Adv. Neural. Inf. Process. Syst. 356, 32 (2019)

    Google Scholar 

  8. Tan, M., Le, Q.: EfficientNetV2: smaller models and faster training. In: International Conference on Machine Learning, pp. 10096–10106. PMLR (2021)

    Google Scholar 

  9. Hoffer, E., Weinstein, B., Hubara, I., et al.: Mix & match: training convnets with mixed image sizes for improved accuracy, speed and scale resiliency. arXiv preprint arXiv:1908.08986 (2019)

  10. You, Y., Zhang, Z., Hsieh, C.J., et al.: ImageNet training in minutes. In: Proceedings of the 47th International Conference on Parallel Processing, pp. 1–10 (2018)

    Google Scholar 

  11. Srivastava, N., Hinton, G., Krizhevsky, A., et al.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  12. Cubuk, E.D., Zoph, B., Shlens, J., et al.: RandAugment: practical automated data augmentation with a reduced search space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 702–703 (2020)

    Google Scholar 

  13. Zhang, H., Cisse, M., Dauphin, Y.N., et al.: Mixup: beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017)

  14. Hein, J., et al.: Towards markerless surgical tool and hand pose estimation. Int. J. Comput. Assist. Radiol. Surg. 16(5), 799–808 (2021)

    Article  Google Scholar 

  15. Peng, S., Liu, Y., Huang, Q., et al.: PvNet: pixel-wise voting network for 6D of pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4561–4570 (2019)

    Google Scholar 

  16. Weng, W., Zhu, X.: INet: convolutional networks for biomedical image segmentation. IEEE Access 9, 16591–16603 (2021)

    Article  Google Scholar 

  17. He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

Download references

Acknowledgement

This work was supported by the National Natural Science Foundation of China (Nos. 62072002 and 62172004), and Special Fund for Anhui Agriculture Research System.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peng Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Niu, Z., Xia, Y., Zhang, J., Wang, B., Chen, P. (2023). Improved Deep Learning-Based Efficientpose Algorithm for Egocentric Marker-Less Tool and Hand Pose Estimation in Manual Assembly. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science(), vol 14090. Springer, Singapore. https://doi.org/10.1007/978-981-99-4761-4_25

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-4761-4_25

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-4760-7

  • Online ISBN: 978-981-99-4761-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics