Skip to main content

A Self-supervised Pose Estimation Approach for Construction Machines

  • Conference paper
  • First Online:
Advances in Visual Computing (ISVC 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14362))

Included in the following conference series:

  • 347 Accesses

Abstract

Pose estimation is a computer vision task used to estimate a skeleton of dynamic systems to predict future movements. Most of the research in this direction is based on a supervised learning approach which requires a massive amount of labeled datasets. In this paper, a self-supervised three-stage model based on a contrastive learning approach is introduced for estimating a skeleton of dynamic construction machines; such as excavators without using any labeled images for the first stage. The whole model structure is divided into three stages: the pre-train stage using the SimCLR contrastive approach, and two fine-tuning stages for the transfer learning and the downstream task. The model can leverage the features and learn from a huge unlabeled dataset called ACID to two small datasets generated from NVIDIA Isaac and MATLAB Simscape simulators as well as transfer the knowledge to a smaller dataset with a ratio of 3.5% from the original ACID dataset. The results show that the proposed approach can improve the accuracy of pose estimation for heavy construction machines in real images by 11% and 13% in comparison to the normal self-supervised approach with two backbones ResNet-50 and HRNet-W32, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Albelwi, S.: Survey on self-supervised learning: auxiliary pretext tasks and contrastive learning methods in imaging. Entropy 24(4), 551 (2022)

    Article  Google Scholar 

  2. Cao, J., Tang, H., Fang, H.S., Shen, X., Lu, C., Tai, Y.W.: Cross-domain adaptation for animal pose estimation. In: IEEE/CVF International Conference on Computer Vision, pp. 9498–9507 (2019)

    Google Scholar 

  3. Caron, M., Misra, I., Mairal, J., Goyal, P., Bojanowski, P., Joulin, A.: Unsupervised learning of visual features by contrasting cluster assignments. Adv. Neural. Inf. Process. Syst. 33, 9912–9924 (2020)

    Google Scholar 

  4. Chan, C., Tan, S.: Determination of the minimum bounding box of an arbitrary solid: an iterative approach. Comput. Struct. 79(15), 1433–1449 (2001)

    Article  Google Scholar 

  5. Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607 (2020)

    Google Scholar 

  6. Dang, Q., Yin, J., Wang, B., Zheng, W.: Deep learning based 2D human pose estimation: a survey. Tsinghua Sci. Technol. 24(6), 663–676 (2019)

    Article  Google Scholar 

  7. Graving, J.M., et al.: Deepposekit, a software toolkit for fast and robust animal pose estimation using deep learning. Elife 8, e47994 (2019)

    Article  Google Scholar 

  8. Jaiswal, A., Babu, A.R., Zadeh, M.Z., Banerjee, D., Makedon, F.: A survey on contrastive self-supervised learning. Technologies 9(1), 2 (2020)

    Article  Google Scholar 

  9. Jin, S., et al.: Differentiable hierarchical graph grouping for multi-person pose estimation. In: 16th European Conference on Computer Vision, pp. 718–734 (2020)

    Google Scholar 

  10. Jin, S., et al.: Whole-body human pose estimation in the wild. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12354, pp. 196–214. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58545-7_12

    Chapter  Google Scholar 

  11. Lan, G., Wu, Y., Hu, F., Hao, Q.: Vision-based human pose estimation via deep learning: a survey. IEEE Trans. Hum. Mach. Syst. (2022)

    Google Scholar 

  12. Lin, C., et al.: Structure-coherent deep feature learning for robust face alignment. IEEE Trans. Image Process. 30, 5313–5326 (2021)

    Article  Google Scholar 

  13. Lin, Z.H., Chen, A.Y., Hsieh, S.H.: Temporal image analytics for abnormal construction activity identification. Autom. Constr. 124, 103572 (2021)

    Article  Google Scholar 

  14. Luo, H., Wang, M., Wong, P.K.Y., Cheng, J.C.: Full body pose estimation of construction equipment using computer vision and deep learning techniques. Autom. Constr. 110, 103016 (2020)

    Article  Google Scholar 

  15. Luo, H., Wang, M., Wong, P.K.Y., Tang, J., Cheng, J.C.: Construction machine pose prediction considering historical motions and activity attributes using gated recurrent unit (GRU). Autom. Constr. 121, 103444 (2021)

    Article  Google Scholar 

  16. Luo, H., Liu, J., Fang, W., Love, P.E., Yu, Q., Lu, Z.: Real-time smart video surveillance to manage safety: a case study of a transport mega-project. Adv. Eng. Inform. 45, 101100 (2020)

    Article  Google Scholar 

  17. Miller, S.: Excavator design with simscape (2023). https://github.com/simscape/Excavator-Simscape/releases/tag/23.1.51.5

  18. Newell, A., Huang, Z., Deng, J.: Associative embedding: end-to-end learning for joint detection and grouping. Adv. Neural Inf. Process. Syst. 30 (2017)

    Google Scholar 

  19. Oquab, M., et al.: Dinov2: learning robust visual features without supervision (2023)

    Google Scholar 

  20. Pereira, T.D., et al.: Fast animal pose estimation using deep neural networks. Nat. Methods 16(1), 117–125 (2019)

    Article  Google Scholar 

  21. Pham, H.T., Rafieizonooz, M., Han, S., Lee, D.E.: Current status and future directions of deep learning applications for safety management in construction. Sustainability 13(24), 13579 (2021)

    Article  Google Scholar 

  22. Rani, V., Nabi, S.T., Kumar, M., Mittal, A., Kumar, K.: Self-supervised learning: a succinct review. Arch. Comput. Methods Eng. 30(4), 2761–2775 (2023)

    Article  Google Scholar 

  23. Sarafianos, N., Boteanu, B., Ionescu, B., Kakadiaris, I.A.: 3D human pose estimation: a review of the literature and analysis of covariates. Comput. Vis. Image Underst. 152, 1–20 (2016)

    Article  Google Scholar 

  24. Soltani, M.M., Zhu, Z., Hammad, A.: Skeleton estimation of excavator by detecting its parts. Autom. Constr. 82, 1–15 (2017)

    Article  Google Scholar 

  25. Xiao, B., Kang, S.C.: Development of an image data set of construction machines for deep learning object detection. J. Comput. Civ. Eng. 35(2), 05020005 (2021)

    Article  Google Scholar 

  26. Zhang, F., Zhu, X., Dai, H., Ye, M., Zhu, C.: Distribution-aware coordinate representation for human pose estimation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7093–7102 (2020)

    Google Scholar 

  27. Zhao, J., Hu, Y., Tian, M.: Pose estimation of excavator manipulator based on monocular vision marker system. Sensors 21(13), 4478 (2021)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ala’a Alshubbak .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Alshubbak, A., Görges, D. (2023). A Self-supervised Pose Estimation Approach for Construction Machines. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2023. Lecture Notes in Computer Science, vol 14362. Springer, Cham. https://doi.org/10.1007/978-3-031-47966-3_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-47966-3_31

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-47965-6

  • Online ISBN: 978-3-031-47966-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics