A Self-supervised Pose Estimation Approach for Construction Machines

Alshubbak, Ala’a; Görges, Daniel

doi:10.1007/978-3-031-47966-3_31

Ala’a Alshubbak^16,17 &
Daniel Görges¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14362))

Included in the following conference series:

International Symposium on Visual Computing

347 Accesses

Abstract

Pose estimation is a computer vision task used to estimate a skeleton of dynamic systems to predict future movements. Most of the research in this direction is based on a supervised learning approach which requires a massive amount of labeled datasets. In this paper, a self-supervised three-stage model based on a contrastive learning approach is introduced for estimating a skeleton of dynamic construction machines; such as excavators without using any labeled images for the first stage. The whole model structure is divided into three stages: the pre-train stage using the SimCLR contrastive approach, and two fine-tuning stages for the transfer learning and the downstream task. The model can leverage the features and learn from a huge unlabeled dataset called ACID to two small datasets generated from NVIDIA Isaac and MATLAB Simscape simulators as well as transfer the knowledge to a smaller dataset with a ratio of 3.5% from the original ACID dataset. The results show that the proposed approach can improve the accuracy of pose estimation for heavy construction machines in real images by 11% and 13% in comparison to the normal self-supervised approach with two backbones ResNet-50 and HRNet-W32, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Multi-person 3D pose estimation from unlabelled data

Article Open access 06 April 2024

A Multi-scale Recalibrated Approach for 3D Human Pose Estimation

Cross-View Self-fusion for Self-supervised 3D Human Pose Estimation in the Wild

References

Albelwi, S.: Survey on self-supervised learning: auxiliary pretext tasks and contrastive learning methods in imaging. Entropy 24(4), 551 (2022)
Article Google Scholar
Cao, J., Tang, H., Fang, H.S., Shen, X., Lu, C., Tai, Y.W.: Cross-domain adaptation for animal pose estimation. In: IEEE/CVF International Conference on Computer Vision, pp. 9498–9507 (2019)
Google Scholar
Caron, M., Misra, I., Mairal, J., Goyal, P., Bojanowski, P., Joulin, A.: Unsupervised learning of visual features by contrasting cluster assignments. Adv. Neural. Inf. Process. Syst. 33, 9912–9924 (2020)
Google Scholar
Chan, C., Tan, S.: Determination of the minimum bounding box of an arbitrary solid: an iterative approach. Comput. Struct. 79(15), 1433–1449 (2001)
Article Google Scholar
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607 (2020)
Google Scholar
Dang, Q., Yin, J., Wang, B., Zheng, W.: Deep learning based 2D human pose estimation: a survey. Tsinghua Sci. Technol. 24(6), 663–676 (2019)
Article Google Scholar
Graving, J.M., et al.: Deepposekit, a software toolkit for fast and robust animal pose estimation using deep learning. Elife 8, e47994 (2019)
Article Google Scholar
Jaiswal, A., Babu, A.R., Zadeh, M.Z., Banerjee, D., Makedon, F.: A survey on contrastive self-supervised learning. Technologies 9(1), 2 (2020)
Article Google Scholar
Jin, S., et al.: Differentiable hierarchical graph grouping for multi-person pose estimation. In: 16th European Conference on Computer Vision, pp. 718–734 (2020)
Google Scholar
Jin, S., et al.: Whole-body human pose estimation in the wild. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12354, pp. 196–214. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58545-7_12
Chapter Google Scholar
Lan, G., Wu, Y., Hu, F., Hao, Q.: Vision-based human pose estimation via deep learning: a survey. IEEE Trans. Hum. Mach. Syst. (2022)
Google Scholar
Lin, C., et al.: Structure-coherent deep feature learning for robust face alignment. IEEE Trans. Image Process. 30, 5313–5326 (2021)
Article Google Scholar
Lin, Z.H., Chen, A.Y., Hsieh, S.H.: Temporal image analytics for abnormal construction activity identification. Autom. Constr. 124, 103572 (2021)
Article Google Scholar
Luo, H., Wang, M., Wong, P.K.Y., Cheng, J.C.: Full body pose estimation of construction equipment using computer vision and deep learning techniques. Autom. Constr. 110, 103016 (2020)
Article Google Scholar
Luo, H., Wang, M., Wong, P.K.Y., Tang, J., Cheng, J.C.: Construction machine pose prediction considering historical motions and activity attributes using gated recurrent unit (GRU). Autom. Constr. 121, 103444 (2021)
Article Google Scholar
Luo, H., Liu, J., Fang, W., Love, P.E., Yu, Q., Lu, Z.: Real-time smart video surveillance to manage safety: a case study of a transport mega-project. Adv. Eng. Inform. 45, 101100 (2020)
Article Google Scholar
Miller, S.: Excavator design with simscape (2023). https://github.com/simscape/Excavator-Simscape/releases/tag/23.1.51.5
Newell, A., Huang, Z., Deng, J.: Associative embedding: end-to-end learning for joint detection and grouping. Adv. Neural Inf. Process. Syst. 30 (2017)
Google Scholar
Oquab, M., et al.: Dinov2: learning robust visual features without supervision (2023)
Google Scholar
Pereira, T.D., et al.: Fast animal pose estimation using deep neural networks. Nat. Methods 16(1), 117–125 (2019)
Article Google Scholar
Pham, H.T., Rafieizonooz, M., Han, S., Lee, D.E.: Current status and future directions of deep learning applications for safety management in construction. Sustainability 13(24), 13579 (2021)
Article Google Scholar
Rani, V., Nabi, S.T., Kumar, M., Mittal, A., Kumar, K.: Self-supervised learning: a succinct review. Arch. Comput. Methods Eng. 30(4), 2761–2775 (2023)
Article Google Scholar
Sarafianos, N., Boteanu, B., Ionescu, B., Kakadiaris, I.A.: 3D human pose estimation: a review of the literature and analysis of covariates. Comput. Vis. Image Underst. 152, 1–20 (2016)
Article Google Scholar
Soltani, M.M., Zhu, Z., Hammad, A.: Skeleton estimation of excavator by detecting its parts. Autom. Constr. 82, 1–15 (2017)
Article Google Scholar
Xiao, B., Kang, S.C.: Development of an image data set of construction machines for deep learning object detection. J. Comput. Civ. Eng. 35(2), 05020005 (2021)
Article Google Scholar
Zhang, F., Zhu, X., Dai, H., Ye, M., Zhu, C.: Distribution-aware coordinate representation for human pose estimation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7093–7102 (2020)
Google Scholar
Zhao, J., Hu, Y., Tian, M.: Pose estimation of excavator manipulator based on monocular vision marker system. Sensors 21(13), 4478 (2021)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Electromobility, University of Kaiserslautern-Landau, Kaiserslautern, Germany
Ala’a Alshubbak & Daniel Görges
German Jordanian University, Amman, Jordan
Ala’a Alshubbak

Authors

Ala’a Alshubbak
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Görges
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ala’a Alshubbak .

Editor information

Editors and Affiliations

University of Nevada Reno, Reno, NV, USA
George Bebis
Google Research, Mountain View, CA, USA
Golnaz Ghiasi
New York University, New York, USA
Yi Fang
Ben-Gurion University, Be'er Sheva, Israel
Andrei Sharf
Microsoft Research, Beijing, China
Yue Dong
The University of Oklahoma, Norman, OK, USA
Chris Weaver
University of Maryland, Collage Park, MD, USA
Zhicheng Leo
University of Central Florida, Orlando, FL, USA
Joseph J. LaViola Jr.
InnerOptic Technology, Hillsborough, NC, USA
Luv Kohli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alshubbak, A., Görges, D. (2023). A Self-supervised Pose Estimation Approach for Construction Machines. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2023. Lecture Notes in Computer Science, vol 14362. Springer, Cham. https://doi.org/10.1007/978-3-031-47966-3_31

Download citation

DOI: https://doi.org/10.1007/978-3-031-47966-3_31
Published: 03 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47965-6
Online ISBN: 978-3-031-47966-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Self-supervised Pose Estimation Approach for Construction Machines

Abstract

Access this chapter

Similar content being viewed by others

Multi-person 3D pose estimation from unlabelled data

A Multi-scale Recalibrated Approach for 3D Human Pose Estimation

Cross-View Self-fusion for Self-supervised 3D Human Pose Estimation in the Wild

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Self-supervised Pose Estimation Approach for Construction Machines

Abstract

Access this chapter

Similar content being viewed by others

Multi-person 3D pose estimation from unlabelled data

A Multi-scale Recalibrated Approach for 3D Human Pose Estimation

Cross-View Self-fusion for Self-supervised 3D Human Pose Estimation in the Wild

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation