skip to main content
10.1145/3663976.3664227acmotherconferencesArticle/Chapter ViewAbstractPublication PagescvipprConference Proceedingsconference-collections
research-article

Real-Time 3D Skeleton Reconstruction: A Comprehensive Approach from Multiple Views

Published: 27 June 2024 Publication History

Abstract

This paper presents a comprehensive method for real-time 3D human skeleton reconstruction from calibrated camera sets, addressing challenges in scenes with multiple individuals. Accurate 3D pose estimation is crucial for various applications such as 3D model animation, augmented reality, and human-computer interaction. The approach involves initial 2D skeleton estimation, followed by skeleton identification through a matching algorithm and reconstruction via triangulation. Three key enhancements were implemented: refining the matching algorithm using 3D reconstruction reprojection, accelerating execution with skeleton tracking, and validation on a diverse dataset with over 9,000 frames. The method achieves accurate 3D reconstruction and robust performance in multi-individual scenarios, making it suitable for real-world applications. Project page: https://instant-skeleton.github.io.

References

[1]
Jawaharlalnehru Arunnehru, Sambandham Thalapathiraj, Ravikumar Dhanasekar, Loganathan Vijayaraja, Raju Kannadasan, Arfat Ahmad Khan, Mohd Anul Haq, Mohammed Alshehri, Mohamed Ibrahim Alwanain, and Ismail Keshta. 2022. Machine Vision-Based Human Action Recognition Using Spatio-Temporal Motion Features (STMF) with Difference Intensity Distance Group Pattern (DIDGP). Electronics 11, 15 (2022). https://doi.org/10.3390/electronics11152363
[2]
Alex Bewley, Zongyuan Ge, Lionel Ott, Fabio Ramos, and Ben Upcroft. 2016. Simple online and realtime tracking. In 2016 IEEE international conference on image processing (ICIP). IEEE, 3464–3468.
[3]
Lewis Bridgeman, Marco Volino, Jean-Yves Guillemaut, and Adrian Hilton. 2019. Multi-Person 3D Pose Estimation and Tracking in Sports. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.
[4]
Jinkun Cao, Xinshuo Weng, Rawal Khirodkar, Jiangmiao Pang, and Kris Kitani. 2022. Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking. arXiv preprint arXiv:2203.14360 (2022).
[5]
Z. Cao, G. Hidalgo Martinez, T. Simon, S. Wei, and Y. A. Sheikh. 2019. OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. In IEEE Transactions on Pattern Analysis and Machine Intelligence.
[6]
Nicolo Carissimi, Paolo Rota, Cigdem Beyan, and Vittorio Murino. 2018. Filling the gaps: Predicting missing joints of human poses using denoising autoencoders. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops. 0–0.
[7]
Yunhao Du, Zhicheng Zhao, Yang Song, Yanyun Zhao, Fei Su, Tao Gong, and Hongying Meng. 2023. Strongsort: Make deepsort great again. IEEE Transactions on Multimedia (2023).
[8]
Zheng Ge, Songtao Liu, Feng Wang, Zeming Li, and Jian Sun. 2021. Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430 (2021).
[9]
Richard I Hartley and Peter Sturm. 1997. Triangulation. Computer vision and image understanding 68, 2 (1997), 146–157.
[10]
Itseez. 2015. Open Source Computer Vision Library. https://github.com/itseez/opencv.
[11]
Arunnehru Jawaharlalnehru, Thalapathiraj Sambandham, Vaijayanthi Sekar, Dhanasekar Ravikumar, Vijayaraja Loganathan, Raju Kannadasan, Arfat Ahmad Khan, Chitapong Wechtaisong, Mohd Anul Haq, Ahmed Alhussen, and Zamil S. Alzamil. 2022. Target Object Detection from Unmanned Aerial Vehicle (UAV) Images Based on Improved YOLO Algorithm. Electronics 11, 15 (2022). https://doi.org/10.3390/electronics11152343
[12]
Glenn Jocher, Ayush Chaurasia, and Jing Qiu. 2023. YOLO by Ultralytics. https://github.com/ultralytics/ultralytics
[13]
Arfat Ahmad Khan, Muhammad Asif Nauman, Muhammad Shoaib, Rashid Jahangir, Roobaea Alroobaea, Majed Alsafyani, Ahmed Binmahfoudh, and Chitapong Wechtaisong. 2022. Crowd Anomaly Detection in Video Frames Using Fine-Tuned AlexNet Model. Electronics 11, 19 (2022). https://doi.org/10.3390/electronics11193105
[14]
Camillo Lugaresi, Jiuqiang Tang, Hadon Nash, Chris McClanahan, Esha Uboweja, Michael Hays, Fan Zhang, Chuo-Ling Chang, Ming Guang Yong, Juhyun Lee, Wan-Teh Chang, Wei Hua, Manfred Georg, and Matthias Grundmann. 2019. MediaPipe: A Framework for Building Perception Pipelines. https://doi.org/10.48550/arXiv.1906.08172
[15]
Julian Tanke and Juergen Gall. 2021. Iterative Greedy Matching for 3D Human Pose Tracking from Multiple Views. arXiv:2101.09745 [cs].
[16]
Nicolai Wojke and Alex Bewley. 2018. Deep Cosine Metric Learning for Person Re-identification. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 748–756. https://doi.org/10.1109/WACV.2018.00087
[17]
Yifu Zhang, Peize Sun, Yi Jiang, Dongdong Yu, Fucheng Weng, Zehuan Yuan, Ping Luo, Wenyu Liu, and Xinggang Wang. 2022. Bytetrack: Multi-object tracking by associating every detection box. In European Conference on Computer Vision. Springer, 1–21.

Index Terms

  1. Real-Time 3D Skeleton Reconstruction: A Comprehensive Approach from Multiple Views

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    CVIPPR '24: Proceedings of the 2024 2nd Asia Conference on Computer Vision, Image Processing and Pattern Recognition
    April 2024
    373 pages
    ISBN:9798400716607
    DOI:10.1145/3663976
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 June 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. 3D Human Pose Estimation
    2. Multi-view Image Processing
    3. Real-time 3D Skeleton Reconstruction
    4. Temporal Coherence in 3D Reconstruction

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    CVIPPR 2024

    Acceptance Rates

    Overall Acceptance Rate 14 of 38 submissions, 37%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 28
      Total Downloads
    • Downloads (Last 12 months)28
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 23 Jan 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media