skip to main content
10.1145/3581783.3612663acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
demonstration

HumVis: Human-Centric Visual Analysis System

Published: 27 October 2023 Publication History

Abstract

Human-centric visual analysis is a fundamental task for many multimedia and computer vision applications, such as self-driving, multimedia retrieval, and augmented reality, etc. Based on our recent research efforts on fine-grained human visual analysis, we develop a robust and efficient human-centric visual analysis system named as HumVis. HumVis is built on a simple yet efficient contextual instance decoupling (CID) module, which can effectively separate different persons in an input image and output corresponding person structure information for visual analysis. Based on CID, HumVis achieves accurate multi-person pose estimation, multi-person foreground segmentation, multi-person part segmentation and 3D human mesh recovery for user-uploaded images/videos and support live stream presentation.

References

[1]
Guillem Brasó, Nikita Kister, and Laura Leal-Taixé. 2021. The Center of Attention: Center-Keypoint Grouping via Attention for Multi-Person Pose Estimation. In Proc. IEEE Int. Conf. Comp. Vis.
[2]
Zhe Cao, Gines Hidalgo, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2021. OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 43, 01 (2021), 172--186.
[3]
Hongsuk Choi, Gyeongsik Moon, JoonKyu Park, and Kyoung Mu Lee. 2022. Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1475--1484.
[4]
Bin Dong, Fangao Zeng, Tiancai Wang, Xiangyu Zhang, and Yichen Wei. 2021. SOLQ: Segmenting Objects by Learning Queries. Proc. Advances in Neural Inf. Process. Syst. (2021).
[5]
Zigang Geng, Ke Sun, Bin Xiao, Zhaoxiang Zhang, and Jingdong Wang. 2021. Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.
[6]
Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. Mask R-CNN. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.
[7]
Sheng Jin, Wentao Liu, Enze Xie, Wenhai Wang, Chen Qian, Wanli Ouyang, and Ping Luo. 2020. Differentiable Hierarchical Graph Grouping for Multi-person Pose Estimation. In Proc. Eur. Conf. Comp. Vis.
[8]
Sven Kreiss, Lorenzo Bertoni, and Alexandre Alahi. 2019. Pifpaf: Composite fields for human pose estimation. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.
[9]
Jianing Li, Shiliang Zhang, Qi Tian, Meng Wang, and Wen Gao. 2019. Pose-guided representation learning for person re-identification. IEEE transactions on pattern analysis and machine intelligence, Vol. 44, 2 (2019), 622--635.
[10]
Zhenguang Liu, Shuang Wu, Shuyuan Jin, Qi Liu, Shijian Lu, Roger Zimmermann, and Li Cheng. 2019. Towards natural and accurate future motion prediction of humans and animals. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10004--10012.
[11]
Qianli Ma, Jinlong Yang, Siyu Tang, and Michael J Black. 2021. The power of points for modeling humans in clothing. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 10974--10984.
[12]
Alejandro Newell, Zhiao Huang, and Jia Deng. 2017. Associative embedding: End-to-end learning for joint detection and grouping. In Proc. Advances in Neural Inf. Process. Syst.
[13]
Xuecheng Nie, Jiashi Feng, Jianfeng Zhang, and Shuicheng Yan. 2019. Single-stage multi-person pose machines. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.
[14]
George Papandreou, Tyler Zhu, Liang-Chieh Chen, Spyros Gidaris, Jonathan Tompson, and Kevin Murphy. 2018. Personlab: Person pose estimation and instance segmentation with a bottom-up, part-based, geometric embedding model. In Proc. Eur. Conf. Comp. Vis.
[15]
George Papandreou, Tyler Zhu, Nori Kanazawa, Alexander Toshev, Jonathan Tompson, Chris Bregler, and Kevin Murphy. 2017. Towards accurate multi-person pose estimation in the wild. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.
[16]
Sida Peng, Yuanqing Zhang, Yinghao Xu, Qianqian Wang, Qing Shuai, Hujun Bao, and Xiaowei Zhou. 2021. Neural body: Implicit neural representations with structured latent codes for novel view synthesis of dynamic humans. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9054--9063.
[17]
Leonid Pishchulin, Eldar Insafutdinov, Siyu Tang, Bjoern Andres, Mykhaylo Andriluka, Peter V Gehler, and Bernt Schiele. 2016. Deepcut: Joint subset partition and labeling for multi person pose estimation. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.
[18]
Ke Sun, Bin Xiao, Dong Liu, and Jingdong Wang. 2019. Deep high-resolution representation learning for human pose estimation. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.
[19]
Zhi Tian, Hao Chen, and Chunhua Shen. 2019. Directpose: Direct end-to-end multi-person pose estimation. arXiv preprint arXiv:1911.07451 (2019).
[20]
Dongkai Wang and Shiliang Zhang. 2023. Contextual Instance Decoupling for Instance-Level Human Analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 45, 8 (2023), 9520--9533.
[21]
Bin Xiao, Haiping Wu, and Yichen Wei. 2018. Simple baselines for human pose estimation and tracking. In Proc. Eur. Conf. Comp. Vis.
[22]
Rufeng Zhang, Zhi Tian, Chunhua Shen, Mingyu You, and Youliang Yan. 2020. Mask encoding for single shot instance segmentation. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn. 10226--10235.
[23]
Xingyi Zhou, Dequan Wang, and Philipp Krähenbühl. 2019. Objects as points. arXiv preprint arXiv:1904.07850 (2019).

Cited By

View all
  • (2024)Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01313(13838-13848)Online publication date: 16-Jun-2024
  • (2024)Spatial-Aware Regression for Keypoint Localization2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.00066(624-633)Online publication date: 16-Jun-2024

Index Terms

  1. HumVis: Human-Centric Visual Analysis System

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '23: Proceedings of the 31st ACM International Conference on Multimedia
    October 2023
    9913 pages
    ISBN:9798400701085
    DOI:10.1145/3581783
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 October 2023

    Check for updates

    Author Tags

    1. 3d human mesh recovery
    2. multi-person part segmentation
    3. multi-person pose estimation
    4. multi-person segmentation

    Qualifiers

    • Demonstration

    Funding Sources

    • This work is supported in part by Natural Science Foundation of China under Grant No. U20B2052, 61936011, in part by The National Key Research and Development Program of China under Grant No.2018YFE0118400.

    Conference

    MM '23
    Sponsor:
    MM '23: The 31st ACM International Conference on Multimedia
    October 29 - November 3, 2023
    Ottawa ON, Canada

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)67
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01313(13838-13848)Online publication date: 16-Jun-2024
    • (2024)Spatial-Aware Regression for Keypoint Localization2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.00066(624-633)Online publication date: 16-Jun-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media