research-article

TransPose: real-time 3D human translation and pose estimation with six inertial sensors

Authors:
Xinyu Yi

Tsinghua University, China

Tsinghua University, China
View Profile

,
Yuxiao Zhou

Tsinghua University, China

Tsinghua University, China
View Profile

,
Feng Xu

Tsinghua University, China

Tsinghua University, China
View Profile

Authors Info & Claims

ACM Transactions on Graphics Volume 40 Issue 4Article No.: 86pp 1–13https://doi.org/10.1145/3450626.3459786

Published:19 July 2021Publication History

ACM Transactions on Graphics

Abstract

Motion capture is facing some new possibilities brought by the inertial sensing technologies which do not suffer from occlusion or wide-range recordings as vision-based solutions do. However, as the recorded signals are sparse and quite noisy, online performance and global translation estimation turn out to be two key difficulties. In this paper, we present TransPose, a DNN-based approach to perform full motion capture (with both global translations and body poses) from only 6 Inertial Measurement Units (IMUs) at over 90 fps. For body pose estimation, we propose a multi-stage network that estimates leaf-to-full joint positions as intermediate results. This design makes the pose estimation much easier, and thus achieves both better accuracy and lower computation cost. For global translation estimation, we propose a supporting-foot-based method and an RNN-based method to robustly solve for the global translations with a confidence-based fusion technique. Quantitative and qualitative comparisons show that our method outperforms the state-of-the-art learning- and optimization-based methods with a large margin in both accuracy and efficiency. As a purely inertial sensor-based approach, our method is not limited by environmental settings (e.g., fixed cameras), making the capture free from common difficulties such as wide-range motion space and strong occlusion.

Supplemental Material

a86-yi.mp4

mp4

84.1 MB

Download

3450626.3459786.mp4

Presentation.

mp4

384.5 MB

Download

Available for Download

vtt

3450626.3459786.vtt (28.6 KB)

References

David Aha. 1997. Lazy Learning.Google Scholar
Sheldon Andrews, Ivan Huerta, Taku Komura, Leonid Sigal, and Kenny Mitchell. 2016. Real-time Physics-based Motion Capture with Sparse Sensors. 1--10.Google Scholar
E.R. Bachmann, Robert Mcghee, Xiaoping Yun, and Michael Zyda. 2002. Inertial and Magnetic Posture Tracking for Inserting Humans Into Networked Virtual Environments. (01 2002).Google Scholar
Long Chen, Haizhou Ai, Rui Chen, Zijie Zhuang, and Shuang Liu. 2020. Cross-View Tracking for Multi-Human 3D Pose Estimation at Over 100 FPS. 3276--3285.Google Scholar
Michael Del Rosario, Heba Khamis, Phillip Ngo, and Nigel Lovell. 2018. Computationally-Efficient Adaptive Error-State Kalman Filter for Attitude Estimation. IEEE Sensors Journal PP (08 2018), 1--1.Google ScholarCross Ref
Tamar Flash and Neville Hogan. 1985. The Coordination of Arm Movements: An Experimentally Confirmed Mathematical Model. The Journal of neuroscience : the official journal of the Society for Neuroscience 5 (08 1985), 1688--703.Google Scholar
Eric Foxlin. 1996. Inertial Head-Tracker Sensor Fusion by a Complementary Separate-Bias Kalman Filter. 185 -- 194, 267.Google Scholar
Andrew Gilbert, Matthew Trumble, Charles Malleson, Adrian Hilton, and John Collomosse. 2018. Fusing Visual and Inertial Sensors with Semantics for 3D Human Pose Estimation. International Journal of Computer Vision (09 2018), 1--17.Google Scholar
Ikhsanul Habibie, Weipeng Xu, Dushyant Mehta, Gerard Pons-Moll, and Christian Theobalt. 2019. In the Wild Human Pose Estimation Using Explicit 2D Features and Intermediate 3D Representations. 10897--10906.Google Scholar
Julius Hannink, Thomas Kautz, Cristian Pasluosta, Jochen Klucken, and Bjoern Eskofier. 2016. Sensor-Based Gait Parameter Extraction With Deep Convolutional Neural Networks. IEEE Journal of Biomedical and Health Informatics PP (09 2016).Google Scholar
Thomas Helten, Meinard Müller, Hans-Peter Seidel, and Christian Theobalt. 2013. Real-Time Body Tracking with One Depth Camera and Inertial Sensors. Proceedings of the IEEE International Conference on Computer Vision, 1105--1112.Google ScholarDigital Library
Roberto Henschel, Timo Marcard, and Bodo Rosenhahn. 2020. Accurate Long-Term Multiple People Tracking using Video and Body-Worn IMUs. IEEE Transactions on Image Processing PP (08 2020), 1--1.Google ScholarDigital Library
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-term Memory. Neural computation 9 (12 1997), 1735--80.Google ScholarDigital Library
Daniel Holden. 2018. Robust solving of optical motion capture data by denoising. ACM Transactions on Graphics (TOG) 37, 4 (2018), 1--12.Google ScholarDigital Library
Yinghao Huang, Manuel Kaufmann, Emre Aksan, Michael Black, Otmar Hilliges, and Gerard Pons-Moll. 2018. Deep inertial poser: Learning to reconstruct human pose from sparse inertial measurements in real time. ACM Transactions on Graphics 37, 1--15.Google ScholarDigital Library
Diederik Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. International Conference on Learning Representations (12 2014).Google Scholar
Huajun Liu, Xiaolin Wei, Jinxiang Chai, Inwoo Ha, and Taehyun Rhee. 2011. Realtime human motion control with a small number of inertial sensors. 133--140.Google Scholar
Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black. 2015. SMPL: A Skinned Multi-Person Linear Model. ACM Trans. Graphics (Proc. SIGGRAPH Asia) 34, 6 (Oct. 2015), 248:1--248:16.Google ScholarDigital Library
Naureen Mahmood, Nima Ghorbani, Nikolaus F. Troje, Gerard Pons-Moll, and Michael J. Black. 2019. AMASS: Archive of Motion Capture as Surface Shapes. In The IEEE International Conference on Computer Vision (ICCV).Google Scholar
Charles Malleson, John Collomosse, and Adrian Hilton. 2019. Real-Time Multi-person Motion Capture from Multi-view Video and IMUs. International Journal of Computer Vision (12 2019).Google ScholarDigital Library
Charles Malleson, Andrew Gilbert, Matthew Trumble, John Collomosse, Adrian Hilton, and Marco Volino. 2017. Real-Time Full-Body Motion Capture from Video and IMUs. 449--457.Google Scholar
Timo Marcard, Gerard Pons-Moll, and Bodo Rosenhahn. 2016. Human Pose Estimation from Video and IMUs. IEEE Transactions on Pattern Analysis and Machine Intelligence 38 (02 2016), 1--1.Google ScholarDigital Library
Timo Marcard, Bodo Rosenhahn, Michael Black, and Gerard Pons-Moll. 2017. Sparse Inertial Poser: Automatic 3D Human Pose Estimation from Sparse IMUs. Computer Graphics Forum 36(2), Proceedings of the 38th Annual Conference of the European Association for Computer Graphics (Eurographics), 2017 36 (02 2017).Google Scholar
Dushyant Mehta, Oleksandr Sotnychenko, Franziska Mueller, Weipeng Xu, Mohamed Elgharib, Pascal Fua, Hans-Peter Seidel, Helge Rhodin, Gerard Pons-Moll, and Christian Theobalt. 2020. XNect: real-time multi-person 3D motion capture with a single RGB camera. ACM Transactions on Graphics 39 (07 2020).Google ScholarDigital Library
Thomas B Moeslund and Erik Granum. 2001. A survey of computer vision-based human motion capture. Computer vision and image understanding 81, 3 (2001), 231--268.Google Scholar
Thomas B Moeslund, Adrian Hilton, and Volker Krüger. 2006. A survey of advances in vision-based human motion capture and analysis. Computer vision and image understanding 104, 2-3 (2006), 90--126.Google Scholar
Gerard Pons-Moll, Andreas Baak, Juergen Gall, Laura Leal-Taixé, Meinard Müller, Hans-Peter Seidel, and Bodo Rosenhahn. 2011. Outdoor human motion capture using inverse kinematics and von mises-fisher sampling. Proceedings of the IEEE International Conference on Computer Vision 0, 1243--1250.Google ScholarDigital Library
Gerard Pons-Moll, Andreas Baak, Thomas Helten, Meinard Müller, Hans-Peter Seidel, and Bodo Rosenhahn. 2010. Multisensor-Fusion for 3D Full-Body Human Motion Capture. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 663--670.Google ScholarCross Ref
Qaiser Riaz, Tao Guanhong, Björn Krüger, and Andreas Weber. 2015. Motion Reconstruction Using Very Few Accelerometers and Ground Contacts. Graphical Models (04 2015).Google Scholar
Daniel Roetenberg, Hendrik Luinge, Chris Baten, and Peter Veltink. 2005. Compensation of Magnetic Disturbances Improves Inertial and Magnetic Sensing of Human Body Segment Orientation. Neural Systems and Rehabilitation Engineering, IEEE Transactions on 13 (10 2005), 395 -- 405.Google Scholar
Martin Schepers, Matteo Giuberti, and G. Bellusci. 2018. Xsens MVN: Consistent Tracking of Human Motion Using Inertial Sensing. (03 2018).Google Scholar
Mike Schuster and Kuldip K Paliwal. 1997. Bidirectional recurrent neural networks. IEEE transactions on Signal Processing 45, 11 (1997), 2673--2681.Google ScholarDigital Library
Loren Schwarz, Diana Mateus, and Nassir Navab. 2009. Discriminative Human Full-Body Pose Estimation from Wearable Inertial Sensor Data. 159--172.Google Scholar
Soshi Shimada, Vladislav Golyanik, Weipeng Xu, and Christian Theobalt. 2020. PhysCap: physically plausible monocular 3D motion capture in real time. ACM Transactions on Graphics 39 (11 2020), 1--16.Google ScholarDigital Library
Ronit Slyper and Jessica Hodgins. 2008. Action Capture with Accelerometers. ACM SIGGRAPH/Eurographics Symposium on Computer Animation, 193--199.Google Scholar
Jochen Tautges, Arno Zinke, Björn Krüger, Jan Baumann, Andreas Weber, Thomas Helten, Meinard Müller, Hans-Peter Seidel, and Bernhard Eberhardt. 2011. Motion Reconstruction Using Sparse Accelerometer Data. ACM Transactions on Graphics 30 (05 2011), 18.Google Scholar
Denis Tome, Matteo Toso, Lourdes Agapito, and Chris Russell. 2018. Rethinking Pose in 3D: Multi-stage Refinement and Recovery for Markerless Motion Capture. 474--483.Google Scholar
Matthew Trumble, Andrew Gilbert, Adrian Hilton, and John Collomosse. 2016. Deep Convolutional Networks for Marker-less Human Pose Estimation from Multiple Views. 1--9.Google Scholar
Matthew Trumble, Andrew Gilbert, Charles Malleson, Adrian Hilton, and John Collomosse. 2017. Total Capture: 3D Human Pose Estimation Fusing Video and Inertial Sensors.Google Scholar
Rachel Vitali, Ryan McGinnis, and Noel Perkins. 2020. Robust Error-State Kalman Filter for Estimating IMU Orientation. IEEE Sensors Journal (10 2020).Google ScholarCross Ref
Daniel Vlasic, Rolf Adelsberger, Giovanni Vannucci, John Barnwell, Markus Gross, Wojciech Matusik, and Jovan Popovic. 2007. Practical motion capture in everyday surroundings. ACM Trans. Graph. 26 (07 2007), 35.Google Scholar
Timo von Marcard, Roberto Henschel, Michael J. Black, Bodo Rosenhahn, and Gerard Pons-Moll. 2018. Recovering Accurate 3D Human Pose in the Wild Using IMUs and a Moving Camera. In Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part X (Lecture Notes in Computer Science), Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.), Vol. 11214. 614--631.Google Scholar
Donglai Xiang, Hanbyul Joo, and Yaser Sheikh. 2019. Monocular Total Capture: Posing Face, Body, and Hands in the Wild. 10957--10966.Google Scholar
Lan Xu, Lu Fang, Wei Cheng, Kaiwen Guo, Guyue Zhou, Qionghai Dai, and Yebin Liu. 2016. FlyCap: Markerless Motion Capture Using Multiple Autonomous Flying Cameras. IEEE Transactions on Visualization and Computer Graphics PP (10 2016).Google Scholar
Zhe Zhang, Chunyu Wang, Wenhu Qin, and Wenjun Zeng. 2020. Fusing Wearable IMUs With Multi-View Images for Human Pose Estimation: A Geometric Approach. 2197--2206.Google Scholar
Zerong Zheng, Yu Tao, Hao Li, Kaiwen Guo, Qionghai Dai, Lu Fang, and Yebin Liu. 2018. HybridFusion: Real-Time Performance Capture Using a Single Depth Sensor and Sparse IMUs: 15th European Conference, Munich, Germany, September 8--14, 2018, Proceedings, Part IX. 389--406.Google Scholar
Yi Zhou, Connelly Barnes, Jingwan Lu, Jimei Yang, and Hao Li. 2018. On the Continuity of Rotation Representations in Neural Networks. CoRR abs/1812.07035 (2018). arXiv:1812.07035Google Scholar
Yuxiao Zhou, Marc Habermann, Ikhsanul Habibie, Ayush Tewari, Christian Theobalt, and Feng Xu. 2020a. Monocular Real-time Full Body Capture with Inter-part Correlations.Google Scholar
Yuxiao Zhou, Marc Habermann, Weipeng Xu, Ikhsanul Habibie, Christian Theobalt, and Feng Xu. 2020b. Monocular Real-Time Hand Shape and Motion Capture Using Multi-Modal Data. 5345--5354.Google Scholar
Yuliang Zou, Jimei Yang, Duygu Ceylan, Jianming Zhang, Federico Perazzi, and Jia-Bin Huang. 2020. Reducing Footskate in Human Motion Reconstruction with Ground Contact Constraints. 448--457.Google Scholar

Index Terms

TransPose: real-time 3D human translation and pose estimation with six inertial sensors
1. Computing methodologies
  1. Computer graphics
    1. Animation
      1. Motion capture

Recommendations

Deep inertial poser: learning to reconstruct human pose from sparse inertial measurements in real time

We demonstrate a novel deep neural network capable of reconstructing human full body pose in real-time from 6 Inertial Measurement Units (IMUs) worn on the user's body. In doing so, we address several difficult challenges. First, the problem is severely ...
Read More
Real-Time Multi-person Motion Capture from Multi-view Video and IMUs
Abstract
A real-time motion capture system is presented which uses input from multiple standard video cameras and inertial measurement units (IMUs). The system is able to track multiple people simultaneously and requires no optical markers, specialized ...
Read More
A real-time on-chip algorithm for IMU-Based gait measurement
PCM'12: Proceedings of the 13th Pacific-Rim conference on Advances in Multimedia Information Processing

This paper presents a real-time and on-chip gait measurement algorithm used in our Gait Measurement System (GMS). Our GMS is a small foot-mounted device based on an Inertial Measurement Unit (IMU), which contains an accelerometer and a gyroscope. The ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Graphics Volume 40, Issue 4
August 2021
2170 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/3450626
Editor:
Sylvain Paris
Adobe Inc.
Issue’s Table of Contents
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 July 2021
Published in tog Volume 40, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
IMU
RNN
inverse kinematics
pose estimation
real-time
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 55
  Total Citations
  View Citations
- 958
  Total Downloads
- Downloads (Last 12 months)395
- Downloads (Last 6 weeks)52
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

TransPose: real-time 3D human translation and pose estimation with six inertial sensors

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Deep inertial poser: learning to reconstruct human pose from sparse inertial measurements in real time

Real-Time Multi-person Motion Capture from Multi-view Video and IMUs

A real-time on-chip algorithm for IMU-Based gait measurement

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

TransPose: real-time 3D human translation and pose estimation with six inertial sensors

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Deep inertial poser: learning to reconstruct human pose from sparse inertial measurements in real time

Real-Time Multi-person Motion Capture from Multi-view Video and IMUs

A real-time on-chip algorithm for IMU-Based gait measurement

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media