skip to main content
research-article

BodyTrak: Inferring Full-body Poses from Body Silhouettes Using a Miniature Camera on a Wristband

Published:07 September 2022Publication History
Skip Abstract Section

Abstract

In this paper, we present BodyTrak, an intelligent sensing technology that can estimate full body poses on a wristband. It only requires one miniature RGB camera to capture the body silhouettes, which are learned by a customized deep learning model to estimate the 3D positions of 14 joints on arms, legs, torso, and head. We conducted a user study with 9 participants in which each participant performed 12 daily activities such as walking, sitting, or exercising, in varying scenarios (wearing different clothes, outdoors/indoors) with a different number of camera settings on the wrist. The results show that our system can infer the full body pose (3D positions of 14 joints) with an average error of 6.9 cm using only one miniature RGB camera (11.5mm x 9.5mm) on the wrist pointing towards the body. Based on the results, we disscuss the possible application, challenges, and limitations to deploy our system in real-world scenarios.

References

  1. Md Atiqur Rahman Ahad, Masud Ahmed, Anindya Das Antar, Yasushi Makihara, and Yasushi Yagi. 2021. Action recognition using kinematics posture feature on 3D skeleton joint locations. Pattern Recognition Letters 145 (2021), 216--224.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Karan Ahuja, Andy Kong, Mayank Goel, and Chris Harrison. 2020. Direction-of-Voice (DoV) Estimation for Intuitive Speech Interaction with Smart Devices Ecosystems.. In UIST. 1121--1131.Google ScholarGoogle Scholar
  3. Karan Ahuja, Sven Mayer, Mayank Goel, and Chris Harrison. 2021. Pose-on-the-Go: Approximating User Pose with Smartphone Sensor Fusion and Inverse Kinematics. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Amonzon. [n.d.]. Musou USB Safety Tester,USB Digital Power Meter Tester Multimeter Current and Voltage Monitor DC 5.1A 30V Amp Voltage Power Meter, Test Speed of Chargers, Cables, Capacity of Power Banks,Black. [EB/OL]. https://www.amazon.com/Musou-Digital-Multimeter-Chargers-Capacity/dp/B071214RD8 Accessed Oct 4, 2020.Google ScholarGoogle Scholar
  5. Rozilene Maria C Aroeira, B Estevam, Antônio Eustáquio M Pertence, Marcelo Greco, and João Manuel RS Tavares. 2016. Non-invasive methods of computer vision in the posture evaluation of adolescent idiopathic scoliosis. Journal of bodywork and movement therapies 20, 4 (2016), 832--843.Google ScholarGoogle ScholarCross RefCross Ref
  6. Carlijn VC Bouten, Karel TM Koekkoek, Maarten Verduin, Rens Kodde, and Jan D Janssen. 1997. A triaxial accelerometer and portable data processing unit for the assessment of daily physical activity. IEEE transactions on biomedical engineering 44, 3 (1997), 136--147.Google ScholarGoogle Scholar
  7. Zhe Cao, Gines Hidalgo, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2019. OpenPose: realtime multi-person 2D pose estimation using Part Affinity Fields. IEEE transactions on pattern analysis and machine intelligence 43, 1 (2019), 172--186.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Tuochao Chen, Yaxuan Li, Songyun Tao, Hyunchul Lim, Mose Sakashita, Ruidong Zhang, Francois Guimbretiere, and Cheng Zhang. 2021. NeckFace: Continuously Tracking Full Facial Expressions on Neck-mounted Wearables. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 5, 2 (2021), 1--31.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Tuochao Chen, Benjamin Steeper, Kinan Alsheikh, Songyun Tao, François Guimbretière, and Cheng Zhang. 2020. C-Face: Continuously Reconstructing Facial Expressions by Deep Learning Contours of the Face with Ear-Mounted Miniature Cameras. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. 112--125.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Intel Corporation. 2021. RealSense. In https://www.intelrealsense.com/.Google ScholarGoogle Scholar
  11. Microsoft Corporation. 2021. Microsoft Kinect.. In https://en.wikipedia.org/wiki/Kinect.Google ScholarGoogle Scholar
  12. Rita Cucchiara, Costantino Grana, Andrea Prati, and Roberto Vezzani. 2004. Probabilistic posture classification for human-behavior analysis. IEEE Transactions on systems, man, and cybernetics-Part A: Systems and Humans 35, 1 (2004), 42--54.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Amit Das, Ivan Tashev, and Shoaib Mohammed. 2017. Ultrasound based gesture recognition. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 406--410.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Mohamed El Amine Elforaici, Ismail Chaaraoui, Wassim Bouachir, Youssef Ouakrim, and Neila Mezghani. 2018. Posture recognition using an RGB-D camera: exploring 3D body modeling and deep learning approaches. In 2018 IEEE life sciences conference (LSC). IEEE, 69--72.Google ScholarGoogle Scholar
  15. Riza Alp Güler, Natalia Neverova, and Iasonas Kokkinos. 2018. Densepose: Dense human pose estimation in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7297--7306.Google ScholarGoogle ScholarCross RefCross Ref
  16. Samuel Gandang Gunanto et al. 2016. 2D to 3D space transformation for facial animation based on marker data. In 2016 6th International Annual Engineering Seminar (InAES). IEEE, 1--5.Google ScholarGoogle Scholar
  17. Samer Hijazi, Rishi Kumar, and Chris Rowen. 2015. Using convolutional neural networks for image recognition. Cadence Design Systems Inc.: San Jose, CA, USA (2015), 1--12.Google ScholarGoogle Scholar
  18. Ryosuke Hori, Ryo Hachiuma, Hideo Saito, Mariko Isogawa, and Dan Mikami. 2021. Silhouette-Based Synthetic Data Generation For 3D Human Pose Estimation With A Single Wrist-Mounted 360° Camera. In 2021 IEEE International Conference on Image Processing (ICIP). IEEE, 1304--1308.Google ScholarGoogle ScholarCross RefCross Ref
  19. Fang Hu, Peng He, Songlin Xu, Yin Li, and Cheng Zhang. 2020. FingerTrak: Continuous 3D Hand Pose Tracking by Deep Learning Hand Silhouettes Captured by Miniature Thermal Cameras on Wrist. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 4, 2, Article 71 (June 2020), 24 pages. https://doi.org/10.1145/3397306Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Fang Hu, Peng He, Songlin Xu, Yin Li, and Cheng Zhang. 2020. FingerTrak: Continuous 3D hand pose tracking by deep learning hand silhouettes captured by miniature thermal cameras on wrist. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4, 2 (2020), 1--24.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Xinyue Huang and Adriana Kovashka. 2016. Inferring Visual Persuasion via Body Language, Setting, and Deep Features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.Google ScholarGoogle ScholarCross RefCross Ref
  22. Dong-Hyun Hwang, Kohei Aso, and Hideki Koike. 2019. MonoEye: Monocular Fisheye Camera-based 3D Human Pose Estimation. In 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR). IEEE, 988--989.Google ScholarGoogle ScholarCross RefCross Ref
  23. Dong-Hyun Hwang, Kohei Aso, Ye Yuan, Kris Kitani, and Hideki Koike. 2020. MonoEye: Multimodal Human Motion Capture System Using A Single Ultra-Wide Fisheye Camera. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. 98--111.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. NaturalPoint Inc. 2021. OptiTrack. In http://optitrack.com.Google ScholarGoogle Scholar
  25. Northern Digital Inc. 2021. trakSTAR. In https://www.ndigital.com/msci/products/drivebay-trakstar/.Google ScholarGoogle Scholar
  26. PhaseSpace Inc. 2021. PhaseSpace. In https://phasespace.com/.Google ScholarGoogle Scholar
  27. Wenjun Jiang, Hongfei Xue, Chenglin Miao, Shiyang Wang, Sen Lin, Chong Tian, Srinivasan Murali, Haochen Hu, Zhi Sun, and Lu Su. 2020. Towards 3D human pose construction using wifi. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking. 1--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Shian-Ru Ke, LiangJia Zhu, Jenq-Neng Hwang, Hung-I Pai, Kung-Ming Lan, and Chih-Pin Liao. 2010. Real-time 3D human pose estimation from monocular view with applications to event detection and video gaming. In 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance. IEEE, 489--496.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Alex Kendall, Matthew Grimes, and Roberto Cipolla. 2015. Posenet: A convolutional network for real-time 6-dof camera relocalization. In Proceedings of the IEEE international conference on computer vision. 2938--2946.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. David Kim, Otmar Hilliges, Shahram Izadi, Alex D Butler, Jiawen Chen, Iason Oikonomidis, and Patrick Olivier. 2012. Digits: freehand 3D interactions anywhere using a wrist-worn gloveless sensor. In Proceedings of the 25th annual ACM symposium on User interface software and technology. 167--176.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Kevin Lin, Lijuan Wang, Kun Luo, Yinpeng Chen, Zicheng Liu, and Ming-Ting Sun. 2020. Cross-domain complementary learning using pose for multi-person part segmentation. IEEE Transactions on Circuits and Systems for Video Technology 31, 3 (2020), 1066--1078.Google ScholarGoogle ScholarCross RefCross Ref
  32. Jianbo Liu, Ying Wang, Yongcheng Liu, Shiming Xiang, and Chunhong Pan. 2020. 3D PostureNet: A unified framework for skeleton-based posture recognition. Pattern Recognition Letters 140 (2020), 143--149.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Yang Liu, Zhenjiang Li, Zhidan Liu, and Kaishun Wu. 2019. Real-time arm skeleton tracking and gesture inference tolerant to missing wearable sensors. In Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services. 287--299.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. ALT LLC. 2021. Antilatency. In https://antilatency.com/.Google ScholarGoogle Scholar
  35. Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3431--3440.Google ScholarGoogle ScholarCross RefCross Ref
  36. Vicon Motion Systems Ltd. 2021. Vicon. In https://vicon.com/.Google ScholarGoogle Scholar
  37. Dushyant Mehta, Helge Rhodin, Dan Casas, Pascal Fua, Oleksandr Sotnychenko, Weipeng Xu, and Christian Theobalt. 2017. Monocular 3d human pose estimation in the wild using improved cnn supervision. In 2017 international conference on 3D vision (3DV). IEEE, 506--516.Google ScholarGoogle ScholarCross RefCross Ref
  38. Greg Mori, Xiaofeng Ren, Alexei A Efros, and Jitendra Malik. 2004. Recovering human body configurations: Combining segmentation and recognition. In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., Vol. 2. IEEE, II-II.Google ScholarGoogle ScholarCross RefCross Ref
  39. Evonne Ng, Donglai Xiang, Hanbyul Joo, and Kristen Grauman. 2020. You2me: Inferring body pose in egocentric video via first and second person interactions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9890--9900.Google ScholarGoogle ScholarCross RefCross Ref
  40. Jaime A Rincon, Angelo Costa, Paulo Novais, Vicente Julian, and Carlos Carrascosa. 2018. Intelligent wristbands for the automatic detection of emotional states for the elderly. In International Conference on Intelligent Data Engineering and Automated Learning. Springer, 520--530.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Daniel Roetenberg, Henk Luinge, and Per Slycke. 2009. Xsens MVN: Full 6DOF human motion tracking using miniature inertial sensors. Xsens Motion Technologies BV, Tech. Rep 1 (2009), 1--7.Google ScholarGoogle Scholar
  42. J Roggendorf, S Chen, S Baudrexel, S Van De Loo, C Seifried, and R Hilker. 2012. Arm swing asymmetry in Parkinson's disease measured with ultrasound based motion analysis during treadmill gait. Gait & posture 35, 1 (2012), 116--120.Google ScholarGoogle Scholar
  43. Ralf Schmidt, Catherine Disselhorst-Klug, Jiri Silny, and Günter Rau. 1999. A marker-based measurement procedure for unconstrained wrist and elbow motions. Journal of biomechanics 32, 6 (1999), 615--621.Google ScholarGoogle ScholarCross RefCross Ref
  44. Sheng Shen, He Wang, and Romit Roy Choudhury. 2016. I am a smartwatch and i can track my user's arm. In Proceedings of the 14th annual international conference on Mobile systems, applications, and services. 85--96.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Takaaki Shiratori, Hyun Soo Park, Leonid Sigal, Yaser Sheikh, and Jessica K. Hodgins. 2011. Motion Capture from Body-Mounted Cameras. ACM Trans. Graph. 30, 4, Article 31 (July 2011), 10 pages. https://doi.org/10.1145/2010324.1964926Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Christina Strohrmann, Holger Harms, Cornelia Kappeler-Setz, and Gerhard Troster. 2012. Monitoring kinematic changes with fatigue in running using body-worn sensors. IEEE transactions on information technology in biomedicine 16, 5 (2012), 983--990.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Nusrat Tasnim, Md Islam, Joong-Hwan Baek, et al. 2020. Deep learning-based action recognition using 3D skeleton joints information. Inventions 5, 3 (2020), 49.Google ScholarGoogle ScholarCross RefCross Ref
  48. Denis Tome, Patrick Peluse, Lourdes Agapito, and Hernan Badino. 2019. xr-egopose: Egocentric 3d human pose from an hmd camera. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 7728--7738.Google ScholarGoogle ScholarCross RefCross Ref
  49. Vive. 2021. HTC VIVE.. In https://www.vive.com/.Google ScholarGoogle Scholar
  50. Kathan Vyas, Rui Ma, Behnaz Rezaei, Shuangjun Liu, Michael Neubauer, Thomas Ploetz, Ronald Oberleitner, and Sarah Ostadabbas. 2019. Recognition of atypical behavior in autism diagnosis from video using pose estimation over time. In 2019 IEEE 29th International Workshop on Machine Learning for Signal Processing (MLSP). IEEE, 1--6.Google ScholarGoogle ScholarCross RefCross Ref
  51. Erwin Wu, Ye Yuan, Hui-Shyong Yeo, Aaron Quigley, Hideki Koike, and Kris M Kitani. 2020. Back-Hand-Pose: 3D Hand Pose Estimation for a Wrist-worn Camera via Dorsum Deformation Network. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. 1147--1160.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Weipeng Xu, Avishek Chatterjee, Michael Zollhoefer, Helge Rhodin, Pascal Fua, Hans-Peter Seidel, and Christian Theobalt. 2019. Mo 2 cap 2: Real-time mobile 3d motion capture with a cap-mounted fisheye camera. IEEE transactions on visualization and computer graphics 25, 5 (2019), 2093--2101.Google ScholarGoogle Scholar
  53. Jackie Yang, Gaurab Banerjee, Vishesh Gupta, Monica S Lam, and James A Landay. 2020. Soundr: Head Position and Orientation Prediction Using a Microphone Array. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Jackie Yang, Tuochao Chen, Fang Qin, Monica S Lam, and James A Landay. 2022. HybridTrak: Adding Full-Body Tracking to VR Using an Off-the-Shelf Webcam. In CHI Conference on Human Factors in Computing Systems. 1--13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Mingmin Zhao, Tianhong Li, Mohammad Abu Alsheikh, Yonglong Tian, Hang Zhao, Antonio Torralba, and Dina Katabi. 2018. Through-wall human pose estimation using radio signals. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7356--7365.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. BodyTrak: Inferring Full-body Poses from Body Silhouettes Using a Miniature Camera on a Wristband

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
      Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies  Volume 6, Issue 3
      September 2022
      1612 pages
      EISSN:2474-9567
      DOI:10.1145/3563014
      Issue’s Table of Contents

      Copyright © 2022 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 7 September 2022
      Published in imwut Volume 6, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader