Enabling Gait Analysis in the Telemedicine Practice through Portable and Accurate 3D Human Pose Estimation
Introduction
Human pose estimation (HPE) is a key component for human motion analysis from RGB images and videos [1]. Among the many application fields, such as, sport performance analysis, human computer interaction, and action recognition, there is a growing interest in applying such a computer vision technology for the analysis of pathological gait detection [2], [3], [4], [5], [6]. The interest is mainly due to the fact that, differently from other solutions like multi-camera motion capture systems, multiple inertial measurement units (IMU), force plate, and pressure insoles, HPE does not require participants to wear markers or specialized sensors and can be rapidly applied without complicated setup.
Thanks to the advances in the convolutional neural networks (CNNs), there have been significant improvements in the accuracy of the 3D HPE. Different techniques rely on the training of end-to-end CNNs to predict the 3D human pose from each image (i.e., video frame) [7]. Other techniques achieve comparable accuracy by estimating the 2D pose first and then lifting the 2D pose to view-invariant 3D pose [8].
On the other hand, long latencies of the image processing (i.e., execution time of the inference phase) lead the system to work at low frequency and, as a consequence, to a sensible decrease in the overall estimation accuracy [9]. As both direct (3D) and indirect (2D + Depth) methods are computationally demanding, they require well-equipped and advanced architectures (i.e., multi-core CPUs supported by a GPU accelerator) to achieve accuracy and real-time processing.
One of the main challenges is to adopt markerless gait analysis through CNN-based software (SW) applications in telemedicine. Telemedicine comprises medical care and professionals that adopt information and communication technologies in order to provide assessment, diagnosis, goal setting, treatment education, and monitoring through remote devices [10], [11]. As to motion analysis, telehealth approaches are useful to provide evaluations in the management of conditions affecting movement (for example orthopedic or neurologic diseases, including, fractures, amputations, stroke, Parkinson disease, multiple sclerosis) and their follow-up. Furthermore, telehealth technologies for motion analysis may be useful for rehabilitation in order to assess body function limitations due to posture or movement alterations as well as to monitor execution and effects of physical exercises in patients suffering from disabling conditions.
Nevertheless, delivering gait analysis at a distance through cameras and HPE software requires the system to satisfy, beside accuracy and real-time, also portability and privacy compliance at the same time [12]. Addressing all these non-functional constraints in a seameless way requires the HPE software to execute on Internet-of-Things (IoT) devices at the edge [13], by which the video streams (i.e., the sensitive information) is elaborated close to the input sensor (i.e., the camera), while only the process results can be stored or sent over the communication network (i.e., Ethernet, WiFi) [14].
In this context, IoT devices for edge computing are resource-constrained architectures, in which memory and computing capability are bounded to guarantee portability, energy efficiency, and low-power consumption. State-of-the-art software solutions for HPE either do not run on these devices due to memory limitations or they achieve very low working frequencies (i.e., far from real-time)1, which can lead to very low accuracy in the kinetic gait analysis. Although some solutions have been proposed to apply HPE on mobile devices [16], they achieve real-time performance at the cost of coarse pose estimation (i.e., not adapt for clinical gait analysis) due to memory or computing power limitations [17].
In this work, we address the problem of satisfying multiple and different non-functional constraints like real-time computing, accuracy, privacy compliance, and portability for remote gait analysis. We present a portable embedded system (see Fig. 1(a)) that implements a real-time accurate 3D HPE SW application with a low-cost camera and a low-power computing device.
The main contributions of the work, which are the key elements of the proposed system, are the following:
- •
We reimplemented a 2D HPE engine to take advantage of the heterogeneous computing elements (i.e., CPU and integrated GPU) of an off-the-shelf portable computing device (i.e., NVIDIA Jetson board) to achieve real-time computation at the edge.
- •
We combined the inference phase with different image elaboration blocks to improve the accuracy while guaranteeing real-time and reduced memory footprint of the software. We selected a set of spatio-temporal filters and an interpolation procedure to extend the HPE from 2D to 3D and we implemented them thorough optimized parallel code to take advantage of multi-level parallelism on the device.
- •
We present a quantitative analysis of the proposed platform accuracy in extrapolating both the 3D positions of human joints and the joint angles. We present an extended comparison with: (i) an infrared marker-based motion capture system (i.e., Vicon), and (ii) the OpenPose software application executed on a high-performance computing server to evaluate the applicability of the platform for gait analysis.
The article is organized as follows. Section 2 presents the analysis of the related work. Section 3 presents the proposed portable HPE platform. Section 4 reports the experimental results. Section 5 presents the discussion of the obtained results, while Section 6 is devoted to the conclusions.
Section snippets
Related Work
Measuring gait variables using computer vision has been increasingly applied in the recent years to assess mobility and risks of fall [5], [18] as well as to identify gait features in parkinsonism and other neurological diseases [19], [20]. Quantifying gait pathology with commodity cameras increases access to quantitative motion analysis in clinics and at home and enables researchers to conduct large-scale studies of neurological and musculoskeletal disorders [21]. The same technology has been
Method
Fig. 2 shows an overview of the proposed platform (MAEVE). It consists of communicating and concurrent elaboration nodes that implement a pipeline of computer vision primitives and inference-based applications. The pipeline allows for an efficient processing of the frame sequence (i.e., video stream), which are generated by a RGB-D cameras and elaborated, in real time, on the heterogeneous computing elements of the embedded boards (i.e., CPU and GPU). The result is a set of 3D keypoints (),
Experimental results
We evaluated MAEVE in terms of real-time computing performance on different off-the-shelf embedded boards. In every configuration, the input camera consists of an Intel RealSense D415 1920x1080 RGB frame + 1280x720 depth frame running at 60 frames per second (FPS). We then evaluated the MAEVE accuracy in two ways. First, by comparing the absolute 3D coordinates of the joint keypoints extrapolated by MAEVE and the marker coordinates extrapolated by an infra-red motion capture system during the
Discussion
In our experimental analysis, the HPE software at the state of the art (i.e., Openpose) has shown to achieve very high accuracy at the cost of high computational load and memory footprint. This is due to the fact that, to achieve high accuracy, the core of the pose estimation framework requires a deep and convolutional neural network architecture. As a consequence, such an inference application either does not run on memory bounded portable devices or, when run on the most efficient and
Conclusion
One of the main challenges of RGB-D based on 3D HPE is to achieve accuracy for clinical applications. Even more challenging is the HPE applied in the telemedicine practice, which requires the system to satisfy, beside accuracy, also portability, real-time, and privacy compliance constraints at the same time. In this article, we presented MAEVE, a 3D HPE that runs in real-time on a off-the-shelf portable computing board. The experimental results demonstrated that MAEVE achieves real time
Declaration of Competing Interest
We would like to confirm no conflict of interest, financial or other, exists. This manuscript is entirely original, has not been copyrighted, published, submitted, or accepted for publication elsewhere.
References (45)
- et al.
Edge computing: A survey
Future Generation Computer Systems
(2019) - et al.
Verification of validity of gait analysis systems during treadmill walking and running using human pose tracking algorithm
Gait and Posture
(2021) - et al.
Extraction of gait parameters from marker-free video recordings of timed up-and-go tests: Validity, inter- and intra-rater reliability
Gait and Posture
(2021) - NVIDIA AI IoT, Tensor RT Pose Estimation, 2020,...
Biomechanics and motor control of human movement: Fourth edition
Biomechanics and Motor Control of Human Movement: Fourth Edition
(2009)- et al.
A review of 3d human pose estimation algorithms for markerless motion capture
Computer Vision and Image Understanding
(2021) - et al.
Mcdcd: Multi-source unsupervised domain adaptation for abnormal human gait detection
IEEE Journal of Biomedical and Health Informatics
(2021) - et al.
Cross-subject and cross-modal transfer for generalized abnormal gait pattern recognition
IEEE Transactions on Neural Networks and Learning Systems
(2021) - et al.
Emg-based abnormal gait detection and recognition
Proceedings of the 2020 IEEE International Conference on Human-Machine Systems, ICHMS 2020
(2020) - et al.
Measuring gait variables using computer vision to assess mobility and fall risk in older adults with dementia
IEEE Journal of Translational Engineering in Health and Medicine
(2020)
3-D canonical pose estimation and abnormal gait recognition with a single RGB-D camera
IEEE Robotics and Automation Letters
Coarse-to-fine volumetric prediction for single-image 3d human pose
Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Learning pose grammar to encode human body configuration for 3d pose estimation
Proc. of AAAI Conference on Artificial Intelligence
View invariant 3d human pose estimation
IEEE Transactions on Circuits and Systems for Video Technology
D-sorm: A digital solution for remote monitoring based on the attitude of wearable devices
Computer Methods and Programs in Biomedicine
Heart failure non-invasive home telemonitoring systems: A systematic review
Computer Methods and Programs in Biomedicine
Internet of things in health: Requirements, issues, and gaps
Computer Methods and Programs in Biomedicine
Security and privacy protection in the e-health system: Remote monitoring of covid-19 patients as a use case
Smart Innovation, Systems and Technologies
Realtime multi-person 2d pose estimation using part affinity fields
Proc. of IEEE Conference on Computer Vision and Pattern Recognition
Mobipose: Real-time multi-person pose estimation on mobile devices
SenSys 2020 - Proc. of ACM Conference on Embedded Networked Sensor Systems
Human identification in health care systems using mobile edge computing
Transactions on Emerging Telecommunications Technologies
A clinically interpretable computer-vision based method for quantifying gait in parkinsons disease
Sensors
Cited by (10)
On the reliability of single-camera markerless systems for overground gait monitoring
2024, Computers in Biology and MedicineA novel scoring approach for the Wolf Motor Function Test in stroke survivors using motion-sensing technology and machine learning: A preliminary study
2024, Computer Methods and Programs in BiomedicineEnvironmental Sustainability of a Televisit Process: Definition of Parameters and Preliminary Results
2024, Lecture Notes in Mechanical EngineeringFinite element analysis of stress distribution of proximal femoral growth plate in adolescents
2023, Chinese Journal of Orthopaedics