Abstract
Hand pose estimation is a fundamental task for many human-robot interaction-related applications. In this work, we proposed novel hand pose estimation models, which leverage two types of attention mechanisms: self-attention and channel attention. By incorporating a simple yet efficient Squeeze-and-excitation (SE) block into Res152-CondPose, our best method, SERes152-CondPoseSE, successfully models interdependencies between channels. It outperforms the baseline, Res152-CondPose, by an absolute 9.56% in mean Average Precision and 17.78% in Multiple Object Tracking Accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Tracking performance comparison can be found at: https://youtu.be/k8ioKqLlSms.
References
Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B.: 2D human pose estimation: new benchmark and state of the art analysis. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3686–3693. IEEE (2014). https://doi.org/10.1109/cvpr.2014.471
Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the CLEAR MOT metrics. EURASIP J. Image Video Process. 2008(1), 1–10 (2008). https://doi.org/10.1155/2008/246309
Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E.: Squeeze-and-excitation networks (2020). https://doi.org/10.1109/tpami.2019.2913372
Louis, N., et al.: Temporally guided articulated hand pose tracking in surgical videos (2021). https://doi.org/10.2139/ssrn.4019293
Santavas, N., Kansizoglou, I., Bampis, L., Karakasis, E., Gasteratos, A.: Attention! A lightweight 2D hand pose estimation approach (2021). https://doi.org/10.1109/jsen.2020.3018172
Simon, T., Joo, H., Matthews, I., Sheikh, Y.: Hand keypoint detection in single images using multiview bootstrapping. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2017). https://doi.org/10.1109/cvpr.2017.494
Zhang, J., Jiao, J., Chen, M., Qu, L., Xu, X., Yang, Q.: 3D hand pose tracking and estimation using stereo matching (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Nguyen, QD., Bui, AT., Nguyen, TH., Do, TH. (2023). Enhancing 2D Hand Pose Detection and Tracking in Surgical Videos by Attention Mechanism. In: Braubach, L., Jander, K., Bădică, C. (eds) Intelligent Distributed Computing XV. IDC 2022. Studies in Computational Intelligence, vol 1089. Springer, Cham. https://doi.org/10.1007/978-3-031-29104-3_19
Download citation
DOI: https://doi.org/10.1007/978-3-031-29104-3_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-29103-6
Online ISBN: 978-3-031-29104-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)