Abstract
This paper introduces the initial edition of the Trauma TeleHelper for Operational Medical Procedure Support and Offline Network (Trauma THOMPSON) Challenge. It was organized as a satellite event of the 26th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) 2023. The challenge contains two tracks and four tasks related to automatic analysis of videos and images about emergency care procedures under resource constrained environments. The three tasks for Track 1, are (1) action recognition; (2) action anticipation; and (3) activity recognition. For Track 2, the only task was visual question answering. The videos were recorded by a team of doctors from the first-person view and annotated by medical professionals. The data were split into 70% for training and 30% for testing. For Task 1, the best method using VideoSwin with Swin-S and ThreeCrop achieved a Top 1 accuracy of 35.27%. For Task 2, the best method using VideoSwin with Swin-S and CenterCrop achieved Top 1 accuracy of 23.67%. No submission was received for Task 3. For the VQA task, the best method relying on MCAN-large with VinVL and FQCA obtained an accuracy of 74.35%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Lee, D., Yoon, S.N.: Application of artificial intelligence-based technologies in the healthcare industry: opportunities and challenges. IJERPH 18(1), 271 (2021). https://doi.org/10.3390/ijerph18010271
Wahl, B., Cossy-Gantner, A., Germann, S., Schwalbe, N.R.: Artificial intelligence (AI) and global health: how can AI contribute to health in resource-poor settings? BMJ Glob. Health 3(4), e000798 (2018). https://doi.org/10.1136/bmjgh-2018-000798
Buffie, E.F., Adam, C., Zanna, L.-F., Kpodar, K.: Loss-of-learning and the post-Covid recovery in low-income countries. J. Macroecon. 75, 103492 (2023). https://doi.org/10.1016/j.jmacro.2022.103492
Toole, M.J., Waldman, R.J.: The public health aspects of complex emergencies and refugee situations. Annu. Rev. Public Health 18, 283–312 (1997). https://doi.org/10.1146/annurev.publhealth.18.1.283
Welling, D.R., Ryan, J.M., Burris, D.G., Rich, N.M.: Seven Sins Humanit. Med. World J. Surg. 34(3), 466–470 (2010). https://doi.org/10.1007/s00268-009-0373-z
Stewart, T., Bird, P.: Health economic evaluation: cost-effective strategies in humanitarian and disaster relief medicine. BMJ Mil Health (2022). https://doi.org/10.1136/bmjmilitary-2021-001859
Shackelford, S.A., et al.: Case-control analysis of prehospital death and prolonged field care survival during recent US military combat operations. J. Trauma Acute Care Surg. 91(2), S186–S193 (2021). https://doi.org/10.1097/TA.0000000000003252
Riesberg, J., Powell, D., Loos, P.: The loss of the golden hour. Spec. Warfare Mag 30(1), 49–51 (2017)
Eastridge, B.J., et al.: Death on the battlefield (2001–2011): implications for the future of combat casualty care. J. Trauma Acute Care Surg. 73(6), S431–S437 (2012). https://doi.org/10.1097/TA.0b013e3182755dcc
Eastridge, B.J., et al.: Died of wounds on the battlefield: causation and implications for improving combat casualty care. J. Trauma Inj. Infect. Crit. Care 71(1), S4–S8 (2011). https://doi.org/10.1097/TA.0b013e318221147b
Johri, P., Diván, M.J., Khanam, R., Marciszack, M., Will, A. (eds.): Trends and Advancements of Image Processing and Its Applications. EAI/Springer Innovations in Communication and Computing. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-75945-2. isbn 978-3-030-75944-5 978-3-030-75945-2
Jiang, H., et al.: A review of deep learning-based multiple-lesion recognition from medical images: classification, detection and segmentation. Comput. Biol. Med. 157, 106726 (2023). https://doi.org/10.1016/j.compbiomed.2023.106726
Kaur, R., Singh, S.: A comprehensive review of object detection with deep learning. Digit. Sig. Process. 132, 103812 (2023). https://doi.org/10.1016/j.dsp.2022.103812
Ghandi, T., Pourreza, H., Mahyar, H.: Deep learning approaches on image captioning: a review. ACM Comput. Surv. 56(3), 1–39 (2024). https://doi.org/10.1145/3617592
Lin, Z., et al.: Medical visual question answering: a survey. Artif. Intell. Med. 143, 102611 (2023). https://doi.org/10.1016/j.artmed.2023.102611
Soomro, K., Zamir, A.R., Shah, M.: UCF101: a dataset of 101 human actions classes from videos in the wild (2012). arXiv: 1212.0402[cs]
Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: 2011 International Conference on Computer Vision, Barcelona, Spain, pp. 2556–2563. IEEE (2011). https://doi.org/10.1109/ICCV.2011.6126543
Kay, W., et al.: The kinetics human action video dataset (2017). https://doi.org/10.48550/ARXIV.1705.06950
Goyal, R., et al.: The “something something” video database for learning and evaluating visual common sense (2017). https://doi.org/10.48550/ARXIV.1706.04261
Gu, C., et al.: AVA: a video dataset of spatio-temporally localized atomic visual actions (2018). arXiv: 1705.08421[cs]
Pirsiavash, H., Ramanan, D.: Detecting activities of daily living in first-person camera views. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, pp. 2847–2854. IEEE (2012). https://doi.org/10.1109/CVPR.2012.6248010
Li, Y., Liu, M., Rehg, J.M.: In the eye of the beholder: gaze and actions in first person video (2020). arXiv: 2006.00626[cs]
Damen, D., et al.: Scaling egocentric vision: the EPIC-KITCHENS dataset (2018). arXiv: 1804.02748[cs]
Jiang, N., et al.: Baseline models for action recognition of unscripted casualty care dataset. In: Waiter, G., Lambrou, T., Leontidis, G., Oren, N., Morris, T., Gordon, S. (eds.) MIUA 2023. LNCS, vol. 1412, pp. 215–227. Springer, Cham (2024). https://doi.org/10.1007/978-3-031-48593-0_16
Zhuo, Y., et al.: TON-ViT: a neuro-symbolic AI based on task oriented network with a vision transformer. In: Waiter, G., Lambrou, T., Leontidis, G., Oren, N., Morris, T., Gordon, S. (eds.) MIUA 2023. LNCS, vol. 14122, pp. 157–170. Springer, Cham (2024). https://doi.org/10.1007/978-3-031-48593-0_12
Paydar, S., et al.: Tube thoracostomy (chest tube) removal in traumatic patients: what do we know? What can we do? Bull. Emerg. Trauma 3(2), 37–40 (2015)
Mould-Millman, N.-K., et al.: Prolonged casualty care: extrapolating civilian data to the military context. J. Trauma Acute Care Surg. 93(2), S78–S85 (2022). https://doi.org/10.1097/TA.0000000000003675
Liu, X., Liang, G.: Action recognition and action anticipation tasks in the trauma THOMPSON challenge, Technical report
Vuong, T.T.L., Bui, D.C., Kwak, J.T.: QuIIL at T3 challenge: towards automation in life-saving intervention procedures from first-person view
Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL 2002, Philadelphia, Pennsylvania, p. 311. Association for Computational Linguistics (2001). https://doi.org/10.3115/1073083.1073135
Lin, N., Cai, M.: EPIC-KITCHENS-100 unsupervised domain adaptation challenge for action recognition 2022: team HNU-FPV, Technical report (2022). arXiv: 2207.03095[cs]
Goyal, Y., Khot, T., Summers-Stay, D., Batra, D., Parikh, D.: Making the V in VQA matter: elevating the role of image understanding in visual question answering (2016). https://doi.org/10.48550/ARXIV.1612.00837
Fathi, A., Ren, X., Rehg, J.M.: Learning to recognize objects in egocentric activities. In: CVPR 2011, Colorado Springs, CO, USA, pp. 3281–3288. IEEE (2011). https://doi.org/10.1109/CVPR.2011.5995444
Nwoye, C.I., et al.: Rendezvous: attention mechanisms for the recognition of surgical action triplets in endoscopic videos. Med. Image Anal. 78, 102433 (2022). https://doi.org/10.1016/j.media.2022.102433. arXiv: 2109.03223[cs]
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Additional information
Disclaimers: The views expressed are those of the author(s) and do not reflect the official policy of the Department of the Army, the Department of Defense, or the U.S. Government. The investigators have adhered to the policies for the protection of human subjects as prescribed in 45 CFR 46.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhuo, Y. et al. (2025). Overview of the Trauma THOMPSON Challenge at MICCAI 2023. In: Bao, R., Grant, E., Kirkpatrick, A., Wachs, J., Ou, Y. (eds) AI for Brain Lesion Detection and Trauma Video Action Recognition. TTC BONBID-HIE 2023 2023. Lecture Notes in Computer Science, vol 14567. Springer, Cham. https://doi.org/10.1007/978-3-031-71626-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-71626-3_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-71625-6
Online ISBN: 978-3-031-71626-3
eBook Packages: Computer ScienceComputer Science (R0)