Baseline Models for Action Recognition of Unscripted Casualty Care Dataset

Jiang, Nina; Zhuo, Yupeng; Kirkpatrick, Andrew W.; Couperus, Kyle; Tran, Oanh; Beck, Jonah; DeVane, DeAnna; Candelore, Ross; McKee, Jessica; Gorbatkin, Chad; Birch, Eleanor; Colombo, Christopher; Duerstock, Bradley; Wachs, Juan

doi:10.1007/978-3-031-48593-0_16

Nina Jiang¹³,
Yupeng Zhuo¹³,
Andrew W. Kirkpatrick¹⁴,
Kyle Couperus^15,17,
Oanh Tran^15,17,
Jonah Beck^15,17,
DeAnna DeVane^15,17,
Ross Candelore^15,16,
Jessica McKee¹⁴,
Chad Gorbatkin¹⁵,
Eleanor Birch¹⁵,
Christopher Colombo^15,17,
Bradley Duerstock¹³ &
…
Juan Wachs¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14122))

Included in the following conference series:

Annual Conference on Medical Image Understanding and Analysis

343 Accesses

Abstract

This paper presents a comprehensive framework of datasets and algorithms for action recognition in scenarios where data is scarce, unstructured, and unscripted. The long-term objective of this work is an intelligent assistant to the medic, a surrogate buddy, that can tell the medic what needs to get done in every step of trauma resuscitation. As an essential part of this objective, we collected datasets and developed algorithms suitable for emergent contexts, such as casualty care in the field, disaster response and recovery scenarios, and other related high-risks/high-stakes scenarios where real-time decision-making is crucial. The proposed framework enables the development of new algorithms by providing a standardized set of evaluation metrics and test cases for assessing their performance. Ultimately, this research seeks to enhance the capabilities of practitioners and emergency responders by enabling them to better anticipate and recognize actions in challenging and unpredictable situations. Our dataset, referred to as Trauma Thompson, includes Tourniquet Application, Tracheostomy, Tube Thoracostomy, Needle Thoracostomy, and Interosseous Insertion procedures. The proposed algorithms based on the relative position embedding for the Vision Transformer referred as to ReVit, can achieve competitive performance with the state-of-art algorithms on our dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Zhang, Y., et al.: Neural network-based approaches for biomedical relation classification: a review. J. Biomed. Inform. 99, 103294 (2019). https://doi.org/10.1016/j.jbi.2019.103294
Dosovitskiy, A., et al.: An image is worth \(16 \times 16\) words: transformers for image recognition at scale (2020). https://doi.org/10.48550/ARXIV.2010.11929
Soomro, K., Zamir, A.R., Shah, M.: UCF101: a dataset of 101 human actions classes from videos in the wild (2012). https://doi.org/10.48550/ARXIV.1212.0402
Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563. IEEE, Barcelona, Spain (2011). https://doi.org/10.1109/ICCV.2011.6126543
Kay, W., et al.: The kinetics human action video dataset (2017)
Google Scholar
Goyal, R., et al.: The “something something” video database for learning and evaluating visual common sense (2017)
Google Scholar
Gu, C., et al.: AVA: a video dataset of spatio-temporally localized atomic visual actions (2018)
Google Scholar
Sigurdsson, G.A., Varol, G., Wang, X., Farhadi, A., Laptev, I., Gupta, A.: Hollywood in homes: crowdsourcing data collection for activity understanding. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 510–526. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_31
Chapter Google Scholar
Bachmann, D., Weichert, F., Rinkenauer, G.: Review of three-dimensional human-computer interaction with focus on the leap motion controller. Sensors 18(7), 2194 (2018). https://doi.org/10.3390/s18072194
Abebe, G., Catala, A., Cavallaro, A.: A first-person vision dataset of office activities. In: Schwenker, F., Scherer, S. (eds.) MPRSS 2018. LNCS (LNAI), vol. 11377, pp. 27–37. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20984-1_3
Chapter Google Scholar
Pirsiavash, H., Ramanan, D.: Detecting activities of daily living in first-person camera views. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2847–2854. IEEE, Providence, RI (2012). https://doi.org/10.1109/CVPR.2012.6248010
Li, Y., Liu, M., Rehg, J.M.: In the eye of beholder: joint learning of gaze and actions in first person video. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11209, pp. 639–655. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01228-1_38
Chapter Google Scholar
Damen, D., et al.: The EPIC-KITCHENS dataset: collection, challenges and baselines (2020)
Google Scholar
Sun, Z., Ke, Q., Rahmani, H., Bennamoun, M., Wang, G., Liu, J.: Human action recognition from various data modalities: a review. IEEE Trans. Pattern Anal. Mach. Intell. (2022). https://doi.org/10.1109/TPAMI.2022.3183112
Yao, G., Lei, T., Zhong, J.: A review of convolutional-neural-network-based action recognition. Pattern Recognit. Lett. 118, 14–22 (2019). https://doi.org/10.1016/j.patrec.2018.05.018
Abdulazeem, Y., Balaha, H.M., Bahgat, W.M., Badawy, M.: Human action recognition based on transfer learning approach. IEEE Access 9, 82058–82069 (2021). https://doi.org/10.1109/ACCESS.2021.3086668
Butler, F.K., Hagmann, J., Butler, E.G.: Tactical combat casualty care in special operations. Mil. Med. 161, 3–16 (1996). https://doi.org/10.1093/milmed/161.suppl_1.3
Wu, K., Peng, H., Chen, M., Fu, J., Chao, H.: Rethinking and improving relative position encoding for vision transformer (2021). https://doi.org/10.48550/ARXIV.2107.14222
Qu, A., Niu, J., Mo, S.: Explore better relative position embeddings from encoding perspective for transformer models. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 2989–2997. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic (2021). https://doi.org/10.18653/v1/2021.emnlp-main.237
Bello, I., Zoph, B., Vaswani, A., Shlens, J., Le, Q.V.: Attention augmented convolutional networks (2020)
Google Scholar
Mazurowski, M.A., Habas, P.A., Zurada, J.M., Lo, J.Y., Baker, J.A., Tourassi, G.D.: Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance. Neural Netw. 21(2), 427–436 (2008). https://doi.org/10.1016/j.neunet.2007.12.031
Wang, L., et al.: Temporal segment networks: towards good practices for deep action recognition (2016)
Google Scholar
Zhao, H., Torralba, A., Torresani, L., Yan, Z.: HACS: human action clips and segments dataset for recognition and temporal localization (2019)
Google Scholar
Feichtenhofer, C., Fan, H., Malik, J., He, K.: SlowFast networks for video recognition (2019)
Google Scholar
Wang, L., et al.: Temporal segment networks for action recognition in videos (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Industrial Engineering, Purdue University, West Lafayette, USA
Nina Jiang, Yupeng Zhuo, Bradley Duerstock & Juan Wachs
University of Calgary, Calgary, Canada
Andrew W. Kirkpatrick & Jessica McKee
Madigan Army Medical Center, Tacoma, USA
Kyle Couperus, Oanh Tran, Jonah Beck, DeAnna DeVane, Ross Candelore, Chad Gorbatkin, Eleanor Birch & Christopher Colombo
William Beaumont Army Medical Center, Fort Bliss, USA
Ross Candelore
The Geneva Foundation, Tacoma, USA
Kyle Couperus, Oanh Tran, Jonah Beck, DeAnna DeVane & Christopher Colombo

Authors

Nina Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Yupeng Zhuo
View author publications
You can also search for this author in PubMed Google Scholar
Andrew W. Kirkpatrick
View author publications
You can also search for this author in PubMed Google Scholar
Kyle Couperus
View author publications
You can also search for this author in PubMed Google Scholar
Oanh Tran
View author publications
You can also search for this author in PubMed Google Scholar
Jonah Beck
View author publications
You can also search for this author in PubMed Google Scholar
DeAnna DeVane
View author publications
You can also search for this author in PubMed Google Scholar
Ross Candelore
View author publications
You can also search for this author in PubMed Google Scholar
Jessica McKee
View author publications
You can also search for this author in PubMed Google Scholar
Chad Gorbatkin
View author publications
You can also search for this author in PubMed Google Scholar
Eleanor Birch
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Colombo
View author publications
You can also search for this author in PubMed Google Scholar
Bradley Duerstock
View author publications
You can also search for this author in PubMed Google Scholar
Juan Wachs
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Juan Wachs .

Editor information

Editors and Affiliations

University of Aberdeen, Aberdeen, UK
Gordon Waiter
University of Aberdeen, Aberdeen, UK
Tryphon Lambrou
University of Aberdeen, Aberdeen, UK
Georgios Leontidis
University of Aberdeen, Aberdeen, UK
Nir Oren
University of Aberdeen, Aberdeen, UK
Teresa Morris
University of Aberdeen, Aberdeen, UK
Sharon Gordon

Additional information

Disclaimers: The views expressed are those of the author(s) and do not reflect the official policy of the Department of the Army, the Department of Defense, or the U.S. Government. The investigators have adhered to the policies for the protection of human subjects as prescribed in 45 CFR 46.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jiang, N. et al. (2024). Baseline Models for Action Recognition of Unscripted Casualty Care Dataset. In: Waiter, G., Lambrou, T., Leontidis, G., Oren, N., Morris, T., Gordon, S. (eds) Medical Image Understanding and Analysis. MIUA 2023. Lecture Notes in Computer Science, vol 14122. Springer, Cham. https://doi.org/10.1007/978-3-031-48593-0_16

Download citation

DOI: https://doi.org/10.1007/978-3-031-48593-0_16
Published: 02 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-48592-3
Online ISBN: 978-3-031-48593-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Baseline Models for Action Recognition of Unscripted Casualty Care Dataset