Abstract
Anomaly detection approaches in medical imaging show promise in reducing the need for labelled data. However, the question of how to evaluate anomaly detection algorithms remains challenging, both in terms of the data and the metrics. In this work, we take a cohort of inpatient CT head scans from an elderly stroke patient population containing a variety of anomalies, and treat the associated radiology reports as the reference for clinically relevant findings which should be detected by an anomaly detection algorithm. We apply two state-of-the-art anomaly detection methods to the data, namely denoising autoencoder (DAE) and context-to-local feature matching (CLFM) models. We then extract bounding boxes from the predicted anomaly score heatmaps, which we treat as candidate anomaly detections. A clinical evaluation is then conducted in which 3 radiologists rate the candidate anomalies with respect to their detection and localisation accuracy, by assigning the corresponding report sentence where a clinically relevant anomaly is correctly detected, and rating localisation according to a 3-point scale (good, partial, poor). We find that neither method exhibits sufficiently high recall for clinical use, even at low detection thresholds, although anomaly detection shows promise as a scalable approach for detecting clinically relevant findings. We highlight that selection of the optimal thresholds and extraction of discrete anomaly predictions (e.g. bounding boxes) are underexplored topics in anomaly detection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
West of Scotland Safe Haven ethical approval number GSH19NE004.
References
Flood fill - skimage v0.19.2 docs. https://scikit-image.org/docs/stable/auto_examples/segmentation/plot_floodfill.html. Accessed 24 Apr 2023
Interactive figures - matplotlib 3.6.3 documentation. https://matplotlib.org/stable/users/explain/interactive.html. Accessed 24 Apr 2023
Jupyter widgets - jupyter widgets 8.0.2 documentation. https://ipywidgets.readthedocs.io/en/stable/. Accessed 24 Apr 2023
Baur, C., Denner, S., Wiestler, B., Navab, N., Albarqouni, S.: Autoencoders for unsupervised anomaly segmentation in brain MR images: a comparative study. Med. Image Anal. 101952 (2021)
Chilamkurthy, S., et al.: Deep learning algorithms for detection of critical findings in head CT scans: a retrospective study. Lancet 392(10162), 2388–2396 (2018)
Honnibal, M., Montani, I., Van Landeghem, S., Boyd, A.: spaCy: industrial-strength natural language processing in Python (2020). https://doi.org/10.5281/zenodo.1212303
Hunter, J.D.: Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9(3), 90–95 (2007). https://doi.org/10.1109/MCSE.2007.55
Kascenas, A., Pugeault, N., O’Neil, A.Q.: Denoising autoencoders for unsupervised anomaly detection in brain MRI. In: International Conference on Medical Imaging with Deep Learning, pp. 653–664. PMLR (2022)
Kascenas, A., et al.: The role of noise in denoising models for anomaly detection in medical images. arXiv preprint arXiv:2301.08330 (2023)
Kascenas, A., Young, R., Jensen, B.S., Pugeault, N., O’Neil, A.Q.: Anomaly detection via context and local feature matching. In: 2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI), pp. 1–5. IEEE (2022)
Kluyver, T., et al.: Jupyter notebooks - a publishing format for reproducible computational workflows. In: Loizides, F., Scmidt, B. (eds.) Positioning and Power in Academic Publishing: Players, Agents and Agendas, pp. 87–90. IOS Press, The Netherlands (2016). https://eprints.soton.ac.uk/403913/
Lagogiannis, I., Meissen, F., Kaissis, G., Rueckert, D.: Unsupervised pathology detection: a deep dive into the state of the art. arXiv preprint arXiv:2303.00609 (2023)
Lee, S., et al.: Emergency triage of brain computed tomography via anomaly detection with a deep generative model. Nat. Commun. 13(1), 4251 (2022)
Meissen, F., Wiestler, B., Kaissis, G., Rueckert, D.: On the pitfalls of using the residual as anomaly score. In: Medical Imaging with Deep Learning (2022). https://openreview.net/forum?id=ZsoHLeupa1D
Organization, W.H.: ICD-10: international statistical classification of diseases and related health problems: tenth revision (2004)
Pérez-García, F., Sparks, R., Ourselin, S.: TorchIO: a python library for efficient loading, preprocessing, augmentation and patch-based sampling of medical images in deep learning. Comput. Methods Program. Biomed. 106236 (2021). https://doi.org/10.1016/j.cmpb.2021.106236, https://www.sciencedirect.com/science/article/pii/S0169260721003102
Pinaya, W.H., et al.: Fast unsupervised brain anomaly detection and segmentation with diffusion models. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) MICCAI 2022. LNCS, vol. 13438, pp. 705–714. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16452-1_67
Reddi, S., Kale, S., Kumar, S.: On the convergence of Adam and beyond. In: International Conference on Learning Representations (2018)
Schrempf, P., et al.: Templated text synthesis for expert-guided multi-label extraction from radiology reports. Mach. Learn. Knowl. Extract. 3(2), 299–317 (2021). https://doi.org/10.3390/make3020015, https://www.mdpi.com/2504-4990/3/2/15
Smith, A.R.: Tint fill. In: Proceedings of the 6th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 1979, pp. 276–283. Association for Computing Machinery, New York (1979). https://doi.org/10.1145/800249.807456
Smith, L.N., Topin, N.: Super-convergence: very fast training of neural networks using large learning rates. In: Artificial Intelligence and Machine Learning for Multi-domain Operations Applications, vol. 11006, pp. 369–386. SPIE (2019)
Tschuchnig, M.E., Gadermayr, M.: Anomaly detection in medical imaging - a mini review. In: Haber, P., Lampoltshammer, T.J., Leopold, H., Mayr, M. (eds.) Data Science – Analytics and Applications, pp. 33–38. Springer, Wiesbaden (2022). https://doi.org/10.1007/978-3-658-36295-9_5
Wilde, K., Anderson, L., Boyle, M., Pinder, A., Weir, A.: Introducing a new trusted research environment – the safe haven artificial platform (SHAIP). Int. J. Popul. Data Sci. 7(3) (2022)
Zimmerer, D., et al.: MOOD 2020: a public benchmark for out-of-distribution detection and localization on medical images. IEEE Trans. Med. Imaging 41(10), 2728–2738 (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Kascenas, A. et al. (2024). Clinically Focussed Evaluation of Anomaly Detection and Localisation Methods Using Inpatient CT Head Data. In: Xue, Y., Chen, C., Chen, C., Zuo, L., Liu, Y. (eds) Data Augmentation, Labelling, and Imperfections. MICCAI 2023. Lecture Notes in Computer Science, vol 14379. Springer, Cham. https://doi.org/10.1007/978-3-031-58171-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-58171-7_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-58170-0
Online ISBN: 978-3-031-58171-7
eBook Packages: Computer ScienceComputer Science (R0)