Abstract
The accurate detection of suspicious regions in medical images is an error-prone and time-consuming process required by many routinely performed diagnostic procedures. To support clinicians during this difficult task, several automated solutions were proposed relying on complex methods with many hyperparameters. In this study, we investigate the feasibility of detection transformer (DETR) models for volumetric medical object detection. In contrast to previous works, these models directly predict a set of objects without relying on the design of anchors or manual heuristics such as non-maximum-suppression to detect objects. We show by conducting extensive experiments with three models, namely DETR, Conditional DETR, and DINO DETR on four data sets (CADA, RibFrac, KiTS19, and LIDC) that these set prediction models can perform on par with or even better than currently existing methods. DINO DETR, the best-performing model in our experiments demonstrates this by outperforming a strong anchorbased one-stage detector, Retina U-Net, on three out of four data sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Baumgartner M, Jäger PF, Isensee F, Maier-Hein KH. NnDetection: a self-configuring method for medical object detection. Med Image Comput Comput Assist Interv. Springer, 2021:530–9.
Jaeger PF, Kohl SA, Bickelhaupt S, Isensee F, Kuder TA, Schlemmer HP et al. Retina U-Net: embarrassingly simple exploitation of segmentation supervision for medical object detection. ML4H Workshop. PMLR. 2020:171–83.
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S. End-to-end object detection with transformers. Comput Vis ECCV. Springer, 2020:213–29.
Meng D, Chen X, Fan Z, Zeng G, Li H, Yuan Y et al. Conditional DETR for fast training convergence. Proc IEEE Int Conf Comput Vis. 2021:3631–40.
Zhang H, Li F, Liu S, Zhang L, Su H, Zhu J et al. Dino: Detr with improved denoising anchor boxes for end-to-end object detection. 2022.
Wittmann B, Navarro F, Shit S, Menze B. Focused decoding enables 3D anatomical detection by transformers. 2022.
Ivantsits M, Goubergrits L, Kuhnigk JM, Huellebrand M, Bruening J, Kossen T et al. Detection and analysis of cerebral aneurysms based on X-ray rotational angiography-the CADA 2020 challenge. Med Image Anal. 2022;77:102333.
Jin L, Yang J, Kuang K, Ni B, Gao Y, Sun Y et al. Deep-learning-assisted detection and segmentation of rib fractures from CT scans: development and validation of FracNet. EBioMedicine. 2020;62.
Heller N, Sathianathen N, Kalapara A,Walczak E, Moore K, Kaluzniak H et al. The KiTS19 challenge data: 300 kidney tumor cases with clinical context, CT semantic segmentations, and surgical outcomes. 2019.
Armato III SG, McLennan G, Bidaut L, McNitt-Gray MF, Meyer CR, Reeves AP et al. The lung image database consortium (LIDC) and image database resource initiative (IDRI): a completed reference database of lung nodules on CT scans. Med Phys. 2011;38(2):915–31.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Der/die Autor(en), exklusiv lizenziert an Springer Fachmedien Wiesbaden GmbH, ein Teil von Springer Nature
About this paper
Cite this paper
Ickler, M.K., Baumgartner, M., Roy, S., Wald, T., Maier-Hein, K.H. (2023). Taming Detection Transformers for Medical Object Detection. In: Deserno, T.M., Handels, H., Maier, A., Maier-Hein, K., Palm, C., Tolxdorff, T. (eds) Bildverarbeitung für die Medizin 2023. BVM 2023. Informatik aktuell. Springer Vieweg, Wiesbaden. https://doi.org/10.1007/978-3-658-41657-7_39
Download citation
DOI: https://doi.org/10.1007/978-3-658-41657-7_39
Published:
Publisher Name: Springer Vieweg, Wiesbaden
Print ISBN: 978-3-658-41656-0
Online ISBN: 978-3-658-41657-7
eBook Packages: Computer Science and Engineering (German Language)