Abstract
Multi-organ segmentation of the abdominal region plays a vital role in clinical such as organ quantification, surgical planning, and disease diagnosis. Due to the dense distribution of abdominal organs and the close connection between each organ, the accuracy of the label is highly required. However, the dense and complex structure of abdominal organs necessitates highly professional medical expertise to manually annotate the organs, leading to significant costs in terms of time and effort. We found a cheap and easily accessible form of supervised information. Recording the areas by the eye tracker where the radiologist focuses while reading abdominal images, gaze information is able to force the network model to focus on relevant objects or features required for the segmentation task. Therefore how to effectively integrate image information with gaze information is a problem to be solved. To address this issue, we propose a novel network for abdominal multi-organ segmentation, which incorporates radiologists’ gaze information to boost high-precision segmentation and weaken the demand for high-cost manual labels. Our network includes three special designs: 1) a dual-path encoder to further integrate gaze information; 2) a cross-attention transformer module (CATM) that embeds human cognitive information about the image into the network model; and 3) multi-feature skip connection (MSC), which combines spatial information during down-sampling to offset the internal details of segmentation. Additionally, our network utilizes discrete wavelet transform (DWT) to further provide information on organ location and edge in different directions. Extensive experiments performed on the publicly available Synapse dataset demonstrate that our proposed method can integrate effectively gaze information and achieves Dice similarity coefficient (DSC) up to 81.87% and Hausdorff distance (HD) reduction to 11.96%, as well as gain high-quality readable visualizations. Code will be available at https://github.com/code-Porunacabeza/gaze_seg/.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bertram, R., et al.: Eye movements of radiologists reflect expertise in CT study interpretation: a potential tool to measure resident development. Radiology 281(3), 805–815 (2016)
Brunyé, T.T., Drew, T., Weaver, D.L., Elmore, J.G.: A review of eye tracking for understanding and improving diagnostic interpretation. Cogn. Res. Princ. Implic. 4(1), 1–16 (2019). https://doi.org/10.1186/s41235-019-0159-2
Cao, H., et al.: Swin-Unet: Unet-like pure transformer for medical image segmentation. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds.) ECCV 2022. LNCS, vol. 13803, pp. 205–218. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-25066-8_9
Chen, J., et al.: Transunet: transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021)
Fu, S., et al.: Domain adaptive relational reasoning for 3D multi-organ segmentation. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12261, pp. 656–666. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59710-8_64
Herzig, J., Nowak, P.K., Müller, T., Piccinno, F., Eisenschlos, J.M.: Tapas: weakly supervised table parsing via pre-training. arXiv preprint arXiv:2004.02349 (2020)
Kundel, H.L., Nodine, C.F., Krupinski, E.A., Mello-Thoms, C.: Using gaze-tracking data and mixture distribution analysis to support a holistic model for the detection of cancers on mammograms. Acad. Radiol. 15(7), 881–886 (2008)
Li, G., Lyu, J., Wang, C., Dou, Q., Qin, J.: WavTrans: synergizing wavelet and cross-attention transformer for multi-contrast MRI super-resolution. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) MICCAI 2022. LNCS, vol. 13436, pp. 463–473. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16446-0_44
Men, Q., Teng, C., Drukker, L., Papageorghiou, A.T., Noble, J.A.: Multimodal-guidenet: gaze-probe bidirectional guidance in obstetric ultrasound scanning. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) MICCAI 2022. LNCS, vol. 13437, pp. 94–103. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16449-1_10
Milletari, F., Navab, N., Ahmadi, S.A.: V-net: fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV), pp. 565–571. IEEE (2016)
Oktay, O., et al.: Attention U-net: learning where to look for the pancreas. arXiv preprint arXiv:1804.03999 (2018)
Ouyang, X., et al.: Learning hierarchical attention for weakly-supervised chest X-ray abnormality localization and diagnosis. IEEE Trans. Med. Imaging 40(10), 2698–2710 (2020)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Shu, R., Chen, Y., Kumar, A., Ermon, S., Poole, B.: Weakly supervised disentanglement with guarantees. arXiv preprint arXiv:1910.09772 (2019)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Wang, S., Ouyang, X., Liu, T., Wang, Q., Shen, D.: Follow my eye: using gaze to supervise computer-aided diagnosis. IEEE Trans. Med. Imaging 41(7), 1688–1698 (2022)
Wu, C.C., Wolfe, J.M.: Eye movements in medical image perception: a selective review of past, present and future. Vision 3(2), 32 (2019)
Acknowledgements
This study was supported by the National Natural Science Foundation (No. 62101249 and No. 62136004), the Natural Science Foundation of Jiangsu Province (No. BK20210291), and the China Postdoctoral Science Foundation (No. 2021TQ0149 and No. 2022M721611).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, C., Zhang, D., Ge, R. (2023). Eye-Guided Dual-Path Network for Multi-organ Segmentation of Abdomen. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14226. Springer, Cham. https://doi.org/10.1007/978-3-031-43990-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-43990-2_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43989-6
Online ISBN: 978-3-031-43990-2
eBook Packages: Computer ScienceComputer Science (R0)