Abstract
Myopic choroidal neovascularization (mCNV) is a vision-threatening complication of high myopia characterized by the growth of abnormal blood vessels in the choroid layer of the eye. In OCT images, mCNV typically presents as a highly reflective area within the subretinal layer. Therefore, accurate segmentation of mCNV in OCT images can better assist clinicians in assessing the disease status and guiding treatment decisions. However, accurate segmentation in OCT images is highly challenging due to the presence of noise interference, complex lesion areas, and low contrast. Consequently, we propose a parallel-branch network architecture that combines super token vision transformer (STViT) and CNN to more efficiently capture global dependency and low-level feature details. The super token attention mechanism (STA) in STViT reduces the number of tokens in self-attention and preserves global modeling. Additionally, we create a novel feature fusion module that utilizes depth-wise separable convolutions to efficiently fuse multi-level features from two pathways. We conduct extensive experiments on an in-house OCT dataset and a public OCT dataset, and the results demonstrate that our proposed method achieves state-of-the-art segmentation performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cheung, C.M.G., et al.: Myopic choroidal neovascularization: review, guidance, and consensus statement on management. Ophthalmology 124, 1690–1711 (2017)
Ohno-Matsui, K., Ikuno, Y., Lai, T.Y., Cheung, C.M.G.: Diagnosis and treatment guideline for myopic choroidal neovascularization due to pathologic myopia. Progr. Retinal Eye Res. 63, 92–106 (2018)
Wilkins, G.R., Houghton, O.M., Oldenburg, A.L.: Automated segmentation of intraretinal cystoid fluid in optical coherence tomography. IEEE Trans. Biomed. Eng. 59(4), 1109–1114 (2012)
Xiang, D., et al.: Automatic segmentation of retinal layer in OCT images with choroidal neovascularization. IEEE Trans. Image Process. 27(12), 5880–5891 (2018)
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III, pp. 234–241. Springer International Publishing, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., Liang, J.: Unet++: A nested u-net architecture for medical image segmentation. In: Stoyanov, D., et al. (ed.) DLMIA/ML-CDS -2018. LNCS, vol. 11045, pp. 3–11. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00889-5_1
Oktay, O., et al.: Attention u-net: learning where to look for the pancreas (2018)
Jha, D., et al.: Resunet++: an advanced architecture for medical image segmentation. In: 2019 IEEE International Symposium on Multimedia (ISM), pp. 225–2255. IEEE (2019)
Dosovitskiy, A., et al.: An image is worth 16 × 16 words: transformers for image recognition at scale (2020)
Gao, Y., Zhou, M., Metaxas, D.N.: UTNet: a hybrid transformer architecture for medical image segmentation. In: de Bruijne, M., et al. (ed.) MICCAI 2021. LNCS, vol. 12903, pp. 61–71. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87199-4_6
Zhang, Y., Liu, H., Qiang, H.: Transfuse: fusing transformers and cnns for medical image segmentation. In: de Bruijne, M., et al. (ed.) Medical Image Computing and Computer Assisted Intervention – MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part I, pp. 14–24. Springer International Publishing, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_2
Azad, R., Heidari, M., Yuli, W., Merhof, D.: Contextual attention network: transformer meets u-net. In: Lian, C., Cao, X., Rekik, I., Xuanang, X., Cui, Z. (eds.) Machine Learning in Medical Imaging: 13th International Workshop, MLMI 2022, Held in Conjunction with MICCAI 2022, Singapore, September 18, 2022, Proceedings, pp. 377–386. Springer Nature Switzerland, Cham (2022). https://doi.org/10.1007/978-3-031-21014-3_39
Huang, H., Zhou, X., Cao, J., He, R., Tan, T.: Vision transformer with super token sampling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 22690–22699 (2023)
Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2019)
Schlemper, J., et al.: Attention gated networks: learning to leverage salient regions in medical images. Med. Image Anal. 53, 197–207 (2019)
Huang, H., Zhou, X., He, R.: Orthogonal transformer: an efficient vision transformer backbone with token orthogonalization. Adv. Neural Inf. Process. Syst. 35, 14596–14607 (2022)
Jampani, V., Sun, D., Liu, M.-Y., Yang, M.-H., Kautz, J.: Superpixel sampling networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 352–368 (2018)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018)
Rashno, A., et al.: Fully automated segmentation of fluid/cyst regions in optical coherence tomography images with diabetic macular edema using neutrosophic sets and graph algorithms. IEEE Trans. Biomed. Eng. 65(5), 989–1001 (2017)
Acknowledgements
This work is supported in part by the National Natural Science Foundation of China (No. U22A2024, 62106153, 82271103), Guangdong Basic and Applied Basic Research Foundation (No. 2020A1515110605, 2022A1515012326) and Natural Science Foundation of Shenzhen (No. JCYJ20220818095809021).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Dong, X. et al. (2024). A Super Token Vision Transformer and CNN Parallel Branch Network for mCNV Lesion Segmentation in OCT Images. In: Cao, X., Xu, X., Rekik, I., Cui, Z., Ouyang, X. (eds) Machine Learning in Medical Imaging. MLMI 2023. Lecture Notes in Computer Science, vol 14348. Springer, Cham. https://doi.org/10.1007/978-3-031-45673-2_27
Download citation
DOI: https://doi.org/10.1007/978-3-031-45673-2_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-45672-5
Online ISBN: 978-3-031-45673-2
eBook Packages: Computer ScienceComputer Science (R0)