Abstract
Robust and accurate segmentation of subcortical structures in MR images is difficult due to: (1) low image contrast and spatial resolution, (2) ambiguous boundaries and large appearance variances, and (3) corrupted training sample annotations. In this paper, we propose a novel CNN architecture to address the above problem from two aspects: increasing the discriminating ability of image feature representations and alleviating the influence of incorrect annotations. Specifically, we first propose the contextual-wise multi-scale feature aggregation (C-MSFA) module to extract multi-scale context. In contrast to the existing methods, the C-MSFA module aggregates the contextual information of each subcortical structure by fusing different scales of encoder features on the corresponding soft regions. Moreover, the shifted-window strategy is used to keep detailed information. Then we propose the Transformer-like decoder feature recalibration (TDFR) module to obtain discriminative decoder feature representations by learning the feature context descriptors through the cross-attention between the decoder features and the contextual-wise multi-scale features, which are used to refine the decoder features in a channel recalibration manner. Finally, we propose a novel online meta-mask learning method using a meta-mask branch to evaluate the influence of training pixels and generate a binary meta-mask to exclude unfavorable pixels and labels. The proposed method is evaluated on two benchmark datasets (the IBSR dataset and the MALC dataset). The experimental results show that our method has better performance than several state-of-the-art medical image segmentation networks and subcortical structure segmentation methods.
Similar content being viewed by others
References
Kikinis R, Shenton ME, Donnino RM, Jolesz FA, Iosifescu DV, McCarley RW, Saiviroonporn P, Hokama HH, Robatino A, Metcalf D, et al. (1996) A digital brain atlas for surgical planning, model-driven segmentation, and teaching. IEEE Trans Vis Comput Graph 2(3):232–241
Teipel SJ, Grothe M, Lista S, Toschi N, Garaci FG, Hampel H (2013) Relevance of Magnetic Resonance Imaging for Early Detection and Diagnosis of Alzheimer Disease. Med Clin N Am 97(3):399–424
Phillips JL, Batten LA, Tremblay P, Aldosary F, Blier P (2015) A prospective, longitudinal study of the effect of remission on cortical thickness and hippocampal volume in patients with treatment-resistant depression. Int J Neuropsychopharmacol 18(8):04
Deeley MA, Chen A, Datteri R, Noble JH, Cmelak AJ, Donnelly EF, Malcolm AW, Moretti L, Jaboin J, Niermann K, et al. (2011) Comparison of manual and automatic segmentation methods for brain structures in the presence of space-occupying lesions: a multi-expert study. Phys Med Biol 56 (14):4557–4577
Fischl B (2012) FreeSurfer. NeuroImage 62(2):774–781
Patenaude B, Smith SM, Kennedy DN, Jenkinson M (2011) A Bayesian model of shape and appearance for subcortical brain segmentation. NeuroImage 56(3):907–922
Coupé P, Manjón JV, Fonov V, Pruessner J, Robles M, Collins DL (2011) Patch-based segmentation using expert priors: application to hippocampus and ventricle segmentation. NeuroImage 54(2):940–954
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on computer vision and pattern recognition (CVPR), pp 770–778
Liu H, Liu T, Zhang Z, Sangaiah AK, Yang Bing, Li YF (2022) ARHPE: asymmetric relation-aware representation learning for head pose estimation in industrial human-machine interaction. IEEE Trans Ind Inf pp 1–1
Liu H, Nie H, Zhang Z, Li Y-F (2021) Anisotropic angle distribution learning for head pose estimation and attention understanding in human-computer interaction. Neurocomputing 433:310–322
Li Z, Liu H, Zhang Z, Liu T, Xiong NN (2021) Learning knowledge graph embedding with heterogeneous relation attention networks. IEEE Trans Neural Netw Learn Syst pp 1–13
Liu H, Zheng C, Li D, Shen X, Lin K, Wang J, Zhang Z, Zhang Z, Xiong NN (2021) EDMF: efficient deep matrix factorization with review feature learning for industrial recommender system. IEEE Trans Ind Inf pp 1–1
Zhang Z, Li Z, Liu H, Xiong NN (2022) Multi-scale dynamic convolutional network for knowledge graph embedding. IEEE Trans Knowl Data Eng 34(5):2335–2347
Bernal J, Kushibar K, Asfaw DS, Valverde S, Oliver A, Martí R, Lladó X (2019) Deep convolutional neural networks for brain image analysis on magnetic resonance imaging: a review. Artif Intell Med 95:64–81
Mehta R, Majumdar A, Sivaswamy J (2017) Brainsegnet: a convolutional neural network architecture for automated segmentation of human brain structures. J Med Imaging 4(2):1–11
Kushibar K, Valverde S, González-Villà S, Bernal J, Cabezas M, Oliver A, Lladó X (2018) Automated sub-cortical brain structure segmentation combining spatial and deep convolutional features. Med Image Anal 48:177–186
Milletari F, Ahmadi S-A, Kroll C, Plate A, Rozanski V, Maiostre J, Levin J, Dietrich O, Ertl-Wagner B, Bötzel K, et al. (2017) Hough-CNN: deep learning for segmentation of deep brain regions in MRI and ultrasound. Comput Vis Image Underst 164:92–102
Wachinger C, Reuter M, Klein T (2018) DeepNAT: deep convolutional neural network for segmenting neuroanatomy. NeuroImage 170:434–445
Lihao L, Hu X, Zhu L, Fu C-W, Qin J, Heng P (2020) Ψ-net: stacking densely convolutional LSTMs for sub-cortical brain structure segmentation. IEEE Trans Med Imaging 39:2806– 2817
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: 2015 IEEE Conference on computer vision and pattern recognition (CVPR), pp 3431–3440
Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. Springer, pp 234–241
Zhou Z, Siddiquee MMR, Tajbakhsh N, Liang J (2020) UNet++: redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans Med Imaging 39(6):1856–1867
Huang H, Lin L, Tong R, Hu H, Zhang Q, Iwamoto Y, Han X, Chen Y-W, Wu J (2020) UNet 3+: a full-scale connected unet for medical image segmentation. In: ICASSP 2020 - 2020 IEEE International conference on acoustics, speech and signal processing (ICASSP), pp 1055–1059
Schlemper J, Oktay O, Schaap M, Heinrich M, Kainz B, Glocker B, Rueckert D (2019) Attention gated networks: learning to leverage salient regions in medical images. Med Image Anal 53:197–207
Roy AG, Navab N, Wachinger C (2019) Recalibrating fully convolutional networks with spatial and channel “squeeze and excitation” blocks. IEEE Trans Med Imaging 38(2):540–549
Gu R, Wang G, Song T, Huang R, Aertsen M, Deprest J, Ourselin S, Vercauteren T, Zhang S (2021) CA-net: comprehensive attention convolutional neural networks for explainable medical image segmentation
Liu H, Fang S, Zhang Z, Li D, Lin K, Wang J (2021) MFDNet: collaborative poses perception and matrix fisher distribution for head pose estimation. IEEE Trans Multimed :1–1
Liu H, Wang X, Zhang W, Zhang Z, Li Y-F (2020) Infrared head pose estimation with multi-scales feature fusion on the IRHP database for human attention recognition. Neurocomputing 411:510–520
Li D, Liu H, Zhang Z, Ke L, Fang S, Li Z, Xiong NN (2021) CARM: confidence-aware recommender model via review representation learning and historical rating behavior in the online platforms. Neurocomputing 455:283–296
Li Z, Liu H, Zhang Z, Liu T, Shu J (2021) Recalibration convolutional networks for learning interaction knowledge graph embedding. Neurocomputing 427:118–130
Koh PW, Liang P (2017) Understanding black-box predictions via influence functions. In: International conference on machine learning. PMLR, pp 1885–1894
Ren M, Zeng W, Yang B Urtasun R (2018) Learning to reweight examples for robust deep learning. In: International Conference on Machine Learning. PMLR, pp 4334–4343
Yuan Yuhui, Chen Xilin, Wang Jingdong (2020) Object-contextual representations for semantic segmentation. In: Computer vision – ECCV 2020, pp 173–190
Wang J, Zhou S, Fang C, Wang L, Wang J (2020) Meta corrupted pixels mining for medical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 335–345
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Shakeri Mahsa, Tsogkas Stavros, Ferrante Enzo, Lippe Sarah, Kadoury Samuel, Paragios Nikos, Kokkinos Iasonas (2016) Sub-cortical brain structure segmentation using f-cnn’s. In: 2016 IEEE 13Th international symposium on biomedical imaging (ISBI). IEEE, pp 269–272
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Dolz J, Desrosiers C, Ayed IB (2018) 3D fully convolutional networks for subcortical segmentation in MRI: a large-scale study. NeuroImage 170:456–470
Milletari F, Navab N, Ahmadi S (2016) V-net: fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth international conference on 3d vision (3DV), pp 565–571
Çiçek Ö, Abdulkadir A, Lienkamp SS, Brox T, Ronneberger O (2016) 3D u-net: learning dense volumetric segmentation from sparse annotation. In: Medical image computing and computer-assisted intervention – MICCAI 2016. Springer, pp 424–432
Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7794–7803
Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: Inproceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3146–3154
Zhang H, Zhang H, Wang C, Xie J (2019) Co-occurrent features in semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 548–557
Cheng Bowen, Schwing Alex, Kirillov Alexander (2021) Per-pixel classification is not all you need for semantic segmentation. Adv Neural Inf Process Syst :34
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
Zadrozny B (2004) Learning and evaluating classifiers under sample selection bias. In: Proceedings of the twenty-first international conference on Machine learning, p 114
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139
Malisiewicz T, Gupta A, Efros AA (2011) Ensemble of exemplar-svms for object detection and beyond. In: 2011 International Conference on Computer Vision, pp 89–96
Lin T-Y, Goyal P, Girshick R, He K, Dollar P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV)
Wang Y, Deng Z, Hu X, Zhu L, Yang X, Xu X, Heng P-A, Ni D (2018) Deep attentional features for prostate segmentation in ultrasound. In: Medical image computing and computer assisted intervention – MICCAI 2018, pp 523–530
Sinha A, Dolz J (2020) Multi-scale self-guided attention for medical image segmentation. IEEE J Biomed Health Inf 25(1):121–130
Landman B, Warfield S (2012) MICCAI 2012 workshop on multi-atlas labeling. In: Medical image computing and computer assisted intervention conference
Marcus DS, Fotenos AF, Csernansky JG, Morris JC, Buckner RL (2010) Open access series of imaging studies: longitudinal MRI data in nondemented and demented older adults. J Cogn Neurosci 22(12):2677–2684
Bao S, Chung ACS (2018) Multi-scale structured CNN with label consistency for brain MR image segmentation. Comput Methods Biomech Biomed Eng Imaging Vis 6(1):113–117
Acknowledgements
This work is supported by the National Natural Science Foundation of China (grant No.61871106 and No.61370152), Key R & D projects of Liaoning Province, China (grant No. 2020JH2/10100029), and the Open Project Program Foundation of the Key Laboratory of Opto-Electronics Information Processing, Chinese Academy of Sciences (OEIP-O-202002).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, X., Wei, Y., Wang, C. et al. Contextual-wise discriminative feature extraction and robust network learning for subcortical structure segmentation. Appl Intell 53, 5868–5886 (2023). https://doi.org/10.1007/s10489-022-03848-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03848-y