A contrastive consistency semi-supervised left atrium segmentation model
Introduction
Atrial fibrillation (AF) is a common heart disease and the risk of it increases with age (Feinberg et al., 1995). Patients with AF may have heart palpitations, breathlessness, low energy, and an increased risk of stroke (Center, 2009). Catheter ablation is a current routine therapy for patients with AF (Kalla et al., 2017). However, the success ratio of catheter ablation is unsatisfactory, after which AF recurrence and the second ablation often happen (Chelu et al., 2018). According to the clinical experience, ablation strategies and AF recurrence are dominated by the degree of atrial fibrosis and the ablation-related scar (Akoum et al., 2011; Wu et al., 2021). And learning the topology of the left atrium (LA) is crucial for evaluating the degree of atrial fibrosis and ablation-related scar in patients with AF. Therefore, to improve the success ratio of the catheter ablation, accurate segmentation of the LA in medical images is a critical process that can assist the clinic in understanding the topology of LA, assessing the risk of AF, and making patient-specific treatment plan. Recently, late gadolinium-enhanced MRI (LGE MRI) provides a promising visualizing ability for myocardial scar tissues through brightening scar signal intensities to differentiate them from the healthy tissues, which results in the poor boundary of the LA (Yang et al., 2020). The LA segmentation involves the LA cavity, pulmonary veins, LA appendage, etc. These complex structures and the fuzzy boundary problem make the acquirement of the semantic-level label of the LA consuming much more time and labor. Therefore, accurate and automatic segmenting of the LA in LGE MRI is a challenging and necessary task.
For the past few years, deep learning models have taken impressive improvements on several medical image segmentation tasks (Shen et al., 2017). However, a high-efficient supervised deep learning model requires abundant labeled data. And the requirement of plenty of data with dense annotations somewhat slows down the process of deep learning application in medical image analysis. On the other hand, a large amount of unlabeled data may be available with the development of the wise information technology of med (Cheplygina et al., 2019). Hence, research on leveraging unlabeled data for medical image analysis is highly required.
In this work, we focus on semi-supervised learning (SSL) to learn representations from both labeled and unlabeled data for LA segmentation. SSL is an intermediate way between supervised learning and unsupervised learning (Chapelle et al., 2006), and its efficiency has been verified in many computer vision tasks (Van Engelen and Hoos, 2020). Typically, SSL attempts to train a model with a limited amount of labeled data and a large amount of unlabeled data. The unlabeled data supervises the model in a self-training manner with the consistent regularization which is based on the assumption that predictions of the model should be consistent under minor perturbations for the same input (Van Engelen and Hoos, 2020). Notably, the scope of this work is the standard SSL whose involved data has the same categories and modality (e.g. MRI).
Recently, several LA segmentation works have been done with SSL to relieve the requirement of expensive dense annotations for deep learning models. Primary works of these SSL models for LA segmentation are based on consistent regularization. To be specific, they can either make model predictions consistent with the original unlabeled data and its random perturbed data (e.g. noise, scaling) or make the model learn distribution consistency between labeled and unlabeled data by adversarial learning. Due to the consistency is calculated among predictions of unlabeled data (also called pseudo labels), the false prediction has the potential to make the training unstable. To mitigate the effect of unreliable predictions on the stability of training, UA-MT leveraged an uncertainty map of predictions for perturbed data to filter out the high uncertainty regions (Yu et al., 2019). This model adopted the mean-teacher (Tarvainen and Valpola, 2017) framework that required two networks and multiple forward propagations to formulate the uncertainty information. To reduce the time and memory cost, Wu et al. designed a network with two decoders and formulated the discrepancy of these two predictions as model uncertainty information to construct an unsupervised loss (Wu et al., 2021b). However, this model just considered the consistency in the output-level. To embed the geometric information into training, Li et al. (2020) took a distance map regression as an auxiliary task and adopted a discriminator to distinguish the source of the predicted distance map to learn the representation from unlabeled data while learning the shape information (). Following this work, Luo et al., 2021a, Luo et al., 2021b extended the concept of consistency to the task-level and proposed a dual-task model that jointly optimized the segmentation task and a distance map regression task to utilize geometric information and unlabeled data at the same time. Most of these models leveraged unlabeled data by forcing the model to be consistent in either image-/ouput-level or feature-level (Wang et al., 2020). But they ignored the class-level information and became class-agnostic approaches. However, the class-level information is crucial to improve the distinguishability of the segmentation model.
Contrastive learning has achieved major advances in self-supervised representation learning. The main idea of it is to pull the positive samples together and push the negative samples apart. And the sample construction strategy is commonly based on data augmentations at the image-level. Augmentations of the same input are positive samples, and the other data are negative samples (Khosla et al., 2020; Chaitanya et al., 2020). The performance of contrastive learning has shown great potential and achieved state-of-the-art results in downstream visual tasks (He et al., 2020; Chen et al., 2020). However, the representation learning of contrastive learning is usually on the image-level. It is too rough to fit the semantic segmentation task. To learn more specific representations, Chaitanya et al. (2020) proposed a local version contrastive learning to encourage the model to learn local representations (). Following this local contrastive learning idea, Xiang et al. embedded a contrastive loss at the feature level for SSL based on a teacher-student model (Xiang et al., 2021). Although these models constructed sampling based on local or feature levels, the class information is still ignored.
Inspired by the idea of contrastive learning (Chen et al., 2020; Chaitanya et al., 2020; Khosla et al., 2020; Chen et al., 2021), we embedded a contrastive consistency loss at the class-level in an unsupervised manner to enable the class-aware SSL. For learning the class-level representation, we constructed a classification model following a segmentation model that takes the segmentation predictions as input and maps them into a class-vector space. Then, we set class-vectors of the same class as intra-class samples and class-vectors of different classes as inter-class samples. At last, the contrastive consistency loss based on these samples is embedded in the supervised segmentation loss to jointly optimize the segmentation framework.
In summary, the main contributions of our model are three folds:
Firstly, we proposed a class-aware semi-supervised LA segmentation framework. Compared with the class-agnostic SSL models, the framework can leverage the class-level information to learn representations from both labeled and unlabeled data to improve the distinguishability of the segmentation model.
Secondly, we proposed a contrastive consistency loss on the class-vector space. Compared with the sample construction strategy at the image-level, our class-level sample construction strategy can enable the model to learn more distinguishable representations that will be beneficial to the pixel-level segmentation task. Moreover, we set the samples of labeled data as the reference to samples of unlabeled data to alleviate the effect of the unreliable predictions for unlabeled data.
Thirdly, we verified our framework on the popular left atrial segmentation dataset and performed plenty of ablation and comparative experiments. Both quantitative and qualitative results demonstrated the superiority of the proposed framework.
Section snippets
Materials and methods
In this section, we will introduce the detail of the proposed LA semi-supervised segmentation framework. We first briefly present the involved data in this work. Afterward, we describe details of our framework and loss functions. At last, details of the implementation and metrics are described.
Comparative Experiments and Results
Firstly, we compared our framework with four state-of-the-art LA semi-supervised segmentation works, including the uncertainty-aware mean teacher approach (UA-MT) (Yu et al., 2019), shape-aware adversarial network (SASSNet) (Li et al., 2020), local and global structure-aware entropy regularized mean teacher model (LG-ER-MT) (Hang et al., 2020), and dual-task consistency framework (DTC) (Luo et al., 2021a). Table 1 demonstrates the quantitative comparative result of these methods. The first two
Discussion
In this work, we aimed to develop a class-aware semi-supervised LA segmentation framework on LGE MRI for patients with AF. Extensive experiments demonstrate that learning representation from the unlabeled data in the training stage can improve segmentation performance. Current mainstream semi-supervised LA segmentation works focused on a consistent regularization strategy to leverage the unlabeled data. While this kind of model usually requires a complex structure, such as a mean teacher with
5. Conclusions
In this study, we constructed a semi-supervised LA segmentation framework with a segmentation model followed by a classification model. The E2DNet takes patches as input to predict probability maps for each class. And the classification model maps these probability maps into the class-vector space. At last, the framework is supervised by the segmentation loss of labeled data and self-supervised by the contrastive consistency loss between labeled data and the unlabeled data. Thanks to the
CRediT authorship contribution statement
Yashu Liu: Conceptualization; Formal analysis; Investigation; Methodology; Validation; Visualization; Writing-original draft. Wei Wang, Funding acquisition; Supervision; Writing-review & editing. Gongning Luo: Funding acquisition; Project administration; Writing-review & editing. Kuanquan Wang: Funding acquisition; Project administration; Supervision; Writing-review & editing. Shuo Li: Investigation; Methodology; Validation; Writing-review & editing.
Funding
This work was supported by the National Natural Science Foundation of China [grant numbers 62001141, 62001144]; and the Science and Technology Innovation Committee of Shenzhen Municipality [grant number JCYJ20210324131800002].
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
Thanks to the authors of Yu et al. (2019) and Ma et al. (2020). Their code repositories are the fundament of our work. Thanks to the organizers of the 2018 Atrial Segmentation Challenge to publish the LA segmentation dataset.
References (39)
- et al.
Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis
Med. Image Anal.
(2019) - et al.
Radiofrequency catheter ablation for atrial fibrillation: approaches and outcomes
Heart, Lung Circ.
(2017) - et al.
A global benchmark of algorithms for segmenting the left atrium from late gadolinium-enhanced cardiac magnetic resonance imaging
Med. Image Anal.
(2021) - et al.
Simultaneous left atrium anatomy and scar segmentations via deep learning in multiview information with attention
Future Generation Computer Systems
(2020) - et al.
Atrial fibrosis helps select the appropriate patient and strategy in catheter ablation of atrial fibrillation: a DE-MRI guided approach
J. Cardiovasc. Electr.
(2011) Radiofrequency Ablation for Atrial Fibrillation: A Guide for Adults, Comparative Effectiveness Review Summary Guides for Consumers
(2009)- Chaitanya, K., Erdil, E., Karani, N., Konukoglu, E., 2020. Contrastive Learning of Global and Local Features for...
- Chapelle, O., Schölkopf, B., Zien, A., 2006. Semi-Supervised Learning The MIT Press, Cambridge, Massachusetts, London,...
- et al.
Atrial fibrosis by late gadolinium enhancement magnetic resonance imaging and catheter ablation of atrial fibrillation: 5-year follow-up data
J. Am. Heart Assoc.
(2018) - et al.
JAS-GAN: generative adversarial network based joint atrium and scar segmentation on unbalanced atrial targets
IEEE J. Biomed. Health
(2021)
Adaptive hierarchical dual consistency for semi-supervised left atrium segmentation on cross-domain data
IEEE T. Med. Imaging
Measures of the amount of ecologic association between species
Ecology
Prevalence, age distribution, and gender of patients with atrial fibrillation: analysis and implications
Arch. Intern. Med.
Cited by (23)
MLC: Multi-level consistency learning for semi-supervised left atrium segmentation
2024, Expert Systems with ApplicationsSemi-supervised segmentation of hyperspectral pathological imagery based on shape priors and contrastive learning
2024, Biomedical Signal Processing and ControlMulti-head consistent semi-supervised learning for lumbar CT segmentation
2024, Biomedical Signal Processing and ControlLearning with limited annotations: A survey on deep semi-supervised learning for medical image segmentation
2024, Computers in Biology and Medicine