A contrastive consistency semi-supervised left atrium segmentation model

doi:10.1016/j.compmedimag.2022.102092

Computerized Medical Imaging and Graphics

Volume 99, July 2022, 102092

https://doi.org/10.1016/j.compmedimag.2022.102092 Get rights and content

Highlights

•
A semi-supervised LA segmentation framework to leverage unlabeled data for automatic and accurate LA segmentation.
•
A contrastive consistency loss function based on the class-vector for learning more distinguishable representations.
•
A classification model based on the class-aware information to improve the performance of segmentation model.

Abstract

Accurate segmentation for the left atrium (LA) is a key process of clinical diagnosis and therapy for atrial fibrillation. In clinical, the semantic-level segmentation of LA consumes much time and labor. Although supervised deep learning methods can somewhat solve this problem, a high-efficient deep learning model requires abundant labeled data that is hard to acquire. Therefore, the research on automatic LA segmentation of leveraging unlabeled data is highly required. In this paper, we propose a semi-supervised LA segmentation framework including a segmentation model and a classification model. The segmentation model takes volumes from both labeled and unlabeled data as input and generates predictions of LAs. And then, a classification model maps these predictions to class-vectors for each input. Afterward, to leverage the class information, we construct a contrastive consistency loss function based on these class-vectors, so that the model can enlarge the discrepancy of the inter-class and compact the similarity of the intra-class for learning more distinguishable representation. Moreover, we set the class-vectors from the labeled data as references to the class-vectors from the unlabeled data to relieve the influence of the unreliable prediction for the unlabeled data. At last, we evaluate our semi-supervised LA segmentation framework on a public LA dataset using four universal metrics and compare it with recent state-of-the-art models. The proposed model achieves the best performance on all metrics with a Dice Score of 89.81 %, Jaccard of 81.64 %, 95 % Hausdorff distance of 7.15 mm, and Average Surface Distance of 1.82 mm. The outstanding performance of the proposed framework shows that it may have a significant contribution to assisting the therapy of patients with atrial fibrillation. Code is available at: https://github.com/PerceptionComputingLab/SCC.

Introduction

Atrial fibrillation (AF) is a common heart disease and the risk of it increases with age (Feinberg et al., 1995). Patients with AF may have heart palpitations, breathlessness, low energy, and an increased risk of stroke (Center, 2009). Catheter ablation is a current routine therapy for patients with AF (Kalla et al., 2017). However, the success ratio of catheter ablation is unsatisfactory, after which AF recurrence and the second ablation often happen (Chelu et al., 2018). According to the clinical experience, ablation strategies and AF recurrence are dominated by the degree of atrial fibrosis and the ablation-related scar (Akoum et al., 2011; Wu et al., 2021). And learning the topology of the left atrium (LA) is crucial for evaluating the degree of atrial fibrosis and ablation-related scar in patients with AF. Therefore, to improve the success ratio of the catheter ablation, accurate segmentation of the LA in medical images is a critical process that can assist the clinic in understanding the topology of LA, assessing the risk of AF, and making patient-specific treatment plan. Recently, late gadolinium-enhanced MRI (LGE MRI) provides a promising visualizing ability for myocardial scar tissues through brightening scar signal intensities to differentiate them from the healthy tissues, which results in the poor boundary of the LA (Yang et al., 2020). The LA segmentation involves the LA cavity, pulmonary veins, LA appendage, etc. These complex structures and the fuzzy boundary problem make the acquirement of the semantic-level label of the LA consuming much more time and labor. Therefore, accurate and automatic segmenting of the LA in LGE MRI is a challenging and necessary task.

For the past few years, deep learning models have taken impressive improvements on several medical image segmentation tasks (Shen et al., 2017). However, a high-efficient supervised deep learning model requires abundant labeled data. And the requirement of plenty of data with dense annotations somewhat slows down the process of deep learning application in medical image analysis. On the other hand, a large amount of unlabeled data may be available with the development of the wise information technology of med (Cheplygina et al., 2019). Hence, research on leveraging unlabeled data for medical image analysis is highly required.

In this work, we focus on semi-supervised learning (SSL) to learn representations from both labeled and unlabeled data for LA segmentation. SSL is an intermediate way between supervised learning and unsupervised learning (Chapelle et al., 2006), and its efficiency has been verified in many computer vision tasks (Van Engelen and Hoos, 2020). Typically, SSL attempts to train a model with a limited amount of labeled data and a large amount of unlabeled data. The unlabeled data supervises the model in a self-training manner with the consistent regularization which is based on the assumption that predictions of the model should be consistent under minor perturbations for the same input (Van Engelen and Hoos, 2020). Notably, the scope of this work is the standard SSL whose involved data has the same categories and modality (e.g. MRI).

Recently, several LA segmentation works have been done with SSL to relieve the requirement of expensive dense annotations for deep learning models. Primary works of these SSL models for LA segmentation are based on consistent regularization. To be specific, they can either make model predictions consistent with the original unlabeled data and its random perturbed data (e.g. noise, scaling) or make the model learn distribution consistency between labeled and unlabeled data by adversarial learning. Due to the consistency is calculated among predictions of unlabeled data (also called pseudo labels), the false prediction has the potential to make the training unstable. To mitigate the effect of unreliable predictions on the stability of training, UA-MT leveraged an uncertainty map of predictions for perturbed data to filter out the high uncertainty regions (Yu et al., 2019). This model adopted the mean-teacher (Tarvainen and Valpola, 2017) framework that required two networks and multiple forward propagations to formulate the uncertainty information. To reduce the time and memory cost, Wu et al. designed a network with two decoders and formulated the discrepancy of these two predictions as model uncertainty information to construct an unsupervised loss (Wu et al., 2021b). However, this model just considered the consistency in the output-level. To embed the geometric information into training, Li et al. (2020) took a distance map regression as an auxiliary task and adopted a discriminator to distinguish the source of the predicted distance map to learn the representation from unlabeled data while learning the shape information (). Following this work, Luo et al., 2021a, Luo et al., 2021b extended the concept of consistency to the task-level and proposed a dual-task model that jointly optimized the segmentation task and a distance map regression task to utilize geometric information and unlabeled data at the same time. Most of these models leveraged unlabeled data by forcing the model to be consistent in either image-/ouput-level or feature-level (Wang et al., 2020). But they ignored the class-level information and became class-agnostic approaches. However, the class-level information is crucial to improve the distinguishability of the segmentation model.

Contrastive learning has achieved major advances in self-supervised representation learning. The main idea of it is to pull the positive samples together and push the negative samples apart. And the sample construction strategy is commonly based on data augmentations at the image-level. Augmentations of the same input are positive samples, and the other data are negative samples (Khosla et al., 2020; Chaitanya et al., 2020). The performance of contrastive learning has shown great potential and achieved state-of-the-art results in downstream visual tasks (He et al., 2020; Chen et al., 2020). However, the representation learning of contrastive learning is usually on the image-level. It is too rough to fit the semantic segmentation task. To learn more specific representations, Chaitanya et al. (2020) proposed a local version contrastive learning to encourage the model to learn local representations (). Following this local contrastive learning idea, Xiang et al. embedded a contrastive loss at the feature level for SSL based on a teacher-student model (Xiang et al., 2021). Although these models constructed sampling based on local or feature levels, the class information is still ignored.

Inspired by the idea of contrastive learning (Chen et al., 2020; Chaitanya et al., 2020; Khosla et al., 2020; Chen et al., 2021), we embedded a contrastive consistency loss at the class-level in an unsupervised manner to enable the class-aware SSL. For learning the class-level representation, we constructed a classification model following a segmentation model that takes the segmentation predictions as input and maps them into a class-vector space. Then, we set class-vectors of the same class as intra-class samples and class-vectors of different classes as inter-class samples. At last, the contrastive consistency loss based on these samples is embedded in the supervised segmentation loss to jointly optimize the segmentation framework.

In summary, the main contributions of our model are three folds:

Firstly, we proposed a class-aware semi-supervised LA segmentation framework. Compared with the class-agnostic SSL models, the framework can leverage the class-level information to learn representations from both labeled and unlabeled data to improve the distinguishability of the segmentation model.

Secondly, we proposed a contrastive consistency loss on the class-vector space. Compared with the sample construction strategy at the image-level, our class-level sample construction strategy can enable the model to learn more distinguishable representations that will be beneficial to the pixel-level segmentation task. Moreover, we set the samples of labeled data as the reference to samples of unlabeled data to alleviate the effect of the unreliable predictions for unlabeled data.

Thirdly, we verified our framework on the popular left atrial segmentation dataset and performed plenty of ablation and comparative experiments. Both quantitative and qualitative results demonstrated the superiority of the proposed framework.

Section snippets

Materials and methods

In this section, we will introduce the detail of the proposed LA semi-supervised segmentation framework. We first briefly present the involved data in this work. Afterward, we describe details of our framework and loss functions. At last, details of the implementation and metrics are described.

Comparative Experiments and Results

Firstly, we compared our framework with four state-of-the-art LA semi-supervised segmentation works, including the uncertainty-aware mean teacher approach (UA-MT) (Yu et al., 2019), shape-aware adversarial network (SASSNet) (Li et al., 2020), local and global structure-aware entropy regularized mean teacher model (LG-ER-MT) (Hang et al., 2020), and dual-task consistency framework (DTC) (Luo et al., 2021a). Table 1 demonstrates the quantitative comparative result of these methods. The first two

Discussion

In this work, we aimed to develop a class-aware semi-supervised LA segmentation framework on LGE MRI for patients with AF. Extensive experiments demonstrate that learning representation from the unlabeled data in the training stage can improve segmentation performance. Current mainstream semi-supervised LA segmentation works focused on a consistent regularization strategy to leverage the unlabeled data. While this kind of model usually requires a complex structure, such as a mean teacher with

5. Conclusions

In this study, we constructed a semi-supervised LA segmentation framework with a segmentation model followed by a classification model. The E2DNet takes patches as input to predict probability maps for each class. And the classification model maps these probability maps into the class-vector space. At last, the framework is supervised by the segmentation loss of labeled data and self-supervised by the contrastive consistency loss between labeled data and the unlabeled data. Thanks to the

CRediT authorship contribution statement

Yashu Liu: Conceptualization; Formal analysis; Investigation; Methodology; Validation; Visualization; Writing-original draft. Wei Wang, Funding acquisition; Supervision; Writing-review & editing. Gongning Luo: Funding acquisition; Project administration; Writing-review & editing. Kuanquan Wang: Funding acquisition; Project administration; Supervision; Writing-review & editing. Shuo Li: Investigation; Methodology; Validation; Writing-review & editing.

Funding

This work was supported by the National Natural Science Foundation of China [grant numbers 62001141, 62001144]; and the Science and Technology Innovation Committee of Shenzhen Municipality [grant number JCYJ20210324131800002].

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

Thanks to the authors of Yu et al. (2019) and Ma et al. (2020). Their code repositories are the fundament of our work. Thanks to the organizers of the 2018 Atrial Segmentation Challenge to publish the LA segmentation dataset.

References (39)

V. Cheplygina et al.
Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis
Med. Image Anal.
(2019)
M. Kalla et al.
Radiofrequency catheter ablation for atrial fibrillation: approaches and outcomes
Heart, Lung Circ.
(2017)
Z. Xiong et al.
A global benchmark of algorithms for segmenting the left atrium from late gadolinium-enhanced cardiac magnetic resonance imaging
Med. Image Anal.
(2021)
Guang Yang et al.
Simultaneous left atrium anatomy and scar segmentations via deep learning in multiview information with attention
Future Generation Computer Systems
(2020)
N. Akoum et al.
Atrial fibrosis helps select the appropriate patient and strategy in catheter ablation of atrial fibrillation: a DE-MRI guided approach
J. Cardiovasc. Electr.
(2011)
E. Center
Radiofrequency Ablation for Atrial Fibrillation: A Guide for Adults, Comparative Effectiveness Review Summary Guides for Consumers
(2009)
Chaitanya, K., Erdil, E., Karani, N., Konukoglu, E., 2020. Contrastive Learning of Global and Local Features for...
Chapelle, O., Schölkopf, B., Zien, A., 2006. Semi-Supervised Learning The MIT Press, Cambridge, Massachusetts, London,...
M.G. Chelu et al.
Atrial fibrosis by late gadolinium enhancement magnetic resonance imaging and catheter ablation of atrial fibrillation: 5-year follow-up data
J. Am. Heart Assoc.
(2018)
J. Chen et al.
JAS-GAN: generative adversarial network based joint atrium and scar segmentation on unbalanced atrial targets
IEEE J. Biomed. Health
(2021)

J. Chen et al.

Adaptive hierarchical dual consistency for semi-supervised left atrium segmentation on cross-domain data

IEEE T. Med. Imaging

(2022)

Chen, T., Kornblith, S., Norouzi, M., Hinton, G. , 2020. A Simple Framework for Contrastive Learning of Visual...

Chen, T., Luo, C., Li, L., 2021. Intriguing properties of contrastive losses. In: Advances in Neural Information...

L.R. Dice

Measures of the amount of ecologic association between species

Ecology

(1945)

W.M. Feinberg et al.

Prevalence, age distribution, and gender of patients with atrial fibrillation: analysis and implications

Arch. Intern. Med.

(1995)

Fausto, M., Nassir, N., Seyed-Ahmad, A., 2016. V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image...

Gerig, G., Jomier, M., Chakos, M., 2001. Valmet: A New Validation Tool for Assessing and Improving 3D Object...

Hang, W., Feng, W., Liang, S., Yu, L., Wang, Q., Choi, K., Qin, J., 2020. Local and Global Structure-Aware Entropy...

He, K., Fan, H., Wu, Y., Xie, S., Grishick, R., 2020. Momentum Contrast for Unsupervised Visual Representation...

Cited by (23)

MLC: Multi-level consistency learning for semi-supervised left atrium segmentation
2024, Expert Systems with Applications
Atrial fibrillation is the most common type of arrhythmia associated with a high mortality rate. Left atrium segmentation is crucial for the diagnosis and treatment of atrial fibrillation. Accurate left atrium segmentation with limited labeled data is a tricky problem. In this paper, a novel multi-level consistency semi-supervised learning method is proposed for left atrium segmentation from 3D magnetic resonance images. The proposed framework can efficiently utilize limited labeled data and large amounts of unlabeled data by performing consistency predictions under task level, data level, and feature level perturbations. For task consistency, the segmentation results and signed distance maps were used for both segmentation and distance estimation tasks. For data level perturbation, random flips (horizontal or vertical) were introduced for unlabeled data. Moreover, based on virtual adversarial training, we design a multi-layer feature perturbation in the structure of skipping connection. Our method is evaluated on the publicly available Left Atrium Segmentation Challenge dataset version 2018. For the model trained with a label rate of 20%, the evaluation metrics Dice, Jaccard, ASD, and 95HD are 91.69%, 84.71%, 1.43 voxel, and 5.44 voxel, respectively. The experimental results show that the proposed method outperforms other semi-supervised learning methods and even achieves better performance than the fully supervised V-Net.
Feature similarity learning based on fuzziness minimization for semi-supervised medical image segmentation
2024, Information Fusion
Deep learning has advanced the automation and intelligence levels of medical image segmentation, but the acquisition of annotations for medical images proves to be very challenging. Deep semi-supervised learning provides an effective approach to tackling this challenge. In this paper, we propose a method of feature similarity learning for deep semi-supervised image segmentation. The method performs a weighted transformation on the latent features of deep convolution network, and the objective of this transformation is designed to minimize the uncertainty of feature similarity. Based on fuzzy sets of similarity, we define the fuzziness to measure the uncertainty in the objective, and demonstrate that the optimization objective is equivalent to strengthening the discriminant property of those transformed features. This similarity learning is independent of the pseudo-labels, boosting the robustness against noisy pseudo-labels. Our method is implemented through a plug-and-play neural network layer, which can be optimized alongside the segmentation model in an end-to-end manner. The model validation focuses on the left atrium segmentation which is a challenging task for atrial fibrillation treatment. Comprehensive experiments demonstrate the effectiveness of the uncertainty-guided similarity learning on semi-supervised segmentation tasks.
Semi-supervised segmentation of hyperspectral pathological imagery based on shape priors and contrastive learning
2024, Biomedical Signal Processing and Control
Deep learning-based methods have made significant progress in the field of hyperspectral pathological image segmentation. However, these methods heavily rely on a large amount of labeled samples, which are time-consuming and require specialized expertise to obtain. This poses a challenge for cholangiocarcinoma microscopy hyperspectral pathological images segmentation tasks.
To address this issue, this paper proposes a semi-supervised segmentation method of cholangiocarcinoma pathological images based on shape priors and contrastive learning (SPCL), leveraging both labeled and unlabeled data. SPCL incorporates morphological opening operation as a data perturbation strategy to handle the complex noise patterns in pathological images. Additionally, the method incorporates shape priors of cholangiocarcinoma to optimize uncertainty estimation, thereby enhancing the confidence of model predictions. Moreover, SPCL introduce separate feature vector predictions for each class of samples and utilize image-level contrastive learning to improve the extraction of informative representations by the model.
The proposed method is tested on cholangiocarcinoma microscopic hyperspectral data sets. Remarkably, by training with only 10% labeled data, SPCL achieves an overall accuracy of 89.25%. With 20% labeled data, the overall accuracy further improves to 91.17%.
The approach outperforms existing methods, demonstrating the best segmentation performance and providing an effective solution to alleviate the challenges posed by limited labeled data, thus enhancing bile duct tumor pathology image segmentation performance.
Usformer: A small network for left atrium segmentation of 3D LGE MRI
2024, Heliyon
Left atrial (LA) fibrosis plays a vital role as a mediator in the progression of atrial fibrillation. 3D late gadolinium-enhancement (LGE) MRI has been proven effective in identifying LA fibrosis. Image analysis of 3D LA LGE involves manual segmentation of the LA wall, which is both lengthy and challenging. Automated segmentation poses challenges owing to the diverse intensities in data from various vendors, the limited contrast between LA and surrounding tissues, and the intricate anatomical structures of the LA. Current approaches relying on 3D networks are computationally intensive since 3D LGE MRIs and the networks are large. Regarding this issue, most researchers came up with two-stage methods: initially identifying the LA center using a scaled-down version of the MRIs and subsequently cropping the full-resolution MRIs around the LA center for final segmentation. We propose a lightweight transformer-based 3D architecture, Usformer, designed to precisely segment LA volume in a single stage, eliminating error propagation associated with suboptimal two-stage training. The transposed attention facilitates capturing the global context in large 3D volumes without significant computation requirements. Usformer outperforms the state-of-the-art supervised learning methods in terms of accuracy and speed. First, with the smallest Hausdorff Distance (HD) and Average Symmetric Surface Distance (ASSD), it achieved a dice score of 93.1% and 92.0% in the 2018 Atrial Segmentation Challenge and our local institutional dataset, respectively. Second, the number of parameters and computation complexity are largely reduced by 2.8x and 3.8x, respectively. Moreover, Usformer does not require a large dataset. When only 16 labeled MRI scans are used for training, Usformer achieves a 92.1% dice score in the challenge dataset. The proposed Usformer delineates the boundaries of the LA wall relatively accurately, which may assist in the clinical translation of LA LGE for planning catheter ablation of atrial fibrillation.
Multi-head consistent semi-supervised learning for lumbar CT segmentation
2024, Biomedical Signal Processing and Control
Segmentation of lumbar spine from computed tomography (CT) images is a necessary prerequisite for the intelligent diagnosis of many lumbar diseases. Currently, deep learning based methods have obtained encouraging experimental results with amount of labeled medical images. However, image labeling is time and effort consuming and lots of unlabeled data cannot be fully used. In this paper, a Multi-Head Consistent learning Network (MHC-Net) is designed to effectively utilize unlabeled data for semi-supervised medical image segmentation. The proposed MHC-Net method is based on the advanced teacher–student structure. The method raises two novel modules, multi-head decomposition (MHD) and cross supervision module (CSM), to enhance the precision of lumbar vertebral segmentation. The MHD promotes feature diversity by extending output layers of the encoder–decoder network, while the CSM learns generalized feature representations by computing the similarity between multi-head decompositions. The study conducts numerous experiments on two widely-used medical databases for the spine. The proposed MHC-Net approach still achieves competitive performance with only 20% labeled data, which is significant to reduce data labeling burden and assist intelligent spinal diagnosis.
Learning with limited annotations: A survey on deep semi-supervised learning for medical image segmentation
2024, Computers in Biology and Medicine
Medical image segmentation is a fundamental and critical step in many image-guided clinical approaches. Recent success of deep learning-based segmentation methods usually relies on a large amount of labeled data, which is particularly difficult and costly to obtain, especially in the medical imaging domain where only experts can provide reliable and accurate annotations. Semi-supervised learning has emerged as an appealing strategy and been widely applied to medical image segmentation tasks to train deep models with limited annotations. In this paper, we present a comprehensive review of recently proposed semi-supervised learning methods for medical image segmentation and summarize both the technical novelties and empirical results. Furthermore, we analyze and discuss the limitations and several unsolved problems of existing approaches. We hope this review can inspire the research community to explore solutions to this challenge and further advance the field of medical image segmentation.

View all citing articles on Scopus

View full text

A contrastive consistency semi-supervised left atrium segmentation model

Highlights

Abstract

Introduction

Section snippets

Materials and methods

Comparative Experiments and Results

Discussion

5. Conclusions

CRediT authorship contribution statement

Funding

Declaration of Competing Interest

Acknowledgments

Med. Image Anal.

Heart, Lung Circ.

Med. Image Anal.

Future Generation Computer Systems

Atrial fibrosis helps select the appropriate patient and strategy in catheter ablation of atrial fibrillation: a DE-MRI guided approach

J. Cardiovasc. Electr.

Radiofrequency Ablation for Atrial Fibrillation: A Guide for Adults, Comparative Effectiveness Review Summary Guides for Consumers

Atrial fibrosis by late gadolinium enhancement magnetic resonance imaging and catheter ablation of atrial fibrillation: 5-year follow-up data

J. Am. Heart Assoc.

JAS-GAN: generative adversarial network based joint atrium and scar segmentation on unbalanced atrial targets