Knowledge transfer between brain lesion segmentation tasks with increased model capacity

doi:10.1016/j.compmedimag.2020.101842

Computerized Medical Imaging and Graphics

Volume 88, March 2021, 101842

https://doi.org/10.1016/j.compmedimag.2020.101842 Get rights and content

Highlights

•
We address the problem of scarce annotated data for brain lesion segmentation.
•
Knowledge transfer between brain lesion segmentation tasks is proposed.
•
A fine-tuning strategy with increased model capacity is developed.
•
We also introduce a spatially adaptive mechanism for the model capacity increase.
•
Our method achieves better performance on ischemic stroke lesion segmentation.

Abstract

Convolutional neural networks (CNNs) have become an increasingly popular tool for brain lesion segmentation in recent years due to its accuracy and efficiency. However, CNN-based brain lesion segmentation generally requires a large amount of annotated training data, which can be costly for medical imaging. In many scenarios, only a few annotations of brain lesions are available. One common strategy to address the issue of limited annotated data is to transfer knowledge from a different yet relevant source task, where training data is abundant, to the target task of interest. Typically, a model can be pretrained for the source task, and then fine-tuned with the scarce training data associated with the target task. However, classic fine-tuning tends to make small modifications to the pretrained model, which could hinder its adaptation to the target task. Fine-tuning with increased model capacity has been shown to alleviate this negative impact in image classification problems. In this work, we extend the strategy of fine-tuning with increased model capacity to the problem of brain lesion segmentation, and then develop an advanced version that is better suitable for segmentation problems. First, we propose a vanilla strategy of increasing the capacity, where, like in the classification problem, the width of the network is augmented during fine-tuning. Second, because unlike image classification, in segmentation problems each voxel is associated with a labeling result, we further develop a spatially adaptive augmentation strategy during fine-tuning. Specifically, in addition to the vanilla width augmentation, we incorporate a module that computes a spatial map of the contribution of the information given by width augmentation in the final segmentation. For demonstration, the proposed method was applied to ischemic stroke lesion segmentation, where a model pretrained for brain tumor segmentation was fine-tuned, and the experimental results indicate the benefit of our method.

Introduction

Automated brain lesion segmentation has a great potential in guiding clinical diagnosis and treatment strategies. In particular, convolutional neural networks (CNNs) have achieved state-of-the-art segmentation performance. For example, Zhao et al. (2018) have proposed a unified CNN-based framework for brain tumor segmentation with appearance and spatial consistency. In Nair et al. (2020), multiple sclerosis (MS) lesions are segmented with CNNs and lesion-level uncertainties are explored. Kamnitsas et al. (2017) have developed an efficient multi-scale CNN, which achieves remarkable segmentation performance for traumatic brain injuries, brain tumors, and stroke lesions. Kervadec et al. (2019) have developed a CNN integrated with a boundary loss to improve stroke lesion segmentation and white matter hyperintensity (WMH) segmentation. However, the training of CNNs generally requires a large number of annotations, which can be costly for medical imaging. In many scenarios, only a few annotated images are available. Thus, it is highly desirable to develop CNN-based brain lesion segmentation methods that allow efficient network training with scarce annotated data.

Transfer learning is a commonly used strategy to address the problem of scarce annotated training data for deep networks, where knowledge learned from a source task with abundant annotations is transferred to a target task of interest with scarce annotations. Normally, the source task and the target task are different yet relevant. Fine-tuning is a typical and widely used strategy of transfer learning, where a model is pretrained for the source task and then the knowledge learned for the source task is transferred to the target task by fine-tuning the pretrained model with the limited annotations from the target dataset (Girshick et al., 2014). In the process of fine-tuning, parameters that are specific to the target task, such as the network weights in the last classification layer, are randomly initialized, and they are optimized together with the other pretrained parameters. This fine-tuning strategy has been successfully applied to image classification and segmentation tasks. For example, in Agrawal et al. (2014), the effectiveness of fine-tuning a network is shown, where the network is pretrained with the ImageNet dataset (Deng et al., 2009) and fine-tuned for image classification and object detection on the SUN dataset (Xiao et al., 2010) and PASCAL VOC dataset (Everingham et al., 2010). In Tajbakhsh et al. (2016), based on distinct medical imaging applications that include classification, detection, and segmentation, the impact of fine-tuning on the performance of CNNs is investigated, and the possibility of knowledge transfer from natural images to medical images is demonstrated. In Ghafoorian et al. (2017), fine-tuning is used for domain adaptation between two WMH datasets with different intensity distributions, and it has achieved accurate WMH segmentation with only a few annotated training scans for the target task.

The transfer learning strategy is also possible for brain lesion segmentation when only limited annotations are available for a task of interest, because there are publicly available datasets that include a decent number of scans with annotated brain lesions (Menze et al., 2015, Maier et al., 2017, Kuijf et al., 2019). However, despite the widespread use of the classic fine-tuning strategy described above, it is observed that this strategy is suboptimal for image classification. Classic fine-tuning tends to make small modifications to the pretrained model, which could hinder its adaptation to the target task (Wang et al., 2017). Instead, an increase in the network capacity during fine-tuning can improve the performance for the target classification task. In Wang et al. (2017) width augmented and depth augmented networks with additional randomly initialized units are proposed. Such an increase in model capacity is inspired by developmental learning in cognitive science (Luciana and Nelson, 2001), and it allows existing units to better adapt to the target task. The strategy of increasing model capacity may also benefit brain lesion segmentation problems, yet this has not been previously explored. In addition, existing strategies of model capacity increase are developed for image classification and they may not be optimal for image segmentation problems.

In this paper, to address the problem of scarce annotated training data for brain lesion segmentation, we explore the strategy of fine-tuning with increased model capacity for knowledge transfer between tasks of brain lesion segmentation. In particular, we focus on the situation where only a pretrained model is available and access to the training data for the source task is not guaranteed, which is likely to happen due to privacy or other practical concerns (Burton et al., 2015, Micaelli and Storkey, 2019). In this setting, multi-task learning (Caruana, 1997) or retraining a different model for the source task is not feasible. Since it has been shown in Wang et al. (2017) that width augmentation is superior to depth augmentation, we focus on the development of width augmented networks. First, we develop a vanilla width augmentation strategy motivated by Wang et al. (2017) for brain lesion segmentation, where the number of channels in the penultimate layer—the layer before the final prediction layer—is increased during fine-tuning. Moreover, unlike image classification problems, in brain lesion segmentation each voxel is associated with a labeling result, and different spatial locations may require different contributions of information from the augmented units. Therefore, we further develop a spatially adaptive width augmentation strategy. Specifically, we propose a module that computes a spatial map of the contribution of information from the augmented units that should be used at each voxel. In this way, the spatially adaptive strategy of model capacity increase is more suitable for brain lesion segmentation than the vanilla width augmentation strategy.

The contributions of our work are summarized as follows:

•
We investigate knowledge transfer with increased model capacity between two different brain lesion segmentation tasks, which has not been explored previously. Compared with the classic fine-tuning method, this strategy enables the segmentation network to better adapt to the target task.
•
We further propose a spatially adaptive mechanism of model capacity increase, which takes into consideration the fact that each voxel is associated with a labeling result and thus is more appropriate for segmentation problems.
•
We show experimentally the benefit of the proposed strategy of fine-tuning with increased model capacity. Specifically, we applied the proposed method to the segmentation of ischemic stroke lesions, where a model was pretrained using the BraTS dataset (Menze et al., 2015) for brain tumor segmentation and then fine-tuned with annotated stroke lesions. Results indicate that the segmentation accuracy is improved with the proposed method.

The rest of this paper is organized as follows. Section 2 introduces the proposed strategy of knowledge transfer between brain lesion segmentation tasks. Section 3 provides experimental evidence that demonstrates the benefit of the proposed approach. In Section 4, discussion on the results and future work is given. Section 5 gives a summary of this paper.

Section snippets

Methods

In this section, we first formulate the problem of knowledge transfer between brain lesion segmentation tasks and introduce the classic fine-tuning strategy. Then, we describe the proposed method of fine-tuning with increased model capacity, which addresses the limitations of classic fine-tuning. Finally, we present the details of implementation.

Results

In this section, we present the evaluation of the proposed approach. For demonstration, the proposed method was applied to ischemic stroke lesion segmentation, which is the target task. We selected brain tumor segmentation as the source task due to the publicly available tumor annotations in the BraTS dataset (Menze et al., 2015). All experiments were performed on an NVIDIA Tesla V100 GPU.

Discussion

For brain lesion segmentation tasks, the availability of a large number of annotated training scans may not always be guaranteed. In such cases, fine-tuning a model that is pretrained with abundant annotations for a different yet relevant task could substantially improve the segmentation performance. However, since classic fine-tuning may limit the adaptation of the pretrained model to the task of interest, motivated by Wang et al. (2017) we explore model capacity increase using network width

Conclusion

We have explored knowledge transfer between tasks of brain lesion segmentation for improved segmentation performance when a limited number of annotated training scans are available. Specifically, a fine-tuning strategy with network augmentation is developed, where the network width is increased during fine-tuning. In addition, a spatially adaptive mechanism is proposed to allow a more flexible use of the augmented information. Using a model pretrained by publicly available brain tumor

Authors’ contribution

Yanlin Liu: writing – original draft, methodology, software, validation, investigation, formal analysis, visualization. Wenhui Cui: writing – original draft, methodology, validation. Qing Ha: investigation, resources. Xiaoliang Xiong: investigation, resources. Xiangzhu Zeng: conceptualization, investigation, data curation. Chuyang Ye: conceptualization, methodology, writing – review & editing, supervision, project administration, funding acquisition.

Acknowledgment

This work is supported by the Beijing Natural Science Foundation (L192058 and 7192108) and Beijing Institute of Technology Research Fund Program for Young Scholars.

References (43)

K. Kamnitsas et al.
Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation
Med. Image Anal.
(2017)
O. Maier et al.
ISLES 2015 – a public evaluation benchmark for ischemic stroke lesion segmentation from multispectral MRI
Med. Image Anal.
(2017)
T. Nair et al.
Exploring uncertainty measures in deep networks for multiple sclerosis lesion detection and segmentation
Med. Image Anal.
(2020)
S.M. Smith et al.
Advances in functional and structural MR image analysis and implementation as FSL
NeuroImage
(2004)
M.W. Woolrich et al.
Bayesian analysis of neuroimaging data in FSL
NeuroImage
(2009)
X. Zhao et al.
A deep learning model integrating FCNNs and CRFs for brain tumor segmentation
Med. Image Anal.
(2018)
P. Agrawal et al.
Analyzing the performance of multilayer neural networks for object recognition
European Conference on Computer Vision
(2014)
W. Bai et al.
Semi-supervised learning for network-based cardiac MR image segmentation
International Conference on Medical Image Computing and Computer-Assisted Intervention
(2017)
B. Billot et al.
A learning strategy for contrast-agnostic MRI segmentation
P.R. Burton et al.
Data safe havens in health research and healthcare
Bioinformatics
(2015)

R. Caruana

Multitask learning

Mach. Learn.

(1997)

X. Chen et al.

Catastrophic forgetting meets negative transfer: batch spectral shrinkage for safe transfer learning

Adv. Neural Inform. Process. Syst.

(2019)

Ö. Çiçek et al.

3D u-net: learning dense volumetric segmentation from sparse annotation

International Conference on Medical Image Computing and Computer-Assisted Intervention

(2016)

W. Cui et al.

Semi-supervised brain lesion segmentation with an adapted mean teacher model

International Conference on Information Processing in Medical Imaging

(2019)

J. Deng et al.

Imagenet: a large-scale hierarchical image database

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

(2009)

M. Everingham et al.

The pascal visual object classes (VOC) challenge

Int. J. Comput. Vision

(2010)

M. Ghafoorian et al.

Transfer learning for domain adaptation in MRI: application in brain lesion segmentation

International Conference on Medical Image Computing and Computer-Assisted Intervention

(2017)

R. Girshick et al.

Rich feature hierarchies for accurate object detection and semantic segmentation

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

(2014)

H. Kervadec et al.

Boundary loss for highly unbalanced segmentation

International Conference on Medical Imaging with Deep Learning

(2019)

D.P. Kingma et al.

Adam: A Method for Stochastic Optimization

(2014)

J. Kirkpatrick et al.

Overcoming catastrophic forgetting in neural networks

Proc. Natl. Acad. Sci. U.S.A.

(2017)

Cited by (4)

Missing Data Imputation via Conditional Generator and Correlation Learning for Multimodal Brain Tumor Segmentation
2022, Pattern Recognition Letters
Citation Excerpt :
Section 6 concludes this work. Over the past few years, many conventional [6–9] and deep learning [10–15] based approaches have been proposed to automatically segment brain tumors in MRI. These methods apply the full modalities to do the segmentation.
Brain tumor is one of the most high-risk cancers which causes the 5-year survival rate of only about 36%. Accurate diagnosis of brain tumor is critical for the treatment planning. However, it’s common to missing one modality in clinical scenarios. In this paper, we propose a novel brain tumor segmentation network to impute the missing data. The proposed network consists of a conditional generator, a multi-source correlation network and a segmentation network. To impute the missing data, we propose to use a conditional generator to generate the missing modality under the condition of the available modalities. As the multi MR modalities have a strong relationship in tumor regions, we design a multi-source correlation network to learn the multi-source correlation. On the one hand, the multi-source correlation network can help the conditional generator to generate the missing modality which should keep the consistent correlation with the available modalities. On the other hand, it can guide the segmentation network to learn the correlated feature representations to improve the segmentation performance. The experiments evaluated on BraTS 2018 dataset demonstrate the superior performance of the proposed method when compared with the state-of-the-art methods.
Volumetric white matter tract segmentation with nested self-supervised learning using sequential pretext tasks
2021, Medical Image Analysis
Citation Excerpt :
Thus, more advanced strategies of knowledge transfer could be explored in future work. For example, to allow the pretrained network to better adapt to the target task, the network capacity can be increased during the fine-tuning process, where layers corresponding to high-level image understanding can be augmented in depth or width (Wang et al., 2017; Liu et al., 2021). Also, the transfer of useful features can be determined adaptively with attentive feature selection (Wang et al., 2020).
White matter (WM) tract segmentation based on diffusion magnetic resonance imaging (dMRI) provides an important tool for the analysis of brain development, function, and disease. Deep learning based methods of WM tract segmentation have been proposed, which greatly improve the accuracy of the segmentation. However, the training of the deep networks usually requires a large number of manual delineations of WM tracts, which can be especially difficult to obtain and unavailable in many scenarios. Therefore, in this work, we explore how to perform deep learning based WM tract segmentation when annotated training data is scarce. To this end, we seek to exploit the abundant unannotated dMRI data in the self-supervised learning framework. From the unannotated data, knowledge about image context can be learned with pretext tasks that do not require manual annotations. Specifically, a deep network can be pretrained for the pretext task, and the knowledge learned from the pretext task is then transferred to the subsequent WM tract segmentation task with only a small number of annotated scans via fine-tuning. We explore two designs of pretext tasks that are related to WM tracts. The first pretext task predicts the density map of fiber streamlines, which are representations of generic WM pathways, and the training data can be obtained automatically with tractography. The second pretext task learns to mimic the results of registration-based WM tract segmentation, which, although inaccurate, is more relevant to WM tract segmentation and provides a good target for learning context knowledge. Then, we combine the two pretext tasks and develop a nested self-supervised learning strategy. In the nested self-supervised learning strategy, the first pretext task provides initial knowledge for the second pretext task, and the knowledge learned from the second pretext task with the initial knowledge is transferred to the target WM tract segmentation task via fine-tuning. To evaluate the proposed method, experiments were performed on brain dMRI scans from the Human Connectome Project dataset with various experimental settings. The results show that the proposed method improves the performance of WM tract segmentation when tract annotations are scarce.
Weakly supervised learning in domain transfer scenario for brain lesion segmentation in MRI
2024, Multimedia Tools and Applications
Differential Diagnostic Value of Machine Learning–Based Models for Embolic Stroke
2023, Clinical and Applied Thrombosis/Hemostasis

¹: These authors have contributed equally to this work.

View full text

Knowledge transfer between brain lesion segmentation tasks with increased model capacity

Highlights

Abstract

Introduction

Section snippets

Methods

Results

Discussion

Conclusion

Authors’ contribution

Acknowledgment

Med. Image Anal.

Med. Image Anal.

Med. Image Anal.

NeuroImage

NeuroImage

Med. Image Anal.

Analyzing the performance of multilayer neural networks for object recognition

European Conference on Computer Vision

Semi-supervised learning for network-based cardiac MR image segmentation

International Conference on Medical Image Computing and Computer-Assisted Intervention

A learning strategy for contrast-agnostic MRI segmentation

Data safe havens in health research and healthcare

Bioinformatics

Multitask learning

Mach. Learn.

Catastrophic forgetting meets negative transfer: batch spectral shrinkage for safe transfer learning

Adv. Neural Inform. Process. Syst.

3D u-net: learning dense volumetric segmentation from sparse annotation

International Conference on Medical Image Computing and Computer-Assisted Intervention

Semi-supervised brain lesion segmentation with an adapted mean teacher model

International Conference on Information Processing in Medical Imaging

Imagenet: a large-scale hierarchical image database

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

The pascal visual object classes (VOC) challenge

Int. J. Comput. Vision

Transfer learning for domain adaptation in MRI: application in brain lesion segmentation

International Conference on Medical Image Computing and Computer-Assisted Intervention

Rich feature hierarchies for accurate object detection and semantic segmentation

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Boundary loss for highly unbalanced segmentation

International Conference on Medical Imaging with Deep Learning

Adam: A Method for Stochastic Optimization

Overcoming catastrophic forgetting in neural networks

Proc. Natl. Acad. Sci. U.S.A.