Keywords

1 Introduction

Machine learning methods, especially neural networks, have proven to excel at many image processing and analysis methods in the medical image domain. Yet, their success strongly relies on the availability of large training data sets with high quality ground truth annotations, e.g. expert segmentation of anatomical/pathological structures. Therefore, generating realistic synthetic data with ground truth labels has become crucial to boost the performance of neural networks. In [21], the authors generate brain and heart MRIs with corresponding ground truth displacement fields for the augmentation of a registration network using a statistical shape and appearance model. In [14], GAN-based image generation is used as an augmentation technique for a cell segmentation network.

Other methods enable the generation of pathological data to boost the segmentation or classification of pathologies [8, 24]. In [19], synthetic tumors are simulated on normal-appearance brain MRIs using conditional GANs, leading to an improved performance of a tumor segmentation network. Further approaches generate normal appearance images from pathological data to enable unsupervised pathology segmentation or dataset balancing [2, 23]. Those methods address certain pathological objects, yet the presence of pathologies in medical images strongly influences image analysis tasks targeting normal anatomical structures [13]. Data with ground truth annotations of both the normal and the abnormal structures would be required for the training of machine learning methods engaging with such tasks. However, most of the large publicly available datasets containing some type of pathologies are commonly designed for the segmentation (detection/localization) of the particular pathological structure and thus only contain expert segmentations of the latter, e.g. [15]. On the other hand, datasets containing ground truth annotations of normal anatomy (as used for e.g. atlas generation) are usually generated from healthy populations [18]. This induces two main problems: 1) the lack of ground truth annotation to evaluate the accuracy of standard algorithms on pathological data, and 2) the lack of data to train algorithms that target anatomical structures in pathological data.

In this work, we propose a method for the generation of realistic pathological data with ground truth labels of both anatomical and pathological structures. This is achieved by a GAN-based domain translation approach, that retains the topology of a healthy source domain, whereas the appearance of a target pathological domain is recreated. This way, the anatomic annotations of the source domain, can be directly applied to the generated images. Our method also includes an explicit pathology simulation such that tumors can be injected in the images in a controlled manner and their ground truth segmentations are available. Still, simply overlaying pathological tissue over the healthy structures as in [19] is not sufficient for a realistic appearance since brain tumors cause distortions of their surrounding tissue due to the tumor mass effect. For this reason, a novel inverse probabilistic approach to simulate tumor-induced deformations is proposed here. The feasibility of the method is demonstrated on brain MRIs by generating images containing brain tumors based on the topology of healthy brains. In our experiments the generated images serve as training datasets for segmentation and registration neural networks. The results show a significant improvement on both tasks when using our synthetic pathological data and underline the importance of mass effect simulation.

Fig. 1.
figure 1

Method overview. Left: training; right: inference. 1) Learn pathological appearance; 2) Learn inverse tumor deformations; 3) Generate pathological appearance from the topology of a healthy image (pathology injection possible); 4) Extract the inverse pathology displacement from a real pathological image; 5) Warp a generated tumor image with the inverse predicted displacement.

2 Methods

Generative adversarial networks (GANs) are models able to generate realistic images of high quality  [9]. GANs learn to map a random noise vector \(\mathbf {z}\) to an output image y using a generator function \(G: \mathbf {z}\rightarrow y\). To ensure that the generator produces realistically looking images that cannot be distinguished from real ones, an adversarial discriminator D is enclosed in the training process, aiming to perfectly distinguish between real images and generator’s fakes. An extension of regular GANs are conditional GANs (cGANs), that learn the mapping from an observed image x additionally, \(G:\{x,\mathbf {z}\}\rightarrow y\). A widespread application of cGANs is style and domain transfer [22].

2.1 Topology-aware Domain Translation with Pathology Integration

To preserve the ground truth annotations of labelled images but mimic the appearance of unlabelled (e.g pathological) images, a domain translation method that preserves the topology of the source is required. However, with no paired data available, this might be a significant hurdle. Here, we pursue the approach proposed in [20] to establish unpaired domain translation by considering intensity-independent shapes as a condition for a cGAN, while learning to generate the appearance of the target image domain (Fig. 1, step 1). This way, in the inference phase the shape of a source image can be translated to the domain of the training data, while preserving the topology of the input (Fig. 1, step 3). The shape information needs to be as accessible and easy to generate as possible for any source and target domain, so the extracted image edges are used based on [20].

When translating from a healthy to a pathological appearance, the training data is (mostly) strictly pathological. Thus, simply extracting the edges of tumors leads to pathology hallucination [3], making it impossible to control the presence and position of the latter. Hence, in this work, the masks of the pathological structures are considered as a second condition to the cGAN. In this way it is possible to inject pathologies to healthy tissue of a desired size, position and form, as well as generate images without pathological structures.

For the domain translation GAN a ResNet generator and a fully-convolutional patch discriminator are found to deliver satisfying results. For an efficient 3D image generation, the patch-based memory-efficient approach from [20] is used. Furthermore, in our experience enriching the binary Canny edge information by gradient magnitude weighting enhances the image quality considerably.

2.2 Inverse Probabilistic Tissue Deformation Prediction

Brain tumors deform their surrounding tissue due to the tumor mass effect, thus simply overlaying a tumor does not deliver realistic images. A variety of sophisticated biophysical modelling approaches that simulate tumor growth and its mass effect like [6, 11] exist, however they are typically time-consuming and have limited accuracy due to unknown parameters. Still, such distortions might be of a crucial importance for further image processing methods. Here, a novel idea to simulate pathology-induced tissue deformations is presented. Determining the deformation of a pathological image at a given state is a challenging task due to the typical lack of corresponding healthy images. Furthermore, different pathologies at different stages have strongly varying influence on their surroundings. Thus, a straight-forward learning of the deformation is not feasible.

The key idea in this work is based on deducing the inverse tissue deformation of a pathological image by learning the healthy tissue shape distribution. Thus, a network is designed such that it directly predicts a displacement field from a given deformed shape (marked red in Fig. 1). Formally, \(S_p \in \mathbb {R}^{n}\) is the shape describing a pathological image and \(S_{h_i}\in \mathbb {R}^{n}\) is a possible healthy shape. Assume that \(S_p\) is a deformed version of \(S_{h_i}\). Then \(f: \mathbb {R}^n\rightarrow \mathbb {R}^{n \times d}\) is a probabilistic neural network such that \(f(S_p)=\varphi _i\) and \(S_p\circ \varphi _i\approx S_{h_i}\), where \(\varphi _{i}\) represents a displacement field and d is the image dimension. A probabilistic U-Net approach [12] is used to estimate a distribution of the unknown deformation parameters and allow for many possible normal shapes corresponding to a deformed one. Also, some biophysical properties are captured into a regularisation function in addition to the network loss: firstly, we assume constant tissue diffusivity and apply a global diffusion regularizer, secondly the locality of the mass effect is ensured with a weighted sparsity regularisation with increasing weights proportional to the distance from the tumor center.

More specifically, simplified shape representations that capture deformations are required for training. Here, the threshold-based ventricle segmentation of healthy patients are considered and deformed with a naive approach [16]. A probabilistic U-Net is then trained to learn the displacement field that transforms a deformed shape into a normal one (Fig. 1, step 2). The tumor deformation is therefore the inverse of the learned displacement field. To be able to easily invert the displacements and ensure diffeomorphism, the network learns velocity fields [17] and, based on them, the displacement fields are approximated. The diffeomorphic method used here is based on static velocity fields described in [1], where given the velocities v, the displacement fields \(\varphi \) are calculated as \(\varphi =\exp {(v)}\) and \(\varphi ^{-1}= \exp {(-v)}\). Thus, inverse deformations are simply computed by inverting the velocities and using the scaling-squaring algorithm to approximate the \(exp(\cdot )\) function. This strategy enables deducing a possible displacement from the shape of a real pathological image in the inference phase, and applying its inverse on the generated domain-translated image (Fig. 1, step 4)Footnote 1.

3 Experiments and Results

3.1 Data

Pathological: 220 3D brain T2 MRIs of patients with high grade glioblastomas and their ground truth segmentation masks from the BRATS challenge [15]. For evaluation purposes, well visible anatomical structures (ventricles and caudate nuclei) of 20 randomly selected images are manually segmented and used for testing exclusively.

Healthy: 3D brain MRI T1 scans of healthy patients with labelled anatomic regions from two freely available datasets: 30 images from the IXI datasetFootnote 2 [10] and 40 from the LONI LPBA40 [18]. However, only the segmentations of the structures available in the pathological test set are considered.

Atlas: T1 and T2 sequences of the ICBM 152 brain atlas [7].

3.2 Experimental Setup

Our experimental setup represents the following scenario: Given a set of images containing pathologies and labelled images of some other healthy patients’ domain, generate labelled images similar to the pathological domain in order to, firstly, estimate the performance (exp. 1), and secondly, train (exp. 2 and 3) algorithms applied on the pathological dataset. For all experiments, synthetic data is generated according to the pipeline shown in Fig. 1. In the first step, a GAN is trained on T2 images from the BRATS dataset, in order to emphasize the difference between the healthy and pathological domain and show the possibility for domain translation between different MRI acquisition parameters. In the second step, a probabilistic U-Net is trained on ca. 350 deformed shapes of IXI images to learn the deformation from pathological to healthy shapes. The inference in step 3 and 4 results in three kinds of images that are used as training or testing data in our experiments: domain translated healthy, domain translated with an overlayed tumor, and the latter combined with a predicted tumor deformation (ca. 440 images of each type, examples in Fig. 2 and Fig. 3).

Fig. 2.
figure 2

Example of 2D generated images. From left to right: real T1 healthy patients MRI depicting the source topology; real T2 MRI containing the target tumor (and appearance); generated T2 images: without tumor; with a tumor and no deformation; deformation created with the naive approach; predicted deformation by our approach. For better visibility the edges of the non-deformed segmentations are overlayed.

Fig. 3.
figure 3

Example of 3D generated images: axial, coronar and sagittal slices. Left: Injected tumor without a deformation; Right: applied tumor-induced deformation. For better visibility the edges of the non-deformed segmentations are overlayed.

1) Evaluation of algorithm accuracy. Missing ground truth annotations of pathological data inhibit the assessment of image processing methods. On the example of atlas-based segmentation using a pre-trained 3D registration neural network [25], we hypothesize that its usage yields comparable results on our synthetic data set and on real pathological data, making it possible to estimate the algorithms accuracy for real data.

2) Image registration. Here, the impact of tumors and tumor mass effect in the training and testing data are explored for the use-case registration. A supervised method is required to underline the necessity of ground truth data from pathological domains. Hence, the architecture of our choice is FlowNet [4], where a registration displacement field is predicted from an input image pair. The ground-truth displacements are generated using a pairwise registration [5] of the LPBA40 data and directly transferred to the domain translated images. Since the registration of two pathological images is infeasible, a fixed image of a healthy appearance is chosen if the moving image contains tumors. Predicted tumor-induced deformations are directly integrated into the ground-truth displacement when applicable. In the test phase an image-to-atlas registration is established, registering the test BRATS images to the T2 atlas. As the FlowNet architecture is extremely memory consuming, this experiment is realized for 2D image slices only.

3) Semantic segmentation. In this experiment, the influence of pathologies on the semantic segmentation of 2D and 3D data is explored. For this purpose a 2D and a 3D U-Net are trained on the different synthetic datasets. The training is established in a strictly supervised manner, where the labels of the original healthy IXI data are directly used as ground truth for the domain translated images. However, when using deformations the labels are transformed accordingly, also anatomical labels overlayed by tumor tissue are not considered.

Additionally, for experiments 2) and 3), the influence of random elastic data augmentation of the datasets containing tumors is explored.

Table 1. Results of the 3D segmentation and 2D registration experiments measured in mean Dice coefficients. Training and testing datasets: real healthy (real IXI/LPBA40 T1 MRIs of healthy patients); real tumor (real BRATS images with manual anatomical annotations); gen. healthy (generated T2 MRIs with no tumors); gen. tumor (generated T2 MRIs with tumors); gen. tumor def. (generated T2 MRIs with tumors and predicted tumor-induced deformations); subscript \(^A\) indicates random elastic data augmentation. Italic numbers correspond to the baseline; bold marks the best and statistically significant (\(p<0.005\) in a two-tailed paired t-test) result for each experiment.

3.3 Results

Some examples of the generated images are shown in Fig. 2 and in Fig. 3 for 2D and 3D respectively. Consistent with [20], the resulting 2D and 3D images are of a sufficiently realistic appearance and high quality. The first experiment shows the capability of the images to be used for assessment of a registration network pre-trained on healthy T1 images. When used for atlas registration on real pathological T2 data, the registration yields mean Dice values of \(0.43(\pm 0.14)\)/\(0.40(\pm 0.20)\) (ventricles/caudate nuclei) and for the generated deformed tumor images \(0.47(\pm 0.19)\)/\(0.47(\pm 0.21)\). Those results are comparable and show no significant difference in an unpaired t-test (\(p>0.1\)), which implicates plausibility of the generated data and induces its suitability to evaluate the accuracy of pretrained neural networks. When testing the registration network on the generated T2 tumor images without deformations, mean Dices of \(0.62(\pm 0.11)\)/\(0.63(\pm 0.10)\) are achieved (see supplementary). This emphasizes the importance to integrate tumor-induced deformations into synthetic images.

Fig. 4.
figure 4

Example segmentations of 3D pathological images. Column 1–2: ground truth whole image and zoomed to the tumor region; Column 3–7: segmentations of U-Net trained differently with fake images (the best setup is marked bold). Note that when tumors do not directly impact the target tissue, the influence of the different training types is marginal (second row).

For experiments 2) and 3), each training setup is executed ten times with different random seeds to ensure the stability of the results. All methods are evaluated in terms of Dice overlaps of the ventricles and caudate nuclei averaged over all 20 test BRATS images and seeds. The segmentation and registration results are shown in Table 1. The 2D segmentation results are analogous to the 3D results with the best setup yielding \(0.71 (\pm 0.12)/0.64 (\pm 0.24)\). Also, deformations with the naive approach [16] do not deliver significantly better results compared to using no deformations (see supplementary). The results clearly show that generated domain-translated images with pathologies and their corresponding deformations improve training by far. Adding tumors to the generated images enhances the segmentation of smaller structures, yet, the segmentation of the larger ventricles is impaired. This is due to subtle tumors in the test images that do not impact the segmentation of the ventricles (Fig. 4), thus it might be useful to mix the training datasets with and without tumors. However, by extending the training dataset by our proposed deformation method, significantly better results are achieved for all experiments. This shows that pathological tissue deformations are crucial for both image registration and segmentation. Adding random elastic augmentation significantly enhances the segmentation results. Contrary to that, this type of augmentation disrupts the training process of the registration network strongly, which is consistent with [21]. Overall, the best setups for registration and segmentation deliver Dice coefficients in a reasonable range compared to the baseline training and testing on the real healthy T1 MRIs.

4 Discussion and Conclusion

In this work we propose a method to generate images from pathological domains with ground truth anatomical annotations directly transferred from healthy patients. Our approach includes a tumor synthesizing technique and a novel idea to simulate the tumor mass effect. In the performed experiments, the necessity for generating ground truth pathological data is emphasized and we show that the synthesized data is suitable for the training of neural networks applied for real pathological data. Moreover, a significant improvement of the networks’ performance is achieved by additionally considering tumor-induced deformations. Future work could gain improvement by mixing healthy and pathological images in a single dataset and a more elaborate assessment of the tissue deformations.