Deep Convolutional Encoder-Decoders for Prostate Cancer Detection and Classification

Kiraly, Atilla P.; Nader, Clement Abi; Tuysuzoglu, Ahmet; Grimm, Robert; Kiefer, Berthold; El-Zehiry, Noha; Kamen, Ali

doi:10.1007/978-3-319-66179-7_56

Atilla P. Kiraly^21,22,
Clement Abi Nader^21,22,
Ahmet Tuysuzoglu²¹,
Robert Grimm^21,22,
Berthold Kiefer^21,22,
Noha El-Zehiry^21,22 &
…
Ali Kamen^21,22

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10435))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

14k Accesses

Abstract

Prostate cancer accounts for approximately 11% of all cancer cases. Definitive diagnosis is made by histopathological examination of tissue biopsies. Recently, there have been strong correlations established between pre-biopsy multi-parametric MR image findings and the histopathology results. We investigate novel deep learning networks that provide tumor localization and classification solely based on prostate multi-parametric MR images using images with biopsy confirmed lesions. We propose to use a multi-channel image-to-image convolutional encoder-decoders where responses signify localized lesions and output channels represent different tumor classes. We take simple point locations in the labeled ground truth data and train networks to output Gaussian kernels around those points across multiple channels. This approach allows for both localization and classification within a single run. The input data consists of axial T2-weighted images, apparent diffusion coefficient maps, high b-value diffusion-weighted images, and K-trans parameter maps from 202 patients. The images were co-registered on a per patient basis and exhaustive comparisons were performed with 5-fold cross-validation across three different models with increasing complexity. The highest average classification area-under-the-curve (AUC) achieved was 83.4% using a medium complexity model, in which no skip-connection were used across layers. In individual k-folds, AUCs above 90% were achieved. The results demonstrate promise for directly determining tumor malignancy without performing an invasive biopsy procedure.

You have full access to this open access chapter, Download conference paper PDF

A Comparative Assessment of Feed-Forward and Convolutional Neural Networks for the Classification of Prostate Lesions

Semi-automatic classification of prostate cancer on multi-parametric MR imaging using a multi-channel 3D convolutional neural network

Article 29 August 2019

Study and Analysis of the Heterogeneity of a Prostate Cancer Dataset: First Steps on the Release of a Multicenter Strongly-Annotated Dataset

Keywords

1 Introduction

Prostate cancer is the most frequently diagnosed cancer in American men with 181,000 new cases in 2016 resulting in more than 26,000 deaths [10]. Early diagnosis has resulted in improved long term survival but depends on invasive multicore biopsies done under trans-rectal ultrasound (TRUS) imaging guidance. Recently, multi-parametric magnetic resonance imaging (MRI) has provided promising results as a non-invasive alternative for prostate cancer detection and classification [3].

Two specific tasks are required in the examination of multi-parametric MRI (mpMRI) images. First, cancer regions must be detected and second these suspicious areas must be classified as either benign or otherwise actionable, where biopsy is recommended for further tissue interrogation. This approach could potentially reduce the number of biopsies done overall. The comprehensive assessment of mpMRI, which may consist of 8 or more different volumetric datasets, can be tedious for daily clinical readings. Furthermore, subtle and collective signatures of cancerous lesions expressed within mpMRI are rather difficult to detect consistently even by experienced radiologists. This challenge is augmented in cases of small lesions. Another challenge is also characterizing these lesions and presenting the results as they relate to biopsy findings with Gleason Scores. There has been a number of attempts in providing an automatic solution to quantify these contrast changes and use them to detect and classify suspicious lesions. Chung et al. provides a good overview of the challenges [3]. The majority of proposed methods are based on various quantifiable image features, which are hypothesized to be important for the detection and classification tasks. For example, in [9], level set methods were used to segment the prostate and a set of features were acquired from multiple diffusion-weighted images (DWI). These features were then used in a stacked auto-encoder and finally, through a logistic regression method, classified into two classes of benign and malignant. The result was a 100% correct classification rate based on data from 53 patients. Also, in [3], a deep learning network comprised of stochastically realized receptive fields and ending in fully connected sequencing layers was proposed with a sensitivity of 64.00% and specificity of 82.48% on a dataset from 20 patients. We present a novel approach for prostate cancer classification based on image-to-image networks, inspired by [6]. In this work, multi-parametric images are directly entered to the network and no preprocessing step in terms of feature extraction is required. We evaluate the classification performance on multiple image-to-image architectures and image input channel variations on a 202 patient dataset.

2 Methods

We formulate the task as a multi-object segmentation problem. In this approach, the “segmentation” is in fact the response map. Unlike a binary segmentation, the fractional response peaks at the tumor location and follows a Gaussian distribution in the vicinity. Two independent response channels are considered to accommodate both benign and malignant lesions’ characteristics. This approach has multiple advantages. First, the spatial uncertainty of the marked lesion is inherently considered through the choice of the Gaussian standard deviation. Second, there is no specific need to determine a patch size to interrogate the neighborhood around the lesion. The implementation is a type of encoder-decoder architecture [2]. However, instead of an anticipated binary segmentation output, local maxima within an output response map suggest the tumors’ locations. Additional analysis comparing the intensity of the response maps from different output channels (e.g., benign and malignant) at particular locations is done to further characterize the detected lesion. This architecture naturally allows multiple tumors and multiple classes of tumors to be detected and characterized within a series of multi-parametric input images. Depending on the availability of the ground-truth, one may simply add the tumor boundaries and even extend the approach to not only detect and characterize but also to segment as well. In the following, we make use of 2D, as opposed to 3D convolutional architectures, which have fewer parameters and allows for additional training data with superior results based on our experience.

2.1 Data Preparation

Data have been collected from patients with a suspicion of prostate cancer. Overall, we processed 202 multi-parametric prostate MRI (mpMRI) datasets from the ProstateX challenge database [7]. The patients were all imaged using 3T MRI scanners without an endo-rectal coil. The scan protocol included axial T2-weighted 2D turbo spin-echo images providing anatomical overview of the prostate and the zonal structure. Furthermore, diffusion weighted imaging (DWI), depicting water molecule diffusion variations due to the microscopic changes in tissue structures, was included. DWI is created using different diffusion-weightings (b-values) that depict tissue with increased cellularity and thus restricted diffusion. The apparent diffusion coefficient (ADC) is derived using the signal intensity changes of at least two b-values and provides a quantitative map demonstrating the degree of water molecule diffusion. It is believed that tumors have restricted levels of diffusion and hence appear hypo intensive within the ADC map. Finally, a calculated b-value image at b = 1400 mm\(^2\)/s was extrapolated. Additionally, the data includes dynamic contrast-enhanced (DCE) images. These images consist of a series of T1-weighted acquisitions taken during intravenous gadolinium-based contrast agent injection. It is known that the prostate cancer tissue often induces some level of angiogenesis, which is followed by an increased vascular permeability as compared to normal prostatic tissue. Pharmacokinetic modelling was applied to the DCE-MRI series in order to estimate the K-trans parameter of the Tofts model as an indicator of tissue permeability [12].

For annotation, the lesions’ center locations and corresponding classifications were available [7]. The two-class labels were clinically relevant cancer (Gleason score >6) and non-relevant (Gleason score \(\le \)6). We use a cascaded 3D elastic registration as a first step of preprocessing to compensate for any motion that may have occurred during acquisitions [13]. In order to increase robustness, a pairwise registration between T2-weighted image and the corresponding low-b diffusion image as the representative of DWI set is performed. We then apply the computed deformation to compensate motion in both ADC map and high-b diffusion image. Similarly, we perform a pairwise registration between T2-weighted image and late enhanced contrast image as the representative of DCE set. Additionally, an 80 mm \(\times \) 80 mm region of interest (ROI) mask was applied on each slice to ensure only prostate and surrounding areas were considered. After intra-patient registration, all images are then reformatted into a T2-weighted 100 mm \(\times \) 100 mm \(\times \) 60 mm image grid, which corresponds to roughly 200 \(\times \) 200 pixel 2D slices. Two ground truth maps corresponding to benign and malignant tumor labels were created for each dataset by creating Gaussian distribution (with \(3\sigma \) of 10 mm) at each lesion point in 2D as shown in Fig. 1. The Gaussian distribution was also propagated through-plane with a standard deviation adjusted to the acquisition slice thickness. Only slices containing any tumor labels were selected for processing. This final set totals 824 slices from the 202 patient cases.

2.2 Network Design and Training

We designed three convolutional-deconvolutional image-to-image networks (Models 0, 1, and 2) with increasing complexity in terms of number of features and layers as shown in Fig. 2. Compared to the 13 convolutional and 13 deconvolutional layers of SegNet [2], these models contain fewer layers and features to avoid over-fitting. Each model’s output consists of two channels signifying the malignant and benign tumor categories. Batch normalization was used after each convolutional layer during training. A \(256\times 256\) input image was used. In addition to the three networks, the following modifications were also evaluated:

Input images available (T2, ADC, High B-value, K-trans)
Activation function
- Rectified Liner Unit (ReLU)
- Leaky ReLU (\(\alpha =0.01\))
- Very Leaky ReLU (\(\alpha =0.3\)) - improved classification performance in [1]
Adding skip-connections [4]
Training data augmentation (Gaussian noise addition, rotation, shifting)

All networks were trained using Theano [11] with batch gradient decent (Batch size = 10). A mean-squared error loss function computed within a mask of the original image slice size was used. Training was performed for a maximum of 100 epochs and a minimal loss on a small set of validation data was used to select the model. A constant learning rate of 0.005 was used throughout. In order to assess the sampling variability we performed 5-fold cross validation bootstrapped five times with different sets of data chosen randomly for training and testing, hence 20% of the data was used for testing in each fold. Using this approach we are able to get a range of results and can compute rather a sampling independent average performance. As a performance indicator, we use the area under the curve (AUC) for each classification run. We also make sure that no slices from a single patient fall into both training/validation and test datasets. Classification was determined by the intensity ratios from each channel at the given location.

3 Results

Our first aim is to assess performance in lesion characterization. The second aim is to better understand the contribution of the different mpMRI contrasts in the overall characterization performance. It is desirable to have a compromise between the acquisition length (smaller number of channels) and the performance. The performance results using varying number of multi-parametric channels are shown in Table 1 and plotted in Fig. 3. It is clear that the aggregate of all modalities produced the best result across all models. However, it is clinically desirable to eliminate the dynamic sequence scan, both to save time and contrast agent injection. The performance in this case may still provide a clinically acceptable negative predictive value (NPV) to rule out malignant lesions and avoid invasive biopsies (by selecting an appropriate operating point on the ROC curve). This hypothesis must be further investigated and validated. Model 1 produces the best average AUC with the least variability while Model 0 has an optimal single AUC score among all the folds tested. Sample AUCs are shown in Figs. 4 and 5.

Table 1. Average AUC results of the three networks used with different combinations of input channels without data augmentation.

Full size table

Results based on all four input channels with variations of adding skip connects or changing the response function are shown in Table 2 and Fig. 3. Using the leaky and very leaky ReLUs resulted in inferior performance compared to ReLUs. However, skip connections resulted in improved performance for the most complex model with an average AUC of 83.3% and reduced variability across folds.

Table 2. Average AUC results of the three networks used with architecture changes.

Full size table

Training data augmentation by translation and rotation coupled with Gaussian noise resulted in a consistent improvement. An average AUC of 95% was reached, when we applied the data augmentation along with skip connections to model M1. We also coupled the image to image localization and classification network with a discriminator that aims to identify real and generated probability maps. The resulting network drives the evolution of the image to image localization and classification network by the weighted sum of the regression cost and binary classification cost stemming from the use of the discriminator. The training is conducted using the approaches recently provided in the generative adversarial networks (GAN) literature [5, 8]. In our limited experiments, we found out that this adversarial setup yielded similar performance compared to the image to image network. Use of different adversarial approaches is a part of our future directions (Fig. 6).

4 Conclusions and Future Work

We have presented a convolutional image-to-image deep learning pipeline for performing classification without fully connected layers as in conventional classification pipelines. The same network could also be used for localization of suspicious regions by examining the responses across different channels. We have experimented and shown results by varying input channels and network parameters to arrive at a recommendation architecture for an optimal performance. An average AUC of 83.4% for classification without data augmentation is promising and improvements are possible, for instance, by inclusion of a prostate segmentation region. This will allow the network to focus solely on regions within the prostate and not get penalized for responses outside this region. We also plan to develop and evaluate localization of tumors by the individual channel responses.

Although optimal classification was achieved using four input images, in a practice it is undesirable to inject patients with contrast to obtain K-trans and DCE images. Therefore we hypothesize that methods developed without use of K-trans or DCE images could find more utility in early diagnosis scenario as a gatekeeper of a more invasive biopsy approach.

References

Anthimopoulos, M., Christodoulidis, S., Ebner, L., Christe, A., Mougiakakou, S.: Lung pattern classification for interstitial lung diseases using a deep convolutional neural network. IEEE TMI 35(5), 1207–1216 (2016)
Google Scholar
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. In: CVPR 2015, p. 5 (2015)
Google Scholar
Chung, A.G., Shafiee, M.J., Kumar, D.: Discovery radiomics for multi-parametric MRI prostate cancer detection. In: Computer Vision and Pattern Recognition, pp. 1–8 (2015)
Google Scholar
Drozdzal, M., Vorontsov, E., Chartrand, G., Kadoury, S., Pal, C.: The importance of skip connections in biomedical image segmentation. In: Carneiro, G., Mateus, D., Peter, L., Bradley, A., Tavares, J.M.R.S., Belagiannis, V., Papa, J.P., Nascimento, J.C., Loog, M., Lu, Z., Cardoso, J.S., Cornebise, J. (eds.) LABELS/DLMIA -2016. LNCS, vol. 10008, pp. 179–187. Springer, Cham (2016). doi:10.1007/978-3-319-46976-8_19
Chapter Google Scholar
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A. and Bengio, Y.: Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27, pp. 2672–2680. Curran Associates Inc. (2014)
Google Scholar
Kainz, P., Urschler, M., Schulter, S., Wohlhart, P., Lepetit, V.: You should use regression to detect cells. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 276–283. Springer, Cham (2015). doi:10.1007/978-3-319-24574-4_33
Chapter Google Scholar
Litjens, G., Debats, O., Barentsz, J., Karssemeijer, N., Huisman, H.: Computer-aided detection of prostate cancer in MRI. IEEE TMI 33(5), 1083–1092 (2014)
Google Scholar
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. CoRR, abs/1511.06434 (2015)
Google Scholar
Reda, I., Shalaby, A., Khalifa, F., Elmogy, M., Aboulfotouh, A., El-Ghar, M.A., Hosseini-Asl, E., Werghi, N., Keynton, R., El-Baz, A.: Computer-aided diagnostic tool for early detection of prostate cancer. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 2668–2672, September 2016
Google Scholar
Siegel, R.L., Miller, K.D., Jemal, A.: Cancer statistics, 2016. CA Cancer J. Clin. 66(1), 7–30 (2016)
Article Google Scholar
Theano Development Team. Theano: a Python framework for fast computation of mathematical expressions. arXiv e-prints, abs/1605.02688, May 2016
Google Scholar
Tofts, P.S., Kermode, A.G.: Measurement of the blood-brain barrier permeability and leakage space using dynamic mr imaging. 1. fundamental concepts. Magn. Reson. Med. 17(2), 357–367 (1991)
Article Google Scholar
Wells, W.M., Viola, P., Atsumi, H., Nakajima, S., Kikinis, R.: Multi-modal volume registration by maximization of mutual information. Med. Image Anal. 1(1), 35–51 (1996)
Article Google Scholar

Download references

Acknowledgments

Data used in this research were obtained from The Cancer Imaging Archive (TCIA) sponsored by the SPIE, NCI/NIH, AAPM, and Radboud University [7].

Author information

Authors and Affiliations

Siemens-Healthineers, Technology Center, Princeton, NJ, USA
Atilla P. Kiraly, Clement Abi Nader, Ahmet Tuysuzoglu, Robert Grimm, Berthold Kiefer, Noha El-Zehiry & Ali Kamen
Siemens-Healthineers, Diagnostic Imaging, MR, Erlangen, Germany
Atilla P. Kiraly, Clement Abi Nader, Robert Grimm, Berthold Kiefer, Noha El-Zehiry & Ali Kamen

Authors

Atilla P. Kiraly
View author publications
You can also search for this author in PubMed Google Scholar
Clement Abi Nader
View author publications
You can also search for this author in PubMed Google Scholar
Ahmet Tuysuzoglu
View author publications
You can also search for this author in PubMed Google Scholar
Robert Grimm
View author publications
You can also search for this author in PubMed Google Scholar
Berthold Kiefer
View author publications
You can also search for this author in PubMed Google Scholar
Noha El-Zehiry
View author publications
You can also search for this author in PubMed Google Scholar
Ali Kamen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Atilla P. Kiraly .

Editor information

Editors and Affiliations

Université de Sherbrooke, Sherbrooke, QC, Canada
Maxime Descoteaux
DKFZ, Heidelberg, Germany
Lena Maier-Hein
Ulm University of Applied Sciences, Ulm, Germany
Alfred Franz
Université de Rennes 1, Rennes, France
Pierre Jannin
McGill University, Montreal, QC, Canada
D. Louis Collins
Université Laval, Québec, QC, Canada
Simon Duchesne

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 10371 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kiraly, A.P. et al. (2017). Deep Convolutional Encoder-Decoders for Prostate Cancer Detection and Classification. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D., Duchesne, S. (eds) Medical Image Computing and Computer Assisted Intervention − MICCAI 2017. MICCAI 2017. Lecture Notes in Computer Science(), vol 10435. Springer, Cham. https://doi.org/10.1007/978-3-319-66179-7_56

Download citation

DOI: https://doi.org/10.1007/978-3-319-66179-7_56
Published: 04 September 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66178-0
Online ISBN: 978-3-319-66179-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)