3TP-CNN: Radiomics and Deep Learning for Lesions Classification in DCE-MRI

Gravina, Michela; Marrone, Stefano; Piantadosi, Gabriele; Sansone, Mario; Sansone, Carlo

doi:10.1007/978-3-030-30645-8_60

3TP-CNN: Radiomics and Deep Learning for Lesions Classification in DCE-MRI

Michela Gravina¹⁴,
Stefano Marrone¹⁴,
Gabriele Piantadosi¹⁴,
Mario Sansone¹⁴ &
…
Carlo Sansone¹⁴

Conference paper
First Online: 02 September 2019

2217 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11752))

Abstract

Dynamic Contrast Enhanced-Magnetic Resonance Imaging (DCE-MRI) is a diagnostic method for the detection and diagnosis of breast cancer. Requiring the acquisition of images before and after the injection of a paramagnetic contrast agent, it provides a large amount of data that can hardly be analyzed without the use of a Computer Aided Diagnosis (CAD) system, whose aim is to support radiologists in the interpretation of medical images. Among the major issues in developing a CAD for the breast DCE-MRI there is the lesion diagnosis, namely the classification of lesioned tissues according to the tumour aggressiveness. Several studies have been conducted so far to explore the applicability of Deep Learning (DL) approaches to the automatic breast lesions classification. However, we argue that solutions only relying on DL are not so effective since past learned experience in the radiomics field should also be kept in mind to better exploit the dynamics of contrast agent and its effect on the acquired images. To this aim, we propose an approach that exploits the well-known Three Time Points (3TP) idea to select the specific time points that best highlight the tissues under analysis. Our findings show that promising results can then be obtained by using transfer learning, resulting in an approach that is able to outperform both the classical (non-deep) and some very recent deep proposals.

You have full access to this open access chapter, Download conference paper PDF

1 Introduction

The breast cancer worldwide number of cases has significantly increased since the 1970s. This phenomenon is partly due to modern lifestyles, with recent studies showing that tumours are mostly an environmental rather than a genetic disease, being the results of factors like pollution, smoking, nutrition, radiation, stress, and traumas. Tumours grow and expand without evident signs, coming out with symptoms only at an advanced stage of the disease. For this reason, early detection is the key factor to improve breast neoplasm prognosis.

In recent years, Dynamic Contrast Enhanced-Magnetic Resonance Imaging (DCE-MRI) has demonstrated great potential in screening different tumours tissues, gaining increasing popularity as an important complementary diagnostic methodology for early detection of breast cancer [9]. It involves the intravenous injection of a contrast agent (CA) in order to highlight both the physiological and morphological characteristics of the tissue. The contrast agent is a paramagnetic or super-paramagnetic substance (such as Gadolinium-based), characterized by a specific absorption time that, spreading with different speed in function of the tissue vascularization, allows to highlight the damaged tissues with respect to the surrounding healthy ones.

A DCE-MRI study consists of MRI images taken before (pre-contrast series) and after (post-contrast series) the intravenous injection of the contrast agent, involving the acquisition of 3D volumes at different times, thus resulting in a 4D volume (Fig. 1a) with 3 spatial dimensions (x, y, z) and one temporal dimension. Each DCE-MRI voxel (a pixel in three-dimensional space) is associated with a Time Intensity Curve (TIC) which reflects the absorption and the release of the contrast agent (Fig. 1b), following the vascularisation characteristics of the tissue under analysis [14].

Although a visual assessment of the lesion malignity could be performed by analyzing the TIC, lesion diagnosis is a hard and time-consuming task because (i) real curves are much noisier than the illustrative ones and (ii) the involved amount of data is so huge that it can hardly be inspected without the use of a Computer Aided Detection/Diagnosis (CAD) system. Focusing on the automatic CAD system, lesion diagnosis can be considered as the binary classification task of distinguishing between benign and malignant tumours.

Performing lesion diagnosis by means of a classifier model requires to extract the features that best suite the task and, to this aim, newer hand-crafted features are continuously proposed by domain experts. In the last years, Deep Learning (DL) based approaches have gained popularity in many pattern recognition tasks, with Convolutional Neural Networks (CNNs) - artificial neural networks consisting in different convolutional layers stacked to form a deep architecture able to automatically learn a compact hierarchical representation of the input - performing particularly well on images. Although this characteristic suggests exploring CNNs also for biomedical images processing, accordingly to the radiomics point of view (medical images are more than pictures [5]) our idea is that the underlying physiological characteristics of DCE-MR images should also be taken into account in order to effectively exploit all the available information.

In 1997 the study conducted by Degani [2] proved that it is possible to effectively analyze DCE-MRI data considering only volumes at very specific time points (3TP method), bringing a huge contribution to the research in the radiomics field. Despite this, literature works do not seem to consider this methodology, with authors mostly using deep learning approaches to extract the features that best contribute to task solution.

In this work, we want to join the radiomics methodology and CNNs, in order to exploit the medical experience and the deep learning capabilities for the automatic breast lesion classification task in DCE-MRI. To this aim we propose 3TP-CNN, a methodology that guides the choice of DCE-MRI volume to feed to CNNs, exploring, as a case of study, the breast DCE-MRI. Finally, since the amount of available training data is usually small, we propose to fine-tune a pre-trained CNN after a replication-based data augmentation stage that demonstrated to be effective when dealing with biomedical images.

The rest of the paper is organized as follows: Sect. 2 introduces the proposed approach, the dataset used and the experimental setup; Sect. 3 reports the obtained results, comparing them with those obtained by some competitors; finally, Sect. 4, discusses those results and provides some conclusions.

2 3TP-CNN for Lesions Diagnosis

Lesion diagnosis consists in classifying Regions of Interest (ROIs) according to the aggressiveness of the included tumour. The task can be addressed as the binary classification problem of distinguishing between malignant and benign lesions. To this aim, most literature proposals rely on hand-crafted features to describe ROI characteristics such as the TIC behaviour (Dynamics Features), the lesion’s texture (Textural Features) or shape (Morphological features), etc. The works so far proposed mostly exploit the DCE-MRI volumes in three way: by using all the available time series [11], by searching the best combination of acquisitions [6] or by arbitrarily fixing one of them [1]. Although all these approaches show interesting performances, the main limitation is that their applicability is strongly affected by the dataset characteristics.

To overcome this limitation, in this paper we propose to exploits the well-known Three Time Points (3TP) [2] approach to select the specific time points that best highlight the contrast agent absorption and then fine-tune a pre-trained CNN for the actual slice-by-slice classification. In particular, we propose to extract slices along the projection having the higher resolution, considering the different acquisitions of the same slice along time as different channels within the same image that we will feed to the CNN. This allows to perform the classification on images always related to the same physiological characteristics of the tissues under analysis, making our approach independent from the acquisition protocol. To the best of our knowledge, this is the first work that exploits the 3TP-method for lesion diagnosis in DCE-MRI. The proposed approach consists in three main steps (Fig. 2):

3TP Lesion Image extraction, in which for each slice containing a lesion, a 3-channels image is created by stacking the three instances acquired at the three time points suggested by Degani et al. [2]
Slice Classification, during which each slice is classified as malignant or benign
Lesion Classification, in which each lesion is classified by combining results of all its slices, producing a unique label for each lesion

2.1 3TP Lesion Image Extraction

As aforementioned in Sect. 1, a DCE-MRI is a 4D volume having 3 spatial dimensions and a temporal one that represents the acquisition of 3D volumes over time. Starting from it, we propose to extract 3TP images by cutting the sequence of 3D volumes along the axis having the highest resolution. This process generates a set of 3D volumes, each representing the same section (slice) of the tissue seen at different temporal instants. These volumes are extracted only for slices containing a lesion. This is made possible by the lesion segmentation module (one of the stages of a typical CAD system [12]) that localizes the lesion by identifying the Region of Interest (ROI), namely a binary mask that bounds the portion of the tissue within the lesion is.

Each 3D volume can be interpreted as a multi-channel image (since made of slices referring to the very same portion of the tissue) whose number of channels depends on the temporal instants considered during the extraction procedure. In this work we propose to fix the considered number of temporal instant by taking into account the 3TP method proposed by Degani [2], according to which the lesion classification can be improved by taking contrast enhanced images (DCE-MRI) at three time points identified by the time (in seconds) passed after the contrast agent injection. Only three time points are taken into account: a pre-contrast one (\(t_{0}\)), one 2 min after the contrast agent injection (\(t_{1}\), corresponding to the pick of contrast agent levels in tissues) and one 6 min after contrast agent injection (\(t_{2}\), corresponding to the end of the CA washout). For each slice, the resulting 3TP image is a 3-channel image composed of the same slice extracted by the tree volumes acquired at the time instance nearest to \(t_{0}\), \(t_{1}\) and \(t_{2}\) (firt block of Fig. 2).

The obtained images are further pre-processed by extracting only the portion of the data within a squared box centered in the lesion centre and having size 1.5 times the maximum diameter of the lesion itself. Image values are then normalized between 0 and 1, ensuring that, in the next stage, the CNN operates on images having the same scale across different lesions. Finally, all the images are resized to match the input layer size for the used CNN.

2.2 Slices Classification

In order to assess the malignancy of each 3TP image, we propose to fine-tune a CNN pre-trained on ImageNet [3]. It is worth noticing that we do not fix any CNN, as long as it has a 3-channels input layer. We propose to exploit fine-tuning since biomedical images datasets do not usually gather a proper amount of data to effectively train a big CNN from scratch.

Despite the use of fine-tuning, the training procedure could still not be able to properly learn images characteristics since the images could not be enough even for a fine-tuning and because classes are usually very unbalanced. The small size is mostly due to the small number of patients involved in DCE-MRI programs, while the dataset unbalance is because the sizes of malignant and benign lesions are usually very different, resulting in different number of slices per lesion type.

As a consequence, both a data augmentation and a balancing phase are needed. In this work, two variants of data augmentation are explored. The first consists in the application of random rotation and flipping, while the second simply consists in replicating the data (slice replication). In both variants, the dataset is balanced by replicating some randomly chosen slices belonging to the minority class.

2.3 Lesion Classification

At the end of the previous stage, each lesion is associated with a probability of being a malignant or a benign one. However, since the final aim of the work is to classify each lesion, as a final step we combine the classes of all the slices from a given lesion into a single class. In this work, among all the possible combining strategies (CS) we considered:

Majority voting (MV), in which the class of the lesion is the most common class over all its slices
Weighted Majority(WMV), that acts as MV, but in which each slice contribution is weighted by its probability
Biggest Slice(BS), in which the lesion is associated with the class of the slice containing the biggest portion of the lesion

2.4 Experimental Setup

The proposed approach is general and can be applied to the classification of lesions of different organs and by using different DCE-MRI protocols. The same goes for the CNN used for the slice classification and on the other hyperparameters. The experiments have been carried out using Pytorch, evaluating the code on a physical server hosted in our university HPC center^{Footnote 1} equipped with 2 \(\times \) Intel(R) Xeon(R) Intel(R) 2.13 GHz CPUs (4 cores), 32 GB RAM and an Nvidia Titan XP GPU (Pascal family) with 12 GB GRAM. Slice extraction step and non-deep competitors approaches (Sect. 2.4) have been implemented in MATLAB.

Dataset. In this work, we will focus on the breast lesion diagnosis. The dataset is constituted of 39 women breast DCE-MRI (average age 50 years, in range 31–74) with benign or malignant lesions histopathologically proven: 36 lesions were malignant and 22 were benign. All patients underwent imaging with a 1.6 T scanner (SymphonyTim, Siemens Medical System, Erlangen, Germany) equipped with breast coil. DCE T1-weighted FLASH 3D axial fat-saturated images were acquired (TR/TE: 5.08/2.39 ms; flip angle: \(15^{\circ }\); matrix: \(384 \times 384\); thickness: 1.6 mm; acquisition time: 110 s; 128 slices spanning entire breast volume). One series (\(t_{0}\)) was acquired before and 8 series (\(t_{1}\)–\(t_{8}\)) after intravenous injection of a positive paramagnetic contrast agent (gadolinium-diethylene-triamine penta-acetic acid, Gd-DOTA, Dotarem, Guerbet, Roissy CdG Cedex, France).

An experienced radiologist delineated suspect ROIs using original and subtractive image series, defined by subtracting \(t_0\) series from the \(t_1\) series. The manual segmentation stage was performed in Osirix [13], that allows the user to define ROIs at a sub-pixel level.

Related Works. In this work, we consider two classical (non-deep) and two deep learning based works proposed in the literature to compare with the performance of our approach. Fusco et al. [4] propose to use both Dynamic and Morphological features, combining them by using a Multiple Classifier System, in order to take into account the contrast agent concentration and the lesion shape. Piantadosi et al. [11] propose to use Local Binary Patterns on Three Orthogonal Planes (LBP-TOP) descriptor to provide a set of feature by thresholding the neighbourhood of each pixel and considers the result as a binary number. As threshold, the luminance value of the pixel in the centre of the neighbourhood is considered. In [1], Antropova et al. explore the use of a CNN (AlexNet, pre-trained on ImageNet) as feature extractor and then use an SVM for the actual classification. To match the 3-channels input layer, the authors propose to replicate slices extracted from the second post-contrast series. Finally, Haarburger et al. [6] proposed the fine-tuning of a ResNet34 [7] CNN. To match the 3-channels input layer, the authors propose to perform a grid-search among all the possible combinations of time series.

3 Results

The protocol considered in this work has the axial slice as the one having the higher resolution, therefore we extracted the 3TP images along this plane. Performance is evaluated using a 10-fold cross-validation. Since the classification stage is performed slice-by-slice, it is very important to perform a patient-based instead of a slice-based cross-validation, in order to reliably compare different models by avoiding mixing intra-patient slices in the evaluation phase. Slices were replicated three times (obtaining a training dataset 3 times bigger than the original one). As CNN we used AlexNet [8] since in our previous investigations [10] it has shown the best trade-off between classification performance and training time. Performances are evaluated in terms of Accuracy (ACC), Sensitivity (SEN), Specificity (SPE), F1-Score (F1) and Area under ROC curve (AUC).

Table 1. Comparing different 3TP-AlexNet training modalities, by varying the slice combining rule and batch size.

Full size table

Table 2. Comparing different 3TP-AlexNet Slice Replication training modalities, by varying the slice combining rule and batch size.

Full size table

Tables 1 and 2 compare the proposed approach varying the model parameters, such as batch size, combining strategy and data augmentation in order to find their best configuration. The fine-tuning of AlexNet has been performed replacing the last fully connected layers. The best result was achieved by using a learning rate of \(10^{-5}\).

Table 3 compares our best configuration with some literature proposal (Sect. 2.4) and with our proposal without the use of 3TP images as input (1TP AlexNet with Slice Replication) to assess how the 3TP approach affects the performance. The same parameters configuration of our best model was used, but only the second post-contrast series from the 4D DCE-MRI data was taken. It is worth noticing that, since Antropova et al. [1] do not provide enough information about the SVM hyper-parameters settings, we performed an optimization of the classification stage: the best results were obtained by using an SVM with a polynomial kernel of degree equal to 1 and C = 1. Majority voting (MV) is considered as combining strategy.

Table 3. Comparison of the best results obtained by our approach with those achieved by other state-of-the-art approaches and with the results obtained without exploiting the 3TP idea.

Full size table

4 Discussion and Conclusions

The aim of this paper was to investigate automatic lesion malignancy classification in DCE-MRI proposing a solution that joined the radiomics methodology and Convolutional Neural Networks (CNNs), in order to exploit the medical experience and the deep learning capabilities. For this reason, Three Time Points approach (3TP), exploited in Slice extraction step, was applied in order to highlight contrast agent absorption that is decisive in the discrimination between malignant and benign lesions. In our opinion, the past learned experience should always be taken into account because it could provide information that may improve classifier performance. As a case of study, breast DCE-MRI was considered.

Results presented in Tables 1 and 2 compare all the CNN-based approaches obtained by varying the slice combining strategies and batch size. 3TP-AlexNet Slice Replication with a batch size equal to 1 reaches the best results. The most effective slice combining technique is to consider as lesion class the one predicted by the slice containing the biggest ROI. This is reasonable since the biggest ROI in a lesion is likely to bring the majority of the lesion malignancy information.

Table 3 compares our best approach with some methods proposed in the literature, showing that our proposal is able to outperform both the classical (non-deep) approaches and the deep proposals. Haarburger et al. [6] defined the best set of contrast images exploring all the combination of the images provided by the acquisition protocol, while, in our case, the set of contrast images that should be considered is suggested by medical knowledge. This implies that our proposal can be applied for all protocols involving at least 3 acquisitions: the only constraint is the need to have acquisitions close to the times suggested by Degani. Furthermore, Table 3 shows the significant impact that the 3TP method had on system performance, reporting the results obtained by the implementation of a methodology that does not exploit the 3TP method.

The obtained results confirm our idea of exploiting past learned experience in order to provide the network with the medical knowledge that contributes to lesion diagnosis. In addition, it is worth noting that our methodology is not only independent of the protocol, but also of the CNN used for lesion classification: in fact, the choice of AlexNet [8] is only a case-of-study choice.

Since contrast agent absorption is decisive for lesion diagnosis, future work will focus on exploring approaches that are able to further enhance the temporal dynamics of the acquired signal, reflecting the absorption and release of contrast agent. We argue that when performing lesion diagnosis by means of a classifier system, performance depends on the dynamic or spatio-temporal information coming from DCE-MRI data rather than on the CNN used for classification.

Notes

1.
http://www.scope.unina.it.

References

Antropova, N., Huynh, B., Giger, M.: SU-D-207B-06: predicting breast cancer malignancy on DCE-MRI data using pre-trained convolutional neural networks. Med. Phys. 43(6), 3349–3350 (2016). https://doi.org/10.1118/1.4955674. http://www.ncbi.nlm.nih.gov/pubmed/28048384
Article Google Scholar
Degani, H., Gusis, V., Weinstein, D., Fields, S., Strano, S.: Mapping pathophysiological features of breast tumors by MRI at high spatial resolution. Nature Med. 3(7), 780–782 (1997)
Article Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition 2009, CVPR 2009, pp. 248–255. IEEE (2009)
Google Scholar
Fusco, R., Sansone, M., Petrillo, A., Sansone, C.: A multiple classifier system for classification of breast lesions using dynamic and morphological features in DCE-MRI. In: Gimel’farb, G., et al. (eds.) Structural, Syntactic, and Statistical Pattern Recognition, pp. 684–692. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34166-3_75
Chapter Google Scholar
Gillies, R.J., Kinahan, P.E., Hricak, H.: Radiomics: images are more than pictures, they are data. Radiology 278(2), 563–577 (2015)
Article Google Scholar
Haarburger, C., et al.: Transfer learning for breast cancer malignancy classification based on dynamic contrast-enhanced MR images. Bildverarbeitung für die Medizin 2018. I, pp. 216–221. Springer, Heidelberg (2018). https://doi.org/10.1007/978-3-662-56537-7_61
Chapter Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25, pp. 1097–1105. Curran Associates, Inc. (2012)
Google Scholar
Lehman, C.D., et al.: MRI evaluation of the contralateral breast in women with recently diagnosed breast cancer. N. Engl. J. Med. 356(13), 1295–1303 (2007). pMID: 17392300
Article Google Scholar
Marrone, S., Piantadosi, G., Fusco, R., Petrillo, A., Sansone, M., Sansone, C.: An investigation of deep learning for lesions malignancy classification in breast DCE-MRI. In: Battiato, S., Gallo, G., Schettini, R., Stanco, F. (eds.) ICIAP 2017. LNCS, vol. 10485, pp. 479–489. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68548-9_44
Chapter Google Scholar
Piantadosi, G., Fusco, R., Petrillo, A., Sansone, M., Sansone, C.: LBP-TOP for volume lesion classification in breast DCE-MRI. In: Murino, V., Puppo, E. (eds.) ICIAP 2015. LNCS, vol. 9279, pp. 647–657. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23231-7_58
Chapter Google Scholar
Piantadosi, G., Marrone, S., Fusco, R., Sansone, M., Sansone, C.: Comprehensive computer-aided diagnosis for breast t1-weighted dce-mri through quantitative dynamical features and spatio-temporal local binary patterns. IET Comput. Vis. 12(7), 1007–1017 (2018)
Article Google Scholar
Rosset, A., Spadola, L., Ratib, O.: OsiriX: an open-source software for navigating in multidimensional DICOM images. J. Digit. Imaging 17, 205–216 (2004)
Article Google Scholar
Tofts, P.S.: T1-weighted DCE imaging concepts: modelling, acquisition and analysis. Magneton Flash Siemens 3, 30–39 (2010)
Google Scholar

Download references

Acknowledgments

The authors gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research, the availability of the Calculation Centre SCoPE of the University of Naples Federico II and his staff. The authors also thanks Dr. Antonella Petrillo, Head of Division of Radiology and PhD Roberta Fusco, Department of Diagnostic Imaging, Radiant and Metabolic Therapy, “Istituto Nazionale dei Tumori Fondazione G. Pascale”, Naples, Italy, for providing data. This work is part of the “Synergy-net: Research and Digital Solutions against Cancer” project (funded in the framework of the POR Campania FESR 2014-2020 - CUP B61C17000090007).

Author information

Authors and Affiliations

DIETI - University of Naples Federico II, Naples, Italy
Michela Gravina, Stefano Marrone, Gabriele Piantadosi, Mario Sansone & Carlo Sansone

Authors

Michela Gravina
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Marrone
View author publications
You can also search for this author in PubMed Google Scholar
Gabriele Piantadosi
View author publications
You can also search for this author in PubMed Google Scholar
Mario Sansone
View author publications
You can also search for this author in PubMed Google Scholar
Carlo Sansone
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stefano Marrone .

Editor information

Editors and Affiliations

University of Trento, Povo, Italy
Elisa Ricci
Mapillary Research, Graz, Austria
Samuel Rota Bulò
University of Amsterdam, Amsterdam, The Netherlands
Cees Snoek
Fondazione Bruno Kessler, Povo, Italy
Oswald Lanz
Fondazione Bruno Kessler, Povo, Italy
Stefano Messelodi
University of Trento, Povo, Italy
Nicu Sebe

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gravina, M., Marrone, S., Piantadosi, G., Sansone, M., Sansone, C. (2019). 3TP-CNN: Radiomics and Deep Learning for Lesions Classification in DCE-MRI. In: Ricci, E., Rota Bulò, S., Snoek, C., Lanz, O., Messelodi, S., Sebe, N. (eds) Image Analysis and Processing – ICIAP 2019. ICIAP 2019. Lecture Notes in Computer Science(), vol 11752. Springer, Cham. https://doi.org/10.1007/978-3-030-30645-8_60

Download citation

DOI: https://doi.org/10.1007/978-3-030-30645-8_60
Published: 02 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30644-1
Online ISBN: 978-3-030-30645-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Abstract

1 Introduction

2 3TP-CNN for Lesions Diagnosis

2.1 3TP Lesion Image Extraction

2.2 Slices Classification

2.3 Lesion Classification

2.4 Experimental Setup

3 Results

4 Discussion and Conclusions

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation