Pancreatic Cancer Survival Prediction Using CT Scans and Clinical Variables

Sun, Li; Zhang, Songtao

doi:10.1007/978-3-030-01045-4_24

Pancreatic Cancer Survival Prediction Using CT Scans and Clinical Variables

Li Sun³¹ &
Songtao Zhang³¹

Conference paper
First Online: 15 September 2018

1695 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11042))

Abstract

Pancreatic cancer resulted in 411,600 deaths globally in 2015. Pancreatic ductal adenocarcinoma (PDAC) is the most common type of pancreatic cancer and it is highly lethal. Survival for patients with PDAC is dismal due to its aggressive nature, thus the development of novel and reliable prognostic model for early detection and therapy is much desired. Here we proposed a prognostic framework for prediction of overall survival of PDAC patients based on predictors derived from pancreas CT scans and patient clinical variables. Our framework includes three parts: feature extraction, feature selection and survival prediction. First, 2436 radiomics features were extracted from CT scans and were combined with the clinical variables, and a Cox model was fitted to each covariate individually to select the most predictive features. The optimal cut-off was determined by cross-validation. Finally, gradient boosting with component-wise Cox’s proportional hazards model was utilized to predict the overall survival of patients. Our framework achieves excellent performance on MICCAI 2018 Pancreatic Cancer Survival Prediction Challenge dataset, achieving mean concordance index of 0.7016 using five-fold cross-validation.

You have full access to this open access chapter, Download conference paper PDF

1 Introduction

It is estimated that in the United States there are more than 50,000 new cases of pancreatic cancer occur each year, which accounts for about 3% of all cancers in the US and about 7% of all cancer deaths [6]. Pancreatic ductal adenocarcinoma (PDAC) is the most common type of pancreatic cancer, making up more than 80% of cases. It originates in cells lining small tubes in the pancreas called ducts, which transport juices containing important digestive enzymes into the small intestine [1]. Despite substantial research efforts and gradual diagnostic and therapeutic improvements, pancreatic ductal adenocarcinoma is still highly lethal with five-year survival rate just around 5–7% and one-year survival is achieved in less than 20% of cases, reflecting the aggressive metastatic spread of PDAC and the lack of effective and reliable biomarkers for early diagnosis [2]. Therefore, there is an important need for development of novel and effective strategies for PDAC early detection and therapy.

Molecular techniques and imaging techniques have been developed for the diagnose and prognosis. Recent studies have utilized deep and targeted genomic analysis techniques to study pancreatic cancer [7]. These molecular techniques have been successful in characterizing gene-expression signatures in cancer cell and the molecular landscape of PDAC. But molecular techniques may not be able to sufficiently capture the heterogeneous nature of PDAC in vivo. While imaging techniques, especially computed tomography (CT) scan accompanied by three-dimensional (3D) reconstruction, are widely used for preoperative staging of PDAC [8, 9]. Imaging techniques can provide quantitative phenotypic features to reveal the characteristics of tumors. Other clinical variables can also serve as reliable prognostic biomarkers. For example, CA19-9 levels evaluation is recommended to correctly stage PDAC, assess the response to therapy [3].

Prognostic models for the prediction of overall survival of PDAC is of vital importance because it can serve to predict the biological aggressiveness of this cancer and evaluate the patient outcome. Survival analysis, originated from clinical research, establishes a connection between covariates and the time of an event. It differs from traditional regression by the fact that parts of the training data can only be partially observed due to patient’s dropout and study’s limited time. By fitting the survival function and hazard function using covariates in a regression models, we can predict the prognosis of patients.

In this paper, we developed a prognostic model to predict overall survival based on predictors derived from contrast-enhanced pancreas CT scans and patient’s clinical variables.

2 Method

2.1 Image Acquisition

The image dataset used in paper comes from the MICCAI 2018 computational precision medicine challenge, which focuses on the quantitative assessment of pancreas cancer. This dataset consists of a consecutive series of 212 patients undergoing pancreas resection at Memorial Sloan Kettering Cancer Center with clinical variables and high-quality annotated CT imaging. In which 159 portal venous phase CT images of the pancreas parenchyma and tumor in MetaImage (MHD) format are provided as a “training set”. These images are fully anonymized, with expert radiologist segmented tumors and liver parenchyma to eliminate segmentation-related uncertainty. In addition to the image data, the training set includes a data dictionary describing the clinical variables and a list of the variables for each patient, including preoperative CA 19-9 and neoadjuvant chemotherapy.

2.2 Data Processing

Since gray values of CT scans reflect absolute world values (HU) and should be comparable between scanners, we don’t perform intensity normalization on the CT images. For the missing entries in preoperative CA 19-9, we fill the holes with the median value.

2.3 Image Analysis

We extraction of radiomics features from medical imaging to serve as events/covariates for survival analysis. For each CT scan, we extract features in three categories: first order statistics, shape-based features and texture features. In addition, we applied Laplacian of Gaussian and Wavelet filter to images, then extract features based on the filtered image. Each feature is extract from two kinds of region of interest (ROI), the first one is segmented tumor region by experts; the another one is drawn to encompass all pancreas region based on intensity values.

We also applied feature extraction on Wavelet filtered and Laplacian of Gaussian filtered images in addition to the original images. The Wavelet filtering yields 8 decompositions per level, which include all possible combinations of applying either a High or a Low pass filter in each of the three dimensions. The LoG filter enhances the edge by emphasizing areas of gray level change. Sigma value in LoG filter defines how coarse the emphasized texture should be. We extracted features from LoG filtered images with sigma value equals 1.0, 2.0, 3.0, 4.0, 5.0 respectively.

In total, we extracted 2436 features from each CT image. The features were extracted using custom code and Pyradiomics toolbox [4].

First Order Statistics. We extracted 19 features for first order statistics, which are described as follows.

Basic Features. For basic first order statistics, we extract the maximum intensity, minimum intensity, mean, median, 10th percentile, 90th percentile, standard deviation, variance of intensity value in ROI.

Energy. Energy is a measure of the magnitude of voxel values in an image. A larger values implies a greater sum of the squares of these values.

$$\begin{aligned} \textit{energy} = \displaystyle \sum ^{N_p}_{i=1}{(\mathbf X (i) )^2} \end{aligned}$$

(1)

Total Energy. Total Energy is the value of Energy feature scaled by the volume of the voxel.

$$\begin{aligned} \textit{total energy} = V_{voxel}\displaystyle \sum ^{N_p}_{i=1}{(\mathbf X (i) + c)^2} \end{aligned}$$

(2)

Entropy. Entropy specifies the uncertainty/randomness in the image values. It measures the average amount of information required to encode the image values.

$$\begin{aligned} \textit{entropy} = -\displaystyle \sum ^{N_g}_{i=1}{p(i)\log _2\big (p(i)+\epsilon \big )} \end{aligned}$$

(3)

where $\epsilon $ is an arbitrarily small positive number.

Interquartile Range

$$\begin{aligned} \textit{interquartile range} = \mathbf P _{75} - \mathbf P _{25} \end{aligned}$$

(4)

where $\mathbf P _{25}$ and $\mathbf P _{75}$ are the $25^{th}$ and $75^{th}$ percentile of the gray level array, respectively.

Range

$$\begin{aligned} \textit{range} = \max (\mathbf X ) - \min (\mathbf X ) \end{aligned}$$

(5)

Mean Absolute Deviation (MAD)

$$\begin{aligned} \textit{MAD} = \frac{1}{N_p}\displaystyle \sum ^{N_p}_{i=1}{|\mathbf X (i)-\bar{X}|} \end{aligned}$$

(6)

Mean Absolute Deviation is the mean distance of all intensity values from the Mean Value of the image array.

Robust Mean Absolute Deviation (rMAD)

$$\begin{aligned} \textit{rMAD} = \frac{1}{N_{10-90}}\displaystyle \sum ^{N_{10-90}}_{i=1} {|\mathbf X _{10-90}(i)-\bar{X}_{10-90}|} \end{aligned}$$

(7)

Robust Mean Absolute Deviation is the mean distance of all intensity values from the Mean Value calculated on the subset of image array with gray levels in between, or equal to the $10^{th}$ and $90^{th}$ percentile.

Root Mean Squared (RMS)

$$\begin{aligned} \textit{RMS} = \sqrt{\frac{1}{N_p}\sum ^{N_p}_{i=1}{(\mathbf X (i) + c)^2}} \end{aligned}$$

(8)

RMS is the square-root of the mean of all the squared intensity values. It is another measure of the magnitude of the image values. This feature is volume-confounded, a larger value of c increases the effect of volume-confounding.

Skewness

$$\begin{aligned} \textit{skewness} = \displaystyle \frac{\mu _3}{\sigma ^3} = \frac{\frac{1}{N_p}\sum ^{N_p}_{i=1}{(\mathbf X (i)-\bar{X})^3}}{\left( \sqrt{\frac{1}{N_p}\sum ^{N_p}_{i=1}{(\mathbf X (i)-\bar{X})^2}}\right) ^3} \end{aligned}$$

(9)

Skewness measures the asymmetry of the distribution of values about the Mean value. Depending on where the tail is elongated and the mass of the distribution is concentrated, this value can be positive or negative.

Kurtosis

$$\begin{aligned} \textit{kurtosis} = \displaystyle \frac{\mu _4}{\sigma ^4} = \frac{\frac{1}{N_p}\sum ^{N_p}_{i=1}{(\mathbf X (i)-\bar{X})^4}}{\left( \frac{1}{N_p}\sum ^{N_p}_{i=1}{(\mathbf X (i)-\bar{X}})^2\right) ^2} \end{aligned}$$

(10)

Kurtosis is a measure of the ‘peakedness’ of the distribution of values in the image ROI. A higher kurtosis implies that the mass of the distribution is concentrated towards the tail(s) rather than towards the mean. A lower kurtosis implies the reverse: that the mass of the distribution is concentrated towards a spike near the Mean value.

Uniformity

$$\begin{aligned} \textit{uniformity} = \displaystyle \sum ^{N_g}_{i=1}{p(i)^2} \end{aligned}$$

(11)

Uniformity is a measure of the sum of the squares of each intensity value. This is a measure of the homogeneity of the image array, where a greater uniformity implies a greater homogeneity or a smaller range of discrete intensity values.

2.4 Shape Features

We extracted 14 features to characterize the shape of ROI, which are described as follows.

Basic Features. For basic shape features, we extract volume, surface area, surface area to volume ratio, maximum 3D diameter, maximum 2D diameter for axial, coronal and sagittal plane respectively, major axis length, minor axis length and least axis length in ROI.

Sphericity

$$\begin{aligned} \textit{sphericity} = \frac{\root 3 \of {36 \pi V^2}}{A} \end{aligned}$$

(12)

Sphericity is a measure of the roundness of the shape of the tumor region relative to a sphere. It is a dimensionless measure, independent of scale and orientation. The value of 1 indicates a perfect sphere.

Spherical Disproportion

$$\begin{aligned} \textit{spherical disproportion} = \frac{A}{4\pi R^2} = \frac{A}{\root 3 \of {36 \pi V^2}} \end{aligned}$$

(13)

where R is the radius of a sphere with the same volume as the tumor, and equal to $\root 3 \of {\frac{3V}{4\pi }}$.

Elongation

$$\begin{aligned} \textit{elongation} = \sqrt{\frac{\lambda _{\text {minor}}}{\lambda _{\text {major}}}} \end{aligned}$$

(14)

where $\lambda _{\text {major}}$ and $\lambda _{\text {minor}}$ are the lengths of the largest and second largest principal component axes.

Flatness

$$\begin{aligned} \textit{flatness} = \sqrt{\frac{\lambda _{\text {least}}}{\lambda _{\text {major}}}} \end{aligned}$$

(15)

where $\lambda _{\text {major}}$ and $\lambda _{\text {least}}$ are the lengths of the largest and smallest principal component axes.

2.5 Texture Features

We also extracted texture features using the standard pipeline of Pyradiomics toolbox, including 22 grey level co-occurrence matrix (GLCM) features, 16 gray level run length matrix (GLRLM) features, 16 Grey level size zone matrix (GLSZM) features, five neigbouring gray tone difference matrix (NGTDM) features and 14 gray level dependence matrix (GLDM) Features. Due to the page limitation we don’t present the full list here, it can be found at https://pyradiomics.readthedocs.io/en/latest/features.html#module-radiomics.glcm

2.6 Survival Analysis

Feature Selection. Since the number of covariates is large relative to the number of observations, we performed feature selection remove the redundant or irrelevant features we extracted, as well as to reduce overfitting in model. More specifically, we fit a Cox model to each variable individually and record the c-index on the training set.

Next, we want to build a parsimonious model by excluding irrelevant features. We use the top ranking features as described above. By experimenting with five-fold cross-validation on the training set, we determined the number of features to use.

Survival Prediction. In order to perform accurate and robust prediction of the overall survival predication of patients’ survival, we utilize gradient boosting with component-wise Cox’s proportional hazards model as base learner [5].

We denotes $D_i$ as the time from disease onset to death, and $C_i$ as the potential time for patient i, $i=1,\cdots ,n$. Thus the observed survival time is $T_{i}=\min \{ D_{i},C_{j}\}$, and the death indicator is given by $\delta _{i}=I(D_{i} \le C_{i}) $. Let $X_{i}=(X_{i1},\cdots ,X_{ip})^{T}$ be a p-dimensional covariate vector which contains all features selected for the ith patient. We assume that, conditional on $X_i$, $D_i$ is independently censored by $C_i$. Then the death hazard can be modeled as

$$\begin{aligned} {\lambda _i}(t|{X_i}) = \mathop {\lim }\limits _{dt \rightarrow 0} \frac{1}{{dt}}\Pr (t \le {D_i} < t + dt|{D_i} \ge t,{X_i}) = {\lambda _0}(t)\exp (X_i^T\beta ) \end{aligned}$$

(16)

where $\lambda _{0}(t)$ denotes the baseline Coxs proportional hazards function. $\beta = (\beta _{1},\cdots ,\beta _{p})$ is the vector of parameters. The corresponding log-partial likelihood is given by

$$\begin{aligned} (\beta ) = \sum \limits _{i = 1}^n {{\delta _i}\left[ {X_i^T\beta - \log \{ \sum \limits _{\ell \in {R_i}} {\exp (X_\ell ^T\beta )} \} } \right] } \end{aligned}$$

(17)

The idea of gradient boosting is to pursue iterative steepest ascent of the log likelihood function. At each step, given the current estimate of $\beta $, say ${\hat{\beta }}$, let $\hat{\eta }= X^{T}\hat{\beta }$. The algorithm computes the gradient of the log-partial likelihood with respect to $\eta _{i}$, the ith component of $\eta $,

$$\begin{aligned} {U_i} = \frac{\delta }{{\delta {\eta _i}}}{l_n}(\eta ){|_{\eta = \hat{\eta }}} = {\delta _i} - \sum \limits _{\ell = 1}^n {\frac{{{\delta _\ell }I({T_i} \ge {T_\ell })\exp ({{\hat{\eta }}_i})}}{{\sum \limits _{k = 1}^n {I({T_k} \ge {T_\ell })\exp ({{\hat{\eta }}_k})} }}} \end{aligned}$$

(18)

A component-wise algorithm can be implemented by restricting the search direction to be component-wise. For example, fit component-wise model

$$\begin{aligned} {{\tilde{\beta }}_j} = \mathop {\arg \min }\limits _{{\beta _j}} \frac{1}{n}\sum \limits _{i = 1}^n {{{({U_i} - {X_{ij}}{{\tilde{\beta }}_j})}^2}} \end{aligned}$$

(19)

for $j=1,\cdots ,p$, compute

$$\begin{aligned} {j^*} = \mathop {\arg \min }\limits _{1 \le j \le p} \frac{1}{n}\sum \limits _{i = 1}^n {{{({U_i} - {X_{ij}}{{\tilde{\beta }}_j})}^2}} \end{aligned}$$

(20)

and update ${{\hat{\beta }}_{j*}} = {{\hat{\beta }}_{j*}} + v{{\tilde{\beta }}_{j*}}$ until stop criteria is reaches. Here v denotes the learning rate. In our model, we set learning rate to 0.1 and number of boosting stages to 100.

3 Result

Based on the methods described above, we performed feature extraction, feature selection and survival prediction. For feature extraction, we extracted 2438 features per case, including 2436 image features and 2 clinical covariates. For feature selection, we fitted a Cox model to each variable individually and record the c-index on the training set, and used five-fold cross-validation on the training set to the number of features to use. In the end, we selected the following 16 features (Table 1):

Table 1. Selected most predicative features

Full size table

Finally, we evaluate the performance of our model by predicting overall survival on training set based on predictors and calculated the concordance index. The experiments were done in five-fold cross-validation. We achieves the mean concordance index of 0.7016.

4 Discussion

In this paper, we developed a prognostic model to predict overall survival based on predictors derived from contrast-enhanced pancreas CT scans and patient clinical variables. We extracted radiomics features from CT scans, and fitted a Cox model to each variable individually to select the most predictive features. Finally, we use gradient boosting with component-wise Cox’s proportional hazards model to predict the overall survival of patients. Our model achieves mean concordance index of 0.7016 using five-fold validation on training set. In the future, we will explore more possibilities of feature selection and survival predication methods to improve the performance of our model.

References

Adamska, A., Domenichini, A., Falasca, M.: Pancreatic ductal adenocarcinoma: current and evolving therapies. Int. J. Mol. Sci. 18(7), 1338 (2017)
Article Google Scholar
Brennan, M.F., Kattan, M.W., Klimstra, D., Conlon, K.: Prognostic nomogram for patients undergoing resection for adenocarcinoma of the pancreas. Ann. Surg. 240(2), 293 (2004)
Article Google Scholar
Distler, M., Pilarsky, E., Kersting, S., Grützmann, R.: Preoperative CEA and CA 19–9 are prognostic markers for survival after curative resection for ductal adenocarcinoma of the pancreas-a retrospective tumor marker prognostic study. Int. J. Surg. 11(10), 1067–1072 (2013)
Article Google Scholar
van Griethuysen, J.J., et al.: Computational radiomics system to decode the radiographic phenotype. Cancer Res. 77(21), e104–e107 (2017)
Article Google Scholar
He, K., et al.: Component-wise gradient boosting and false discovery control in survival analysis with high-dimensional covariates. Bioinformatics 32(1), 50–57 (2015)
Google Scholar
Siegel, R.L., Miller, K.D., Jemal, A.: Cancer statistics, 2017. CA: A Cancer J. Clin. 67(1), 7–30 (2017). https://doi.org/10.3322/caac.21387
Article Google Scholar
Oldfield, L.E., Connor, A.A., Gallinger, S.: Molecular events in the natural history of pancreatic cancer. Trends Cancer 3(5), 336–346 (2017)
Article Google Scholar
Rangarajan, V.: Three dimensional (3D) CT reconstruction in cancer imaging. Indian J. Med. Res. 137(1), 10–11 (2013)
Google Scholar
Zhu, L., et al.: CT imaging biomarkers predict clinical outcomes after pancreatic cancer surgery. Medicine 95(5), e2664 (2016)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Southern University of Science and Technology, Shenzhen, 518055, China
Li Sun & Songtao Zhang

Authors

Li Sun
View author publications
You can also search for this author in PubMed Google Scholar
Songtao Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Li Sun .

Editor information

Editors and Affiliations

University College London, London, UK
Danail Stoyanov
University of Leeds, Leeds, UK
Zeike Taylor
Kitware Inc., Carrboro, NC, USA
Stephen Aylward
University of Porto, Porto, Portugal
João Manuel R.S. Tavares
Western University, London, ON, Canada
Yiming Xiao
Memorial Sloan Kettering Cancer Center, New York, NY, USA
Amber Simpson
Sunnybrook Health Science Centre, Toronto, ON, Canada
Anne Martel
Deutsches Krebsforschungszentrum (DKFZ), Heidelberg, Germany
Lena Maier-Hein
University of Western Ontario, London, ON, Canada
Shuo Li
Concordia University, Montreal, QC, Canada
Hassan Rivaz
SINTEF Health Research, Trondheim, Norway
Ingerid Reinertsen
Grenoble Alpes University, St.-Martin-d’Hères, France
Matthieu Chabanas
National Cancer Institute, Bethesda, MD, USA
Keyvan Farahani

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sun, L., Zhang, S. (2018). Pancreatic Cancer Survival Prediction Using CT Scans and Clinical Variables. In: Stoyanov, D., et al. Simulation, Image Processing, and Ultrasound Systems for Assisted Diagnosis and Navigation. POCUS BIVPCS CuRIOUS CPM 2018 2018 2018 2018. Lecture Notes in Computer Science(), vol 11042. Springer, Cham. https://doi.org/10.1007/978-3-030-01045-4_24

Download citation

DOI: https://doi.org/10.1007/978-3-030-01045-4_24
Published: 15 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01044-7
Online ISBN: 978-3-030-01045-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics