1 Introduction

With estimated 40,710 new cases and 28,920 deaths in 2017, liver cancer remains one of the deadliest cancers in the US [1]. The modern treatment approach called - stereotactic body radiation therapy (RT) (SBRT) - has been growing into one of the main treatment option for liver cancer patients, especially for patients suffering from unresectable and multifocal tumors. Treatment fractionation allows higher doses to be delivered to the tumor during SBRT in comparison to other RT types. Higher doses to the tumor are however accompanied by irradiation of the healthy organs-at-risks (OARs) located in the close proximity to the tumor [2]. Designing a dose delivery plan, where the tumor control probabilities are maximized while the risks of radiation-induced side effects are kept on the clinically acceptable level, is the main goal of SBRT planning.

In practice, dose delivery optimization is based on computing dose/volume histograms (DVHs), which estimate the volumes of organs receiving certain dose levels. A clinically acceptable liver SBRT plan must spare \(\ge \)700 cm\(^3\) of liver from receiving \(\ge \)15 Gy and \(\ge \)500 cm\(^3\) of liver from receiving \(\ge \)7 Gy. These two and other similar constraints can be encoded through the DVH models. Despite it universality and applicability, the DVH concept suffers from several shortcomings. First, DVHs are averaged over the patient population and therefore do not take into account the anatomic and morphological properties of a patient. Second, DVHs tacitly assume that all organs are homogenous tissues and two dose plans with same DVHs are expected to impose the same risks. Limitations of DVHs compromise the quality of RT treatments with up to 27% of liver RT and more than 50% of head-and-neck RT resulting in radiation-induced toxicities of healthy OARs [2]. Toxicities eventually reduce the quality of patients’ post-RT life and increase morbidity and mortality rates. Recently, the existence of critical-to-spare regions of OARs has been demonstrated on the example of head-and-neck RT, where sparing of the specific regions of parotid glands with stem cells reduces the risks of post-RT xerostomia toxicity [3]. Similar biochemical analysis is very time consuming and therefore unlikely to be repeated for abdominal OARs in the recent future. At the same time, identification of critical-to-spare regions is urgently needed for the next generation RTs.

Fig. 1.
figure 1

A schematic illustration of the proposed framework for the analysis of radiation therapy dose patterns associated with negative post-treatment outcomes.

This paper pioneers the concept of deep 3D dose analysis for survival and local cancer progression prediction after liver SBRTs (Fig. 1). Our first innovation is adaptation of convolutional neural networks (CNNs) for detection of consistent patterns in three-dimensional (3D) dose plans computed over patients’ livers, and associating such patterns with post-SBRT outcomes. Our second innovation is identification of critical liver regions irradiation of which is associated with the poorest prognosis. Finally, the critical regions are assembled into the first radiation-sensitivity liver atlases for both primary and metastatic cancer.

2 Methodology

During SBRT treatment planning, clinical specialists annotate tumors and surrounding OARs that may be affected by radiation during the treatment, prescribe the optimal doses and SBRT fractionation and finally develop the optimal dose delivery plan. As a results, a 3D dose plan superimposed with patient’s anatomy image, and the anatomy segmentation are generated. To account for dose SBRT fractionation, the absolute delivered doses are converted into biologically effective doses, and the dose values are normalized according to the biological properties of individual OARs. The intensity of a voxel with x coordinate in the 3D dose plan indicates how much dose a tissue at the same x coordinate in the anatomy image is expected to receive during the treatment.

2.1 Registration of 3D Dose Plans into Anatomy-Unified Space

The 3D CNNs that can potentially be used for classification of 3D anatomy images cannot straightforwardly be applied for classification of 3D dose plan. First, a 3D dose plan by itself is not sufficient for treatment outcome prediction, as the dose plan must be paired with the patient’s anatomy, and the same dose plan cannot be delivered to a patient other than the patient for which the plan was generated. Second, the location of dose hot spots in the plan is critical, and shifting a hot spot to few centimeters in any directions will completely invalidate the treatment plan. At the same time, CNNs are generally translation-invariant and would predict the same outcome from a dose plan even if a hot spot is repositioned from the tumor to a surrounding structure.

To address the above mentioned obstacles, we first registered all cases to the same anatomy defined by a randomly selected segmentation \(I_{REF}\). All liver masks were registered and the corresponding dose plans were deformed according to the obtained transformation matrices. Registration consisted of two linear, namely affine and similarity, transformations and one nonlinear B-spline transformation with mutual information used as the registration similarly measure. Registration used a multi-resolution image pyramid with smoothing sigmas of 4, 2, 1 and 0 and shrinking factors of 4, 2, 2 and 1. To facilitate repetition of the framework, the publicly available ANTs toolbox [4] was utilized with the parameters similar to the parameters that won EMPIRE10 challenge [5]. The dose plan for a case M was then deformed according to the registration results:

$$\begin{aligned} D_{M}' \leftarrow T_M(D_M) : T_M=\underset{T}{\arg \min } \left( S(I_{REF},T(I_M))\right) , \end{aligned}$$
(1)

where function S estimates similarity between two images, function T transforms images, \(T_M\) is the optimal transformation of liver mask \(I_M\) to the reference liver mask \(I_{REF}\), \(D_M\) is the 3D dose plan delivered to \(I_M\) and \(D_{M}'\) is the deformed dose plan aligned towards the anatomy of \(I_{REF}\). After registering all segmentations and deforming the corresponding dose plans, we obtained the set of anatomically-aligned dose plans \(D'\) where the dose at voxel \(\varvec{x}\) on different dose plans is always delivered to the similar anatomical location of the liver.

All registered dose plan are cropped to the same size that corresponds to the smallest bounding box encompassing all registered liver volumes. A non-zero padding was artificially added around each dose plan. The purpose of the padding is to compensate for translation invariance of CNNs and allow CNNs to learn the consistent dose patterns of dose plans with respect to the plan borders. As all plans are registered, the plan borders approximately correspond to the same anatomical locations near the liver, and therefore knowing the location of a dose pattern with respect to the padding is equivalent to knowing the location of the pattern against the liver anatomy. Registration and padding serve to address the obstacles towards CNN-based analysis of 3D dose plans.

2.2 CNN Transfer Learning for Outcome Prediction After Liver SBRTs

We designed a CNN to discover the consistent dose patterns in registered dose plans and associate these patterns with SBRT outcomes. To account for a potentially limited number of SBRTs, we utilized the CNN transfer learning mechanism. The CNN was first trained on a large collection of general 3D medical images and then tuned on dose plans. The CNN architecture consisted of three sets of convolution, rectified linear unit and dropout layers, separated by two max-pooling layers. All convolution layers have filters of \(2^3\) size with 15 filters for the first and third layers and 10 filters for the second bottle-neck convolution layer. All max-pooling layers have the size of \(3^3\). The last network layers were fully-connected layers with softmax activation. The parameters of the CNN were set as follows: learning rate = 0.001, momentum = 0.9 and dropout rate = 0.33. To facilitate repetition of the proposed research, the Computational Network Toolkit (CNTK) developed by Microsoft was used for CNN implementation [6].

The database for transfer learning consisted of 3D CT images of 2644 human organs of 28 types, collected at our institution for RT planning, and additionally using three public databases [7, 8]. The database included the following organs (number of samples): aorta (50), adrenal gland (100), bladder (50), chest wall (25), duodenum (53), esophagus (105), eye globe (93), gall bladder (50), inferior vena cava (50), kidney (394), large bowel (51), larynx (45), liver (173), mandible (50), optic chiasm (46), optic nerve (94), pancreas (132), parotid gland (91), pharynx (39), portal and splenic veins (175), rectum (50), small bowel (72), spinal cord (205), spleen (78), stomach (126), submandibular gland (62), uterus (50) and vertebra (135). From these organ images, we generated 26440 samples of \(19^3\) size of 5 mm\(^3\) resolution depicting different parts of each organ. The CNN was trained to recognize which organ is depicted in each sample.

After training the CNN to recognize organs in abdominal, spine and head-and-neck regions, the network was tuned on the registered dose plans. Each dose plan selected for training was duplicated 10 times with added up Gaussian noise from the interval \(0.02\cdot \max (D)\) where \(\max (D)\) is the maximum delivered dose, up to 5 mm random shift, and were isotropically rescaled into 2 cm\(^3\) resolution. The survival and local progression from the SBRT database were converted into binary values using predefined cut-off thresholds. The generated CNNs then made predictions from the dose plans from unfamiliar-to-it liver SBRTs.

2.3 Treatment Outcome Atlases

Apart from making predictions about SBRT outcomes, we estimated which regions of the liver are most critical to be spared during SBRT. To understand how irradiation of different liver regions and different tumor positioning affects treatment outcome, we created a set of artificial treatment plans with artificial liver tumors. Let’s \(D_0\) be an empty dose map and \(I_{REF}\) is the reference liver anatomy. An artificial tumor is defined as a spherical region of a size s located at an arbitrary position \(\varvec{x}\) in \(I_{REF}\). The artificial doses delivered to the artificial tumor volume were generated according to the following equation:

$$\begin{aligned} D_{s, \varvec{x}}(\varvec{y})=p \left( \left( 1 - \frac{\Vert \varvec{y} - \varvec{x} \Vert }{s} \right) \left( p_{max}-p_{min}\right) + p_{min} \right) d_{max}, \end{aligned}$$
(2)

where \(\forall \varvec{y} \in Sphere(s, \varvec{x})\), \(D_{s, \varvec{x}}(\varvec{y})\) is an artificial dose at point \(\varvec{y}\), p(t) is a binary variable of Bernoulli distribution, and \(p_{min}\) and \(p_{max}\) are predefined constants representing probabilities that maximal prescribed dose \(d_{max}\) is applied to point \(\varvec{y}\). As a result, artificial doses \(D_{s, \varvec{x}}\) delivered to the tumor \((s, \varvec{x})\) are represented by a homogeneous sphere of \(d_{max}\) intensity with impulse noise occurring more often closer to the borders of the tumor. To finalize the artificial doses to the tumor, \(D_{s, \varvec{x}}\) was smoothed with a Gaussian kernel \(\mathbf {G}\):

$$\begin{aligned} D_{s, \varvec{x}}' = \mathbf {G}(D_{s, \varvec{x}}, 1), \end{aligned}$$
(3)

Apart from the tumor doses \((s,\varvec{x})\), we needed to add doses to the surrounding structures to create a realistic treatment plan. Such surrounding doses were defined by the Gaussian distribution with sigma \(=1.5s\). The doses to the tumor \(D_{s, \varvec{x}}'\) and its surroundings formed the complete artificial dose plan \(D_{s, \varvec{x}}''\).

The artificial dose volumes were generated for all possible positionings of the tumor inside the reference liver mask \(I_{REF}\). The CNN-based predictions were then calculated from the corresponding artificial treatments:

$$\begin{aligned} V_{s, \varvec{x}} = f\left( D_{s, \varvec{x}}''\right) , \end{aligned}$$
(4)

into the volumetric outcome atlas V. The risk of a negative outcome from treating a tumor centered at point \(\varvec{x}\) is stored at the corresponding point of the atlas V. The reference CT image corresponding to liver mask \(I_{REF}\) was manually segmented into eight liver segments and the CNN-predicted risks for tumors located at different liver segments were summarized.

3 Experiments and Results

We assembled a database of patients treated with liver SBRT from 2004 to 2015 at our institution. The database consisted of 125 (63 males and 62 females) patients with complete follow-ups, where 58 were treated for liver metastases, 36 for hepatocellular carcinoma, 27 for cholangiocarcinoma and 4 for other primary liver cancers. Each patient was treated in 1–5 SBRT fractions with the total prescribed dose ranging from 25 to 54 Gy. During treatment planning, all doses were converted into biologically effective doses (\(\mathrm {BED}_{10}\)) using the linear quadratic model with an \(\alpha / \beta \) ratio of 10. Patients were followed for 1 to 98 months after the treatment with the median time of 13 months. Framework validation followed 20-fold cross-validation schema, ensuring that traning and testing was always perfromed on different non-intersection subsets of patients.

Fig. 2.
figure 2

Kaplan-Meier plots showing separation of patients into classes with positive and negative post-treatment outcome prediction on the [0, 2.5] year time frame.

3.1 Post-SBRT Survival and Local Progression Prediction

The SBRT outcomes of interest were survival time and local progression start after the treatment. The CNNs were designed as binary classifiers with the 2 year cut-off time separating positive and negative survival outcomes, and the 1 year cut-off time separating positive and negative local progression outcomes. A couple of patients who were alive at the time of the last follow-up and whose last follow-ups took place less than 2 years after the treatment were removed from the survival prediction analysis. Similarly, a couple of patients who died less than 1 year from the SBRT and did not develop local cancer progression were removed from the local progression prediction analysis.

The accuracy of CNN-based outcome prediction for primary cancer cases was of 0.76 and 0.67 for survival and local progression prediction, respectively, estimated by the area under the ROC curve (AUC). For metastatic cancer, CNN resulted in AUC of 0.68 and 0.56 for survival and local progression prediction, respectively. Figure 2 presents the Kaplan-Meier estimators.

3.2 Survival and Local Progression Atlases

Prior to evaluating the developed dose atlases, we computed the average distribution of doses and tumor locations for primary and metastatic liver cancers. The doses delivered to eight liver segments significantly correlated \((\rho =0.02)\) for primary and metastatic cancer cases. The occurrence of tumors at individual liver segments also significantly correlated \((\rho =0.02)\) for both cancers.

Fig. 3.
figure 3

The CNN-generated risk maps superimposed over liver mesh, where the red colors correspond to the critical-to-spare regions for survival (\(1^{st}\) row) and local progression (\(2^{nd}\) row) for primary (\(1^{st}\) column) and metastatic (\(2^{nd}\) column) liver cancer.

The risk scores associated with negative outcomes were computed for individual liver segments for both primary and metastatic cancers. Risk scores are defined on the interval [0, 1] where 0 corresponds to the lowest risks of negative treatment outcome. The risk scores are summarized in Table 1. To additionally visualize the liver regions associated with high risks of negative treatment outcome, we superimposed the liver mesh generated from liver segmentation \(I_{REF}\) with the atlas V. The color of a mesh face with the coordinate \(\varvec{x}\) shows the predicted risk score from the dose delivered to the coordinate \(\varvec{x}\). Figure 3 shows such colored meshes computed for primary and metastatic cancer cases.

4 Discussion and Conclusion

In this paper, we investigated the possibilities to adapt CNNs for prediction of RT outcomes from 3D volumetric dose plans. We developed a framework consisting of the dose plans registration and CNN-based plans classification steps and validated it on a clinical database of SBRTs with follow-ups. We additionally assembled a large collection of 3D CT images for CNN transfer learning.

Our first conclusion is that the outcome prediction accuracy is higher for primary liver cancer than for metastatic liver cancer. We attribute this to the fact that survival of patients with liver metastases considerably depends on the primary cancer site, which is not indicated on the liver SBRT dose plans. The risks score for survival and local progression (Table 1) correlated with each other for both primary (coefficient = 0.56) and metastatic (coefficient = 0.85) cases. This observation is in agreement with clinical studies showing correlation between survival and progression-free survival [9]. As expected, the survival risks scores for primary and metastatic cancers do not correlate with each other (coefficient = 0.19). The tumors and doses are distributed in a statistically similar ways for both primary and metastatic cancer, so we cannot attribute the observed lack of correlation to the database being biased towards some cancer type.

In summary, we pioneered the concept of 3D dose analysis for prediction of RT outcomes. The only previous attempt of studying dose patterns with deep learning was developed for 2D dose plans and was therefore limited to the OARs with the simple cylindrical shape [10]. Highly promising results were observed on the example of liver SBRTs. The deep dose analysis concept is a fundamentally new approach in RT planning, which has potential to move the field from population-averaged to anatomy-driven personalized treatments.

Table 1. The predicted post-SBRT outcome computed for each liver segment. Lower values correspond to lower risks of negative SBRT outcomes.