Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Recently, automated breast ultrasound (ABUS) has been developed as a new and promising tool for diagnosing breast cancer. ABUS can provide 3D views of the breast by automatically scanning the whole breast. It also has a number of advantages over traditional 2D handheld ultrasound: higher reproducibility, less operator dependence, and less image acquisition time [8]. However, reviewing ABUS images is extremely time-consuming, because a typical exam often consists of three volumes of each breast in order to cover the whole breast. Furthermore, the large size of the ABUS volumes may cause oversight errors in some malignancies. Therefore, automated cancer detection in ABUS is highly expected to assist clinicians in facilitating the identification of breast cancer.

Fig. 1.
figure 1

Example ABUS images. Blue arrows indicate biopsy-proven cancer regions, and orange arrows indicate cancer mimicry with very similar appearance to cancer regions.

As shown in Fig. 1, computer-aided detection (CADe) of cancer from ABUS images remains very challenging. First, cancers often possess high intraclass appearance variation caused by various factors like imaging artifacts of acoustic shadows and speckles, deformation of soft tissues and large difference in lesion size. Second, malignant lesions may possibly appear similar to other structures, such as benign lesions and normal hypoechoic structures. Third, small cancers are mostly likely to be missed even by clinical experts because of the relatively low quality of ultrasound imaging, the existence of similar normal structures and the oversight errors. Finally, the severe class imbalance between cancer and non-cancer voxels is another challenge because of the extremely small size of the cancer relative to the large ABUS volume. In this situation, the predictive model developed based on machine learning methods could be biased and inaccurate.

To better assist clinician with cancer screening, a number of CADe approaches have been developed. Moon et al. developed a CADe system based on a two-stage multi-scale blob analysis method [5]. The system showed sensitivities of 100%, 90% and 70% with false positives (FPs) per volume of 17.4, 8.8, and 2.7, respectively. Tan et al. proposed a multi-stage system using an ensemble of neural networks to classify cancers [9]. Although the FPs per volume could be controlled at 1, the sensitivity was only 64%. Lo et al. employed watershed segmentation to extract potential abnormalities in ABUS and reduced FPs using various quantitative features [3]. The sensitivities were 100%, 90% and 80% with FPs per volume of 9.44, 5.42, and 3.33, respectively. Generally, maintaining high sensitivity with low FPs remains a vital problem in ABUS CADe.

Recently, a surge of deep learning is becoming dominant over traditional CADe methods [7]. We propose a novel 3D convolutional neural network (CNN) for automatic cancer detection in ABUS. We believe we are the first to employ deep learning based techniques for this problem. Our contribution is twofold. First, we propose a threshold loss function by adding a threshold map (TM) layer in the CNN. The proposed method provides voxel-level adaptive threshold to classify voxels into cancer or non-cancer, thus achieving high sensitivity with low FPs. Second, we propose a densely deep supervision (DDS) mechanism to improve the sensitivity significantly by utilizing multi-scale discriminative features of all layers [2]. We employ two loss functions to enhance DDS performance. Specifically, class-balanced cross entropy is employed to tackle the issue of limited positive training samples; overlap loss is for selecting discriminative cancer representations. The proposed network was extensively evaluated on a 196-patient dataset, with 661 cancer regions.

Fig. 2.
figure 2

The illustration of the proposed network for cancer detection in ABUS.

2 Methods

Figure 2 illustrates the proposed network, which leverages the DDS to learn more discriminative cancer representations for the improvement of detection sensitivity, and utilizes the proposed TM to adaptively refine the probability map for the FPs reduction while maintaining high sensitivity.

2.1 Network Architecture

We choose the most successful segmentation net 3D U-net [6], [11] as our backbone architecture but make the following modifications (Fig. 2): (a) we employ the pre-trained C3D model [10] to fine-tune our network parameters to restrain the over-fitting issue induced by limited ABUS training samples; (b) we design a DDS mechanism to effectively learn discriminative features for cancer identification and meanwhile boost the gradient flow of the whole network; (c) we add a TM layer to provide voxel-level adaptive threshold for optimizing the probability map, thus achieving high sensitivity with low FPs. Specifically, the TM is automatically learned from the complementary information of learned features, label information and predicted probability map. (d) other customizations: each convolutional layer is connected with a batch normalization (BN) layer and a rectified linear unit (ReLU); each of the stages 3–7 uses 3 convolutional layers to increase receptive fields for using more global information.

2.2 Densely Deep Supervision

Deep neural networks are powerful to generate abundant multi-scale features for object detection in natural images. Nevertheless, this is challenging in ABUS images, because breast cancers possess high intraclass appearance variation and some are relatively subtle. Furthermore, due to gradient vanishing issue, the parameter tuning processes of the 3D CNN may encounter low efficiency and overfitting problems. Taking advantages of deeply supervised nets (DSN) [2], we implement the DDS into our 3D U-net to alleviate above problems by fully exploiting the multi-scale features from all stages. Specifically, we input each of the stages 1-9 and the concatenation of all stages into DDS pool (thus totally 10 DSNs), and introduce a DDS loss function to supervise the generation of cancer probability map. The DDS loss function is defined as

$$\begin{aligned} \mathcal {L}_{dds}(X,Y;\mathcal {W},\omega ) = \sum _{t=1}^{T-1} \Big (\theta _t * \big ( \mathcal {L}_{cbce}^{(t)}(X,Y;\mathcal {W},\omega ^{(t)}) + \mathcal {L}_{ol}^{(t)}(X,Y;\mathcal {W},\omega ^{(t)}) \big ) \Big ), \end{aligned}$$
(1)

where X is training image and Y is corresponding label image, \(\mathcal {W}\) is the weight of main network, \( \omega =(\omega ^{(1)},\omega ^{(2)},\cdots ,\omega ^{(T)}) \) where \((\omega ^{(1)},\omega ^{(2)},\cdots ,\omega ^{(T-1)})\) are the weights of each DSN and \(\omega ^{(T=11)}\) is the weight of TM layer, \( \theta =(\theta _1,\theta _2,\cdots ,\theta _T) \) are the coefficients to weight each DSN loss and threshold loss in total loss, respectively. The class-balanced cross entropy (CBCE) loss \(\mathcal {L}_{cbce}\) and overlap (OL) loss \(\mathcal {L}_{ol}\) in Eq. (1) will be explained in the following paragraphs.

Class-Balanced Cross Entropy Loss. Considering lots of breast cancers are relatively subtle, the distribution of cancer/non-cancer regions in a large ABUS volume is heavily biased. We employ a CBCE loss to tackle the issue of limited positive training samples. Specifically, we introduce a class-balancing weight \(\alpha \) to offset the imbalance between cancer and non-cancer voxels and define the CBCE loss function as

$$\begin{aligned} \begin{aligned} \mathcal {L}^{(t)}_{cbce}(\mathcal {W},\omega ^{(t)})=-\alpha \sum _{i\in Y_+}logPs(y_i=1| X;\mathcal {W},\omega ^{(t)}) \\ -(1-\alpha )\sum _{i\in Y_-}logPs(y_i=0| X;\mathcal {W},\omega ^{(t)}), \end{aligned} \end{aligned}$$
(2)

where \(y_i\in \{0,1\}\) represents the label of Y at location i. \( \alpha =sum(Y_-)/sum(Y) \) and \( 1-\alpha =sum(Y_+)/sum(Y) \), where \(Y_-\) and \(Y_+\) are non-cancer and cancer label sets. \(Ps(y_i| X;\mathcal {W},\omega ^{(t)}) = e^{z_j}/\sum _ke^{z_k}\in (0,1)\) denotes the feature map obtained by softmax function, where \(z_j\) is the score of class j and \(k=2\).

Overlap Loss. To further improve the detection sensitivity, especially for better identification of subtle cancers, we design a new loss function, i.e. overlap loss, to learn more discriminative features for cancer representations. The OL loss function is defined as

$$\begin{aligned} \mathcal {L}^{(t)}_{ol}(\mathcal {W},\omega ^{(t)})=\sum _{i=1}^{\left| X\right| }(Ps(y_i=1|X;\mathcal {W},\omega ^{(t)}) * Ps(y_i=0|X;\mathcal {W},\omega ^{(t)})). \end{aligned}$$
(3)

The purpose of this loss is to ensure that cancer regions should have little overlap with non-cancer regions, thus the optimal overlapped region should be zero. By minimizing the overlap loss, we attempt to learn more discriminative features to distinguish cancer and non-cancer regions.

2.3 Threshold Map

Although the probability map generated by the proposed DDS mechanism can indicate cancer locations with high sensitivity, it may still have some high probability regions which are actually normal tissues. Thus the further post-processing of the probability map is essential to obtain a better detection. However, conventional methods often fail in simultaneously achieving high sensitivity and low FPs, for example, fixed threshold is sensitive to the selected value; softmax method often achieves high FPs when coping with challenging problems; directly performing conditional random field tends to lower the sensitivity.

To address the above issues, we concatenate and train a threshold map (TM) layer in our network to adaptively refine the probability map for better detection. The proposed TM can provide voxel-level adaptive threshold to classify voxels into cancer or non-cancer by making use of all the information from learned features, label information and probability map, thus achieving a good balance between high sensitivity and low FPs. To train the TM, we design a new loss function, i.e. threshold loss, which can be calculated as follow:

$$\begin{aligned} \mathcal {L}_{threshold}(\mathcal {W},\omega ^{(t)})=1-\frac{2* \left| Mask(y_i=1|X;\mathcal {W},\omega ^{(t)}) * Y \right| }{\left| Mask(y_i;\mathcal {W},\omega ^{(t)})\right| +\left| Y \right| }, \end{aligned}$$
(4)
$$\begin{aligned} Mask(y_i;\mathcal {W},\omega ^{(t)})=1/(1+e^{-tmp}), \end{aligned}$$
(5)
$$\begin{aligned} tmp=\left\{ \begin{array} {rcl} &{}Ps(y_i=1|X;\mathcal {W},\omega ^{(t)}),&{}{Ps(y_i)>Threshold\_Map(y_i)} \\ &{}-e^{10},&{}else \end{array} \right. . \end{aligned}$$
(6)

The objective of this threshold loss is to learn a voxel-wise threshold map, which can be further employed to adaptively refine the probability map by suppressing non-cancer regions and meanwhile maintaining cancer regions. To the best of our knowledge, we are the first to design a threshold map to adaptively optimize the probability map. The efficacy of the proposed TM is shown in our experiments.

We summarize the total loss function for our cancer detection as

$$\begin{aligned} \mathcal {L}_{total}=\mathcal {L}_{dds}+\theta _{T}*\mathcal {L}_{threshold}. \end{aligned}$$
(7)
Fig. 3.
figure 3

Example results of cancer detection in ABUS. Top row: ABUS images with annotated cancer locations (red contours) and purple arrow indicates cancer mimicry. Middle row: refined cancer probability maps by the proposed threshold map. Bottom row: 3D visualization of our detected cancers (gray) and ground truth (red).

3 Experiments

Materials. Experiments were carried on the dataset obtained using Invenia ABUS system (GE, USA) in Sun Yat-Sen University Cancer Center. Informed consent for this retrospective study was obtained from our institutional review board. To cover the whole breast, three volumes including anterior-posterior, medial and lateral passes were obtained for each breast. Thus for each patient, six ABUS volumes were acquired. The voxel resolutions of the acquired 3D ABUS volumes were 0.511 mm, 0.082 mm and 0.200 mm in the transverse, sagittal and coronal direction, respectively.

In this study, ABUS data from 196 women (age range: 30–75 years, mean 49 years) with biopsy-proven breast cancers were collected. From these ABUS data, 559 volumes were annotated by an experienced clinician, which included 661 cancer regions (volume: 0.01–86.54 cm\(^3\), mean: 2.84 cm\(^3\)). Four-fold cross-validation was conducted to evaluate the detection performance. As a control, 119 ABUS volumes with no abnormal findings were also included for evaluations.

Implementation Details. Our proposed framework was implemented with the popular library Keras for Tensorflow. To tackle the issues of limited cancer samples and demanding 3D computational cost, we divided ABUS volume into multiple \(96\times 64\times 96\) cubes and adopted data augmentation (i.e., translation, rotation, cropping, flipping) for training, and further combine the predicted cube into a volume as detection result. The framework was trained on a 8x NVIDIA Tesla GPU. Adaptive moment estimation was used to train the whole framework. We set the learning rate as 1e\(-4\) and learning stopped after 30000 iterations.

Detection Performance. We extensively compared our method with state of the art, including SegNet [1], FCN [4], U-net [11]. To illustrate the efficacy of the proposed triplet loss, we further evaluated the proposed network with different loss functions, including dice loss (DL), cross entropy (CE) loss, CBCE loss, CBCE-OL loss, and the triplet CBCE-OL-TM loss.

Table 1. Sensitivities and corresponding FPs per volume for different methods
Fig. 4.
figure 4

Left: the cancer volume (cv) distribution of all 661 cancer regions. Right: the detection sensitivities for different ranges of cancer volume.

Figure 3 visualizes the cancer detection results by our network. By utilizing the proposed DDS and threshold map, our network can output accurate cancer probability maps even when cancers were subtle or cancer mimicry existed. Table 1 lists the sensitivities and corresponding FPs per volume for different methods. Our network obtained a sensitivity of 93% with 2.2 FPs per ABUS volume. Compared to SegNet and FCN, our network significantly improved detection sensitivity meanwhile still controlled FPs at about 2. Although U-net obtained a FP value less than 1, its sensitivity is less than 80%. By observing the results of our network with different loss functions in Table 1, the designed DDS and TM contributed to the improvement of detection performance. Specifically, the DDS with CBCE-OL loss contributed to select discriminative cancer representations and the TM loss helped to adaptively optimize probability map for the FPs reduction while maintaining high sensitivity. Table 1 also records the difference between FPs in cancer and normal volumes. Our network got the number of FPs for normal volumes slightly lower than that for abnormal volumes.

Figure 4 further illustrates the volume distribution of all 661 cancer regions, as well as the corresponding detection sensitivities for different ranges of cancer volume. It can be observed that our network achieved a sensitivity above 85% even when cancer volume was smaller than 1 cm\(^3\), and when cancer volume was larger than 5 cm\(^3\), we got a sensitivity of 100%.

4 Conclusion

In this paper, we propose a novel 3D convolutional network for automatic cancer detection in ABUS. To the best of our knowledge, we are the first to employ deep learning techniques for this problem. In the proposed network, a novel threshold map is designed to provide voxel-level adaptive threshold to classify voxels into cancer or non-cancer, thus achieving high sensitivity with low FPs. Furthermore, a densely deep supervision mechanism is employed to improve the sensitivity greatly by utilizing multi-scale discriminative features of all layers. Experiments show our network obtains a sensitivity of 93% with 2.2 FPs per ABUS volume. Our method can provide an accurate and automatic cancer detection tool for breast cancer screening by maintaining high sensitivity and low FPs.