Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

In contrast to other types of medical imaging (e.g. Magnetic Resonance Imaging, MRI, diffusion Magnetic Resonance Imaging, dMRI, and x-rays computed tomography), ultrasound imaging (UI) offers several advantages such as real-time imaging, no radiation, and small movable devices [1]. However, even the best ultrasound images, have a lower resolution compared with the corresponding MRI and x-ray computed tomography [2]. The loss of spatial resolution in ultrasound images is mainly caused by two types of limitations: intrinsic and extrinsic. Intrinsic resolution limitation is related to the characteristics of the interaction between a tissue and the ultrasound wave, such as attenuation and scattering. Extrinsic resolution limitation is given by spontaneous and non-spontaneous movements of the analyzed tissue or the patient (e.g. respiration and heart breathing) [3].

The lost of spatial resolution in medical imaging is a critical issue for clinical analysis since it complicates the accomplishment of different procedures such as diagnosis of diseases, segmentation (tissue, nerves and bone) and needle guidance in peripheral nerve blocking (PNB) procedures [4]. For this reason, several approaches have been developed to enhance the spatial resolution in ultrasound images from the post-processing of low-resolution (LR) images. For instance, in [1] the ultrasonic images are modeled as a convolution of a point spread function of the image process, and the tissue function. Spatial resolution is increased by deconvolving the observed radio frequency image by the point spread function using homomorphic filtering. However, this method requires great computational effort. Another approach is proposed in [5]. Here, the spatial resolution enhancement is considered as an ill-posed inverse problem where the regularization is performed using anisotropic diffusion. Nevertheless, its accuracy is affected in presence of high levels of speckle noise. Recently, a new approach based on patch learning have been proposed. For instance, in [6, 7], a patch-based Gaussian Processes Regression (GPR) for enhancing the spatial resolution was introduced. Nevertheless, due to the natural smooth behavior of Gaussian Processes (GPs), the method tends to generate a blurring effect over the edges in the High Resolution (HR) images. This drawback was minimized for MRI in [6] by adding a post-processing step, where a specific 2D filter is used in order to highlight the edges.

As we have pointed above, different approaches have been proposed to deal with the issue of low spatial resolution in medical images. However, each of them approaches exhibits their strengths and their weakness. In this sense, the HR resolution images generated with these approaches correspond to noisy versions of the actual (and unknown) HR image. This problem can be minimized using supervised learning with multiple annotators systems to estimate the ground true (actual HR image) from multiple noisy images.

Learning with multiple annotators is an emergent area in the context of supervised learning. Its aim is to deal with supervised problems, where the gold standard is not available, and in contrast, we just have access to an amount of noisy annotations provided by multiple annotators [8, 9]. Recently, many approaches have been proposed to address different problems such as classification [8], regression [9], and sequence labeling [10], under the framework of multiple annotators. In the area of machine learning with multiple annotators, we can recognize two main goals using the training data: First, to estimate the ground-truth from the multiple annotations. Second, to build a supervised scheme (e.g. a classifier or a regressor). In this work we focus in the first goal, in order to estimate a HR image from multiple HR images (possibly noisy). The idea of our methodology is to use different interpolations methods (e.g. Bilinear interpolation, Bicubic interpolation and Lanczos interpolation) to generate HR images. Then, we consider each pixel intensity value in each of these HR images as a corrupted version of the pixel intensity value in the corresponding hidden and true HR image (considered as the gold standard). Finally, we use the regression scheme for multiple annotators based on Gaussian Processes, proposed in [9], with the aim of computing an estimate of the actual HR image from the noisy HR images given by the interpolation algorithms. We compare our approach against two super-resolution schemes based on Gaussian process regression. One of these schemes uses the pixel intensities in the nearest neighbors as the features [7]. The second scheme uses the position of each pixel as features [6]. Validation is carried out based on the Mean Square Error (MSE) metric and the Dice coefficient (DC) for the morphological validation (nerve segmentation).

2 Materials and Methods

2.1 Dataset

The dataset used in this work was collected by the Universidad Tecnológica de Pereira, Pereira, Colombia and the Santa Mónica Hospital, Dosquebradas, Colombia. The dataset comprises recordings of ultrasound images from patients who underwent regional anesthesia using the Peripheral nerve blocking procedure. The dataset in conformed by 31 ultrasound images from the ulnar nerve (18 images) and median nerve (13 images). Each ultrasound image was collected using a Sonosite Nano-Maxx device (The resolution of each image is 360\(\,\times \,\)370 pixels). Each image in the dataset was labeled by an expert in anesthesiology to indicate the location of the nerve structures. Figure 1 shows the types of images in the dataset.

Fig. 1.
figure 1

Images belonging to the dataset. In the left, the ulnar nerve is shown and the median nerve is shown on the right. Each image has been labeled by an expert in anesthesiology to locate the nerve structures.

2.2 Spatial Resolution Enhancement Based on Gaussian Processes Regression with Multiple Annotators

We follow the regression model with multiple annotators proposed by [9]. We assume that there are R HR images, generated by R different interpolation methods. In this sense, the training set comprises \(\mathscr { D}=\left\{ \mathbf {x}_i, y_i^{1}, ...y_i^R \right\} _{i=1}^{N}\), where \(x_i\) is the feature vector (in this case we consider as features the pixel coordinates x and y) and \(y_i^{j}\) is the intensity value of the pixel i in the HR image generated by the \(j-\)th interpolation algorithm. So, the model assumes that each annotation is generated following \(y_i^{j} = f_i + \epsilon ^{j}\), where \(f_i\) is the unknown ground-truth (in this case the intensity value pixel in the actual but hidden HR image) and \(\epsilon ^{j} = \mathscr {N}(0, \sigma _{j}^{2})\) is the distribution associated with the noise.

Assuming that each annotator (in this case each interpolation method) labels the observation \(\mathbf {x}_i\) independently and assuming independence between the annotators, the likelihood is given as

$$\begin{aligned} p(\mathbf {y}|\mathbf {f}) = \prod _{j}\prod _{i}\mathscr {N}(y_{i}^{j}|f_i, \sigma _{j}^2), \end{aligned}$$

where \(\mathbf {y}= \left\{ y_i^{1}, ...y_i^R \right\} _{i=1}^{N}\). Assuming a Gaussian process prior over \(\mathbf {f}\) such as \(p(\mathbf {f}) = \mathscr {N}\left( \mathbf {0},\mathbf {K}\right) \), where the covariance function is computed using a specific kernel function, it can be shown that for a new observation \(f(\mathbf {x}_{*})\), the posterior distribution is given by

$$\begin{aligned} p(f(\mathbf {x}_*)|\mathbf {y}) = \mathscr {N}(f(\mathbf {x}_*)|\bar{f}(\mathbf {x}_*), k(f(\mathbf {x}_*), f(\mathbf {x}_*')), \end{aligned}$$
(1)

where, \(\bar{f}(\mathbf {x}_*) = k(\mathbf {x}_*, \mathbf {X})\left( \mathbf {K}+ \hat{\Sigma }\right) ^{-1}\hat{\mathbf {y}}\),

\(k(f(\mathbf {x}_*), f(\mathbf {x}_*')) = k(\mathbf {x}_*, \mathbf {x}_*') - k(\mathbf {x}_*, \mathbf {X})\left( \mathbf {K}+ \hat{\Sigma }\right) ^{-1} k(\mathbf {X}, \mathbf {x}_*)\) and

$$\begin{aligned} \frac{1}{\hat{\sigma }_{i}^{2}} = \sum _{j}\frac{1}{\hat{\sigma }_{j}^{2}}, \quad \hat{y}_i = \hat{\sigma }_{i}^{2} \sum _{j}\frac{y_i^j}{\hat{\sigma }_{j}^{2}}, \quad \hat{{\Sigma }} = {\text {diag}}\left( \hat{\sigma }_{1}^{2}, \dots \hat{\sigma }_{N}^{2}\right) . \end{aligned}$$

However, in this work we are not interested in making a prediction for new samples \(\mathbf {x}_{*}\), but in computing a gold standard estimate from the multiple annotations. In this sense, we consider as a new instances the whole samples in the training data \(\mathbf {X}\) and use the expression (1) with the aim of computing a probabilistic estimate for the gold standard (i.e. the actual intensity value for each pixel in the HR image) The unspecified parameters can be estimated by minimizing the negative log of the evidence (see [9]).

2.3 Procedure

GPRMA for Resolution Enhancement. For developing our methodology we generate five HR images from five different interpolation methods: Nearest Neighborhood interpolation (NN) [11], Bilinear interpolation (Bil) [12], Bicubic interpolation (Bic) [12], Lanczos interpolation (Lan) [11] and a methodology where the files in the images are interpolated using linear interpolation and the rows are estimated using nearest interpolation (LN) Then, each pixel intensity value in each of these HR images is considered as a corrupted version of the pixel intensity value in the corresponding hidden and true HR image (considered as the gold standard). Finally, we perform a patch-based Gaussian process regression with multiple annotators (GPRMA) for computing an estimated of the actual HR image from the annotations (i.e. the HR images generated using the interpolation methods described above). We use 10\(\,\times \,\)10 patches with no overlap. The patch size was chosen by cross-validation. A GPRMA is trained in each patch, considering as features the relative position x and y with respect to the beginning of the patch, and considering as annotations, the pixel intensity values in each HR image. We use the well-known radial basis function (RBF) kernel for all the experiments. Parameter estimation is performed by minimizing the log of the evidence (see [9]) using gradient descend.

Experimental Setup. In order to validate the methodology proposed in this paper, first, we down-sample each image in the dataset described in Sect. 2.1. Hence, we obtain 31 180\(\,\times \,\)175 LR images. Next, we use our methodology based on GPRMA for building the 360\(\,\times \,\)370 pixels HR images. We compare our approach with two super-resolution schemes based on Gaussian process regression. One of these schemes uses the position of each pixel as features [6]. The other scheme uses the nearest neighbors as the features [7]. We compare all the HR images with the respective Gold Standard (i.e. the images in the initial dataset). The performance of all methods is measured in terms of the mean squared error (MSE) obtained with each HR image with respect to the Ground truth. Finally, we morphologically validate the methodologies by segmenting nerve structures. This segmentation process is carried out using an active shape model [13]. This model is initialized using a methodology based on Graph Cuts [14]. The performance for the morphological validation is measured in terms of the Dice Coefficient [15].

3 Results and Discussions

First, we perform a direct comparison between the up-sampled ultrasound images and the gold standard (i.e. the images in the dataset). As we pointed in Sect. 2.3 this comparison is carry out in terms of MSE (specifically, the average MSE computed for the whole images in the dataset). Table 1, shows the average MSE results for the interpolations methods used as annotations in the GPRMA scheme. Moreover, this table reports the average MSE for the methodology proposed in [6] (GPR-1), the methodology proposed in [7] (GPR-2) and our methodology (GPRMA). In Fig. 2, we show the graphical error with the up-sampled images and the gold standard. We choose as gold standard one of the images in the dataset corresponding to a ulnar nerve. Sub-figures (a)-(e) show the absolute error for the interpolation methods used as annotations for the regression scheme with multiple annotators. Sub-figures (f), (g) and (h), show the absolute error for the methodologies proposed in [6, 7] and the approach proposed in this paper.

Table 1. Average MSE between the up-sampled images and the ground truth.
Fig. 2.
figure 2

Graphics errors for the spatial resolution enhancement in ultrasound images. Sub-figures (a)–(e), correspond to the error images for the interpolations methods considered as annotations. Similarly, in (f), (g) and (h), we show respectively the graphical error for the methods proposed in [6, 7] and our methodology

From Table 1 it is possible to note that our methodology for the spatial resolution enhancement outperforms the approaches based on Gaussian process regression (GPR-1 and GPR-2) in terms of the MSE. However, according to the Sub-figure (h), it is possible to observe that our methodology interpolates efficiently all the intensity of pixels in the HR image, excepts for those located at the edges, were a blur effect is observed. The above is due to the natural smooth behavior of Gaussian processes. Furthermore, we note that the method GPR-1 has a low performance in ultrasound images. Taking into account that the approach GPR-1 was originally developed for the resolution enhancement in Magnetic resonance images, which are affected by a different type of noise to the present in ultrasound images (speckle noise). The low performance obtained by GPR-1 is explained by its sensitivity to the speckle noise present in ultrasound images.

Then, we validate the HR images morphologically to define which method is appropriate or not. The morphological validation is performed by segmenting nerves structures in the image. The segmentation of nerves is a key issue in anesthesiology since it is necessary to perform peripheral nerve blocking (which is used for regional anesthesia and for pain management). Table 2, shows the results for morphological validation measured in terms of DC between the gold standard (manual segmentation by a specialist) and the segmentation obtained with the HR images and the original images in the dataset.

Table 2. Morphological validation in terms of the Dice coefficient.

From Table 2, it is possible to note that there are not significant differences between the performance of our methodology (GPRMA), the interpolations methods and the methodology proposed in [7] (GPR-2). Moreover, we can note that the methodology proposed in [6] has the lowest performance. On the other hand, if we take into account the whole validations schemes proposed in this work, it is possible to determine that the proposed methodology outperforms the approaches considered in this work.

4 Conclusion

In this paper, we introduced a methodology for spatial resolution enhancement in ultrasound images based on a novel area in supervised learning known as learning from multiple annotators. Results achieved with the proposed methodology outperform to the GPR methods proposed in [6, 7] for both validation schemes: interpolation validation and morphological validation. Hence, it is possible to determine that GPRMA is a promising methodology for enhancing spatial resolution in ultrasound images. Future work can be oriented in developing a post-processing step with the aim of highlight the edges in the HR image. Moreover, it is possible to extend the GPRMA model used in order to model the dependence between the annotator expertise and the samples in the input space.