Keywords

1 Introduction

In magnetic resonance imaging (MRI) practice, a dilemma is inevitably considered in the balance between image resolution, signal-to-noise ratio (SNR), and scan time [1]. High resolution is required for obtaining more image details, correspondingly leading to increased acquisition time. Since SNR is proportional to the slice thickness and square root of acquisition time, long scan time results in reduced SNR, and in high probability introduces artifacts caused by patient motion. Considering MRI resource is limited and costly, thick slices and low scan time have to be leveraged in order to achieve a desired SNR for distinguishing the signal of interest from noise. Many techniques have been developed to handle this dilemma at the acquisition level, including parallel imaging [11] and robust k-space sampling [8]. Recently, it has been shown that super-resolution reconstruction (SRR) achieves better trade-off between image resolution, SNR, and scan time than direct high-resolution (HR) acquisition, for certain image contrasts and sequences, such as T\(_2\)-weighted images [9]. As a result, SRR is widely used in MRI to improve the image quality by means of post-processing.

SRR scheme is inspired by the work [17], where multiple low-resolution (LR) natural images are combined into an HR image. The multi-acquisition-based SRR scheme for MRI was introduced in [3]. A milestone work demonstrated that the SRR over multiple anisotropic acquisitions with rotation of the slice selection direction outperforms those with translation [14]. As an extension, the slices with arbitrary orientations and displacements were leveraged in [10]. With these SRR schemes, extensive approaches have been proposed, including frequency-based, image-based, and feature transform-based methods [5].

Image-based SRR has recently been receiving significant interests [4, 12, 13, 16] as it is straightforward to incorporate a forward model with image priors. The forward model describes the image acquisition process. Because the forward model poses an ill-posed problem, images priors are required to embed desired properties into the reconstructed image, e.g., spatial smoothness and edge enhancements. The state-of-the-art priors are Tikhonov cost [9], total variance (TV) [16], and bilateral TV, also known as multi-scale TV (MSTV) [2]. Tikhonov cost is computationally efficient due to the convex relaxation. However, it may result in reduced edge sharpness. TV-based priors are thus leveraged to preserve edges. Edge-directed SRR [15] is emerging as a robust prior to enhance edges in the reconstruction, and shows promising results in obtaining sharper images. However, it focuses on edge preservation only while leaving spatial smoothness unconsidered. Furthermore, that work is data-hungry since it relies on learning the gradient profile prior from a larger collection of natural images.

In this work, we consider the imaged-based MRI SRR based on anisotropic acquisition schemes. Inspired by the success of the edge-directed method [15] for single natural image super-resolution, we proposed a new multi-scale gradient field prior that guides the HR image reconstruction. The prior improves both spatial smoothness and edge preservation. The inverse of the forward model of image formation is used to propagate the gradient guidance from the LR images to the HR image space. The gradient fields over multiple scales are exploited for more accurate edge localization. Our method allows inter-volume motion during the MRI scans and can incorporate with the LR images with arbitrary orientations and displacements in frequency space, such as orthogonal and origin-shifted scans. We evaluate our method on the synthetic data as well as the data acquired on a Siemens 3T MRI scanner containing 45 MRI scans from 14 subjects. The evaluation results demonstrate that our method achieves superior reconstruction over state-of-the-art methods. In particular, the anatomical structures of hippocampus can be clearly shown in the reconstructed images by our method. This is a significant improvement for the in vivo studies on hippocampus.

2 Materials and Methods

We have a set of n LR images, denoted by \(\{{\mathbf {y}}_k\}_{k=1}^n\), acquired from n scans with arbitrary orientations and displacements in frequency space. The reconstructed HR image \({\mathbf {x}}\) is achieved from the following maximum a posterior estimation

$$\begin{aligned} \max _{\mathbf {x}}p\left( {\mathbf {x}}|\{{\mathbf {y}}_k\}_{k=1}^n\right) = \max _{\mathbf {x}}\prod _{k=1}^n p\left( {\mathbf {y}}_k|{\mathbf {x}}\right) p\left( {\mathbf {x}}\right) . \end{aligned}$$
(1)

The probability density function \(p\left( \{{\mathbf {y}}_k\}_{k=1}^n|{\mathbf {x}}\right) \) is expanded as the products of \(p\left( {\mathbf {y}}_k|{\mathbf {x}}\right) \) in Eq. (1) since the LR images are independently acquired. The likelihood \(p\left( {\mathbf {y}}_k|{\mathbf {x}}\right) \) can be formulated by a forward model that describes the MRI acquisition process. The prior \(p\left( {\mathbf {x}}\right) \) can be implemented by a regularization term to embed desired properties into the reconstructed HR image.

2.1 Forward Model

Let the images \({\mathbf {x}}\) and \({\mathbf {y}}_k\) be expressed as the column vectors by a lexicographical order of their voxels. The forward model of MRI is thus formulated as

$$\begin{aligned} {\mathbf {y}}_k = {\mathbf {D}}_k{\mathbf {H}}_k{\mathbf {T}}_k{\mathbf {x}}+ {\varvec{\mu }}_k,\qquad k=1,2,\dots ,n. \end{aligned}$$
(2)

\({\mathbf {T}}_k\) is a coordinate transform operator, and implemented as a rigid transform in this work to represent the inter-volume motion during the acquisition. \({\mathbf {H}}_k\) is a spatially invariant blur kernel. It describes the point spread function (PSF) of the MRI acquisition process, and can be decomposed into three kernels in the slice selection, the phase- and the frequency-encoding directions. We consider the PSF in the slice selection direction only and implement it as a truncated Gaussian function. \({\mathbf {D}}_k\) is a downsampling operator. \({\varvec{\mu }}_k\) denotes the noise during the acquisition. Referring to [6], \({\varvec{\mu }}_k\) can be assumed to be additive and Gaussian when the SNR > 3. As a result, the likelihood in Eq. (1) can be found by

$$\begin{aligned} p\left( {\mathbf {y}}_k|{\mathbf {x}}\right) =\frac{1}{\sqrt{2\pi }\sigma _k}\exp \left( -\frac{\left\| {\mathbf {y}}_k-{\mathbf {D}}_k{\mathbf {H}}_k{\mathbf {T}}_k{\mathbf {x}}\right\| _2^2}{2\sigma _k^2}\right) , \end{aligned}$$
(3)

where \(\sigma _k\) denotes the power of the noise \({\varvec{\mu }}_k\).

2.2 Multi-scale Gradient Field Prior

SRR is an ill-posed problem, and thus, the priors are used to compensate missing information, particularly the loss of high frequencies. Image gradient is usually exploited to preserve edges in the reconstruction, such as Tikhonov and TV priors. Edge-directed methods have recently been emerging as a robust prior [15], where a desired gradient field is imposed on the reconstructed image as a guidance of the edge locations as well as edge strengths. Specifically, the prior is implemented by enforcing \(\nabla {\mathbf {x}}=\widehat{\nabla {\mathbf {x}}}\), where \(\nabla \) computes the gradient of \({\mathbf {x}}\), and \(\widehat{\nabla {\mathbf {x}}}\) is an input and denotes the desired gradient field. In [15], \(\widehat{\nabla {\mathbf {x}}}\) is learned from a large collection of natural images. However, the learning is designed for edge preservation only while leaving spatial smoothness unconsidered. In this work, we propose a new analytical operator to guide the reconstructed image for both edge preservation and spatial smoothness:

$$\begin{aligned} g_\tau \left( x\right) =x \left( 1+\left( \frac{\tau }{x}\right) ^4\right) ^{-1}. \end{aligned}$$
(4)

By inputting an image gradient field \(\nabla {\mathbf {x}}'\), \(g_\tau \left( \cdot \right) \) enhances large edges for sharpness and punishes gradient perturbations for smoothness. The parameter \(\tau >0\) balances between the gradient enhancement and punishment. The gradient field prior for the HR image \({\mathbf {x}}\) is therefore obtained from \(\widehat{\nabla {\mathbf {x}}}=g_\tau \left( \nabla {\mathbf {x}}'\right) \). To this end, a strategy is required to estimate the image gradient field from the LR images and transform it to the HR image space to form \(\nabla {\mathbf {x}}'\).

The difficulty in obtaining \(\nabla {\mathbf {x}}'\) is that the blurring in the LR images results in inaccurate edge localization, leading to inaccurate gradient guidance for the reconstructed image. We improve the accuracy of the edge location estimation from two aspects: compute the gradient over multiple scales of the LR images and dynamically update the estimation during the optimization of the reconstruction.

We align up all the LR images and interpolate them by a 3rd-order B-spline method to the images of the same size as \({\mathbf {x}}\). The mean image of these interpolated images is used as \({\mathbf {x}}'\). The gradient field in the HR image space at a scale s is then computed from \(\nabla _s{\mathbf {x}}'\), and correspondingly \(\widehat{\nabla _s{\mathbf {x}}}=g_\tau \left( \nabla _s{\mathbf {x}}'\right) \). In this work, each scale s is related to a unique integer triplet \(\left( \alpha ,\beta ,\gamma \right) \) that indicates the scales in x, y, and z directions, respectively. We thus define the operator \(\nabla _s\) as

$$\begin{aligned} \nabla _s = \omega ^{|\alpha |+|\beta |+|\gamma |} \left( {\mathbf {I}}-\varPsi _\alpha ^x\right) \left( {\mathbf {I}}-\varPsi _\beta ^y\right) \left( {\mathbf {I}}-\varPsi _\gamma ^z\right) , \end{aligned}$$
(5)

where \(0<\omega <1\) denotes a weight parameter, \({\mathbf {I}}\) denotes an identity matrix, and the operators \(\varPsi _\alpha ^x\), \(\varPsi _\beta ^y\), and \(\varPsi _\gamma ^z\) shift \({\mathbf {I}}\) by \(\alpha \), \(\beta \), and \(\gamma \) voxels in x, y, and z directions, respectively. To exclude the repeated scales, we constrain \(\alpha \in \left[ -p,p\right] \), \(\beta \in \left[ 0,p\right] \), \(\gamma \in \left[ 0,p\right] \), and \(\alpha +\beta +\gamma \ge 0\). Since the scale at 0 is trivial, we further exclude the case \(\alpha =\beta =\gamma =0\). We set \(p=2\) to create 40 gradient fields.

For a robust gradient guidance, \(\nabla _s{\mathbf {x}}\) is desired to be located at \(\widehat{\nabla _s{\mathbf {x}}}\) and yield a heavy-tailed distribution, such as Laplace distribution. As a result, the prior of our SRR approach is formulated as

$$\begin{aligned} p\left( {\mathbf {x}}\right) =\prod _{s=1}^S \frac{1}{2\sigma } \exp \left( -\frac{\left\| \nabla _s{\mathbf {x}}-g_\tau \left( \nabla _s{\mathbf {x}}'\right) \right\| _1}{\sigma }\right) , \end{aligned}$$
(6)

over S gradient fields, where \(\sigma \) is the variance of the distribution. It is evident that the MSTV prior [2] is a special case of Eq. (6) when \(g_\tau \left( \nabla _x'\right) \) is always zero. However, the proposed multi-scale gradient field prior is essentially different from the MSTV prior and its variants. The desired gradient field is exploited in the proposed approach to guide both the spatial smoothness and edge preservation over multiple scales in the SRR.

2.3 Super-Resolution Reconstruction

Referring to Eqs. (1), (3), and (6), the SRR for the HR image \({\mathbf {x}}\) is achieved from

$$\begin{aligned} \min _{\mathbf {x}}\sum _{k=1}^n\left\| {\mathbf {y}}_k-{\mathbf {D}}_k{\mathbf {H}}_k{\mathbf {T}}_k{\mathbf {x}}\right\| _2^2+\lambda \sum _{s=1}^S\left\| \nabla _s{\mathbf {x}}-g_\tau \left( \nabla _s{\mathbf {x}}'\right) \right\| _1, \end{aligned}$$
(7)

where \(\lambda >0\) is a parameter balancing between the likelihood and the prior. We solve Eq. (7) by a gradient descent algorithm, where \({\mathbf {x}}\) is estimated at iteration t by \({\mathbf {x}}^t={\mathbf {x}}^{(t-1)}+\eta \varDelta {\mathbf {x}}^{t}\) with the learning rate \(\eta \) and the increment \(\varDelta {\mathbf {x}}^{t}\). Meanwhile, to obtain an accurate image gradient guidance, we dynamically update the gradient field over scale-s by \(g_\tau \left( \nabla _s{\mathbf {x}}^t\right) \) at certain t, e.g., every 50 iterations. \(\tau \) in Eq. (4) is set to 0.065. According to [2], \(\omega \) in Eq. (5) is set to 0.6. We suggest \(\lambda \) in Eq. (7) is set in [0.03, 0.1] according to our empirical results.

Our SRR approach can incorporate with various acquisition schemes. We use the orthogonal (axial, coronal, and sagittal) and the origin-shifted acquisitions to demonstrate the efficacy of our method by comparing to those based on state-of-the-art priors, including Tikhonov, TV, MSTV, Huber loss, and Rousseau [12]. We have the synthetic data and the data acquired on a Siemens 3T MRI scanner containing 45 scans from 14 subjects for the demonstrations. All scans were performed in accordance with the local institutional review board protocol.

Synthetic Orthogonal LR Acquisitions. We simulated a T\(_1\)-weighted data set based on the Dryad data package [7]. We downsampled the original image of isotropic resolution of 250 \(\upmu \)m to that of 500 \(\upmu \)m, and used it as the ground truth. We then downsampled the ground truth image by the factors of \(\{2,3,4,5,6,8\}\) in the x, y, and z directions, respectively, to obtain an axial, an coronal, and an sagittal LR image at each factor. Simulated motion was added to these downsampled images. We evaluate the reconstructed HR image quality by peak signal-to-noise ratio (PSNR), SSIM [18], and root square mean error (RSME).

Axial and Coronal LR Acquisitions (AC). We acquired a T\(_2\)-weighted turbo spin-echo (T\(_2\)W-TSE) data set containing 36 scans from 12 subjects. For each subject, an axial and a coronal LR images were acquired with in-plane resolutions of 0.4 \(\times \) 0.4 mm\(^2\) and slice thickness of 2 mm, and correspondingly a T\(_2\)-weighted HR image was acquired as the reference. It takes about 2 min in acquiring a T\(_2\)W-TSE LR image on a Siemens 3 T scanner. The HR images are reconstructed at the resolution of 0.4 mm\(^3\). Instead of RSME, we employ the sharpness metric [4], which is an image gradient-based criteria, to evaluate the edge preservation in the reconstruction. We also qualitatively investigate the edge enhancement.

Fig. 1.
figure 1

Evaluation results of our and other three methods on the synthetic data set.

Axial, Coronal, and Sagittal LR Acquisitions (ACS). We acquired a T\(_2\)-weighted data set containing an axial, a coronal, and a sagittal LR image from a volunteer. The in-plane and through-plan resolutions of the 3 images are 0.4 \(\times \) 0.4 mm\(^2\) and 2 mm, respectively. The HR image is reconstructed at isotropic 0.4 mm. We qualitatively compare our reconstruction results to those of baseline methods.

Origin-Shifted Coronal LR Acquisitions (OSC). We acquired a T\(_2\)W-TSE data set containing 3 scans of a coronal LR image of resolution 0.4 \(\times \) 0.4 \(\times \) 0.8 mm, and 3 scans of a coronal LR image at the same resolution but with its origin shifted by 0.4 mm. The HR image is reconstructed at isotropic 0.4 mm. We conducted 3 scans for each coronal image for noise suppression in the reconstruction. We aim at clearly showing the hippocampus by our SRR approach.

3 Results and Discussions

Figure 1 shows the evaluation results on the synthetic data set. It is evident that our approach outperforms the 3 baselines that are respectively based on Tikhonov, TV, and MSTV priors. It indicates that the proposed multi-scale gradient field prior improves the SRR on the synthetic data set.

Figure 2 shows the evaluation results obtained from our approach and the baselines on the AC data set. The average performance of our method over the 12 sets of reconstructions is: PSNR = 25.96 ± 1.49, SSIM = 0.79 ± 0.047, sharpness = 1.03 ± 0.024, which is better than the 5 baselines in terms of all the 3 metrics. Figure 3 shows a representative reconstructed result on the AC data set. A representative edge is analyzed and the results are shown in Fig. 3(d). It is evident that our proposed prior leads to better edge enhancement than the 3 TV-based baseline priors. The reference image has low contrast due to the long time scan. Therefore, the MSTV, TV and our methods obtain better edges than the reference.

Fig. 2.
figure 2

Evaluation results of our and other five methods on the AC data set.

Fig. 3.
figure 3

Reconstructed results on the AC data set. (a) A slice from the axial LR image. (b) A slice from the coronal LR image. (c) A slice form the reconstructed HR image by the proposed approach. (d) Edge profile plot over the line marked in red in (c). (Color figure online)

Fig. 4.
figure 4

Reconstructed results on the ACS data set, obtained from (a) Tikhonov, (b) TV, (c) MSTV, and (d) the proposed approaches, respectively.

Figure 4 shows the qualitative results on the ACS data set obtained from the Tikhonov, TV, MSTV, and our approach. It can be seen that the SRR over the orthogonal LR scans can achieves good HR images. Our approach performs the best in showing image details and produces sharper edges than the 3 baselines.

Figure 5 shows the qualitative results on the OSC data set. We can see that our method produces both region-smoothed and edge-sharpened HR image, as compared to the 3 baselines. In particular, the anatomical structures of the hippocampus can be clearly shown in different image planes of our reconstructed image. This is a significant improvement for the in vivo studies on hippocampus.

Fig. 5.
figure 5

Reconstructed results on the OSC data set. Two coronal slices from (a) an LR image and (b) our reconstruction. The hippocampus in sagittal slices from (c) an LR image, and the images reconstructed via (d) Tikhonov, (e) TV, (f) MSTV, and (g) the proposed approach, respectively.

4 Conclusions

In this work, we have proposed a novel image-based MRI SRR approach based on anisotropic acquisition schemes. We have achieved superior reconstruction to state-of-the-art methods by introducing a new multi-scale gradient field prior that guides the reconstruction of the HR image. The inverse of the forward model of image formation has been used to propagate the gradient guidance from the LR images to the HR image space. The proposed approach has been evaluated on the synthetic data as well as the data acquired on a Siemens 3T MRI scanner containing 45 MRI scans from 14 subjects. The evaluation results have demonstrated that our proposed prior leads to improved SRR as compared to state-of-the-art priors, and that, based on the orthogonal and the origin-shifted LR acquisition schemes, the proposed SRR obtains better results at lower or the same cost in scan time than the direct HR acquisition.