1 Introduction

Spinal fusion surgery is one of the most commonly employed procedures for treating various spinal conditions involving scoliosis, spinal stenosis, degenerative disc disease or spondylolisthesis [1]. The procedure consists in using a bone graft to fuse two or more vertebral bodies together into one single rigid structure. In most cases, the surgeon additionally uses metal plates, screws and rods to support the vertebrae while the bones fuse. A crucial part of the spinal instrumentation procedure is the placement of pedicle screws, which has been associated with high complication factors related to screw malpositioning [2]. The accuracy required for pedicle screw placement varies significantly depending on the size of the screw, the vertebra level and the anatomy. Rampersaud et al. [3] reported a maximum error tolerance of screw malpositioning below 1 mm translation and \(5^\circ \) rotation at the midcervical spine, the midthoracic spine, and the thoracolumbar junction. The tolerance is higher in the thoracolumbar spine, where \(3.8\,\text {mm}/12.7^\circ \) at the L5 vertebra was estimated.

Image-guided navigation systems (IGNS) have been shown to reduce screw malpositioning rate by providing information on instrument localization with respect to the patient’s anatomy. For IGNS to be functional during surgery, the registration step that aligns pre-operative images to the current state of the patient’s anatomy must be accurate. In a standard clinical procedure, the registration is achieved by manually identifying homologous anatomical landmarks on both the pre-operative images and the patient. The procedure lasts approximately 10 to 15 min for each vertebra [4, 5]. This approach is tedious, extends the operating time and is subject to operator variability. Moreover, during navigation, a dynamic reference object (DRO) (i.e., a spatially tracked tool) is rigidly attached to the spinous process of a vertebra and serves as a reference coordinate frame to account for patient positioning and motion during surgery. Once the registration achieved, changes in the position of the DRO caused by patient movement, surgical interventions or inadvertent contact with the DRO, may invalidate the registration.

Common commercial IGNS, such as the O-arm (Medtronic inc., Minneapolis, MN), Airo Mobile (Brainlab, Feldkirchen, Germany), SpineMask (Stryker, Kalamazoo, MI) or Ziehm Vision FD Vario 3D (Ziehm Imaging, Orlando, FL) use fluoroscopy or computed tomography (CT) intra-operative imaging. The latter imaging modalities introduce risks of harmful radiation exposure for both the patient and the operating room (OR) personnel. Moreover, they require a typical setup time of \({\sim }15\,\text {min}\) [6] and extra personnel for manipulating the equipment, which significantly extends the surgical procedure time.

Intra-operative ultrasound (iUS) has been investigated as possible alternative imaging in orthopedic and spine surgery applications [7,8,9]. With low cost, non-ionizing radiation exposure, small footprint and a significantly shorter setup time in the OR, iUS imaging is a good candidate for image-guided navigation. However, ultrasound images can have low image quality affecting the registration accuracy, a limited field of view precluding imaging large or distant structures, and shadow artifacts induced by high acoustic absorption of bones, which hinder their application in clinical environment.

The goal of this paper is to propose an OR-designed fast CT-to-iUS image registration method for spine surgery. Specifically, we present a rigid registration framework to align pre-operative CT to iUS images of a single vertebra. Considering the rigid anatomical structure of the bones, single vertebra registration is a common step to achieve a more global group-wise multi-vertebrae registration to capture the spine curvature [10,11,12]. The motivations behind this work are three-fold: (i) to develop a radiation-free approach that relies solely on iUS imaging, (ii) to design an unobtrusive and straightforward procedure compatible with the surgical workflow, and (iii) to design a fast registration method that allows the surgeon to rapidly realign the images to correct for patient-to-image misregistration during surgery. The remainder of this paper is organized as follows: Sect. 2 reviews previous work using iUS-based IGNS in spine surgery. Section 3 presents the proposed registration framework. The experimentation setup is described in Sect. 4 and results are presented in Sect. 5.

2 Related Work

In order to achieve high CT-to-iUS registration accuracy, several authors have exploited unique properties of iUS imaging. Strong ultrasound reflections on bone structures cause the vertebra to appear in black on iUS images with a hyper echoic edge several mm thick on the bone surface [13]. Yan et al. [4, 14] proposed to use a backward and a forward tracing approaches to first extract the posterior surface of the vertebra on both iUS and CT images, respectively. Then, a rigid cross-correlation registration is applied to align the vertebra surfaces. Authors reported a median target registration error (TRE) ranging between 1.65–2.31 mm on porcine cadavers. A slice-to-volume variant of the approach proposed in [15], in which the registration is performed without iUS volume reconstruction to accelerate the computations, achieved comparable accuracy. The reported registration time was around 120 s per vertebra. Both methods require an initial alignment, assumed to be achieved prior to the registration with a landmark-based manual registration.

A hierarchical CT-to-iUS vertebra registration framework was proposed by Koo et al. [8]. The registration involves three steps. First, similar to Yan’s approach [4], a landmark-based manual registration is applied for an initial guess of the alignment, followed by a rigid cross-correlation registration of the posterior vertebral surface extracted on both iUS and CT imaging. The vertebra surfaces are extracted with the backward and forward tracing methods. Finally, an additional intensity-based rigid registration is performed between the vertebra surface on the CT image and the original iUS image. The average TRE reported was \(2.18\,\text {mm}\pm 0.82\,\text {mm}\) (ranging between 0.89–4.45 mm) on a porcine cadaver. Excluding the manual landmark registration, intensity-based registrations were achieved in \({\sim }100\,\text {s}\) per vertebra.

Nagpal et al. [10] proposed a multi-vertebrae CT-to-iUS registration framework. Here again, the posterior surface of the vertebrae is exploited [16] and the registration is achieved in three steps. First, with the assumption that both CT and iUS images represent similar structures, the initial alignment is obtained by applying a rigid registration using mutual information on the vertebra surface images. Then, the registration is refined by applying a global point-based registration using the vertebra surface coordinate points. To account for the spine curvature over multiple vertebrae, an additional group-wise vertebra registration is performed, in which intervertebral points are manually added to prevent physically incoherent transformations. Because, the study was conducted on clinical data of human subjects, a gold standard registration was not possible, authors used manual landmark registration combined with the proposed method to serve as ground truth registration. They reported average TRE of anatomical landmarks ranging from 0.71–1.70 mm and a computation time ranging from 50–185 s.

3 Registration Framework

Figure 1 shows an overview of the proposed registration framework. The approach involves four intra-operative steps: (1) extract the posterior surface of the vertebra on iUS images, (2) create an iUS compounded volume from iUS acquisition slices, (3) estimate the initial alignment, and (4) perform gradient alignment of the vertebra surfaces of CT and iUS images. The posterior vertebra surfaces on CT and iUS images are extracted using the forward and backward tracing methods [4]. The approach has the advantage to be fast and provides good results.

Fig. 1.
figure 1

Flowchart of the proposed registration framework.

3.1 Intra-operative Ultrasound Image Acquisition

We use an iUS-based navigation system composed of an optical tracking camera (Polaris, Northern Digital Inc., Ontario, Canada), an ultrasound machine with a tracked phased array probe (HDI 5000/P4-7, Philips, Amsterdam, Netherlands) and a tracked tool used as a DRO. The ultrasound probe is calibrated such that collected images are associated with their respective spatial position and orientation relative to the DRO. The Intraoperative Brain Imaging System (IBIS) [17] open-source plate-form is used for navigation, i.e., probe calibration, tracking and 3D visualization.

The acquisition frame rate of iUS images is around 25 Hz, which may introduce redundant information in successive frames. To reduce the computation time of vertebra surface extraction and volume compounding, the number of acquired frames is reduced such as a minimum distance \(d \in \mathbb {R}_{\ge 0}\) separates the centroids of successive frames. A high value of d yields a sparse volume and fast computations, while a value of zero does not modify the acquisition. The frames satisfying the distance criterion are selected for the next steps.

3.2 Ultrasound Volume Compounding

The selected frames are combined into a single volume by aggregating the ultrasound slices to form a resampled volume, the compounded volume, to avoid a full volume reconstruction as proposed in [15]. Because the relationship between the spatial positions of the ultrasound slices is fixed, registering the compounded volume to the CT volume is analogous to simultaneously optimizing for a slice-to-volume rigid body registration of each individual iUS slice to the CT volume. In our implementation, each iUS pixel intensity is resampled in its corresponding 3D location in the compounded volume, and the intensities are averaged for overlapping pixels. It is important to consider the spatial resolution of the resampled compounded volume. Figure 2 shows examples of volume compounding with different resolutions. While a fine resolution results in a large but highly sparse volume, a coarse resolution results in a small but dense volume. Note that because we use gradient information in the final alignment step, a too sparse volume precludes capturing inter-slice gradient information.

Fig. 2.
figure 2

Examples of ultrasound volume compounding with a resolution of \(2\times 2 \times 2\,\text {mm}^3\) (left), \(1\times 1 \times 1\,\text {mm}^3\) (middle) and \(0.5\times 0.5 \times 0.5\,\text {mm}^3\) (right).

3.3 Initial Alignment

In order to guess the initial alignment, we define a simple sweep procedure to limit the variability in the translational and angular positioning of the iUS probe during the acquisition. The quality of the iUS acquisition has a significant impact on the registration [14]. Thus, our acquisition procedure consists in a single axial sweep along the cranio-caudal direction, starting from the inferior extremity up to the superior extremity of the vertebra, with the probe orientation normal to the coronal plane (Fig. 3).

This acquisition procedure has three key properties: (1) assuming that the same number of vertebrae is imaged with both CT and iUS, the center of mass of the selected iUS frame centroids roughly correspond to the center of the CT image, (2) the scan trajectory is approximately linear along the inferior to superior axis, (3) on the iUS image plane, the proximal to distal axis from the probe’s transducers corresponds to the posterior to anterior axis on the vertebra. Based on this, three anatomical points are created on the physical space: a center of mass \(\mathbf {p}_{\text {mass}}^{\text {US}}\), a superior point \(\mathbf {p}_{\text {sup}}^{\text {US}}\) at a 10 mm distance from \(\mathbf {p}_{\text {mass}}^{\text {US}}\) toward the superior direction, and a distal point \(\mathbf {p}_{\text {distal}}^{\text {US}}\) at a 10 mm distance from \(\mathbf {p}_{\text {mass}}^{\text {US}}\) toward the anterior direction. Similarly, three homologous points \(\mathbf {p}_{\text {mass}}^{\text {CT}}\), \(\mathbf {p}_{\text {sup}}^{\text {CT}}\) and \(\mathbf {p}_{\text {distal}}^{\text {CT}}\) are created on the CT image. Finally, the initial alignment transform is obtained by applying a Procrustes point-based rigid registration, minimizing the least-square distances between the CT and the iUS points.

Fig. 3.
figure 3

Illustration of the acquisition procedure.

3.4 GPU-Based Gradient Alignment Registration

The initial alignment approach roughly registers the CT to iUS images, based on the acquisition procedure described in Sect. 3.3. To refine the registration, we perform a gradient alignment registration [18]. Originally, the approach was designed for brain MR-to-iUS registration. First, the gradient from both the fixed iUS image and the moving CT image are extracted. Then, a covariance matrix adaptation (CMA) evolution strategy [19] is used to maximize the inner product of the normalized gradients:

$$\begin{aligned} S(\nabla I_\text {US}(\mathbf {x}), \nabla I_\text {CT}(\mathbf {x})) = \left\langle \frac{\nabla I_\text {US}(\mathbf {x})}{|\nabla I_\text {US}(\mathbf {x})|}, \frac{\nabla I_\text {CT}(\mathbf {x})}{|\nabla I_\text {CT}(\mathbf {x})|} \right\rangle ^n, \end{aligned}$$
(1)

where \(\mathbf {x}\) is the image coordinate vector, \(\nabla I_\text {US}\) and \(\nabla I_\text {CT}\) are the fixed iUS and moving CT image gradients, respectively, and \(n \in \mathbb {N}\) is a free parameter which characterizes the matching criterion and was set to \(n=64\). To reduce the computation time, the metric is computed on a subset of points sampled among the most confident gradients on the image. We slightly modified the algorithm to take into account the vertebra surface on the intra-operative images. Instead of a random sampling over the entire image, the points are sampled from a 2 mm thick region around the iUS extracted bone surface. Candidates satisfying the low uncertainty criterion (see [18] for details) among the bone surface points are then selected to be used in Eq. (1). Gradient image computations of \(\nabla I_\text {US}\) and \(\nabla I_\text {CT}\), and evaluation of the similarity metric in Eq. (1) are performed on a GPU. The final registration transform is given by:

$$\begin{aligned} T^\text {reg} = \arg \max _T S\Big (\nabla I_\text {US}(\mathbf {x}), \nabla I_\text {CT}(T(\mathbf {x}))\Big ). \end{aligned}$$
(2)

Finally, we perform the registration using a multi-scale approach. Two different scales are used. In the first pass, the images are smoothed using a Gaussian filter with \(\sigma =2\,\text {mm}\) to capture large structures, e.g., thicker surface of the vertebra. A second registration pass is then performed on images filtered with \(\sigma =1\,\text {mm}\).

4 Experiments

We validated our proposed registration framework on the same dataset used in [14]. The dataset contains a CT scan of a lumbosacral section of a porcine cadaver in supine position, in which vertebrae T15 and L1 to L6 were present. The CT scan consists in a superior to inferior axial slices acquired using a Picker International PQ6000 CT scanner with an in-slice resolution of \(0.35 \times 0.35\,\text {mm}^2\) and a slice thickness of 2 mm.

For each vertebra, three to four fiducials were implanted on the anterior/ventral part of the cadaver, such that they do not interfere with the iUS acquisition. The fiducials are made of pipette tips that can be nested together. Each fiducial is composed of three parts: a fiducial base which is rigidly fixed to the vertebra, an imaging marker which is a steel sphere inside the pipette that appears bright in CT images, and a reference marker which is a filled pipette such that its center corresponds to the center of the sphere in the imaging marker. Imaging fiducial positions were collected by computing the centers of the segmented bright spheres that appear on the CT image. Reference fiducial positions were manually collected using a tracked pointer with IBIS. The ground truth registration transform of each vertebra was obtained by applying a point-based registration on its corresponding fiducials.

The fiducials were used to establish the ground truth registration transform. Therefore, using the fiducial positions to assess vertebrae alignment is not suitable. In fact, computing the fiducial registration error (FRE) may not be representative of the TRE at the vertebra surface. Moreover, because the fiducials were placed far from the vertebra surface, a small misalignment of the fiducial points (i.e., small FRE) may result in a large TRE at the vertebra surface. To evaluate the TRE of the registration, seven landmarks were manually identified on the surface of each vertebra on the CT images. The anatomical landmarks correspond to: a point on the apex of the spinous process, two points on the left and right laminae, two points on the left and right superior articular processes and two points on the left and right inferior articular processes. The TRE of each vertebra is obtained by:

$$\begin{aligned} \textit{TRE}_v = \sqrt{\frac{1}{7} \sum _i^7 |T^\text {gt} \mathbf {p}_i - T^\text {reg} \mathbf {p}_i|^2}, \end{aligned}$$
(3)

where v is the vertebra level, \(T^{\text {gt}}\) is the ground truth registration transform obtained from fiducial point-based registration and \(\mathbf {p}_i\) is the \(i^\textit{th}\) landmark point manually positioned on the vertebra surface. In the literature [10, 14], a threshold of 2 mm is commonly used to characterize a successful registration, i.e., the registration is considered successful if its associated TRE is below 2 mm. Similarly, in our experiment, we use a 2 mm threshold to report the success rate of the registration.

In addition to the registration accuracy, we measured the computation time required to perform the registration of each vertebra. The computations involve three main tasks: extracting the vertebra surface (backward tracing), compounding the iUS volume and aligning CT to iUS volumes (i.e., solving Eq. (2)). We also report the computation time required to perform the initial alignment, although it can be neglected due to its small contribution to the overall registration time. Note that the iUS acquisition time, i.e., the time required to manipulate the iUS probe and perform the sweep, is not reported in this study. All computations were performed using an Intel\(^\copyright \) Core™ i7-3820 CPU at \(3.6\,\text {GHz} \times 8\) station and a NVIDIA GeForce GTX 670 graphics card with \(4\,\text {Gb}\) of memory.

Fig. 4.
figure 4

Example of qualitative results showing superimposition of a registered CT image and a iUS image of the L4 vertebra: (red) iUS image, (blue) vertebra surface extracted on iUS with backward tracing, (gray) CT image, (green) vertebra surface extracted on CT with forward tracing. (Color Figure online)

Using a stochastic CMA evolution strategy to optimize Eq. (2) yields non-deterministic results. To measure the overall registration accuracy, for each vertebra, 10 batches of registrations are performed. Each batch involves two steps. The first step, referred to as Reconstruction, consists in performing a vertebra surface extraction on the iUS image and a slice compounding into a volume. The second step, referred to as Registration, consists in performing 10 repetitions of the CT-to-iUS registration, i.e., an initial alignment followed by a gradient alignment. In total, 100 registration trials where performed for each vertebra. It should be noted that the trials use the same CT and iUS acquisition images for each vertebra. We set the distance threshold for acquisition frame reduction \(d=0.5\,\text {mm}\) (see Sect. 3.1). The resolution of the iUS compounded volume (see Sect. 3.2) is set to \(1.5\times 1.5 \times 1.5\,\text {mm}^3\), to produce sufficiently dense volumes.

5 Results

Figure 4 shows a qualitative result obtained with the proposed registration framework and the quantitative accuracy results are summarized in Table 1. The overall TRE is slightly better than the results reported in [14] with the inferior to superior axial iUS acquisition scan (ultrasound sweep No. 1). The median TRE is 1.48 mm (IQR 0.68 mm) ranging from 0.45 mm to 2.78 mm, which is below the acceptance threshold of 2 mm. This is highlighted by a success rate of \(84.42\%\). However, the results obtained on the L4 vertebra seem to be the worst, with a median TRE of 2.03 mm. The reason behind this large error may be related to the large FRE of 0.593 mm induced when the ground truth registration was generated at L4.

The number of selected frames and the computation time for each vertebra registration are summarized in Table 2. The average overall registration time is \(0.742\,\text {s} \pm 0.037\,\text {s}\) per vertebra. This includes both the initial alignment and the gradient alignment processing time. This is significantly lower than the 2 min reported by Yan et al. [15] and the 100 s reported by Koo et al. [8] per each vertebra registration. The registration time ranging between 50–185 s reported by Nagpal et al. [10] involved multiple vertebrae registration, precluding a direct comparison. It should be noted that all the aforementioned works did not include the iUS volume reconstruction time, which is expected to be performed after acquiring the iUS images during the surgery. In our approach, the computational bottleneck is associated with the reconstruction step with an average time of \(10.05\,\text {s} \pm 1.26\,\text {s}\). This is expected since the reconstruction task is performed on a CPU. Including the reconstruction and the registration, the total time to align the pre-operative CT image to the iUS image is \(10.79\,\text {s} \pm 1.27\,\text {s}\), which is practical in the OR. Reducing the number of acquisition frames allows to reduce the reconstruction time. Particularly for the vertebra surface extraction step where the computation time corresponds to \({\sim }58\%\) of the overall registration time.

Table 1. Registration accuracy results for each vertebra level: (left) target registration error (TRE) after the initial alignment, (middle left) TRE after final registration, (middle right) success rate below 2 mm, and (right) fiducial registration error of the ground truth registration.
Table 2. Computation time results: GPU computations are indicated by a \(^{\star }\).

6 Conclusion

In this paper, we presented a registration framework to rigidly align a CT volume to iUS images of a single vertebra. We demonstrated that our approach can achieve a median accuracy of 1.48 mm ranging from 0.45 mm to 2.78 mm on a lumbosacral section of a porcine cadaver. This is below the clinical acceptance threshold of 2 mm suggested in the literature. More importantly, with a straightforward iUS acquisition procedure and a highly efficient computation time of \({\sim }11\,\text {s}\), the registration framework can be easily integrated into the surgical workflow. We estimate the entire registration procedure (including the iUS acquisition) to be completed in less than one minute, rather than the 15 min required using an intra-operative CT imaging system. This allows fast corrections of registration misalignment during the surgery, without additional exposure to radiation.

Future work will involve an extended validation of the registration framework. Because the quality of the iUS acquisition may have a significant impact on the registration outcome [14], we will analyze how violation of the proposed iUS acquisition procedure affects the registration results. We will also investigate efficient methods to perform the reconstruction step on a GPU. In fact, parallelizing the extraction of the vertebra surface and the iUS volume compounding will result in further gain in computation time.