1 Introduction

The extraocular muscles (EOMs) implement eye movements. Through Magnetic Resonance Imaging (MRI), it has been found that many forms of binocular alignment (strabismus) are associated with anatomical abnormalities of EOMs [1]. In clinical practice, EOM enlargement is a key quantity to examine in diagnosing several complex strabismus [2] including thyroid eye disease [3, 4]. Therefore, how to reliably and efficiently outline the EOM boundaries from clinical MRI becomes an important practical and research question. In all published studies to date, investigators segment EOM boundaries manually [1, 5, 6], which is labor expensive and may introduce user dependent artifacts.

Several computer-aided semi-automatic [79] and automatic segmentation [10, 11] methods have been developed. However, all of these methods used image pixels as the underlying representation primitive. It is known that pixels are not the most natural representation of visual scenes, since they do not take into account the local patterns among neighboring pixels and are subject to noise. It would be more natural and efficient to process the image with perceptually meaningful patches containing many pixels that share similar features.

We propose a fully automatic EOM segmentation method based on superpixel, region adjacency graphing and Normalized Cuts, while integrating the prior shape information. Rather than processing images on the pixel level, the approach builds upon local feature of the EOMs. We consider small image patches obtained from superpixel over-segmentation [1215] as the basic unit of any further image processing procedures, such as filtering, detection and segmentation. We show that by building a region adjacency graph of the superpixels, we can develop a robust method to outline the eye socket boundary and the EOMs within. The performance of our automatic segmentation method was evaluated by comparing our results to manual segmentation which showed high accuracy.

2 Related Work

Firbank et al. [8] showed the feasibility to segment EOMs using the active contours. However, this approach is sensitive to the boundary initialization, since it can be easily trapped in local minima [16]. In addition, the accuracy is influenced by the convergence criteria — higher accuracy requires tighter convergence criteria and longer computation time [17]. Souza et al. proposed a mathematical morphology method to semi-automatically segment EOMs [9, 18]. They performed an iterative grayscale closing operations to segment the orbital wall which was then used as the region of interest. The EOMs were outlined in the region of interest through Laplacian or Gaussian detector and opening operations. However, the size of flat disk used for the morphology operation was fixed and only worked on the pixel level. The number of iterations had to be carefully supervised. A more recent semi-automatic approach deformed 3D geometric template models of the EOMs to the MRI images of individual patients [19]. Image features of the EOMs were detected and filtered to guide fitting of the generic anatomical template. However, the template model has to be built by considering anatomical characteristics of the EOMs. In addition, a global registration between the image sequence and the template model had to be performed at the beginning.

3 Methodology

3.1 Superpixel Over-Segmentation

The image segmentation algorithm Superpixel [13] groups pixels with coherent intensities and spatial locations into patches of pixels. These superpixels provide a high level representation of the original image which can be used for further processing. The geometric shapes of superpixels are not restricted to rectangular. Such flexibility enables representation of features more naturally by maintaining the boundaries of the objects in the image. Accurate segmentation can then be performed by merging the local superpixels which have similar features.

We applied the k-means algorithm to group nearby pixels into superpixels in uniform sizes [12]. Unlike other superpixellization methods [20, 21], the k-means method produces a more regularized grid of superpixels, which is important for building the region adjacency graph. Figure 1(a) shows a T-1 weighted quasi-coronal MRI image perpendicular to the long axis of the orbit with 312 micron pixels and 2 mm plane thickness. Figure 1(b) illustrates the result of the superpixel over-segmentation. The boundaries of the superpixels preserve the true structure boundaries. More importantly, the shape and area characteristics of the EOMs and the eye socket are relatively consistent [9]. As the algorithm restricts, the number of pixels in each superpixel is nearly constant across the image.

Fig. 1.
figure 1

(a) A representative coronal MRI image of the eye; (b) labeled ocular structures to segment; (c) superpixel over-segmentation; (d) manually segmented structured overlaid on the superpixels.

3.2 Region Adjacency Graph Construction

Fig. 2.
figure 2

Region adjacency graph build from superpixels.

Human visual perception is good at recognizing individual objects even with varying intensities or textures. As an algorithm with no priors, superpixel segmentation algorithms [12, 13, 22, 23] have the tendency to over-segment the image (Fig. 1(c)(d)). Region adjacency graph (RAG) is a data structure common for many segmentation algorithms [24]. We used RAG to specify spatial connection of neighboring superpixels. Each superpixel was defined as a node in a graph (Fig. 2). Each superpixel was connected through the edges to all its neighbors. The RAG was used to merge adjacent regions provided that these regions have similar intensity distributions. Denote \(n_i\) to be a node in RAG with mean intensity \(I_i\) and \(n_j\) to be one of its neighbors. The edge weight between \(n_i\) and \(n_j\) is defined as \(w_{ij}=\exp \bigl (\frac{-\Vert I_{i}-I_{j}\Vert ^2}{\sigma ^2}\bigr )\), where \(\sigma ^2\) is overall image variance. \(w_{ij}\) measures the intensity similarity between \(n_i\) and \(n_j\).

3.3 Normalized Cuts Segmentation

Fig. 3.
figure 3

Normalized Cuts segmentation produced regions labeled in different gray scales.

Superpixel segmentation is a bottom-up approach as it merges individual pixels together. Once we have the superpixels represented by the RAG, we applied the Normalized Cuts algorithm (Ncut) [25], a top-down approach to partition the graph recursively until finding the ocular structures. Applying the Normalized Cuts other on superpixels other than pixels is more robust to image noise. In addition, it has an obvious advantage of being computationally efficient, as the size of the affinity matrix and the complexity of the RAG employed for image representation are significantly reduced.

In each division, the Ncut algorithm optimally divides one region into two subregions \(N_1\) and \(N_2\) by removing edges connecting them in RAG:

$$\begin{aligned} Ncut(N_1,N_2)=\frac{cut(N_1, N_2)}{assoc(N_1,N)}+\frac{cut(N_1,N_2)}{assoc(N_2,N)}. \end{aligned}$$
(1)

\( cut(N_1, N_2)=\sum _{n_i\in N_1 \& n_j\in N_2}w_{ij}\) computes the degree of dissimilarity between \(N_1\) and \(N_2\) as summed weight of all the removed edges. \(assoc(N_1,N)\) defines the total edge weight from nodes in \(N_1\) to all nodes in the current region.

Figure 3 shows the segmentation results after applying Normalized Cuts. The final nodes after Normalized Cuts contain many superpixels and are colored in grey scale. The Normalized Cuts segmentation can successfully highlight some of the ocular structures such as the superior rectus muscle, the oblique rectus muscle and the optic nerve. However, the lateral and medial rectus muscles had incomplete boundaries and their boundaries are connected with the orbital wall, making them difficult to segment. Further automatic operations are needed to solve discontinuity issues and label each structure.

3.4 Orbital Wall and Extraocular Muscle Segmentation

Segmenting the orbital wall was studied previously. Souza et al. [9] applied an iterative grayscale mathematical morphology operation to segment the orbits. But their method required the user to specify a flat disk template and number of recursive iterations of erosions. Firbank et al. [8] manually outlined the boundary around the eye socket. We proposed an automatic method to extract the orbital wall with shape prior knowledge. The Laplacian of the Gaussian [26] and connected components labeling methods are applied to detect the connected boundaries of the orbital wall, rectus muscles and optic nerve from the segmented Normalized Cuts image shown in Fig. 4(a).

Fig. 4.
figure 4

Region of interest extraction. (a) Initial boundaries after Ncuts based on Fig. 3; (b) Center(*) of each boundary (c) Finding optical never and orbit regions using k-mean cluster; (d) Orbit and EOMs boundaries identified using convex hull; (e) Region of interest in the original image; (f) Region of interest in the superpixel image (Color figure online).

To extract the orbital wall, we considered the prior knowledge that the eye socket was always located near the image center. The center of each region produced by Normalized Cuts was calculated and shown in Fig. 4(b). The centers of the optic nerve and the orbital wall were the two closest centers near the image center. Using the k-nearest algorithm, the regions of the optic nerve and the orbital wall can be identified from the boundary map. Any region outside the orbital wall were removed (see (Fig. 4(c)). Two of the closed boundaries inside the orbital wall were identified as the superior and inferior rectus muscles (Fig. 4(c)). To segment the lateral rectus muscle and medial rectus muscle with incomplete boundaries, the convex hull around the orbital wall was calculated and shown as the red closed curve in Fig. 4(d). The generated closed orbital wall completed the initial discontinuous boundaries of the lateral and medial rectus muscles which are in contact with the orbital wall. The convex hull served as the region of interests and reserves the natural boundary of the eye socket in the original image (Fig. 4(e)) and the superpxiel image (Fig. 4(f)). Finally, image region and hole filling algorithms were used to complete the EOM segmentation (Fig. 5(a)). Figure 5(b) shows the segmented boundaries overlaid with the MRI and superpixel images.

Fig. 5.
figure 5

(a) Segmented extraocular muscles and optic nerve. (b) Superpixels mapped with the orbital wall and the segmented ocular structures.

4 Experiments and Results

4.1 Materials

Fig. 6.
figure 6

The shape error between the manually segmented boundaries and the superpixel over-segmented boundaries decreases as the number of superpixels n increases for orbit, superior rectus (SR), inferior rectus (IR), medial rectus (MR), lateral rectus (LR) and all four rectus (All) muscles

The T1-weighted MRI images of both eyes were acquired from 5 health subjects and provided by Dr. Joseph Demer at UCLA. Eight coronal images at the slice thickness of 2 mm for each eye were segmented. All images were digitized with \(256\times 256\) pixels and 16 bits gray-level of resolution at voxel size of 0.3 mm \(\times \) 0.3 mm \(\times \) 2.0 mm. We asked two operators to independently and manually trace the ocular structure boundaries, which were used as ground truth for accuracy assessment.

4.2 Shape Error Analysis

One critical issue is to determine the number of superpixels n when applying the k-means algorithm. If there are too few superpixels, we may miss out important structures. On the other hand, if there are too many, we lose the superpixel’s advantage of being representative, robust and efficient. In order to determine an appropriate n, parameter learning was performed by analyzing the influence of n on the segmentation accuracy. We first manually segmented one set of MRI images by tracing the boundaries of the ocular structures. We applied superpixel over-segmentation on the same set of images while varying \(200<n<2600\). For each n, we overlapped the manually traced boundaries to the superpixel image (Fig. 1(c)(d)). The shape error between the manual and the superpixel segmentations was computed using the boundary-based measurement [13] to quantify how close the superpixel boundaries are to the manual segmentation (approximating ground truth). The shape error was calculated as the mean absolute distance between the superpixel boundaries and the ground truth boundaries. Figure 6 plots the shape error as a function of n for different ocular structures. Unsurprisingly, as the number of superpixels increases, the error drops monotonically to zero when each pixel becomes one superpixel. According to “elbow selection criterion” [27, 28], \(n=1800\) was chosen as the number of superpixels for our MRI dataset in subsequent operations.

4.3 Performance Evaluation

Area-based metric [9] and volume-based metric [7, 8] are commonly used to evaluate the accuracy of image segmentation. One drawback is that these two metrics do not consider the overlap between ground truth and computer generated segmentation results. We decided to use the region-based metrics to assess our proposed approach: Variation of Information (VI) [29], Probabilistic Rand Index (RI) [30] and Segmentation Covering Criteria (Covering) [31].

Variation of Information. The Variation of Information metric was introduced for the purpose of clustering comparison [29]. It measures the distance between two segmentations in terms of their average conditional entropy defined as

$$\begin{aligned} VI(S,S')=H(S)+H(S)-2I(S,S'), \end{aligned}$$
(2)

where H and I represent the entropies and mutual information between two clusters of data S and \(S'\).

Rand Index. Rand Index [30] was first proposed for general clustering evaluation. It operates by comparing the compatibility of assignments between pairs of elements in the clusters. The Rand Index between the automatical segmentation and the manual segmentation X and G is given by summing the number of pairs of pixels that have the same label in \(X\bigcup G\) and those with different labels in both segmentations, divided by the total number of pairs of pixels.

Segmentation Covering. We define the covering of segmentation S by segmentation \(S'\) as

$$\begin{aligned} C(S' \rightarrow S) = 1/N * \sum _{R \in S} \left| R \right| * max_{R' \in S'} O(R, R'), \end{aligned}$$
(3)

where N is the total number of pixels in the image and \(O(R,R')=\frac{\left| R \cap R' \right| }{\left| R \cup R' \right| }\) is the overlap between two regions R and \(R'\) [31].

Table 1. Computational time (in seconds) of applying three collision detection methods.

Table 1 summarized evaluation results using the region-based metrics. The IR, LR, MR and SR muscles of both left and right eyes were analyzed. We computed two scores for the Segmentation Covering for EOMs, which were segmentations at a optimal data set scale (ODS) and a optimal image scale (OIS). We also examined the Rand Index and Variation of Information quantities compared to the manual segmentation. Lower value of Variation of Information metric indicates greater similarity. Our average Variation of Information value is 1.51 which outperforms other image segmentation outcomes [32]. Rand Index metric is in the range [0 1]. Higher value indicates greater similarity between two segmentations. For different EOMs, The average value for Rand Index metric is greater than 0.82 which shows the excellent performance of our algorithm. The standard deviation of Rand Index is about 0.03 demonstrating that our method can generate consistent results for four different EOMs. With respect to Segmentation Covering, we compute the normalized overlap score in range [0 1]. The larger the value, the more accurate the algorithm. The average value for different EOMs is 0.78, which is consistent to Rand Index result. In summary, as Table 1 shows, our automatic segmentation algorithm was able to segment boundaries fairly close to the manual segmented boundaries which illustrates the accuracy and effectiveness of our approach.

5 Conclusions

Extraocular muscles segmentation from MRI is an important and challenging task for clinical diagnosis. This study has demonstrated an automatic method using Superpixel, Region Adjacency Graph and Normalized Cuts. The results were compared to manual segmentations. Region-based segmentation evaluation metrics showed that our method was able to segment boundaries fairly accurately. In the future, we plan to improve the efficiency of segmentation on superpixels using Normalized Cuts by using GPU to accelerate the Normalized Cuts. Alternatively, we will employ other more efficient graph cut algorithms to segmentation the images. To improve the reliability for the optic nerve identification, we will consider the shape prior of the optic nerve and locate it by using circle fitting method on the segmented shapes. The method will be applied to automatically reconstruct 3D patient-specific EOM models which will be used in clinical diagnosis and surgical planning.