Statistical analysis of manual segmentations of structures in medical images

https://doi.org/10.1016/j.cviu.2012.11.014Get rights and content

Highlights

  • We model segmentation uncertainty in CT images.

  • We consider the tasks of computing summary statistics and identifying outliers.

  • This framework is based on shape analysis of 2D closed, planar curves.

  • The underlying statistical models are efficient in summarizing data variability.

  • The models can be used in image registration and 3D surface reconstruction.

Abstract

The problem of extracting anatomical structures from medical images is both very important and difficult. In this paper we are motivated by a new paradigm in medical image segmentation, termed Citizen Science, which involves a volunteer effort from multiple, possibly non-expert, human participants. These contributors observe 2D images and generate their estimates of anatomical boundaries in the form of planar closed curves. The challenge, of course, is to combine these different estimates in a coherent fashion and to develop an overall estimate of the underlying structure. Treating these curves as random samples, we use statistical shape theory to generate joint inferences and analyze this data generated by the citizen scientists. The specific goals in this analysis are: (1) to find a robust estimate of the representative curve that provides an overall segmentation, (2) to quantify the level of agreement between segmentations, both globally (full contours) and locally (parts of contours), and (3) to automatically detect outliers and help reduce their influence in the estimation. We demonstrate these ideas using a number of artificial examples and real applications in medical imaging, and summarize their potential use in future scenarios.

Introduction

Segmentation of medical images to extract anatomical structures is arguably the most important problem in medical image analysis. A variety of techniques ranging from variational methods to statistical methods to fuzzy logic have been applied to this problem but a general solution still remains elusive. The difficulty lies in capturing the vast observed variability, in the form of pixel values and anatomical shapes, into appropriate objective functions that enable high-quality segmentations. This is further compounded by low resolution, high noise and blurriness associated with common medical imaging modalities. The segmentation performance achieved by trained humans – physicians, clinicians, technicians, etc. – has been very difficult to match by automated systems. Since manual segmentations by skilled practitioners are generally expensive and the amount of imaging data being generated far out-paces the experts’ availability, it is not practical to rely on these trained personnel for all our segmentation needs.

There are several directions for addressing this gap between the need and the available expertise. Firstly, motivated by the success of manual segmentations by trained experts, an important focus of current research in segmentation has been in training machine learning algorithms to improve their performance. A similar idea has been to develop structured prior models on shapes of interest and perform Bayesian segmentations – segmentations guided by prior knowledge of statistical shape variability associated with anatomical structures to be segmented. A second, completely different paradigm is to involve humans, albeit non-experts, who can volunteer their time and effort in segmenting medical images. This approach is part of a larger effort involving public participation in scientific data analysis. It is this second direction that motivates the methods presented in this paper. The last decade has seen a tremendous increase in what has been termed Citizen Science - non-experts helping scientists collect and/or analyze data. These projects range from Galaxy Zoo, where non-experts help classify galaxies [1], to Fold-It, where people help fold proteins [2], to measuring trees and counting birds in nests (Cornell Lab of Ornithology) [3]. These projects take advantage of human perceptual abilities and intuitions about shape (Galaxy Zoo/Fold-It) and the ability of volunteers to take measurements over a large spatial extent. We are currently trying to build the foundation for Citizen Science projects that further exploit the growth of digital imaging, both to support volunteers annotating 2D and 3D structures in imagery, and in building tools to allow volunteers to capture calibrated imagery to support more quantitative measurements. We believe these tools may support a diverse set of Citizen Science projects. This paper is about analyzing data arising from multiple manual segmentations of 3D medical image data, but could be applied to 2D photograph segmentations as well (e.g. finding birds, trees or cars in a scene). Shown in Fig. 1 is an illustration of a manual segmentation of the liver. The segmentation is performed using multiple oblique and parallel image slices and the resulting two-dimensional curves can be used for full three-dimensional surface reconstruction.

One challenge any data collection and analysis project faces is data verification and validation. This is particularly true in Citizen Science projects, where the participants can come from a wide variety of backgrounds and skill sets. To date, these projects have largely relied on carefully constructed training tutorials along with simple statistics to filter out incorrect data (be it malicious or unintentional). Simple statistics work well in the case of category labels (e.g. Galaxy Zoo or bird counting) because any given dataset is viewed by multiple people and each person marks multiple datasets. This provides fairly reliable statistics for both the datasets and reliability data on the individuals marking the datasets.

We would like to extend the same statistical approach to support Citizen Science projects where the volunteers contribute contour data rather than category labels. An example of such data (brainstem) is provided in Fig. 2. We show a 2D CT image with an example segmentation, a zoom-in on part of the image where the boundary of the brainstem is partly visible, and five expert segmentations that form the raw data for techniques developed in this paper. In particular, we would like to compute sample statistics for multiple contours that let us answer the following specific questions about the data:

  • (1)

    What is the mean or median contour? This helps filter out small noise and discrepancies in any individual person’s contour.

  • (2)

    What is the magnitude of the standard deviation of a set of contours? This provides a measure of confidence for the segmentations.

  • (3)

    Is a contour an outlier?

  • (4)

    What is the average distance – under a shape metric – of an individual’s contours from the average contour?

  • (5)

    Do the segmentations follow a Gaussian distribution or is the data better described by a mixture of Gaussian distributions? (This happens quite often when there is more than one interpretation of the instructions. For instance, one group might contour an inner wall, while the other group contours an outer wall.)

In addition to answering the above questions, we are interested in providing the scientists (in this case physicians) with visualization tools that will help them analyze both inter and intra variation in their datasets, and to provide confidence analysis on their reconstructions. To this end, we provide visualizations of the major modes of variation, and their magnitudes, which provide some estimate of the reliability of different segments of the contour. It is important to note that these measures of confidence or reliability do not necessarily correlate with segmentation accuracy, but rather just consistency.

Historically, the problem of shape analysis of curves has been studied with a variety of different mathematical representations of curves. These representations include sample points or landmarks [4], [5], [6], level sets [7], deformable templates [8], medial axes [9], [10], algebraic invariants, and others. However, the most natural representation – a parameterized curve – is relatively infrequent. The difficulty in studying shapes of parameterized curves lies in the parameterization variability: a curve can be re-parameterized in infinitely many ways but still have the same shape. Davies et al. [11], [12] use landmark models and compute optimal re-parameterizations using a minimum description length (MDL) type energy. Although such models have proven very useful in medical image analysis and have good specificity, generalizability and compactness (as compared to manually selected landmarks and arc-length parameterization), the method used to compute them has three main limitations. First, the optimization problem is defined in terms of ensembles and thus the distance between any two shapes depends on the other shapes in the ensemble. This contrasts the standard mathematical definition of a distance between objects. Second, the MDL energy does not preserve distances between shapes when they are re-parameterized in the same way. Intuitively, the shape metrics and statistical analysis should not change with a change in the parameterization, but rather only with a change in shape. Finally, the MDL driven optimization problem requires a pre-selection of a template, which can be rather arbitrary and choosing different templates may lead to different solutions.

An emerging body of work has proposed using an algebraic approach to handle the problem of parameterization. In this approach, one unifies all possible re-parameterizations of a curve by forming equivalence relations, and each shape is denoted uniquely by an equivalence class. Shape metrics are defined and computed between equivalence classes, rather than individual curves. A very important advantage of this framework is that the process of comparing any two shapes, i.e. two equivalence classes, involves finding optimal registrations between the corresponding curves. (Note that the registration of points along curves corresponds to re-parameterizations of curves.) Thus, this framework, termed elastic shape analysis, leads to a simultaneous registration and comparison of shapes of curves, under a single metric, and to a principled approach to shape analysis. Notably, the methods involving landmarks, level sets, medial axes, and others, do not provide this ability to simultaneously register and analyze shapes. Several papers, including [13], [14], [15], have utilized an elastic framework, but we will use the approach presented in [16], [17] for this paper. An important advantage of this approach is that amongst all methods for elastic shape analysis of contours, this is the only one so far that extends to contours in higher dimensions.

Thus, the main contributions of this work are:

  • (1)

    The application of elastic analysis to modeling contour segmentation uncertainty in medical images.

  • (2)

    A method for visualization of segmentation reliability based on principal component analysis.

  • (3)

    Extension of shape analysis methods presented in [16], [17] that enables one to model rotation, scale and position in addition to shape.

  • (4)

    Definition and computation of the elastic median based on a general algorithm presented in [18].

  • (5)

    A formal statistical procedure to identify outliers in datasets based on elastic distances from the median.

The rest of this paper is organized as follows. In Section 2, we summarize a mathematical/statistical framework presented in past papers, such as [17], and adapt these tools for use in the current problem. Specifically, we introduce an algorithm for estimating the sample median in the space of contours and use that median estimate for studying dominant modes of data variability. Experimental results involving brainstem and prostate images are presented in Section 3, and the paper concludes with a short summary in Section 4.

Section snippets

Mathematical framework

The basic problem we face requires a technique for statistical analysis of planar, closed contours. There are several theories available for this task. In this section we summarize a recent approach for elastic analysis of contours that is particularly attractive for the current problem.

Experimental results on medical images

In this section we display examples of elastic statistical models computed for real segmentation data. The data used here comes from oblique slices of images of the brainstem and prostate, and a few parallel image slices of the same anatomical structures. In some examples we also compare expert segmentations to those performed by non-experts. In each example, we estimate both the mean and median of the manual segmentations. We also compute the covariance structure at the median. In computing

Summary

We have presented a comprehensive set of methods for statistical analysis of closed, planar contours. These methods are especially useful in what is termed Citizen Science, where non-experts help scientists collect and analyze real world data. In our framework, we consider modeling uncertainty of manual segmentations performed on medical images. These types of statistical analyses of contours are very important in medical surface reconstruction and medical image registration. We have provided

Acknowledgments

This research was supported in part by: NSF DMS 0915003, NSF IIS 1217515, NSF DMS 1208959, NSF DBI 1053554 and NSF DBI 1313810.

References (26)

  • P. Fletcher et al.

    The geometric median on Riemannian manifolds with application to robust atlas estimation

    Neuroimage

    (2009)
  • C. Lintott et al.

    Galaxy Zoo: morphologies derived from visual inspection of galaxies from the Sloan Digital Sky Survey

    Mon. Not. R. Astron. Soc.

    (2008)
  • S. Cooper et al.

    Predicting protein structures with a multiplayer online game

    Nature

    (2010)
  • C.L. of Ornithology, Citizen Science....
  • I.L. Dryden et al.

    Statistical Shape Analysis

    (1998)
  • D.G. Kendall et al.

    Shape and Shape Theory

    (1999)
  • C.G. Small

    The Statistical Theory of Shape

    (1996)
  • S. Osher et al.

    Level Set Methods and Dynamic Implicit Surfaces

    (2003)
  • U. Grenander et al.

    Computational anatomy: an emerging discipline

    Q. Appl. Math. LVI

    (1998)
  • P.T. Fletcher et al.

    Principal geodesic analysis for the study of nonlinear statistics of shape

    IEEE Trans. Med. Imaging

    (2004)
  • K. Gorczowski et al.

    Multi-object analysis of volume, pose, and shape using statistical discrimination

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2010)
  • R. Davies et al.

    Building 3-d statistical shape models by direct optimization

    IEEE Trans. Med. Imaging

    (2010)
  • R. Davies et al.

    A minimum description length approach to statistical shape modeling

    IEEE Trans. Med. Imaging

    (2002)
  • Cited by (0)

    View full text