Learning Deep Features for Automated Placement of Correspondence Points on Ensembles of Complex Shapes

Agrawal, Praful; Whitaker, Ross T.; Elhabian, Shireen Y.

doi:10.1007/978-3-319-66182-7_22

Learning Deep Features for Automated Placement of Correspondence Points on Ensembles of Complex Shapes

Praful Agrawal²¹,
Ross T. Whitaker²¹ &
Shireen Y. Elhabian²¹

Conference paper
First Online: 04 September 2017

11k Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10433))

Abstract

Correspondence-based shape models are an enabling technology for various medical imaging applications that rely on statistical analysis of populations of anatomical shape. One strategy for automatic correspondence placement is to simultaneously learn a compact representation of the underlying anatomical variation in the shape space while capturing the geometric characteristics of individual shapes. The inherent geometric complexity and population-level shape variation in anatomical structures introduce significant challenges in finding optimal shape correspondence models. Existing approaches adopt iterative optimization schemes with objective functions derived from probabilistic modeling of shape space, e.g. entropy of Gaussian-distributed shape space, to find useful sets of dense correspondence on shape ensembles. Nonetheless, anatomical shape distributions can be far more complex than this Gaussian assumption, which entails linear shape variation. Recent works address this limitation by adopting an application-specific notion of correspondence through lifting positional data to a higher-dimensional feature space (e.g. sulcal depth, brain connectivity, and geodesic distance to anatomical landmarks), with the goal of simplifying the optimization problem. However, this typically requires a careful selection of hand-crafted features and their success heavily rely on expertise in finding such features consistently. This paper proposes an automated feature learning approach using deep convolutional neural networks for optimization of dense point correspondence on shape ensembles. The proposed method endows anatomical shapes with learned features that enhance the shape correspondence objective function to deal with complex objects and populations. Results demonstrate that deep learning based features perform better than methods that rely on position and compete favorably with hand-crafted features.

You have full access to this open access chapter, Download conference paper PDF

1 Introduction

Statistical shape models have important applications is various medical imaging tasks, such as image segmentation (e.g. [8]), hypothesis testing (e.g. [3]), anatomical reconstruction (e.g. [1]), and pathology detection (e.g. [15]). The utility of such models is influenced by a shape representation that facilitates statistical analysis. In this regard, landmarks are a popular choice as a light-weight representation—compared to embedding shapes in the image intensity values at voxels—that is easy to understand and implement while promoting visual communication of the results [14]. To perform shape statistics, landmarks should be defined consistently within a given population to refer to the same anatomical position on every shape instance; a concept known as correspondence. Such correspondences are often created manually, but this is time-/labor-intensive, requiring qualified specialists (e.g. radiologists), and cost-prohibitive for 3D images and large collections of imaging data.

Ad-hoc automated methods, such as choosing nearest points between surfaces, are prone to mismatches and result in substandard analytical results. Alternatively, other methods try to match parameterizations of surfaces while maintaining regularity. For example, the spherical harmonics point distribution model (SPHARM-PDM) [16] is a parameterization-based correspondence scheme that relies on a smooth one-to-one mapping from each shape instance to the unit sphere. The mapping to a sphere as well as the a-priori reliance on smoothness, rather than shape features in the population, is a shortcoming. Other automated approaches have introduced the statistics of the population itself to drive the matching of shape features. For instance, minimum description length (MDL) [7] optimizes point correspondences using an information content objective, but it too relies on intermediate spherical surface parameterizations, which places limitations on the types of shapes and the optimization process. Forgoing the parameterization, the particle-based shape modeling (PSM) approach has offered a flexible nonparametric and general framework to establish dense point correspondences across shape ensembles without constraining surface topology or parameterization [4, 10]. The PSM approach formulates a trade-off between geometric description of each shape and the statistical compactness of the population, formulated in terms of entropy.

Current PSM implementations rely on a Gaussian models in the shape space, which is (approximately) the vector space formed by the x-y-z coordinates of the collection of correspondences (modulo a similarity transform). However, the distribution of anatomical structures can be far more complex than the Gaussian assumes, and surface locations are not always indicative of their correspondence. To address these shortcomings, previous work (in the context of brain imaging) has included sulcal depth [11] and brain connectivity [12] as additional features in the PSM optimization. While promising, such approaches are tailored to particular data set, anatomy and application. Datar et al. [6] proposed geodesic distances to user-identified landmarks on individual shapes and hence were able to effectively guide entropy-based optimization to a compact statistical model for complex shapes. However, this predefined mapping requires a careful selection of hand-crafted features and their success heavily rely on expertise (and time) required to find such features consistently.

In this paper, we propose an automated feature-learning approach for establishing dense correspondence models to alleviate the need for engineered shape descriptors or guiding landmarks. The idea is motivated by recent advances in computer vision and computer graphics that use deep convolutional neural networks (CNNs) to learn shape features for establishing pairwise correspondences between shapes (e.g. [2, 5]). In leveraging the feature-based PSM with deep-learned features, this paper makes several contributions including: (1) effectively learning features to establish shape correspondence using direct surface geometry information rather than relying on predefined surface descriptors, (2) incorporating the deep-learned features in the optimization of dense correspondences to automatically find a useful set of dense correspondences across complex shape ensembles, and (3) thereby relaxing the linear/Gaussian assumption of the current PSM approach while minimizing the required domain expertise by learning shape-specific characteristics in a fully automated fashion. Experiments on synthetic and real medical data demonstrate that the proposed deep feature-based PSM compares favorably with existing state-of-the-art particle-based shape modeling approaches.

2 Methodology

In this paper, we use deep convolutional neural networks (CNNs) to learn local shape features for establishing shape correspondence. The goal is to make some of the hidden layers of the network respond similarly at corresponding locations across the given shape ensemble, and dissimilarly at noncorresponding ones. In this regard, we use a Siamese network setting similar to [5], which consists of two identical copies of the same deep CNN and is fed with pairs of corresponding (positives) and noncorresponding (negatives) local surface descriptors. As input to the deep neural network, we extract local surface geometrical characteristics using circular geodesic patches in the neighborhood of given point locations on a shape’s surface (see Fig. 1). These patches are then used as the input layer to the Siamese network. After training, one copy of the two identical deep CNNs in the Siamese network becomes a nonlinear filter bank (we use the last convolutional layer, before the final classification, as in Fig. 2) that is computed on all mesh vertices to produce feature maps on given surface. These feature maps are then used to aid correspondence optimization in the entropy-based PSM. In what follows, we further expand on details of patch extraction, Siamese network configuration, and the modified correspondence objective function to obtain a useful set of dense correspondences across anatomical shapes.

Geodesic patch extraction: The success of deep CNNs have been demonstrated on analyzing functions defined on Euclidean grid-like domains such as images. Nonetheless, non-Euclidean data, particularly surface geometry, does not directly benefit from these deep models in which operations such as convolution and pooling are not readily defined. Recently, local charting methods (e.g. [2]) have been proposed as a generalization of convolution to non-Euclidean domains where a patch operator is defined to extract local surface patches that are subsequently associated with predefined shape descriptors. Here we rely directly on surface geometry to compute such local shape descriptors. Specifically, the graph of surface distance to the tangent space can encode local geometrical properties around a point. Hence, we propose to use signed normal distance to the tangent space sampled in the geodesic neighborhood of a surface point \(\mathbf {x}\) as a local snapshot of surface geometry. We use principal curvature directions to define the local intrinsic coordinate system of the patch constructed at a surface point \(\mathbf {x}\). As illustrated in Fig. 1, neighboring points that lie in the geodesic ball \(\mathcal {B}(\mathbf {x}) = \{\mathbf {x}' : d_X(\mathbf {x}, \mathbf {x}') \le \rho _o\} \) with radius \(\rho _o > 0\) are sampled on a circular geodesic patch where each ring corresponds to a particular geodesic distance \(d_X(\mathbf {x}, \mathbf {x}') = \rho \) with \(\rho \in [0, \rho _o]\) and every ray originating from \(\mathbf {x}\) is perpendicular to the geodesic rings are inclined at an angle \(\theta \in [0,2\pi )\) to the first principal curvature direction at \(\mathbf {x}\). \(\rho _o\) is automatically set to \(5\%\) of maximum shape diameter in the ensemble. Principal curvature directions are estimated using least squares fitting of a quadratic [13] while enforcing the right-hand on local coordinates, with the normal direction representing the z-axis. To ensure consistent and smooth principal curvature directions, meshes are generated by triangulating the DT’s isosurface using the dynamic particle-based implicit surface sampling [9] that yields water-tight triangular meshes. However, the principal curvature directions are accurate only up to the sign of the vector. To address this ambiguity, we extract two patches per surface point (see Fig. 1e), one with bases defined by the positive sign of principal directions and other by the negative sign and use them as a two channel image for input to the neural network.

Correspondence feature learning: Given sets of corresponding points on shapes in the ensemble, we train a Siamese CNN to learn a binary class problem, where positive class indicates correspondence between pair of input patches and negative class suggests that patches possess different shape characteristics. We use the network settings of Chopra et al. [5], see Fig. 2. During experiments we found that using \(p_{drop} > 0\) does not impact classification performance, therefore, we set it to zero for our experiments. All layers use the softplus activation function for nonlinearity and weights are initialized using a zero mean normal distribution with 0.05 standard deviation. Choice of softplus activation ensures smooth output features, crucial for correspondence optimization (see Fig. 2) and faster training convergence, being softer version of ReLU. We subtract mean of the training data from all training patches (and same mean is subtracted from testing patches for feature extraction) so that the input data can align with the distribution of initial network weights (spanning the hypersphere).

Deep feature-based PSM: The proposed PSM uses a set of dynamic particle systems, one for each shape, in which particles interact with one another with mutually repelling forces to sample (and therefore describe) the surface geometry. Consider a cohort of shapes \(\mathcal {S}= \{\mathbf {z}_1, \mathbf {z}_2, ..., \mathbf {z}_N\}\) containing N surfaces, each with its own set of M corresponding particles \(\mathbf {z}_n = [\mathbf {z}_n^1, \mathbf {z}_n^2, ..., \mathbf {z}_n^M] \in \mathbb {R}^{dM}\) such that \(\mathbf {z}_n^m \in \mathbb {R}^d\) and ordering implies correspondence across shapes. This representation involves two random variables: a shape space variable \(\mathbf {Z}\in \mathbb {R}^{dM}\) and a position variable \(\mathbf {X}_n \in \mathbb {R}^d\) that encodes particle distribution on the n–th shape (configuration space). For groupwise modeling, shapes in \(\mathcal {S}\) should share the same world coordinate system. We use generalized Procrustes alignment to estimate a rigid transformation matrix \(\mathbf {T}_n\) per shape instance such that \(\mathbf {z}_n^m = \mathbf {T}_n\mathbf {x}_n^m\). Correspondences are established by minimizing a combined shape correspondence and surface sampling cost function \(Q = H(\mathbf {Z}) - \sum _{n=1}^N H(\mathbf {X}_n)\), where H is an entropy estimation assuming Gaussian distribution on the joint vector of features and positions and a nonparametric density estimator in the (3D) configuration space, as in [4]. In particular, correspondence entropy solely relies on particle positions to achieve compact ensemble representation in shape-feature space (first term) against a uniform distribution of particles on each surface for accurate shape representation (second term). The entropy-based PSM formulation is generalized to a more flexible notion of correspondence by minimizing the entropy of arbitrary, vector-valued functions of the particle position [6, 11, 12]. Thus, the proposed feature-based PSM extension modifies only the correspondence term \(H(\mathbf {Z})\) where the sampling term, \(H(\mathbf {X}_n)\), still constrains the particles to lie on the shape’s surface by maximizing the positional entropy.

In this paper, we propose to use features extracted from the trained CNN to optimize correspondences on shape ensembles to generate compact statistical shape models. We also keep the particle positions to guide optimization for locating corresponding points as extracted features only encode local geometric properties and require positional information, unlike geodesic distances which encode global shape characteristics. Hence, \(q = d + L\) where L is the number of features extracted from the trained network. Hence, we have \(\mathbf {z}_n^m = f(\mathbf {x}_n^m) = \left[ f_d^1(\mathbf {x}_n^m), ..., f_d^L(\mathbf {x}_n^m), (\mathbf {T}_n \mathbf {x}_n^m)^T \right] \) where \(f_d^l(\mathbf {x}_n^m)\) is the \(l{-}\)th deep feature extracted at the \(m{-}\)th particle on the n–th shape, which is defined by \(\mathbf {z}_n = [\mathbf {z}_n^1, ..., \mathbf {z}_n^M]^T \in \mathbb {R}^{qM}\). Let \(\mathbf {y}_n = \mathbf {z}_n - \varvec{\mu }\) with the mean . The centered data matrix is defined as \(\mathbf {Y}= [\mathbf {y}_1, ... , \mathbf {y}_N] \in \mathbb {R}^{qM \times N}\). Since \(N \ll qM\), feature space covariance is estimated in the dual space of dimension N which defines the hyperplane in \(\mathbb {R}^{qM}\) that the N–samples inhabit. With a Gaussian assumption in the lifted feature space, shape space entropy estimate becomes , where \(\mathbf {I}\) is the identity matrix in the \(qM{-}\)dimensional feature space and \(\alpha > 0\) is a regularization parameter that prevents nondominant modes (smallest eigenvalues) from dominating the optimization process. Let \(\mathbf {C}= \left( \mathbf {Y}^T\mathbf {Y}+ \alpha \mathbf {I}\right) ^{-1} = [\mathbf {c}_1, ..., \mathbf {c}_N]\). By the chain rule, the partial derivative of the correspondence entropy term with respect to the data vector \(\mathbf {z}_n\) becomes \(\partial H(\mathbf {Z}) / \partial \mathbf {z}_n = \mathbf {J}_n^T \mathbf {c}_n\), where \(\mathbf {J}_n\) is the Jacobian of the functional data for the n–th shape that has the structure of a block diagonal matrix with \(M \times M\) blocks, each with diagonal block of \(q \times d\) sub-matrices of the function gradients at each particle, which is computed numerically by projecting the correspondence points on the surface. This gradient is used to update point correspondences in accordance with the optimization described in [4].

3 Results and Discussion

Two synthetic datasets and a real dataset of medical shapes is used to demonstrate the ability of Deep feature-based PSM in generating optimal statistical models for complex shapes. Proposed method is compared with Positional PSM – Cates et al. [4] and Geodesics PSM – Datar et al. [6], using mean shape from the optimized statistical models and following quantitative measures: Scree plot: We plot the percentage of variance w.r.t. PCA (Principal Component Analysis) modes to demonstrate the model compactness. Generalization: Reconstruction error of the correspondence model for an unseen shape instance, using PCA model built from training samples, is evaluated in a leave-one-out fashion. We plot mean of point-wise physical distance between test samples and their reconstruction w.r.t. number of PCA modes (lower is better). Specificity: Ability of the model to generate plausible shapes, quantified as the Euclidean distance between a sampled shape (from PCA model built using all training samples) and its closest training sample based on \(l_2{-}\)norm (lower is better). Average over 1000 samples is used.

Proof-of-concept: First synthetic dataset (Bean2) contains 30 shapes that represent a coffee bean with a spatially varying structure. Second dataset (Bean4) comprises of 23 samples of more complex coffee bean shape, with a group of closely located thin structures collectively varying in position (see Fig. 3). In order to generate training samples for the Siamese network, we use an optimized statistical shape model with 3072 points for Bean2, obtained using Geodesic-PSM. Patches from six shapes were randomly selected for every correspondence point to accumulate training data. Siamese network with \(L=10\) output features yield optimal classification performance of 0.92 AUC (area under the curve) of the ROC (Receiver Operating Characteristics) curve, resulting in \(90\%\) True Positive Ratio (TPR) at an expense of \(20\%\) False Positive Ratio (FPR). Given that there are multiple regions with similar shape characteristics, which may lead to higher FPR, therefore 20% is a relatively small penalty. Moreover, using position information in Deep feature-based PSM will help in reducing impact of false positives. All 10 features are used in PSM to generate a shape model with 4096 correspondence points. Figure 4 presents mean shape and quantitative evaluation of the correspondence model. Results indicate compactness of the statistical shape model and better generalization and specificity performance over Geodesics-based PSM.

Bean4 shapes have similar characteristics as Bean2, therefore, we use the same trained Bean2 network to extract features for Bean4 shapes. Figure 2 presents samples of same feature on the two datasets. Comparative results on Bean4 data, presented in Fig. 5, clearly highlight the superiority of proposed method over positional-PSM in compactness of the resulting statistical shape model as well as it outperforms in its ability to generalize over unseen data and to produce plausible shape samples.

Real data: 20 Scapula shapes (10 controls and 10 patients) were retrospectively chosen from a shoulder CT database of controls and patients with osseous Hill-Sachs lesions. Samples were rigidly aligned w.r.t. the glenoidal plane and a set of 16 anatomical landmarks were defined manually. Reconstructed scapula meshes were then clipped to the glenoid, acromion and coracoid to model the areas of high geometric curvature related to constraint of the humeral head, Fig. 3 illustrates shape samples. Geodesic-PSM based shape model with 2432 points was used to generate training data. Using \(L=5\) output features resulted in optimal classification performance of AUC \(=\) 0.80. Figure 6 showcases superior results using proposed method, especially in generalization and specificity. It is important to note that proposed method is able to achieve errors of about 2 voxels using dominant modes (5 modes) in contrast to minimum 3 voxels from other methods.

References

Balestra, S., Schumann, S., Heverhagen, J., Nolte, L., Zheng, G.: Articulated statistical shape model-based 2D-3D reconstruction of a hip joint. In: Stoyanov, D., Collins, D.L., Sakuma, I., Abolmaesumi, P., Jannin, P. (eds.) IPCAI 2014. LNCS, vol. 8498, pp. 128–137. Springer, Cham (2014). doi:10.1007/978-3-319-07521-1_14
Chapter Google Scholar
Boscaini, D., Masci, J., Rodolà, E., Bronstein, M.: Learning shape correspondence with anisotropic convolutional neural networks. In: NIPS, pp. 3189–3197 (2016)
Google Scholar
Bredbenner, T.L., Eliason, T.D., Potter, R.S., Mason, R.L., Havill, L.M., Nicolella, D.P.: Statistical shape modeling describes variation in tibia and femur surface geometry between control and incidence groups from the osteoarthritis initiative database. J. Biomech. 43(9), 1780–1786 (2010)
Article Google Scholar
Cates, J., Fletcher, P.T., Styner, M., Shenton, M., Whitaker, R.: Shape modeling and analysis with entropy-based particle systems. In: Karssemeijer, N., Lelieveldt, B. (eds.) IPMI 2007. LNCS, vol. 4584, pp. 333–345. Springer, Heidelberg (2007). doi:10.1007/978-3-540-73273-0_28
Chapter Google Scholar
Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: CVPR, vol. 1, pp. 539–546 (2005)
Google Scholar
Datar, M., Lyu, I., Kim, S.H., Cates, J., Styner, M.A., Whitaker, R.: Geodesic distances to landmarks for dense correspondence on ensembles of complex shapes. In: Mori, K., Sakuma, I., Sato, Y., Barillot, C., Navab, N. (eds.) MICCAI 2013. LNCS, vol. 8150, pp. 19–26. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40763-5_3
Chapter Google Scholar
Davies, R.H., Twining, C.J., Cootes, T.F., Waterton, J.C., Taylor, C.J.: A minimum description length approach to statistical shape modeling. IEEE TMI 21(5), 525–537 (2002)
MATH Google Scholar
Heimann, T., Meinzer, H.P.: Statistical shape models for 3D medical image segmentation: a review. MedIA 13(4), 543–563 (2009)
Google Scholar
Meyer, M., Kirby, R.M., Whitaker, R.: Topology, accuracy, and quality of isosurface meshes using dynamic particles. IEEE TVCG 13(6), 1704–1711 (2007)
Google Scholar
Oguz, I., Cates, J., Datar, M., Paniagua, B., Fletcher, T., Vachet, C., Styner, M., Whitaker, R.: Entropy-based particle correspondence for shape populations. IJCARS 11(7), 1221–1232 (2016)
Google Scholar
Oguz, I., Cates, J., Fletcher, T., Whitaker, R., Cool, D., Aylward, S., Styner, M.: Cortical correspondence using entropy-based particle systems and local features. In: ISBI, pp. 1637–1640 (2008)
Google Scholar
Oguz, I., Niethammer, M., Cates, J., Whitaker, R., Fletcher, T., Vachet, C., Styner, M.: Cortical correspondence with probabilistic fiber connectivity. In: Prince, J.L., Pham, D.L., Myers, K.J. (eds.) IPMI 2009. LNCS, vol. 5636, pp. 651–663. Springer, Heidelberg (2009). doi:10.1007/978-3-642-02498-6_54
Chapter Google Scholar
Rusinkiewicz, S.: Estimating curvatures and their derivatives on triangle meshes. In: IEEE 3DPVT, pp. 486–493 (2004)
Google Scholar
Sarkalkan, N., Weinans, H., Zadpoor, A.A.: Statistical shape and appearance models of bones. Bone 60, 129–140 (2014)
Article Google Scholar
Shen, K.K., Fripp, J., Mériaudeau, F., Chételat, G., Salvado, O., Bourgeat, P.: Detecting global and local hippocampal shape changes in Alzheimer’s disease using statistical shape models. Neuroimage 59(3), 2155–2166 (2012)
Article Google Scholar
Styner, M., Oguz, I., Xu, S., Brechbühler, C., Pantazis, D., Levitt, J.J., Shenton, M.E., Gerig, G.: Framework for the statistical shape analysis of brain structures using SPHARM-PDM. Insight J. 1071, 242 (2006)
Google Scholar

Download references

Acknowledgment

Authors would like to thank Heath B. Henninger, PhD and Matthijs Jacxsens, MD for providing Scapula shapes with anatomical landmarks. This work was supported by NIH grants P41-GM103545-19 and R01-EB016701.

Author information

Authors and Affiliations

Scientific Computing and Imaging Institute, University of Utah, Salt Lake City, USA
Praful Agrawal, Ross T. Whitaker & Shireen Y. Elhabian

Authors

Praful Agrawal
View author publications
You can also search for this author in PubMed Google Scholar
Ross T. Whitaker
View author publications
You can also search for this author in PubMed Google Scholar
Shireen Y. Elhabian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Praful Agrawal .

Editor information

Editors and Affiliations

Université de Sherbrooke, Sherbrooke, QC, Canada
Maxime Descoteaux
DKFZ, Heidelberg, Germany
Lena Maier-Hein
Ulm University of Applied Sciences, Ulm, Germany
Alfred Franz
Université de Rennes 1, Rennes, France
Pierre Jannin
McGill University, Montreal, QC, Canada
D. Louis Collins
Université Laval, Québec, QC, Canada
Simon Duchesne

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Agrawal, P., Whitaker, R.T., Elhabian, S.Y. (2017). Learning Deep Features for Automated Placement of Correspondence Points on Ensembles of Complex Shapes. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D., Duchesne, S. (eds) Medical Image Computing and Computer Assisted Intervention − MICCAI 2017. MICCAI 2017. Lecture Notes in Computer Science(), vol 10433. Springer, Cham. https://doi.org/10.1007/978-3-319-66182-7_22

Download citation

DOI: https://doi.org/10.1007/978-3-319-66182-7_22
Published: 04 September 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66181-0
Online ISBN: 978-3-319-66182-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)