Unsupervised Discovery of Spatially-Informed Lung Texture Patterns for Pulmonary Emphysema: The MESA COPD Study

Yang, Jie; Angelini, Elsa D.; Balte, Pallavi P.; Hoffman, Eric A.; Austin, John H. M.; Smith, Benjamin M.; Song, Jingkuan; Barr, R. Graham; Laine, Andrew F.

doi:10.1007/978-3-319-66182-7_14

Jie Yang²¹,
Elsa D. Angelini^21,22,
Pallavi P. Balte²³,
Eric A. Hoffman²⁵,
John H. M. Austin²⁴,
Benjamin M. Smith^23,26,
Jingkuan Song²¹,
R. Graham Barr^23,27 &
…
Andrew F. Laine²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10433))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

12k Accesses
1 Altmetric

Abstract

Unsupervised discovery of pulmonary emphysema subtypes offers the potential for new definitions of emphysema on lung computed tomography (CT) that go beyond the standard subtypes identified on autopsy. Emphysema subtypes can be defined on CT as a variety of textures with certain spatial prevalence. However, most existing approaches for learning emphysema subtypes on CT are limited to texture features, which are sub-optimal due to the lack of spatial information. In this work, we exploit a standardized spatial mapping of the lung and propose a novel framework for combining spatial and texture information to discover spatially-informed lung texture patterns (sLTPs). Our spatial mapping is demonstrated to be a powerful tool to study emphysema spatial locations over different populations. The discovered sLTPs are shown to have high reproducibility, ability to encode standard emphysema subtypes, and significant associations with clinical characteristics.

You have full access to this open access chapter, Download conference paper PDF

Explaining Radiological Emphysema Subtypes with Unsupervised Texture Prototypes: MESA COPD Study

Unsupervised Discovery of Emphysema Subtypes in a Large Clinical Cohort

Emphysema Quantification on Cardiac CT Scans Using Hidden Markov Measure Field Model: The MESA Lung Study

1 Introduction

Pulmonary emphysema overlaps considerably with chronic obstructive pulmonary disease (COPD), and is traditionally subcategorized into 3 standard subtypes: centrilobular emphysema (CLE), panlobular emphysema (PLE) and paraseptal emphysema (PSE). These subtypes were initially defined on autopsy. Radiologists’ labeling of these subtypes on CT is labor-intensive, with substantial intra- and inter-rater variability [1]. Moreover, pathologists have disagreements on the very existence of such pure subtypes.

CT-based automated emphysema labeling has received increasing interest recently, in both supervised manners for replicating standard subtyping [2, 3], and unsupervised manners to discover new subtypes [4,5,6]. Preliminary CT-based clinical studies suggest that regional analysis will be instrumental in advancing the understanding of multiple pulmonary diseases [7]. Most existing approaches for learning emphysema subtypes on CT are limited to texture-based features, which are sub-optimal due to the lack of spatial information. Previous studies [5, 6] proposed to generate unsupervised lung texture patterns (LTPs) based on texture appearance, and to group them based on their spatial co-occurrence. However, such approaches only account for relative spatial occurrence at the scale of local regions of interest (ROIs). Also, post-grouping could not guarantee spatial homogeneity of the generated LTPs. Regarding spatial lung partitioning, using lung lobes leads to coarse spatial precision while using subdivisions of Cartesian coordinates lacks relative information such as peripheral versus central positioning which is important in defining PSE. Therefore a dedicated lung shape spatial mapping is designed in this work that adapts to individual shapes while enabling cross-subject comparison without registration. We then introduce an unsupervised framework for combining spatial and texture information to discover localized emphysematous LTPs, which we call the spatially-informed LTPs (sLTPs). We evaluate our lung shape spatial mapping for studying emphysema spatial patterns on CLE/PLE/PSE-predominant populations, and evaluate the discovered sLTPs in terms of reproducibility, ability to encode standard emphysema subtypes, and association with clinical characteristics.

2 Method

The pipeline for generating sLTPs, illustrated in Fig. 1, consists of the following three steps: (1) generate spatial mapping of the lungs; (2) generate LTPs using texture-based features and augment them with spatial features; (3) discover a distinct set of sLTPs.

2.1 Spatial Mapping of the Lung Shape

We use Poisson distance map (PDM) [8] to encode the shapes of individual lungs V, and label voxel positions in the range of [0, 1], measuring the “peel to core” distance between a given voxel and the external lung surface $\partial V$. Formally, we compute the Poisson solution U on the binary segmentation V using the following diffusion conditions:

$$\begin{aligned} {}&\varDelta U(x,y,z)=-1, \ \mathrm {for} \ (x,y,z)\in V \nonumber \\&\mathrm {subject \ to} \ U(x,y,z)=0, \ \mathrm {for} \ (x,y,z)\in \partial V \end{aligned}$$

(1)

where $\varDelta U = U_{xx}+U_{yy}+U_{zz}$. We further compute $U^*$ as the post-relaxed version of U to ensure robustness [8].

To uniquely encode 3D voxel positions, we add conformal mapping of the PDM solution onto a sphere, which we call the Poisson distance conformal map (PDCM). We define $r=1-U^*$, and encode superior versus inferior, anterior versus posterior and medial versus lateral voxel positioning via latitude and longitude angles $(\theta ,\phi )$ with respect to the PDM core position ($r=0$) and standard image axis.

2.2 Augmented Lung Texture Patterns

First, voxels are labeled as emphysema if they have intensity values below $-950$ HU [9] or are selected by the hidden Markov measure field [10] segmentation method, with parameters adapted to the scanner type. Then emphysema-specific LTPs are generated on ROIs with volumetric percentage of emphysema ($\%emph$) above 1%. We first generate an initial set of 100 LTPs $\{LTP_k\}_{k=1,..,100}$ with texture features, and then augment them iteratively via spatial regularization as detailed in Algorithm 1. The texture features are texton-histogram features with pre-learned textons, which were shown to be superior in a similar task in [6]. We follow the parameter settings in [6], with a codebook of 40 textons (defined as centers of clusters of 3$\,\times \,$3$\,\times \,$3 pixels patches). ROIs size is set to 25 mm$^3$ (approximating the size of secondary pulmonary lobules). The texture centroid of $LTP_k$ is where $FT_x$ is the texture feature of a ROI x, and $\varLambda _k$ denotes the set of ROIs that are labeled as $LTP_k$.

The spatial centroid of $LTP_k$ can be modeled as the average spatial density of ROIs in $\varLambda _k$. Computationally, we define lung sub-regions by dividing $r \in [0, 1]$, $\theta \in [0, 2\pi ]$ and $\phi \in [-\pi /2, \pi /2]$ into 3, 4 and 3 regular intervals to distinguish core to peel, anterior/medial/posterior/lateral, and inferior to superior regions. The spatial feature $FS_x$ of a ROI x is a one-hot vector of length 3$\,\times \,$4$\,\times \,$3 indicating the sub-region to which x belongs. The spatial centroid of $LTP_k$ is computed as . To augment LTPs with spatial features, dedicated metrics are used to enforce intra- and inter-class similarity constraints. The spatial histogram bins are well-aligned given the spatial sub-divisions while the texture histogram bins are more ambiguous. We therefore propose in Algorithm 1 a mixed $\chi ^2$-$\ell ^2$ similarity metric to enforce spatial regularity while preserving textural intra-class similarities. Spatial regularization will inevitably decrease the textural homogeneity of individual LTPs. Given no ground-truth justification, we hereby tune a regularization weight $\lambda $ with an empirically acceptable decrease in texture homogeneity. A penalty term is added with $\gamma =\infty $ to prevent a ROI from being labeled as a spatially preferred but texturally dissimilar LTP.

2.3 Final Spatially-Informed LTPs

The final set of sLTPs is expected to preserve distinct augmented LTPs and discard redundant ones. We generate sLTPs by partitioning a weighted undirected graph with similarity weights G defined as:

$$\begin{aligned} G_{i,j}=\frac{N_{i\rightarrow j}+N_{j\rightarrow i}}{N_{i}+N_{j}}\cdot \mathbbm {1}(\frac{\sum _k{N_{i\rightarrow k}}}{N_{i}}>T)\cdot \mathbbm {1}(\frac{\sum _k{N_{j\rightarrow k}}}{N_{j}}>T) \end{aligned}$$

(3)

where $N_{i\rightarrow j}$ denotes $|\varLambda _j|$ when removing $LTP_i$, and $N_i$ denotes $|\varLambda _i|$ when all LTPs exist. A ROI with a texture distance to its alternative LTP label $LTP_k$ exceeding the maximum texture within-cluster distance of $LTP_k$ is not re-labeled, which makes $(\sum _k{N_{i\rightarrow k}})/N_{i}\leqslant 1$. The indicator function $\mathbbm {1}(\cdot )$ is designed to preserve distinct patterns, and the threshold T is set to 0.5. In contrast to previous unsupervised emphysema subtyping algorithms [4,5,6] that rely on an arbitrarily pre-defined number of subtypes, we use Infomap [11] for the partition of G. Infomap is a community detection method that efficiently describes information flow on a network graph through Huffman coding, and returns a final number of sLTPs with guaranteed global optimality.

3 Experimental Results

3.1 Data

The data consists of 321 full-lung CT scans from MESA COPD study [1] (4 scans are discarded due to excessive noise [6]). The global extents of the three standard emphysema subtypes (%CLE, %PLE and %PSE over the total lung volume per scan) are available, corresponding to the average of visually assessed scores by four experienced radiologists. All scans were acquired at full inspiration, using either a Siemens 64-slice scanner or a GE 64-slice scanner, and were reconstructed using B35/Standard kernels. The slice thickness was 0.625 mm, and isotropic in-plane resolution was in the range [0.58, 0.88] mm.

3.2 Population Evaluation Using PDCM

We first demonstrate the ability of PDCM to study population-level emphysema spatial patterns. In Fig. 2(a) the average lung field intensity per angle $(\theta ,\phi )$ is projected onto each individual PDCM surface, and then averaged on normal subjects without emphysema and CLE-, PLE-, and PSE-predominant subjects with $\%emph > 5\%$. Similarly, averaged intensity across r from core to peel are visualized in Fig. 2(b). From Fig. 2(a), attenuation values for all groups are higher in anterior versus posterior regions, which agrees with the gravity effect. Maps of CLE- and PSE-predominant subjects appear to have lower attenuation (more emphysema) in superior versus inferior regions, while this is not obvious for PLE-predominant subjects. This agrees with the observation in [1] on this dataset that CLE and PSE severity were greater in higher versus lower lung zones, whereas severity of PLE did not vary by lung zone. Furthermore, low attenuation regions are more diffused, and clear regions of normal attenuation (blue) are absent for PLE-predominant subjects, which agrees with the definition of PLE. From Fig. 2(b), PSE-predominant subjects appear to have higher attenuation in the core and lower attenuation on the peel than CLE- and PLE-predominant subjects, which agrees with the definition of PSE. Attenuation values appear to be higher in the peel versus core, which is likely due to the presence of mediastinal/costal pleura.

3.3 Qualitative and Quantitative Evaluations of sLTPs

A random 3/4 of the total dataset is used as training scans (N = 238), while the others (N = 79) are used for testing. An average of 2,726 ROIs are extracted per scan to densely (with overlap) cover the emphysematous areas. A final set of 12 sLTPs is discovered using the full training set, and are illustrated in Fig. 2 (c). ROIs belonging to the same sLTP appear to be textually homogeneous, and each sLTP appears to have a distinct pattern, either textually or spatially. Since we jointly enforce spatial prevalence and textural homogeneity, a sLTP can have spatial “outliers” that were texturally favored.

Reproducibility. Four training subsets are generated by randomly eliminating 25% of the training scans. Reproducibility of sLTPs is measured by computing the overlap of test ROI labels using the Hungarian method for optimal sLTP matching [12], and the sLTPs learned from the full training set as the ground-truth. We discovered 12, 12, 13, and 13 sLTPs on training subsets. The average labeling Dice ratio is 0.91, which corresponds to a very high reproducibility level. The number of discovered sLTPs varies slightly between training subsets. This can be caused by a large change in the proportion of certain rare LTPs within the subsets, which modifies the weights of the similarity graph.

Table 1. ICC and 95% confidence interval between predicted standard emphysema subtype scores and ground truth.

Full size table

Ability to Encode Standard Subtypes. We expect the 12 learned sLTPs to be able to encode the standard emphysema subtypes. We evaluate here the prediction ability using a constrained multivariate regression [6], and compare our method with two previous algorithms [5, 6] (implemented with our training data, and setting the numbers of LTPs to 12 for comparison with a constant number of CT-based predictors). Intraclass correlation (ICC) values between predicted standard emphysema subtype scores and ground truth on the full dataset (N = 317), computed in a 4-fold cross validation manner, are reported in Table 1. Our sLTP model returns comparable ICC values and even higher for CLE and PSE standard subtypes.

Clinical Significance. Spearman’s partial correlations between sLTP percentage within the lung and clinical characteristics [1] after adjusting for demographical factors (age, race, gender, height and weight) are visualized in Fig. 3. Correlation values for MRC dyspnea scale, post six minute walk test (6MWT) breathlessness and fatigue are flipped so that a negative correlation always corresponds to more symptoms. Strong partial correlations were present for FEV1, 6MWT total distance, MRC dyspnea scale, and pre (baseline) and post 6MWT oxygen saturation. While sLTP 7 and sLTP 8 seem to be associated with healthier subjects (positive correlations), the other sLTPs are present often together with symptoms (negative correlations). We then additionally adjusted for $\%emph_{-950}$ in the partial correlation (not shown in the figure), and found that 12 sLTPs, 7 sLTPs, 6 sLTPs, and 5 sLTPs remain significantly correlated with FEV1, 6MWT total distance, post 6MWT oxygen saturation and MRC dyspnea scale respectively. These results indicate that the clinical relevance captured by the sLTPs would not be available when using the standard measure of $\%emph_{-950}$.

4 Discussions and Conclusions

In this work, we exploit a conformal spatial mapping of the lung shape to uniquely encode 3D voxel position in unregistered CT scans. We propose an unsupervised learning framework for discovering lung texture patterns of emphysema that incorporates spatial information. Algorithmic designs include an original similarity metric of spatio-textural features combining $\chi ^2$-$\ell ^2$ distances, data-driven weight parameters, and Infomap graph partitioning. Lung shape spatial mapping enables straightforward population-wide discovery of emphysema spatial patterns in CLE/PLE/PSE-predominant subjects. Spatially-informed emphysema lung texture patterns (sLTPs) generated in this study are reproducible, able to encode standard emphysema subtypes, and have significant correlations with clinical characteristics. In the future, the proposed method will be applied on a cohort of COPD patients for longitudinal progression analysis.

References

Smith, B.M., Austin, J.H., et al.: Pulmonary emphysema subtypes on computed tomography: the MESA COPD study. Am. J. Med. 127(1), 94.e7–94.e23 (2014)
Article Google Scholar
Sørensen, L., Shaker, S.B., et al.: Quantitative analysis of pulmonary emphysema using local binary patterns. IEEE Trans. Med. Imaging 29(2), 559–569 (2010)
Article Google Scholar
Gangeh, M.J., Sørensen, L., Shaker, S.B., Kamel, M.S., Bruijne, M., Loog, M.: A texton-based approach for the classification of lung parenchyma in CT images. In: Jiang, T., Navab, N., Pluim, J.P.W., Viergever, M.A. (eds.) MICCAI 2010. LNCS, vol. 6363, pp. 595–602. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15711-0_74
Chapter Google Scholar
Binder, P., Batmanghelich, N.K., Estepar, R.S.J., Golland, P.: Unsupervised discovery of emphysema subtypes in a large clinical cohort. In: Wang, L., Adeli, E., Wang, Q., Shi, Y., Suk, H.-I. (eds.) MLMI 2016. LNCS, vol. 10019, pp. 180–187. Springer, Cham (2016). doi:10.1007/978-3-319-47157-0_22
Chapter Google Scholar
Hame, Y., et al.: Sparse sampling and unsupervised learning of lung texture patterns in pulmonary emphysema: MESA COPD study. In: IEEE ISBI, pp. 109–113 (2015)
Google Scholar
Yang, J., et al.: Explaining radiological emphysema subtypes with unsupervised texture prototypes: MESA COPD study. In: Müller, H., et al. (eds.) MCV 2016, BAMBI 2016. LNCS, vol. 10081, pp. 69–80. Springer, Cham (2016)
Chapter Google Scholar
Murphy, K., et al.: Toward automatic regional analysis of pulmonary function using inspiration and expiration thoracic CT. Med. Phys. 39(3), 1650–1662 (2012)
Article Google Scholar
Gorelick, L., et al.: Shape representation and classification using the poisson equation. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 1991–2005 (2006)
Article Google Scholar
Yang, J., Angelini, E.D., Balte, P.P., Hoffman, E.A., Wu, C.O., Venkatesh, B.A., Barr, R.G., Laine, A.F.: Emphysema quantification on cardiac CT scans using hidden markov measure field model: the MESA lung study. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9901, pp. 624–631. Springer, Cham (2016). doi:10.1007/978-3-319-46723-8_72
Chapter Google Scholar
Hame, Y., Angelini, E.D., et al.: Adaptive quantification and longitudinal analysis of pulmonary emphysema with a hidden markov measure field model. IEEE Trans. Med. Imaging 33(7), 1527–1540 (2014)
Article Google Scholar
Rosvall, M., Bergstrom, C.T.: Maps of random walks on complex networks reveal community structure. Proc. Nat. Acad. Sci. 105(4), 1118–1123 (2008)
Article Google Scholar
Roth, V., Lange, T., Braun, M., Buhmann, J.: A resampling approach to cluster validation. In: Härdle, W., Rönz, B. (eds.) Compstat, pp. 123–128. Physica, Heidelberg (2002)
Chapter Google Scholar

Download references

Acknowledgements

Thanks NIH/NHLBI R01-HL121270, R01-HL077612, RC1-HL100543, R01-HL093081 and N01-HC095159 through N01-HC-95169, UL1-RR-024156 and UL1-RR-025005 for funding.

Author information

Authors and Affiliations

Department of Biomedical Engineering, Columbia University, New York, NY, USA
Jie Yang, Elsa D. Angelini, Jingkuan Song & Andrew F. Laine
ITMAT Data Science Group, NIHR Imperial BRC, Imperial College, London, UK
Elsa D. Angelini
Department of Medicine, Columbia University Medical Center, New York, NY, USA
Pallavi P. Balte, Benjamin M. Smith & R. Graham Barr
Department of Radiology, Columbia University Medical Center, New York, NY, USA
John H. M. Austin
Department of Radiology, Medicine and Biomedical Engineering, University of Iowa, Iowa City, IA, USA
Eric A. Hoffman
Department of Medicine, McGill University Health Center, Montreal, QC, Canada
Benjamin M. Smith
Department of Epidemiology, Columbia University Medical Center, New York, NY, USA
R. Graham Barr

Authors

Jie Yang
View author publications
You can also search for this author in PubMed Google Scholar
Elsa D. Angelini
View author publications
You can also search for this author in PubMed Google Scholar
Pallavi P. Balte
View author publications
You can also search for this author in PubMed Google Scholar
Eric A. Hoffman
View author publications
You can also search for this author in PubMed Google Scholar
John H. M. Austin
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin M. Smith
View author publications
You can also search for this author in PubMed Google Scholar
Jingkuan Song
View author publications
You can also search for this author in PubMed Google Scholar
R. Graham Barr
View author publications
You can also search for this author in PubMed Google Scholar
Andrew F. Laine
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrew F. Laine .

Editor information

Editors and Affiliations

Université de Sherbrooke, Sherbrooke, QC, Canada
Maxime Descoteaux
DKFZ, Heidelberg, Germany
Lena Maier-Hein
Ulm University of Applied Sciences, Ulm, Germany
Alfred Franz
Université de Rennes 1, Rennes, France
Pierre Jannin
McGill University, Montreal, QC, Canada
D. Louis Collins
Université Laval, Québec, QC, Canada
Simon Duchesne

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, J. et al. (2017). Unsupervised Discovery of Spatially-Informed Lung Texture Patterns for Pulmonary Emphysema: The MESA COPD Study. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D., Duchesne, S. (eds) Medical Image Computing and Computer Assisted Intervention − MICCAI 2017. MICCAI 2017. Lecture Notes in Computer Science(), vol 10433. Springer, Cham. https://doi.org/10.1007/978-3-319-66182-7_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-66182-7_14
Published: 04 September 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66181-0
Online ISBN: 978-3-319-66182-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)