Abstract
We present a generative probabilistic approach to discovery of disease subtypes determined by the genetic variants. In many diseases, multiple types of pathology may present simultaneously in a patient, making quantification of the disease challenging. Our method seeks common co-occurring image and genetic patterns in a population as a way to model these two different data types jointly. We assume that each patient is a mixture of multiple disease subtypes and use the joint generative model of image and genetic markers to identify disease subtypes guided by known genetic influences. Our model is based on a variant of the so-called topic models that uncover the latent structure in a collection of data. We derive an efficient variational inference algorithm to extract patterns of co-occurrence and to quantify the presence of heterogeneous disease processes in each patient. We evaluate the method on simulated data and illustrate its use in the context of Chronic Obstructive Pulmonary Disease (COPD) to characterize the relationship between image and genetic signatures of COPD subtypes in a large patient cohort.
N.K. Batmanghelich and A. Saeedi—equal contribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Susstrunk, S.: Slic superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2274–2282 (2012)
Batmanghelich, K.N., Cho, M., Jose, R.S., Golland, P.: Spherical topic models for imaging phenotype discovery in genetic studies. In: Cardoso, M.J., Simpson, I., Arbel, T., Precup, D., Ribbens, A. (eds.) BAMBI 2014. LNCS, vol. 8677, pp. 107–117. Springer, Heidelberg (2014)
Batmanghelich, N.K., Dalca, A.V., Sabuncu, M.R., Golland, P.: Joint modeling of imaging and genetics. In: Gee, J.C., Joshi, S., Pohl, K.M., Wells, W.M., Zöllei, L. (eds.) IPMI 2013. LNCS, vol. 7917, pp. 766–777. Springer, Heidelberg (2013)
Bush, W.S., Moore, J.H.: Genome-wide association studies. PLoS Comput. Biol. 8(12), e1002822 (2012)
Castaldi, P.J., et al.: Genome-wide association identifies regulatory loci associated with distinct local histogram emphysema patterns. Am. J. Respir. Crit. Care Med. 190(4), 399–409 (2014)
Castaldi, P.J., San José Estépar, R., Mendoza, C.S., Hersh, C.P., Laird, N., Crapo, J.D., Lynch, D.A., Silverman, E.K., Washko, G.R.: Distinct quantitative computed tomography emphysema patterns are associated with physiology and function in smokers. Am. J. Respir. Crit. Care Med. 188(9), 1083–1090 (2013)
Cho, M.H., et al.: Risk loci for chronic obstructive pulmonary disease: a genome-wide association study and meta-analysis. Lancet Respir. Med. 2(3), 214–225 (2014)
Guan, Y., Dy, J.G., Niu, D., Ghahramani, Z.: Variational inference for nonparametric multiple clustering. In: MultiClust Workshop, KDD 2010 (2010)
Hoffman, M.D., Blei, D.M., Wang, C., Paisley, J.: Stochastic variational inference. J. Mach. Learn. Res. 14(1), 1303–1347 (2013)
Mendoza, C.S., et al.: Emphysema quantification in a multi-scanner hrct cohort using local intensity distributions. In: 2012 9th IEEE International Symposium on Biomedical Imaging (ISBI), pp. 474–477. IEEE (2012)
Regan, E.A., Hokanson, J.E., Murphy, J.R., Make, B., Lynch, D.A., Beaty, T.H., Curran-Everett, D., Silverman, E.K., Crapo, J.D.: Genetic epidemiology of copd (copdgene) study design. COPD: J. Chronic Obstructive Pulm. Dis. 7(1), 32–43 (2011)
Rosenberg, A., Hirschberg, J.: V-measure: a conditional entropy-based external cluster evaluation measure. In: EMNLP-CoNLL, vol. 7, pp. 410–420. Citeseer (2007)
Satoh, K., Kobayashi, T., Misao, T., Hitani, Y., Yamamoto, Y., Nishiyama, Y., Ohkawa, M.: CT assessment of subtypes of pulmonary emphysema in smokers. CHEST J. 120(3), 725–729 (2001)
Sivic, J., Zisserman, A.: Efficient visual search of videos cast as text retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 31(4), 591–606 (2009)
Song, Y., Cai, W., Zhou, Y., Feng, D.D.: Feature-based image patch approximation for lung tissue classification. IEEE Trans. Med. Imaging 32(4), 797–808 (2013)
Sorensen, L., Shaker, S.B., De Bruijne, M.: Quantitative analysis of pulmonary emphysema using local binary patterns. IEEE Trans. Med. Imaging 29(2), 559–569 (2010)
Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical dirichlet processes. J. Am. Stat. Assoc. 101(476), 1566–1581 (2006)
Acknowledgements
This work was supported by NIH NIBIB NAMIC U54- EB005149, NIH NCRR NAC P41-RR13218 and NIH NIBIB NAC P41-EB015902, NHLBI R01HL089856, R01HL089897, K08HL097029, R01HL113264, 5K25HL104085, 5R01HL116931, and 5R01HL116473. The COPDGene study (NCT00608764) is also supported by the COPD Foundation through contributions made to an Industry Advisory Board comprised of AstraZeneca, Boehringer Ingelheim, Novartis, Pfizer, Siemens, GlaxoSmithKline and Sunovion.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix: Variational Bayes Inference Procedure
Appendix: Variational Bayes Inference Procedure
Combining all components of the model defined in Sect. 2, we construct the joint distribution of all variables in the model (Fig. 6):
where N and M are the number of supervoxels and minor alleles, respectively, identified for subject s.
Left: Graphical model that represents the joint distribution. The open gray and white circles correspond to the observed and the latent random variables, respectively. The full circles represent fixed hyper-parameters. Superscript I and G denote image and genetic parts of the model respectively. Right: Update rules for the variational parameters.
We choose a factorization for the distribution q that captures most model assumptions and yet is computationally tractable:
where we choose an appropriate approximating distribution for each latent variable and use \(\tilde{}\) to denote parameters of the approximating distributions. The optimization is defined in the space of the variational parameters \(\left\{ \tilde{\eta }^I,\tilde{\eta }^G,\tilde{\omega },{\xi }, \tilde{\alpha }, {\phi }^I,{\phi }^G \right\} \). We omit the derivation of the updates due to space constraints; Algorithm 1 provides pseudocode for the resulting updates. We run the algorithm five times starting from different random initializations and report the result with the highest lower bound F(q) .
Once the algorithm converges, we estimate the population-level quantities of interest as means of the corresponding approximating distributions:
Each expectation above can be easily evaluated from the parameters of the corresponding distribution. In addition, we construct spatial maps that display the posterior probability of each population topic for each supervoxel in a particular subject s to visually evaluate the disease structure in that subject.
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Batmanghelich, N.K., Saeedi, A., Cho, M., Estepar, R.S.J., Golland, P. (2015). Generative Method to Discover Genetically Driven Image Biomarkers. In: Ourselin, S., Alexander, D., Westin, CF., Cardoso, M. (eds) Information Processing in Medical Imaging. IPMI 2015. Lecture Notes in Computer Science(), vol 9123. Springer, Cham. https://doi.org/10.1007/978-3-319-19992-4_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-19992-4_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19991-7
Online ISBN: 978-3-319-19992-4
eBook Packages: Computer ScienceComputer Science (R0)