Skip to main content

Deep Unsupervised Clustering for Conditional Identification of Subgroups Within a Digital Pathology Image Set

  • Conference paper
  • First Online:
Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 (MICCAI 2023)

Abstract

Consideration of subgroups or domains within medical image datasets is crucial for the development and evaluation of robust and generalizable machine learning systems. To tackle the domain identification problem, we examine deep unsupervised generative clustering approaches for representation learning and clustering. The Variational Deep Embedding (VaDE) model is trained to learn lower-dimensional representations of images based on a Mixture-of-Gaussians latent space prior distribution while optimizing cluster assignments. We propose the Conditionally Decoded Variational Deep Embedding (CDVaDE) model which incorporates additional variables of choice, such as the class labels, as conditioning factors to guide the clustering towards subgroup structures in the data which have not been known or recognized previously. We analyze the behavior of CDVaDE on multiple datasets and compare it to other deep clustering algorithms. Our experimental results demonstrate that the considered models are capable of separating digital pathology images into meaningful subgroups. We provide a general-purpose implementation of all considered deep clustering methods as part of the open source Python package DomId (https://github.com/DIDSR/DomId).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Ahn, E., Kumar, A., Feng, D., Fulham, M., Kim, J.: Unsupervised feature learning with k-means and an ensemble of deep convolutional neural networks for medical image classification. arXiv preprint arXiv:1906.03359 (2019)

  2. Barragán-Montero, A., et al.: Artificial intelligence and machine learning for medical imaging: a technology review. Physica Med. 83, 242–256 (2021)

    Article  Google Scholar 

  3. Deng, L.: The MNIST database of handwritten digit images for machine learning research. IEEE Sig. Process. Mag. 29(6), 141–142 (2012)

    Article  Google Scholar 

  4. Gavrielides, M.A., Gallas, B.D., Lenz, P., Badano, A., Hewitt, S.M.: Observer variability in the interpretation of HER2/\(neu\) immunohistochemical expression with unaided and computer-aided digital microscopy. Arch. Pathol. Lab. Med. 135(2), 233–242 (2011). https://doi.org/10.5858/135.2.233

    Article  Google Scholar 

  5. Gossmann, A., Cha, K.H., Sun, X.: Performance deterioration of deep neural networks for lesion classification in mammography due to distribution shift: an analysis based on artificially created distribution shift. In: Medical Imaging 2020: Computer-Aided Diagnosis, vol. 11314, p. 1131404. SPIE (2020). https://doi.org/10.1117/12.2551346

  6. Jiang, Z., Zheng, Y., Tan, H., Tang, B., Zhou, H.: Variational deep embedding: an unsupervised and generative approach to clustering. In: IJCAI (2017)

    Google Scholar 

  7. Kart, T., Bai, W., Glocker, B., Rueckert, D.: DeepMCAT: large-scale deep clustering for medical image categorization. In: Engelhardt, S., et al. (eds.) DGM4MICCAI/DALI -2021. LNCS, vol. 13003, pp. 259–267. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-88210-5_26

    Chapter  Google Scholar 

  8. Keay, T., Conway, C.M., O’Flaherty, N., Hewitt, S.M., Shea, K., Gavrielides, M.A.: Reproducibility in the automated quantitative assessment of HER2/neu for breast cancer. J. Pathol. Inform. 4(1), 19 (2013)

    Article  Google Scholar 

  9. Kim, D.W., Jang, H.Y., Kim, K.W., Shin, Y., Park, S.H.: Design characteristics of studies reporting the performance of artificial intelligence algorithms for diagnostic analysis of medical images: results from recently published papers. Korean J. Radiol. 20(3), 405–410 (2019). https://doi.org/10.3348/kjr.2019.0025

    Article  Google Scholar 

  10. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: ICLR (2013). arxiv.org/abs/1312.6114v10

  11. Oakden-Rayner, L., Dunnmon, J., Carneiro, G., Re, C.: Hidden stratification causes clinically meaningful failures in machine learning for medical imaging. In: CHIL 2020, pp. 151–159. ACM (2020). https://doi.org/10.1145/3368555.3384468

  12. Perkonigg, M., Sobotka, D., Ba-Ssalamah, A., Langs, G.: Unsupervised deep clustering for predictive texture pattern discovery in medical images. arXiv preprint arXiv:2002.03721 (2020)

  13. Vokinger, K.N., Feuerriegel, S., Kesselheim, A.S.: Mitigating bias in machine learning for medicine. Commun. Med. 1(1), 25 (2021). https://doi.org/10.1038/s43856-021-00028-w

    Article  Google Scholar 

  14. Xie, J., Girshick, R., Farhadi, A.: Unsupervised deep embedding for clustering analysis. In: Balcan, M.F., Weinberger, K.Q. (eds.) Proceedings of the 33rd International Conference on Machine Learning. Proceedings of Machine Learning Research, New York, USA, vol. 48, pp. 478–487. PMLR (2016). https://proceedings.mlr.press/v48/xieb16.html

  15. Yu, A.C., Mohajer, B., Eng, J.: External validation of deep learning algorithms for radiologic diagnosis: a systematic review. Radiol. Artif. Intell. 4(3), e210064 (2022). https://doi.org/10.1148/ryai.210064

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to thank Dr. Marios Gavrielides for providing access to the HER2 dataset and for helpful discussion. This project was supported in part by an appointment to the Research Participation Program at the U.S. Food and Drug Administration administered by the Oak Ridge Institute for Science and Education through an interagency agreement between the U.S. Department of Energy and the U.S. Food and Drug Administration. XS acknowledges support from the Hightech Agenda Bayern.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexej Gossmann .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1031 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sidulova, M., Sun, X., Gossmann, A. (2023). Deep Unsupervised Clustering for Conditional Identification of Subgroups Within a Digital Pathology Image Set. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14227. Springer, Cham. https://doi.org/10.1007/978-3-031-43993-3_64

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43993-3_64

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43992-6

  • Online ISBN: 978-3-031-43993-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics