Skip to main content

Active Data Enrichment by Learning What to Annotate in Digital Pathology

  • Conference paper
  • First Online:
Cancer Prevention Through Early Detection (CaPTion 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13581))

Included in the following conference series:

  • 632 Accesses

Abstract

Our work aims to link pathology with radiology with the goal to improve the early detection of lung cancer. Rather than utilising a set of predefined radiomics features, we propose to learn a new set of features from histology. Generating a comprehensive lung histology report is the first vital step toward this goal. Deep learning has revolutionised the computational assessment of digital pathology images. Today, we have mature algorithms for assessing morphological features at the cellular and tissue levels. In addition, there are promising efforts that link morphological features with biologically relevant information. While promising, these efforts mostly focus on narrow, well-defined questions. Developing a comprehensive report that is required in our setting requires an annotation strategy that captures all clinically relevant patterns specified in the WHO guidelines. Here, we propose and compare approaches aimed to balance the dataset and mitigate the biases in learning by automatically prioritising regions with clinical patterns underrepresented in the dataset. Our study demonstrates the opportunities active data enrichment can provide and results in a new lung-cancer dataset annotated to a degree that is not readily available in the public domain.

GB is supported by FG and the EPSRC Center for Doctoral Training in Health Data Science (EP/S02428X/1). TC is supported by Linacre College, Oxford. The work was done as part of UKRI DART Lung Health Program.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Alsubaie, N., Shaban, M., Snead, D., Khurram, A., Rajpoot, N.: A multi-resolution deep learning framework for lung adenocarcinoma growth pattern classification. In: Nixon, M., Mahmoodi, S., Zwiggelaar, R. (eds.) MIUA 2018. Communications in Computer and Information Science, vol. 894, pp. 3–11. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-95921-4_1

    Chapter  Google Scholar 

  2. Coudray, N., et al.: Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat. Med. 24(10), 1559–1567 (2018). https://doi.org/10.1038/s41591-018-0177-5

    Article  Google Scholar 

  3. Davidson, M.R., Gazdar, A.F., Clarke, B.E.: The pivotal role of pathology in the management of lung cancer. J. Thorac. Dis. 5(Suppl 5), S463-478 (2013). https://doi.org/10.3978/j.issn.2072-1439.2013.08.43

    Article  Google Scholar 

  4. Huang, T., et al.: Distinguishing lung adenocarcinoma from lung squamous cell carcinoma by two hypomethylated and three hypermethylated genes: a meta-analysis. PLoS ONE 11(2), e0149088 (2016). https://doi.org/10.1371/journal.pone.0149088

    Article  Google Scholar 

  5. Li, B., Li, Y., Eliceiri, K.W.: Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14318–14328 (2021). https://doi.org/10.1109/CVPR46437.2021.01409

  6. Lu, M.Y., Williamson, D.F., Chen, T.Y., Chen, R.J., Barbieri, M., Mahmood, F.: Data-efficient and weakly supervised computational pathology on whole-slide images. Nat. Biomed. Eng. 5(6), 555–570 (2021). https://doi.org/10.1038/s41551-020-00682-w

  7. Meza, R., Meernik, C., Jeon, J., Cote, M.L.: Lung cancer incidence trends by gender, race and histology in the United States, 1973–2010. PLoS ONE 10(3), 1–14 (2015). https://doi.org/10.1371/journal.pone.0121323

    Article  Google Scholar 

  8. Nicholson, A.G., et al.: The 2021 WHO classification of lung tumors: impact of advances since 2015. J. Thorac. Oncol. Off. Publ. Int. Assoc. Study Lung Cancer 17(3), 362–387 (2022). https://doi.org/10.1016/j.jtho.2021.11.003

    Article  Google Scholar 

  9. Stang, A., et al.: Diagnostic agreement in the histopathological evaluation of lung cancer tissue in a population-based case-control study. Lung Cancer 52(1), 29–36 (2006). https://doi.org/10.1016/j.lungcan.2005.11.012

    Article  Google Scholar 

  10. Torre, L.A., Siegel, R.L., Jemal, A.: Lung cancer statistics. In: Ahmad, A., Gadgeel, S. (eds.) Lung Cancer and Personalized Medicine. AEMB, vol. 893, pp. 1–19. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-24223-1_1

    Chapter  Google Scholar 

  11. Wei, J.W., et al.: Pathologist-level classification of histologic patterns on resected lung adenocarcinoma slides with deep neural networks. Sci. Rep. 9(1), 3358 (2019). https://doi.org/10.1038/s41598-019-40041-7

    Article  Google Scholar 

  12. Yang, H., et al.: Deep learning-based six-type classifier for lung cancer and mimics from histopathological whole slide images: a retrospective study. BMC Med. 19(1), 80 (2021). https://doi.org/10.1186/s12916-021-01953-2

    Article  Google Scholar 

  13. Zhao, B., et al.: Reproducibility of radiomics for deciphering tumor phenotype with imaging. Sci. Rep. 6(1), 23428 (2016). https://doi.org/10.1038/srep23428

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to George Batchkala .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 477 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Batchkala, G., Chakraborti, T., McCole, M., Gleeson, F., Rittscher, J. (2022). Active Data Enrichment by Learning What to Annotate in Digital Pathology. In: Ali, S., van der Sommen, F., Papież, B.W., van Eijnatten, M., Jin, Y., Kolenbrander, I. (eds) Cancer Prevention Through Early Detection. CaPTion 2022. Lecture Notes in Computer Science, vol 13581. Springer, Cham. https://doi.org/10.1007/978-3-031-17979-2_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-17979-2_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-17978-5

  • Online ISBN: 978-3-031-17979-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics