Abstract
Our work aims to link pathology with radiology with the goal to improve the early detection of lung cancer. Rather than utilising a set of predefined radiomics features, we propose to learn a new set of features from histology. Generating a comprehensive lung histology report is the first vital step toward this goal. Deep learning has revolutionised the computational assessment of digital pathology images. Today, we have mature algorithms for assessing morphological features at the cellular and tissue levels. In addition, there are promising efforts that link morphological features with biologically relevant information. While promising, these efforts mostly focus on narrow, well-defined questions. Developing a comprehensive report that is required in our setting requires an annotation strategy that captures all clinically relevant patterns specified in the WHO guidelines. Here, we propose and compare approaches aimed to balance the dataset and mitigate the biases in learning by automatically prioritising regions with clinical patterns underrepresented in the dataset. Our study demonstrates the opportunities active data enrichment can provide and results in a new lung-cancer dataset annotated to a degree that is not readily available in the public domain.
GB is supported by FG and the EPSRC Center for Doctoral Training in Health Data Science (EP/S02428X/1). TC is supported by Linacre College, Oxford. The work was done as part of UKRI DART Lung Health Program.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alsubaie, N., Shaban, M., Snead, D., Khurram, A., Rajpoot, N.: A multi-resolution deep learning framework for lung adenocarcinoma growth pattern classification. In: Nixon, M., Mahmoodi, S., Zwiggelaar, R. (eds.) MIUA 2018. Communications in Computer and Information Science, vol. 894, pp. 3–11. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-95921-4_1
Coudray, N., et al.: Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat. Med. 24(10), 1559–1567 (2018). https://doi.org/10.1038/s41591-018-0177-5
Davidson, M.R., Gazdar, A.F., Clarke, B.E.: The pivotal role of pathology in the management of lung cancer. J. Thorac. Dis. 5(Suppl 5), S463-478 (2013). https://doi.org/10.3978/j.issn.2072-1439.2013.08.43
Huang, T., et al.: Distinguishing lung adenocarcinoma from lung squamous cell carcinoma by two hypomethylated and three hypermethylated genes: a meta-analysis. PLoS ONE 11(2), e0149088 (2016). https://doi.org/10.1371/journal.pone.0149088
Li, B., Li, Y., Eliceiri, K.W.: Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14318–14328 (2021). https://doi.org/10.1109/CVPR46437.2021.01409
Lu, M.Y., Williamson, D.F., Chen, T.Y., Chen, R.J., Barbieri, M., Mahmood, F.: Data-efficient and weakly supervised computational pathology on whole-slide images. Nat. Biomed. Eng. 5(6), 555–570 (2021). https://doi.org/10.1038/s41551-020-00682-w
Meza, R., Meernik, C., Jeon, J., Cote, M.L.: Lung cancer incidence trends by gender, race and histology in the United States, 1973–2010. PLoS ONE 10(3), 1–14 (2015). https://doi.org/10.1371/journal.pone.0121323
Nicholson, A.G., et al.: The 2021 WHO classification of lung tumors: impact of advances since 2015. J. Thorac. Oncol. Off. Publ. Int. Assoc. Study Lung Cancer 17(3), 362–387 (2022). https://doi.org/10.1016/j.jtho.2021.11.003
Stang, A., et al.: Diagnostic agreement in the histopathological evaluation of lung cancer tissue in a population-based case-control study. Lung Cancer 52(1), 29–36 (2006). https://doi.org/10.1016/j.lungcan.2005.11.012
Torre, L.A., Siegel, R.L., Jemal, A.: Lung cancer statistics. In: Ahmad, A., Gadgeel, S. (eds.) Lung Cancer and Personalized Medicine. AEMB, vol. 893, pp. 1–19. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-24223-1_1
Wei, J.W., et al.: Pathologist-level classification of histologic patterns on resected lung adenocarcinoma slides with deep neural networks. Sci. Rep. 9(1), 3358 (2019). https://doi.org/10.1038/s41598-019-40041-7
Yang, H., et al.: Deep learning-based six-type classifier for lung cancer and mimics from histopathological whole slide images: a retrospective study. BMC Med. 19(1), 80 (2021). https://doi.org/10.1186/s12916-021-01953-2
Zhao, B., et al.: Reproducibility of radiomics for deciphering tumor phenotype with imaging. Sci. Rep. 6(1), 23428 (2016). https://doi.org/10.1038/srep23428
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Batchkala, G., Chakraborti, T., McCole, M., Gleeson, F., Rittscher, J. (2022). Active Data Enrichment by Learning What to Annotate in Digital Pathology. In: Ali, S., van der Sommen, F., Papież, B.W., van Eijnatten, M., Jin, Y., Kolenbrander, I. (eds) Cancer Prevention Through Early Detection. CaPTion 2022. Lecture Notes in Computer Science, vol 13581. Springer, Cham. https://doi.org/10.1007/978-3-031-17979-2_12
Download citation
DOI: https://doi.org/10.1007/978-3-031-17979-2_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17978-5
Online ISBN: 978-3-031-17979-2
eBook Packages: Computer ScienceComputer Science (R0)