Abstract
With the development of deep learning and computational pathology, whole-slide images (WSIs) are widely used in clinical diagnosis. A WSI, which refers to the scanning of conventional glass slides into digital slide images, usually contains gigabytes of pixels. Most existing methods in computer vision process WSIs as many individual patches, where the model infers the patches one by one to synthesize the final results, neglecting the intrinsic WSI-wise global correlations among the patches. In this paper, we propose the PATHology TRansformer (PathTR), which utilizes the global information of WSI combined with the local ones. In PathTR, the local context is first aggregated by a self-attention mechanism, and then we design a recursive mechanism to encode the global context as additional states to build the end to end model. Experiments on detecting lymph-node tumor metastases for breast cancer show that the proposed PathTR achieves the Free-response Receiver Operating Characteristic Curves (FROC) score of 87.68%, which outperforms the baseline and NCRF method with +8.99% and +7.08%, respectively. Our method also achieves a significant 94.25% sensitivity at 8 false positives per image.
W. Qin and R. Xu—These authors contributed equally to this work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The need for informed consent was waived by the institutional review board of Radboud University Medical Center (RUMC). [7].
References
Wang, D., Khosla, A., Gargeya, R., Irshad, H., Beck, A.H.: Deep learning for identifying metastatic breast cancer. ArXiv preprint abs/1606.05718 (2016)
Li, Y., Ping, W.: Cancer metastasis detection with neural conditional random field. ArXiv preprint abs/1806.07064 (2018)
Shen, Yiqing, Ke, Jing: A deformable CRF model for histopathology whole-slide image classification. In: Martel, Anne L.., Abolmaesumi, Purang, Stoyanov, Danail, Mateus, Diana, Zuluaga, Maria A.., Zhou, S. Kevin., Racoceanu, Daniel, Joskowicz, Leo (eds.) MICCAI 2020. LNCS, vol. 12265, pp. 500–508. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59722-1_48
Zhang, W., Zhu, C., Liu, J., Wang, Y., Jin, M.: Cancer metastasis detection through multiple spatial context network. In: Proceedings of the 2019 8th International Conference on Computing and Pattern Recognition, pp. 221–225 (2019)
Shen, Y., Ke, J.: Sampling based tumor recognition in whole-slide histology image with deep learning approaches. IEEE/ACM Trans. Comput. Biol. Bioinform. (2021)
Wang, Xiyue, Yang, Sen, Zhang, Jun, Wang, Minghui, Zhang, Jing, Huang, Junzhou, Yang, Wei, Han, Xiao: TransPath: transformer-based self-supervised learning for histopathological image classification. In: de Bruijne, Marleen, Cattin, Philippe C.., Cotin, Stéphane., Padoy, Nicolas, Speidel, Stefanie, Zheng, Yefeng, Essert, Caroline (eds.) MICCAI 2021. LNCS, vol. 12908, pp. 186–195. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87237-3_18
Bejnordi, B.E., et al.: Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA 318, 2199–2210 (2017)
Liu, Y., et al.: Detecting cancer metastases on gigapixel pathology images. ArXiv preprint abs/1703.02442 (2017)
Liu, Y., et al.: Artificial intelligence-based breast cancer nodal metastasis detection: insights into the black box for pathologists. Arch. Pathol. Laboratory Med. 143, 859–868 (2019)
Shen, Yiqing, Ke, Jing: A deformable CRF model for histopathology whole-slide image classification. In: Martel, Anne L.., Abolmaesumi, Purang, Stoyanov, Danail, Mateus, Diana, Zuluaga, Maria A.., Zhou, S. Kevin., Racoceanu, Daniel, Joskowicz, Leo (eds.) MICCAI 2020. LNCS, vol. 12265, pp. 500–508. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59722-1_48
Ye, J., Luo, Y., Zhu, C., Liu, F., Zhang, Y.: Breast cancer image classification on WSI with spatial correlations. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2019, Brighton, United Kingdom, 12–17 May 2019, pp. 1219–1223. IEEE (2019)
Vang, Yeeleng S.., Chen, Zhen, Xie, Xiaohui: Deep learning framework for multi-class breast cancer histology image classification. In: Campilho, Aurélio, Karray, Fakhri, ter Haar Romeny, Bart (eds.) ICIAR 2018. LNCS, vol. 10882, pp. 914–922. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93000-8_104
Zanjani, F.G., Zinger, S., et al.: Cancer detection in histopathology whole-slide images using conditional random fields on deep embedded spaces. In: Medical Imaging 2018: Digital Pathology, vol. 10581, p. 105810I. International Society for Optics and Photonics (2018)
Kong, Bin, Wang, Xin, Li, Zhongyu, Song, Qi., Zhang, Shaoting: Cancer metastasis detection via spatially structured deep network. In: Niethammer, Marc, Styner, Martin, Aylward, Stephen, Zhu, Hongtu, Oguz, Ipek, Yap, Pew-Thian., Shen, Dinggang (eds.) IPMI 2017. LNCS, vol. 10265, pp. 236–248. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59050-9_19
Mahbod, Amirreza, Ellinger, Isabella, Ecker, Rupert, Smedby, Örjan., Wang, Chunliang: Breast cancer histological image classification using fine-tuned deep network fusion. In: Campilho, Aurélio, Karray, Fakhri, ter Haar Romeny, Bart (eds.) ICIAR 2018. LNCS, vol. 10882, pp. 754–762. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93000-8_85
Oh, S.W., Lee, J., Xu, N., Kim, S.J.: Video object segmentation using space-time memory networks. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, IEEE (2019) 9225–9234
Woo, S., Kim, D., Cho, D., Kweon, I.S.: Linknet: Relational embedding for scene graph. In Bengio, S., Wallach, H.M., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R., eds.: Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3–8, 2018, Montréal, Canada. (2018) 558–568
Wu, C., Feichtenhofer, C., Fan, H., He, K., Krähenbühl, P., Girshick, R.B.: Long-term feature banks for detailed video understanding. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16–20, 2019, Computer Vision Foundation / IEEE (2019) 284–293
Xu, J., Cao, Y., Zhang, Z., Hu, H.: Spatial-temporal relation networks for multi-object tracking. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, IEEE (2019) 3987–3997
Chen, Y., Cao, Y., Hu, H., Wang, L.: Memory enhanced global-local aggregation for video object detection. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13–19, 2020, IEEE (2020) 10334–10343
Wright, A.I., Magee, D., Quirke, P., Treanor, D.: Incorporating local and global context for better automated analysis of colorectal cancer on digital pathology slides. Procedia Computer Science 90 (2016) 125–131 20th Conference on Medical Image Understanding and Analysis (MIUA 2016)
Chomphuwiset, P., Magee, D.R., Boyle, R.D., Treanor, D.E.: Context-based classification of cell nuclei and tissue regions in liver histopathology. In: MIUA. (2011)
Zormpas-Petridis, K., Failmezger, H., Roxanis, I., Blackledge, M.D., Jamin, Y., Yuan, Y.: Capturing global spatial context for accurate cell classification in skin cancer histology. ArXiv preprint abs/1808.02355 (2018)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. In Guyon, I., von Luxburg, U., Bengio, S., Wallach, H.M., Fergus, R., Vishwanathan, S.V.N., Garnett, R., eds.: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4–9, 2017, Long Beach, CA, USA. (2017) 5998–6008
Otsu, N.: A threshold selection method from gray level histograms. IEEE Trans. Syst. Man Cybern. 9, 62–66 (1979)
Wallace, G.K.: The jpeg still picture compression standard. Commun. ACM 34, 30–44 (1991)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30, 2016, IEEE Computer Society (2016) 770–778
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., Houlsby, N.: An image is worth 16x16 words: Transformers for image recognition at scale. In: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3–7, 2021, OpenReview.net (2021)
Egan, J.P., Greenberg, G.Z., Schulman, A.I.: Operating characteristics, signal detectability, and the method of free response. J. Acoust. Soc. Am. 33, 993–1007 (1961)
Acknowledgement
This research was supported in part by the Foundation of Shenzhen Science and Technology Innovation Committee (JCYJ20180507181527806). We also thank Qiuchuan Liang (Beijing Haidian Kaiwen Academy, Beijing, China) for preprocessing data.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Qin, W., Xu, R., Jiang, S., Jiang, T., Luo, L. (2023). PathTR: Context-Aware Memory Transformer for Tumor Localization in Gigapixel Pathology Images. In: Wang, L., Gall, J., Chin, TJ., Sato, I., Chellappa, R. (eds) Computer Vision – ACCV 2022. ACCV 2022. Lecture Notes in Computer Science, vol 13846. Springer, Cham. https://doi.org/10.1007/978-3-031-26351-4_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-26351-4_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26350-7
Online ISBN: 978-3-031-26351-4
eBook Packages: Computer ScienceComputer Science (R0)