Hierarchical Pretrained Backbone Vision Transformer for Image Classification in Histopathology

Zedda, Luca; Loddo, Andrea; Di Ruberto, Cecilia

doi:10.1007/978-3-031-43153-1_19

Luca Zedda¹⁰,
Andrea Loddo¹⁰ &
Cecilia Di Ruberto¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14234))

Included in the following conference series:

International Conference on Image Analysis and Processing

600 Accesses

Abstract

Histopathology plays a crucial role in clinical diagnosis, treatment planning, and research by enabling the examination of diseases in tissues and organs. However, the manual analysis of histopathological images is time-consuming and labor-intensive, requiring expert pathologists. To address this issue, this work proposes a novel architecture called Hierarchical Pretrained Backbone Vision Transformer for automated histopathological image classification, a critical tool in clinical diagnosis, treatment planning, and research. Current deep learning-based methods for image classification require a large amount of labeled data and significant computational resources to be trained effectively. By leveraging pretrained Visual Transformer backbones, our approach can classify histopathology images, achieve state-of-the-art performance, and take advantage of the pretrained backbones’ weights. We evaluated it on the Chaoyang histopathology dataset, comparing it with other state-of-the-art Visual Transformers. The experimental results demonstrate that the proposed architecture outperforms the others, indicating its potential to be an effective tool for histopathology image classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chen, H., et al.: Gashis-transformer: a multi-scale visual transformer approach for gastric histopathological image detection. Pattern Recogn. 130, 108827 (2022)
Article Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Li, F.-F.: A large-scale hierarchical image database. In: Imagenet (2009)
Google Scholar
Dosovitskiy, A., et al.: An image is worth 16\(\times \)16 words: transformers for image recognition at scale. In: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, 3–7 May 2021 (2021)
Google Scholar
Glotsos, D., et al.: Improving accuracy in astrocytomas grading by integrating a robust least squares mapping driven support vector machine classifier into a two level grade classification scheme. Comput. Methods Progr. Biomed. 90(3), 251–261 (2008)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Hendrycks, D., Lee, K., Mazeika, M.: Using pre-training can improve model robustness and uncertainty. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning, ICML 2019, Long Beach, California, USA, 9–15 June 2019, vol. 97 of Proceedings of Machine Learning Research, pp. 2712–2721. PMLR (2019)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Liu, Z., et al.: Swin transformer V2: scaling up capacity and resolution. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, 18–24 June 2022, pp. 11999–12009. IEEE (2022)
Google Scholar
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, 10–17 October 2021, pp. 9992–10002. IEEE (2021)
Google Scholar
Putzu, L., Fumera, G.: An empirical evaluation of nuclei segmentation from h &e images in a real application scenario. Appl. Sci. 10(22), 7982 (2020)
Article Google Scholar
Srinidhi, C.L., Ciga, O., Martel, A.L.: Deep neural network models for computational histopathology: a survey. Medical Image Anal. 67, 101813 (2021)
Article Google Scholar
Steiner, A.P., Kolesnikov, A., Zhai, X., Wightman, R., Uszkoreit, J., Beyer, L.: How to train your vit? data, augmentation, and regularization in vision transformers. In: Transactions on Machine Learning Research (2022)
Google Scholar
Szegedy, C., et al.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 1–9. IEEE Computer Society (2015)
Google Scholar
Vaswani, A., et al. Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)
Google Scholar
Wu, H., et al.: Introducing convolutions to vision transformers. In: Cvt (2021)
Google Scholar
Xu, X., Kapse, S., Gupta, R., Prasanna, P.: Vit-dae: transformer-driven diffusion autoencoder for histopathology image analysis. CoRR, abs/2304.01053 (2023)
Google Scholar
Li, Y., et al.: Training vision transformers from scratch on imagenet. In: Tokens-to-Token Vit (2021)
Google Scholar
Zhang, X., Chan, F.T.S., Mahadevan, S.: Explainable machine learning in image classification models: an uncertainty quantification perspective. Knowl. Based Syst 243, 108418 (2022)
Article Google Scholar
Zhou, D., et al.: Towards deeper vision transformer. In: Deepvit (2021)
Google Scholar
Zhou, X., Tang, C., Huang, P., Tian, S., Mercaldo, F., Santone, A.: Asi-dbnet: an adaptive sparse interactive resnet-vision transformer dual-branch network for the grading of brain cancer histopathological images. Interdisc. Sci. Comput. Life Sci. 15(1), 15–31 (2023)
Google Scholar
Zhu, C., Chen, W., Peng, T., Wang, Y., Jin, M.: Hard sample aware noise robust learning for histopathology image classification. IEEE Trans. Med. Imaging 41, 881–894 (2021)
Article Google Scholar

Download references

Acknowledgements

We acknowledge financial support under the National Recovery and Resilience Plan (NRRP), Mission 4 Component 2 Investment 1.5 - Call for tender No.3277 published on December 30, 2021 by the Italian Ministry of University and Research (MUR) funded by the European Union – NextGenerationEU. Project Code ECS0000038 – Project Title eINS Ecosystem of Innovation for Next Generation Sardinia – CUP F53C22000430001- Grant Assignment Decree No. 1056 adopted on June 23, 2022 by the Italian Ministry of University and Research (MUR)”

Author information

Authors and Affiliations

Department of Mathematics and Computer Science, University of Cagliari, via Ospedale 72, 09124, Cagliari, Italy
Luca Zedda, Andrea Loddo & Cecilia Di Ruberto

Authors

Luca Zedda
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Loddo
View author publications
You can also search for this author in PubMed Google Scholar
Cecilia Di Ruberto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrea Loddo .

Editor information

Editors and Affiliations

University of Udine, Udine, Italy
Gian Luca Foresti
University of Udine, Udine, Italy
Andrea Fusiello
University of York, York, UK
Edwin Hancock

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zedda, L., Loddo, A., Di Ruberto, C. (2023). Hierarchical Pretrained Backbone Vision Transformer for Image Classification in Histopathology. In: Foresti, G.L., Fusiello, A., Hancock, E. (eds) Image Analysis and Processing – ICIAP 2023. ICIAP 2023. Lecture Notes in Computer Science, vol 14234. Springer, Cham. https://doi.org/10.1007/978-3-031-43153-1_19

Download citation

DOI: https://doi.org/10.1007/978-3-031-43153-1_19
Published: 05 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43152-4
Online ISBN: 978-3-031-43153-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Hierarchical Pretrained Backbone Vision Transformer for Image Classification in Histopathology