Detection of Oral Potentially Malignant Lesions Through Transformer-Based Segmentation Models

Goswami, Buddhadev; Hazra, Shubham; Das, Sandipan; Nagar, Saurabh R.; Gudi, Ravindra; Punjabi, Nirmal

doi:10.1007/978-3-031-78169-8_21

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15305))

Included in the following conference series:

International Conference on Pattern Recognition

221 Accesses

Abstract

The early detection of Oral Potentially Malignant Disorders (OPMDs) is critical for successful intervention and improved patient outcomes. Screening for potentially malignant oral lesions is crucial before classifying them into different classes and subclasses of OPMDs. This study compares two transformer-based segmentation models, Med-SAM and SegFormer-B5, with CNN models to detect OPMDs in the oral cavity from photographic images. Using a dataset of 1,435 images from 486 patients, the models were evaluated with four scenarios by incorporating different image types for training and testing data, viz. full images and cropped images. Three metrics were used to assess the model: Percentage overlap with the actual lesions, F1 score, and IoU. The SegFormer-B5 model varied in performance, achieving its best results with an F1 score, IoU and percentage overlap of 0.83, 0.72 and 0.83, respectively, when training and testing on cropped images. When trained and tested on full images, the SegFormer-B5 model achieved an F1 score of 0.59, IoU-0.45, and percentage overlap of 0.67. On the other hand, the Med-SAM model demonstrated moderate performance on full images with an F1 score of 0.38, a percent overlap of 0.73, and IoU-0.26 but significantly excelled with cropped RoIs (Region of Interest), reaching an F1 score of 0.73 and percentage overlap of 0.82 and IoU-0.59. The high F1 scores, IoU, and percentage overlap achieved by these models underscore their capability to screen OPMDs effectively and support the classification models by enhancing the precision and reliability of OPMD diagnostics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

An Automatic Classification Methods in Oral Cancer Detection

Advancements in diagnosing oral potentially malignant disorders: leveraging Vision transformers for multi-class detection

Article Open access 08 June 2024

Comparing Training Strategies Using Multi-Assessor Segmentation Labels for Barrett’s Neoplasia Detection

References

Adeoye, J., et al.: Deep learning predicts the malignant-transformation-free survival of oral potentially malignant disorders. Cancers 13(23), 6054 (2021). https://doi.org/10.3390/cancers13236054
Article Google Scholar
Baheti, B., Innani, S., Gajre, S., Talbar, S.: Eff-UNet: a novel architecture for semantic segmentation in unstructured environment. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1473–1481 (2020). https://doi.org/10.1109/CVPRW50498.2020.00187
Birur, N., et al.: Field validation of deep learning based point-of-care device for early detection of oral malignant and potentially malignant disorders. Sci. Rep. 12(1), 14283 (2022). https://doi.org/10.1038/s41598-022-18249-x
Article Google Scholar
Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587 (2017)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (2018)
Google Scholar
Huang, S.Y., Chiou, C.Y., Tan, Y.S., Chen, C.Y., Chung, P.C.: Deep oral cancer lesion segmentation with heterogeneous features. In: 2022 IEEE International Conference on Recent Advances in Systems Science and Engineering (RASSE), pp. 1–8 (2022). https://doi.org/10.1109/RASSE54974.2022.9989871
Kumari, P., Debta, P., Dixit, A.: Oral potentially malignant disorders: etiology, pathogenesis, and transformation into oral cancer. Front. Pharmacol. 13, 825266 (2022). https://doi.org/10.3389/fphar.2022.825266
Article Google Scholar
Ma, J., He, Y., Li, F., Han, L., You, C., Wang, B.: Segment anything in medical images. Nat. Commun. 15(1), 654 (2024). https://doi.org/10.1038/s41467-024-44824-z
Article Google Scholar
Maurício, J., Domingues, I., Bernardino, J.: Comparing vision transformers and convolutional neural networks for image classification: a literature review. Appl. Sci. 13(9), 5521 (2023). https://doi.org/10.3390/app13095521
Article Google Scholar
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems (2019)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. arXiv:1505.04597 (2015)
Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. arXiv:1605.06211 (2016)
Skalski, P.: Make sense (2019). https://github.com/SkalskiP/make-sense/
Song, B., et al.: Exploring uncertainty measures in convolutional neural network for semantic segmentation of oral cancer images. J. Biomed. Opt. 27(11), 115001 (2022). https://doi.org/10.1117/1.JBO.27.11.115001
Article Google Scholar
Tanriver, G., Soluk Tekkesin, M., Ergen, O.: Automated detection and classification of oral lesions using deep learning to detect oral potentially malignant disorders. Cancers 13(11), 2766 (2021). https://doi.org/10.3390/cancers13112766
Article Google Scholar
Tkachenko, M., Malyuk, M., Holmanyuk, A., Liubimov, N.: Label Studio: Data labeling software (2020-2022). https://github.com/heartexlabs/label-studio, open source software available from https://github.com/heartexlabs/label-studio
Wang, J., Zhang, X., Lv, P., Zhou, L., Wang, H.: EAR-U-Net: EfficientNet and attention-based residual U-Net for automatic liver segmentation in CT. arXiv:2110.01014 (2021)
Wu, Y., Kirillov, A., Massa, F., Lo, W.Y., Girshick, R.: Detectron2 (2019). https://github.com/facebookresearch/detectron2
Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., Luo, P.: SegFormer: simple and efficient design for semantic segmentation with transformers. In: Advances in Neural Information Processing Systems (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

Koita Centre for Digital Health, Indian Institute of Technology Bombay, Mumbai, India
Buddhadev Goswami & Nirmal Punjabi
Department of CSE, Indian Institute of Technology Bombay, Mumbai, India
Shubham Hazra
Department of Civil Engineering, Indian Institute of Technology Bombay, Mumbai, India
Sandipan Das
Department of Pathology, Tata Memorial Centre, Mumbai, India
Saurabh R. Nagar
Department of Chemical Engineering, Indian Institute of Technology Bombay, Mumbai, India
Ravindra Gudi
Sensing and Monitoring Foundation, Mumbai, India
Nirmal Punjabi

Authors

Buddhadev Goswami
View author publications
You can also search for this author in PubMed Google Scholar
Shubham Hazra
View author publications
You can also search for this author in PubMed Google Scholar
Sandipan Das
View author publications
You can also search for this author in PubMed Google Scholar
Saurabh R. Nagar
View author publications
You can also search for this author in PubMed Google Scholar
Ravindra Gudi
View author publications
You can also search for this author in PubMed Google Scholar
Nirmal Punjabi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nirmal Punjabi .

Editor information

Editors and Affiliations

University of Salford, Salford, Lancashire, UK
Apostolos Antonacopoulos
Indian Institute of Technology Bombay, Mumbai, Maharashtra, India
Subhasis Chaudhuri
Johns Hopkins University, Baltimore, MD, USA
Rama Chellappa
Chinese Academy of Sciences, Beijing, China
Cheng-Lin Liu
IIT Kharagpur, Kharagpur, West Bengal, India
Saumik Bhattacharya
Indian Statistical Institute Kolkata, Kolkata, West Bengal, India
Umapada Pal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Goswami, B., Hazra, S., Das, S., Nagar, S.R., Gudi, R., Punjabi, N. (2025). Detection of Oral Potentially Malignant Lesions Through Transformer-Based Segmentation Models. In: Antonacopoulos, A., Chaudhuri, S., Chellappa, R., Liu, CL., Bhattacharya, S., Pal, U. (eds) Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15305. Springer, Cham. https://doi.org/10.1007/978-3-031-78169-8_21

Download citation

DOI: https://doi.org/10.1007/978-3-031-78169-8_21
Published: 30 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-78168-1
Online ISBN: 978-3-031-78169-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Detection of Oral Potentially Malignant Lesions Through Transformer-Based Segmentation Models

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

An Automatic Classification Methods in Oral Cancer Detection

Advancements in diagnosing oral potentially malignant disorders: leveraging Vision transformers for multi-class detection

Comparing Training Strategies Using Multi-Assessor Segmentation Labels for Barrett’s Neoplasia Detection

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

Detection of Oral Potentially Malignant Lesions Through Transformer-Based Segmentation Models

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

An Automatic Classification Methods in Oral Cancer Detection

Advancements in diagnosing oral potentially malignant disorders: leveraging Vision transformers for multi-class detection

Comparing Training Strategies Using Multi-Assessor Segmentation Labels for Barrett’s Neoplasia Detection

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation