Skip to main content

Detection of Oral Potentially Malignant Lesions Through Transformer-Based Segmentation Models

  • Conference paper
  • First Online:
Pattern Recognition (ICPR 2024)

Abstract

The early detection of Oral Potentially Malignant Disorders (OPMDs) is critical for successful intervention and improved patient outcomes. Screening for potentially malignant oral lesions is crucial before classifying them into different classes and subclasses of OPMDs. This study compares two transformer-based segmentation models, Med-SAM and SegFormer-B5, with CNN models to detect OPMDs in the oral cavity from photographic images. Using a dataset of 1,435 images from 486 patients, the models were evaluated with four scenarios by incorporating different image types for training and testing data, viz. full images and cropped images. Three metrics were used to assess the model: Percentage overlap with the actual lesions, F1 score, and IoU. The SegFormer-B5 model varied in performance, achieving its best results with an F1 score, IoU and percentage overlap of 0.83, 0.72 and 0.83, respectively, when training and testing on cropped images. When trained and tested on full images, the SegFormer-B5 model achieved an F1 score of 0.59, IoU-0.45, and percentage overlap of 0.67. On the other hand, the Med-SAM model demonstrated moderate performance on full images with an F1 score of 0.38, a percent overlap of 0.73, and IoU-0.26 but significantly excelled with cropped RoIs (Region of Interest), reaching an F1 score of 0.73 and percentage overlap of 0.82 and IoU-0.59. The high F1 scores, IoU, and percentage overlap achieved by these models underscore their capability to screen OPMDs effectively and support the classification models by enhancing the precision and reliability of OPMD diagnostics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Adeoye, J., et al.: Deep learning predicts the malignant-transformation-free survival of oral potentially malignant disorders. Cancers 13(23), 6054 (2021). https://doi.org/10.3390/cancers13236054

    Article  Google Scholar 

  2. Baheti, B., Innani, S., Gajre, S., Talbar, S.: Eff-UNet: a novel architecture for semantic segmentation in unstructured environment. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1473–1481 (2020). https://doi.org/10.1109/CVPRW50498.2020.00187

  3. Birur, N., et al.: Field validation of deep learning based point-of-care device for early detection of oral malignant and potentially malignant disorders. Sci. Rep. 12(1), 14283 (2022). https://doi.org/10.1038/s41598-022-18249-x

    Article  Google Scholar 

  4. Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587 (2017)

  5. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (2018)

    Google Scholar 

  6. Huang, S.Y., Chiou, C.Y., Tan, Y.S., Chen, C.Y., Chung, P.C.: Deep oral cancer lesion segmentation with heterogeneous features. In: 2022 IEEE International Conference on Recent Advances in Systems Science and Engineering (RASSE), pp. 1–8 (2022). https://doi.org/10.1109/RASSE54974.2022.9989871

  7. Kumari, P., Debta, P., Dixit, A.: Oral potentially malignant disorders: etiology, pathogenesis, and transformation into oral cancer. Front. Pharmacol. 13, 825266 (2022). https://doi.org/10.3389/fphar.2022.825266

    Article  Google Scholar 

  8. Ma, J., He, Y., Li, F., Han, L., You, C., Wang, B.: Segment anything in medical images. Nat. Commun. 15(1), 654 (2024). https://doi.org/10.1038/s41467-024-44824-z

    Article  Google Scholar 

  9. MaurĂ­cio, J., Domingues, I., Bernardino, J.: Comparing vision transformers and convolutional neural networks for image classification: a literature review. Appl. Sci. 13(9), 5521 (2023). https://doi.org/10.3390/app13095521

    Article  Google Scholar 

  10. Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems (2019)

    Google Scholar 

  11. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. arXiv:1505.04597 (2015)

  12. Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. arXiv:1605.06211 (2016)

  13. Skalski, P.: Make sense (2019). https://github.com/SkalskiP/make-sense/

  14. Song, B., et al.: Exploring uncertainty measures in convolutional neural network for semantic segmentation of oral cancer images. J. Biomed. Opt. 27(11), 115001 (2022). https://doi.org/10.1117/1.JBO.27.11.115001

    Article  Google Scholar 

  15. Tanriver, G., Soluk Tekkesin, M., Ergen, O.: Automated detection and classification of oral lesions using deep learning to detect oral potentially malignant disorders. Cancers 13(11), 2766 (2021). https://doi.org/10.3390/cancers13112766

    Article  Google Scholar 

  16. Tkachenko, M., Malyuk, M., Holmanyuk, A., Liubimov, N.: Label Studio: Data labeling software (2020-2022). https://github.com/heartexlabs/label-studio, open source software available from https://github.com/heartexlabs/label-studio

  17. Wang, J., Zhang, X., Lv, P., Zhou, L., Wang, H.: EAR-U-Net: EfficientNet and attention-based residual U-Net for automatic liver segmentation in CT. arXiv:2110.01014 (2021)

  18. Wu, Y., Kirillov, A., Massa, F., Lo, W.Y., Girshick, R.: Detectron2 (2019). https://github.com/facebookresearch/detectron2

  19. Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., Luo, P.: SegFormer: simple and efficient design for semantic segmentation with transformers. In: Advances in Neural Information Processing Systems (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nirmal Punjabi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Goswami, B., Hazra, S., Das, S., Nagar, S.R., Gudi, R., Punjabi, N. (2025). Detection of Oral Potentially Malignant Lesions Through Transformer-Based Segmentation Models. In: Antonacopoulos, A., Chaudhuri, S., Chellappa, R., Liu, CL., Bhattacharya, S., Pal, U. (eds) Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15305. Springer, Cham. https://doi.org/10.1007/978-3-031-78169-8_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-78169-8_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-78168-1

  • Online ISBN: 978-3-031-78169-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics