Skip to main content

ICDAR 2023 Competition on Detecting Tampered Text in Images

  • Conference paper
  • First Online:
Document Analysis and Recognition - ICDAR 2023 (ICDAR 2023)

Abstract

Document analysis and recognition techniques are evolving quickly and have been widely used in real-world applications. However, detecting tampered text in images is rarely studied. Existing image forensics research mainly focuses on detecting tampered objects in natural images. Text manipulation in images exhibits different characteristics, e.g.,, text consistency, imperceptibility, etc., which bring new challenges for image forensics. Therefore, We organized the ICDAR 2023 Competition on Detecting Tampered Text in Images (DTTI) and established a new dataset named TII, which consists of 11,385 images. 5,500 images are tampered using various manipulation techniques and annotated by pixel-level masks. Two tasks are set up: text manipulation classification and text manipulation detection. The contest started on 15th February, 2023 and ended on 20th March, 2023. Over a thousand teams were registered for the competition, with 277 and 176 valid submissions in each task. In this competition report, we describe the details of the proposed dataset, tasks, evaluation protocols and the results summaries. We hope that this competition could promote the research of text manipulation detection in images.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://tianchi.aliyun.com/competition/entrance/532048/introduction.

  2. 2.

    https://tianchi.aliyun.com/competition/entrance/532052/introduction.

  3. 3.

    https://tianchi.aliyun.com/competition/entrance/532048/rankingList.

  4. 4.

    https://tianchi.aliyun.com/competition/entrance/532052/rankingList.

References

  1. Abdallah, A., Berendeyev, A., Nuradin, I., Nurseitov, D.: TNCR: table net detection and classification dataset. Neurocomputing 473, 79–97 (2022)

    Article  Google Scholar 

  2. Bertrand, R., Gomez-Krämer, P., Terrades, O.R., Franco, P., Ogier, J.M.: A system based on intrinsic features for fraudulent document detection. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 106–110. IEEE (2013)

    Google Scholar 

  3. van Beusekom, J., Stahl, A., Shafait, F.: Lessons learned from automatic forgery detection in over 100,000 invoices. In: Garain, U., Shafait, F. (eds.) IWCF 2012/2014. LNCS, vol. 8915, pp. 130–142. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-20125-2_12

    Chapter  Google Scholar 

  4. Bibi, M., Hamid, A., Moetesum, M., Siddiqi, I.: Document forgery detection using printer source identification-a text-independent approach. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), vol. 8, pp. 7–12. IEEE (2019)

    Google Scholar 

  5. Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 801–818 (2018)

    Google Scholar 

  6. Cruz, F., Sidere, N., Coustaty, M., d’Andecy, V.P., Ogier, J.M.: Local binary patterns for document forgery detection. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1223–1228. IEEE (2017)

    Google Scholar 

  7. Cruz, F., Sidère, N., Coustaty, M., Poulain d’Andecy, V., Ogier, J.-M.: Categorization of document image tampering techniques and how to identify them. In: Zhang, Z., Suter, D., Tian, Y., Branzan Albu, A., Sidère, N., Jair Escalante, H. (eds.) ICPR 2018. LNCS, vol. 11188, pp. 117–124. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-05792-3_11

    Chapter  Google Scholar 

  8. Dong, J., Wang, W., Tan, T.: Casia image tampering detection evaluation database. In: 2013 IEEE China Summit and International Conference on Signal and Information Processing, pp. 422–426. IEEE (2013)

    Google Scholar 

  9. Evangelidis, G.D., Psarakis, E.Z.: Parametric image alignment using enhanced correlation coefficient maximization. IEEE Trans. Pattern Anal. Mach. Intell. 30(10), 1858–1865 (2008)

    Article  Google Scholar 

  10. Guan, H., et al.: MFC datasets: large-scale benchmark datasets for media forensic challenge evaluation. In: 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW), pp. 63–72. IEEE (2019)

    Google Scholar 

  11. Guo, M.H., Lu, C.Z., Hou, Q., Liu, Z., Cheng, M.M., Hu, S.M.: Segnext: rethinking convolutional attention design for semantic segmentation. arXiv preprint arXiv:2209.08575 (2022)

  12. Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., Xu, C.: Ghostnet: more features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1580–1589 (2020)

    Google Scholar 

  13. Hao, J., Zhang, Z., Yang, S., Xie, D., Pu, S.: Transforensics: image forgery localization with dense self-attention. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 15055–15064 (2021)

    Google Scholar 

  14. Hassani, A., Shi, H.: Dilated neighborhood attention transformer. arXiv preprint arXiv:2209.15001 (2022)

  15. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)

    Google Scholar 

  16. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  17. Huang, Z., et al.: ICDAR 2019 competition on scanned receipt OCR and information extraction. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1516–1520. IEEE (2019)

    Google Scholar 

  18. Jaume, G., Ekenel, H.K., Thiran, J.P.: FUNSD: a dataset for form understanding in noisy scanned documents. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), vol. 2, pp. 1–6. IEEE (2019)

    Google Scholar 

  19. Korus, P., Huang, J.: Evaluation of random field models in multi-modal unsupervised tampering localization. In: 2016 IEEE International Workshop on Information Forensics and Security (WIFS), pp. 1–6. IEEE (2016)

    Google Scholar 

  20. Kwon, M.J., Yu, I.J., Nam, S.H., Lee, H.K.: Cat-net: compression artifact tracing network for detection and localization of image splicing. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 375–384 (2021)

    Google Scholar 

  21. Liao, M., Lyu, P., He, M., Yao, C., Wu, W., Bai, X.: Mask textspotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 532–548 (2019)

    Article  Google Scholar 

  22. Liao, M., Shi, B., Bai, X., Wang, X., Liu, W.: Textboxes: a fast text detector with a single deep neural network. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)

    Google Scholar 

  23. Liao, M., Wan, Z., Yao, C., Chen, K., Bai, X.: Real-time scene text detection with differentiable binarization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 11474–11481 (2020)

    Google Scholar 

  24. Liu, Z., et al.: Swin transformer V2: scaling up capacity and resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12009–12019 (2022)

    Google Scholar 

  25. Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., Xie, S.: A convnet for the 2020s. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11976–11986 (2022)

    Google Scholar 

  26. Lu, P., Wang, H., Zhu, S., Wang, J., Bai, X., Liu, W.: Boundary textspotter: toward arbitrary-shaped scene text spotting. IEEE Trans. Image Process. 31, 6200–6212 (2022)

    Article  Google Scholar 

  27. Mehta, S., Rastegari, M.: Mobilevit: light-weight, general-purpose, and mobile-friendly vision transformer. arXiv preprint arXiv:2110.02178 (2021)

  28. Ng, T.T., Hsu, J., Chang, S.F.: Columbia image splicing detection evaluation dataset. DVMM lab. Columbia Univ CalPhotos Digit Libr (2009)

    Google Scholar 

  29. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)

    Google Scholar 

  30. Shi, B., Bai, X., Belongie, S.: Detecting oriented text in natural images by linking segments. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2550–2558 (2017)

    Google Scholar 

  31. Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2016)

    Article  Google Scholar 

  32. Shi, B., Yang, M., Wang, X., Lyu, P., Yao, C., Bai, X.: Aster: an attentional scene text recognizer with flexible rectification. IEEE Trans. Pattern Anal. Mach. Intell. 41(9), 2035–2048 (2018)

    Article  Google Scholar 

  33. Tan, M., Le, Q.: Efficientnet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114. PMLR (2019)

    Google Scholar 

  34. Tan, M., Le, Q.: EfficientNetV2: smaller models and faster training. In: International Conference on Machine Learning, pp. 10096–10106. PMLR (2021)

    Google Scholar 

  35. Wang, J., et al.: Deep high-resolution representation learning for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 43(10), 3349–3364 (2020)

    Article  Google Scholar 

  36. Wang, Y., Zhang, B., Xie, H., Zhang, Y.: Tampered text detection via RGB and frequency relationship modeling. Chin. J. Netw. Inf. Secur. 8(3), 29–40 (2022)

    Google Scholar 

  37. Wang, Y., Xie, H., Xing, M., Wang, J., Zhu, S., Zhang, Y.: Detecting tampered scene text in the wild. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13688, pp. 215–232. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19815-1_13

    Chapter  Google Scholar 

  38. Wen, B., Zhu, Y., Subramanian, R., Ng, T.T., Shen, X., Winkler, S.: Coverage-a novel database for copy-move forgery detection. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 161–165. IEEE (2016)

    Google Scholar 

  39. Wu, Y., AbdAlmageed, W., Natarajan, P.: Mantra-net: manipulation tracing network for detection and localization of image forgeries with anomalous features. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9543–9552 (2019)

    Google Scholar 

  40. Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., Luo, P.: Segformer: simple and efficient design for semantic segmentation with transformers. Adv. Neural. Inf. Process. Syst. 34, 12077–12090 (2021)

    Google Scholar 

  41. Xie, X., Fu, L., Zhang, Z., Wang, Z., Bai, X.: Toward understanding wordart: corner-guided transformer for scene text recognition. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13688, pp. 303–321. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19815-1_18

    Chapter  Google Scholar 

  42. Yang, M., et al.: Reading and writing: discriminative and generative modeling for self-supervised text recognition. In: Proceedings of the 30th ACM International Conference on Multimedia, pp. 4214–4223 (2022)

    Google Scholar 

Download references

Acknowledgments

This competition is supported by the National Natural Science Foundation (NSFC#62225603).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiang Bai .

Editor information

Editors and Affiliations

Appendices

Appendix

A More Examples from TTI

Figure 3 gives more examples from TII dataset, including tampered documents and their ground truth.

Fig. 3.
figure 3

More examples from TTI dataset. The forged images and their ground truth are presented in the odd rows and even rows, respectively.

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Luo, D. et al. (2023). ICDAR 2023 Competition on Detecting Tampered Text in Images. In: Fink, G.A., Jain, R., Kise, K., Zanibbi, R. (eds) Document Analysis and Recognition - ICDAR 2023. ICDAR 2023. Lecture Notes in Computer Science, vol 14188. Springer, Cham. https://doi.org/10.1007/978-3-031-41679-8_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-41679-8_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-41678-1

  • Online ISBN: 978-3-031-41679-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics