Skip to main content

EnsExam: A Dataset for Handwritten Text Erasure on Examination Papers

  • Conference paper
  • First Online:
Document Analysis and Recognition - ICDAR 2023 (ICDAR 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14189))

Included in the following conference series:

  • 816 Accesses

Abstract

Handwritten text erasure on examination papers is an important new research topic with high practical value due to its ability to restore examination papers and collect questions that are answered incorrectly for review, thereby improving educational efficiency. However, to the best of our knowledge, there is no publicly available dataset for handwritten text erasure on examination papers. To facilitate the development of this field, we build a real-world dataset called SCUT-EnsExam (short for EnsExam). The dataset consists of 545 examination paper images, each of which has been carefully annotated to provide a visually reasonable erasure target. With EnsExam, we propose an end-to-end model, which introduces a soft stroke mask to erase the handwritten text precisely. Furthermore, we propose a simple yet effective loss called stroke normalization (SN) loss to alleviate the imbalance between text and non-text regions. Extensive numerical experiments shows that our proposed method outperforms previous state-of-the-art methods on EnsExam. In addition, quantitative experiments on scene text removal benchmark, SCUT-EnsText, demonstrate the generalizability of our method. The EnsExam will be made available at https://github.com/SCUT-DLVCLab/SCUT-EnsExam.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bian, X., Wang, C., Quan, W., et al.: Scene text removal via cascaded text stroke detection and erasing. Comput. Vis. Media 8(2), 273–287 (2022)

    Article  Google Scholar 

  2. Chen, K., Pang, J., Wang, J., et al.: Hybrid task cascade for instance segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4974–4983 (2019)

    Google Scholar 

  3. Chen, L.C., Papandreou, G., Kokkinos, I., et al.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018)

    Article  Google Scholar 

  4. Chng, C.K., Liu, Y., Sun, Y., et al.: ICDAR2019 robust reading challenge on arbitrary-shaped text -RRC-ArT. In: 2019 International Conference on Document Analysis and Recognition, pp. 1571–1576 (2019)

    Google Scholar 

  5. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)

    Google Scholar 

  6. Gupta, A., Vedaldi, A., Zisserman, A.: Synthetic data for text localisation in natural images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2315–2324 (2016)

    Google Scholar 

  7. Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)

    Google Scholar 

  8. Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., et al.: ICDAR 2015 competition on robust reading. In: 2015 13th International Conference on Document Analysis and Recognition, pp. 1156–1160 (2015)

    Google Scholar 

  9. Karatzas, D., Shafait, F., Uchida, S., et al.: ICDAR 2013 robust reading competition. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1484–1493 (2013)

    Google Scholar 

  10. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (2015)

    Google Scholar 

  11. Lee, H., Choi, C.: The surprisingly straightforward scene text removal method with gated attention and region of interest generation: a comprehensive prominent model analysis. In: Avidan, S., Brostow, G., Cisse, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol. 13676, pp. 457–472. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19787-1_26

  12. Liu, C., et al.: Don’t forget me: accurate background recovery for text removal via modeling local-global context. In: Avidan, S., Brostow, G., Cisse, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol. 13688, pp. 409–426. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19815-1_24

  13. Liu, C., Liu, Y., Jin, L., et al.: EraseNet: end-to-end text removal in the wild. IEEE Trans. Image Process. 29, 8760–8775 (2020)

    Article  MATH  Google Scholar 

  14. Liu, Y., et al.: Exploring the capacity of an orderless box discretization network for multi-orientation scene text detection. Int. J. Comput. Vision 129(6), 1972–1992 (2021)

    Article  Google Scholar 

  15. Milletari, F., Navab, N., Ahmadi, S.A.: V-Net: fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision, pp. 565–571 (2016)

    Google Scholar 

  16. Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. In: International Conference on Learning Representations (2018)

    Google Scholar 

  17. Nayef, N., Patel, Y., Busta, M., et al.: ICDAR2019 robust reading challenge on multi-lingual scene text detection and recognition - RRC-MLT-2019. In: 2019 International Conference on Document Analysis and Recognition, pp. 1582–1587 (2019)

    Google Scholar 

  18. Nayef, N., Yin, F., Bizid, I., et al.: ICDAR2017 robust reading challenge on multi-lingual scene text detection and script identification - RRC-MLT. In: 2017 14th IAPR International Conference on Document Analysis and Recognition, pp. 1454–1459 (2017)

    Google Scholar 

  19. Salehi, S.S.M., Erdogmus, D., Gholipour, A.: Tversky loss function for image segmentation using 3d fully convolutional deep networks. In: Machine Learning in Medical Imaging, pp. 379–387 (2017)

    Google Scholar 

  20. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (2015)

    Google Scholar 

  21. Tang, Z., Miyazaki, T., Sugaya, Y., Omachi, S.: Stroke-based scene text erasing using synthetic data for training. IEEE Trans. Image Process. 30, 9306–9320 (2021)

    Article  Google Scholar 

  22. Tursun, O., Denman, S., Zeng, R., et al.: MTRNet++: one-stage mask-based scene text eraser. Comput. Vis. Image Underst. 201, 103066 (2020)

    Article  Google Scholar 

  23. Tursun, O., Zeng, R., Denman, S., et al.: MTRNet: a generic scene text eraser. In: 2019 International Conference on Document Analysis and Recognition, pp. 39–44 (2019)

    Google Scholar 

  24. Veit, A., Matera, T., Neumann, L., Matas, J., Belongie, S.: COCO-Text: dataset and benchmark for text detection and recognition in natural images. arXiv preprint arXiv:1601.07140 (2016)

  25. Wang, B., Li, J., Jin, X., Yuan, Q.: CHENet: image to image Chinese handwriting eraser. In: Pattern Recognition and Computer Vision, pp. 40–51 (2022)

    Google Scholar 

  26. Wang, K., Belongie, S.: Word spotting in the wild. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6311, pp. 591–604. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15549-9_43

    Chapter  Google Scholar 

  27. Wang, W., Xie, E., Song, X., et al.: Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8440–8449 (2019)

    Google Scholar 

  28. Wang, Y., Xie, H., Fang, S., et al.: PERT: a progressively region-based network for scene text removal. arXiv preprint arXiv:2106.13029 (2021)

  29. Wang, Z., Bovik, A., Sheikh, H., Simoncelli, E.: Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, pp. 600–612 (2004)

    Google Scholar 

  30. Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1

    Chapter  Google Scholar 

  31. Zhang, S.X., Zhu, X., Chen, L., et al.: Arbitrary shape text detection via segmentation with probability maps. IEEE Trans. Pattern Anal. Mach. Intell. (2022). https://doi.org/10.1109/TPAMI.2022.3176122

    Article  Google Scholar 

  32. Zhang, S., Liu, Y., Jin, L., et al.: EnsNet: ensconce text in the wild. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 801–808 (2019)

    Google Scholar 

Download references

Acknowledgements

This research is supported in part by NSFC (Grant No.: 61936003), Zhuhai Industry Core and Key Technology Research Project (no. 2220004002350), and Science and Technology Foundation of Guangzhou Huangpu Development District (No. 2020GH17) and GD-NSF (No.2021A1515011870).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lianwen Jin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Huang, L. et al. (2023). EnsExam: A Dataset for Handwritten Text Erasure on Examination Papers. In: Fink, G.A., Jain, R., Kise, K., Zanibbi, R. (eds) Document Analysis and Recognition - ICDAR 2023. ICDAR 2023. Lecture Notes in Computer Science, vol 14189. Springer, Cham. https://doi.org/10.1007/978-3-031-41682-8_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-41682-8_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-41681-1

  • Online ISBN: 978-3-031-41682-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics