Abstract
In this paper, the usability of synthetic handwritten text to improve machine learning models is examined for the domain of handwritten text detection. We generate synthetic handwritten text by using an existing model based on a style conditioned GAN, and add those texts to scanned documents to mimic handwritten annotations. Object detection models (YOLOv5 and YOLOv8) are trained using synthetic data as a baseline to distinguish handwritten text from remaining content. We study different granularity labels (word-, line- and paragraph-level) and model sizes in our evaluation and show that applying those models to real data results in a mAP@50 of 0.88 and a pixel-level F1@50 of 0.96 for the CVL dataset, and a mAP@50 of 0.72 and F1@50 of 0.89 for SCAN, a custom dataset, created by adding real handwritten annotations to a scientific paper.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The value of \(\pm 1^{\circ }\) is empirically defined to keep a balance between clearly visible line rotations and extreme line spacings for paragraphs with long lines.
- 2.
Each document has a randomly chosen number of 1 .. 12 paragraphs (those boundaries are empirically defined). The algorithm stops after 1000 iterations (also empirically defined), hence less paragraphs than this randomly chosen upper limit are possible as well.
- 3.
https://datatorch.io, last accessed on 2023-07-16.
References
Apostolos, A., Bridson, D., Papadopoulos, C., Pletschacher, S.: A realistic dataset for performance evaluation of document layout analysis. In: Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR2009), pp. 296–300 (2009)
Bhunia, A.K., Khan, S., Cholakkal, H., Anwer, R.M., Khan, F.S., Shah, M.: Handwriting Transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1086–1094 (2021)
Carbonell, M., Mas, J., Villegas, M., Fornés, A., Lladós, J.: End-to-End Handwritten Text Detection and Transcription in Full Pages. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), vol. 5, pp. 29–34 (2019)
Davis, B.L., Morse, B.S., Price, B.L., Tensmeyer, C., Wigington, C., Jain, R.: Text and style conditioned GAN for generation of offline handwriting lines. In: 31st British Machine Vision Conference 2020, BMVC 2020, Virtual Event, UK (2020)
Fiel, S., Sablatnig, R.: Writer identification and retrieval using a convolutional neural network. In: CAIP 2015, Part II, pp. 26–37 (2015)
Fogel, S., Averbuch-Elor, H., Cohen, S., Mazor, S., Litman, R.: ScrabbleGAN: semi-supervised varying length handwritten text generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Gholamian, S., Vahdat, A.: Handwritten and printed text segmentation: a signature case study. In: IEEE/CVF International Conference on Computer Vision, ICCV 2023, Paris, France, 1–6 October 2023, pp. 582–592 (2023)
Graves, A.: Generating sequences with recurrent neural networks. CoRR abs/1308.0850 (2013)
Jo, J., Koo, H.I., Soh, J.W., Cho, N.I.: Handwritten text segmentation via end-to-end learning of convolutional neural networks. Multimed. Tools Appl. 79(43–44), 32137–32150 (2020)
Jocher, G., et al.: Ultralytics/YOLOv5: v7.0 - YOLOv5 SOTA realtime instance segmentation (2022). https://zenodo.org/record/7347926. Accessed 2023-09-24
Jocher, G., Chaurasia, A., Qiu, J.: YOLO by Ultralytics. version 8.0.0 (2023). https://github.com/ultralytics/ultralytics. Accessed 24 Sept 2023
Kang, L., Riba, P., Wang, Y., Rusiñol, M., Fornés, A., Villegas, M.: GANwriting: content-conditioned generation of styled handwritten word images. In: ECCV 2020, pp. 273–289 (2020)
Kleber, F., Fiel, S., Diem, M., Sablatnig, R.: CVL-DataBase: an off-line database for writer retrieval, writer identification and word spotting. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 560–564 (2013)
Krishnan, P., Dutta, K., Jawahar, C.: Deep feature embedding for accurate recognition and retrieval of handwritten text. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 289–294 (2016)
Lin, T.Y., et al.: Microsoft COCO: common objects in context. In: ECCV 2014, pp. 740–755 (2014)
Marti, U.V., Bunke, H.: The IAM-database: an English sentence database for offline handwriting recognition. Int. J. Doc. Anal. Recogn. 5(1), 39–46 (2002)
Marti, U.V., Messerli, R., Bunke, H.: Writer identification using text line based features. In: Proceedings of Sixth International Conference on Document Analysis and Recognition, pp. 101–105 (2001)
Muth, M.M.: Synthetic data for applications in document analysis. Diploma Thesis (2023). https://repositum.tuwien.at/handle/20.500.12708/188733. Artwork Size: 69 pages, TU Wien
Shen, Q., Luan, F., Yuan, S.: Multi-scale residual based Siamese neural network for writer-independent online signature verification. Appl. Intell. 52(12), 14571–14589 (2022)
Stig, J., Leech, G.N., Goodluck, H.: Manual of information to accompany the Lancaster-Oslo/Bergen Corpus of British English, for use with digital computers. Department of English, University of Oslo (1978)
Tolkien, J.R.R.: The fellowship of the ring. The Lord of the Rings, HarperCollins, London, England (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Muth, M., Peer, M., Kleber, F., Sablatnig, R. (2025). Advancing Handwritten Text Detection by Synthetic Text. In: Antonacopoulos, A., Chaudhuri, S., Chellappa, R., Liu, CL., Bhattacharya, S., Pal, U. (eds) Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15319. Springer, Cham. https://doi.org/10.1007/978-3-031-78495-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-78495-8_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-78494-1
Online ISBN: 978-3-031-78495-8
eBook Packages: Computer ScienceComputer Science (R0)