skip to main content
10.1145/3573128.3609343acmconferencesArticle/Chapter ViewAbstractPublication PagesdocengConference Proceedingsconference-collections
short-paper

Read-Write-Learn: Self-Learning for Handwriting Recognition

Published: 22 August 2023 Publication History

Abstract

Handwriting recognition relies on supervised data for training. Annotations typically include both the written text and the author's identity to facilitate the recognition of a particular style. A large annotation set is required for robust recognition, which is not always available in historical texts and low-annotation languages. To mitigate this challenge, we propose the Read-Write-Learn framework. In this setting, we augment the training process of handwriting recognition with a language model and a handwriting generator. Specifically, in the first reading step, we employ a language model to identify text that is likely detected correctly by the recognition model. Then, in the writing step, we generate more training data in the same writing style. Finally, in the learning step, we use the newly generated data in the same writing style to finetune the recognition model. Our Read-Write-Learn framework allows the recognition model to incrementally converge on the new style. Our experiments on historical handwritten documents demonstrate the benefits of the approach, and we present several examples to showcase improved recognition.

References

[1]
Youngmin Baek, Bado Lee, Dongyoon Han, Sangdoo Yun, and Hwalsuk Lee. 2019. Character region awareness for text detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9365--9374.
[2]
Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, and Mubarak Shah. 2021. Handwriting transformers. In Proceedings of the IEEE/CVF international conference on computer vision. 1086--1094.
[3]
Edgard Chammas, Chafic Mokbel, and Laurence Likforman-Sulem. 2018. Handwriting recognition of historical documents with few labeled data. In 2018 13th IAPR International Workshop on Document Analysis Systems (DAS). IEEE, 43--48.
[4]
europeana. [n.d.]. Taisclann Dhigiteach na hÉireann DIGITAL REPOSITORY OF IRELAND. https://www.europeana.eu/en/collections/organisation/1482250000000364879-digital-repository-of-ireland
[5]
Andreas Fischer, Andreas Keller, Volkmar Frinken, and Horst Bunke. 2012. Lexicon-free handwritten word spotting using character HMMs. Pattern Recognition Letters 33, 7 (2012), 934--942. https://doi.org/10.1016/j.patrec.2011.09.009 Special Issue on Awards from ICPR 2010.
[6]
Sharon Fogel, Hadar Averbuch-Elor, Sarel Cohen, Shai Mazor, and Roee Litman. 2020. Scrabblegan: Semi-supervised varying length handwritten text generation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 4324--4333.
[7]
Luyu Gao, Xueguang Ma, Jimmy Lin, and Jamie Callan. 2022. Precise Zero-Shot Dense Retrieval without Relevance Labels. arXiv preprint arXiv:2212.10496 (2022).
[8]
Jiaxin Huang, Shixiang Shane Gu, Le Hou, Yuexin Wu, Xuezhi Wang, Hongkun Yu, and Jiawei Han. 2022. Large language models can self-improve. arXiv preprint arXiv:2210.11610 (2022).
[9]
Alex WC Lee, Jonathan Chung, and Marco Lee. 2021. GNHK: A dataset for English handwriting in the wild. In Document Analysis and Recognition--ICDAR 2021: 16th International Conference, Lausanne, Switzerland, September 5-10, 2021, Proceedings, Part IV 16. Springer, 399--412.
[10]
Minghao Li, Tengchao Lv, Jingye Chen, Lei Cui, Yijuan Lu, Dinei Florencio, Cha Zhang, Zhoujun Li, and Furu Wei. 2021. Trocr: Transformer-based optical character recognition with pre-trained models. arXiv preprint arXiv:2109.10282 (2021).
[11]
U-V Marti and Horst Bunke. 2002. The IAM-database: an English sentence database for offline handwriting recognition. International Journal on Document Analysis and Recognition 5 (2002), 39--46.
[12]
Clemens Neudecker, Konstantin Baierer, Mike Gerber, Christian Clausner, Apostolos Antonacopoulos, and Stefan Pletschacher. 2021. A survey of OCR evaluation tools and metrics. In The 6th International Workshop on Historical Document Imaging and Processing. 13--18.
[13]
Konstantina Nikolaidou, Mathias Seuret, Hamam Mokayed, and Marcus Liwicki. 2022. A survey of historical document image datasets. International Journal on Document Analysis and Recognition (IJDAR) 25, 4 (2022), 305--338.
[14]
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al. 2019. Language models are unsupervised multitask learners. OpenAI blog 1, 8 (2019), 9.
[15]
Baoguang Shi, Xiang Bai, and Cong Yao. 2016. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE transactions on pattern analysis and machine intelligence 39, 11 (2016), 2298--2304.
[16]
Phillip Benjamin Ströbel, Simon Clematide, Martin Volk, and Tobias Hodel. 2022. Transformer-based HTR for historical documents. arXiv preprint arXiv:2203.11008 (2022).
[17]
Phillip Benjamin Ströbel, Simon Clematide, Martin Volk, Raphael Schwitter, Tobias Hodel, and David Schoch. 2022. Evaluation of HTR models without ground truth material. arXiv preprint arXiv:2201.06170 (2022).
[18]
Zhuosheng Zhang, Aston Zhang, Mu Li, and Alex Smola. 2022. Automatic chain of thought prompting in large language models. arXiv preprint arXiv:2210.03493 (2022).

Index Terms

  1. Read-Write-Learn: Self-Learning for Handwriting Recognition

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    DocEng '23: Proceedings of the ACM Symposium on Document Engineering 2023
    August 2023
    187 pages
    ISBN:9798400700279
    DOI:10.1145/3573128
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 August 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. handwriting generation
    2. handwriting recognition
    3. self-learning

    Qualifiers

    • Short-paper
    • Research
    • Refereed limited

    Conference

    DocEng '23
    Sponsor:
    DocEng '23: ACM Symposium on Document Engineering 2023
    August 22 - 25, 2023
    Limerick, Ireland

    Acceptance Rates

    DocEng '23 Paper Acceptance Rate 9 of 27 submissions, 33%;
    Overall Acceptance Rate 194 of 564 submissions, 34%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 72
      Total Downloads
    • Downloads (Last 12 months)29
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 15 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media