Skip to main content

Text-line-up: Don’t Worry About the Caret

  • Conference paper
  • First Online:
  • 3390 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12823))

Abstract

In a freestyle handwritten text-line, sometimes words are inserted using a caret symbol (\(^\wedge \)) for corrections/annotations. Such insertions create fluctuations in the reading sequence of words. In this paper, we aim to line-up the words of a text-line, so that it can assist the OCR engine. Previous text-line segmentation techniques in the literature have scarcely addressed this issue. Here, the task undertaken is formulated as a path planning problem, and a novel multi-agent hierarchical reinforcement learning-based architecture solution is proposed. As a matter of fact, no linguistic knowledge is used here. Experimentation of the proposed solution architecture has been conducted on English and Bengali offline handwriting, which yielded some interesting results.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Grüning, T., et al.: A two-stage method for text line detection in historical documents. IJDAR 22, 285–302 (2019)

    Article  Google Scholar 

  2. Survey, A., Sulem, L.L., Zahour, A., Taconet, B.: Text line segmentation of historical documents. IJDAR 9, 123–138 (2007)

    Google Scholar 

  3. Surinta, O., et al.: A* path planning for line segmentation of handwritten documents. In: ICFHR, pp. 175–180 (2014)

    Google Scholar 

  4. Li, X.Y., et al.: Script-independent text line segmentation in freestyle handwritten documents. IEEE TPAMI 30(8), 1313–1329 (2008)

    Article  Google Scholar 

  5. Arulkumaran, K., et al.: Deep reinforcement learning: a brief survey. IEEE Sig. Process. Mag. 34(6), 26–38 (2017)

    Article  Google Scholar 

  6. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. MIT Press, Cambridge (2018). ISBN: 9780262039246

    Google Scholar 

  7. Wilber: GIMP 2.10.22 Released (2020). Online: gimp.org. Accessed 3 May 2021

  8. Marti, U., Bunke, H.: The IAM-database: an English sentence database for off-line handwriting recognition. IJDAR 5, 39–46 (2002)

    Article  Google Scholar 

  9. Alaei, A., Pal, U., Nagabhushan, P.: Dataset and ground truth for handwritten text in four different scripts. IJPRAI 26(4), 1253001 (2012)

    MathSciNet  Google Scholar 

  10. Berliac, Y. F.: The Promise of Hierarchical Reinforcement Learning. The Gradient (2019)

    Google Scholar 

  11. Wierstra, D., Foerster, A., Peters, J., Schmidhuber, J.: Solving deep memory POMDPs with recurrent policy gradients. In: ICANN, pp. 697–706 (2007)

    Google Scholar 

  12. Badrinarayanan, V., et al.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE TPAMI 39(12), 2481–2495 (2017)

    Article  Google Scholar 

  13. Zhang, A., et al.: Dive into Deep Learning (2020). Online: d2l.ai. Accessed 3 May 2021

  14. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. ICML 37, 448–456 (2015)

    Google Scholar 

  15. Misra, D.: Mish: a self regularized non-monotonic activation function. In: Paper # 928, BMVC 2020 (2020)

    Google Scholar 

  16. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Doklady Akademii Nauk SSSR 163(4), 845–848 (1965)

    MathSciNet  MATH  Google Scholar 

  17. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015)

    Article  Google Scholar 

  18. Wang, Z., et al.: Dueling network architectures for deep reinforcement learning. ICML 48, 1995–2003 (2016)

    Google Scholar 

  19. Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992)

    MATH  Google Scholar 

  20. Wandell, B.A.: Foundations of Vision. Sinauer Asso. Inc. (1995). ISBN: 9780878938537

    Google Scholar 

  21. Larochelle, H., Hinton, G.E.: Learning to combine foveal glimpses with a third-order Boltzmann machine. In: NIPS, pp. 1243–1251 (2010)

    Google Scholar 

  22. He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

    Google Scholar 

  23. Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: EMNLP, pp. 1724–1734 (2014)

    Google Scholar 

  24. Chung, J., et al.: Empirical evaluation of gated recurrent neural networks on sequence modeling. In: NIPS Workshop on Deep Learning (2014)

    Google Scholar 

  25. Mnih, V., et al.: Recurrent models of visual attention. In: NIPS, pp. 2204–2212 (2014)

    Google Scholar 

  26. Sutton, R.S., et al.: Policy gradient methods for reinforcement learning with function approximation. In: NIPS, pp. 1057–1063 (1999)

    Google Scholar 

  27. Hertz, J., Krogh, A., Palmer, R.G.: Introduction to the Theory of Neural Computation. CRC Press, Boca Raton (1991). https://doi.org/10.1201/9780429499661

  28. Botchkarev, A.: Performance metrics (error measures) in machine learning regression, forecasting and prognostics: properties and typology arXiv:1809.03006 (2018)

  29. Stamatopoulos, N., et al.: ICDAR 2013 handwriting segmentation contest. In: ICDAR, pp. 1402–1406 (2013)

    Google Scholar 

  30. Chaudhuri, B.B., Adak, C.: An approach for detecting and cleaning of struck-out hand-written text. Pattern Recogn. 61, 282–294 (2017)

    Article  Google Scholar 

  31. Almageed, W.A., et al.: Page rule-line removal using linear subspaces in monochromatic handwritten Arabic documents. In: ICDAR, pp. 768–772 (2009)

    Google Scholar 

Download references

Acknowledgment

All the people who contributed to generating the database are gratefully acknowledged. The authors also heartily thank all the consulted linguistic and handwriting experts.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chandranath Adak .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Adak, C., Chaudhuri, B.B., Lin, CT., Blumenstein, M. (2021). Text-line-up: Don’t Worry About the Caret. In: Lladós, J., Lopresti, D., Uchida, S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science(), vol 12823. Springer, Cham. https://doi.org/10.1007/978-3-030-86334-0_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86334-0_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86333-3

  • Online ISBN: 978-3-030-86334-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics