Skip to main content

Text Line Detection and Recognition of Greek Polytonic Documents

  • Conference paper
  • First Online:
Document Analysis and Recognition – ICDAR 2023 Workshops (ICDAR 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14194))

Included in the following conference series:

Abstract

In this work we highlight the significance of Text Line Detection in documents. By utilizing a well-known Deep Neural Network and by proposing some simple but efficient modifications applied during training of such models, we can achieve very accurate results in different datasets of high diversity. Moreover, such models can be robust even when trained with few data. Our focus is on Greek polytonic documents (typewritten and handwritten) and we provide a new dataset to the public (GTLD-small) for text line detection. We evaluate our method through scenarios applied to the detection and recognition tasks, while demonstrating promising results when compared to popular commercial or open-source systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/hukaixuan19970627/yolov5_obb.

  2. 2.

    https://doi.org/10.5281/zenodo.8020403.

  3. 3.

    https://www.piop.gr/en/istoriko-arxeio.aspx.

  4. 4.

    https://github.com/ultralytics/yolov5.

  5. 5.

    https://cloud.google.com/vision.

  6. 6.

    https://github.com/tesseract-ocr/tesseract.

  7. 7.

    https://github.com/Calamari-OCR/calamari.

References

  1. Ahn, B., Ryu, J., Koo, H.I., Cho, N.I.: Textline detection in degraded historical document images. EURASIP J. Image Video Process. 2017(1), 82 (2017)

    Article  Google Scholar 

  2. Ares Oliveira, S., Seguin, B., Kaplan, F.: dhSegment: a generic deep-learning approach for document segmentation. In: 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), 2018, pp. 7–12. IEEE (2018)

    Google Scholar 

  3. Basu, S., Chaudhuri, C., Kundu, M., Nasipuri, M., Basu, D.: Text line extraction from multi-skewed handwritten documents. Pattern Recogn. 40(6), 1825–1839 (2007)

    Article  MATH  Google Scholar 

  4. Boillet, M., Kermorvant, C., Paquet, T.: Robust text line detection in historical documents: learning and evaluation methods. Int. J. Doc. Anal. Recog. (IJDAR) 25(2), 95–114 (2022)

    Article  Google Scholar 

  5. Boillet, M., Kermorvant, C., Paquet, T.: Multiple document datasets pre-training improves text line detection with deep neural networks. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 2134–2141 (2021)

    Google Scholar 

  6. Diem, M., Kleber, F., Sablatnig, R.: Text line detection for heterogeneous documents. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 743–747 (2013)

    Google Scholar 

  7. Diem, M., Kleber, F., Sablatnig, R., Gatos, B.: cBAD: ICDAR 2019 competition on baseline detection. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1494–1498 (2019)

    Google Scholar 

  8. Droby, A., Kurar Barakat, B., Alaasam, R., Madi, B., Rabaev, I., El-Sana, J.: Text line extraction in historical documents using mask r-CNN. Signals 3(3), 535–549 (2022). https://doi.org/10.3390/signals3030032

    Article  Google Scholar 

  9. Gatos, B., et al.: GRPOLY-DB: an old Greek polytonic document image database. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 646–650 (2015)

    Google Scholar 

  10. Grüning, T., Leifert, G., Strauß, T., Michael, J., Labahn, R.: A two-stage method for text line detection in historical documents. Int. J. Docu. Anal. Recogn. (IJDAR) 22(3), 285–302 (2019)

    Article  Google Scholar 

  11. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017)

    Google Scholar 

  12. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

    Google Scholar 

  13. Jocher, G., et al.: ultralytics/yolov5: v7.0 - YOLOv5 SOTA Realtime Instance Segmentation (2022). https://doi.org/10.5281/zenodo.7347926

  14. Kumar, J., Ye, P., Doermann, D.: Structural similarity for document image classification and retrieval. Pattern Recogn. Lett. 43, 119–126 (2014). iCPR2012 Awarded Papers

    Google Scholar 

  15. LIn, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  16. Louloudis, G., Gatos, B., Pratikakis, I., Halatsis, C.: Text line detection in handwritten documents. Pattern Recogn. 41(12), 3758–3772 (2008)

    Article  MATH  Google Scholar 

  17. Michael, J., Labahn, R., Grüning, T., Zöllner, J.: Evaluating sequence-to-sequence models for handwritten text recognition. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1286–1293 (2019)

    Google Scholar 

  18. Nicolas, S., Paquet, T., Heutte, L.: Text line segmentation in handwritten document using a production system. In: 9th International Workshop on Frontiers in Handwriting Recognition, pp. 245–250 (2004)

    Google Scholar 

  19. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788 (2016)

    Google Scholar 

  20. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  21. Sahare, P., Dhok, S.B.: Review of text extraction algorithms for scene-text and document images. IETE Tech. Rev. 34(2), 144–164 (2017). https://doi.org/10.1080/02564602.2016.1160805

    Article  Google Scholar 

  22. Shi, Z., Govindaraju, V.: Line separation for complex document images using fuzzy runlength. In: Proceedings of the First International Workshop on Document Image Analysis for Libraries (DIAL 2004), p. 306. DIAL 2004, IEEE Computer Society, USA (2004)

    Google Scholar 

  23. Sichani, A.M., Kaddas, P., Mikros, G.K., Gatos, B.: OCR for Greek polytonic (multi accent) historical printed documents: development, optimization and quality control. In: Proceedings of the 3rd International Conference on Digital Access to Textual Cultural Heritage, pp. 9–13. DATeCH2019, Association for Computing Machinery, New York, NY, USA (2019)

    Google Scholar 

  24. Simistira, F., Ul-Hassan, A., Papavassiliou, V., Gatos, B., Katsouros, V., Liwicki, M.: Recognition of historical Greek polytonic scripts using LSTM networks. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 766–770 (2015)

    Google Scholar 

  25. Teslya, N., Mohammed, S.: Deep learning for handwriting text recognition: existing approaches and challenges. In: 2022 31st Conference of Open Innovations Association (FRUCT), pp. 339–346 (2022)

    Google Scholar 

  26. Yang, X., Yan, J.: Arbitrary-oriented object detection with circular smooth label. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12353, pp. 677–694. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58598-3_40

    Chapter  Google Scholar 

Download references

Acknowledgment

This research has been partially co-financed by the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH-CREATE-INNOVATE, project Culdile (Cultural Dimensions of Deep Learning, project code: T1EDK-03785), the Operational Program Attica 2014-2020, under the call RESEARCH AND INNOVATION PARTNERSHIPS IN THE REGION OF ATTICA, project reBook (Digital platform for re-publishing Historical Greek Books, project code: ATTP4-0331172) and the project “Corpus-assisted drama translation research: Shakespeare In Translation - ShakeIT”, under the call of internal research project funding of University of Cyprus (UCY https://www.ucy.ac.cy/directory/en/).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Panagiotis Kaddas .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kaddas, P., Gatos, B., Palaiologos, K., Christopoulou, K., Kritsis, K. (2023). Text Line Detection and Recognition of Greek Polytonic Documents. In: Coustaty, M., Fornés, A. (eds) Document Analysis and Recognition – ICDAR 2023 Workshops. ICDAR 2023. Lecture Notes in Computer Science, vol 14194. Springer, Cham. https://doi.org/10.1007/978-3-031-41501-2_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-41501-2_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-41500-5

  • Online ISBN: 978-3-031-41501-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics