Abstract
In this work we highlight the significance of Text Line Detection in documents. By utilizing a well-known Deep Neural Network and by proposing some simple but efficient modifications applied during training of such models, we can achieve very accurate results in different datasets of high diversity. Moreover, such models can be robust even when trained with few data. Our focus is on Greek polytonic documents (typewritten and handwritten) and we provide a new dataset to the public (GTLD-small) for text line detection. We evaluate our method through scenarios applied to the detection and recognition tasks, while demonstrating promising results when compared to popular commercial or open-source systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
References
Ahn, B., Ryu, J., Koo, H.I., Cho, N.I.: Textline detection in degraded historical document images. EURASIP J. Image Video Process. 2017(1), 82 (2017)
Ares Oliveira, S., Seguin, B., Kaplan, F.: dhSegment: a generic deep-learning approach for document segmentation. In: 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), 2018, pp. 7–12. IEEE (2018)
Basu, S., Chaudhuri, C., Kundu, M., Nasipuri, M., Basu, D.: Text line extraction from multi-skewed handwritten documents. Pattern Recogn. 40(6), 1825–1839 (2007)
Boillet, M., Kermorvant, C., Paquet, T.: Robust text line detection in historical documents: learning and evaluation methods. Int. J. Doc. Anal. Recog. (IJDAR) 25(2), 95–114 (2022)
Boillet, M., Kermorvant, C., Paquet, T.: Multiple document datasets pre-training improves text line detection with deep neural networks. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 2134–2141 (2021)
Diem, M., Kleber, F., Sablatnig, R.: Text line detection for heterogeneous documents. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 743–747 (2013)
Diem, M., Kleber, F., Sablatnig, R., Gatos, B.: cBAD: ICDAR 2019 competition on baseline detection. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1494–1498 (2019)
Droby, A., Kurar Barakat, B., Alaasam, R., Madi, B., Rabaev, I., El-Sana, J.: Text line extraction in historical documents using mask r-CNN. Signals 3(3), 535–549 (2022). https://doi.org/10.3390/signals3030032
Gatos, B., et al.: GRPOLY-DB: an old Greek polytonic document image database. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 646–650 (2015)
Grüning, T., Leifert, G., Strauß, T., Michael, J., Labahn, R.: A two-stage method for text line detection in historical documents. Int. J. Docu. Anal. Recogn. (IJDAR) 22(3), 285–302 (2019)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Jocher, G., et al.: ultralytics/yolov5: v7.0 - YOLOv5 SOTA Realtime Instance Segmentation (2022). https://doi.org/10.5281/zenodo.7347926
Kumar, J., Ye, P., Doermann, D.: Structural similarity for document image classification and retrieval. Pattern Recogn. Lett. 43, 119–126 (2014). iCPR2012 Awarded Papers
LIn, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Louloudis, G., Gatos, B., Pratikakis, I., Halatsis, C.: Text line detection in handwritten documents. Pattern Recogn. 41(12), 3758–3772 (2008)
Michael, J., Labahn, R., Grüning, T., Zöllner, J.: Evaluating sequence-to-sequence models for handwritten text recognition. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1286–1293 (2019)
Nicolas, S., Paquet, T., Heutte, L.: Text line segmentation in handwritten document using a production system. In: 9th International Workshop on Frontiers in Handwriting Recognition, pp. 245–250 (2004)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788 (2016)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Sahare, P., Dhok, S.B.: Review of text extraction algorithms for scene-text and document images. IETE Tech. Rev. 34(2), 144–164 (2017). https://doi.org/10.1080/02564602.2016.1160805
Shi, Z., Govindaraju, V.: Line separation for complex document images using fuzzy runlength. In: Proceedings of the First International Workshop on Document Image Analysis for Libraries (DIAL 2004), p. 306. DIAL 2004, IEEE Computer Society, USA (2004)
Sichani, A.M., Kaddas, P., Mikros, G.K., Gatos, B.: OCR for Greek polytonic (multi accent) historical printed documents: development, optimization and quality control. In: Proceedings of the 3rd International Conference on Digital Access to Textual Cultural Heritage, pp. 9–13. DATeCH2019, Association for Computing Machinery, New York, NY, USA (2019)
Simistira, F., Ul-Hassan, A., Papavassiliou, V., Gatos, B., Katsouros, V., Liwicki, M.: Recognition of historical Greek polytonic scripts using LSTM networks. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 766–770 (2015)
Teslya, N., Mohammed, S.: Deep learning for handwriting text recognition: existing approaches and challenges. In: 2022 31st Conference of Open Innovations Association (FRUCT), pp. 339–346 (2022)
Yang, X., Yan, J.: Arbitrary-oriented object detection with circular smooth label. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12353, pp. 677–694. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58598-3_40
Acknowledgment
This research has been partially co-financed by the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH-CREATE-INNOVATE, project Culdile (Cultural Dimensions of Deep Learning, project code: T1EDK-03785), the Operational Program Attica 2014-2020, under the call RESEARCH AND INNOVATION PARTNERSHIPS IN THE REGION OF ATTICA, project reBook (Digital platform for re-publishing Historical Greek Books, project code: ATTP4-0331172) and the project “Corpus-assisted drama translation research: Shakespeare In Translation - ShakeIT”, under the call of internal research project funding of University of Cyprus (UCY https://www.ucy.ac.cy/directory/en/).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Kaddas, P., Gatos, B., Palaiologos, K., Christopoulou, K., Kritsis, K. (2023). Text Line Detection and Recognition of Greek Polytonic Documents. In: Coustaty, M., Fornés, A. (eds) Document Analysis and Recognition – ICDAR 2023 Workshops. ICDAR 2023. Lecture Notes in Computer Science, vol 14194. Springer, Cham. https://doi.org/10.1007/978-3-031-41501-2_15
Download citation
DOI: https://doi.org/10.1007/978-3-031-41501-2_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-41500-5
Online ISBN: 978-3-031-41501-2
eBook Packages: Computer ScienceComputer Science (R0)