Abstract
Baseline detection is an important process in document image analysis and recognition systems. It is extensively used to many various preprocessing stages such as text normalization, skew correction, characters segmentation, slant and slop correction as well as in feature extraction. in this work, we proposed a new method for baseline detection based on horizontal projection histogram and directions features of subwords skeleton for Arabic script; which form the main component of the text that may consist of at least one letter, in addition of diacritic and dots. The efficiency of the proposed method is has been proven by the experiment’s results on an IFN/ENIT Arabic benchmark dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
U. Nations, http://www.un.org (March 13, 2013)
Abu-Ain, T.A.H., Abu-Ain, W.A.H., Sheikh Abdullah, S.N.H., Omar, K.: Off-line Arabic Character-Based Writer Identification – a Survey. In: International Journal on Advanced Science, Engineering and Information Technology, pp. 161–166 (2011); Proceeding of the International Conference on Advanced Science, Engineering and Information Technology Bangi, Malaysia
Bataineh, B., Abdullah, S.N.H.S., Omar, K.: Arabic calligraphy recognition based on binarization methods and degraded images. In: International Conference in Pattern Analysis and Intelligent Robotics (ICPAIR 2011), pp. 65–70 (2011)
Gacek, A.: Arabic Manuscripts: A Vademecum for Readers, BRILL (2009)
Parhami, B., Taraghi, M.: Automatic Recognition of Printed Farsi Texts. Presented at the Proc. Conf. Pattern Recognition, England (1980)
Pechwitz, M., Margner, V.: Baseline estimation for Arabic handwritten words. In: Proceeding in Eighth International Workshop on Frontiers and Handwriting Recognition, pp. 479–484 (2002)
Farooq, F., Govindaraju, V., Perrone, M.: Pre-processing methods for handwritten Arabic documents. In: Proceedings in Eighth International Conference on Document Analysis and Recognition, vol. 1, pp. 267–271 (2005)
Ziaratban, M., Faez, K.: A novel two-stage algorithm for baseline estimation and correction in Farsi and Arabic handwritten text line. In: 19th International Conference on Pattern Recognition, ICPR 2008, pp. 1–5 (2008)
Boubaker, H., Kherallah, M., Alimi, A.M.: New Algorithm of Straight or Curved Baseline Detection for Short Arabic Handwritten Writing. In: 10th International Conference on Document Analysis and Recognition, ICDAR 2009, pp. 778–782 (2009)
Boukerma, H., Farah, N.: A Novel Arabic Baseline Estimation Algorithm Based on Sub-Words Treatment. In: International Conference on Frontiers in Handwriting Recognition (ICFHR 2010), pp. 335–338 (2010)
Nagabhushan, P., Alaei, A.: Tracing and Straightening the Baseline in Handwritten Persian/Arabic Text-line: A New Approach Based on Painting-technique. International Journal on Computer Science and Engineering 2, 907–916 (2010)
Bataineh, B., Abdullah, S.N.H.S., Omar, K.: An adaptive local binarization method for document images based on a novel thresholding method and dynamic windows. Pattern Recognition Letters 32, 1805–1813 (2011)
Linda, G.C.S., Shapiro, G.: Computer Vision. Prentice Hall (2002)
Abu-Ain, W., Abdullah, S.N.H.S., Bataineh, B., Abu-Ain, T., Omar, K.: Skeletonization Algorithm for Binary Images. In: International Conference on Electrical Engineering and Informatics, ICEEI 2013 (2013)
IFN/ENIT - Database of Arabic Handwritten words, T. U. Institute of Communications Technology, Braunschweig, Germany (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Abu-Ain, T., Sheikh Abdullah, S.N.H., Bataineh, B., Omar, K., Abu-Ein, A. (2013). A Novel Baseline Detection Method of Handwritten Arabic-Script Documents Based on Sub-Words. In: Noah, S.A., et al. Soft Computing Applications and Intelligent Systems. M-CAIT 2013. Communications in Computer and Information Science, vol 378. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40567-9_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-40567-9_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40566-2
Online ISBN: 978-3-642-40567-9
eBook Packages: Computer ScienceComputer Science (R0)