skip to main content
10.1145/1815330.1815348acmotherconferencesArticle/Chapter ViewAbstractPublication PagesdasConference Proceedingsconference-collections

Handwritten Arabic text line segmentation using affinity propagation

Published: 09 June 2010 Publication History


In this paper, we present a novel graph-based method for extracting handwritten text lines in monochromatic Arabic document images. Our approach consists of two steps - Coarse text line estimation using primary components which define the line and assignment of diacritic components which are more difficult to associate with a given line. We first estimate local orientation at each primary component to build a sparse similarity graph. We then, use a shortest path algorithm to compute similarities between non-neighboring components. From this graph, we obtain coarse text lines using two estimates obtained from Affinity propagation and Breadth-first search. In the second step, we assign secondary components to each text line. The proposed method is very fast and robust to non-uniform skew and character size variations, normally present in handwritten text lines. We evaluate our method using a pixel-matching criteria, and report 96% accuracy on a dataset of 125 Arabic document images. We also present a proximity analysis on datasets generated by artificially decreasing the spacings between text lines to demonstrate the robustness of our approach.


Manivannan Arivazhagan, Harish Srinivasan, and Sargur Srihari, "A statistical approach to line segmentation in handwritten documents," Volume 6500. SPIE, 2007.
Masaki Yamaoka and Osamu Iwaki, "Document layout analysis using pattern classification method," Lecture Notes in Computer Science, Vol. 1024/1995, pp. 524--525
Chih-Hong Kao, Hon-Son Don, "Skew Detection of Document Images Using Line Structural Information," icita, vol. 1, pp. 704--715, Third International Conference on Information Technology and Applications (ICITA'05) Volume 1, 2005
Arvind K. R., Jayant Kumar and Ramakrishnan A. G., "Entropy Based Skew Correction of Document Images," Lecture Notes in Computer Science, Vol. 4815/2007, Springer, pp. 495--502, 2007
U.-V. Marti, H. Bunke, "Text Line Segmentation and Word Recognition in a System for General Writer Independent Handwriting Recognition," pp. 0159, Sixth International Conference on Document Analysis and Recognition (ICDAR'01), 2001
Z. Razak, K. Zulkiflee, "Off-line Handwriting textline segmentation: a review," International Journal of Computer Science and Network Security 8(7)(2008) 12--20.
Zahour A., Taconet B., Likforman-Sulem L., Bousella W., Overlapping and multi-touching text-line segmentation by block covering analysis, Pattern Analysis & Applications, DOI 10.1007/s10044-008-0127-9, July 2008.
B. Yanikoglu, P. A. Sandon, "Segmentation of off-line cursive handwriting using linear programming", Pattern Recognition 31(12) (1998) 1825--1833.
Réjean Plamondon, Sargur N. Srihari, "On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 63--84, Jan. 2000
Y. Lu, "Machine printed character segmentation: an overview," Pattern Recognition 28, 67--80 (1995).
Vassilis Papavassiliou, Themos Stafylakis, Vassilis Katsouros, George Carayannis, "Handwritten document image segmentation into textlines and words," Pattern Recognition, Volume 43, Issue 1, January 2010, Pages 369--377
G. Louloudis, B. Gatos, C. Halatsis, "Text Line Detection in Unconstrained Handwritten Documents Using a Block-Based Hough Transform Approach," ICDAR, vol. 2, pp. 599--603, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) Vol 2, 2007
Amin A. "Off-line Arabic character recognition: The state of the art", Pattern Recognition, Vol. 31, pp. 517--530, 1998.
A. Zahour, B. Taconet, P. Mercy, S. Ramdane, "Arabic handwritten text-line extraction," In: Proceedings of the Sixth International Conference on Document Analysis and Recognition, 2001, pp. 281--285.
U. Pal, S. Datta, "Segmentation of Bangla unconstrained handwritten text," In: Proc. of the Seventh International Conference on Document Analysis and Recognition, vol. 2, 2003, pp. 1128--1132
Venu Govindaraju, Huaigu Cao and Anurag Bhardwaj, "Handwritten Document Retrieval Strategies", Proc. of ICDAR worskhop on Noisy Text Analytics (AND), Spain, 2009.
Yi Li, Yefeng Zheng and David Doermann, "Detecting Text Line in Handwritten Documents," ICPR'06, pages 1030--1033, 2006.
Handwritten Arabic Proximity Datasets. Language and Media Processing Laboratory.
W. Boussellaa, A. Zahour, B. Taconet, A. Benabdelhafid, A. Alimi, "Segmentation texte/graphique: Application au manuscrits Arabes Anciens.", Neuvième Colloque International Francophone sur lŠEcrit et le Document, Fribourg, Suisse, 18--21 Septembre 2006, pp. 139--144
F. Farooq, V. Govindaraju, and M. Perrone, "Preprocessing Methods for Handwritten Arabic Documents", Proc. Int'l Conf. Document Analysis and Recognition, pp. 267--271, 2005.
Du, X., Pan, W. et Bui, T. D., "Text line segmentation in handwritten documents using mumford-shah model," Pattern Recogn., 42(12):3136--3145, 2009.
U. V. Martin and H. Bunke., "Text line segmentation and word recognition in a system for general writer independent handwriting recognition," In Proc. Intl. Conf. on Document Analysis and Recognition, pages 159--163, 2001.
Brendan J. Frey and Delbert Dueck, "Clustering by Passing Messages Between Data Points," Science 315, 972--976
Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2001). "Section 24.3: Dijkstra's algorithm," Introduction to Algorithms (Second ed.). MIT Press and McGraw-Hill.
Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2001). "Section 22.2: Breadth First Search," Introduction to Algorithms (Second ed.). MIT Press and McGraw-Hill.

Cited By

View all
  • (2024)A Survey on Text-Line Segmentation in Arab Historical ManuscriptsInternational Journal of Informatics and Applied Mathematics10.53508/ijiam.14072367:1(14-32)Online publication date: 13-Jun-2024
  • (2022)Learning-free, divide and conquer text-line extraction algorithm for printed Arabic text with diacriticsJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2022.04.02134:9(7699-7709)Online publication date: Oct-2022
  • (2022)A Review of Various Line Segmentation Techniques Used in Handwritten Character RecognitionInformation and Communication Technology for Competitive Strategies (ICTCS 2021)10.1007/978-981-19-0095-2_34(353-365)Online publication date: 23-Jun-2022
  • Show More Cited By



Information & Contributors


Published In

cover image ACM Other conferences
DAS '10: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
June 2010
490 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]


Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 June 2010


Request permissions for this article.

Check for updates

Author Tags

  1. Arabic
  2. Arabic documents
  3. Dijkstra's shortest path algorithm
  4. affinity propagation
  5. breadth-first search
  6. clustering
  7. handwritten documents
  8. line detection
  9. text line segmentation


  • Research-article

Funding Sources


DAS '10


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 02 Mar 2025

Other Metrics


Cited By

View all
  • (2024)A Survey on Text-Line Segmentation in Arab Historical ManuscriptsInternational Journal of Informatics and Applied Mathematics10.53508/ijiam.14072367:1(14-32)Online publication date: 13-Jun-2024
  • (2022)Learning-free, divide and conquer text-line extraction algorithm for printed Arabic text with diacriticsJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2022.04.02134:9(7699-7709)Online publication date: Oct-2022
  • (2022)A Review of Various Line Segmentation Techniques Used in Handwritten Character RecognitionInformation and Communication Technology for Competitive Strategies (ICTCS 2021)10.1007/978-981-19-0095-2_34(353-365)Online publication date: 23-Jun-2022
  • (2022)Deep Learning-Based Segmentation of Connected Components in Arabic Handwritten DocumentsIntelligent Systems and Pattern Recognition10.1007/978-3-031-08277-1_8(93-106)Online publication date: 17-Jun-2022
  • (2021)Arabic handwritten text line segmentation using a multi-agent system and a directed CNN2021 Fifth International Conference On Intelligent Computing in Data Sciences (ICDS)10.1109/ICDS53782.2021.9626747(1-7)Online publication date: 20-Oct-2021
  • (2021)Combination of deep neural networks and logical rules for record segmentation in historical handwritten registers using few examplesInternational Journal on Document Analysis and Recognition (IJDAR)10.1007/s10032-021-00362-8Online publication date: 3-Mar-2021
  • (2020)A Robust Progressive Text Line Segmentation Framework with Markov Line DescriptorsProceedings of the 2020 4th International Conference on Video and Image Processing10.1145/3447450.3447482(199-212)Online publication date: 25-Dec-2020
  • (2020)Survey on Segmentation and Recognition of Handwritten Arabic ScriptSN Computer Science10.1007/s42979-020-00187-y1:4Online publication date: 6-Jun-2020
  • (2020)A Robust Method for Text, Line, and Word Segmentation for Historical Arabic ManuscriptsData Analytics for Cultural Heritage10.1007/978-3-030-66777-1_7(147-172)Online publication date: 10-Dec-2020
  • (2019)Efficient Algorithms for Text Lines and Words Segmentation for Recognition of Arabic Handwritten ScriptEnergy Transfer and Dissipation in Plasma Turbulence10.1007/978-981-13-5953-8_32(387-401)Online publication date: 3-May-2019
  • Show More Cited By

View Options

Login options

View options


View or Download as a PDF file.



View online with eReader.







Share this Publication link

Share on social media