Skip to main content
Log in

A fast and accurate contour-based method for writer-dependent offline handwritten Farsi/Arabic subwords recognition

  • Original Paper
  • Published:
International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Abstract

This paper concerns with the recognition of offline Farsi/Arabic handwriting. The overall appearance of each subword in Farsi/Arabic script is described by its shape contour that provides us with a rich set of discriminative characteristics. Our approach is writer-dependent; that is, the system is trained to recognize the subwords written by a particular writer. A fast contour alignment is the central part of the proposed algorithm, where the alignment is performed based on a handful of feature points. An efficient lexicon reduction algorithm based on characteristic loci feature, which works directly on subwords’ binary images, is proposed as well. Fast and precise alignment along with efficient lexicon reduction and appropriate similarity matching yields a high recognition rate while kept the speed high. Our experiment on IBN SINA database shows that the correct classification rate could be as high as 91.08 %. This figure is achieved merely by subword shape matching, without dots and diacritics, and without any statistical language model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21

Similar content being viewed by others

References

  1. AbdulKader, A.: A two-tier Arabic offline handwriting recognition based on conditional joining rules. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), LNCS 4768, Springer (2008)

  2. Abdulla, S., Al-Nassiri, A., Salam, R.A.: Off-line Arabic handwritten word segmentation using rotational invariant segments features. Int. Arab J. Inf. Technol. 5(2), 200–208 (Apr 2008)

    Google Scholar 

  3. Abed, H., Margner, V.: Arabic text recognition systems—state of the art and future trends. In: Proceedings of International Conference on Innovations in Information Technology, pp. 692–696, Al Ain (2008)

  4. Aburas, A.A., Rehiel, S.M.A.: Off-line omni-style handwriting Arabic character recognition system based on wavelet compression. J. Arab Res. Inst. Sci. Eng. (ARISER) 3(4), 123–135 (2007)

    Google Scholar 

  5. Al Hamad, H.A., Abu Zitar, R.: Development of an efficient neural-based segmentation technique for Arabic handwriting recognition. Pattern Recognit. 43(8), 2773–2798 (2010)

    Article  MATH  Google Scholar 

  6. Al-Hajj Mohamad, R., Likforman-Sulem, L., Mokbel, C.: Combining slanted-frame classifiers for improved HMM-based Arabic handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31(7), 1165–1177 (2009)

    Article  Google Scholar 

  7. Al Khateeb, J.H., Jianmin, J., Jinchang, R., Stan, S.I.: Component-based segmentation of words from handwritten Arabic text. Int. J. Comput. Syst. Sci. Eng. 5(1), 344–348 (2009)

    Google Scholar 

  8. Alma’adeed, S., Higgens, C., Elliman, D.: Off-line recognition of handwritten Arabic words using multiple hidden Markov models. Knowl. Based Syst. 17, 75–79 (2004)

    Article  Google Scholar 

  9. Amrouch, M., Elyassa, M., Rachidi, A., Mammass, D.: Off-line arabic handwritten characters recognition based on a hidden markov models. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), LNCS 5099, pp. 447–454 (2008)

  10. Azmi, R.: Recognition of omnifont printed Farsi text. PhD Thesis, Tarbiat Modarres University, Tehran, Iran (1999) (in Farsi)

  11. Ball, G.R., Srihari, S.N.: Prototype integration in off-line handwriting recognition adaptation. In: Proceedings of International Conference on Frontiers in Handwriting Recognition, pp. 529–534, Montreal, Canada (2008)

  12. Ball, G.R., Srihari, S.N.: Writer adaptation in off-line Arabic handwriting recognition. In: Proceedings of SPIE, 6815 (2008)

  13. Basu, S., Das, N., Sarkar, R., Kundu, M., Nasipuri, M., Basu D.: Recognition of numeric postal codes from multi-script postal address blocks. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), LNCS 5909, pp. 381–386 (2009)

  14. Benouareth, A., Ennaji, A., Sellami, M.: Semi-continuous HMMs with explicit state duration for unconstrained arabic word modeling and recognition. Pattern Recognit. Lett. 29(12), 1742–1752 (2008)

    Article  Google Scholar 

  15. Bergroth, L., Hakonen, H., Raita, T.: A survey of longest common subsequence algorithms. In: Proceedings of the 7th Symposium on String Processing and, Information Retrieval (SPIRE), pp. 39–48 (2000)

  16. Cheikh, I.B., Kacem, A.: Neural network for the recognition of handwritten Tunisian city names. In: Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR’07), vol. 2, pp. 1108–1112, Curitiba (2007)

  17. Chen, J., Cao, H., Prasad, R., Bhardwaj, A., Natarajan, P.: Gabor features for offline arabic handwriting recognition. In: Proceedings of IAPR Workshop on Document Analysis Systems (DAS’10), pp. 53–58, Boston, MA (2010)

  18. Cheriet, M., Kharma, N., Liu, C.L., Suen, C.Y.: Character Recognition Systems: A Guide for Students and Practioners. Wiley, London (2007)

    Book  Google Scholar 

  19. Chherawala, Y., Cheriet, M.: W-TSV: weighted topological signature vector for lexicon reduction in handwritten Arabic documents. Pattern Recognit. 45, 3277–3287 (2012)

    Article  Google Scholar 

  20. Dehghan, M., Faez, K., Ahmadi, M., Shridhar, M.: Handwritten Farsi (Arabic) word recognition: a holistic approach using discrete HMM. Pattern Recognit. 34(5), 1057–1065 (2001)

    Article  MATH  Google Scholar 

  21. Dreuw, P., Rybach, D., Gollan, C., Ney, H.: Writer adaptive training and writing variant model refinement for offline Arabic handwriting recognition. In: Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR’09), pp. 21–25, Barcelona (2009)

  22. Ebrahimi, A., Kabir, E.: A pictorial dictionary for printed Farsi subwords. Pattern Recognit. Lett. 29, 656–663 (2008)

    Article  Google Scholar 

  23. Ehsani, M., Babaee, M.: Recognition of Farsi handwritten cheque values using neural networks. In: Proceedings of the 3rd International IEEE Conference Intelligent Systems, pp. 656–660 (2006)

  24. Eldin, A.S., Nouh, A.S.: Arabic character recognition: a survey. In: Proceedings of SPIE Optical Pattern Recognition, vol. 3386, pp. 331–340, Orlando, Florida, USA (1998)

  25. Farah, N., Souici, L., Farah, L., Sellami, M.: Arabic words recognition with classifiers combination: an application to literal amounts. In: Proceedings of Artificial Intelligence: Methodology, Systems, and Applications, pp. 331–340, Varna, Bulgaria (2004)

  26. Farah, N., Souici, L., Sellami, M.: Classifiers combination and syntax analysis for arabic literal amount recognition. Eng. Appl. Artif. Intell. 19(1), 29–39 (2006)

    Article  Google Scholar 

  27. Farrahi Moghaddam, R., Cheriet, M., Adankon, M., Filonenko, K., Wisnovsky, R.: IBN SINA: a database for research on processing and understanding of Arabic manuscripts images. In: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems (DAS ’10), pp. 11–18. ACM (2010)

  28. Farrahi Moghaddam, R., Cheriet, M., Milo, T., Wisnovsky, R.: A prototype system for handwritten sub-word recognition: toward Arabic-manuscript transliteration CoRR, abs/1111.3281 (2011)

  29. Farrahi Moghaddam, R., Cheriet, M.: A multi-scale framework for adaptive binarization of degraded document images. Pattern Recognit. 43, 2186–2198 (2010)

    Article  MATH  Google Scholar 

  30. Fischer, A., Riesen, K., Bunke, H.: Graph similarity features for HMM-based handwriting recognition in historical documents. In: Proceedings of the International Conference on Frontiers in Handwriting Recognition (ICFHR ’10), pp. 253–258 (2010)

  31. Glucksman, H.: Classification of mixed-font alphabets by characteristic loci. In: Proceedings of IEEE Computer Conference, pp. 138–141 (1967)

  32. James, G.M.: Curve alignment by moments. Ann. Appl. Stat. 1(2), 480–501 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  33. Jou, F.D., Fan, K.C., Chang, Y.L.: Efficient matching of large-size histograms. Pattern Recognit. Lett. 25, 277–286 (2004)

    Article  Google Scholar 

  34. Kessentini, Y., Paquet, T., Ben Hamadou, A.: Off-line handwritten word recognition using multi-stream hidden markov models. Pattern Recognit. Lett. 31(1), 60–70 (2010)

    Article  Google Scholar 

  35. Khorsheed, M.S.: Off-line Arabic character recognition—a review. Pattern Anal. Appl. 5, 31–45 (2002)

    Article  MathSciNet  Google Scholar 

  36. Koerich, A.L., Sabourin, R., Suen, C.Y.: Large vocabulary off-line handwriting recognition: a survey. Pattern Anal. Appl. 6, 97–121 (2003)

    Article  MathSciNet  Google Scholar 

  37. Li, Z., Luo, X., Gao, C.: Multi-resolution curve alignment based on salient features. In: Proceedings of the 18th International Conference on, Pattern Recognition (ICPR’06), vol. 2, pp. 357–360 (2006)

  38. Liu, C.L., Suen, C.Y.: A new benchmark on the recognition of handwritten Bangla and Farsi numeral characters. Pattern Recognit. 42(12), 3287–3295 (2009)

    Article  MATH  Google Scholar 

  39. Lopresti, D., Nagy, G., Seth, S., Zhang, X.: Multi-character field recognition for Arabic and chinese handwriting. In: Lecture Notes in Computer Science, vol. 4768, p. 218 (2008)

  40. Lorigo, L.M., Govindaraju, V.: Offline Arabic handwriting recognition: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 28(5), 712–724 (2006)

    Article  Google Scholar 

  41. Madhvanath, S., Govindaraju, V.: The role of holistic paradigms in handwritten word recognition. IEEE Trans. Pattern Anal. Mach. Intell. 23, 149–164 (2001)

    Google Scholar 

  42. Mahmoud, S.: Arabic (Indian) handwritten digits recognition using Gabor-based features. In: Proceedings of International Conference on Innovations in Information Technology, pp. 683–687, Al Ain (2008)

  43. Marques, J.S.: A fuzzy algorithm for curve and surface alignment. Pattern Recognit. Lett. 19(9), 797–803 (1998)

    Article  MATH  Google Scholar 

  44. Mattar, M.A., Ross, M.G., Learned-Miller, E.G.: Nonparametric curve alignment. In: Proceedings of IEEE International Conference on Acoustics, Speech, and, Signal Processing (ICASSP’09), pp. 3457–3460 (2009)

  45. Mozaffari, S., Faez, K., Margner, V.: Application of fractal theory for on-line and off-line Farsi digit recognition. In: Lecture Notes in Computer Science, vol. 4571, p. 868 (2007)

  46. Mozaffari, S., Faez, K., Margner, V., El-Abed, H.: Two-stage Lexicon reduction for offline Arabic handwritten word recognition. Int. J. Pattern Recognit. Artif. Intell. 22, 1323–1341 (2008)

    Article  Google Scholar 

  47. Munich, M.E., Perona, P.: Continuous dynamic time warping for translation-invariant curve alignment with applications to signature verification. In: Proceedings of the 7th IEEE International Conference on Computer Vision (ICCV’99), vol. 1, pp. 108–115 (1999)

  48. Myers, C., Rabiner, L., Rosenberg, A.: Performance tradeoffs in dynamic time warping algorithms for isolated word recognition. IEEE Trans. Signal Process. Acoust. Speech Signal Process. 28, 623–635 (1980)

    Article  MATH  Google Scholar 

  49. Parvez, M.T., Mahmoud, S.A.: Arabic handwriting recognition using structural and syntactic pattern attributes. Pattern Recognit. 46, 141–154 (2013)

    Article  Google Scholar 

  50. Plamondon, R., Srihari, S.N.: On-line and off-line handwriting recognition: a comprehensive survey. IEEE Trans. Pattern Anal. Mach. Intell. 22(1), 63–84 (2000)

    Article  Google Scholar 

  51. Quiniou, S., Anquetil, E., Carbonnel, S.: Statistical language models for on-line handwritten sentence recognition. In: Proceedings of the Eight International Conference on Document Analysis and Recognition (ICDAR05) (2005)

  52. Ravani, R., Nooralishahi, P., Amani, A.S.: A novel approach for Persian/Arabic Intelligent Word Recognition (IWR). In: Proceedings of the 3rd European Workshop on Visual Information Processing (EUVIP), pp. 292–297 (2011)

  53. Ronn, B.B.: Non-parametric maximum likelihood estimation for shifted curves. J. R. Stat. Soc. B(63), 243–259

  54. Saeed, K., Albakoor, M.: Region growing based segmentation algorithm for typewritten and handwritten text recognition. Appl. Soft Comput. 9(2), 608–617 (2009)

    Article  Google Scholar 

  55. Sari, T., Souici, L., Sellami, M.: Off-line handwritten Arabic character segmentation algorithm: ACSA. In: Proceedings of International Workshop on Frontiers in Handwriting Recognition, pp. 452–457, Niagara-on-the-Lake Ontario, Canada (2002)

  56. Sari, T., Sellami, M.: Cursive Arabic script segmentation and recognition system. Int. J. Comput. Appl. 27(3), 161–168 (2005)

    Google Scholar 

  57. Sebastian, T., Klein, P., Kimia, B.: On aligning curves. IEEE Trans. Pattern Anal. Mach. Intell. 25, 116–125 (2003)

    Article  Google Scholar 

  58. Sonka, M., Hlavac, V., Boyle, R.: Image Processing, Analysis and Machine Vision. Thomson Learning, USA (2008)

    Google Scholar 

  59. Souici-Meslati, L., Sellami, M.: A hybrid approach for Arabic literal amounts recognition. Arab. J. Sci. Eng. 29, 177–194 (2004)

    Google Scholar 

  60. Steinherz, T., Rivlin, E., Intrator, N.: Off-line cursive script word recognition: a survey. Int. J. Document Anal. Recognit. (IJDAR) 2, 90–110 (1999)

    Article  Google Scholar 

  61. Vamvakas, G., Gatos, B., Stamatopoulos, N., Perantonis, S.: A complete optical character recognition methodology for historical documents. In: Proceedings of the Eighth IAPR International Workshop on Document Analysis Systems (DAS ’08), pp. 525–532 (2008)

  62. Vinciarelli, A., Bengio, S.: Writer adaptation techniques in HMM based off-line cursive script recognition. Pattern Recognit. Lett. 23(8), 905–915 (2002)

    Google Scholar 

  63. Wang, K.M., Gasser, T.: Alignment of curves by dynamic time warping. Ann. Stat. 25(3), 1251–1276 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  64. Wshah, S., Govindaraju, V., Cheng, Y., Li, H.: A novel lexicon reduction method for Arabic handwriting recognition. In: Proceedings of the 20th International Conference on Pattern Recognition (ICPR ’10), pp. 2865–2868 (2010)

  65. Wshah, S., Shi, Z., Govindaraju, V.: Segmentation of Arabic handwriting based on both contour and skeleton segmentation. In: Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR’09), pp. 793–797, Barcelona (2009)

  66. Wuthrich, M., Liwicki, M., Fischer, A., Indermuhle, E., Bunke, H., Viehhauser, G., Stolz, M.: Language model integration for the recognition of handwritten medieval documents. In: Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR ’09), pp. 211–215 (2009)

  67. Xia, M., Liu, B.: Aligning curves under projective transform and its application to image registration. In: Proceedings of IEEE International Conference on Image Processing (ICIP’06), pp. 349–352 (2006)

Download references

Acknowledgments

The authors would like to thank the anonymous reviewers for their valuable comments and constructive suggestions that helped them to improve content and presentation of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Babak N. Araabi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fouladi, K., Araabi, B.N. & Kabir, E. A fast and accurate contour-based method for writer-dependent offline handwritten Farsi/Arabic subwords recognition. IJDAR 17, 181–203 (2014). https://doi.org/10.1007/s10032-013-0210-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-013-0210-7

Keywords

Navigation