Skip to main content
Log in

Binarization, character extraction, and writer identification of historical Hebrew calligraphy documents

  • Original Paper
  • Published:
International Journal of Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Abstract

We present our work on the paleographic analysis and recognition system intended for processing of historical Hebrew calligraphy documents. The main goal is to analyze documents of different writing styles in order to identify the locations, dates, and writers of test documents. Using interactive software tools, a data base of extracted characters has been established. It now contains about 20,000 characters of 34 different writers, and will be distinctly expanded in the near future. Preliminary results of automatic extraction of pre-specified letters using the erosion operator are presented. We further propose and test topological features for handwriting style classification based on a selected subset of the Hebrew alphabet. A writer identification experiment using 34 writers yielded 100% correct classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Fournier, J.M., Vienot, J.C.: Fourier transform holograms used as matched filters in hebraic paleography. Isr. J. Technol. 281–287 (1971)

  2. Sirat, C.: L’examen des ’critures: L’oeil et la machine, Paris, Editions du Centre National de la Recherche Scientifique (1981)

  3. Dinstein I. and Shapira Y. (1982). Ancient hebraic handwriting identification with run-length histograms. IEEE Trans. Syst. Man Cybern. 12: 405–409

    Article  Google Scholar 

  4. Likforman-Sulem L., Maitre H. and Sirat C. (1991). An expert vision system for analysis of Hebrew characters and authentication of Manuscripts. Pattern Recognit. 24(2): 121–137

    Article  Google Scholar 

  5. Bar-Yosef I. (2005). Input sensitive thresholding for ancient Hebrew manuscript. Pattern Recognit. Lett. 26: 1168–1173

    Article  Google Scholar 

  6. Breu H., Gil J., Kirkpatrick D. and Werman M. (1995). Linear time Euclidean distance transform algorithms. IEEE Trans. Pattern Anal. Machine Intell. 17(5): 529–533

    Article  Google Scholar 

  7. Zhuang, Y., Zhang, X., Wu, J., Lu, X.: Retrieval of Chinese calligraphic character image. In: 5th Pacific Rim Conference on Multimedia, Tokyo, Japan. pp. 17–24. Part I, (2004)

  8. Saykol E., Sinop A.K., Gudukbay U., Ulusoy O. and Cetin A.E. (2004). Content-based retrieval of historical Ottoman documents stored as textual images. IEEE Trans. Image Process. 13(3): 314–325

    Article  Google Scholar 

  9. Haralick R.M., Sternberg S.R. and Zhuang X. (1987). Image analysis using mathematical morphology. IEEE Trans. PAMI 9(4): 532–550

    Google Scholar 

  10. Al-Badr B. and Haralick R.M. (1998). A segmentation-free approach to text recognition with application to Arabic text. IJDAR 1(3): 147–166

    Article  Google Scholar 

  11. Schauf, M., Akoy, S., Haralick, R.M.: Model-based shape recognition using recursive mathematical morphology. 14th International Conference on Pattern Recognition, pp. 202–204 (1998)

  12. Beit-Arie, M.: Paleographical Identification of Hebrew Manuscripts: Methodology and Practice, in idem, The Making of the Medieval Hebrew Book, pp. 15–44. The Magnes Press, The Hebrew University, Jerusalem (1991)

  13. Said H.E.S., Tan T.N. and Baker K.D. (2000). Personal identification based on handwriting. Pattern Recognit. 33(1): 149–160

    Article  Google Scholar 

  14. Bulacu, M., Schomaker, L.R.B., Vuurpijl, L.G.: Writer identification using edge-based directional features. International Conference on Document Analysis and Recognition, pp. 937–941 (2003)

  15. Zhang, B., Srihari, S.N., Lee, S.: Individuality of handwritten characters. ICDAR 2003, pp. 1086–1090

  16. Zhang, B., Srihari, S.N.: Analysis of Handwriting Individuality Using Word Features. ICDAR ’01, p. 1142

  17. Wang, X., Ding, X., Liu, H.: Writer identification using directional element features and linear transform. In: International Conference on Document Analysis and Recognition, pp. 942–945 (2003)

  18. Ablavsky, V., Stevens, M.R.: Automatic feature selection with applications to script identification of degraded Documents. In: International Conference on Document Analysis and Recognition, pp. 750–754 (2003)

  19. Molina, L.C., Belanche, L., Nebot, A.: Feature selection algorithms: a survey and experimental evaluation. In: Proceedings of the International Conference on Data Mining, pp. 306–313 (2002)

  20. Kittler, J.: Feature set search algorithms. Pattern Recognit. Signal Process. pp. 41–60 (1978)

  21. Jain A.K. and Zongker D. (1997). Feature selection: evaluation, application and small sample performance. IEEE Trans. Pattern Anal. Machine Intell. 19: 153–158

    Article  Google Scholar 

  22. Pudil P., Novovicova J. and Kittler J. (1994). Floating search methods in feature selection. Pattern Recognit. Lett. 15: 1119–1125

    Article  Google Scholar 

  23. Duda R.O., Hart P.E. and Stork D.G. (2000). Pattern Classification. Wiley, New York

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Itay Bar-Yosef.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bar-Yosef, I., Beckman, I., Kedem, K. et al. Binarization, character extraction, and writer identification of historical Hebrew calligraphy documents. IJDAR 9, 89–99 (2007). https://doi.org/10.1007/s10032-007-0041-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-007-0041-5

Keywords

Navigation