Abstract
We present our work on the paleographic analysis and recognition system intended for processing of historical Hebrew calligraphy documents. The main goal is to analyze documents of different writing styles in order to identify the locations, dates, and writers of test documents. Using interactive software tools, a data base of extracted characters has been established. It now contains about 20,000 characters of 34 different writers, and will be distinctly expanded in the near future. Preliminary results of automatic extraction of pre-specified letters using the erosion operator are presented. We further propose and test topological features for handwriting style classification based on a selected subset of the Hebrew alphabet. A writer identification experiment using 34 writers yielded 100% correct classification.
Similar content being viewed by others
References
Fournier, J.M., Vienot, J.C.: Fourier transform holograms used as matched filters in hebraic paleography. Isr. J. Technol. 281–287 (1971)
Sirat, C.: L’examen des ’critures: L’oeil et la machine, Paris, Editions du Centre National de la Recherche Scientifique (1981)
Dinstein I. and Shapira Y. (1982). Ancient hebraic handwriting identification with run-length histograms. IEEE Trans. Syst. Man Cybern. 12: 405–409
Likforman-Sulem L., Maitre H. and Sirat C. (1991). An expert vision system for analysis of Hebrew characters and authentication of Manuscripts. Pattern Recognit. 24(2): 121–137
Bar-Yosef I. (2005). Input sensitive thresholding for ancient Hebrew manuscript. Pattern Recognit. Lett. 26: 1168–1173
Breu H., Gil J., Kirkpatrick D. and Werman M. (1995). Linear time Euclidean distance transform algorithms. IEEE Trans. Pattern Anal. Machine Intell. 17(5): 529–533
Zhuang, Y., Zhang, X., Wu, J., Lu, X.: Retrieval of Chinese calligraphic character image. In: 5th Pacific Rim Conference on Multimedia, Tokyo, Japan. pp. 17–24. Part I, (2004)
Saykol E., Sinop A.K., Gudukbay U., Ulusoy O. and Cetin A.E. (2004). Content-based retrieval of historical Ottoman documents stored as textual images. IEEE Trans. Image Process. 13(3): 314–325
Haralick R.M., Sternberg S.R. and Zhuang X. (1987). Image analysis using mathematical morphology. IEEE Trans. PAMI 9(4): 532–550
Al-Badr B. and Haralick R.M. (1998). A segmentation-free approach to text recognition with application to Arabic text. IJDAR 1(3): 147–166
Schauf, M., Akoy, S., Haralick, R.M.: Model-based shape recognition using recursive mathematical morphology. 14th International Conference on Pattern Recognition, pp. 202–204 (1998)
Beit-Arie, M.: Paleographical Identification of Hebrew Manuscripts: Methodology and Practice, in idem, The Making of the Medieval Hebrew Book, pp. 15–44. The Magnes Press, The Hebrew University, Jerusalem (1991)
Said H.E.S., Tan T.N. and Baker K.D. (2000). Personal identification based on handwriting. Pattern Recognit. 33(1): 149–160
Bulacu, M., Schomaker, L.R.B., Vuurpijl, L.G.: Writer identification using edge-based directional features. International Conference on Document Analysis and Recognition, pp. 937–941 (2003)
Zhang, B., Srihari, S.N., Lee, S.: Individuality of handwritten characters. ICDAR 2003, pp. 1086–1090
Zhang, B., Srihari, S.N.: Analysis of Handwriting Individuality Using Word Features. ICDAR ’01, p. 1142
Wang, X., Ding, X., Liu, H.: Writer identification using directional element features and linear transform. In: International Conference on Document Analysis and Recognition, pp. 942–945 (2003)
Ablavsky, V., Stevens, M.R.: Automatic feature selection with applications to script identification of degraded Documents. In: International Conference on Document Analysis and Recognition, pp. 750–754 (2003)
Molina, L.C., Belanche, L., Nebot, A.: Feature selection algorithms: a survey and experimental evaluation. In: Proceedings of the International Conference on Data Mining, pp. 306–313 (2002)
Kittler, J.: Feature set search algorithms. Pattern Recognit. Signal Process. pp. 41–60 (1978)
Jain A.K. and Zongker D. (1997). Feature selection: evaluation, application and small sample performance. IEEE Trans. Pattern Anal. Machine Intell. 19: 153–158
Pudil P., Novovicova J. and Kittler J. (1994). Floating search methods in feature selection. Pattern Recognit. Lett. 15: 1119–1125
Duda R.O., Hart P.E. and Stork D.G. (2000). Pattern Classification. Wiley, New York
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bar-Yosef, I., Beckman, I., Kedem, K. et al. Binarization, character extraction, and writer identification of historical Hebrew calligraphy documents. IJDAR 9, 89–99 (2007). https://doi.org/10.1007/s10032-007-0041-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10032-007-0041-5