Skip to main content
Log in

An efficient segmentation-free approach to assist old Greek handwritten manuscript OCR

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

Recognition of old Greek manuscripts is essential for quick and efficient content exploitation of the valuable old Greek historical collections. In this paper, we focus on the problem of recognizing early Christian Greek manuscripts written in lower case letters. Based on the existence of closed cavity regions in the majority of characters and character ligatures in these scripts, we propose a novel, segmentation-free, fast and efficient technique that assists the recognition procedure by tracing and recognizing the most frequently appearing characters or character ligatures. First, we detect closed cavities that exist in the character body. Then, the protrusions in the outer contour outline of the connected components that contain the character closed cavities are used for the classification of the area around closed cavities to a specific character or a character ligature. The proposed method gives highly accurate results and offers great assistance to old Greek handwritten manuscript OCR. We also provide additional OCR applications that not only prove the robustness of the proposed method but also demonstrate its generic flavor in case segmentation and text location tasks are very difficult to perform.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  1. Vinciarelli A (2002) survey on off-line Cursive Word Recognition. Pattern Recognition 35:1433–1446

    Article  MATH  Google Scholar 

  2. Lu Y, Tan CL (2002) Combination of multiple classifiers using probabilistic dictionary and its application to postcode recognition. Pattern Recognition 35:2823–2832

    Article  MATH  Google Scholar 

  3. Brakensiek A, Rottland J, Rigoll G (2003) Confidence measures for an address reading system. Seventh international conference on document analysis and recognition, ICDAR2003, pp 294–298

  4. Hirano T, Okada Y, Yoda F (2001) Field extraction method from existing forms transmitted by facsimile. Sixth international conference on document analysis and recognition, ICDAR2001, pp 738–742

  5. Xu Q, Lam L, Suen CY (2001) A knowledge-based segmentation system for handwritten dates on bank cheques. Sixth international conference on document analysis and recognition, ICDAR2001, pp 384–388

  6. Gorski N, Anisimov V, Augustin E, Baret O, Price D, Simon JC (1999) A2iA check reader: a family of bank check recognition systems. Proc. fifth int’l conf. document analysis and recognition, pp 523–526

  7. Suen CY, et al (1993) Building a new generation of handwriting recognition systems. Patt Recog Lett 14:303–315

    Article  Google Scholar 

  8. Guillevic D, Suen CY (1997) HMM word recognition engine. Fourth international conference on document analysis and recognition ICDAR97, pp 544

  9. Kavallieratou E, Fakotakis N, Kokkinakis G (2002) Handwritten character recognition based on structural characteristics. 16th International conference on pattern recognition, pp 139–142

  10. Eastwood B et al. (1997) A feature based neural network segmenter for handwritten words. International conference on computational intelligence and multimedia applications (ICCIMA’97), Australia, pp 286–290

  11. Lu Y, Shridhar M (1996) Character segmentation in handwritten words—an overview, Patt Recog 29(1):77–96

    Article  Google Scholar 

  12. Xiao X, Leedham G (1999) Cursive script segmentation incorporating knowledge of writing. Proceedings of the fifth international conference on document analysis and recognition, pp 535–538

  13. Plamondon P, Privitera CM (1999) The segmentation of cursive handwritten: an approach based on off-line recovery of the motor-temporal information, IEEE Trans Image Process 8:80–91

    Article  Google Scholar 

  14. Chi Z, Suters M, Yan H (1995) Separation of single-and double-touching handwritten numeral strings. Opt Eng 34:1159–1165

    Article  Google Scholar 

  15. Zhao S, Chi Z, Shi P, Yan H (2003) Two-stage segmentation of unconstrained handwritten Chinese characters. Pattern Recognition 36:145–156

    Article  MATH  Google Scholar 

  16. Farag R (1979) Word-level recognition of cursive script, IEEE Trans. Comput Vol C-28:172–175

    Google Scholar 

  17. Simon J (1992) Off-line cursive word recognition. Proceedings of the IEEE 80:1150–1161

    Article  Google Scholar 

  18. Madhvanath S, Govindaraju V (1993) Holistic lexicon reduction. Proceedings of the Third International Workshop on Frontiers in Handwriting Recognition. Buffalo, N.Y:71–82

  19. Madhvanath S, Kleinger E, Govindaraju V (1999) Holistic verifications of handwritten phrases. IEEE Trans. PAMI 21:1344–1356

    Google Scholar 

  20. Chen CH, de Curtins J (2003) Word Recognition in a Segmentation-Free Approach to OCR. Second International Conference on Document Analysis and Recognition (ICDAR’93), pp 573–576

  21. Chen CH, de Curtins J (1992) A Segmentation-free Approach to OCR. IEEE Workshop on Applications of Computer Vision, pp 190–196

  22. Duda R, Hart E (1973) Pattern Classification and Scene Analysis. Wiley

    MATH  Google Scholar 

  23. Amin A and Masini G Machine recognition of cursive Arabic words, Application of Digital Image Processing IV, San Diego, CA, August 1982, Vol SPIE-359, pp.286–292]

  24. Mori S, Suen CY, Yamamoto K Historical review of OCR research and development, Proc. IEEE, vol. 80 1992, pp. 1029–1058

  25. Ulmann J. R. Experiments with the n-tuple method of pattern recognition, IEEE Trans. Computers, vol 18, no 12,1969 pp. 1135–1137

  26. Jung DM, Krishnamoorty MS, Nagy G, Shapira A. N-tuple features for OCR revisited, IEEE Trans. PAMI vol. 18, no. 7,1996, pp. 734–745

  27. Gonzalez RC, Woods RE (1992) Digital Image Processing. Addison-Wesley

    Google Scholar 

  28. Gatos B, Pratikakis I, Perantonis SJ Locating Text in Historical Collection Manuscripts. Lecture Notes on AI, SETN 2004, pp. 476–485

  29. Niblack W (1986) An Introduction to Digital Image Processing. Prentice Hall, Englewood Cliffs NJ, pp 115–116

    Google Scholar 

  30. Pavlidis T (1992) Algorithms for Graphics and Image Processing. Computer Science Press, Rockville, MD

    Google Scholar 

  31. Xia F (2003) Normal vector and winding number in 2D digital images with their application for hole detection. Pattern Recognition 36:1383–1395

    Article  MATH  Google Scholar 

  32. Jain A (1989) Fundamentals of digital image processing. Prentice Hall

    Google Scholar 

  33. Theodoridis S, Koutroumbas K (1997) Pattern Recognition. Academic Press

    Google Scholar 

  34. Chang CC, Lin, C. J. LIBSVM: A library for support vector machines 2001, Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

  35. American Memory: Historical Collections for the National Digital Library, http://memory.loc.gov/

  36. Sauvola J, Kauniskangas H (1999) MediaTeam Document Database II, a CD-ROM collection of document images. University of Oulu, Finland

    Google Scholar 

Download references

Acknowledgements

This research is carried out within the framework of the Greek GSRT-funded R&D project, D-SCRIBE, which aims to develop an integrated system for digitization and processing of old Greek manuscripts.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to B. Gatos.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gatos, B., Ntzios, K., Pratikakis, I. et al. An efficient segmentation-free approach to assist old Greek handwritten manuscript OCR. Pattern Anal Applic 8, 305–320 (2006). https://doi.org/10.1007/s10044-005-0013-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-005-0013-7

Keywords

Navigation