Elsevier

Pattern Recognition

Volume 29, Issue 7, July 1996, Pages 1161-1177
Pattern Recognition

Processing of binary images of handwritten text documents

https://doi.org/10.1016/0031-3203(95)00142-5Get rights and content

Abstract

This paper deals with three different problems in the processing of binary images of handwritten text documents. Firstly, an integrated algorithm that finds a straight line approximation of a textual stroke is described. It has the advantage of using the distance transform of thinned binary images to identify spurious bifurcation points, which are unavoidable when thinning algorithms are used, remove them and recover the original ones. The obtained straight line approximations preserve the structural information of the original pattern. The algorithm does not resort to distortable geometrical properties. Secondly, a method is presented to recover loops that become blobs due to blotting. The method depends on removing the pixels whose distance transform exceeds a calculated threshold. Unfortunately, it seems that it is not possible to recover such loops with a high rate of success. The authors suggest that the inclusion of thickness information, in the line segments that connect the vertices of the straight line approximations produced by the previous algorithm, is a step towards a solution of this problem. Finally, a method is developed to extract lines from pages of handwritten text, by finding the shortest spanning tree of a graph formed from the set of main strokes. Then, main strokes of extracted lines are arranged in the same order as they were written by following the path in which they are contained. Then, every secondary stroke is assigned to the closest main stroke. At the end, an ordered list of main strokes, each with the corresponding number of assigned secondary strokes, is obtained. Each combination of main-secondary strokes can be the input to a subsequent recognition stage. The method proved to be powerful and more suited to variable handwriting.

References (21)

There are more references available in the full text version of this article.

Cited by (29)

  • Cascaded-Automatic Segmentation for Schistosoma japonicum eggs in images of fecal samples

    2014, Computers in Biology and Medicine
    Citation Excerpt :

    Then, a point set filter was applied to filter out the noisy points and improve the segment performance. This filter is based on a number of existing methods in the domain of image thinning, labeling and connected component analysis [13–20]. Next, the Randomized Hough Transform method was used to extract the point set of a raw edge [21–26].

  • Image thinning using pulse coupled neural network

    2004, Pattern Recognition Letters
  • A novel triangulation procedure for thinning hand-written text

    2001, Pattern Recognition Letters
    Citation Excerpt :

    Spurious tails are a frequently occurring consequence of the thinning process (Parker, 1997), particularly when applying distance transform based methods. Although several techniques (Stentiford and Mortimer, 1983; Huang, 1996; Abuhaiba et al., 1996) have been suggested to remove spurious tails, none of these are appropriate additions to triangulation-based thinning. Instead, two methods have been developed to reduce the number of such tails by manipulating the triangle database.

View all citing articles on Scopus
View full text