Pattern skeletonization using run-length-wise processing for intersection distortion problem

https://doi.org/10.1016/S0167-8655(99)00047-1Get rights and content

Abstract

Existing skeletonization methods are largely pixel-wise methods which present problems at line pattern intersection regions. Challenged by the problems, we have developed a run-length-wise method in vectorizing line pattern intersections and solving intersection distortion problem in skeletonization. We experimented the method with handwritten Chinese character and numeral character skeletonization to find points of convergence and divergence and to link them into meaningful sets of data. Using these sets of data we construct windows around intersection regions and replace distorted intersection with straight line intersection. Thus we overcome the distortion problem and at the same time retain the smooth property of pixel-wise thinning.

Introduction

Pixel-wise operation usually uses a 3 × 3 window to check the 8 neighbors, then processes the central pixel. The number of possible neighbor configuration is 28=256. For a 5 × 5 window, the number of possible configuration will be 224=16777216. That is why only 3 × 3 window is often used. Then because of the window size, only very limited local pattern property is being examined. This has caused many pixel-wise thinning algorithms to produce intersection distortion in their thinned results. This paper proposes an innovative method to this intersection distortion problem. The proposal is to use run-length-wise processing to find intersection region in an input line image, then use pixel-wise thinning to thin the line image, then use the intersection region location data to clip the distorted intersection and then draw straight line intersection in the pixel-wisely thinned image.

Section 2 of the paper is dedicated to review of literature. Section 3 discusses the intersection problem and the proposed solution. Section 4 explains the proposed algorithm. Section 5 summarizes the proposed algorithm in a pseudo code form. In Section 6, we apply the proposed algorithm to Chinese character skeletonization. In Section 7, we apply the algorithm to numerical character skeletonization. Section 8 provides some comments on experiment and results.

Section snippets

Review of literature

In handwritten character recognition, thinning is usually used as a key preprocess Bentley and Ottmann, 1979, Blum, 1964, Brown et al., 1988. There are many existing methods for skeletonizing line image patterns. They can be categorized into two major groups: pixel-wise and non-pixel-wise.

The pixel-wise methods involve the pixel-wise thinning operation. In these methods contour pixels are classified as removable or retainable by template matching against 3 × 3 patterns, then the removable contour

Intersection distortion problem and the proposed solution

The distortion problem can be described as: (1) `T' junction’s straight stroke is only reduced to curved skeleton line, (2) `+' and `X' intersection become split, (3) the coarse intersection centers miss their true center positions. These can be clearly seen in Fig. 1, result of a traditional parallel pixel-wise thinning similar to Zhang and Suen (1984) algorithm. From the skeleton, we can see the intersections are all `Y' shaped: some of these Y's come from original T intersections and others

The run-length-wise processing algorithm

The run-length scan is carried out on a line from left to right, then is repeated from top to bottom. After the current run-length is found, it is checked against run-lengths on the previous scan line for any overlapping. There can be several cases:

  • Case 1: no overlap, (stroke-start condition);

  • Case 2: there is one overlap, (stroke-continuing condition);

  • Case 3: there are two overlaps, (converge condition);

  • Case 4: there are more than two overlaps;

where the overlap is based on 8-connectedness.

Pseudo code of the algorithm

1. Intersection finding

  • horizontal raster line scan

  • Begin

    • first line scan

    • next line scan (to find a new run-length and its case belonging)

      • if (Case 1)

        • assign new label to the run-length.

      • if (Case 2)

        • if (diverge condition)

          • label the run-length with updated (incremented) label

          • record the diverge position and the two diverging strokes’ labels

        • else

          • label the run-length with the continuing stroke’s label

          • if (X intersection condition)

            • pair the Convergence and Divergence records and record the linking

      • if (Case 3)

Application to Chinese character skeletonization

Chinese characters contain intersections largely in rectangular ( [ ] ) configuration, few in diamond (◊) configuration. So we have to use diagonal raster line scan. If we call a diagonal line `/' sweeping from top-left corner to bottom-right corner of an image as forward diagonal line scan then a diagonal line `⧹' sweeping from top-right corner to bottom-left corner of an image is called backward diagonal line scan.

We apply a forward diagonal line scan first. To simplify the programming, the

Application to numeral character skeletonization

The result of applying the run-length-wise operation to numeral character skeletonization is shown in Fig. 10, Fig. 11, Fig. 12.

The handwritten numerals contain arbitrary cross intersections. We need not only horizontal line scan, but also the vertical line scan as well as the two diagonal line scans, so that any intersection missed by one type of line scan can be found by the other type of line scan. For example, for the numeral character in Fig. 10(a), the horizontal line scan would not

Comments on experimental and results

We have experimented the proposed algorithm on some of handwritten Chinese characters and handwritten numerical characters. Part of the results are shown in Fig. 5, Fig. 6, Fig. 7, Fig. 8, Fig. 9, Fig. 10, Fig. 11, Fig. 12. The results show the proposed algorithm works well to certain standard. The detectable line crossing angle is less than 35°.

There are two major complex areas in computation: (1) is at implementing diagonal line scan boundary condition, that is reduced by using enlarged

Conclusion

Most frequently quoted image skeletonization methods are largely based on pixel-wise operations which are too local in relation to human viewing domain and thus present problems that cannot be solved at that level. We have experimented with run-length-wise processing and found they can be used effectively in vectorizing line pattern intersections and overcome the problems at a higher level closer to the human viewing domain.

The run-length-wise operation is in nature a sequential process. It

Acknowledgements

Thanks are due to Dr. Jianming Hu and Ms. Connie Bao for initial discussion and for providing the input characters together with reading/writing facility; also to Ms. Colleen Moore for her linguistic input. Also special respect and gratitude should be paid to the referees for their constructive input and professional guidance.

References (35)

  • T.J. Sebok et al.

    An algorithm for line intersection identification

    Pattern Recognition

    (1981)
  • B. Shapiro et al.

    Skeleton generation from x, y boundary sequences

    Comput. Graph. & Image Process.

    (1981)
  • R.W. Zhou et al.

    A novel single-pass thinning algorithm and an effective set of performance criteria

    Pattern Recognition Letters

    (1995)
  • J.L. Bentley et al.

    Algorithms for reporting and counting geometric intersections

    IEEE Trans. Comput.

    (1979)
  • Blum, H., 1964. A transformation for extracting new descriptors of shape. In: Wathen-Dunn, W. (Ed.), Proceedings of the...
  • E.S. Deutsch

    Thinning algorithms on rectangular, hexagonal, and triangular arrays

    C. ACM

    (1972)
  • Z. Guo et al.

    Parallel thinning with two-subiteration algorithms

    Commun. ACM

    (1989)
  • Cited by (6)

    • Image Analysis and Computer Vision: 1999

      2000, Computer Vision and Image Understanding
    • A fast CBIR system of old ornamental letter

      2008, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    • The use of digital pattern recognition techniques for virtual reconstruction of eroded and visually complicated archeological geometric patterns

      2008, International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences - ISPRS Archives
    • EDT based tracing maximum thinning algorithm on grey scale images

      2000, Proceedings - International Conference on Pattern Recognition
    View full text