Pattern skeletonization using run-length-wise processing for intersection distortion problem
Introduction
Pixel-wise operation usually uses a 3 × 3 window to check the 8 neighbors, then processes the central pixel. The number of possible neighbor configuration is 28=256. For a 5 × 5 window, the number of possible configuration will be 224=16777216. That is why only 3 × 3 window is often used. Then because of the window size, only very limited local pattern property is being examined. This has caused many pixel-wise thinning algorithms to produce intersection distortion in their thinned results. This paper proposes an innovative method to this intersection distortion problem. The proposal is to use run-length-wise processing to find intersection region in an input line image, then use pixel-wise thinning to thin the line image, then use the intersection region location data to clip the distorted intersection and then draw straight line intersection in the pixel-wisely thinned image.
Section 2 of the paper is dedicated to review of literature. Section 3 discusses the intersection problem and the proposed solution. Section 4 explains the proposed algorithm. Section 5 summarizes the proposed algorithm in a pseudo code form. In Section 6, we apply the proposed algorithm to Chinese character skeletonization. In Section 7, we apply the algorithm to numerical character skeletonization. Section 8 provides some comments on experiment and results.
Section snippets
Review of literature
In handwritten character recognition, thinning is usually used as a key preprocess Bentley and Ottmann, 1979, Blum, 1964, Brown et al., 1988. There are many existing methods for skeletonizing line image patterns. They can be categorized into two major groups: pixel-wise and non-pixel-wise.
The pixel-wise methods involve the pixel-wise thinning operation. In these methods contour pixels are classified as removable or retainable by template matching against 3 × 3 patterns, then the removable contour
Intersection distortion problem and the proposed solution
The distortion problem can be described as: (1) `T' junction’s straight stroke is only reduced to curved skeleton line, (2) `+' and `X' intersection become split, (3) the coarse intersection centers miss their true center positions. These can be clearly seen in Fig. 1, result of a traditional parallel pixel-wise thinning similar to Zhang and Suen (1984) algorithm. From the skeleton, we can see the intersections are all `Y' shaped: some of these Y's come from original T intersections and others
The run-length-wise processing algorithm
The run-length scan is carried out on a line from left to right, then is repeated from top to bottom. After the current run-length is found, it is checked against run-lengths on the previous scan line for any overlapping. There can be several cases:
Case 1: no overlap, (stroke-start condition);
Case 2: there is one overlap, (stroke-continuing condition);
Case 3: there are two overlaps, (converge condition);
Case 4: there are more than two overlaps;
Pseudo code of the algorithm
1. Intersection finding
horizontal raster line scan
Begin
first line scan
next line scan (to find a new run-length and its case belonging)
if (Case 1)
assign new label to the run-length.
if (Case 2)
if (diverge condition)
label the run-length with updated (incremented) label
record the diverge position and the two diverging strokes’ labels
else
label the run-length with the continuing stroke’s label
if (X intersection condition)
pair the Convergence and Divergence records and record the linking
if (Case 3)
Application to Chinese character skeletonization
Chinese characters contain intersections largely in rectangular ( [ ] ) configuration, few in diamond (◊) configuration. So we have to use diagonal raster line scan. If we call a diagonal line `/' sweeping from top-left corner to bottom-right corner of an image as forward diagonal line scan then a diagonal line `⧹' sweeping from top-right corner to bottom-left corner of an image is called backward diagonal line scan.
We apply a forward diagonal line scan first. To simplify the programming, the
Application to numeral character skeletonization
The result of applying the run-length-wise operation to numeral character skeletonization is shown in Fig. 10, Fig. 11, Fig. 12.
The handwritten numerals contain arbitrary cross intersections. We need not only horizontal line scan, but also the vertical line scan as well as the two diagonal line scans, so that any intersection missed by one type of line scan can be found by the other type of line scan. For example, for the numeral character in Fig. 10(a), the horizontal line scan would not
Comments on experimental and results
We have experimented the proposed algorithm on some of handwritten Chinese characters and handwritten numerical characters. Part of the results are shown in Fig. 5, Fig. 6, Fig. 7, Fig. 8, Fig. 9, Fig. 10, Fig. 11, Fig. 12. The results show the proposed algorithm works well to certain standard. The detectable line crossing angle is less than 35°.
There are two major complex areas in computation: (1) is at implementing diagonal line scan boundary condition, that is reduced by using enlarged
Conclusion
Most frequently quoted image skeletonization methods are largely based on pixel-wise operations which are too local in relation to human viewing domain and thus present problems that cannot be solved at that level. We have experimented with run-length-wise processing and found they can be used effectively in vectorizing line pattern intersections and overcome the problems at a higher level closer to the human viewing domain.
The run-length-wise operation is in nature a sequential process. It
Acknowledgements
Thanks are due to Dr. Jianming Hu and Ms. Connie Bao for initial discussion and for providing the input characters together with reading/writing facility; also to Ms. Colleen Moore for her linguistic input. Also special respect and gratitude should be paid to the referees for their constructive input and professional guidance.
References (35)
- et al.
Handprinted symbol recognition system
Pattern Recognition
(1988) - et al.
A modified fast parallel algorithm for thinning digital patterns
Pattern Recognition Letters
(1988) - et al.
An alternate smoothing and stripping algorithm for thinning digital binary patterns
Signal Processing
(1986) - et al.
Skeletonization of binary images with nonuniform width via block decomposition and contour vector matching
Pattern Recognition
(1998) - et al.
Skeleton generation of engineering drawings via contour matching
Pattern Recognition
(1994) - et al.
Structural primitive extraction and coding for handwritten numeral recognition
Pattern Recognition
(1998) - et al.
A Chinese-character-stroke-extraction algorithm based on contour information
Pattern Recognition
(1998) - et al.
A knowledge-based thinning algorithm
Pattern Recognition
(1991) K*K thinning
Computer Vision, Graphics and Image Processing
(1990)A vectorizer and feature extractor for document recognition
Comput. Vision Graph. Image Process.
(1986)
An algorithm for line intersection identification
Pattern Recognition
Skeleton generation from x, y boundary sequences
Comput. Graph. & Image Process.
A novel single-pass thinning algorithm and an effective set of performance criteria
Pattern Recognition Letters
Algorithms for reporting and counting geometric intersections
IEEE Trans. Comput.
Thinning algorithms on rectangular, hexagonal, and triangular arrays
C. ACM
Parallel thinning with two-subiteration algorithms
Commun. ACM
Cited by (6)
Extraction of embedded and/or line-touching character-like objects
2002, Pattern RecognitionImage Analysis and Computer Vision: 1999
2000, Computer Vision and Image UnderstandingLicense plate detection and segmentation using cluster run length smoothing algorithm
2012, Journal of Information Technology ResearchA fast CBIR system of old ornamental letter
2008, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)The use of digital pattern recognition techniques for virtual reconstruction of eroded and visually complicated archeological geometric patterns
2008, International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences - ISPRS ArchivesEDT based tracing maximum thinning algorithm on grey scale images
2000, Proceedings - International Conference on Pattern Recognition