Extracting color halftones from printed documents using texture analysis
Introduction
Printed documents often contain text, halftones, and other illustrations on the same page. When processing documents electronically, it is important to separate these different data types, because, typically, their processing requirements differ. For example, text might be sent to an OCR system for interpretation, whereas illustrations might be compressed for efficient storage. The process of segmenting a page into homogeneous regions and labeling the regions by data type is called document layout analysis (DLA).
A variety of DLA methods appear in the literature and can be divided into three fundamental approaches: top-down, bottom-up, and texture-based. When analyzing a document top-down, pages are split into blocks, checked for homogeneity, and further divided, if necessary. Most of these methods are based on run length smoothing [1], [2] or projection cuts [3]. Bottom-up methods group pixels into connected components, which are then merged into progressively larger regions [4], [5], [6], [7] Methods combining top-down and bottom-up concepts have also been proposed [8]. Texture-based methods treat the different data types as different textures, where each texture is characterized by some distinctive repeating pattern. Some methods, attempt to detect these patterns directly using spatial masks. Examples include the Texture Co-occurrence Spectrum method of Payne [9] and the neural network-trained masks used by Jain [10]. Other methods exploit the fact that repeating patterns produce distinct spectral signatures [11], [12], [13], [14]. Bandpass filters are then employed to isolate regions associated with those signatures.
Although the methods mentioned above are all effective in certain domains, robust DLA has not yet been achieved. For example, some methods are intolerant [1], [2], [3] or have limited tolerance [8] to document skew. Thus, a page that is rotated significantly when scanned would be problematic. Also, several approaches are restricted to documents having a Manhattan layout; i.e., clear horizontal and vertical white gaps between and within regions [1], [2], [3], [4], [7], [8]. While this is often the case, it is not true in general. What makes DLA so difficult is that no single approach seems to be suitable for all the data types encountered on a page; e.g., effective methods for isolating text may require assumptions (Manhattan layout, for example) that are not always valid for halftones.
The approach advocated here is to simplify DLA by eliminating halftones from the page, using an algorithm customized for that purpose. The idea is to preprocess the digitized page to identify regions containing halftones and replace them with the background color. The extracted halftones can be discarded or processed separately. In this way, existing DLA techniques can be applied to a more homogeneous input. Implementing this strategy requires a reliable and efficient algorithm for segmenting halftones from other information on the page. This paper presents such an algorithm. We show that a single circularly symmetric bandpass filter can robustly segment both grayscale and color halftones from a digitized page, independently of other information on the page. It can do so regardless of the number, size, shape, or orientation of the halftone region(s). We further show how this filter can be easily designed, using a sample page from the document. Although other methods have been proposed for segmenting halftones, the ability to accurately segment color and grayscale halftones of arbitrary shape is novel to this approach. It is also important to note that only a single filter is required for all halftones in the entire document. In contrast, a bank of filters was required by others [11], [12].
The remaining sections provide more detail. Section 2 describes grayscale and color halftones and their spectral characteristics. Section 3 describes the segmentation algorithm. Section 4 presents experimental results, and Section 5 is the conclusion.
Section snippets
Halftones and their spectra
Commercial printing processes produce documents by depositing ink on paper. Only a few colors of ink are used, however, so shades of gray or color cannot be generated directly. Consequently, continuous-tone images (color or grayscale) are typically printed as a halftone. A halftone consists of an array of closely spaced micro-dots, all produced with the same color of ink. At normal viewing distances, the individual dots are imperceptible to the naked eye. Instead, the eye integrates the dots
Halftone segmentation algorithm
This section presents an algorithm for segmenting regions containing grayscale or color halftones from other information on a printed page. The algorithm handles rotated images and regions having arbitrary shape. Fig. 6 provides a schematic overview of the algorithm. An input page is digitized by a scanner. The scanner produces three grayscale images, which represent the input in terms of the color bands (RGB). Each image then undergoes a filtering operation. First, the image is filtered with a
Experimental results
This section demonstrates the effectiveness of the algorithm for extracting color and grayscale halftones from the other information in a document. A variety of examples are provided. All input images were 512×512 pixels.
Conclusion
The advent of color in the printing industry provides new challenges in the area of document processing. Many of the earlier document layout analysis techniques dealt only with grayscale halftones. The goal of this research was to develop methods to extract both color and grayscale halftones from other information in the document. The approach capitalizes on the spectral characteristics of halftones, which allows a single bandpass filter with an annular passband to extract halftones. The
About the Author—DENNIS F. DUNN received the B.S. degree in chemical engineering from Case Western Reserve University in 1969, the M.Eng. degree in engineering science from The Pennsylvania State University in 1981, and the Ph.D. degree in computer science from The Pennsylvania State University in 1992. He was an Assistant Professor with the Computer Science and Engineering Department, The Pennsylvania State University from 1992 to 1997. He is presently an Assistant Professor with the Computer
References (18)
- et al.
Block segmentation and text extraction in mixed text/image documents
Comp. Vision Graphics Image Proc.
(1982) - et al.
Classifications of newspaper image blocks using texture analysis
Comp. Vision Graphics Image Proc.
(1989) - et al.
Page segmentation and classification
Comp. Vision Graphics Image Proc.
(1992) - et al.
Page segmentation using texture discrimination
Pattern Recognition
(1996) - et al.
Document analysis system
IBM J. Res.
(1982) - et al.
A robust algorithm for text string separation from mixed text/graphics images
IEEE Trans. Pattern Anal. Mach. Intell.
(1988) The document spectrum for page layout analysis
IEEE Trans. Pattern Anal. Mach. Intell.
(1993)- et al.
A fast algorithm for bottom-up document layout analysis
IEEE Trans. Pattern Anal. Mach. Intell.
(1997) - et al.
Document representation and its application to page decomposition
IEEE Trans. Pattern Anal. Mach. Intell.
(1998)
Cited by (9)
Scanned document enhancement based on fast text detection
2016, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - ProceedingsLow complexity pixel-based halftone detection
2011, Proceedings of SPIE - The International Society for Optical EngineeringText extraction and enhancement of binary images using cellular automata
2009, International Journal of Automation and ComputingText extraction from document images using edge information
2009, Proceedings of INDICON 2009 - An IEEE India Council ConferenceGravitational reallocation of halftone dots for moiré-free color proofing
2005, Final Program and Proceedings - IS and T/SID Color Imaging ConferenceGravitational Re-allocation of Halftone Dots for Moire-Free Color Proofing
2005, Journal of the Institute of Image Electronics Engineers of Japan
About the Author—DENNIS F. DUNN received the B.S. degree in chemical engineering from Case Western Reserve University in 1969, the M.Eng. degree in engineering science from The Pennsylvania State University in 1981, and the Ph.D. degree in computer science from The Pennsylvania State University in 1992. He was an Assistant Professor with the Computer Science and Engineering Department, The Pennsylvania State University from 1992 to 1997. He is presently an Assistant Professor with the Computer Science Department, the University of Wyoming. His main research interests are computer vision, document analysis, image processing, texture analysis, and human-vision modeling.
About the Author—NILOUFER MATHEW received the B.S and M.S degrees in Electrical Engineering in 1993 and 1997, respectively, from the National Institute of Engineering, Mysore, India and The Pennsylvania State University, University Park, PA. Her main research interest was in digital halftoning. She is currently a Software Engineer at Decision Consultants Incorporated in Southfield, MI.