Abstract
Page segmentation and classification is very important in document layout analysis system before it is presented to an OCR system or for any other subsequent processing steps. In this paper, we propose an accurate and suitably designed system for complex documents segmentation. This system is based on steerable pyramid transform. The features extracted from pyramid sub-bands serve to locate and classify regions into text (either machine-printed or handwritten) and non-text (images, graphics, drawings or paintings) in some noise-infected, deformed, multilingual, multi-script document images. These documents contain tabular structures, logos, stamps, handwritten script blocks, photographs, etc. The encouraging and promising results obtained on 1,000 official complex document images data set are presented in this research paper. We compared our results with those from existing state-of-the-art methods. This comparison shows that the proposed method performs consistently well on large sets of complex document images.
Similar content being viewed by others
References
Freeman W.T., Adelson E.H.: The design and use of steerable filters. IEEE Trans. Pattern Anal. Mach. Intell. 13(9), 891–906 (1991)
Fletcher L., Kasturi R.: A robust algorithm for text string separation from mixed text/graphics images. IEEE Trans. Pattern Anal. Mach. Intell. 10, 910–918 (1988)
Tan, C.L., Yuan, B., Huang, W., Zhang, Z.: Text/graphics separation using pyramid operations. In: International Conference on Document Analysis and Recognition, Bangalore, pp. 169–172 (1999)
Antaonacopulos A.: Page Segmentation Using the Description of the Background. Comput. Vis. Image Underst. 70(3), 350–369 (1998)
Pavlidis, T., Zhou, J.: Page segmentation by white streams. In: Proceedings of the 10th International Conference on Pattern recognition, pp. 945–953, Saint-Malo, France (1991)
Sural, S., Das, P.K.: A two step algorithm and its parallelisation for the generation of minimum containing rectangles for document image segmentation. In: International Conference on Document Analysis and Recognition, Bangalore, pp. 173–176 (1999)
Wang D., Srihari S.N.: Classi_cation of newspaper image blocks using texture analysis. CVGIP 47, 327–352 (1989)
Vishwanathan M., Nagy G.: Characteristics of digitized images of technical articles. SPIE 1(661), 6–17 (1992)
Jung K., Kim K., Jain A.K.: Text information extraction in images and video: a survey. Pattern Recognit. 37(5), 977–997 (2004)
Jain, A.K., Yu, B.: Page segmentation using document model. In: International Conference on Document Analysis and Recognition, Vol. 1, pp. 173–176. Munich (1997)
Mao S., Rosenfeld A.: Tapas Kanungo document structure analysis algorithms: a literature survey. Doc. Recognit. Retr. X 5010(1), 197–207 (2003)
Ahmad, U.A., Kidiyo, K., Joseph, R.: Texture features based on Fourier transform and Gabor filters: an empirical comparison. In: ICMV 2007, International Conference on Machine Vision, Islamabad, Pakistan vol. 1, pp. 67–72 (2007)
Rellier G., Descombes X., Falzon F., Zerubia J.: Texture feature analysis using a Gauss-Markov model in hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 42(7), 1543–1551 (2004)
Wang, Y., Wei, X., Xiao, S.: LBP texture analysis based on the local adaptive niblack algorithm. In: CISP, Congress on Image and Signal Processing, Hainan, China, (2), pp. 777–780 (2008)
Charalampidis D.: Texture synthesis: textons revisited. IEEE Trans. Image Process. 15(3), 777–787 (2006)
Haralick R.M., Shanmungan K., Dinstein I.: Textural features for image classification. IEEE Trans. Syst. Man Cybern. 3(6), 610–621 (1973)
Haley G.M., Manjunath B.S.: Rotation-invariant texture classification using a complete space-frequency model. IEEE Trans. Image Process. 8, 255–269 (1999)
Kim J.K., Park H.W.: Statistical textural features for detection of micro calcifications in digitized mammograms. IEEE Trans. Med. Imaging 18(3), 231–238 (1999)
Zhao Y., Zhang L., Li P., Huang B.: Classification of high spatial resolution imagery using improved Gaussian Markov random-field-based texture features. IEEE Trans. Geosci. Remote Sens. 45(5), 1458–1468 (2007)
Krishnamachari S., Chellappa R.: Multiresolution Gauss–Markov random field models for texture segmentation. IEEE Trans. Image Process. 6(2), 251–267 (1997)
Van de Wouwer G., Scheunders P., Dyck D.V.: Statistical texture characterization from discrete wavelet representations. IEEE Trans. Image Process. 8, 592–598 (1999)
Clausi D.A., Huang D.: Design-based texture feature fusion using Gabor filters and co-occurrence probabilities. IEEE Trans. Image Process. 14(7), 925–936 (2005)
Mittal, N., Mital, D.P., Chan, K.L.: Features for texture segmentation using Gabor filters. In: International Conference on Image Processing and its Applications, Dublin, Ireland, vol. 1, pp. 353–357 (1999)
Marcelja S.: Mathematical description of the responses of simple cortical cells. J. Opt. Soc. Am. 70(1), 1297–1300 (1980)
Grigorescu S.E., Petkov N., Kruizinga P.: Comparison of texture features based on Gabor filters. IEEE Trans. Image Process. 11(10), 1160–1167 (2002)
Nourbakhsh, F., Pati, P.B., Ramakrishnan, A.G.: Document page layout analysis using harris corner points. In: Proceedings of ICISIP (2006)
Raju, S., Patiand, P.B., Ramakrishnan, A.G.: Gabor filter based block energy analysis for text extraction from digital document images. In: Proceedings of the First International Workshop on Document Image Analysis for Libraries (DIAL’04)
Kumar, S., Gupta, R., Khanna, N., Chaudhury, S., Joshi, S.D.: Text extraction and document image segmentation using matchedwavelets and MRF Model. IEEE Trans. Image Process. 16(8), 2117–2128 (2007)
Liang C.-W., Chen P.-Y.: DWT based text localization. Int. J. Appl. Sci. Eng. 2(1), 105–116 (2004)
Do M.N., Vetterli M.: Wavelet-based texture retrieval using generalized gaussian density and kullback-leibler distance. IEEE Trans. Image Process. 11(2), 146–158 (2002)
Huang P.W., Dai S.K., Lin P.L.: Texture image retrieval and image segmentation using composite sub-band gradient vectors. J. Vis. Commun. Image Represent. 17(5), 947–957 (2006)
Kokare M., Chatterji B.N., Biswas P.K.: Cosinemodulated wavelet based texture features for content-based image retrieval. Pattern Recognit. Lett. 25(4), 391–398 (2004)
Randen T., Husoy J.H.: Filtering for texture classification: a comparative study. IEEE Trans. Pattern Anal. Mach. Intell. 21(4), 291–310 (1999)
Sarkar A., Sharma K.M.S., Sonak R.V.: A new approach for subset 2-D AR model identification for describing textures. IEEE Trans. Image Process. 6(3), 407–413 (1997)
Sayadi M., Najim M.: Comparison of second and third order statistics based adaptive filters for texture characterization. IEEE ICASSP Arizona USA 6 99, 3281–3284 (1999)
Sayadi, M., Buzenac, V., Najim, M.: Texture characterization using 2-D cumulant-based lattice adaptive filtering. In: IEEE ICASSP 98, Seattle, USA, vol. 3, pp. 2725–2728 (1998)
Sabari Raju, S., Pati, P.B., Ramakrishnan, A.G.: Gabor filter based block energy analysis for text extraction from digital document images, DIAL04, pp. 233–243 (2004)
Etemad, K., Doermann, D., Chellappa, R.: Multiscale Segmentation of unstructured Document Pages Using Soft Decision Integration. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19, No. 1, Jan (1997)
Greenspans, H., Belongic, S., Goodman, R.: Rotation invariant texture recognition using a steerable pyramid. In: Proceedings of ICPR’94, International Conference on Pattern Recognition, vol. 1, pp. 162–167 (1994)
Tzagkarakis G., Beferull-Lozano B., Tsakalides P.: Rotation-invariant texture retrieval with Gaussianized steerable pyramids. IEEE Trans. Image Process. 15(9), 2702–2718 (2006)
Montoya-Zegarra, J.A., Leite, N.J., Torres, R.: Rotation-invariant and scale-invariant steerable pyramid decomposition for texture image retrieval. Brazilian Symposium on Computer Graphics and Image Processing (1), pp. 121–128 (2007)
Simoncelli, E.P., Freeman, W.T.: The steerable pyramid: a flexible architecture for multi-scale derivative computation In: Proceedings of IEEE second international Conference on Image Processing, pp. 444–447, Washington, DC (1995)
Danielsson P., Seger O.: Rotation invariance in gradient and higher order derivative detectors. Comp. Vis. Graph. Image Proc. 49, 198–221 (1990)
Freeman, W.T., Adelson, E.H.: Steerable filters. In: Topical Meeting on Image Understanding and Machine Vision. Optical society of America. Technical Digest Series vol. 14, June (1989)
Portilla J., Simoncelli E.P.: A parametric texture model based on joint statistics of complex wavelet coefficients. Int. J. Comput. Vis. 40(1), 49–70 (2000)
Fawcett T., Flach P.A.: A response to webb and Ting’s on the application of ROC analysis to predict classification performance under varying class distributions. Mach. Learn. 58(1), 33–38 (2005)
O’Gorman L.: The document spectrum for page layout analysis. IEEE Trans. Pattern Anal. Mach. Intell. 15(11), 1162–1173 (1993)
Nagy G., Seth S., Viswanathan M.: A prototype document image analysis system for technical journals. Computer 7(25), 10–22 (1992)
Phillips I., Chhabra A.: Empirical performance evaluation of graphics recognition systems. IEEE Trans. Pattern Anal. Mach. Intell. 21(9), 849–870 (1999)
Phillips, I., Liang, J., Chhabra, A., Haralick, R.: A performance evaluation protocol for graphics recognition systems. In: Graphics Recognition: Algorithms and Systems, Lecture Notes in Computer Science, vol. 1389, pp. 372–389, Springer (1998)
Yanikoglu B.A., Vincent L.: Pink Panther: a complete environment for ground-truthing and benchmarking document page segmentation. Pattern Recognit. 31(9), 1191–1204 (1994)
Phillips I.T., Chhabra A.: Empirical performance evaluation of graphics recognition systems. IEEE Trans. Pattern Anal. Mach. Intell. 21(9), 849–870 (1999)
Refering to MediaTeam Document Database Sauvola J. and Kauniskangas H.: MediaTeam Document Database II, a CD-ROM collection of document images, University of Oulu, Finland (1999)
Antonacopoulos, A., Pletschacher, S., Bridson, D., Papadopoulos, C.: ICDAR 2009 Page Segmentation Competition, in: 10th International Conference on Document Analysis and Recognition (ICDAR’09), Barcelona, Spain, July (2009)
Nourbakhsh, F., Pati, P.B., Ramakrishnan, A.G.: Text localization and extraction from complex Gray images, 2006, Computer vision and image processing. In: Proceedings of the 5th Indian Conference, ICVGIP, Madurai (2006)
Zhong Y., Karu K., Jain A.K.: Locating text in complex color images. Pattern recognit. 28(10), 1523–1535 (1995)
Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceedings of 4th Alvey Vision Conference, pp. 147–151 (1988)
Otsu N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Benjelil, M., Kanoun, S., Mullot, R. et al. Complex documents images segmentation based on steerable pyramid features. IJDAR 13, 209–228 (2010). https://doi.org/10.1007/s10032-010-0113-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10032-010-0113-9