Complex documents images segmentation based on steerable pyramid features

Benjelil, Mohamed; Kanoun, Slim; Mullot, Rémy; Alimi, Adel M.

doi:10.1007/s10032-010-0113-9

Complex documents images segmentation based on steerable pyramid features

Full Paper
Published: 10 March 2010

Volume 13, pages 209–228, (2010)
Cite this article

International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Mohamed Benjelil^1,2,
Slim Kanoun¹,
Rémy Mullot² &
…
Adel M. Alimi¹

320 Accesses
13 Citations
Explore all metrics

Abstract

Page segmentation and classification is very important in document layout analysis system before it is presented to an OCR system or for any other subsequent processing steps. In this paper, we propose an accurate and suitably designed system for complex documents segmentation. This system is based on steerable pyramid transform. The features extracted from pyramid sub-bands serve to locate and classify regions into text (either machine-printed or handwritten) and non-text (images, graphics, drawings or paintings) in some noise-infected, deformed, multilingual, multi-script document images. These documents contain tabular structures, logos, stamps, handwritten script blocks, photographs, etc. The encouraging and promising results obtained on 1,000 official complex document images data set are presented in this research paper. We compared our results with those from existing state-of-the-art methods. This comparison shows that the proposed method performs consistently well on large sets of complex document images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Systematic Survey on CAPTCHA Recognition: Types, Creation and Breaking Techniques

Article 14 June 2021

Image segmentation evaluation: a survey of methods

Article 18 April 2020

An unsupervised automatic organization method for Professor Shirakawa’s hand-notated documents of oracle bone inscriptions

Article Open access 05 March 2024

References

Freeman W.T., Adelson E.H.: The design and use of steerable filters. IEEE Trans. Pattern Anal. Mach. Intell. 13(9), 891–906 (1991)
Article Google Scholar
Fletcher L., Kasturi R.: A robust algorithm for text string separation from mixed text/graphics images. IEEE Trans. Pattern Anal. Mach. Intell. 10, 910–918 (1988)
Article Google Scholar
Tan, C.L., Yuan, B., Huang, W., Zhang, Z.: Text/graphics separation using pyramid operations. In: International Conference on Document Analysis and Recognition, Bangalore, pp. 169–172 (1999)
Antaonacopulos A.: Page Segmentation Using the Description of the Background. Comput. Vis. Image Underst. 70(3), 350–369 (1998)
Article Google Scholar
Pavlidis, T., Zhou, J.: Page segmentation by white streams. In: Proceedings of the 10th International Conference on Pattern recognition, pp. 945–953, Saint-Malo, France (1991)
Sural, S., Das, P.K.: A two step algorithm and its parallelisation for the generation of minimum containing rectangles for document image segmentation. In: International Conference on Document Analysis and Recognition, Bangalore, pp. 173–176 (1999)
Wang D., Srihari S.N.: Classi_cation of newspaper image blocks using texture analysis. CVGIP 47, 327–352 (1989)
Google Scholar
Vishwanathan M., Nagy G.: Characteristics of digitized images of technical articles. SPIE 1(661), 6–17 (1992)
Article Google Scholar
Jung K., Kim K., Jain A.K.: Text information extraction in images and video: a survey. Pattern Recognit. 37(5), 977–997 (2004)
Article Google Scholar
Jain, A.K., Yu, B.: Page segmentation using document model. In: International Conference on Document Analysis and Recognition, Vol. 1, pp. 173–176. Munich (1997)
Mao S., Rosenfeld A.: Tapas Kanungo document structure analysis algorithms: a literature survey. Doc. Recognit. Retr. X 5010(1), 197–207 (2003)
Google Scholar
Ahmad, U.A., Kidiyo, K., Joseph, R.: Texture features based on Fourier transform and Gabor filters: an empirical comparison. In: ICMV 2007, International Conference on Machine Vision, Islamabad, Pakistan vol. 1, pp. 67–72 (2007)
Rellier G., Descombes X., Falzon F., Zerubia J.: Texture feature analysis using a Gauss-Markov model in hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 42(7), 1543–1551 (2004)
Article Google Scholar
Wang, Y., Wei, X., Xiao, S.: LBP texture analysis based on the local adaptive niblack algorithm. In: CISP, Congress on Image and Signal Processing, Hainan, China, (2), pp. 777–780 (2008)
Charalampidis D.: Texture synthesis: textons revisited. IEEE Trans. Image Process. 15(3), 777–787 (2006)
Article Google Scholar
Haralick R.M., Shanmungan K., Dinstein I.: Textural features for image classification. IEEE Trans. Syst. Man Cybern. 3(6), 610–621 (1973)
Article Google Scholar
Haley G.M., Manjunath B.S.: Rotation-invariant texture classification using a complete space-frequency model. IEEE Trans. Image Process. 8, 255–269 (1999)
Article Google Scholar
Kim J.K., Park H.W.: Statistical textural features for detection of micro calcifications in digitized mammograms. IEEE Trans. Med. Imaging 18(3), 231–238 (1999)
Article MATH Google Scholar
Zhao Y., Zhang L., Li P., Huang B.: Classification of high spatial resolution imagery using improved Gaussian Markov random-field-based texture features. IEEE Trans. Geosci. Remote Sens. 45(5), 1458–1468 (2007)
Article Google Scholar
Krishnamachari S., Chellappa R.: Multiresolution Gauss–Markov random field models for texture segmentation. IEEE Trans. Image Process. 6(2), 251–267 (1997)
Article Google Scholar
Van de Wouwer G., Scheunders P., Dyck D.V.: Statistical texture characterization from discrete wavelet representations. IEEE Trans. Image Process. 8, 592–598 (1999)
Article Google Scholar
Clausi D.A., Huang D.: Design-based texture feature fusion using Gabor filters and co-occurrence probabilities. IEEE Trans. Image Process. 14(7), 925–936 (2005)
Article Google Scholar
Mittal, N., Mital, D.P., Chan, K.L.: Features for texture segmentation using Gabor filters. In: International Conference on Image Processing and its Applications, Dublin, Ireland, vol. 1, pp. 353–357 (1999)
Marcelja S.: Mathematical description of the responses of simple cortical cells. J. Opt. Soc. Am. 70(1), 1297–1300 (1980)
Article MathSciNet Google Scholar
Grigorescu S.E., Petkov N., Kruizinga P.: Comparison of texture features based on Gabor filters. IEEE Trans. Image Process. 11(10), 1160–1167 (2002)
Article MathSciNet Google Scholar
Nourbakhsh, F., Pati, P.B., Ramakrishnan, A.G.: Document page layout analysis using harris corner points. In: Proceedings of ICISIP (2006)
Raju, S., Patiand, P.B., Ramakrishnan, A.G.: Gabor filter based block energy analysis for text extraction from digital document images. In: Proceedings of the First International Workshop on Document Image Analysis for Libraries (DIAL’04)
Kumar, S., Gupta, R., Khanna, N., Chaudhury, S., Joshi, S.D.: Text extraction and document image segmentation using matchedwavelets and MRF Model. IEEE Trans. Image Process. 16(8), 2117–2128 (2007)
Article MathSciNet Google Scholar
Liang C.-W., Chen P.-Y.: DWT based text localization. Int. J. Appl. Sci. Eng. 2(1), 105–116 (2004)
MathSciNet Google Scholar
Do M.N., Vetterli M.: Wavelet-based texture retrieval using generalized gaussian density and kullback-leibler distance. IEEE Trans. Image Process. 11(2), 146–158 (2002)
Article MathSciNet Google Scholar
Huang P.W., Dai S.K., Lin P.L.: Texture image retrieval and image segmentation using composite sub-band gradient vectors. J. Vis. Commun. Image Represent. 17(5), 947–957 (2006)
Article Google Scholar
Kokare M., Chatterji B.N., Biswas P.K.: Cosinemodulated wavelet based texture features for content-based image retrieval. Pattern Recognit. Lett. 25(4), 391–398 (2004)
Article Google Scholar
Randen T., Husoy J.H.: Filtering for texture classification: a comparative study. IEEE Trans. Pattern Anal. Mach. Intell. 21(4), 291–310 (1999)
Article Google Scholar
Sarkar A., Sharma K.M.S., Sonak R.V.: A new approach for subset 2-D AR model identification for describing textures. IEEE Trans. Image Process. 6(3), 407–413 (1997)
Article Google Scholar
Sayadi M., Najim M.: Comparison of second and third order statistics based adaptive filters for texture characterization. IEEE ICASSP Arizona USA 6 99, 3281–3284 (1999)
Google Scholar
Sayadi, M., Buzenac, V., Najim, M.: Texture characterization using 2-D cumulant-based lattice adaptive filtering. In: IEEE ICASSP 98, Seattle, USA, vol. 3, pp. 2725–2728 (1998)
Sabari Raju, S., Pati, P.B., Ramakrishnan, A.G.: Gabor filter based block energy analysis for text extraction from digital document images, DIAL04, pp. 233–243 (2004)
Etemad, K., Doermann, D., Chellappa, R.: Multiscale Segmentation of unstructured Document Pages Using Soft Decision Integration. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19, No. 1, Jan (1997)
Greenspans, H., Belongic, S., Goodman, R.: Rotation invariant texture recognition using a steerable pyramid. In: Proceedings of ICPR’94, International Conference on Pattern Recognition, vol. 1, pp. 162–167 (1994)
Tzagkarakis G., Beferull-Lozano B., Tsakalides P.: Rotation-invariant texture retrieval with Gaussianized steerable pyramids. IEEE Trans. Image Process. 15(9), 2702–2718 (2006)
Article Google Scholar
Montoya-Zegarra, J.A., Leite, N.J., Torres, R.: Rotation-invariant and scale-invariant steerable pyramid decomposition for texture image retrieval. Brazilian Symposium on Computer Graphics and Image Processing (1), pp. 121–128 (2007)
Simoncelli, E.P., Freeman, W.T.: The steerable pyramid: a flexible architecture for multi-scale derivative computation In: Proceedings of IEEE second international Conference on Image Processing, pp. 444–447, Washington, DC (1995)
Danielsson P., Seger O.: Rotation invariance in gradient and higher order derivative detectors. Comp. Vis. Graph. Image Proc. 49, 198–221 (1990)
Article MATH Google Scholar
Freeman, W.T., Adelson, E.H.: Steerable filters. In: Topical Meeting on Image Understanding and Machine Vision. Optical society of America. Technical Digest Series vol. 14, June (1989)
Portilla J., Simoncelli E.P.: A parametric texture model based on joint statistics of complex wavelet coefficients. Int. J. Comput. Vis. 40(1), 49–70 (2000)
Article MATH Google Scholar
Fawcett T., Flach P.A.: A response to webb and Ting’s on the application of ROC analysis to predict classification performance under varying class distributions. Mach. Learn. 58(1), 33–38 (2005)
Article Google Scholar
O’Gorman L.: The document spectrum for page layout analysis. IEEE Trans. Pattern Anal. Mach. Intell. 15(11), 1162–1173 (1993)
Article Google Scholar
Nagy G., Seth S., Viswanathan M.: A prototype document image analysis system for technical journals. Computer 7(25), 10–22 (1992)
Article Google Scholar
Phillips I., Chhabra A.: Empirical performance evaluation of graphics recognition systems. IEEE Trans. Pattern Anal. Mach. Intell. 21(9), 849–870 (1999)
Article Google Scholar
Phillips, I., Liang, J., Chhabra, A., Haralick, R.: A performance evaluation protocol for graphics recognition systems. In: Graphics Recognition: Algorithms and Systems, Lecture Notes in Computer Science, vol. 1389, pp. 372–389, Springer (1998)
Yanikoglu B.A., Vincent L.: Pink Panther: a complete environment for ground-truthing and benchmarking document page segmentation. Pattern Recognit. 31(9), 1191–1204 (1994)
Article Google Scholar
Phillips I.T., Chhabra A.: Empirical performance evaluation of graphics recognition systems. IEEE Trans. Pattern Anal. Mach. Intell. 21(9), 849–870 (1999)
Article Google Scholar
Refering to MediaTeam Document Database Sauvola J. and Kauniskangas H.: MediaTeam Document Database II, a CD-ROM collection of document images, University of Oulu, Finland (1999)
Antonacopoulos, A., Pletschacher, S., Bridson, D., Papadopoulos, C.: ICDAR 2009 Page Segmentation Competition, in: 10th International Conference on Document Analysis and Recognition (ICDAR’09), Barcelona, Spain, July (2009)
Nourbakhsh, F., Pati, P.B., Ramakrishnan, A.G.: Text localization and extraction from complex Gray images, 2006, Computer vision and image processing. In: Proceedings of the 5th Indian Conference, ICVGIP, Madurai (2006)
Zhong Y., Karu K., Jain A.K.: Locating text in complex color images. Pattern recognit. 28(10), 1523–1535 (1995)
Article Google Scholar
Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceedings of 4th Alvey Vision Conference, pp. 147–151 (1988)
Otsu N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

REGIM–ENIS, B.P. 1173, 3038, Sfax, Tunisia
Mohamed Benjelil, Slim Kanoun & Adel M. Alimi
L3I, University of La Rochelle, Avenue Michel Crépeau, 17042, La Rochelle, France
Mohamed Benjelil & Rémy Mullot

Authors

Mohamed Benjelil
View author publications
You can also search for this author in PubMed Google Scholar
Slim Kanoun
View author publications
You can also search for this author in PubMed Google Scholar
Rémy Mullot
View author publications
You can also search for this author in PubMed Google Scholar
Adel M. Alimi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohamed Benjelil.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Benjelil, M., Kanoun, S., Mullot, R. et al. Complex documents images segmentation based on steerable pyramid features. IJDAR 13, 209–228 (2010). https://doi.org/10.1007/s10032-010-0113-9

Download citation

Received: 22 June 2009
Revised: 20 December 2009
Accepted: 14 February 2010
Published: 10 March 2010
Issue Date: September 2010
DOI: https://doi.org/10.1007/s10032-010-0113-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Complex documents images segmentation based on steerable pyramid features

Abstract

Access this article

Similar content being viewed by others

A Systematic Survey on CAPTCHA Recognition: Types, Creation and Breaking Techniques

Image segmentation evaluation: a survey of methods

An unsupervised automatic organization method for Professor Shirakawa’s hand-notated documents of oracle bone inscriptions

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Complex documents images segmentation based on steerable pyramid features

Abstract

Access this article

Similar content being viewed by others

A Systematic Survey on CAPTCHA Recognition: Types, Creation and Breaking Techniques

Image segmentation evaluation: a survey of methods

An unsupervised automatic organization method for Professor Shirakawa’s hand-notated documents of oracle bone inscriptions

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation