Skip to main content
Log in

Complex documents images segmentation based on steerable pyramid features

  • Full Paper
  • Published:
International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Abstract

Page segmentation and classification is very important in document layout analysis system before it is presented to an OCR system or for any other subsequent processing steps. In this paper, we propose an accurate and suitably designed system for complex documents segmentation. This system is based on steerable pyramid transform. The features extracted from pyramid sub-bands serve to locate and classify regions into text (either machine-printed or handwritten) and non-text (images, graphics, drawings or paintings) in some noise-infected, deformed, multilingual, multi-script document images. These documents contain tabular structures, logos, stamps, handwritten script blocks, photographs, etc. The encouraging and promising results obtained on 1,000 official complex document images data set are presented in this research paper. We compared our results with those from existing state-of-the-art methods. This comparison shows that the proposed method performs consistently well on large sets of complex document images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Freeman W.T., Adelson E.H.: The design and use of steerable filters. IEEE Trans. Pattern Anal. Mach. Intell. 13(9), 891–906 (1991)

    Article  Google Scholar 

  2. Fletcher L., Kasturi R.: A robust algorithm for text string separation from mixed text/graphics images. IEEE Trans. Pattern Anal. Mach. Intell. 10, 910–918 (1988)

    Article  Google Scholar 

  3. Tan, C.L., Yuan, B., Huang, W., Zhang, Z.: Text/graphics separation using pyramid operations. In: International Conference on Document Analysis and Recognition, Bangalore, pp. 169–172 (1999)

  4. Antaonacopulos A.: Page Segmentation Using the Description of the Background. Comput. Vis. Image Underst. 70(3), 350–369 (1998)

    Article  Google Scholar 

  5. Pavlidis, T., Zhou, J.: Page segmentation by white streams. In: Proceedings of the 10th International Conference on Pattern recognition, pp. 945–953, Saint-Malo, France (1991)

  6. Sural, S., Das, P.K.: A two step algorithm and its parallelisation for the generation of minimum containing rectangles for document image segmentation. In: International Conference on Document Analysis and Recognition, Bangalore, pp. 173–176 (1999)

  7. Wang D., Srihari S.N.: Classi_cation of newspaper image blocks using texture analysis. CVGIP 47, 327–352 (1989)

    Google Scholar 

  8. Vishwanathan M., Nagy G.: Characteristics of digitized images of technical articles. SPIE 1(661), 6–17 (1992)

    Article  Google Scholar 

  9. Jung K., Kim K., Jain A.K.: Text information extraction in images and video: a survey. Pattern Recognit. 37(5), 977–997 (2004)

    Article  Google Scholar 

  10. Jain, A.K., Yu, B.: Page segmentation using document model. In: International Conference on Document Analysis and Recognition, Vol. 1, pp. 173–176. Munich (1997)

  11. Mao S., Rosenfeld A.: Tapas Kanungo document structure analysis algorithms: a literature survey. Doc. Recognit. Retr. X 5010(1), 197–207 (2003)

    Google Scholar 

  12. Ahmad, U.A., Kidiyo, K., Joseph, R.: Texture features based on Fourier transform and Gabor filters: an empirical comparison. In: ICMV 2007, International Conference on Machine Vision, Islamabad, Pakistan vol. 1, pp. 67–72 (2007)

  13. Rellier G., Descombes X., Falzon F., Zerubia J.: Texture feature analysis using a Gauss-Markov model in hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 42(7), 1543–1551 (2004)

    Article  Google Scholar 

  14. Wang, Y., Wei, X., Xiao, S.: LBP texture analysis based on the local adaptive niblack algorithm. In: CISP, Congress on Image and Signal Processing, Hainan, China, (2), pp. 777–780 (2008)

  15. Charalampidis D.: Texture synthesis: textons revisited. IEEE Trans. Image Process. 15(3), 777–787 (2006)

    Article  Google Scholar 

  16. Haralick R.M., Shanmungan K., Dinstein I.: Textural features for image classification. IEEE Trans. Syst. Man Cybern. 3(6), 610–621 (1973)

    Article  Google Scholar 

  17. Haley G.M., Manjunath B.S.: Rotation-invariant texture classification using a complete space-frequency model. IEEE Trans. Image Process. 8, 255–269 (1999)

    Article  Google Scholar 

  18. Kim J.K., Park H.W.: Statistical textural features for detection of micro calcifications in digitized mammograms. IEEE Trans. Med. Imaging 18(3), 231–238 (1999)

    Article  MATH  Google Scholar 

  19. Zhao Y., Zhang L., Li P., Huang B.: Classification of high spatial resolution imagery using improved Gaussian Markov random-field-based texture features. IEEE Trans. Geosci. Remote Sens. 45(5), 1458–1468 (2007)

    Article  Google Scholar 

  20. Krishnamachari S., Chellappa R.: Multiresolution Gauss–Markov random field models for texture segmentation. IEEE Trans. Image Process. 6(2), 251–267 (1997)

    Article  Google Scholar 

  21. Van de Wouwer G., Scheunders P., Dyck D.V.: Statistical texture characterization from discrete wavelet representations. IEEE Trans. Image Process. 8, 592–598 (1999)

    Article  Google Scholar 

  22. Clausi D.A., Huang D.: Design-based texture feature fusion using Gabor filters and co-occurrence probabilities. IEEE Trans. Image Process. 14(7), 925–936 (2005)

    Article  Google Scholar 

  23. Mittal, N., Mital, D.P., Chan, K.L.: Features for texture segmentation using Gabor filters. In: International Conference on Image Processing and its Applications, Dublin, Ireland, vol. 1, pp. 353–357 (1999)

  24. Marcelja S.: Mathematical description of the responses of simple cortical cells. J. Opt. Soc. Am. 70(1), 1297–1300 (1980)

    Article  MathSciNet  Google Scholar 

  25. Grigorescu S.E., Petkov N., Kruizinga P.: Comparison of texture features based on Gabor filters. IEEE Trans. Image Process. 11(10), 1160–1167 (2002)

    Article  MathSciNet  Google Scholar 

  26. Nourbakhsh, F., Pati, P.B., Ramakrishnan, A.G.: Document page layout analysis using harris corner points. In: Proceedings of ICISIP (2006)

  27. Raju, S., Patiand, P.B., Ramakrishnan, A.G.: Gabor filter based block energy analysis for text extraction from digital document images. In: Proceedings of the First International Workshop on Document Image Analysis for Libraries (DIAL’04)

  28. Kumar, S., Gupta, R., Khanna, N., Chaudhury, S., Joshi, S.D.: Text extraction and document image segmentation using matchedwavelets and MRF Model. IEEE Trans. Image Process. 16(8), 2117–2128 (2007)

    Article  MathSciNet  Google Scholar 

  29. Liang C.-W., Chen P.-Y.: DWT based text localization. Int. J. Appl. Sci. Eng. 2(1), 105–116 (2004)

    MathSciNet  Google Scholar 

  30. Do M.N., Vetterli M.: Wavelet-based texture retrieval using generalized gaussian density and kullback-leibler distance. IEEE Trans. Image Process. 11(2), 146–158 (2002)

    Article  MathSciNet  Google Scholar 

  31. Huang P.W., Dai S.K., Lin P.L.: Texture image retrieval and image segmentation using composite sub-band gradient vectors. J. Vis. Commun. Image Represent. 17(5), 947–957 (2006)

    Article  Google Scholar 

  32. Kokare M., Chatterji B.N., Biswas P.K.: Cosinemodulated wavelet based texture features for content-based image retrieval. Pattern Recognit. Lett. 25(4), 391–398 (2004)

    Article  Google Scholar 

  33. Randen T., Husoy J.H.: Filtering for texture classification: a comparative study. IEEE Trans. Pattern Anal. Mach. Intell. 21(4), 291–310 (1999)

    Article  Google Scholar 

  34. Sarkar A., Sharma K.M.S., Sonak R.V.: A new approach for subset 2-D AR model identification for describing textures. IEEE Trans. Image Process. 6(3), 407–413 (1997)

    Article  Google Scholar 

  35. Sayadi M., Najim M.: Comparison of second and third order statistics based adaptive filters for texture characterization. IEEE ICASSP Arizona USA 6 99, 3281–3284 (1999)

    Google Scholar 

  36. Sayadi, M., Buzenac, V., Najim, M.: Texture characterization using 2-D cumulant-based lattice adaptive filtering. In: IEEE ICASSP 98, Seattle, USA, vol. 3, pp. 2725–2728 (1998)

  37. Sabari Raju, S., Pati, P.B., Ramakrishnan, A.G.: Gabor filter based block energy analysis for text extraction from digital document images, DIAL04, pp. 233–243 (2004)

  38. Etemad, K., Doermann, D., Chellappa, R.: Multiscale Segmentation of unstructured Document Pages Using Soft Decision Integration. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19, No. 1, Jan (1997)

  39. Greenspans, H., Belongic, S., Goodman, R.: Rotation invariant texture recognition using a steerable pyramid. In: Proceedings of ICPR’94, International Conference on Pattern Recognition, vol. 1, pp. 162–167 (1994)

  40. Tzagkarakis G., Beferull-Lozano B., Tsakalides P.: Rotation-invariant texture retrieval with Gaussianized steerable pyramids. IEEE Trans. Image Process. 15(9), 2702–2718 (2006)

    Article  Google Scholar 

  41. Montoya-Zegarra, J.A., Leite, N.J., Torres, R.: Rotation-invariant and scale-invariant steerable pyramid decomposition for texture image retrieval. Brazilian Symposium on Computer Graphics and Image Processing (1), pp. 121–128 (2007)

  42. Simoncelli, E.P., Freeman, W.T.: The steerable pyramid: a flexible architecture for multi-scale derivative computation In: Proceedings of IEEE second international Conference on Image Processing, pp. 444–447, Washington, DC (1995)

  43. Danielsson P., Seger O.: Rotation invariance in gradient and higher order derivative detectors. Comp. Vis. Graph. Image Proc. 49, 198–221 (1990)

    Article  MATH  Google Scholar 

  44. Freeman, W.T., Adelson, E.H.: Steerable filters. In: Topical Meeting on Image Understanding and Machine Vision. Optical society of America. Technical Digest Series vol. 14, June (1989)

  45. Portilla J., Simoncelli E.P.: A parametric texture model based on joint statistics of complex wavelet coefficients. Int. J. Comput. Vis. 40(1), 49–70 (2000)

    Article  MATH  Google Scholar 

  46. Fawcett T., Flach P.A.: A response to webb and Ting’s on the application of ROC analysis to predict classification performance under varying class distributions. Mach. Learn. 58(1), 33–38 (2005)

    Article  Google Scholar 

  47. O’Gorman L.: The document spectrum for page layout analysis. IEEE Trans. Pattern Anal. Mach. Intell. 15(11), 1162–1173 (1993)

    Article  Google Scholar 

  48. Nagy G., Seth S., Viswanathan M.: A prototype document image analysis system for technical journals. Computer 7(25), 10–22 (1992)

    Article  Google Scholar 

  49. Phillips I., Chhabra A.: Empirical performance evaluation of graphics recognition systems. IEEE Trans. Pattern Anal. Mach. Intell. 21(9), 849–870 (1999)

    Article  Google Scholar 

  50. Phillips, I., Liang, J., Chhabra, A., Haralick, R.: A performance evaluation protocol for graphics recognition systems. In: Graphics Recognition: Algorithms and Systems, Lecture Notes in Computer Science, vol. 1389, pp. 372–389, Springer (1998)

  51. Yanikoglu B.A., Vincent L.: Pink Panther: a complete environment for ground-truthing and benchmarking document page segmentation. Pattern Recognit. 31(9), 1191–1204 (1994)

    Article  Google Scholar 

  52. Phillips I.T., Chhabra A.: Empirical performance evaluation of graphics recognition systems. IEEE Trans. Pattern Anal. Mach. Intell. 21(9), 849–870 (1999)

    Article  Google Scholar 

  53. Refering to MediaTeam Document Database Sauvola J. and Kauniskangas H.: MediaTeam Document Database II, a CD-ROM collection of document images, University of Oulu, Finland (1999)

  54. Antonacopoulos, A., Pletschacher, S., Bridson, D., Papadopoulos, C.: ICDAR 2009 Page Segmentation Competition, in: 10th International Conference on Document Analysis and Recognition (ICDAR’09), Barcelona, Spain, July (2009)

  55. Nourbakhsh, F., Pati, P.B., Ramakrishnan, A.G.: Text localization and extraction from complex Gray images, 2006, Computer vision and image processing. In: Proceedings of the 5th Indian Conference, ICVGIP, Madurai (2006)

  56. Zhong Y., Karu K., Jain A.K.: Locating text in complex color images. Pattern recognit. 28(10), 1523–1535 (1995)

    Article  Google Scholar 

  57. Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceedings of 4th Alvey Vision Conference, pp. 147–151 (1988)

  58. Otsu N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohamed Benjelil.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Benjelil, M., Kanoun, S., Mullot, R. et al. Complex documents images segmentation based on steerable pyramid features. IJDAR 13, 209–228 (2010). https://doi.org/10.1007/s10032-010-0113-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-010-0113-9

Keywords

Navigation