Skip to main content
Log in

Skew detection in document images based on rectangular active contour

  • Full Paper
  • Published:
International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Abstract

The digitalization processes of documents produce frequently images with small rotation angles. The skew angles in document images degrade the performance of optical character recognition (OCR) tools. Therefore, skew detection of document images plays an important role in automatic document analysis systems. In this paper, we propose a Rectangular Active Contour Model (RAC Model) for content region detection and skew angle calculation by imposing a rectangular shape constraint on the zero-level set in Chan–Vese Model (C-V Model) according to the rectangular feature of content regions in document images. Our algorithm differs from other skew detection methods in that it does not rely on local image features. Instead, it uses global image features and shape constraint to obtain a strong robustness in detecting skew angles of document images. We experimented on different types of document images. Comparing the results with other skew detection algorithms, our algorithm is more accurate in detecting the skews of the complex document images with different fonts, tables, illustrations, and layouts. We do not need to pre-process the original image, even if it is noisy, and at the same time the rectangular content region of a document image is also detected.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Baird H.S.: The skew angle of printed documents, document image analysis. IEEE Computer Society Press, Los Alamitos, CA (1995)

    Google Scholar 

  2. Kapoor R., Bagai D., Kamal T.S.: A new algorithm for skew detection and correction. Pattern Recogn. Lett. 25, 1215–1229 (2004)

    Article  Google Scholar 

  3. Srihari S.N., Govindraju V.: Analysis of textual images using the Hough transform. Mach. Vision Appl. 2, 141–153 (1989)

    Article  Google Scholar 

  4. Kwag H.K., Kim S.H., Jeong S.H., Lee G.S.: Efficient skew estimation and correction algorithm for document images. Image Vision Comput. 20, 25–35 (2001)

    Article  Google Scholar 

  5. Amin A., Fischer S.: A document skew detection method using the Hough transform. Pattern Anal. Appl. 3, 243–253 (2000)

    Article  MATH  Google Scholar 

  6. Hashizume A., Yeh P.S., Rosenfeld A.: A method of detecting the orientation of aligned components. Pattern Recogn. Lett. 4, 125–132 (1986)

    Article  Google Scholar 

  7. Sarfraz, M., Zidouri, A., Shahab, S.A.: A novel approach for skew estimation of document images in OCR system. In: International conference on Computer Graphics, Imaging and Visualization, pp. 175–180 (2005)

  8. Cattoni, R., Coianiz, T., Messelodi, S., Modena, C.M.: Geometric layout analysis techniques for document image understanding: a review. ITC-irst technical report TR#9703-09 (1998)

  9. Egozi, A., Dinstein, I., Chapran, J., Fairhurst, M.: An EM based algorithm for skew detection. In: International Conference on Document Analysis and Recognition, vol. 1, pp. 277–281 (2007)

  10. Makridis, M., Nikolaou, N., Papamarkos, N.: A new technique for global and local skew correction in binary documents. Advanced concepts for intelligent vision systems, pp. 877–887 (2007)

  11. Hong Y.: Skew correction of document images using interline cross-correlation. Graphical Models Image Process. 55(6), 538–543 (1993)

    Article  Google Scholar 

  12. Chen, S., Haralick, R.M.: An automatic algorithm for text skew estimation in document images using recursive morphological transforms. In: International Conference on Image Processing, vol. 1, pp. 139–143 (1994)

  13. Fan K.C., Wang Y.K., Lay T.R.: Marginal noise removal of document images. Pattern Recogn. 35, 2593–2611 (2002)

    Article  MATH  Google Scholar 

  14. Shafait, F., Beusekom, J.V., Keysers, D., Breuel, T.M.: Page frame detection for marginal noise removal from scanned documents. In: Scandinavian Conference on Image Analysis, pp. 651–660 (2007)

  15. Osher S., Sethian J.: Fronts propagating with curvature dependent speed: algorithms based on the Hamilton-Jacobi formulation. J. Comput. Phys. 79, 12–49 (1988)

    Article  MATH  MathSciNet  Google Scholar 

  16. Sethian J.A.: Level set methods and fast marching methods. 2nd edn. Cambridge University Press, Cambridge (1999)

    MATH  Google Scholar 

  17. Chan T.F., Vese L.A.: Active contours without edges. IEEE Trans. Image Process. 10, 266–277 (2001)

    Article  MATH  Google Scholar 

  18. Mumford D., Shah J.: Optimal approximations by piecewise smooth functions and associated variational problems. Commun. Pure Appl. Math. 42, 577–685 (1989)

    Article  MATH  MathSciNet  Google Scholar 

  19. Yandong, T., Xiaomao, L. et al.: Automatic segmentation of the papilla in a fundus image based on the C-V Model and a shape restraint. In: International Conference on Pattern Recognition, pp. 183–186 (2006)

  20. Dempster A., Laird N., Rubin D.: Maximum-likelihood from incomplete data via the EM algorithm. J. Royal Stat. Soc. Ser.B 39, 1–38 (1977)

    MATH  MathSciNet  Google Scholar 

  21. Shafait F., Beusekom J.V., Keysers D., Breuel T.: Document cleanup using page frame detection. Int. J. Document Anal. Recogn. 11, 81–96 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yandong Tang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fan, H., Zhu, L. & Tang, Y. Skew detection in document images based on rectangular active contour. IJDAR 13, 261–269 (2010). https://doi.org/10.1007/s10032-010-0119-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-010-0119-3

Keywords

Navigation