Skip to main content
Log in

Visual Similarity Based Document Layout Analysis

  • Artificial Intelligence
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

In this paper, a visual similarity based document layout analysis (DLA) scheme is proposed, which by using clustering strategy can adaptively deal with documents in different languages, with different layout structures and skew angles. Aiming at a robust and adaptive DLA approach, the authors first manage to find a set of representative filters and statistics to characterize typical texture patterns in document images, which is through a visual similarity testing process. Texture features are then extracted from these filters and passed into a dynamic clustering procedure, which is called visual similarity clustering. Finally, text contents are located from the clustered results. Benefit from this scheme, the algorithm demonstrates strong robustness and adaptability in a wide variety of documents, which previous traditional DLA approaches do not possess.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Wong K Y, Casey R G, Wahl F M. Document analysis system. IBM Journal Res. Develop, 1982, 26(6): 647–656.

    Article  Google Scholar 

  2. Nagy G, Seth S, Viswanathan M. A prototype document image analysis system for technical journals. IEEE Computer, 1992, 25(7): 10–22.

    Google Scholar 

  3. Drivas D, Amin A. Page segmentation and classification utilizing bottom-up approach. In Proc. the Third International Conference on Document Analysis and Recognition, Montreal, Aug. 14–16, 1995, pp.610–614.

  4. Simon A, Pret J, Johnson A. A fast algorithm for bottom-up document layout analysis. IEEE Trans. Pattern Analysis and Machine Intelligence, 1997, 19(3): 273–276.

    Article  Google Scholar 

  5. Jain A K, Zhong Y. Page segmentation using texture analysis. Pattern Recognition, 1996, 29(5): 743–770.

    Article  Google Scholar 

  6. Jain A K, Bhattacharjee S. Text segment using Gabor filters for automatic document processing. Machine Vision and Applications, 1992, 5(3): 169–184.

    Google Scholar 

  7. Lee S-W, Ryu D-S. Parameter-free geometric document layout analysis. IEEE Trans. Pattern Analysis and Machine Intelligence, 2001, 23(11): 1240–1251.

    Article  Google Scholar 

  8. Li J, Gray R M. Context-based multiscale classification of document images using wavelet coefficient distributions. IEEE Trans. Image Processing, 2000, 9(9): 1604–1616.

    Article  Google Scholar 

  9. Wu V, Manmatha R, Riseman E M. TextFinder: An automatic system to detect and recognize text in images. IEEE Trans. Pattern Analysis and Machine Intelligence, 1999, 21(11): 1224–1229.

    Article  Google Scholar 

  10. Julesz B. Visual pattern discrimination. IRE Trans. Information Theory, 1962, (IT-8): 84–92.

  11. Zhu S C, Wu Y N, Mumford D. Minimax entropy principle and its application to texture modeling. Neural Computation, 1997, 9(8): 1627–1660.

    Article  Google Scholar 

  12. Liu X, Wang D. Texture classification using spectral histograms. IEEE Trans. Image Processing, 2003, 12(6): 661–670.

    Article  Google Scholar 

  13. Gabor D. Theory of communication. J. IEE., 1946, 93(26): 429–457.

    Google Scholar 

  14. Hubel D H, Wiesel T N. Receptive fields and functional architecture of monkey striate cortex. The Journal of Physiology, 1968, 195: 215–243.

    Google Scholar 

  15. Zhu S C, Wu Y N, Mumford D B. FRAME: Filters, random fields and maximum entropy — Towards a unified theory for texture modeling. International Journal of Computer Vision, 1998, 27(3): 1–20.

    Google Scholar 

  16. Zhu S C, Liu X W, Wu Y N. Exploring texture ensembles by efficient Markov chain Monte Carlo — Toward a “trichromacy” theory of texture. IEEE Trans. Pattern Analysis and Machine Intelligence, 2000, 22(6): 554–569.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Di Wen.

Additional information

This work is supported by the National Natural Science Foundation of China under Grant No. 60472002.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wen, D., Ding, XQ. Visual Similarity Based Document Layout Analysis. J Comput Sci Technol 21, 459–465 (2006). https://doi.org/10.1007/s11390-006-0459-0

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-006-0459-0

Keywords

Navigation