Skip to main content

Advertisement

Log in

Fourier–Mellin registration of line-delineated tabular document images

  • Original Paper
  • Published:
International Journal of Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Abstract

Image registration (or alignment) is a useful preprocessing tool for assisting in manual data extraction from handwritten forms, as well as for preparing documents for batch OCR of specific page regions. A new technique is presented for fast registration of lined tabular document images in the presence of a global affine transformation, using the Discrete Fourier--Mellin Transform (DFMT). Each component of the affine transform is handled separately, which dramatically reduces the total parameter space of the problem. This method is robust and deals with all components of the affine transform in a uniform way by working in the frequency domain. The DFMT is extended to handle shear, which can approximate a small amount of perspective distortion. In order to limit registration to foreground pixels only, and to eliminate Fourier edge effects, a novel, locally adaptive foreground-background segmentation algorithm is introduced, based on the median filter, which eliminates the need for Blackman windowing as usually required by DFMT image registration. A novel information-theoretic optimization of the median filter is presented. An original method is demonstrated for automatically obtaining blank document templates from a set of registered document images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Barrett, W., Hutchison, L., Quass, D., Nielson, H., Kennard, D.: Digital mountain: from granite archive to global access. In: Proceedings of the Document Image Analysis for Libraries. IEEE (2004)

  2. Brown, L.G.: A survey of image registration techniques. ACM Comput. Surv. 24(4), 325–376 (1992)

    Article  Google Scholar 

  3. Nagy, G.: Twenty years of document image analysis in PAMI. Trans. Pattern Anal. Mach. Intell. 22(1), 38–62 (2000)

    Article  Google Scholar 

  4. Zitova, B., Flusser, J.: Image registration methods: a survey. Image Visual Comput. 21(11), 977–1000 (2003)

    Article  Google Scholar 

  5. Nielson, H.E., Barrett, W.A.: Consensus-based table form recognition. In: Proceedings, Seventh International Conference on Document Analysis and Recognition, August 2003, vol. II, pp. 906–910

  6. Chandran, S., Kasturi, R.: Structural recognition of tabulated data. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 516–519 (1993)

  7. Tang, Y., Liu, J., Li, B.F., Xi, D.: Multiresolution analysis in extraction of reference lines from documents with gray level background. Trans. Pattern Anal. Mach. Intell. 19(8) (1997)

  8. Xi, D., Lee, S.: Table structure extraction from form documents based on gradient-wavelet scheme. In: Proceedings of the Document Analysis Systems: Theory and Practice: Third IAPR Workshop. International Association for Pattern Recognition (1998)

  9. Chandran, S., Balasubramanian, S., Gandhi, T., Prasad, A., Kasturi, R.: Structure recognition and information extraction from tabular documents. Int. J. Imag. Syst. Tech. 7, 289–303 (1996)

    Article  Google Scholar 

  10. Vinciarelli, A.: A survey on off-line cursive script recognition. Pattern Recog. 35(7), 1433–1446 (2002)

    Article  MATH  Google Scholar 

  11. LDS.org. News release: Facts About the 1880 U.S. Census. The Church of Jesus Christ of Latter-day Saints. http://www.lds.org/newsroom/showpackage/0,15367,3881--1--4--645,00.html (October 2002)

  12. Doermann, D.: The indexing and retrieval of document images: a survey. Comput. Vis. Image Understand. 70(3), 287–298 (1998)

    Article  Google Scholar 

  13. Plamondon, R., Srihari, S.: On-Line and off-line handwriting recognition: a comprehensive survey. Trans. Pattern Anal. Mach. Intell. 22(1) (2000)

  14. Steinherz, T., Rivlin, E., Intrator, N.: Offline cursive script word recognition – a survey. Int. J. Doc. Anal. Recog. 2(2/3), 90–110 (1999)

    Google Scholar 

  15. Kia, O., Doermann, D.: Structural compression for document analysis. In: Proceedings of the International Conference on Pattern Recognition (1996)

  16. Kia, O., Doermann, D.: Integrated segmentation and clustering for enhanced compression of document images. In: Proceedings of the International Conference on Document Analysis and Recognition, vol. 1 (1997)

  17. Postl, W.: Detection of linear oblique structures and skew scan in digitized documents. In: Proceedings of the International Conference on Pattern Recognition pp. 687–689 (1986)

  18. Postl, W.: Method for automatic correction of character skew in the acquisition of a text original in the form of digital scan results. U.S. Patent number 4,723,297, U.S. Patent and Trademarks Office (1988)

  19. Baird, H.S.: The skew angle of printed documents. In: Proceedings of the Society of Photographic Scientists and Engineers vol. 40, pp. 21–24 (1987)

  20. Duda, R., Hart, P.: Transformation to detect lines and curves in pictures. Commun. ACM 15(1), 11–15 (1972)

    Article  Google Scholar 

  21. Lee, D.X., Thoma, G., Weschler, H.: Automated page orientation and skew angle detection for binary document images. Pattern Recog. 27(10), 1325–1344 (1994)

    Article  Google Scholar 

  22. Amin, A., Fischer, S., Parkinson, T., Shiu, R.: Fast algorithm for skew detection. In: Proceedings of the Symposium on Electronic Imaging, IS&T/SPIE (The International Society for Optical Engineering) (1996)

  23. Hinds, S.C., Fisher, J.L., Amato, D.P.D.: A document skew detection method using run-length encoding and the hough transform. In: Proceedings of the International Conference on Pattern Recognition, pp. 464–468 (1990)

  24. Perantonis, S., Gatos, B., Papamarkos, N.: Block decomposition and segmentation for fast Hough transform evaluation. Pattern Recog. 32(5), 811–824 (1999)

    Article  Google Scholar 

  25. Cao, Y., Wang, S., Li, H.: Skew detection and correction in document images based on straight-line fitting. Pattern Recog. Lett. 24(12), 1871–1879 (2003)

    Article  Google Scholar 

  26. Si, D.: Skew and slant correction for document images using gradient direction. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 142–146 (1997)

  27. Okun, O., Pietikainen, M., Sauvola, J.J.: Document skew estimation without angle range restriction. Int. J. Doc. Anal. Recog. 2(2/3), 132–144 (1999)

    Article  Google Scholar 

  28. Steinherz, T., Intrator, N., Rivlin, E.: Skew detection via principal components analysis. In: Proceedings of the International Conference on Document Analysis and Recognition (1999)

  29. Sauvola, J., Pietikäinen, M.: Skew angle detection using texture direction analysis. In: Proceedings, Scandinavian Conference on Image Analysis (1995)

  30. Najman, L.: Using mathematical morphology for document skew estimation. In: Proceedings of the Symposium on Electronic Imaging: Document Recognition and Retrieval XI, IS&T/SPIE (The International Society for Optical Engineering) (2004)

  31. Kavallieratou, E., Fakotakis, N., Kokkinakis, G.: Skew angle estimation for printed and handwritten documents using the Wigner–Ville distribution. Image Vis. Comput. 20, 813–824 (2002)

    Article  Google Scholar 

  32. Garris, M.D., Grother, P.J.: Generalized form registration using structure-based techniques. In: Proceedings of the Fifth Annual Symposium on Document Analysis and Information Retrieval, pp. 321–334 (1996)

  33. Wolberg, G., Zokai, S.: Robust image registration using log-polar transform. In: Proceedings, International Conference on Image Processing, IEEE (2000)

  34. Wolberg, G., Zokai, S.: Image registration for perspective deformation recovery. In: Proceedings of the Conference on Automatic Target Recognition X, IS&T/SPIE (The International Society for Optical Engineering) (2000)

  35. Zhang, Z., Blum, R.S.: A hybrid image registration technique for a digital camera image fusion application. Inf. Fusion 2(2), 135–149 (2001)

    Article  Google Scholar 

  36. Kuglin, C.D., Hines, D.C.: The phase correlation image alignment method. In: Proceedings of the Conference on Cybernetics and Society, IEEE, pp. 163–165 (1975)

  37. Castro, E.D., Morandi, C.: Registration of translated and rotated images using finite Fourier transforms. Trans. Pattern Anal. Mach. Intell. 9(5), 700–703 (1987)

    Article  Google Scholar 

  38. Casasent, D., Psaltis, D.: Position, rotation, and scale-invariant optical correlation. Appl. Opt. 15, 1793–1799 (1976)

    Google Scholar 

  39. Sheng, Y., Arsenault, H.H.: Experiments on pattern recognition using invariant Fourier–Mellin descriptors. J. Opt. Soc. Am. A 3(6), 771–776 (1986)

    Article  PubMed  Google Scholar 

  40. McGuire, M.: An image registration technique for recovering rotation, scale and translation parameters. NEC Technical Report 98-018 (1998)

  41. Reddy, B.S., Chatterji, B.N.: An FFT-based technique for translation, rotation, and scale-invariant image registration. Trans. Pattern Anal. Mach. Intell. 5(8), 1266–1271 (1996)

    Google Scholar 

  42. Stone, H.S.: NEC Technical Report: Fourier-Based Image Registration Techniques, NEC Research. http://www.censsis.neu.edu/hstone_fourier.pdf (2002)

  43. Lin, C.-Y., Wu, M., Bloom, J.A., Miller, M.L., Cox, I.J., Lui, Y.-M.: Rotation, scale, and translation resilient public watermarking for images. Trans. Image Process. 10(5) (2001)

  44. Luo, X., Mirchandani, G.: An integrated framework for image classification. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (2000)

  45. Stone, H.S., Tao, B., McGuire, M.: Analysis of image registration noise due to rotationally dependent aliasing. J. Visual Commun. Image Represent. 14(2) (2003)

  46. Lévy-Vehel, J.: Utilisation de la transformée de Mellin en traitement de signaux fractals – some applications of the Mellin transform in signal processing. INRIA Research Report No. 2992 (1995–1996)

  47. Derrode, S., Ghorbel, F.: Robust and efficient Fourier–Mellin transform approximations for gray-level image reconstruction and complete invariant description. Comput. Visual Image Understand. 83(1), 57–78 (2001)

    Article  MATH  Google Scholar 

  48. Blackman, R.B., Tukey, J.W.: Particular Pairs of Windows. Dover, New York (1959)

    Google Scholar 

  49. Otsu, N.: A threshold selection method from grey-level histograms. Trans. Syst. Man Cybernet. 9(1), 62–66 (1979)

    Article  MathSciNet  Google Scholar 

  50. Niblack, W.: An Introduction to Digital Image Processing. Prentice-Hall, Englewood Cliffs, NJ (1986)

  51. Huang, J., Wang, Y., Wong, E.K.: Check image compression using a layered coding method. J. Electron. Imag., Special issue on image/video processing and compression Visual Comun. 7(3), 426–442 (1998)

    Google Scholar 

  52. Frigo, M., Johnson, S.G.: The fastest fourier transform in the west, version 3. Massacheusetts Institute of Technology, http://www.fftw.org/ (2003)

  53. Borman, S., Stevenson, R.: Spatial Resolution Enhancement of Low-Resolution Image Sequences – A Comprehensive Review with Directions for Future Research. Research Report, University of Notre Dame (1998)

  54. Kia, O.E.: Document image compression and analysis. Ph.D. thesis (1997)

  55. Huang, J., Wang, Y., Wong, E.K.: Check image compression: a comparision of JPEG, wavelet and layered coding methods. In: Proceedings of the International Conference on Image Processing, IEEE, pp. 694–697 (1997)

  56. Devillard, N.: Fast median search: an ANSI C implementation, http://ndevilla.free.fr/median/(July 1998)

  57. Gil, J., Werman, M.: Computing 2-D min, median, and max filters. Trans. Pattern Anal. Machine Intell. 15(5), 504–507 (1993)

    Article  Google Scholar 

  58. Huang, T.S., Yang, G.J., Tang, G.Y.: A fast two-dimensional median filtering algorithm. Trans. Acoustics Speech Signal Process. 27(1) (1979)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luke A. D. Hutchison.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hutchison, L.A.D., Barrett, W.A. Fourier–Mellin registration of line-delineated tabular document images. IJDAR 8, 87–110 (2006). https://doi.org/10.1007/s10032-005-0003-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-005-0003-8

Keywords