Abstract
Image registration (or alignment) is a useful preprocessing tool for assisting in manual data extraction from handwritten forms, as well as for preparing documents for batch OCR of specific page regions. A new technique is presented for fast registration of lined tabular document images in the presence of a global affine transformation, using the Discrete Fourier--Mellin Transform (DFMT). Each component of the affine transform is handled separately, which dramatically reduces the total parameter space of the problem. This method is robust and deals with all components of the affine transform in a uniform way by working in the frequency domain. The DFMT is extended to handle shear, which can approximate a small amount of perspective distortion. In order to limit registration to foreground pixels only, and to eliminate Fourier edge effects, a novel, locally adaptive foreground-background segmentation algorithm is introduced, based on the median filter, which eliminates the need for Blackman windowing as usually required by DFMT image registration. A novel information-theoretic optimization of the median filter is presented. An original method is demonstrated for automatically obtaining blank document templates from a set of registered document images.
Similar content being viewed by others
References
Barrett, W., Hutchison, L., Quass, D., Nielson, H., Kennard, D.: Digital mountain: from granite archive to global access. In: Proceedings of the Document Image Analysis for Libraries. IEEE (2004)
Brown, L.G.: A survey of image registration techniques. ACM Comput. Surv. 24(4), 325–376 (1992)
Nagy, G.: Twenty years of document image analysis in PAMI. Trans. Pattern Anal. Mach. Intell. 22(1), 38–62 (2000)
Zitova, B., Flusser, J.: Image registration methods: a survey. Image Visual Comput. 21(11), 977–1000 (2003)
Nielson, H.E., Barrett, W.A.: Consensus-based table form recognition. In: Proceedings, Seventh International Conference on Document Analysis and Recognition, August 2003, vol. II, pp. 906–910
Chandran, S., Kasturi, R.: Structural recognition of tabulated data. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 516–519 (1993)
Tang, Y., Liu, J., Li, B.F., Xi, D.: Multiresolution analysis in extraction of reference lines from documents with gray level background. Trans. Pattern Anal. Mach. Intell. 19(8) (1997)
Xi, D., Lee, S.: Table structure extraction from form documents based on gradient-wavelet scheme. In: Proceedings of the Document Analysis Systems: Theory and Practice: Third IAPR Workshop. International Association for Pattern Recognition (1998)
Chandran, S., Balasubramanian, S., Gandhi, T., Prasad, A., Kasturi, R.: Structure recognition and information extraction from tabular documents. Int. J. Imag. Syst. Tech. 7, 289–303 (1996)
Vinciarelli, A.: A survey on off-line cursive script recognition. Pattern Recog. 35(7), 1433–1446 (2002)
LDS.org. News release: Facts About the 1880 U.S. Census. The Church of Jesus Christ of Latter-day Saints. http://www.lds.org/newsroom/showpackage/0,15367,3881--1--4--645,00.html (October 2002)
Doermann, D.: The indexing and retrieval of document images: a survey. Comput. Vis. Image Understand. 70(3), 287–298 (1998)
Plamondon, R., Srihari, S.: On-Line and off-line handwriting recognition: a comprehensive survey. Trans. Pattern Anal. Mach. Intell. 22(1) (2000)
Steinherz, T., Rivlin, E., Intrator, N.: Offline cursive script word recognition – a survey. Int. J. Doc. Anal. Recog. 2(2/3), 90–110 (1999)
Kia, O., Doermann, D.: Structural compression for document analysis. In: Proceedings of the International Conference on Pattern Recognition (1996)
Kia, O., Doermann, D.: Integrated segmentation and clustering for enhanced compression of document images. In: Proceedings of the International Conference on Document Analysis and Recognition, vol. 1 (1997)
Postl, W.: Detection of linear oblique structures and skew scan in digitized documents. In: Proceedings of the International Conference on Pattern Recognition pp. 687–689 (1986)
Postl, W.: Method for automatic correction of character skew in the acquisition of a text original in the form of digital scan results. U.S. Patent number 4,723,297, U.S. Patent and Trademarks Office (1988)
Baird, H.S.: The skew angle of printed documents. In: Proceedings of the Society of Photographic Scientists and Engineers vol. 40, pp. 21–24 (1987)
Duda, R., Hart, P.: Transformation to detect lines and curves in pictures. Commun. ACM 15(1), 11–15 (1972)
Lee, D.X., Thoma, G., Weschler, H.: Automated page orientation and skew angle detection for binary document images. Pattern Recog. 27(10), 1325–1344 (1994)
Amin, A., Fischer, S., Parkinson, T., Shiu, R.: Fast algorithm for skew detection. In: Proceedings of the Symposium on Electronic Imaging, IS&T/SPIE (The International Society for Optical Engineering) (1996)
Hinds, S.C., Fisher, J.L., Amato, D.P.D.: A document skew detection method using run-length encoding and the hough transform. In: Proceedings of the International Conference on Pattern Recognition, pp. 464–468 (1990)
Perantonis, S., Gatos, B., Papamarkos, N.: Block decomposition and segmentation for fast Hough transform evaluation. Pattern Recog. 32(5), 811–824 (1999)
Cao, Y., Wang, S., Li, H.: Skew detection and correction in document images based on straight-line fitting. Pattern Recog. Lett. 24(12), 1871–1879 (2003)
Si, D.: Skew and slant correction for document images using gradient direction. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 142–146 (1997)
Okun, O., Pietikainen, M., Sauvola, J.J.: Document skew estimation without angle range restriction. Int. J. Doc. Anal. Recog. 2(2/3), 132–144 (1999)
Steinherz, T., Intrator, N., Rivlin, E.: Skew detection via principal components analysis. In: Proceedings of the International Conference on Document Analysis and Recognition (1999)
Sauvola, J., Pietikäinen, M.: Skew angle detection using texture direction analysis. In: Proceedings, Scandinavian Conference on Image Analysis (1995)
Najman, L.: Using mathematical morphology for document skew estimation. In: Proceedings of the Symposium on Electronic Imaging: Document Recognition and Retrieval XI, IS&T/SPIE (The International Society for Optical Engineering) (2004)
Kavallieratou, E., Fakotakis, N., Kokkinakis, G.: Skew angle estimation for printed and handwritten documents using the Wigner–Ville distribution. Image Vis. Comput. 20, 813–824 (2002)
Garris, M.D., Grother, P.J.: Generalized form registration using structure-based techniques. In: Proceedings of the Fifth Annual Symposium on Document Analysis and Information Retrieval, pp. 321–334 (1996)
Wolberg, G., Zokai, S.: Robust image registration using log-polar transform. In: Proceedings, International Conference on Image Processing, IEEE (2000)
Wolberg, G., Zokai, S.: Image registration for perspective deformation recovery. In: Proceedings of the Conference on Automatic Target Recognition X, IS&T/SPIE (The International Society for Optical Engineering) (2000)
Zhang, Z., Blum, R.S.: A hybrid image registration technique for a digital camera image fusion application. Inf. Fusion 2(2), 135–149 (2001)
Kuglin, C.D., Hines, D.C.: The phase correlation image alignment method. In: Proceedings of the Conference on Cybernetics and Society, IEEE, pp. 163–165 (1975)
Castro, E.D., Morandi, C.: Registration of translated and rotated images using finite Fourier transforms. Trans. Pattern Anal. Mach. Intell. 9(5), 700–703 (1987)
Casasent, D., Psaltis, D.: Position, rotation, and scale-invariant optical correlation. Appl. Opt. 15, 1793–1799 (1976)
Sheng, Y., Arsenault, H.H.: Experiments on pattern recognition using invariant Fourier–Mellin descriptors. J. Opt. Soc. Am. A 3(6), 771–776 (1986)
McGuire, M.: An image registration technique for recovering rotation, scale and translation parameters. NEC Technical Report 98-018 (1998)
Reddy, B.S., Chatterji, B.N.: An FFT-based technique for translation, rotation, and scale-invariant image registration. Trans. Pattern Anal. Mach. Intell. 5(8), 1266–1271 (1996)
Stone, H.S.: NEC Technical Report: Fourier-Based Image Registration Techniques, NEC Research. http://www.censsis.neu.edu/hstone_fourier.pdf (2002)
Lin, C.-Y., Wu, M., Bloom, J.A., Miller, M.L., Cox, I.J., Lui, Y.-M.: Rotation, scale, and translation resilient public watermarking for images. Trans. Image Process. 10(5) (2001)
Luo, X., Mirchandani, G.: An integrated framework for image classification. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (2000)
Stone, H.S., Tao, B., McGuire, M.: Analysis of image registration noise due to rotationally dependent aliasing. J. Visual Commun. Image Represent. 14(2) (2003)
Lévy-Vehel, J.: Utilisation de la transformée de Mellin en traitement de signaux fractals – some applications of the Mellin transform in signal processing. INRIA Research Report No. 2992 (1995–1996)
Derrode, S., Ghorbel, F.: Robust and efficient Fourier–Mellin transform approximations for gray-level image reconstruction and complete invariant description. Comput. Visual Image Understand. 83(1), 57–78 (2001)
Blackman, R.B., Tukey, J.W.: Particular Pairs of Windows. Dover, New York (1959)
Otsu, N.: A threshold selection method from grey-level histograms. Trans. Syst. Man Cybernet. 9(1), 62–66 (1979)
Niblack, W.: An Introduction to Digital Image Processing. Prentice-Hall, Englewood Cliffs, NJ (1986)
Huang, J., Wang, Y., Wong, E.K.: Check image compression using a layered coding method. J. Electron. Imag., Special issue on image/video processing and compression Visual Comun. 7(3), 426–442 (1998)
Frigo, M., Johnson, S.G.: The fastest fourier transform in the west, version 3. Massacheusetts Institute of Technology, http://www.fftw.org/ (2003)
Borman, S., Stevenson, R.: Spatial Resolution Enhancement of Low-Resolution Image Sequences – A Comprehensive Review with Directions for Future Research. Research Report, University of Notre Dame (1998)
Kia, O.E.: Document image compression and analysis. Ph.D. thesis (1997)
Huang, J., Wang, Y., Wong, E.K.: Check image compression: a comparision of JPEG, wavelet and layered coding methods. In: Proceedings of the International Conference on Image Processing, IEEE, pp. 694–697 (1997)
Devillard, N.: Fast median search: an ANSI C implementation, http://ndevilla.free.fr/median/(July 1998)
Gil, J., Werman, M.: Computing 2-D min, median, and max filters. Trans. Pattern Anal. Machine Intell. 15(5), 504–507 (1993)
Huang, T.S., Yang, G.J., Tang, G.Y.: A fast two-dimensional median filtering algorithm. Trans. Acoustics Speech Signal Process. 27(1) (1979)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hutchison, L.A.D., Barrett, W.A. Fourier–Mellin registration of line-delineated tabular document images. IJDAR 8, 87–110 (2006). https://doi.org/10.1007/s10032-005-0003-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10032-005-0003-8