Skip to main content
Log in

Fast correction of bleed-through distortion in grayscale documents by a blind source separation technique

  • Original Paper
  • Published:
International Journal of Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Abstract

Ancient documents are usually degraded by the presence of strong background artifacts. These are often caused by the so-called bleed-through effect, a pattern that interferes with the main text due to seeping of ink from the reverse side. A similar effect, called show-through and due to the nonperfect opacity of the paper, may appear in scans of even modern, well-preserved documents. These degradations must be removed to improve human or automatic readability. For this purpose, when a color scan of the document is available, we have shown that a simplified linear pattern overlapping model allows us to use very fast blind source separation techniques. This approach, however, cannot be applied to grayscale scans. This is a serious limitation, since many collections in our libraries and archives are now only available as grayscale scans or microfilms. We propose here a new model for bleed-through in grayscale document images, based on the availability of the recto and verso pages, and show that blind source separation can be successfully applied in this case too. Some experiments with real-ancient documents arepresented and described.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Leedham, G., Varma, S., Patankar, A., Govindaraju, V.: Separating text and background in degraded document images—a comparison of global thresholding techniques for multi-stage thresholding. In: Proceedings of the 8th International Workshop on Frontiers in Handwriting Recognition, Niagara on the Lake, Canada, pp. 244–249 (2002)

  2. Govindaraju, V., Srihari, N.: Separating handwritten text from overlapping nontextual contours. In: Proceedings of the International Workshop on Frontiers in Handwriting Recognition, Chateau de Bonas, France, pp. 111–119 (1991)

  3. Franke, K., Köppen, M.: A computer-based system to support forensic studies on handwritten documents. IJDAR 3, 218–231 (2001)

    Article  Google Scholar 

  4. Sharma, G.: Show-through cancellation in scans of duplex printed documents. IEEE Trans. Image Process. 10(5), 736–754 (2001)

    Article  Google Scholar 

  5. Dubois, E., Pathak, A.: Reduction of bleed-through in scanned manuscript documents. In: Proceedings of the IS&T Image Processing, Image Quality, Image Capture Systems Conference, Montreal, Canada, pp. 177–180 (2001)

  6. Tan, C.L., Cao, R., Peiyi, S.: Restoration of archival documents using a wavelet technique. IEEE Trans. Pattern Anal. Machine Intell. 24, 1399–1404 (2002)

    Article  Google Scholar 

  7. Dano, P.: Joint restoration and compression of document images with bleed-through distortion. Master thesis, Ottawa-Carleton Institute for Electrical and Computer Engineering, School of Information Technology and Engineering, University of Ottawa (2003)

  8. Nishida, H., Suzuki, T.: Correcting of show-through effects on document images by multiscale analysis. In: Proceedings of the 16th Conference on Pattern Recognition, Quebec City, Canada, pp. 65–68 (2002)

  9. Nishida, H., Suzuki, T.: A multiscale approach to restoring scanned color document images with show-through effects. In: Proceedings of the 7th International Conference on Document Analysis and Recognition (ICDAR 2003) (2003)

  10. Tonazzini, A., Bedini, L., Salerno, E.: Independent component analysis for document restoration. IJDAR 7(1), 17–27 (2004)

    Article  Google Scholar 

  11. Hyvärinen, A., Karhunen, J., Oja, E.: Independent Component Analysis. Wiley, New York (2001)

    Google Scholar 

  12. Tonazzini, A., Salerno, E., Mochi, M., Bedini, L.: Bleed-through removal from degraded documents using a color decorrelation method. In: Document Analysis Systems VI, LNCS 3163, pp. 229–240. Springer, Berlin Heidelberg New York (2004)

  13. Tonazzini, A., Salerno, E., Mochi, M., Bedini, L.: Blind source separation techniques for detecting hidden texts and textures in document images. In: Image Analysis and Recognition, LNCS 3212, Part II, pp. 241–248. Springer, Berlin Heidelberg New York (2004)

  14. Salerno, E., Tonazzini, A., Bedini, L.: Digital image analysis to enhance underwritten text in the Archimedes palimpsest. IJDAR (submitted)

  15. Cichocki, A., Amari, S.-I.: Adaptive Blind Signal and Image Processing. Wiley, New York (2002)

    Google Scholar 

  16. Bell, A.J., Sejnowski, T.J.: An information maximization approach to blind separation and blind deconvolution. Neural Comput. 7, 1129–1159 (1995)

    PubMed  Google Scholar 

  17. Ohta, Y., Kanade, T., Sakai, T.: Color information for region segmentation. Comput. Graph. Vis. Image Process. 13, 222–241 (1980)

    Article  Google Scholar 

  18. Hyvärinen, A., Oja, E.: Independent component analysis: algorithms and applications. Neural Netw. 13, 411–430 (2000)

    Article  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anna Tonazzini.

Additional information

Anna Tonazzini graduated cum laude in Mathematics from the University of Pisa, Italy, in 1981. In 1984 she joined the Istituto di Scienza e Tecnologie dell'Informazione of the Italian National Research Council (CNR) in Pisa, where she is currently a researcher at the Signals and Images Laboratory. She cooperated in special programs for basic and applied research on image processing and computer vision, and is co-author of over 60 scientific papers. Her present interest is on inverse problems theory, image restoration and reconstruction, document analysis and recognition, independent component analysis, neural networks and learning.

Emanuele Salerno graduated in Electronic Engineering from the University of Pisa, Italy, in 1985. In September 1987 he joined the Italian National Research Council (CNR) at the Department of Signal and Image Processing, Information Processing Institute (now Institute of Information Science and Technologies, ISTI, Signals and Images Laboratory), Pisa, Italy, where he has been working in applied inverse problems, image reconstruction and restoration, microwave nondestructive evaluation, and blind signal separation. He has been assuming different responsibilities in research programs in nondesctructive testing, robotics, numerical models for image reconstruction and computer vision, neural networks techniques in astrophysical imagery. At present, he is local scientific responsible in the framework of the European Space Agency's “Planck Surveyor Satellite” mission, and takes part in the European CRAFT project “ISyReADeT”, for document image restoration.

Luigi Bedini graduated cum laude in Electronic Engineering from the University of Pisa, Italy, in 1968. Since 1970 he has been a Researcher of the Italian National Research Council, Istituto di Scienza e Tecnologie dell'Informazione, Pisa, Italy. His interests have been in modelling, identification, and parameter estimation of biological systems applied to non-invasive diagnostic techniques. At present, his research interest is in the field of digital signal processing, image reconstruction and neural networks applied to image processing. He is co-author of more than 80 scientific papers. From 1971 to 1989, he was Associate Professor of System Theory at the Computer Science Department, University of Pisa, Italy.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tonazzini, A., Salerno, E. & Bedini, L. Fast correction of bleed-through distortion in grayscale documents by a blind source separation technique. IJDAR 10, 17–25 (2007). https://doi.org/10.1007/s10032-006-0015-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-006-0015-z

Keywords

Navigation