skip to main content
10.1145/1378889.1378934acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
research-article

User-assisted ink-bleed correction for handwritten documents

Published: 16 June 2008 Publication History

Abstract

We describe a user-assisted framework for correcting ink-bleed in old handwritten documents housed at the National Archives of Singapore (NAS). Our approach departs from traditional correction techniques that strive for full automation. Fully-automated approaches make assumptions about ink-bleed characteristics that are not valid for all inputs. Furthermore, fully-automated approaches often have to set algorithmic parameters that have no meaning for the end-user. In our system, the user needs only to provide simple examples of ink-bleed, foreground ink, and background. These training examples are used to classify the remaining pixels in the document to produce a computer-generated result that is equal to or better than existing fully-automated approaches.
To offer a complete system, we also provide tools that allow any errors in the computer-generated results to be quickly "cleaned up" by the user. The initial training markup, together with the computer-generated results, and manual edits are all recorded with the final output, allowing subsequent viewers to see how a corrected document was created and to make changes or updates. While an ongoing project, our feedback from the NAS staff has been overwhelmingly positive that this user-assisted framework is a practical way to address the ink-bleed problem.

References

[1]
J. Bescos. Image processing algorithms for readability enhancement of old manuscripts. Electronic Imaging, 1:392--397, 1989.
[2]
F. Bookstein. Principal warps: Thin-plate splines and the decomposition of deformations. IEEE Transaction on Pattern Analysis and Machine Intelligence, 11(6):567--585, June 1989.
[3]
A. Dekhtyar, I. E. Iacob, J. Jaromczyk, K. Kiernan, N. Moore, and C. Porter. Building image-based electronic editions using the edition production technology. In ACM/IEEE Joint Conference on Digital Libraries, 2005.
[4]
F. Drira, F. L. Bourgeois, and H. Emptoz. Restoring ink bleed-through degraded document images using a recursive unsupervised classification technique. In Document Analysis Systems (DAS), 2006.
[5]
R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. Wiley-Interscience Publication, 2000.
[6]
R. C. Gonzalez and R. E. Woods. Digital Image Processing. Addison-Wesley, 2nd edition, 2001.
[7]
F. C. Mintzer et al. Toward on-line, worldwide access to vatican library materials. IBM Journal of Research and Development, 40(2):139--162, March 1996.
[8]
C. Monroy, R. Furuta, and G. Stringer. Digital donne: workflow, editing tools, and the reader's interface of a collection of 17th-century english poetry. In ACM/IEEE Joint Conference on Digital Libraries, 2007.
[9]
W. B. Seales and Y. Lin. Digital restoration using volumetric scanning. In ACM/IEEE Joint Conference on Digital libraries, 2004.
[10]
G. Sharma. Show-through cancellation in scans of duplex printed documents. IEEE Trans. on Image Processing, 10(5):736--754, 2001.
[11]
Z. Shi and V. Govindaraju. Historical document image enhancement using background light intensity normalization. In International Conference on Pattern Recognition, 2004.
[12]
C. L. Tan, R. Cao, and P. Shen. Restoration of archival documents using a wavelet technique. IEEE Transaction on Pattern Analysis and Machine Intelligence, 24(10):1399--1404, Oct 2002.
[13]
A. Tonazzini, L. Bedini, and E. Salerno. Independent component analysis for document restortion. International Journal on Document Analysis and Recognition, 7:17--27, 2004.
[14]
Q. Wang, T. Xia, L. Li, and C. Tan. Document image enhancement using directional wavelet. In IEEE Conference on Computer Vision and Pattern Recognition, 2003.
[15]
B. Wingenroth, M. Patton, and T. DiLauro. Enhancing access to the levy sheet music collection. In ACM/IEEE Joint Conference on Digital Libraries, 2002.
[16]
C. Wolf. Document ink bleed-through removal with two hidden markov random fields and a single observation field. In Technical Report RR-LIRIS-2006-019, 2006/2007.
[17]
C. J. Yuan and W. B. Seales. Guided linking: Efficiently making image-to-transcript correspondence. In ACM/IEEE Joint Conference on Digital Libraries, 2001.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
JCDL '08: Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries
June 2008
490 pages
ISBN:9781595939982
DOI:10.1145/1378889
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 June 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. document processing
  2. ink-bleed
  3. restoration
  4. user-assisted systems

Qualifiers

  • Research-article

Conference

JCDL08
JCDL08: Joint Conference on Digital Libraries
June 16 - 20, 2008
PA, Pittsburgh PA, USA

Acceptance Rates

JCDL '08 Paper Acceptance Rate 33 of 117 submissions, 28%;
Overall Acceptance Rate 415 of 1,482 submissions, 28%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media