skip to main content
10.1145/2037342.2037361acmotherconferencesArticle/Chapter ViewAbstractPublication PageshipConference Proceedingsconference-collections
research-article

Towards a faithful visualization of historical books on e-book readers

Published:16 September 2011Publication History

ABSTRACT

The faithful visualization of historical documents on e-book devices and tablet computers is addressed in this paper. To this purpose, digitized books should be converted to re-flowable formats where the characters are easily re-sized. This is accomplished by first analyzing the document to extract the characters that are then clustered and replaced by prototypes. The prototypes are represented as SVG objects and then arranged in the proper position in the converted document.

Among other applications, the proposed conversion can be used to allow visitors of archives and exhibitions to easily browse and consult historical documents on dedicated devices or on personal mobile devices that support standard re-flowable formats.

The system is quantitatively tested on the well known UW-I dataset by computing OCR errors on the original images and on the reconstructed ones. The visual rendering of historical documents is evaluated on a digitized book of the XIX-th Century.

References

  1. SONY eBook Reader emulator. http://ebookstore.sony.com/download/.Google ScholarGoogle Scholar
  2. Tesseract OCR. http://code.google.com/p/tesseract-ocr/.Google ScholarGoogle Scholar
  3. H. Ainsworth. Epub format construction guide. 2010. http://www.hxa.name/.Google ScholarGoogle Scholar
  4. IDPF. Epub3 international digital publishing forum, March 2011. http://idpf.org/epub/30.Google ScholarGoogle Scholar
  5. T. Kohonen. Self-organizing maps. Springer Series in Information Sciences, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Marinai. Metadata extraction from PDF papers for digital library ingest. In 10th Int.l Conf. on Document Analysis and Recognition, pages 251--255, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Marinai, E. Marino, and G. Soda. Table of contents recognition for converting pdf documents in e-book formats. In Proc. 10th ACM symposium on Document engineering, DocEng '10, pages 73--76, New York, NY, USA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. Marinai, E. Marino, and G. Soda. Conversion of PDF books in ePub format. In 11th Int.l Conf. on Document Analysis and Recognition, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. P. Selinger. Potrace: a polygon-based tracing algorithm, 2003. Software available at http://potrace.sourceforge.net/.Google ScholarGoogle Scholar

Index Terms

  1. Towards a faithful visualization of historical books on e-book readers

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          HIP '11: Proceedings of the 2011 Workshop on Historical Document Imaging and Processing
          September 2011
          195 pages
          ISBN:9781450309165
          DOI:10.1145/2037342

          Copyright © 2011 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 16 September 2011

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate52of90submissions,58%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader