skip to main content
article

Information retrieval and OCR: from converting content to grasping meaning

Published:01 September 2002Publication History
Skip Abstract Section

Abstract

IR and OCR have largely developed independent standards and metrics, with OCR focused on literal accuracy, and IR focused on essential "content/meaning". With more and more media not only paper, but in multiple image formats, the opportunities and challenges for OCR on new formats -- video and still images -- are enormous. While OCR is assessed in metrics that emphasize words and characters, IR has learned to apply end-to-end metrics that ask whether the needs of the users can be met by existing systems. The same considerations apply also to the problem of providing permanent worldwide access to millions of pages of legacy print documents, representing the shared human record as it existed until just a few years ago.The International Society for Optical Engineering (SPIE) has held a series of Document Recognition and Retrieval (DRR) conferences. The tenth, DRR X will be held in January 2003, in Santa Clara California. In 2001, Dan LoPresti of Bell Labs decided that the area would benefit from more intense collaboration between those who specialize in finding the words on a page image, and those researchers who know how to find the right documents, given the words. He invited Paul Kantor (Rutgers) to join the DRR Chairs, and together they invited Dave Lewis (Consultant) to give a keynote address at DRR VIII. Dan then stepped down. Paul chaired DRR IX (2002) and then handed the reins to Tapas Kanungo (IBM, Almaden) and together they invited Jamie Callan (CMU), David Grossman (IIT) and Alex Hauptmann (CMU) to join the conference committee for DRR X.To improve communication between SIGIR and DRR, this group proposed a SIGIR workshop on this area. The workshop on "Information Retrieval and OCR: From Converting Content to Grasping Meaning" was intended to stimulate cross-fertilization between OCR and IR, in hopes that better use of IR will enable the OCR community to avoid expensive hand processing, and to demonstrate that the combination of present static and dynamic image processing and present state-of-the-art robust information retrieval can generate substantial advances in both extraction of messages from image streams and conversion of existing paper variants. It solicited papers dealing with future applications, such as the indexing and retrieval of text embedded in static or video graphic images, with problems of skew, distortion, and obscuration, as well as state-of-the-art discussions of the storage and retrieval of handwritten or print legacy materials.The workshop was held on August 15, 2002 in Tampere, Finland, immediately following the SIGIR 2002 conference. Although the workshop was intended to appeal to a wide range of IR and OCR researchers (and indeed was proposed at the request of colleagues from the OCR community), it primarily drew people with a background in IR. About a dozen people participated. The small size allowed a very interactive, seminar-style format and very vigorous discussion between and during presentations. Most presentations ran 30% to 50% longer than planned, and our impression is that most of the participants found it very productive.

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image ACM SIGIR Forum
    ACM SIGIR Forum  Volume 36, Issue 2
    Fall 2002
    99 pages
    ISSN:0163-5840
    DOI:10.1145/792550
    Issue’s Table of Contents

    Copyright © 2002 Authors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 1 September 2002

    Check for updates

    Qualifiers

    • article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader