Loading [MathJax]/extensions/MathMenu.js
Fast Selection of Small and Precise Candidate Sets from Dictionaries for Text Correction Tasks | IEEE Conference Publication | IEEE Xplore

Fast Selection of Small and Precise Candidate Sets from Dictionaries for Text Correction Tasks


Abstract:

Lexical text correction relies on a central step where approximate search in a dictionary is used to select the best correction suggestions for an ill-formed input token....Show More

Abstract:

Lexical text correction relies on a central step where approximate search in a dictionary is used to select the best correction suggestions for an ill-formed input token. In previous work we introduced the concept of a universal Levenshtein automaton and showed how to use these automata for efficiently selecting from a dictionary all entries within a fixed Levenshtein distance to the garbled input word. In this paper we look at refinements of the basic Levenshtein distance that yield more sensible notions of similarity in distinct text correction applications, e.g. OCR. We show that the concept of a universal Levenshtein automaton can be adapted to these refinements. In this way we obtain a method for selecting correction candidates which is very efficient, at the same time selecting small candidate sets with high recall.
Date of Conference: 23-26 September 2007
Date Added to IEEE Xplore: 12 November 2007
ISBN Information:

ISSN Information:

Conference Location: Curitiba, Brazil

Contact IEEE to Subscribe

References

References is not available for this document.