A comparative study of two recent word spotting techniques in the run-length compressed domain | IEEE Conference Publication | IEEE Xplore

A comparative study of two recent word spotting techniques in the run-length compressed domain


Abstract:

This paper presents a comparative study of two recent word spotting techniques ([1] and [2]) directly in the run-length compressed domain. The first technique is based on...Show More

Abstract:

This paper presents a comparative study of two recent word spotting techniques ([1] and [2]) directly in the run-length compressed domain. The first technique is based on partial decompression and limited usage of OCR, and the second technique is completely decompression-less and OCR-less. Both the word spotting techniques use word bounding box ratio feature initially for matching words in the database of compressed document images. For all the matching test-words, the word spotting strategy in the first model is to decompress and OCR first two characters, and then match with the keyword characters. If the matching is successful, then the remaining characters of the test-word are decompressed and OCRed, and eventually matched with the keyword. The word spotting strategy applied in the second model is to extract run based features like number of run transitions and the corresponding correlation of runs along the selected regions of the matching test word, and then match with that of the specified keyword. The proposed methods work in Run-Length Compressed Domain (RLCD) with the capability of operating on CCITT Group 3 1D, CCITT Group 3 2D, and CCITT Group 4 2D compressed documents supported by TIFF and PDF file formats. In the current paper, the efficacy of the proposed models is demonstrated through experimental results and comparative analysis.
Date of Conference: 13-16 September 2017
Date Added to IEEE Xplore: 04 December 2017
ISBN Information:
Conference Location: Udupi, India

Contact IEEE to Subscribe

References

References is not available for this document.