ABSTRACT
A robust table registration method is proposed in this paper for a better understanding on structured information from scanned table images. Scanned images can be heavily degraded because of scanning effects, binarization or purely document itself. For batch processing images with the same table structure, normally the table model is provided and can be used to overcome most challenging quality factors. The given table model is used as the ground truth in this paper. However, only rough precision is needed on table cell dimensions and this makes providing the table model an easier task. The method was tested on Multilingual Automatic Document Classification Analysis and Translation (MADCAT) images and a promising performance is achieved.
- Agency, D. A. R. P. Multilingual automatic document classification, analysis and translation (MADCAT). http://www.darpa.mil/Our_Work/I20/Programs/Multilingual_Automatic_Document_Classification,_Analysis_and_Translation_(MADCAT).aspx.Google Scholar
- Embley, D. W., Hurst, M., Lopresti, D., and Nagy, G. Table-processing paradigms: a research survey. International Journal of Document Analysis and Recognition 8, 2 (2006), 66--86.Google ScholarCross Ref
- Subramanian, K., Cao, H., Peng, X., Prasad, R., and Natarajan, P. Image registration and text recognition for structured census documents. In 12th Annual Workshop on Family History Technology (February 2012).Google Scholar
- Zanibbi, R., Blostein, D., and Cordy, J. R. A survey of table recognition. International Journal of Document Analysis and Recognition 7, 1 (2004), 1--16. Google ScholarDigital Library
Recommendations
Table detection in heterogeneous documents
DAS '10: Proceedings of the 9th IAPR International Workshop on Document Analysis SystemsDetecting tables in document images is important since not only do tables contain important information, but also most of the layout analysis methods fail in the presence of tables in the document image. Existing approaches for table detection mainly ...
Robust Camera Captured Image Mosaicking for Document Digitization and OCR Processing
ICIT '14: Proceedings of the 2014 International Conference on Information TechnologyPeople often capture document images from their mobile camera or handheld digital camera. These cameras are handy and can capture partial images of a big document like chart or new paper and text written on wall. Image mosaic king is the process of ...
Study on Template-Based Coding Method of Binary OCR Table Image
ISDEA '10: Proceedings of the 2010 International Conference on Intelligent System Design and Engineering Application - Volume 02The binary image is commonly found with a lower compression ratio in lossless coding, for which a template-based coding method was employed in the coding of the OCR table images. Specifically, a template image was first established in the same context ...
Comments