ABSTRACT
This paper presents a new approach to detect tabular structures present in document images and in low resolution video images. The algorithm for table detection is based on identifying the unique table start pattern and table trailer pattern. We have formulated perceptual attributes to characterize the patterns. The performance of our table detection system is tested on a set of document images picked from UW-III (University of Washington) dataset, UNLV dataset, video images of NPTEL videos, and our own dataset. Our approach demonstrates improved detection for different types of table layouts, with or without ruling lines. We have obtained correct table localization on pages with multiple tables aligned side-by-side.
- B. Gatos, D. Danatsas, I. Pratikakis, and S. J. Perantonis, Automatic Table Detection in document images, International Conference on Advances in Pattern Recognition (Path and U.K.), August 2005, pp. 612--621. Google ScholarDigital Library
- F. Cesarini, S. Marinai, L. Sarti, and G. Soda, Trainable Table Location in Document Images, International Conference on Pattern Recognition (ICPR) (Quebec, Canada), 2002, pp. 236--240.Google ScholarCross Ref
- S Chandran and R Kasturi, Structural Recognition of Tabulated Data, International Conference on Document Analysis and Recognition ICDAR, 1993.Google Scholar
- A. C. e Silva, Learning rich hidden markov models in document analysis: Table location, International Conference on Docment Analysis and Recognition (Barcelona and Spain), July 2009, pp. 843--847. Google ScholarDigital Library
- Ana Costa e Silva, Alipio M Jorge, and Luis Torgo, Design of an end-to-end method to extract information from tables, International Journal of Document Analysis and Recognition IJDAR 8 (2006), no. 2, 144--171.Google ScholarCross Ref
- E Green and M Krishnamoorthy, Model-based Analysis of Printed Tables, International Conference on Document Analysis and Recognition (Montreal, Canada), 2005, pp. 214--217. Google ScholarDigital Library
- J C Handley, Electronic Imaging Technology, ch. Document Recognition, SPIE, 1999.Google Scholar
- O Hori and D S Doermann, Robust Table-form Structure Analysis Based on Box-Driven Reasoning, International Conference on Document Analysis and Recognition ICDAR, 1995. Google ScholarDigital Library
- J Hu, R kashi, D Lopresti, and G Wilfong, Medium-independent table detection, SPIE Document Recognition and Retrieval VII (San Jose, USA), vol. 3967, 2000, pp. 291--302.Google Scholar
- Mathew Hurst and N Tetsuya, Layout and Language: Integrating spatial and linguistic knowledge for layout understanding tasks, 18th International Conference on Computational Linguistics (ICCL) (Saarbruecken, Germany), 2000. Google ScholarDigital Library
- T. G. Kieninger, Table Structure Recognition Based on Robust Block Segmentation, Document Recognition V SPIE (San Jose, USA), vol. 3302, 1998, pp. 22--32.Google Scholar
- B Klein, G Serdar, T Kieninger, and A Dengel, Three approaches to "industrial" table spotting, International Conference on Document Analysis and Recognition ICDAR (Seattle, USA), 2001. Google ScholarDigital Library
- Ying Liu, Tableseer: Automatic Table Extraction and Search and Understanding, Ph.D. thesis, The Pennsylvania State University, 2009.Google Scholar
- Daniel Lopresti and George Nagy, Automated table processing: An (opinionated) survey, 3rd International Workshop on Graphics Recognition (Jaipur, India), 1999, pp. 109--134.Google Scholar
- Daniel Lopresti and George Nagy, A tabular survey of automated table processing, LNCS (Springer Verlag), vol. 1941, 2000, pp. 93--120. Google ScholarDigital Library
- J Ramel, M Crucianu, N Vincent, and C Faure, Detection, Extraction and Representation of Tables, International Conference on Document Analysis and Recognition (Edinburgh, UK), 2003. Google ScholarDigital Library
- F. Shafait and R. Smith, Table Detection in Heterogeneous Documents, 9th International Workshop on Document Analysis Systems, 2010, pp. 65--72. Google ScholarDigital Library
- J H Shamilian, S B Henry, and L W Thomas, A retargetable table reader, International Conference on Document Analysis and Recognition (Ulm, Germany), 1997. Google ScholarDigital Library
- S. Mandal, S. P. Chowdhury, A. K. Das, and B. Chanda, A Simple and Effective Table Detection System from Document Images, International Journal of Document Analysis and Recognition 8 (2006), no. 2, 172--182.Google ScholarCross Ref
- R. Smith, Hybrid Page Layout Analysis via Tab-Stop Detection, 10th International Conference Document Analysis and Recognition, 2009, pp. 241--245. Google ScholarDigital Library
- W Tersteegen and C Wenzel, ScanTab - Table recognition by reference tables, Document Analysis Systems (DAS) (Nagano, Japan), 1998.Google Scholar
- S. Tsuruoka, K. Takao, T. Tanaka, T. Yoshikawa, and T. Shinogi, Region Segmentation for Table Image with Unknown Complex Structure, International Conference on Document Analysis and Recognition, 2001, pp. 709--713. Google ScholarDigital Library
- Scott Tupaj, Zhongwen Shi, C. Hwa Chang, C. Hwa Chang, and Alam Hassan, Extracting tabular information from text files, EECS Department, Tufts University, 1996, pp. 214--217.Google Scholar
- Y Wang, T P Ihsin, and H Robert, Improvements of zone content classification by using background analysis, Document Analysis Systems (DAS) (Rio de Janeiro, Brazil), 2000.Google Scholar
- Y Wang, T P Ihsin, and H Robert, Automatic ground truth generation and A background analysis based table structure extraction method, International Conference on Document Analysis and Recognition (Seattle, USA), 2001. Google ScholarDigital Library
- Y Wang, T P Ihsin, and H Robert, Table detection via probability optmization, Document Analysis Systems (DAS) (Princeton, NY, USA), 2002. Google ScholarDigital Library
- Janusz Wnek and Robert J Price, An automated conversion of structured documents into SGML, SPIE (San Jose, CA), vol. 3305, 1998, pp. 141--150.Google Scholar
- R Zanibbi, D Blostein, and J Cordy, A Survey of Table Recognition: Models, Observations, Transformations, and Inferences, International Journal on Document Analysis and Recognition IJDAR 7 (2004), no. 1, 1--16. Google ScholarDigital Library
Index Terms
- Table detection in document images using header and trailer patterns
Recommendations
Table detection in heterogeneous documents
DAS '10: Proceedings of the 9th IAPR International Workshop on Document Analysis SystemsDetecting tables in document images is important since not only do tables contain important information, but also most of the layout analysis methods fail in the presence of tables in the document image. Existing approaches for table detection mainly ...
Model based table cell detection and content extraction from degraded document images
DAR '12: Proceeding of the workshop on Document Analysis and RecognitionThis paper describes a novel method for detection and extraction of contents of table cells from handwritten document images. Given a model of the table and a document image containing a table, the hand-drawn or pre-printed table is detected and the ...
Junction-based table detection in camera-captured document images
In this paper, we present a method that locates tables and their cells in camera-captured document images. In order to deal with this problem in the presence of geometric and photometric distortions, we develop new junction detection and labeling ...
Comments