ABSTRACT
Tables are ubiquitous in digital libraries and on the Web, utilized to satisfy various types of data delivery and document formatting goals. For example, tables are widely used to present experimental results or statistical data in a condensed fashion in scientific documents. Identifying and organizing tables of different types is an absolutely necessary task for better table understanding, and data sharing and reusing. This paper has a three-fold contribution: 1) We propose Introduction, Methods, Results, and Discussion (IMRAD)-based table functional classification for scientific documents; 2) A fine-grained table taxonomy is introduced based on an extensive observation and investigation of tables in digital libraries; and 3) We investigate table characteristics and classify tables automatically based on the defined taxonomy. The preliminary experimental results show that our table taxonomy with salient features can significantly improve scientific table classification performance.
- Cho, K. and Kim, J. 1997. Automatic Text Categorization on Hierarchical Category Structure by using ICF(Inverted Category Frequency) Weighting KOREA INFORMATION SCIENCE SOCIETY, 507--510.Google Scholar
- Crestan, E. and Pantel, P. 2011. Web-scale table census and classification. In Proceedings of the fourth ACM international conference on Web search and data mining (Hong Kong, China2011), ACM, 1935904, 545--554. Google ScholarDigital Library
- Fleiss, J.L. 1971. Measuring nominal scale agreement among many raters. Psychological Bulletin 76, 5, 378--382.Google ScholarCross Ref
- Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I.H. 2009. The WEKA data mining software: an update. SIGKDD Explor. Newsl. 11, 1, 10--18. Google ScholarDigital Library
- Hearst, M.A., Divoli, A., Guturu, H., Ksikes, A., Nakov, P., Wooldridge, M.A., and Ye, J. 2007. BioText Search Engine. Bioinformatics 23, 16, 2196--2197. Google ScholarDigital Library
- Kim, S. and Liu, Y. 2011. Functional-Based Table Category Identification in Digital Library. In Proceedings of the 11th International Conference on Document Analysis and Recognition (Beijing, China2011), 1364--1368. Google ScholarDigital Library
- Liu, Y., Bai, K., Mitra, P., and Giles, C.L. 2007. TableSeer: automatic table metadata extraction and searching in digital libraries. In Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries (Vancouver, BC, Canada2007), ACM, 1255193, 91--100. Google ScholarDigital Library
Index Terms
- Scientific table type classification in digital library
Recommendations
Preprint citation practice in PLOS
AbstractThe role of preprints in the scientific production and their part in citations have been growing over the past 10 years. In this paper we study preprint citations in several different aspects: the progression of preprint citations over time, their ...
Digital Health Taxonomy
NISS '23: Proceedings of the 6th International Conference on Networking, Intelligent Systems & SecurityTaxonomies have been an invaluable tool in helping researchers structuring and categorizing terms and concepts of various fields. The same principles apply to the digital transformation in the healthcare industry, where a growing plethora of terms and ...
Taxonomies in software engineering
Context: Software Engineering (SE) is an evolving discipline with new subareas being continuously developed and added. To structure and better understand the SE body of knowledge, taxonomies have been proposed in all SE knowledge areas.Objective: The ...
Comments