A toolkit to facilitate the querying and integration of tabular data from semistructured documents

Hodge, L. E.; Gray, W. A.; Fiddian, N. J.

doi:10.1007/BFb0053482

L. E. Hodge¹,
W. A. Gray¹ &
N. J. Fiddian¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1405))

Included in the following conference series:

British National Conference on Databases

93 Accesses

Abstract

The advent of the World Wide Web (WWW) has meant there is a growing need to link information held in databases and files with information held in other types of structure [2]. This poster presents a toolkit which enables information held in tables within documents to be linked with data held in conventional databases. The toolkit addresses the problems of:-

•
extracting tabular data from various types of document e.g. semi-structured text [4], HTML and spreadsheets
•
providing a standard interface to these tables via a standard query language
•
extracting useful table meta-data from the table and accompanying text.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Wrapper Generation for Semi-structured Internet Sources. Ashish, N & Knoblock, C. University of Southern California.
Google Scholar
Learning to Extract Text-based Information from the World Wide Web. Soderland, S. University of Washington. In Proceedings of Third International Conference on Knowledge Discovery and Data mining (KDD-97).
Google Scholar
Wrapper Induction for Information Extraction. Kushmerick, N, Weld, D, Doorenbos, R. Proceedings of IJCAI 97.
Google Scholar
Using Natural Language Processing for Identifying and Interpreting Tables in Plain Text. Douglas. S, Hurst. M, Quinn, D. December 1994.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Wales, Cardiff, UK
L. E. Hodge, W. A. Gray & N. J. Fiddian

Authors

L. E. Hodge
View author publications
You can also search for this author in PubMed Google Scholar
W. A. Gray
View author publications
You can also search for this author in PubMed Google Scholar
N. J. Fiddian
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Suzanne M. Embury Nicholas J. Fiddian W. Alex Gray Andrew C. Jones

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hodge, L.E., Gray, W.A., Fiddian, N.J. (1998). A toolkit to facilitate the querying and integration of tabular data from semistructured documents. In: Embury, S.M., Fiddian, N.J., Gray, W.A., Jones, A.C. (eds) Advances in Databases. BNCOD 1998. Lecture Notes in Computer Science, vol 1405. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0053482

Download citation

DOI: https://doi.org/10.1007/BFb0053482
Published: 23 May 2006
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64659-4
Online ISBN: 978-3-540-69112-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics