ABSTRACT
In this paper we propose an attribute retrieval approach which extracts and ranks attributes from Web tables. We combine simple heuristics to filter out improbable attributes and we rank attributes based on frequencies and a table match score. Ranking is reinforced with external evidence from Web search, DBPedia and Wikipedia. Our approach can be applied to whatever instance (e.g. Canada) to retrieve its attributes (capital, GDP). It is shown it has a much higher recall than DBPedia and Wikipedia and that it works better than lexico-syntactic rules for the same purpose.
- E. Alfonseca, M. Pasca, and E. Robledo-Arnuncio. Acquisition of instance attributes via labeled and related instances. In SIGIR '10. Google ScholarDigital Library
- M. J. Cafarella, M. Banko, and O. Etzioni. Relational Web Search. Technical report, U. Washington, 2006.Google Scholar
- M. J. Cafarella, A. Halevy, D. Z. Wang, E. Wu, and Y. Zhang. Webtables: exploring the power of tables on the web. Proc. VLDB Endow., 1(1):538--549, 2008. Google ScholarDigital Library
- M. J. Cafarella, A. Y. Halevy, Y. Zhang, D. Z. Wang, and E. Wu. Uncovering the Rel. Web. In WebDB, 2008.Google Scholar
- C.-H. Chang, M. Kayed, M. R. Girgis, and K. F. Shaalan. A survey of web information extraction systems. IEEE Trans. on Knowl. and Data Eng., 18:1411--1428, October 2006. Google ScholarDigital Library
Index Terms
- Retrieving attributes using web tables
Recommendations
Attribute retrieval from relational web tables
SPIRE'11: Proceedings of the 18th international conference on String processing and information retrievalIn this paper, we propose an attribute retrieval approach which extracts and ranks attributes from HTML tables. Given an instance (e.g. Tower of Pisa), we want to retrieve from the Web its attributes (e.g. height, architect). Our approach uses HTML ...
Towards a framework for attribute retrieval
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge managementIn this paper, we propose an attribute retrieval approach which extracts and ranks attributes from HTML tables. We distinguish between class attribute retrieval and instance attribute retrieval. On one hand, given an instance (e.g. University of ...
Extracting Attributes and Synonymous Attributes from Online Encyclopedias
WI-IAT '14: Proceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) - Volume 01In this paper, we present an approach that extracts attributes of open-domain named entities for the Chinese language. The approach contains two steps. The first step consists in an unsupervised technique which captures high frequency attributes from ...
Comments